Linux, FreeBSD, Juniper, Cisco / Network security articles and troubleshooting guides

FAQ
It is currently Sat Aug 19, 2017 2:44 am


Internet Protocol, Transport Control Protocol, Network protocols, Routing, Routers, IP aliases, Routes, Ethernet

Author Message
mandrei99
Post  Post subject: Linux tso (tcp segmentation offload) - what it means and how to enable/disable it  |  Posted: Fri Jan 16, 2015 5:41 am

Joined: Tue Aug 04, 2009 9:16 am
Posts: 245

Offline
 

Linux tso (tcp segmentation offload) - what it means and how to enable/disable it

Why am I seeing packets larger than MTU in tcpdump ?


You probably have seen at least one a wireshark or tcpdump showing some strange packet sizes, way over the regular legitimate MSS and MTU values ( 1460 and 1500 bytes for Ethernet).

Before you begin, read: How the Linux TCP output engine works.
Here is an example:
Code:
03:52:23.511915 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [S], seq 92802589, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 3052623680 ecr 0], length 0
03:52:23.511952 IP 2.2.2.2.80 > 1.1.1.1.31586: Flags [S.], seq 512157486, ack 92802590, win 14480, options [mss 1460,sackOK,TS val 303700 ecr 3052623680,nop,wscale 9], length 0
03:52:23.614824 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [.], ack 1, win 4117, options [nop,nop,TS val 3052623784 ecr 303700], length 0
03:52:23.615236 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [P.], seq 1:116, ack 1, win 4117, options [nop,nop,TS val 3052623784 ecr 303700], length 115
03:52:23.615256 IP 2.2.2.2.80 > 1.1.1.1.31586: Flags [.], ack 116, win 29, options [nop,nop,TS val 303725 ecr 3052623784], length 0
03:52:23.615428 IP 2.2.2.2.80 > 1.1.1.1.31586: Flags [.], seq 1:14481, ack 116, win 29, options [nop,nop,TS val 303726 ecr 3052623784], length 14480
03:52:23.720612 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [.], ack 2897, win 4077, options [nop,nop,TS val 3052623888 ecr 303726], length 0
03:52:23.721956 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [.], ack 5793, win 4095, options [nop,nop,TS val 3052623889 ecr 303726], length 0
03:52:23.721982 IP 2.2.2.2.80 > 1.1.1.1.31586: Flags [.], seq 14481:23169, ack 116, win 29, options [nop,nop,TS val 303752 ecr 3052623889], length 8688
03:52:23.722413 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [.], ack 8689, win 4095, options [nop,nop,TS val 3052623890 ecr 303726], length 0
03:52:23.722431 IP 1.1.1.1.31586 > 2.2.2.2.80: Flags [.], ack 11585, win 4095, options [nop,nop,TS val 3052623890 ecr 303726], length 0


The packet at timestamp 03:52:23.615236 is from client with a tcp push flag (tell remote host to push the data up the stack, from kernel to application - perform a context switch). This is most probably the GET request sent by browser size 115 bytes.
Packet at 03:52:23.615256 is an empty ACK - server kernel aknowledges the previous segment sent by client and next packet at timestamp 03:52:23.615428 is a tcp packet from server to client with the actual payload. See the length 14480.

This length is impossible to send over ethernet link with MTU 1500. Here is the thing: tcpdump and wireshark capture packets in kernel infrastructure (bpf filter in Linux), not what it is actually sent by the network card. So what does this mean ?

TCP Segmentation offload - TSO


In order to save kernel cpu load, the Linux/FreeBSD/Windows kernel calculates the receive window of the tcp client, calculates the send window for this connection and then pushes as much data as possible as permitted by these restrictions.

TCP Segmentation offload allows the system to do TCP segmentation in the NIC driver instead of main CPU via kernel.

In this case, client receive window is 4117*2^6 = 263488. Server initial send window is 10 (TCP segments) so the kernel prepares a buffer of less than 10*1460 bytes (they end up using an MSS of 1448). This is sent from Linux kernel to the interface driver for actual segmentation (along with other parameters like info for the nic driver how to segment these big tcp segments).

Check if TSO is disabled/enabled


To confirm it's tcp segmentation offload (kernel isn't performing the tcp segmentation, but the nic driver) "ethtool" can be used:
Code:
root@server:~# ethtool --show-offload  eth0      (OR ethtool -k eth0)
Features for eth0:
rx-checksumming: off [fixed]
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-unneeded: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]

Disable tcp segmentation offload:


Code:
root@server:~# ethtool -K eth0 tso off
root@server:~# ethtool -K eth0 gso off

Check tcp segmentation if disabled:
Code:
Features for eth0:
rx-checksumming: off [fixed]
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-unneeded: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: on
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off
        tx-tcp6-segmentation: off
udp-fragmentation-offload: on
generic-segmentation-offload: off
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]


Let's confirm with a tcpdump:
Code:
04:28:07.291329 IP 1.1.1.1.18236 > 2.2.2.2.80: Flags [S], seq 2840543612, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 3054767697 ecr 0], length 0
04:28:07.291383 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [S.], seq 186047052, ack 2840543613, win 14480, options [mss 1460,sackOK,TS val 118394 ecr 3054767697,nop,wscale 9], length 0
04:28:07.397891 IP 1.1.1.1.18236 > 2.2.2.2.80: Flags [.], ack 1, win 4117, options [nop,nop,TS val 3054767805 ecr 118394], length 0
04:28:07.398712 IP 1.1.1.1.18236 > 2.2.2.2.80: Flags [P.], seq 1:116, ack 1, win 4117, options [nop,nop,TS val 3054767805 ecr 118394], length 115
04:28:07.398735 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 0
04:28:07.398896 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 1:1449, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398962 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 1449:2897, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398968 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 2897:4345, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398975 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 4345:5793, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398981 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 5793:7241, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398988 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 7241:8689, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.398995 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 8689:10137, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.399000 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 10137:11585, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.399006 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 11585:13033, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448
04:28:07.399012 IP 2.2.2.2.80 > 1.1.1.1.18236: Flags [.], seq 13033:14481, ack 116, win 29, options [nop,nop,TS val 118421 ecr 3054767805], length 1448


!!! Linux tcp segmentation offload is not disabled unless generic segmentation offload is disabled also (ethtool -K eth0 gso off) !!!!

Disabling TSO in Debian persistently:


There are two possible ways:
1. Add ethtool commands in /etc/rc.local (straight forward)
2. Using /etc/network/interfaces and add the "pre-up" line below "iface eth0 inet static":
Code:
root@server:~# cat /etc/network/interfaces
# Network interface for debian
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
allow-hotplug eth0
auto eth0
iface eth0 inet static
        pre-up /sbin/ethtool -K eth0 tso off
        pre-up /sbin/ethtool -K eth0 gso off
...





Top
Display posts from previous:  Sort by  
E-mail friendPrint view

Topics related to - "Linux tso (tcp segmentation offload) - what it means and how to enable/disable it"
 Topics   Author   Replies   Views   Last post 
There are no new unread posts for this topic. How to add IP alias in Debian Linux to last after reboot

debuser

3

1437

Sat Jul 21, 2012 6:03 am

Harespok View the latest post

There are no new unread posts for this topic. Linux: How to list IPv6 neighbors

mandrei99

0

699

Thu Oct 09, 2014 3:11 am

mandrei99 View the latest post

There are no new unread posts for this topic. How to add (persistent) static ARP entries in Linux

mandrei99

0

10624

Mon Sep 29, 2014 5:17 am

mandrei99 View the latest post

There are no new unread posts for this topic. Linux How to change hardware MAC address of an interface

debuser

1

884

Tue Dec 06, 2011 6:26 am

Zettie49 View the latest post

There are no new unread posts for this topic. DNS reverse lookup in Windows 7 and Linux using - nslookup

mandrei99

0

2583

Tue Jan 13, 2015 8:31 pm

mandrei99 View the latest post

There are no new unread posts for this topic. How to dump kernel route cache in Linux

mandrei99

0

1580

Mon Jan 12, 2015 11:26 am

mandrei99 View the latest post

There are no new unread posts for this topic. Set up FTP PROXY via command line in Linux/FreeBSD

mandrei99

0

15190

Tue Jan 20, 2015 5:01 pm

mandrei99 View the latest post

There are no new unread posts for this topic. Linux adding and removing vlan tagged interfaces

mandrei99

0

18180

Wed Aug 28, 2013 8:19 am

mandrei99 View the latest post

There are no new unread posts for this topic. Linux script for interface network bandwidth monitoring

admin

0

1241

Mon Feb 09, 2015 7:05 pm

admin View the latest post

There are no new unread posts for this topic. How to block ping icmp echo requests without a firewall in Linux

debuser

0

826

Mon Nov 26, 2012 9:46 am

debuser View the latest post

 

Who is online
Users browsing this forum: No registered users and 0 guests
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum
Jump to:  
News News Site map Site map SitemapIndex SitemapIndex RSS Feed RSS Feed Channel list Channel list


Delete all board cookies | The team | All times are UTC - 5 hours [ DST ]



phpBB SEO