Virtualization Tip: Always disable checksumming on virtual ethernet devices

I’ve seen this cause problems on both Xen and VMware though I suspect the problem is widespread. When a virtual ethernet device has checksumming enabled it can cause all sorts of network problems. I once got deep into debugging and found that with checksumming enabled packets larger than the MTU would be sent which would then cause a number of fragmentation messages to be sent. It was a mess, but easily correctable. Some specifics follow.

To query the checksumming settings of an ethernet device run:

  • ethtool -k ethX

You might see something like this:

rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off

The most important thing is to disable RX and TX checksumming. For Xen I find the following command to work quite well:

  • ethtool -K eth0 tx off gso on

The status then looks like this:

rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: on

This can be set (for Debian or Ubuntu based systems) in the /etc/network/interfaces file as a pre-up or post-up command or as a script in /etc/network/if-pre-up.d or /etc/network/if-up.d.

Trackback URL for this post:

http://hightechsorcery.com/trackback/89

tx

can you be more precise

disable tx checksum on dom0, domU , both ?

Sorry that wasn't clear

I mean for the domU or any other virtual server for software other than Xen. I have not found such a change necessary for the dom0 or host OS but for all virtual servers it seems to fix networking issues. So disable it on any domU's that you have and see if you ever run into problems.

After test

thanks

I also test with the para-virtualized drivers for windows (GPL)
with these drivers you can play with checksumoffload and scatter/gather

test with file copies from/to network drive

Set to on : lot of frame with incorrect checksum, long frame (1500) are always OK, short frame (40b) are always incorrect (on port 1058 nim, does anyone know what is this port ?)

Set to off : no incorrect frame

speed : 10% better set to off

That's useful information

In my testing I had incorrect frames regardless of what size the packets were. It almost seemed like frames were being jumbled together as packets sizes were often above 1500.

As for the performance I think your numbers are accurate. However, in my testing the performance of the virtual drivers over time would get progressively worse. What finally forced me to take action were transfer speeds 10x worse than they should have been. But it is reassuring that disabling the checksum seems to improve performance even before the problems become that severe.

Creative Commons License Except where otherwise noted, content on this site is licensed under a Creative Commons by-nc-sa 3.0 License