TCP WTF
February 8, 2012 11:42 AM   Subscribe

How do I fix this fast-then-slow download issue on a Linux machine? It doesn't happen on a Windows machine. TCP tuning nerdery probably required.

My office has a 12 megabit symmetric connection, shared between a dozen users or so via a SonicWall edge router and a 100MbE switch.

When running "yum update" on a Fedora machine, I noticed that downloads were beginning very quickly, but then tapering off to only a few kB/s after a meg or two. E.g., a 50MB download will start at upwards of 600kB/s, but after a few moments will start stalling and eventually end up crawling along at 3-4 kB/s.

Some other users reported similar behavior on Win7 boxes, and fixed it by disabling Windows TCP Autotuning. I'm not sure of exactly what behavior this disables, though, or what its Linux equivalent would be. I have already reduced the MTU down to 1400 (from 1500) just to see if that was the issue, but it didn't have any effect.

Running a speed test on a nearby Windows machine shows 12Mb/s available both up and down, so it's not a matter of there not being enough upstream for the ACKs.

Google turns up a lot of people reporting similar problems, but the typical response is to blame it on ISP throttling, which it definitely is not in my case. (Both because I know the connection isn't throttled or shaped, and because I can download the same file on a Windows box and get a much higher sustained transfer.) It definitely seems to be some sort of maladjusted TCP setting. I'm comfortable with Linux but have never had reason to do much TCP tuning before.

Technical details: The test machine is running Fedora 16 (linux 3.1.0-7.fc16.i686 SMP). I can get the same result regardless of what server I'm pulling data from; I've seen it from a variety of Fedora mirrors but also via SCP from a machine at home. It's been a while since I've babysat one of the Linux boxes while it was doing an update, so I'm not sure for how long the issue has been going on. Nothing on the LAN side has changed recently; the edge router and switch are both a few years old and otherwise work fine.
posted by Kadin2048 to Computers & Internet (10 answers total) 4 users marked this as a favorite
 
I'm having a hard time thinking of any TCP setting on a normal Linux box that would produce this behavior. Have you done any weird tinkering? I'd be more inclined to blame the router, in particular any QoS settings it might have enabled.

To diagnose this further I'd try a few things.. First I'd simplify the download test, by using wget or scp to copy large files from a known good download host outside your firewall. Second I'd be sure other machines on my LAN can download file and it's just this one Linux box. If it's just this one machine then I'd break out WireShark and start trying to diagnose the TCP connection. That's a lot of complicated work though, there may be a simpler way.
posted by Nelson at 12:07 PM on February 8, 2012


Have you blocked ICMP messages such as Source Quench somewhere?
posted by fritley at 12:22 PM on February 8, 2012


Although I'm geek, my knowledge of networking is spotty. However, I vaguely remember something like this happening to a computer where I worked a while back. I remember it was something having to do with a mis-match between the router and the computer for support for jumbo frames. It's possible that the hardware was ridiculously old and this is not a problem on even halfway decent hardware these days, but you might want to take a look at that.
posted by Betelgeuse at 12:23 PM on February 8, 2012


Response by poster: I've been running some experiments using SCP from a remote machine. Same behavior occurs.

Oddly ... I have an old Ubuntu machine (2.6.15-51-server) and it does not experience the issue -- it can pull stuff down over SCP all day at 200+ kB/sec.

Looking at the configuration differences between the two machines, the Fedora machine has tcp_congestion_control set to "cubic" while the old Ubuntu machine uses "bic". Also, the initial_ssthresh parameter of the algorithm is 100 on Ubuntu, but set to 0 on Fedora.

When I raised the initial_ssthresh parameter to 100 on Fedora, it seemed to take longer for connections to stall than when it was set to zero. E.g., instead of stalling at 5-6MB, it went for more like 13 or 14MB before slowing to a near-halt. (I didn't rerun these tests that many times, so the numbers aren't that reliable. But there did seem to be a difference.)

Not sure what, if anything, that indicates.

Have you blocked ICMP messages such as Source Quench somewhere?

Not intentionally, no. The Fedora machine is a fresh install of FC16 from media, I didn't do any customization of it. Totally OOTB as far as networking config goes.
posted by Kadin2048 at 12:29 PM on February 8, 2012


Best answer: As per Nelson I would make sure there is no QoS or anything weird in the iptables, and simplifying the test would help point out

What is your typical round-trip-time to most internet sites? Are you going over a satellite? This is a large pipe and you might wish to try some options related to "Long-Fat-Network" (LFN), For example the tcp window scaling parameter (/proc/sys/net/ipv4/tcp_window_scaling) could be adjusted.
I would also look adjusting the tansmission queue 'ifconfig eth0 txqueuelen 2048', and there is an similar receive buffer I believe.

I have found
1) The auto-tune tcp features are not very good for LFNs
2) Some drivers will force this issue because they have unexceptable latency and trip the congestion window

In my experience this this behaviour is caused because it isnt getting the acks back in expected time and the sliding windows is closing (re: search "TCP Congestion avoidance"). They speed of TCP is all about keeping the sliding window as large as possible.
posted by njk at 12:31 PM on February 8, 2012


Response by poster: The network isn't over satellite or RF or anything like that. RT ping to most places on the Internet (nearest Google cluster, Metafilter, Amazon) is under 100ms. The ping to the server I'm using for transfer tests is only 19.5ms with a mdev of 0.9ms, over 20 pings.
posted by Kadin2048 at 12:38 PM on February 8, 2012


Best answer: this behaviour is caused because it isnt getting the acks back in expected time and the sliding windows is closing

I think you're on to something here. I decided (mostly on a lark) to disable tcp_window_scaling* and the problem went away. Something is still not perfect, because it starts up around 400kB/sec and drops down to around 260kB/s after a few MB, but it doesn't stall out completely like it was doing earlier.

Strangely, tcp_window_scaling is enabled on the Ubuntu machine that isn't exhibiting the problem, so the real issue must be some other parameter, but turning it off completely seems to be a brute-force way of making it less broken.

It seems other people have had similar issues, though I have not yet tested that fix.

* For those reading in the future, as root do: "echo 0 > /proc/sys/net/ipv4/tcp_window_scaling". This is not a persistent fix, though; for that you have to modify sysctl.
posted by Kadin2048 at 1:12 PM on February 8, 2012


Best answer: Not entirely sure if this is a fix, but it might contribute to the troubles. Turn your txqueuelen down. NOT to zero. But close. I turned mine to 1 and got some drops and overruns, and adjusted it up to 5 and had no problems.

What purportedly happens is that acks get queued up in the txqueue buffer when downstream packets are coming in fast, and the other end is adjusting its tcp window down every time it doesn't get an ack. Then it gets a dozen of them and speeds back up. This oscillates up and down and the server (technical talk!) gets pissed and eventually just slows traffic down permanently for the session. Cranking down the queue allows tcp to do its windowing properly based on the machine actually being able to flow acks back upstream. Letting TCP do its job without too many buffers in the way lets it go as fast as possible.

This is also non-persistent. ifconfig ethX txqueuelen # where X is the interface number and X is your length number.
posted by gjc at 4:00 PM on February 8, 2012


I see you've done a lot of work already, but hit up the ICSI Netalyzr for a quick analysis of the basics (including bufferbloat, discussed in the blue recently).
posted by mendel at 6:40 PM on February 8, 2012


Response by poster: For the moment I just turned TCP window scaling off so that I could get the system to update successfully, but I may mess around with the txqueuelen paramter as gjc suggests, as that seems like a much more elegant solution.

Thanks, everyone!
posted by Kadin2048 at 12:14 PM on February 9, 2012


« Older words to grow on for a starting university teacher...   |   We're homeschooling and the Biology teacher sucks. Newer »
This thread is closed to new comments.