iptables stops?
April 29, 2005 8:42 AM   Subscribe

One of my clients is running a linux firewall. (FC3, Kernel 2.6.10-1). Occasionally, the firewall box will stop passing packets on the internal interface for what seems to be no reason at all.

The symptoms are that the internal interface will stop passing packets on the network interfaces for some unknown reason. There are a lot of "martian" packets on the network, but I don't know how to filter those out or if they're the cause. Nothing besides the martians shows up in the logs when things freeze 'cept for the distinct absence of logging messages. The problem, unbeleivably enough, goes away if you hit enter a few times on the console and wake the machine back up. Until someone walks up to the console and kicks it, the whole office can't forward packets through the firewall in any way, and the machine is completely unreachable 'cept via console.
I didn't set it up, but since the guy who did has proven incapable of fixing it, they asked me to look at it. Any suggestions? Googling for "iptables stop" didn't turn anything up.
posted by SpecialK to Computers & Internet (4 answers total)
 
Response by poster: Err, that should've been "googling for iptables stops", "iptables freeze", and the rest of the usual other suspects didn't turn anything up"... but my caffeine-deprived brain isn't helping my poor google-fu any.
posted by SpecialK at 8:44 AM on April 29, 2005


I'd remove the apmd package (and it's equivalent ACPI version) - something is putting parts of the system to sleep. Also check in the BIOS for any power-saving options.
posted by LukeyBoy at 8:50 AM on April 29, 2005


I'd agree with the ACPI/APM suggestion. As for the martians:
if [ -r /proc/sys/net/ipv4/conf/all/log_martians ]; then
  echo "1" > /proc/sys/net/ipv4/conf/all/log_martians
fi
will log them.

There are likely two different subnets (some folks on a 10 net others on 192.168.0.0/24 perhaps) connected to the same hub/switch that are causing them in the first place, you may be able to track down the where from the logs.

If removing the power saving bits doesn't help I'd start googling around for problems related to the network driver. I know that there are several which go brain dead in 2.6.x under high load that were recently fixed in the run-up to 2.6.12 (most notably the tg3 driver that's used by many modern HPaqs).
posted by togdon at 11:12 AM on April 29, 2005


Response by poster: Yep, this is definitely a high-load situation. Something like a gig and a half per 6 hour business day down the line, not counting DNS records that are cached at the router.

I'll look at the NIC thing again; the nic that's doing it is a generic on-board NIC, so who knows what's actually running it.

Re: the martians, actually, they don't have any other subnets. It's a small (~50 ppl) office that uses the internet heavily for research; the firewall/router is plugged into a T-1 and a DSL line. They went with linux soft router vs. Cisco because it was much cheaper to do a soft router versus what their consultant told them they needed to do that kind of load-sharing. The martians are supposedly coming from 255.255.255.255, which to me looks like a heavily mangled packet. I think I managed to trace them, back to one network connection at the patch panel but replacing the equipment on the other end of that line doesn't fix anything, and it's kind of hard to trace the location of the packets other than that. I wish I could get a hardware address off of one of the martians, because thanks to DHCP we know all of the hardware addresses in the building, but I can't seem to get a good one.
posted by SpecialK at 12:25 PM on April 29, 2005


« Older Camping Around Mt. Shasta   |   fark links popup blank Newer »
This thread is closed to new comments.