[H-GEN] Improving UDP performance on firewall?

David Jericho davidj at tucanatech.com
Thu Dec 2 20:41:40 EST 2004


David Harrison wrote:

> Anyone have any suggestions on how to improve UDP performance on a 
> Linux firewall (2.4.26 kernel) ? We currently have a number of hosts 
> behind the firewall, and as soon as we start getting a certain amount 
> of traffic, we're seeing poor performance - pings to the firewall and 
> boxes behind the firewall go up and we start dropping packets. The 
> majority of the traffic is UDP (games and streaming media).
>
> I'm fairly confident its not the router - I can ping the router's 
> internal interface and not see any of the performance degradation, but 
> the next hop (the firewall) it starts dropping packets, higher ping 
> times, etc. I'm assume that when pinging the router's internal 
> interface, the packets are going through the same routing process as a 
> packet going to the next hop would, so based on that I've tenatively 
> ruled out the router.


I notice in another email you mention the Broadcom TG3 chips, along with 
an Intel 8255x series board. Firstly, use the bcm5700 driver from 
Broadcom for the Broadcom adapters, and use the Intel e100 module (not 
the standard eepro100 module). Both have proven performance benefits. 
The bcm5700 driver can use the NAPI polling mode in the kernel, leading 
to significantly reduced interrupts on the bus. If you're using an 
interrupt driven mode on your card (which I believe the tg3 driver 
does), you'll see an interrupt for every packet that comes in.

I use the bcm5700 drivers here myself, and on Opteron based machines I 
see network throughput go to ~ 120 MByte/s per direction when using the 
bcm5700 driver, vs around ~ 90 MByte/s per direction for the tg3 driver. 
If I team the interfaces (i.e. 802.1q trunk bonding), throughput using 
the tg3 driver drops to around 70 MByte/s, as opposed to jumping to ~ 
160 - 170 MByte/s using the bcm5700 driver.

The next thing I'd confirm is that your interfaces are indeed running in 
a full duplex mode. Check the switch, or alternately use mii-tool to 
check that.

You don't mention the volume of packets, and the actual data rates 
you're seeing. Given who you work for, I'm thinking that you'd have a 
high packet per second rate, and not that much actual throughput (with 
respect to the number of packets)

As for pinging as a performance measure, I wouldn't put too much stock 
in it. If your firewalling rule set has any ICMP rate limiting in it, 
that could be messing with any stats you obtain. I'd be more inclinded 
to use a UDP pinging test. Search for a tool called uplog using Google 
or freshmeat.net. ICMP also is not guaranteed delivery, and chances are 
your router will drop it in preference for other packets.

Failing that, check the order of your firewall ruleset. Put the common 
rules at the top so they execute first. There isn't much you can do 
otherwise to tune UDP performance. Either the packet gets there, or it 
doesn't.  As a final suggestion, I'm not sure of your network topology, 
but I'll hazard a guess and say I'd be more inclined to use a TG3 as the 
external interface, along with one TG3 internal interface, and the EEPro 
interface as the management interface. One final option is to consider 
is a bridging firewall instead of having the box act as a firewall router.

As a side note, don't bother with a TOE (TCP Offload Engine). Unless 
you're doing iSCSI SANS, FC over Ethernet, or have _very_ high 
throughput and _very_ low latency requirements, you're wasting your 
money. A 1.4GHz Opteron can push and pull data down a TG3 with only a 
minor (< 10%) CPU bump once you offload the IP checksumming. Doing full 
IP stack offload is a non-trivial matter, and only worth it for massive 
filer or database deployments in large SAN environments.

-- 
David Jericho
Senior Systems Administrator, Tucana Technologies





More information about the General mailing list