Measuring 64-byte IP forwarding pps on linux

I’ve been wondering if different kernels affect the ip forwarding performance of linux routers, particularly the RT or ‘realtime’ kernel. Nixos makes it really easy to try different kernels, so I tried a few.

The devices under test are two c2758 boxes with lots of 1000gbase-t interfaces. The source and target are c3558 boxes, each has a link to each router. All are running bgp. The c3558 boxes therefore have two routes to eachother:

[root@lanner-c8f0:~]# ip r show 10.0.0.0
10.0.0.0 nhid 101 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2e02 dev enp2s0f0 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca0 dev enp2s0f1 weight 1

I’ll be using iperf(3) to measure throughput of small packets:

iperf -c 10.0.0.0 -P8 -u -l36 -b0 -t 60

There is a misconception that to measure 64-byte pps forwarding capability with iperf you can just use -l64.

This is not the size of the IP packet, this is the size of the UDP payload.

To get a packet with Total Length: 64, a 36 byte udp payload is correct. Here it is in wireshark:

Internet Protocol Version 4, Src: 172.30.190.200, Dst: 10.0.0.1
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
    Total Length: 64
    Identification: 0xbdad (48557)
    010. .... = Flags: 0x2, Don't fragment
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 64
    Protocol: UDP (17)
    Header Checksum: 0x0818 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 172.30.190.200
    Destination Address: 10.0.0.1
User Datagram Protocol, Src Port: 57727, Dst Port: 5201
    Source Port: 57727
    Destination Port: 5201
    Length: 44
    Checksum: 0x7525 [unverified]
    [Checksum Status: Unverified]
    [Stream index: 9]
    [Timestamps]
    UDP payload (36 bytes)

This can also be verified in wireshark in ‘Statistics -> Packet Lengths’. With -l64, they’re all 92 bytes.

I’ll be using my nixa config management tool to easily reconfigure and bounce the routers:

nixa > nix-shell --run 'python3 nixa --limit datto --action boot'
10.0.0.2 is reachable
10.0.0.3 is reachable
applying template datto.nix to datto: ['10.0.0.2', '10.0.0.3']
10.0.0.2:
---

+++

@@ -9,6 +9,7 @@

   nix.optimise.automatic = true;
   nix.gc.automatic = true;
   system.stateVersion = "24.11";
+  boot.kernelPackages = pkgs.linuxPackages-rt_latest;

   networking = {
     hostName = "spine-green-476d";
rebuilding NixOS on 10.0.0.2
Rebooting 10.0.0.2
10.0.0.2 is reachable
10.0.0.3:
---

+++

@@ -9,6 +9,7 @@

   nix.optimise.automatic = true;
   nix.gc.automatic = true;
   system.stateVersion = "24.11";
+  boot.kernelPackages = pkgs.linuxPackages-rt_latest;

   networking = {
     hostName = "spine-blue-db5d";
rebuilding NixOS on 10.0.0.3
Rebooting 10.0.0.3
10.0.0.3 is reachable

Now that we have the right -l value for 64 byte packets, the Lost/Total Datagrams column in iperf will correspond 1:1 with 64-byte packets. When you flood a device with UDP in iperf, there will be packet loss - so I’ll be taking the recieved mbits/sec and recieved successful datagrams. The following example would be (25799702-3584655)/60, or 370,250 pps:

[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
...
[SUM]   0.00-60.00  sec  1.54 GBytes   220 Mbits/sec  0.000 ms  0/25806379 (0%)  sender
[SUM]   0.00-60.00  sec  1.32 GBytes   190 Mbits/sec  0.016 ms  3584655/25799702 (14%)  receiver

Measurements

First off, heres the standard kernel over the two paths:

384,003 pps
111 Mbits/sec

Heres the standard kernel with one path shut down:

219,324 pps
63.2 Mbits/sec

Heres boot.kernelPackages = pkgs.linuxPackages-rt_latest;:

377,069 pps
109 Mbits/sec

boot.kernelPackages = pkgs.linuxPackages_5_4:

382,472 pps
110 Mbits/sec

Heres two direct cables between source and dest, no routers:

360,602 pps
104 Mbits/sec

Conclusion

Looking at the performance of the direct crossover, I’d say I need to find more capable hosts to generate packets. I was surprised to see this, as the 14nm c3558 boxes are a generation newer than the 22nm c2758 routers and have dual channel ddr4 instead of single ddr3. The results here show that these c2758 routers can forward packets at least as fast as these particular endpoints can generate them.

Nathan Hensel

on caving, mountaineering, networking, computing, electronics


2025-01-12