ipvs Load Balancing on BGP Linux Router

These are notes on testing out ipvs on a BGP unnumbered nixos router.

We have two web servers, 10.0.0.2/32 and 10.0.0.3/32, running behind router 10.0.0.1/32. 10.0.0.1/32 itself has gateway 10.0.0.0/32. We will be serving test files from 10.0.0.2 and 10.0.0.3, and creating ipvs rules on 10.0.0.1.


Router 10.0.0.1 has multipath routes up to its gateway as well as down to the web servers:

[root@b55416a9-f3fc-59d5-b5f8-d40f13e815a1:~]# ip r
default nhid 14 proto bgp metric 20
	nexthop via inet6 fe80::290:bff:fea5:e2d0 dev enp2s0f1 weight 1
	nexthop via inet6 fe80::290:bff:fea5:e2d1 dev enp2s0f0 weight 1
10.0.0.0 nhid 14 proto bgp metric 20
	nexthop via inet6 fe80::290:bff:fea5:e2d0 dev enp2s0f1 weight 1
	nexthop via inet6 fe80::290:bff:fea5:e2d1 dev enp2s0f0 weight 1
10.0.0.2 nhid 52 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca3 dev enp8s0f1 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca4 dev enp6s0f0 weight 1
10.0.0.3 nhid 58 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2e05 dev enp8s0f0 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2e06 dev enp6s0f1 weight 1

On each webserver, I have created a text file to serve with python:

[nix-shell:~/http]# cat index.html
node 1
[nix-shell:~/http]# python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

I can curl them from my desktop (through both routers 10.0.0.0/32 and 10.0.0.1/32):

~ > curl 10.0.0.2:8000
node 1
~ > curl 10.0.0.3:8000
node 2

ipvs

10.0.0.1 is running frr and is configured to redistribute any connected /32 prefixes within 10.0.0.0/16. This lets us inject a ‘vip’ into the network by simply adding it to lo on 10.0.0.1:

ip a add 10.0.1.0/32 dev lo

Now we’ll clear ipvs and add our frontend and backends. No, you cannot just do it all in one command:

ipvsadm -C
ipvsadm -A -t 10.0.1.0:80 -s rr
ipvsadm -a -t 10.0.1.0:80 -r 10.0.0.2:8000 -m
ipvsadm -a -t 10.0.1.0:80 -r 10.0.0.3:8000 -m

And now from my desktop:

~ > curl 10.0.1.0
node 2
~ > curl 10.0.1.0
node 1
~ > curl 10.0.1.0
node 2
~ > curl 10.0.1.0
node 1

It works.


anycast

While a simple in-kernel loadbalancer is a nice tool to have, theres an even simpler solution possible here.

These nodes are all running bgp - both the routers and the webservers. We can forget ipvs and just deploy the 10.0.1.0/32 ip on lo of both the webservers:

[nix-shell:~/http]# ip a add 10.0.1.0/32 dev lo
[nix-shell:~/http]# python3 -m http.server
...
[nix-shell:~/http]# ip a add 10.0.1.0/32 dev lo
[nix-shell:~/http]# python3 -m http.server

Now on the router 10.0.0.1/32, we have a whole bunch of routes to 10.0.1.0/32:

[root@b55416a9-f3fc-59d5-b5f8-d40f13e815a1:~]# ip r
default nhid 14 proto bgp metric 20
	nexthop via inet6 fe80::290:bff:fea5:e2d0 dev enp2s0f1 weight 1
	nexthop via inet6 fe80::290:bff:fea5:e2d1 dev enp2s0f0 weight 1
10.0.0.0 nhid 14 proto bgp metric 20
	nexthop via inet6 fe80::290:bff:fea5:e2d0 dev enp2s0f1 weight 1
	nexthop via inet6 fe80::290:bff:fea5:e2d1 dev enp2s0f0 weight 1
10.0.0.2 nhid 52 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca3 dev enp8s0f1 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca4 dev enp6s0f0 weight 1
10.0.0.3 nhid 58 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2e05 dev enp8s0f0 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2e06 dev enp6s0f1 weight 1
10.0.1.0 nhid 73 proto bgp metric 20
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca3 dev enp8s0f1 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2ca4 dev enp6s0f0 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2e05 dev enp8s0f0 weight 1
	nexthop via inet6 fe80::260:e0ff:fe8a:2e06 dev enp6s0f1 weight 1

From my desktop, we lose the port mapping - but the network itself is indeed doing the loadbalancing:

~ > curl 10.0.1.0:8000
node 1
~ > curl 10.0.1.0:8000
node 1
~ > curl 10.0.1.0:8000
node 2
~ > curl 10.0.1.0:8000
node 2

This will only work if our paths (from the router to the servers) are indeed equal. If our network is poorly connected, we will lose ecmp and traffic will flow to whatever webserver is closest (lower cost).

Nathan Hensel

on caving, mountaineering, networking, computing, electronics


2025-01-05