I recently found out about lxd’s web interface and had to try it out.
All of my servers have 10.0.200.0/32
set on lo, advertised to the network via bgp. This provides simple load balancing and failover for common host services. LXD implements a well-designed modern api, so anycast should be a good solution for cluster access.
Heres the netplan config for one of these servers:
root@x10slhf-xeon-920ea:~# cat /etc/netplan/bgp-unnumbered.yaml
network:
version: 2
renderer: networkd
ethernets:
lo:
addresses:
- 10.0.200.5/32
- 10.0.200.0/32
eno1:
optional: true
mtu: 9000
eno2:
optional: true
mtu: 9000
vlans:
bgpeno1:
link: eno1
id: 10
bgpeno2:
link: eno2
id: 10
tunnels:
vxlan100:
mode: vxlan
id: 100
local: 10.0.200.5
mac-learning: false
mtu: 8950
bridges:
br-vxlan100:
interfaces: ["vxlan100"]
To serve the lxd frontend on 10.0.200.0, we need to set the core.https_address
on each server. We also enable the ui here:
ssh [email protected] -- "snap set lxd ui.enable=true && /snap/bin/lxc config set core.https_address=10.0.200.0:8443 && systemctl reload snap.lxd.daemon"
...
And with that, https://10.0.200.0:8443/ui
gets us to the web ui. For the most part, things work as expected. Shorter-lived connections to the API - such as with the lxc command-line client - work very well, but console video feeds and realtime shells occasionally have their connections dropped. Of course, if intermediate routers decide to send our traffic to a host B and the shell we have open is a container on host A - the connection will be dropped if host B is bounced, and potentially dropped again when host B comes back online.
These shortcomings aside, this is a high bang-for-the-buck solution for basic HA for cluster management that allows our hosts to exist in different broadcast domains.
If anything, I may add a simple systemd daemon on each host to withdraw the route when LXD can’t be reached on 127.0.0.1, implying the service itself is failed.