SNI on Bare-Metal Kubernetes with cert-manager and ingress-nginx

This is an example of server name indication implemented with ingress-nginx, using cert-manager for autonomous certificate management. cert-manager has a bit of a learning curve, but is well worth it.

What all this allows us to do is map subdomains into deployments on a kubernetes cluster. Everything here is ‘highly available’, and after initial setup - it takes mere minutes to deploy a new service on a new subdomain with valid https.

This page is not meant to replace documentation for any of these tools, but to serve as an example of how something like this may look on physical hardware - since many other writings on these topics leave the discussion at ‘refer to your cloud provider’, or ‘just use a nodeport’.

physical environment

We’ll be mapping two subdomains - signal.nih.earth and cdn.nih.erth, to two deployments - hugo and minio. Depending on your environment, this will not be copy-paste-able. Starting with a standard ingress-nginx installation, we need a way for traffic to find its way to the hosts running the ingress controllers. I’ve created the load balancer ingress-nginx-lb to cause 10.0.100.12/32 to be provisioned on each node running a controller by way of my own l3lb. Consequentially this address is advertised to peering top-of-rack routers in anycast fashion - and the network itself performs the load balancing. From here, nginx controllers forward the traffic to either hugo or minio clusterIPs, depending on the domain name used in the tls handshake. The domain name registration is a simple *.nih.earth wildcard.

Here is the ingress service: (terraform manifests are shown)

resource "kubernetes_service" "ingress-nginx-lb" {
  metadata {
    name = "ingress-nginx-lb"
  }
  spec {
    selector = {
      name = "nginx-ingress-microk8s"
    }
    port {
      name        = "http"
      port        = "80"
      target_port = "80"
    }
    port {
      name        = "https"
      port        = "443"
      target_port = "443"
    }
    type          = "LoadBalancer"
    external_ips  = ["10.0.100.12"]
    session_affinity = "ClientIP"
    external_traffic_policy = "Local"
  }
  wait_for_load_balancer = "false"
}

Having landed the service ingress-nginx-lb, we need to tell nginx:

--publish-service	Service fronting the Ingress controller. Takes the form "namespace/name". When used together
			with update-status, the controller mirrors the address of this service's endpoints to the
			load-balancer status of all Ingress objects it satisfies.
$ kubectl edit daemonset nginx-ingress-microk8s-controller -n ingress
...
    spec:
      containers:
      - args:
        ...
        - --publish-service=default/ingress-nginx-lb
        - --update-status
...

With the network and ingress controllers themselves configured, its time to create an ingress:

resource "kubernetes_ingress_v1" "hugo" {
  metadata {
    name = "hugo"
    annotations = {
      "kubernetes.io/ingress.class" = "public"
      "cert-manager.io/issuer" = "letsencrypt-staging"
    }
  }
  spec {
    rule {
      host = "signal.nih.earth"
      http {
        path {
          backend {
            service {
              name = "hugo"
              port {
                number = 443
              }
            }
          }
        path = "/"
        path_type = "Prefix"
        }
      }
    }
    rule {
      host = "cdn.nih.earth"
      http {
        path {
          backend {
            service {
              name = "minio-public"
              port {
                number = 80
              }
            }
          }
        path = "/web-assets"
        path_type = "Prefix"
        }
      }
    }
    tls {
      hosts = ["signal.nih.earth", "cdn.nih.earth"]
      secret_name = "hugo-tls"
    }
  }
}

Assuming you’ve created the Issuer letsencrypt-staging, the cert-manager pods themselves will attempt a GET on ‘/.well-known/’ for each domain name configured. This is going to mean that from the cert-manager pods’ perspective, your domain names need to resolve to an address that will get traffic to the controllers for the relevant ingress class. For me, this meant setting up split-horizon DNS within my cluster’s coredns deployment, pointing at the ingress frontend loadbalancer. Otherwise, public domain resolution would bring the traffic to the wan-facing interface of my border gateway router, where it would be lost.

$ kubectl -n kube-system get configmap coredns -o yaml
apiVersion: v1
data:
  Corefile: |
    .:53 {
...
        hosts {
            10.0.100.12 nih.earth
            10.0.100.12 cdn.nih.earth
            fallthrough
        }
...
    }

We also need to tell cert-manager to use our own DNS server. 10.152.183.10 is the ClusterIP of our coredns service.

$ kubectl get deployment.apps/cert-manager -n cert-manager -o yaml
...
    spec:
      containers:
      - args:
        - --v=2
        - --cluster-resource-namespace=$(POD_NAMESPACE)
        - --leader-election-namespace=kube-system
        - --acme-http01-solver-image=quay.io/jetstack/cert-manager-acmesolver:v1.11.0
        - --max-concurrent-challenges=60
        - --acme-http01-solver-nameservers="10.152.183.10:53"

Assuming things are working, you should be able to watch the logs of the ingress controllers and see GETs returning 200 to the rfc1918 cert-manager acme-solver ips:

10.1.61.98 - - [19/Jan/2023:08:28:21 +0000] "GET /.well-known/acme-challenge/Kz1G61FQh10KeM8k_wxMeoh9TWvQbpZrTg_ngrOjg9M HTTP/1.1" 200 87 "-" "cert-manager-challenges/v1.11.0 (linux/amd64) cert-manager/1a0ef53b06e183356d922cd58af2510d8885bef5" 264 0.001 [default-cm-acme-http-solver-tj7lg-8089] [] 10.1.61.106:8089 87 0.000 200 40aa7a0419dfab1bdf8e846715781ea1
10.1.61.98 - - [19/Jan/2023:08:28:22 +0000] "GET /.well-known/acme-challenge/Z41OcMu_FW2NFuJ2E8z4FShyXlLs8-mqqwpxUrP_RIk HTTP/1.1" 200 87 "-" "cert-manager-challenges/v1.11.0 (linux/amd64) cert-manager/1a0ef53b06e183356d922cd58af2510d8885bef5" 260 0.000 [default-cm-acme-http-solver-pcm4g-8089] [] 10.1.61.105:8089 87 0.000 200 5393fc4792726e7ffb8a501f53674da1

If those local sanity checks have worked, we’ll see public ips of letsencrypt servers repeat the challenge:

23.178.112.106 - - [19/Jan/2023:08:28:28 +0000] "GET /.well-known/acme-challenge/Kz1G61FQh10KeM8k_wxMeoh9TWvQbpZrTg_ngrOjg9M HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" 265 0.002 [default-cm-acme-http-solver-tj7lg-8089] [] 10.1.61.106:8089 87 0.004 200 bbf515041c82d7f1a0d29be6487a7b52
18.246.33.217 - - [19/Jan/2023:08:28:28 +0000] "GET /.well-known/acme-challenge/Kz1G61FQh10KeM8k_wxMeoh9TWvQbpZrTg_ngrOjg9M HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" 265 0.002 [default-cm-acme-http-solver-tj7lg-8089] [] 10.1.61.106:8089 87 0.000 200 f29b154b0f78fa4903aa3608a4a93553

Nathan Hensel

on caving, mountaineering, networking, computing, electronics


2023-01-20