Correct, Consistent Kubernetes Projects with kubectl ApplySets

One of the often-overlooked issues with kubernetes manifests is that they are in fact not truely declarative in the typical kubectl apply -f workflow.

Kubernetes 1.27 introduces ApplySets which fix this issue.


Consider the following. Say we have a project directory default representing the default namespace:

~/kube_prune$ tree
.
└── default
    └── nginx.yaml

2 directories, 1 file

In it, we have nginx.yaml which defines a deployment and service:

apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    name: nginx-app
  ports:
    - protocol: TCP
      port: 80

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-app
spec:
  selector:
    matchLabels:
      name: nginx-app
  template:
    metadata:
      labels:
        name: nginx-app
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

We create the resources with kubectl apply -f:

~/kube_prune$ kubectl apply -f default/
service/nginx created
deployment.apps/nginx-app created

Dang, we realize we should rename the service from ’nginx’ to ’nginx-app’ for consistency. No problem; lets vim the file and apply again:

~/kube_prune$ kubectl apply -f default/
service/nginx-app created
deployment.apps/nginx-app unchanged

Hm, wait a second, instead of renaming the resource, kubectl created a new one!

~/kube_prune$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.152.183.1     <none>        443/TCP   3h7m
nginx        ClusterIP   10.152.183.114   <none>        80/TCP    2m33s
nginx-app    ClusterIP   10.152.183.247   <none>        80/TCP    27s

Ok, lets delete the project and try again:

~/kube_prune$ kubectl delete -f default/
service "nginx-app" deleted
deployment.apps "nginx-app" deleted

~/kube_prune$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.152.183.1     <none>        443/TCP   3h8m
nginx        ClusterIP   10.152.183.114   <none>        80/TCP    3m58s

Grr, we still have a leftover service that needs to be manually deleted.

ApplySets

The solution to stay out of these scenarios is --prune --applyset=. --applyset puts a label applyset.kubernetes.io/part-of= on resources in your project that subsequent runs can use to diff existing resources with your desired state. Ultimately, this makes ‘kubectl apply’ behave more like terraform. If we delete or rename a resource in our project, kubectl will pick up on this and correctly apply the difference.

With --applyset, the above scenario looks like this:

~/kube_prune$ KUBECTL_APPLYSET=true kubectl apply -f default/ --prune --applyset=prod -n default
service/nginx created
deployment.apps/nginx-app created

(rename service/nginx to service/nginx-app)

~/kube_prune$ KUBECTL_APPLYSET=true kubectl apply -f default/ --prune --applyset=prod -n default
service/nginx-app created
deployment.apps/nginx-app unchanged
service/nginx pruned
~/kube_prune$ kubectl get services
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.152.183.1     <none>        443/TCP   3h15m
nginx-app    ClusterIP   10.152.183.217   <none>        80/TCP    10s

ApplySets must be scoped to a namespace, hence the -n default.

ApplySets and Operators

Kubernetes Operators are a game-changing paradigm for managing complicated modern systems like elasticsearch and cockroachdb clusters. They solve a lot of problems very well, but I am often worried when I see something like this (from elasticsearch ECK):

1. Install custom resource definitions:
    kubectl create -f https://download.elastic.co/downloads/eck/2.9.0/crds.yaml

2. Install the operator with its RBAC rules:
    kubectl apply -f https://download.elastic.co/downloads/eck/2.9.0/operator.yaml

This isn’t a long-term solution; this is the curl-to-bash equivalent for kubernetes. A year down the road, we’re probably not going to have a clue what exactly is running or how it was installed.

Lets say we have cockroachdb installed via the 2.8 operator and we want to safely upgrade to operator 2.11.

Here I have crds.yaml and operator.yaml from 2.8 pulled into a git repo:

~/kube_prune/cockroach-operator-system$ ls
crds.yaml  operator.yaml
~/kube_prune/cockroach-operator-system$ git status
On branch master
nothing to commit, working tree clean

We wget crds.yaml and operator.yaml from the 2.11 revision, and there is indeed a diff:

~/kube_prune/cockroach-operator-system$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   crds.yaml
	modified:   operator.yaml

An applyset lets us apply this diff with the confidence that we aren’t leaving behind any cruft. Over time this will help us keep our cluster free of mystery resources that we can’t delete - because we don’t want to destroy all our cockroach clusters!

~/kube_prune$ KUBECTL_APPLYSET=true kubectl apply -f cockroach-operator-system/ --prune --applyset=prod -n cockroach-operator-system
customresourcedefinition.apiextensions.k8s.io/crdbclusters.crdb.cockroachlabs.com configured
namespace/cockroach-operator-system unchanged
serviceaccount/cockroach-operator-sa unchanged
clusterrole.rbac.authorization.k8s.io/cockroach-operator-role configured
clusterrolebinding.rbac.authorization.k8s.io/cockroach-operator-rolebinding unchanged
service/cockroach-operator-webhook-service unchanged
deployment.apps/cockroach-operator-manager configured
mutatingwebhookconfiguration.admissionregistration.k8s.io/cockroach-operator-mutating-webhook-configuration configured
validatingwebhookconfiguration.admissionregistration.k8s.io/cockroach-operator-validating-webhook-configuration configured

Nothing was pruned in this case, but its easy to forsee a situation where there would have been. In the cockroach crd, theres a ServiceAccount cockroach-operator-sa. Lets imagine for one reason or another the cockroachdb authors decide to rename this:

~/kube_prune$ sed -i 's/cockroach-operator-sa/cockroach-operator-account/g' cockroach-operator-system/*.yaml
~/kube_prune$ git diff
diff --git a/cockroach-operator-system/operator.yaml b/cockroach-operator-system/operator.yaml
index d19caa9..f843d30 100644
--- a/cockroach-operator-system/operator.yaml
+++ b/cockroach-operator-system/operator.yaml
@@ -23,7 +23,7 @@ kind: ServiceAccount
 metadata:
   labels:
     app: cockroach-operator
-  name: cockroach-operator-sa
+  name: cockroach-operator-account
   namespace: cockroach-operator-system
 ---
 apiVersion: rbac.authorization.k8s.io/v1
@@ -345,7 +345,7 @@ roleRef:
   name: cockroach-operator-role
 subjects:
 - kind: ServiceAccount
-  name: cockroach-operator-sa
+  name: cockroach-operator-account
   namespace: cockroach-operator-system
 ---
 apiVersion: v1
@@ -599,7 +599,7 @@ spec:
           requests:
             cpu: 10m
             memory: 32Mi
-      serviceAccountName: cockroach-operator-sa
+      serviceAccountName: cockroach-operator-account
 ---
 apiVersion: admissionregistration.k8s.io/v1
 kind: MutatingWebhookConfiguration

Heres the before state:

~/kube_prune$ kubectl get serviceaccounts -n cockroach-operator-system
NAME                    SECRETS   AGE
default                 0         88m
cockroach-operator-sa   0         88m
~/kube_prune$ kubectl get pods -n cockroach-operator-system
NAME                                          READY   STATUS    RESTARTS   AGE
cockroach-operator-manager-5489bf9cbc-nn98t   1/1     Running   0          10m

Will the ApplySet handle this correctly?

~/kube_prune$ KUBECTL_APPLYSET=true kubectl apply -f cockroach-operator-system/ --prune --applyset=prod -n cockroach-operator-system
customresourcedefinition.apiextensions.k8s.io/crdbclusters.crdb.cockroachlabs.com configured
namespace/cockroach-operator-system unchanged
serviceaccount/cockroach-operator-account created
clusterrole.rbac.authorization.k8s.io/cockroach-operator-role configured
clusterrolebinding.rbac.authorization.k8s.io/cockroach-operator-rolebinding configured
service/cockroach-operator-webhook-service unchanged
deployment.apps/cockroach-operator-manager configured
mutatingwebhookconfiguration.admissionregistration.k8s.io/cockroach-operator-mutating-webhook-configuration configured
validatingwebhookconfiguration.admissionregistration.k8s.io/cockroach-operator-validating-webhook-configuration configured
serviceaccount/cockroach-operator-sa pruned
~/kube_prune$ kubectl get pods -n cockroach-operator-system
NAME                                         READY   STATUS    RESTARTS   AGE
cockroach-operator-manager-c9b768cc5-swbjm   1/1     Running   0          10s
~/kube_prune$ kubectl get serviceaccounts -n cockroach-operator-system
NAME                         SECRETS   AGE
default                      0         89m
cockroach-operator-account   0         12s

Yes! cockroach-operator-sa was pruned, cockroach-operator-account was created, and the operator itself, cockroach-operator-manager, was reconfigured and the pod restarted. However likely it may or may not be for these kinds of changes to happen, ApplySets let us manage and upgrade crds and operators with confidence that we’re not making a mess of our cluster.

Nathan Hensel

on caving, mountaineering, networking, computing, electronics


2023-10-14