You’ve probably seen / had to use Kubernetes to some extent, but maybe you don’t really get it yet? And you’d like to? Then this post is for you. In Part 1, we dug into the basic nuts and bolts of Kubernetes, learning about Pods, Deployments, and Services. Next, we’ll dig into configuration and persistence.

Map of the territory

ConfigMaps & Secrets

You generally don’t want to bake environment-specific values (e.g. a database URL, an API credential, etc) into your container image, or hardcode them into the Pod specification. Instead, you injecting them at deploy time – this lets you have the same image running in dev / staging / production, with different configurations.

Setup

First, we’ll create a ConfigMap and look at its contents:

kubectl create configmap app-config --from-literal=GREETING="Hello from ConfigMap" --from-literal=APP_MODE="development"
kubectl get configmap app-config -o yaml

And then we’ll create a Secret:

kubectl create secret generic app-secret --from-literal=DB_PASSWORD="zaphodbeeblebrox"
kubectl get secret app-secret -o yaml

Notice that while the Secret is not shown in plaintext, it is only base64 encoded.

The Secret is not encrypted.

It’s also worth noting that we did this imperatively insted of declaratively. And your database password is now in your shell’s history, which also isn’t great. We’ll discuss those points in a bit.

Consuming as env vars

Create a file called manifests/config-pod.yaml with the following contents:

apiVersion: v1
kind: Pod
metadata:
  name: config-test
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "env | grep -E 'GREETING|APP_MODE|DB_PASSWORD'; sleep 3600"]
      env:
        - name: GREETING
          valueFrom:
            configMapKeyRef:
              name: app-config
              key: GREETING
        - name: APP_MODE
          valueFrom:
            configMapKeyRef:
              name: app-config
              key: APP_MODE
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secret
              key: DB_PASSWORD

Reading through that manifest, you can see that we’re injecting those three values into the Pod’s environment, and then the pod is configured to echo them back out. So, let’s apply it and check its logs to see the output:

kubectl apply -f manifests/config-pod.yaml
kubectl logs config-test

You should see the two configuration values and the secret printed.

Also, while we are managing config values and the secret separately, to the Pod, values from a ConfigMap are indistinguishable from values from a Secret: after pod instantiation, they’re all just strings in environment variables.

Consuming as mounted files

Configurations and secrets can also be made available to the Pod by way of mounted files. This may be preferable if you’re in a situation where environment variables end up a little too visible for comfort (i.e. end up in logs when a crash happens, or similar).

Create a file called manifests/config-pod-volume.yaml with the following contents:

apiVersion: v1
kind: Pod
metadata:
  name: config-test-vol
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "cat /etc/config/GREETING; echo; cat /etc/secret/DB_PASSWORD; echo; sleep 3600"]
      volumeMounts:
        - name: config-vol
          mountPath: /etc/config
        - name: secret-vol
          mountPath: /etc/secret
  volumes:
    - name: config-vol
      configMap:
        name: app-config
    - name: secret-vol
      secret:
        secretName: app-secret

In this manifest we mount our ConfigMap and our Secret as directories, and we’ve set up our Pod to just cat their contents to stdout. So once again, let’s apply it and check its logs to see the output:

kubectl apply -f manifests/config-pod-volume.yaml
kubectl logs config-test-vol

Each key becomes a separate file in the mount path (e.g. /etc/config/GREETING, /etc/config/APP_MODE) with the value as the file contents. In addition to keeping secrets out of environment variables, this is also particularly useful for full configuration files rather than individual values (e.g. mounting an entire nginx.conf as a single ConfigMap key).

Changes do not propagate

Change the GREETING value in the ConfigMap:

kubectl patch configmap app-config -p '{"data":{"GREETING":"UPDATED GREETING"}}'

Now check both pods:

kubectl exec config-test -- env | grep GREETING
kubectl exec config-test-vol -- cat /etc/config/GREETING

Notice that the environment variables do not auto-update – they were injected when the Container was started and are frozen for the lifetime of the Pod. If you want to update the value, you will need to delete and re-deploy the Pod.

But the mounted file eventually does update. You likely saw the original greeting in both cases when you ran the commands above. But if you run the second one again now (or in a few moments) you’ll get the updated greeting. Kubernetes periodically syncs mounted ConfigMap/Secret volumes every minute or so, without restarting the Pod.

Declarative definitions

You can (and probably should) define your ConfigMap declaratively via a manifest:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  GREETING: "Hello from ConfigMap"
  APP_MODE: "development"

You can also define your Secrets declaratively via a manifest:

apiVersion: v1
kind: Secret
metadata:
  name: app-secret
type: Opaque
data:
  DB_PASSWORD: c3VwM3JzM2NyM3Q=

but you probably shouldn’t: you don’t want your plaintext (okay, base64 encoded) secrets committed to your source code repository.

Good practice for secrets here depends on where you want them stored. Either encrypt the values before they are committed (e.g. SOPS by Mozilla) if you want them stored securely in your repository, or you can use an external secrets manager like Vault / AWS Secrets Manager.

Clean up

Delete your configs, secrets, and pods:

kubectl delete -f manifests/config-pod.yaml
kubectl delete -f manifests/config-pod-volume.yaml
kubectl delete configmap app-config
kubectl delete secret app-secret

Multi-container Pod

So far we’ve been putting each container in its own Pod. But a Pod can actually hold more than one container. Containers in the same pod share a network namespace, which means that they can reach each other via localhost, and can also share volumes.

This construction is often referred to as a “sidecar”: you have a main application, and then a “helper” process that is tightly coupled to the main application’s lifecycle. A common example is a sidecar which ships the main application’s logs to wherever they need to go, or that synchronizes files that the main application reads or writes.

Log-shipping sidecar

Create a file called manifests/sidecar-pod.yaml with the following contents:

apiVersion: v1
kind: Pod
metadata:
  name: sidecar-demo
spec:
  containers:
    - name: writer
      image: nginx:1.25
      ports:
        - containerPort: 80
      volumeMounts:
        - name: shared-logs
          mountPath: /var/log/nginx
    - name: sidecar
      image: busybox
      command:
        [
          "sh",
          "-c",
          "tail -f /var/log/nginx/access.log 2>/dev/null || sleep 3600",
        ]
      volumeMounts:
        - name: shared-logs
          mountPath: /var/log/nginx
  volumes:
    - name: shared-logs
      emptyDir: {}

The first container (“writer”) is our nginx container we know and love. The second container (“sidecar”) reads the log output and prints it to standard out – though in real life, you could imagine it shipping the logs to your logging platform.

They share an ephemeral storage volume called “shared-logs”. The emptyDir is the simplest volume type: it exists for the Pod’s lifetime, is shared by every container in the Pod, and is deleted when the Pod is deleted.

Apply and look around

Apply the manifest and look at the resulting Pod:

kubectl apply -f manifests/sidecar-pod.yaml
kubectl get pod sidecar-demo

Notice that for the first time, the “READY” status shows 2/2, indicating there are two containers running in this Pod.

Now let’s have a look at log output:

kubectl logs sidecar-demo

Note that by default, it shows you the logs of your first container. If you want to view the logs of a specific container in a multi-container pod, you’ll need to pass it in as a -c argument:

kubectl logs sidecar-demo -c writer
kubectl logs sidecar-demo -c sidecar

See how the writer has emitted the typical nginx startup loglines, but the reader has not emitted anything yet. This is because nginx has not yet fielded a request

See the shared resources

Let’s hop into the sidecar container:

kubectl exec -it sidecar-demo -c writer -- /bin/sh
# ~~ inside the container ~~
curl localhost:80
exit

As you would expect, you can see nginx’s welcome page. Now try from the sidecar:

kubectl exec -it sidecar-demo -c sidecar -- /bin/sh
# ~~ inside the container ~~
wget -O- localhost:80
ls /var/logs/nginx
exit

Since the two containers share a network namespace, our network request also loads the nginx welcome page (though since the sidecar is just a busybox image, we had to use wget instead of curl). We can also see the nginx request log in the volume that the two containers share.

Now let’s check logs again:

kubectl logs sidecar-demo -c writer
kubectl logs sidecar-demo -c sidecar

Now the sidecar’s logs contain the nginx requests. Note that they both came from localhost.

While the two processes share a disk volume and the network namespace, they do not share a process space. Run ps -A from the sidecar to see this:

kubectl exec -it sidecar-demo -c sidecar -- ps -A

Note that the nginx process is not present.

If you need your containers to share their process namespaces (e.g. in case one needs to send signals to the others’ processes), you can do so by setting spec.shareProcessNamespace: true in the Pod manifest.

Kill one container

Exec into the nginx container and kill it:

kubectl exec -it sidecar-demo -c writer -- /bin/sh
# ~~ inside the container ~~
kill 1
# ~~ note: the container gets killed ~~
kubectl get pod sidecar-demo

Killing PID 1 terminates the container, but notice that the Pod does not die. Instead, it shows 1/2 under the READY column, and immediately restarts the writer container. Run kubectl describe pod sidecar-demo and you’ll see a restart event for only the writer container: by default, individual container restarts within a multi-container pod are independent.

Clean up

Run the following command to delete the pod:

kubectl delete -f manifests/sidecar-pod.yaml

Persistent storage

In our last manifest, we used an emptyDir storage volume. This is fine for scratch space, but it dies with the Pod – not very useful for data that needs to survive a Pod restart, such as in a database.

The two-object model

For persistent storage, Kubernetes tracks two different kinds of objects in order to decouple “what storage do I need” from “where does that storage physically live”.

A PersistentVolume (PV) represents an actual piece of storage (e.g. a disk, an NFS share, a cloud volume, etc), which can be provisioned manually by an admin, or dynamically by a StorageClass (which is more typical in a cloud environment).

A PersistentVolumeClaim (PVC) represents a request for storage allocation (e.g. “I need 1gb of ReadWriteOnce”) that a Pod can reference. Kubernetes is responsible for binding a PVC to a suitable PV. Manifests will only ever reference PVCs, and are agnostic to what PV may be in use.

kind ships with a default StorageClass that dynamically creates a PV the moment a PVC requests one. You can see it here:

kubectl get storageclass

Create a PVC

Create a file called manifests/pvc.yaml with the following contents:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

And then apply it and look at what you’ve got:

kubectl apply -f manifests/pvc.yaml
kubectl get pvc
kubectl get pv

You’ll notice that the PVC sits at “pending” and the PV doesn’t get created at all. Let’s diagnose what’s happening:

kubectl describe pvc data-pvc

Look at the Events section, and you’ll see “waiting for first consumer to be created before binding”. It seems like our default StorageClass does not actually allocate the PV until something uses it.

Use the PVC

Create a file called manifests/storage-pod.yaml with the following contents:

apiVersion: v1
kind: Pod
metadata:
  name: storage-test
spec:
  containers:
    - name: app
      image: busybox
      command: ["sh", "-c", "sleep 3600"]
      volumeMounts:
        - name: persistent-storage
          mountPath: /data
  volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: data-pvc

Then apply it, and let’s look at our PVC and PV again:

kubectl apply -f manifests/storage-pod.yaml
kubectl get pvc
kubectl get pv

Now you can see the PVC’s status is “Bound”, and the PV backing it was automatically allocated.

Data surviving Pod death

Let’s use our Pod to write a file in our PVC:

kubectl exec storage-test -- /bin/sh
# ~~ inside the Pod ~~
echo "Hello there" > /data/test.txt
exit

Now delete the pod, create a new instance of it, and read the file:

kubectl delete pod storage-test
kubectl apply -f manifests/storage-pod.yaml
kubectl exec storage-test -- cat /data/test.txt

The file should still be there, even though our Pod is brand new, because the PVC / PV are independent of the Pod.

So this is how you do Databases

Yep.

So let’s done one.

Create a file called manifests/postgres.yaml with the following contents:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16
          env:
            - name: POSTGRES_PASSWORD
              value: "testpass"
          ports:
            - containerPort: 5432
          volumeMounts:
            - name: pgdata
              mountPath: /var/lib/postgresql/data
              subPath: pgdata
      volumes:
        - name: pgdata
          persistentVolumeClaim:
            claimName: postgres-pvc

This manifest is a little different than the ones we’ve written so far. Fun fact: you can bundle multiple things into the same manifest file, just put three dashes on a line betewen them.

The subPath directive when mounting the PVC means we’re going to use a subdirectory on the volume instead of the volume root. This is because postgres expects a completely empty directory, and without the subpath it’d complain about lost+found.

By the way, it’s worth noting that you can tab-complete your resource names in kubectl commands. This will save you doing a quick kubectl get pod when you apply the module and put something in the database:

kubectl apply -f manifests/postgres.yaml
kubectl wait --for=condition=ready pod -l app=postgres --timeout=60s
kubectl exec -it postgres-[TAB_COMPLETE_HERE] -- psql -U postgres

# ~~ postgres inside the container ~~
CREATE TABLE test (id serial primary key, note text);
INSERT INTO test (note) VALUES ('survived a restart');
exit

Data survives the Pod

Delete the Pod and wait for a new one to be created. For fun, here’s another way to delete resources: by label.

kubectl delete pod -l app=postgres
kubectl wait --for=condition=ready pod -l app=postgres --timeout=60s

Now query the database:

kubectl exec -it postgres-[TAB_COMPLETE_HERE] -- psql -U postgres

# ~~ postgres inside the container ~~
SELECT * FROM test;
exit    

You should see the data you stored in the old pod.

Clean up

Delete the resources we made:

kubectl delete -f manifests/postgres.yaml
kubectl delete -f manifests/storage-pod.yaml
kubectl delete -f manifests/pvc.yaml

Since they eat up disk space, it’s worth confirmig your PV / PVCs are gone:

kubectl get pvc
kubectl get pv

StatefulSets

Deployments are great for stateless web servers. A Deployment’s pods are essentially fungible, the whole “cattle not pets” model: any replica can be replaced by any other, names are random hashes, none of them have a persistent individual identity.

That’s fine for stateless web servers, but doesn’t work for things like a database cluster, where you might need to have a primary replica and one or two other replicas that sync from the primary. In cases like that, you need each replica to have a stable identity that survives restarts.

StatefulSets provide stable identity, ordered startup/shutdown, and stable per-replica storage.

Create a headless service

StatefulSets need a headless service (i.e. no ClusterIP) in order to provide stable per-Pod DNS names, rather than just load-balancing across an undifferentiated pool of replicas.

Create a file called manifests/statefulset.yaml with the following contents:

apiVersion: v1
kind: Service
metadata:
  name: web-headless
spec:
  clusterIP: None
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 80
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: web-headless
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata: { labels: { app: web } }
    spec:
      containers:
        - name: nginx
          image: nginx:1.25
          ports: [{ containerPort: 80 }]
          volumeMounts:
            - name: www
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata: { name: www }
      spec:
        accessModes: ["ReadWriteOnce"]
        resources: { requests: { storage: 100Mi } }

Notice how in our StatefulSet we have a volumeClaimTemplate. Instead of creating a PVC shared by each Pod in a Deployment, each Pod here is going to get its own PVC according to the template we supply here.

Now apply the manifest and watch the pods come up:

kubectl apply -f manifests/statefulset.yaml
kubectl get pods
kubectl get pods
...

You’ll see the three pods come up sequentially and in order. Morever instead of having names with randomized hashes in them, they have predictable names of web-0, web-1, and web-2.

Pod deletion

Delete a pod and watch it come back:

kubectl delete pod web-1
kubectl get pods

When the replacement pod is launched, it has the same name as the one you deleted: the StatefulSet guarantees that identity is stable and reused, even with full Pod deletion.

Examine the storage

Look at the PVCs you created:

kubectl get pvc

You’ll see three separate PVCs, each named for the Pod it is bound to. Now let’s write something to them:

kubectl exec web-0 -- /bin/sh -c "echo 'aa' > /usr/share/nginx/html/index.html"
kubectl exec web-1 -- /bin/sh -c "echo 'bb' > /usr/share/nginx/html/index.html"
kubectl exec web-2 -- /bin/sh -c "echo 'cc' > /usr/share/nginx/html/index.html"

and read them back

kubectl exec web-0 -- cat /usr/share/nginx/html/index.html
kubectl exec web-1 -- cat /usr/share/nginx/html/index.html
kubectl exec web-2 -- cat /usr/share/nginx/html/index.html

Data is stable

Kill a pod, wait for it to come back, and read its data again:

kubectl delete pod web-0
# wait a sec for it to come back...
kubectl exec web-0 -- cat /usr/share/nginx/html/index.html

DNS is also stable

While a Pod’s name is stable, it’s not the same Pod. Let’s redo our pod deletion experiment, but this time keeping an eye on the IP address of the Pod:

kubectl get pods -o wide
kubectl kill web-0
kubectl get pods -o wide

As you can see, the IP address of the replacement pod is not the same. However, we still get stable addressability by virtue of each Pod getting its own stable DNS name:

kubectl run debug --image=busybox --rm -it -- /bin/sh
# ~~ inside the temporary container ~~
nslookup web-0.web-headless.default.svc.cluster.local
nslookup web-1.web-headless.default.svc.cluster.local
nslookup web-2.web-headless.default.svc.cluster.local

The name scheme is [pod-name].[service-name].[namespace-name].svc.cluster.local.

Scaling respects order

Scale down the number of replicas, and watch them go:

kubectl scale statefulset web --replicas=1
kubectl get pods

You’ll see that you’re left with just web-0. If you were quick, you’d also have seen that the pods are sequentially removed, one at a time, in reverse order of creation. The idea is that when you scaling down, you want to remove the newest or “most expendable” replicas.

So going back to our database example, it’s a good idea to put your primary or leader in the first Pod.

Clean up

Unlike a Deployment+PVC, deleting a StatefulSet does not delete the associated PVCs. So you’ll need to delete both:

kubectl delete -f manifests/statefulset.yaml
kubectl delete pvc -l app=web
kubectl get pvc

Next steps

Check out Part 3.