Kubernetes SecurityContext Explained with Examples

Kubernetes SecurityContext Overview

To enforce policies on the pod level, we can use Kubernetes SecurityContext field in the pod specification. A security context is used to define different privilege and access level control settings for any Pod or Container running inside the Pod.

Here are some of the settings which can be configured as part of Kubernetes SecurityContext field:

runAsUser to specify the UID with which each container will run
runAsNonRootflag that will simply prevent starting containers that run asUID 0orroot.
runAsGroup The GID to run the entrypoint of the container process
supplementalGroups specify the Group (GID) for the first process in each container
fsGroup we can specify the Group (GID) for filesystem ownership and new files. This can be applied for entire Pod and not on each container.
allowPrivilegeEscalation controls whether any process inside the container can gain more privilege to perform the respective task.
readOnlyRootFilesystem will mount the container root file system inside the Pod as read-only by default
capabilities controls the different capabilities which can be added using ‘add’ or disabled using ‘drop’ keyword for the container
Seccomp: Filter a process’s system calls.
AppArmor: Use program profiles to restrict the capabilities of individual programs.
**Security Enhanced Linux (SELinux)**Objects are assigned security labels.

Pre-requisites

Before you start with Kubernetes SecurityContext, you must consider below points:

You have an up and running Kubernetes Cluster
You will need a Pod Security Policy in place which will be used to provide different types of Kubernetes SecurityContext such as privileges, capabilities etc

Using runAsUser with Kubernetes SecurityContext

In this section we will explore the runAsUser field used with Kubernetes SecurityContext. The runAsUser can be applied at Pod Level or at Container Level. Let me demonstrate both these examples

Example-1: Define runAsUser for entire Pod

In this section we have a multi container pod where we will define runAsUser parameter under Kubernetes SecurityContext for all the containers running inside the Pod.

apiVersion: v1
kind: Pod
metadata:
  name: pod-as-user-guest
  namespace: test1
spec:
  securityContext:
    runAsUser: 1025
  containers:
  - name: one
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
  - name: two
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]

We will create this Pod:

~]# kubectl create -f security-context-runasuser-1.yaml 
pod/pod-as-user-guest created

Check the status of the Pod, so both our containers are in Running state:

~]# kubectl get pods -n test1
NAME                READY   STATUS    RESTARTS   AGE
pod-as-user-guest   2/2     Running   0          4s

We can connect to both the containers and verify the default user:

~]# kubectl exec -it pod-as-user-guest -n test1 -c one -- id
uid=1025(user2) gid=1025(user2) groups=1025(user2)

~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id
uid=1025(user2) gid=1025(user2) groups=1025(user2)

As expected, both the containers are running with the provided user id defined with runAsUser under Pod level Kubernetes SecurityContext.

We will delete this pod:

~]# kubectl delete pod pod-as-user-guest -n test1
pod "pod-as-user-guest" deleted

Example-2: Define runAsUser for container

In this section now we will define different user for individual container inside the Kubernetes SecurityContext of the Pod definition file:

apiVersion: v1
kind: Pod
metadata:
  name: pod-as-user-guest
  namespace: test1
spec:
  containers:
  - name: one
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
    securityContext:
      runAsUser: 1025
  - name: two
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
    securityContext:
      runAsUser: 1026

Here I have defined runAsUser separately for both the containers inside the Kubernetes SecurityContext so we will use different user for both the containers.

Create this Pod:

~]# kubectl create -f security-context-runasuser-1.yaml 
pod/pod-as-user-guest created

Check the status:

~]# kubectl get pods -n test1
NAME                READY   STATUS    RESTARTS   AGE
pod-as-user-guest   2/2     Running   0          94s

Verify the USER ID of both the containers:

# kubectl exec -it pod-as-user-guest -n test1 -c one -- id
uid=1025(user2) gid=1025(user2) groups=1025(user2)

~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id
uid=1026(user1) gid=1026(user1) groups=1026(user1)

Define common group of shared volumes in Kubernetes (fsGroup)

When we are sharing some volumes across multiple containers, then access permission can become a concern. In such scenarios we can use fsGroup under Kubernetes SecurityContext to define a common group which will act as an group owner for any such shared volumes.

NOTE

fsGroup is assigned at Pod level so you cannot assign it at container level Kubernetes SecurityContext, if you try to assign it at container level then you will get below error:

error: error validating "security-context-fsgroup-1.yaml": error validating data: [ValidationError(Pod.spec.containers[0].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext, ValidationError(Pod.spec.containers[1].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext]; if you choose to ignore these errors, turn validation off with --validate=false

This our sample YAML file to create a pod using fsGroup:

apiVersion: v1
kind: Pod
metadata:
  name: pod-as-user-guest
  namespace: test1
spec:
  securityContext:
    fsGroup: 555
  containers:
  # create container one
  - name: one
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
    # container one running as user id 1025
    securityContext:
      runAsUser: 1025
    # mount empty dir under /volume
    volumeMounts:
    - name: shared-volume
      mountPath: /volume
  # create container two
  - name: two
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
   # container one running as user id 1026
    securityContext:
      runAsUser: 1026
    # mount empty dir under /volume
    volumeMounts:
    - name: shared-volume
      mountPath: /volume
  # create emptyDir
  volumes:
  - name: shared-volume
    emptyDir: {}

Create this pod:

~]# kubectl create -f security-context-fsgroup-1.yaml 
pod/pod-as-user-guest created

Verify the group ownership on the shared volume:

~]# kubectl exec -it pod-as-user-guest -n test1 -c one -- bash
[user2@pod-as-user-guest /]$ id
uid=1025(user2) gid=1025(user2) groups=1025(user2),555

[user2@pod-as-user-guest /]$ ls -ld /volume/
drwxrwsrwx. 2 root 555 4096 Sep  3 09:28 /volume/

[user2@pod-as-user-guest /]$ exit

So, one container one the /volume path is owned by 555 GID as expected. The id command shows the container is running with user ID 1025, as specified in the pod definition. The effective group ID is 1025(user2), but group ID 555 is also associated with the user.

Let’s verify the same on two container:

~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- bash
[user1@pod-as-user-guest /]$ id
uid=1026(user1) gid=1026(user1) groups=1026(user1),555

[user1@pod-as-user-guest /]]$ ls -ld /volume/
drwxrwsrwx. 2 root 555 4096 Sep  3 09:28 /volume/

One more thing which you should know that with fsGroup Kubernetes SecurityContext, any files created inside the shared volume will have group ownership of the ID provided in the pod definition file. For example, here I will create a file inside /tmp:

[user1@pod-as-user-guest ~]$ touch /tmp/file

[user1@pod-as-user-guest ~]$ ls -l /tmp/file
-rw-rw-r--. 1 user1 user1 0 Sep  3 09:42 /tmp/file

As you can see above, the file is owned by user2 user and group. But if you create any file inside the shared volume i.e. /volume path then the group owner of that respective file would be same as fsGroup value i.e. 555 in our case:

[user1@pod-as-user-guest ~]$ touch /volume/file

[user1@pod-as-user-guest ~]$ ls -l /volume/file 
-rw-rw-r--. 1 user1 555 0 Sep  3 09:42 /volume/file

As you can see, the fsGroup Kubernetes SecurityContext property is used when the process creates files in a volume (but this depends on the volume plugin used).

Define supplementalGroups inside Kubernetes SecurityContext

We can combine fsGroup with supplementalGroups inside the Pod’s SecurityContext field to define some additional groups. In such case the runAsUser or the default image user will also be added to these supplementary groups.

apiVersion: v1
kind: Pod
metadata:
  name: pod-as-user-guest
  namespace: test1
spec:
  securityContext:
    fsGroup: 555
    # Define additional groups for the default user
    supplementalGroups: [666, 777]
  containers:
  # create container one
  - name: one
    image: golinux-registry:8090/secure-context-img:latest
    command: ["/bin/sleep", "999999"]
...

We will create this Pod and verify the list of groups part of user2 user:

~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id
uid=1026(user1) gid=1026(user1) groups=1026(user1),555,666,777

So now along with fsGroup, our user has also been added to additional supplementary groups.

Using allowPrivilegeEscalation with Kubernetes SecurityContext

In this section we will cover different areas related to privilege where we will add or remove capabilities to the container. Every container inside a Pod uses kernel capabilities to perform different task, such as even when you are changing permission of a file then you can basically using CAP_CHOWN capability and many other capability such as CAP_SUID, CAP_SGID etc. You use CAP_SYS_MOUNT capability when you are using mount, umount commands.

So, with containers you can accordingly assign or drop these capabilities so the user inside the container will have limited privilege and is considered more secure. To be able to control or restrict capabilities, you must define allowPrivilegeEscalation as true inside the Pod’s Kubernetes Security Context.

The capabilities which a pod would use are basically defined using PodSecurityPolicy.

Example-1: Using allowedCapabilities in Pod Security Policy

TheallowedCapabilities field is used to specify which capabilities pod authors can add in the Kubernetes securityContext.capabilitiesfield in the container spec.

I have following PSP currently added to my Kubernetes Cluster and I have added some capabilities under allowedCapabilities:

~]# kubectl get psp testns-psp-01 -o yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
...
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - SYS_ADMIN
  - NET_BIND_SERVICE
  - CHOWN
  requiredDropCapabilities:
  - ALL
...

Status of the PSP can be checked using below command:

~]# kubectl get psp | grep -E 'PRIV|testns'
NAME                                PRIV    CAPS                               SELINUX    RUNASUSER          FSGROUP     SUPGROUP    READONLYROOTFS   VOLUMES
testns-psp-01                       false   SYS_ADMIN,NET_BIND_SERVICE,CHOWN   RunAsAny   MustRunAsNonRoot   RunAsAny    RunAsAny    false            *

So we are only allowing capabilities as mentioned under CAPS section.

We will create a StatefulSet with certain pre-defined capabilities, but the capability we use will not be part of allowed capabilities in the Pod Security Policy. Here is my pod definition file:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-statefulset
  namespace: testns
spec:
  selector:
    matchLabels:
      app: dev
  serviceName: test-pod
  replicas: 2
  template:
    metadata:
      labels:
        app: dev
    spec:
      containers:
      - name: test-statefulset
        image: golinux-registry:8090/secure-context-img:latest
        command: ["supervisord", "-c", "/etc/supervisord.conf"]
        imagePullPolicy: Always
        securityContext:
          runAsUser: 1025
          ## by default the privilege will be disabled
          privileged: false
          ## allow the use of capabilitites
          allowPrivilegeEscalation: true
          capabilities:
            ## drop all capabilities
            drop:
             - ALL
            ## The creation of statefulset should fail as SUID is not in allowedCapabilities
            add:
             - SUID

Here, we are trying to use a capability using Kubernetes SecurityContext which has not been defined in Pod Security Policy, so let’s try to create this statefulset:

~]# kubectl create -f test-statefulset.yaml 
statefulset.apps/test-statefulset created

The statefulset has been successfully created but the pods have not come up:

~]# kubectl get statefulset -n testns
NAME                      READY   AGE
test-statefulset          0/2     12m

We can use kubectl describe statefulset test-statefulset -n testns command to troubleshoot the issue here:

Events:
  Type     Reason        Age                From                    Message
  ----     ------        ----               ----                    -------
  Warning  FailedCreate  10s (x6 over 12s)  statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added]
  Warning  FailedCreate  7s (x2 over 12s)   statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed]
  Warning  FailedCreate  2s (x4 over 12s)   statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added]

As expected, since the SUID capability was not defined in the PodSecurityPolicy so statefulset failed to create pods.

Now I have modified my statefulset definition file and use CHOWN capability instead of SUID and the pod creation was successful. We can verify the same using the following command:

~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print
Current: = cap_chown+i
Bounding set =cap_chown
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=1025(user2)
gid=1025(user2)
groups=

Example-2: Using defaultAddCapabilities in PodSecurityPolicy

Next we will update our Pod Security Policy to also add some defaultAddCapabilities. All capabilities listed under thedefaultAddCapabilitiesfield will be added to every deployed pod’s containers. If a user doesn’t want certain containers to have those capabilities, they need to explicitly drop them in the specs of those containers.

I have modified my testns-psp-01 using kubectl edit psp testns-psp-01 -n testns command and added defaultAddCapabilities field with new capability:

...
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - SYS_ADMIN
  - NET_BIND_SERVICE
  - CHOWN
  defaultAddCapabilities:
  - NET_RAW
  fsGroup:
    rule: RunAsAny
  requiredDropCapabilities:
  - ALL
...

So, we have marked NET_RAW as default capability which will be added to any container using this Pod Security Policy.

Let us quickly create one statefulset using our last pod definition file and verify if NET_RAW capability is automatically added to the container:

...
        securityContext:
          runAsUser: 1025
          ## by default the privilege will be disabled
          privileged: false
          ## allow the use of capabilitites
          allowPrivilegeEscalation: true
          capabilities:
            ## drop all capabilities
            drop:
             - ALL
            ## Add chown capability
            add:
             - CHOWN
...

Let’s create this statefulset and verify the list of allowed capabilities:

~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print
Current: = cap_chown,cap_net_raw+i
Bounding set =cap_chown,cap_net_raw
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=1025(user2)
gid=1025(user2)
groups=

As expected, we had only added CHOWN capability but our pod also contains NET_RAW capability which was added as part of defaultAddCapabilities from Pod Security Policy.

Example-3: Using requiredDropCapabilities in Pod Security Policy

The final field in this example is requiredDropCapabilities. The capabilities listed in this field are dropped automatically from every container (the PodSecurityPolicy Admission Control plugin will add them to every container’s securityContext.capabilities.drop field).

We have updated our ststefulset definition file, and now we are not dropping or adding any additional capability:

        ...
        securityContext:
          runAsUser: 1025
          ## by default the privilege will be disabled
          privileged: false
          ## allow the use of capabilitites
          allowPrivilegeEscalation: true
          ...

We have also updated our PodSecurityPolicy using kubectl edit psp testns-psp-01 -n testns and added SYS_ADMIN as requiredDropCapabilities:

....
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - NET_BIND_SERVICE
  - CHOWN
  defaultAddCapabilities:
  - NET_RAW
  fsGroup:
    rule: RunAsAny
  requiredDropCapabilities:
  - SYS_ADMIN
....

Next, we deploy our statefulset and verify the applied Linux capabilities:

~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=1025(user2)
gid=1025(user2)
groups=

HINT

We can see quite some capabilities even when we had not defined them explicitly in our PodSecurityPolicy, this is the default behaviour of Kubernetes where it adds some pre-defined capabilities to any container unless we explicitly drop them using pods Kubernetes SecurityContext capabilities section.

Here you can check, SYS_ADMIN capability is not available as it is removed using requiredDropCapabilities

What would happen if you explicitly try to add a dropped capability in SecurityContext?

Here we have dropped SYS_ADMIN capability using requiredDropCapabilities in the Pod Security Policy. Now what if we explicitly try to add the same capability to our statefulset using Kubernetes SecurityContext field as shown below:

...
        securityContext:
          runAsUser: 1025
          ## by default the privilege will be disabled
          privileged: false
          ## allow the use of capabilitites
          allowPrivilegeEscalation: true
          capabilities:
            add:
             - SYS_ADMIN
...

Next, lets try to create this statefulset:

~]# kubectl create -f test-statefulset.yaml 
statefulset.apps/test-statefulset created

Now let’s check the events of this statefulset for any errors:

~]# kubectl describe statefulset test-statefulset -n testns
...
Events:
  Type     Reason        Age              From                    Message
  ----     ------        ----             ----                    -------
  Warning  FailedCreate  5s               statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
  Warning  FailedCreate  5s (x3 over 5s)  statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed]
  Warning  FailedCreate  3s (x6 over 5s)  statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]

Hence if a user tries to create a pod where they explicitly add one of the capabilities listed in the policy’s requiredDropCapabilities field, the pod is rejected:

Summary

In this tutorial we explored different Kubernetes SecurityContext which we can use inside a Pod and container. In a nutshell we covered following topics:

Containers can be configured to run as a different user and/or group than the one defined in the container image.
Containers can also run in privileged mode, allowing them to access the node’s devices that are otherwise not exposed to pods.
Containers can be run as read-only, preventing processes from writing to the container’s filesystem (and only allowing them to write to mounted volumes).
Cluster-level PodSecurityPolicy resources can be created to prevent users from creating pods that could compromise a node.