Kubernetes SecurityContext Capabilities Introduction
With Kubernetes you can control the level of privilege assigned to each Pod and container. We can utilize Kubernetes SecurityContext Capabilities to add or remove Linux Capabilities from the Pod and Container so the container can be made more secure from any kind of intrusion. The Kubernetes SecurityContext Capabilities is tightly coupled with Pod Security Policy which defines the policy for the entire cluster. Later we use these policies with PSP (Pod Security Policy) to map the Pods and control the privilege.
In this tutorial we will give a brief overview on Pod Security Policy (for detailed understanding on PSP you can read my older article Create Pod Security Policy Kubernetes [Step-by-Step]). Then we will explore Kubernetes SecurityContext Capabilities in detail with multiple examples covering different scenarios.
Create Pod Security Policy
First we will create our Pod Security Policy which we will use through out this article. Here is my PSP definition file along with Cluster Role and Cluster Role Binding:
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: testns-psp-01
spec:
privileged: true
allowPrivilegeEscalation: true
requiredDropCapabilities:
allowedCapabilities:
- '*'
defaultAddCapabilities:
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
volumes:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: testns-psp-01
rules:
- apiGroups:
- policy
resourceNames:
- testns-psp-01
resources:
- podsecuritypolicies
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: testns-psp-01
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: testns-psp-01
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: system:authenticated
- kind: Group
name: system:serviceaccounts
apiGroup: rbac.authorization.k8s.io
Here is the output of my installed PSP:
]# kubectl get psp | grep -E 'PRIV|testns'
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
testns-psp-01 false * RunAsAny MustRunAsNonRoot RunAsAny RunAsAny false *
In our Pod Security Policy we have not added any restrictions and everything is allowed basically.
How to create a privileged container inside a Kubernetes Pod
In this example first we will create a privileged pod which should have all the capabilities. In most of the cases following Kubernetes SecurityContext Capability definition should be enough to start a privileged pod:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-statefulset
namespace: testns
spec:
selector:
matchLabels:
app: dev
serviceName: test-pod
replicas: 2
template:
metadata:
labels:
app: dev
spec:
containers:
- name: test-statefulset
image: golinux-registry:8090/secure-context-img:latest
command: ["supervisord", "-c", "/etc/supervisord.conf"]
imagePullPolicy: Always
securityContext:
runAsUser: 1025
## enable privileged mode
privileged: true
Create this statefulset:
]# kubectl create -f test-statefulset.yaml
statefulset.apps/test-statefulset created
Check the list of allowed capabilities:
]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print
Current: = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,35,36,37+i
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,35,36,37
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1025(user1)
gid=1025(user1)
As you can see, all the capabilities are allowed in our container.
In some cases, if you don’t see all the capabilities added to your container then you can use below Kubernetes SecurityContext Capabilities:
...
securityContext:
runAsUser: 1025
privileged: true
allowPrivilegeEscalation: true
capabilities:
add:
- ALL
...
This YAML file expects the respective Pod Security Policy has allowed all capabilities.
How to create a non-privileged container inside a Kubernetes Pod
Now you may wonder that by using privileged as true enables all the privilege so just by making it false, the pod should execute as no-privilege?
Let’s try this theory using this practical example, we have updated our statefulset definition file with the following Kubernetes SecurityContext Capabilities field:
...
containers:
- name: test-statefulset
image: golinux-registry:8090/secure-context-img:latest
command: ["supervisord", "-c", "/etc/supervisord.conf"]
imagePullPolicy: Always
securityContext:
runAsUser: 1025
privileged: false
allowPrivilegeEscalation: false
...
So, basically I have disabled privilege and any kind of privilege escalation inside the container. Once we create this statefulset, let’s verify the available capabilities on the pod:
]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1025(user1)
gid=1025(user1)
groups=
As you can see, even with privileged: false, the container still has multiple capabilities enabled so it is actually not a non-privileged pod.
Solution-1: Drop all capabilities using requiredDropCapabilities inside Pod Security Policy
I would not recommend this solution because PSP are created for whole cluster and it does not make sense to disable all the privilege in the PSP just for one pod. Although you can use RBAC to limit the usage of this PSP only for certain user, in which case this method can be used.
But either way, I will share the steps to drop all the privileges using a Pod Security Policy and you may choose your preferred method.
We will edit our testns-psp-01 using
kubectl edit psp testns-psp-01 -n testns command which will open the
PSP definition file using your default editor. After updating the same,
this is what my Kubernetes SecurityContext Capabilities looks like for
the PSP:
...
spec:
allowPrivilegeEscalation: false
fsGroup:
rule: RunAsAny
requiredDropCapabilities:
- ALL
runAsUser:
rule: MustRunAsNonRoot
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- '*'
So, basically I have removed the allowedCapabilities section and added
requiredDropCapabilities field which will drop all the default
capabilities from the container inside the Pod.
We will re-deploy our statefulset to pick up the new changes. Next verify the available capabilities inside the container:
]# kubectl exec -it test-statefulset-1 -n testns -- capsh --print
Current: =
Bounding set =
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1025(user1)
gid=1025(user1)
groups=
Now as you can see, we cant see any capabilities assigned to our container. So now this is a proper non-privileged container inside a Kubernetes Pod
Solution-2: Using Kubernetes SecurityContext Capabilities in the Pod definition file
Next we will use the Pod definition file to start a non-privileged container by using Kubernetes SecurityContext Capabilities field. In addition to privileged: false, we must explicitly drop all the capabilities as shown below:
...
containers:
- name: test-statefulset
image: golinux-registry:8090/secure-context-img:latest
command: ["supervisord", "-c", "/etc/supervisord.conf"]
imagePullPolicy: Always
securityContext:
runAsUser: 1025
privileged: false
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
...
Let us re-deploy our statefulset and verify the applied Linux capabilities inside the container:
]# kubectl exec -it test-statefulset-1 -n testns -- capsh --print
Current: =
Bounding set =
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1025(user1)
gid=1025(user1)
groups=
So as expected, the container has dropped all the capabilities and can be used as a non-privileged container in Kubernetes Pod.
How to assign limited Linux capabilities to a container inside Kubernetes Pod
Now that we know how to have a privileged and non-privileged pod, let me show you some example to create a pod with limited privilege.
In this example we will only add SYS_TIME capability to our container inside the Kubernetes Pod. To achieve this, I have modified my Pod Security Policy to allow privileged pods and allow all capabilities to be added. We don’t want to restrict this at PSP level, rather we will control this at Pod level.
]# kubectl get psp | grep -E 'PRIV|testns'
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
testns-psp-01 true * RunAsAny MustRunAsNonRoot RunAsAny RunAsAny false *
Here is the snippet of my Kubernetes SecurityContext Capabilities which
I will use to first drop all the capabilities and then only add
SYS_TIME capability
add field
with SYS_TIME and then later provide the drop ALL field then all the
capabilities would be dropped from the container. So, make sure you use
drop first followed by add.
...
spec:
containers:
- name: test-statefulset
image: golinux-registry:8090/secure-context-img:latest
command: ["supervisord", "-c", "/etc/supervisord.conf"]
imagePullPolicy: Always
securityContext:
runAsUser: 1025
privileged: false
allowPrivilegeEscalation: true
capabilities:
drop:
- ALL
add:
- SYS_TIME
...
Let us re-deploy our statefulset and check the applied capabilities:
]# kubectl exec -it test-statefulset-1 -n testns -- capsh --print
Current: = cap_sys_time+i
Bounding set =cap_sys_time
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1025(user1)
gid=1025(user1)
groups=
As expected, the container has dropped all the other capabilities and
only applied SYS_TIME.
How to check the list of capabilities applied to a container inside Kubernetes Pod
Let me show you different ways to get the list of capabilities applied to your Kubernetes Pod’s container:
Method-1: Check the list of Linux capabilities in a container using capsh –print command
We will use capsh command to print the list of applied capabilities to
any container.
[user1@test-statefulset-1 /]$ capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_sys_admin,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_sys_admin,cap_mknod,cap_audit_write,cap_setfcap
Here, we have two fields:
Current: This field contains the list of capabilities currently in use
by the system process
Bounding Set: Tis field contains the list of capabilities which can be
used if required by any of the system or application process
You may also notice +i at the end of Current set of capabilities.
These are Thread Capability Set, there are three different types of
thread capability set which can be defined or allocated:
- Effective - the capabilities used by the kernel to perform permission checks for the thread.
- Permitted - the capabilities that the thread may assume (i.e., a limiting superset for the effective and inheritable sets). If a thread drops a capability from its permitted set, it can never re-acquire that capability (unless it exec()s a set-user-ID-root program).
- inheritable - the capabilities preserved across an execve(2). A child created via fork(2) inherits copies of its parent’s capability sets. See below for a discussion of the treatment of capabilities during exec(). Using capset(2), a thread may manipulate its own capability sets, or, if it has the CAP_SETPCAP capability, those of a thread in another process.
Method-2: Check applied capabilities per process
The above command was showing us system wide Linux capabilities, we can also list the capabilities which are being used by individual process. For example, on my container I have the following process running:
[user1@test-statefulset-1 /]$ ps -ef
UID PID PPID C STIME TTY TIME CMD
user1 1 0 0 17:38 ? 00:00:00 /usr/bin/python /usr/bin/supervisord -c /etc/supervisord.conf
user1 9 1 0 17:38 ? 00:00:00 /usr/sbin/rsyslogd -n -f /tmp/rsyslog.conf -i /tmp/rsyslog.pid
root 10 1 0 17:38 ? 00:00:00 /usr/sbin/sshd -D -f /opt/ssh/sshd_config -p 5022 -E /tmp/sshd.log
user1 643 0 0 17:48 pts/0 00:00:00 bash
user1 1214 643 0 17:58 pts/0 00:00:00 ps -ef
Now I want to check the list of capabilities used by my SSHD process which has PID 10.
[user1@test-statefulset-1 /]$ grep Cap /proc/10/status
CapInh: 00000000a82425fb
CapPrm: 00000000a82425fb
CapEff: 00000000a82425fb
CapBnd: 00000000a82425fb
CapAmb: 0000000000000000
Here,
- CapInh = Inherited capabilities
- CapPrm – Permitted capabilities
- CapEff = Effective capabilities
- CapBnd = Bounding set
- CapAmb = Ambient capabilities set
So we get some hex code value for different capabilities. To convert the hexcode into actual human readable format of capabilities we will use following command:
[user1@test-statefulset-1 /]$ capsh --decode=00000000a82425fb
0x00000000a82425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_sys_admin,cap_mknod,cap_audit_write,cap_setfcap
So, now we have the list of capabilities used by the SSHD process.
How to assign Linux capability to individual file or binary (setcap)
By default many Linux system binaries will have some capabilities assigned to them. You can check this using getcap command. For example to check the list of capability assigned to ping command we can use:
[user1@test-statefulset-1 /]$ getcap `which ping`
/usr/bin/ping = cap_net_admin,cap_net_raw+p
So ping command requires cap_net_admin and cap_net_raw to be able to
function properly.
Let’s use ping with the default capabilities:
[user1@test-statefulset-1 /]$ capsh -- -c "/bin/ping -c 1 localhost"
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.018 ms
--- localhost ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.018/0.018/0.018/0.000 ms
This seems to be working, let’s try the same command but without cap_net_admin capability:
[user1@test-statefulset-1 /]$ capsh --drop=cap_net_admin -- -c "/bin/ping -c 1 localhost"
unable to raise CAP_SETPCAP for BSET changes: Operation not permitted
As you can see, ping command fails to execute with Operation not permitted error.
To add capability to any file we can use setcap command. Let us add
some capability to /usr/sbin/sshd binary, currently as you can see
there are no capabilities assigned to this binary:
[user1@test-statefulset-1 /]$ getcap /usr/sbin/sshd
Next I will add NET_ADMIN capability to this binary file:
[user1@test-statefulset-1 /]$ setcap cap_net_admin+i /usr/sbin/sshd
Verify the same again:
[user1@test-statefulset-1 /]$ getcap /usr/sbin/sshd
/usr/sbin/sshd = cap_net_admin+i
Summary
In this tutorial we explored different areas related to Kubernetes SecurityContext Capabilities. We covered following topics in this article:
- Create a privileged and non-privileged container inside a Kubernetes Pod.
- How to add or drop all the capabilities from a Pod.
- How to add single or pre-defined set of capabilities to a container
- Understanding more about Linux Capabilities
- How to check if capabilities are assigned to a container
Further Readings
man page for
capabilities
Linux Capabilities In
Practice
man page for
setcap

![Kubernetes SecurityContext Capabilities Explained [Examples]](/kubernetes-securitycontext-capabilities/kubernetes_capabilities.jpg)
