In this article we will discuss on how to limit CPU resources in Linux using cgroups and slice with some practical examples.
Now to start with this article, cgroup or Control Group provides
resource management and resource accounting for groups of processes.
Cgroups kernel implementation is mostly in non-critical paths in terms
of performance. The cgroups subsystem implements a new Virtual File
System (VFS) type named “cgroups”. All cgroups actions are done by
filesystem actions, like creating cgroups directories in a cgroup
filesystem, writing or reading to entries in these directories, mounting cgroup
filesystems, etc.

Few pointers on cgroups to limit resources
- cgroup is now integrated with systemd in recent Linux versions since kernel 2.6.24.
- Control group place resources in controllers that represent the type of resource i.e you can define groups of available resources to make sure your application like webserver has guaranteed claim on resources
- In order to do so, cgroup works with default controller which are cpu, memory and blkio
- These controllers are divided into tree structure where different
weight or limits are applied to each branch
- Each of these branches is a cgroup
- One or more processes are assigned to a cgroup
- cgroups can be applied from the command line or from
systemd- Manual creation happens through the
cgconfigservice and thecgredprocess - In all cases, cgroup settings are written to
/sys/fs/cgroups
- Manual creation happens through the
# ls -l /sys/fs/cgroup/
total 0
drwxr-xr-x 2 root root 0 Nov 26 12:48 blkio
lrwxrwxrwx 1 root root 11 Nov 26 12:48 cpu -> cpu,cpuacct
lrwxrwxrwx 1 root root 11 Nov 26 12:48 cpuacct -> cpu,cpuacct
drwxr-xr-x 2 root root 0 Nov 26 12:48 cpu,cpuacct
drwxr-xr-x 2 root root 0 Nov 26 12:48 cpuset
drwxr-xr-x 3 root root 0 Nov 26 12:50 devices
drwxr-xr-x 2 root root 0 Nov 26 12:48 freezer
drwxr-xr-x 2 root root 0 Nov 26 12:48 hugetlb
drwxr-xr-x 2 root root 0 Nov 26 12:48 memory
lrwxrwxrwx 1 root root 16 Nov 26 12:48 net_cls -> net_cls,net_prio
drwxr-xr-x 2 root root 0 Nov 26 12:48 net_cls,net_prio
lrwxrwxrwx 1 root root 16 Nov 26 12:48 net_prio -> net_cls,net_prio
drwxr-xr-x 2 root root 0 Nov 26 12:48 perf_event
drwxr-xr-x 2 root root 0 Nov 26 12:48 pids
drwxr-xr-x 4 root root 0 Nov 26 12:48 systemd
These are the different controllers which are created by the kernel itself. Each of these controllers have their own tunables for example
# ls -l /sys/fs/cgroup/cpuacct/
total 0
-rw-r--r-- 1 root root 0 Nov 26 12:48 cgroup.clone_children
--w--w---- 1 root root 0 Nov 26 12:48 cgroup.event_control
-rw-r--r-- 1 root root 0 Nov 26 12:48 cgroup.procs
-r--r--r-- 1 root root 0 Nov 26 12:48 cgroup.sane_behavior
-r--r--r-- 1 root root 0 Nov 26 12:48 cpuacct.stat
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpuacct.usage
-r--r--r-- 1 root root 0 Nov 26 12:48 cpuacct.usage_percpu
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpu.cfs_period_us
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpu.cfs_quota_us
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpu.rt_period_us
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpu.rt_runtime_us
-rw-r--r-- 1 root root 0 Nov 26 12:48 cpu.shares
-r--r--r-- 1 root root 0 Nov 26 12:48 cpu.stat
-rw-r--r-- 1 root root 0 Nov 26 12:48 notify_on_release
-rw-r--r-- 1 root root 0 Nov 26 12:48 release_agent
-rw-r--r-- 1 root root 0 Nov 26 12:48 tasks
RESOURCE CONTROLLERS IN LINUX KERNEL
Understanding slice
By default, systemd automatically creates a hierarchy of slice, scope and service units to provide a unified structure for the cgroup tree. Services, scopes, and slices are created manually by the system administrator or dynamically by programs. By default, the operating system defines a number of built-in services that are necessary to run the system. Also, there are four slices created by default:
- -.slice — the root slice;
- system.slice — the default place for all system services;
- user.slice — the default place for all user sessions;
- machine.slice — the default place for all virtual machines and Linux containers.
How to limit CPU using slice?
Let us take an example of CPUShares to limit CPU resources. Now
assuming we assign following value of CPUShares to below slice
system.slice -> 1024
user.slice -> 256
machine.slice -> 2048
What does these values mean?
They actually individually mean nothing but instead these values are
used as a comparison factor between all the slices. Here if we assume
that if total CPU availability is 100% then user.slice will get
~7%, system.slice will get 4 times the allocation of user.slice
i.e. ~30% and machine.slice will get twice the allocation of
system.slice which will be around ~60% of the available CPU
resource.
Can I limit CPU of multiple services in system.slice?
This is a valid question, assuming I created three service inside
system.slice with CPUShares value as defined below
service1 -> 1024
service2 -> 256
service3 -> 512
If we sum it up the total becomes larger than 1024 which is actually
assigned to system.slice in the above example. Well again, these
values are only meant of comparison and in real mean nothing. Here
service1 will get the maximum amount of available resource i.e. if
100% of resource is available for system.slice then the
service1 will get ~56%, service2 will get ~14% and
service3 will get ~28% of the available CPU
This is how this cgroup settings in the big level relates between different slices and between different slices relates to different services.
How to create custom slice?
- Every name of a slice unit corresponds to the path to a location in the hierarchy.
- Child slice will inherit the settings from parent slice.
- The dash ("-") character acts as a separator of the path components.
For example, if the name of a slice looks as follows:
parent-name.slice
It means that a slice called parent-name.slice is a subslice of the
parent.slice. This slice can have its own subslice named
parent-name-name2.slice, and so on..
Test CPU resource allocation using practical examples
Now we will create two systemd unit files namely stress1.service and stress2.service to test if we are able to limit CPU. These service scripts will utilise all the CPU on my system
CPUShares
Using these systemd unit files I will put some CPU load using
system.slice
# cat /etc/systemd/system/stress1.service
[Unit]
Description=Put some stress
[Service]
Type=Simple
ExecStart=/usr/bin/dd if=/dev/zero of=/dev/null
This is my second unit file with same content to stress the CPU
# cat /etc/systemd/system/stress2.service
[Unit]
Description=Put some stress
[Service]
Type=Simple
ExecStart=/usr/bin/dd if=/dev/zero of=/dev/null
Start these services
# systemctl daemon-reload
# systemctl start stress1
# systemctl start stress1
Now validate the CPU usage using top command
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1994 root 20 0 107992 608 516 R 49.7 0.0 0:03.11 dd
2001 root 20 0 107992 612 516 R 49.7 0.0 0:02.21 dd
As you see I have two processes which are trying to utilise available CPU, now since both are in the system slice, the process equally gets the available resource. So both the process gets ~50% of the CPU as expected.
Now let us try to add a new process on the user.slice using a while command in the the background
# while true; do true; done &
Next check the CPU usage, and as expected now the available CPU is equally divided into 3 processes. there is no distinction between user.slice and system.slice
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1983 root 20 0 116220 1404 152 R 32.9 0.0 1:53.28 bash
2193 root 20 0 107992 608 516 R 32.9 0.0 0:07.59 dd
2200 root 20 0 107992 612 516 R 32.9 0.0 0:07.13 dd
Now let us enable the slicing by enabling below values in
“/etc/systemd/system.conf”
DefaultCPUAccounting=yes
DefaultBlockIOAccounting=yes
DefaultMemoryAccounting=yes
Reboot the node to activate the changes
Once the system is back UP, next we will again start our stress1 and stress2 service and a while loop using bash shell
# systemctl start stress1
# systemctl start stress2
# while true; do true; done &
Now validate the CPU usage using top command
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2132 root 20 0 116220 1520 392 R 49.3 0.0 1:16.47 bash
1994 root 20 0 107992 608 516 R 24.8 0.0 2:30.40 dd
2001 root 20 0 107992 612 516 R 24.8 0.0 2:29.50 dd
As you see now our slicing has become effective. The user slice is now able to claim 50% of the CPU while the system slice is divided at ~25% for both the stress service.
Let us now further reserve the CPU using CPUShares for our systemd
unit files.
# cat /etc/systemd/system/stress2.service
[Unit]
Description=Put some stress
[Service]
CPUShares=1024
Type=Simple
ExecStart=/usr/bin/dd if=/dev/zero of=/dev/null
# cat /etc/systemd/system/stress1.service
[Unit]
Description=Put some stress
[Service]
CPUShares=512
Type=Simple
ExecStart=/usr/bin/dd if=/dev/zero of=/dev/null
Now in the above unit files I have given priority to stress2.service so it will be allowed double of the resource allocated to stress1.service.
Next restart the services
# systemctl daemon-reload
# systemctl restart stress1
# systemctl restart stress2
Validate the top output
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2132 root 20 0 116220 1520 392 R 49.7 0.0 2:43.11 bash
2414 root 20 0 107992 612 516 R 33.1 0.0 0:04.85 dd
2421 root 20 0 107992 608 516 R 16.6 0.0 0:01.95 dd
So as expected, out of the available 50% CPU resources for system.slice, stress2 gets double the CPU allocated to stress1 service.
user.slice then
system.slice will be allowed to use upto 100% of the available CPU
resource.
Monitor CPU resource usage per slice
systemd-cgtop shows the top control groups of the local Linux
control group hierarchy, ordered by their CPU, memory, or disk
I/O load. The display is refreshed in regular intervals (by default
every 1s), similar in style to
top command.
Resource usage is only accounted for control groups in the relevant
hierarchy, i.e. CPU usage is only accounted for control groups in
the “cpuacct” hierarchy, memory usage only for those in “memory”
and disk I/O usage for those in “blkio”.
Path Tasks %CPU Memory Input/s Output/s
/ 56 100.0 309.0M - -
/system.slice - 97.5 277.4M - -
/system.slice/stress3.service 1 59.9 104.0K - -
/system.slice/stress1.service 1 29.9 104.0K - -
/system.slice/stress2.service 1 7.5 108.0K - -
/user.slice - 1.7 10.7M - -
/user.slice/user-0.slice - 1.7 10.4M - -
/user.slice/user-0.slice/session-7.scope 3 1.7 4.6M - -
/system.slice/pacemaker.service 7 0.0 41.6M - -
/system.slice/pcsd.service 1 0.0 46.8M - -
/system.slice/fail2ban.service 1 0.0 9.0M - -
/system.slice/dhcpd.service 1 0.0 4.4M - -
/system.slice/tuned.service 1 0.0 11.8M - -
/system.slice/NetworkManager.service 3 0.0 11.3M - -
/system.slice/httpd.service 6 0.0 4.6M - -
/system.slice/abrt-oops.service 1 0.0 1.4M - -
/system.slice/rsyslog.service 1 0.0 1.5M - -
/system.slice/rngd.service 1 0.0 176.0K - -
/system.slice/ModemManager.service 1 - 3.6M - -
/system.slice/NetworkManager-dispatcher.service 1 - 944.0K - -
Lastly I hope this article on understanding cgroups and slices with examples to limit CPU resources on Linux was helpful. So, let me know your suggestions and feedback using the comment section.

![Limit CPU with cgroups & slice in Linux [100% Working]](/cgroup-limit-cpu-usage-linux/cgroup.jpg)
