Skip to content

Commit c2b49fe

Browse files
committed
Add dedicated seccomp node reference
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
1 parent 56e2fb1 commit c2b49fe

File tree

5 files changed

+184
-1
lines changed

5 files changed

+184
-1
lines changed

content/en/docs/concepts/security/linux-kernel-security-constraints.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,8 @@ profile to a more permissive profile.
9090
{{</note>}}
9191

9292
To learn how to implement seccomp in Kubernetes, refer to
93-
[Restrict a Container's Syscalls with seccomp](/docs/tutorials/security/seccomp/).
93+
[Restrict a Container's Syscalls with seccomp](/docs/tutorials/security/seccomp/)
94+
or the [Seccomp node reference](/docs/reference/node/seccomp/)
9495

9596
To learn more about seccomp, see
9697
[Seccomp BPF](https://www.kernel.org/doc/html/latest/userspace-api/seccomp_filter.html)
@@ -288,3 +289,4 @@ of support that you need. For instructions, refer to
288289
* [Learn how to use AppArmor](/docs/tutorials/security/apparmor/)
289290
* [Learn how to use seccomp](/docs/tutorials/security/seccomp/)
290291
* [Learn how to use SELinux](/docs/tasks/configure-pod-container/security-context/#assign-selinux-labels-to-a-container)
292+
* [Seccomp Node Reference](/docs/reference/node/seccomp/)

content/en/docs/reference/node/_index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ This section contains the following reference topics about nodes:
1515

1616
* [Node `.status` information](/docs/reference/node/node-status/)
1717

18+
* [Seccomp information](/docs/reference/node/seccomp/)
19+
1820
You can also read node reference details from elsewhere in the
1921
Kubernetes documentation, including:
2022

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
content_type: reference
3+
title: Seccomp and Kubernetes
4+
weight: 80
5+
---
6+
7+
<!-- overview -->
8+
9+
Seccomp stands for secure computing mode and has been a feature of the Linux
10+
kernel since version 2.6.12. It can be used to sandbox the privileges of a
11+
process, restricting the calls it is able to make from userspace into the
12+
kernel. Kubernetes lets you automatically apply seccomp profiles loaded onto a
13+
{{< glossary_tooltip text="node" term_id="node" >}} to your Pods and containers.
14+
15+
## Seccomp fields
16+
17+
{{< feature-state for_k8s_version="v1.19" state="stable" >}}
18+
19+
There are four ways to specify a seccomp profile for a
20+
{{< glossary_tooltip text="pod" term_id="pod" >}}:
21+
22+
- for the whole Pod using [`spec.securityContext.seccompProfile`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context)
23+
- for a single container using [`spec.containers[*].securityContext.seccompProfile`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context-1)
24+
- for an (restartable / sidecar) init container using [`spec.initContainers[*].securityContext.seccompProfile`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context-1)
25+
- for an [ephermal container](/docs/concepts/workloads/pods/ephemeral-containers) using [`spec.ephemeralContainers[*].securityContext.seccompProfile`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context-2)
26+
27+
{{% code_sample file="pods/security/seccomp/fields.yaml" %}}
28+
29+
The Pod in the example above runs as `Unconfined`, while the
30+
`ephemeral-container` and `init-container` specifically defines
31+
`RuntimeDefault`. If the ephemeral or init container would not have set the
32+
`securityContext.seccompProfile` field explicitly, then the value would be
33+
inherited from the Pod. The same applies to the container, which runs a
34+
`Localhost` profile `my-profile.json`.
35+
36+
Generally speaking, fields from (ephemeral) containers have a higher priority
37+
than the Pod level value, while containers which do not set the seccomp field
38+
inherit the profile from the Pod.
39+
40+
{{< note >}}
41+
It is not possible to apply a seccomp profile to a Pod or container running with
42+
`privileged: true` set in the container's `securityContext`. Privileged
43+
containers always run as `Unconfined`.
44+
{{< /note >}}
45+
46+
The following values are possible for the `seccompProfile.type`:
47+
48+
`Unconfined`
49+
: The workload runs without any seccomp restrictions.
50+
51+
`RuntimeDefault`
52+
: A default seccomp profile defined by the
53+
{{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}
54+
is applied. The default profiles aim to provide a strong set of security
55+
defaults while preserving the functionality of the workload. It is possible that
56+
the default profiles differ between container runtimes and their release
57+
versions, for example when comparing those from
58+
{{< glossary_tooltip text="CRI-O" term_id="cri-o" >}} and
59+
{{< glossary_tooltip text="containerd" term_id="containerd" >}}.
60+
61+
`Localhost`
62+
: The `localhostProfile` will be applied, which has to be available on the node
63+
disk (on Linux it's `/var/lib/kubelet/seccomp`). The availability of the seccomp
64+
profile is verified by the
65+
{{< glossary_tooltip text="container runtime" term_id="container-runtime" >}}
66+
on container creation. If the profile does not exist, then the container
67+
creation will fail with a `CreateContainerError`.
68+
69+
### `Localhost` profiles
70+
71+
Seccomp profiles are JSON files following the scheme defined by the
72+
[OCI runtime specification](https://github.com/opencontainers/runtime-spec/blob/f329913/config-linux.md#seccomp).
73+
A profile basically defines actions based on matched syscalls, but also allows
74+
to pass specific values as arguments to syscalls. For example:
75+
76+
```json
77+
{
78+
"defaultAction": "SCMP_ACT_ERRNO",
79+
"defaultErrnoRet": 38,
80+
"syscalls": [
81+
{
82+
"names": [
83+
"adjtimex",
84+
"alarm",
85+
"bind",
86+
"waitid",
87+
"waitpid",
88+
"write",
89+
"writev"
90+
],
91+
"action": "SCMP_ACT_ALLOW"
92+
}
93+
]
94+
}
95+
```
96+
97+
The `defaultAction` in the profile above is defined as `SCMP_ACT_ERRNO` and
98+
will return as fallback to the actions defined in `syscalls`. The error is
99+
defined as code `38` via the `defaultErrnoRet` field.
100+
101+
The following actions are generally possible:
102+
103+
`SCMP_ACT_ERRNO`
104+
: Return the specified error code.
105+
106+
`SCMP_ACT_ALLOW`
107+
: Allow the syscall to be executed.
108+
109+
`SCMP_ACT_KILL_PROCESS`
110+
: Kill the process.
111+
112+
`SCMP_ACT_KILL_THREAD` and `SCMP_ACT_KILL`
113+
: Kill only the thread.
114+
115+
`SCMP_ACT_TRAP`
116+
: Throw a `SIGSYS` signal.
117+
118+
`SCMP_ACT_NOTIFY` and `SECCOMP_RET_USER_NOTIF`.
119+
: Notify the user space.
120+
121+
`SCMP_ACT_TRACE`
122+
: Notify a tracing process with the specified value.
123+
124+
`SCMP_ACT_LOG`
125+
: Allow the syscall to be executed after the action has been logged to syslog or
126+
auditd.
127+
128+
Some actions like `SCMP_ACT_NOTIFY` or `SECCOMP_RET_USER_NOTIF` may be not
129+
supported depending on the container runtime, OCI runtime or Linux kernel
130+
version being used. There may be also further limitations, for example that
131+
`SCMP_ACT_NOTIFY` cannot be used as `defaultAction` or for certain syscalls like
132+
`write`. All those limitations are defined by either the OCI runtime
133+
([runc](https://github.com/opencontainers/runc),
134+
[crun](https://github.com/containers/crun)) or
135+
[libseccomp](https://github.com/seccomp/libseccomp).
136+
137+
The `syscalls` JSON array contains a list of objects referencing syscalls by
138+
their respective `names`. For example, the action `SCMP_ACT_ALLOW` can be used
139+
to create a whitelist of allowed syscalls as outlined in the example above. It
140+
would also be possible to define another list using the action `SCMP_ACT_ERRNO`
141+
but a different return (`errnoRet`) value.
142+
143+
It is also possible to specify the arguments (`args`) passed to certain
144+
syscalls. More information about those advanced use cases can be found in the
145+
[OCI runtime spec](https://github.com/opencontainers/runtime-spec/blob/f329913/config-linux.md#seccomp)
146+
and the [Seccomp Linux kernel documentation](https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt).
147+
148+
## Further reading
149+
150+
- [Restrict a Container's Syscalls with seccomp](/docs/tutorials/security/seccomp/)
151+
- [Pod Security Standards](/docs/concepts/security/pod-security-standards/)

content/en/docs/tasks/administer-cluster/securing-a-cluster.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -275,3 +275,4 @@ page for more on how to report vulnerabilities.
275275
## What's next
276276

277277
- [Security Checklist](/docs/concepts/security/security-checklist/) for additional information on Kubernetes security guidance.
278+
- [Seccomp Node Reference](/docs/reference/node/seccomp/)
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
apiVersion: v1
2+
kind: Pod
3+
metadata:
4+
name: pod
5+
spec:
6+
securityContext:
7+
seccompProfile:
8+
type: Unconfined
9+
ephemeralContainers:
10+
- name: ephemeral-container
11+
image: debian
12+
securityContext:
13+
seccompProfile:
14+
type: RuntimeDefault
15+
initContainers:
16+
- name: init-container
17+
image: debian
18+
securityContext:
19+
seccompProfile:
20+
type: RuntimeDefault
21+
containers:
22+
- name: container
23+
image: docker.io/library/debian:stable
24+
securityContext:
25+
seccompProfile:
26+
type: Localhost
27+
localhostProfile: my-profile.json

0 commit comments

Comments
 (0)