Wiz Defend is Here: Threat detection and response for cloud

From Pod Security Policies to Pod Security Standards – a Migration Guide

Pod Security Policies were removed in Kubernetes v1.25 — learn how to migrate from Pod Security Policies to Pod Security Standards

9 minutes read

In Kubernetes version 1.21, Pod Security Policies (PSP) were officially deprecated and replaced with Pod Security Admission (PSA). PSA implements Pod Security Standards (PSS), a set of policies describing various security-related characteristics of the workloads in a Kubernetes cluster. With version 1.25, PSA became a stable feature and PSP was completely removed. In this blog, we will discuss PSP-to-PSA migration strategies, offer guidance to help transition from Pod Security Policies to Pod Security Standards, and point out potential migration restrictions and limitations.

Background

In Kubernetes, the Admission Controller is a crucial security component that intercepts API server requests and applies a specific policy to authorize or monitor them. Pod Security Policies is a Kubernetes feature that enables administrators to define security constraints for the creation and deployment of pods, such as restricting privileged access and sensitive host path mounting. However, PSPs were deprecated as of Kubernetes v1.21 in favor of the newer Pod Security Standards, which provide similar functionality with easier control. 

Pod Security Standards can be used to define security policies at three levels (privileged, baseline, and restricted) for pods at a cluster-wide or namespace level. There are two approaches a cluster administrator may take to enforce Pod Security Standards: using the built-in Pod Security Admission Controller or relying on third-party alternatives. These third-party alternatives validate pod creation requests against the defined policies to ensure that only pods that meet the specified security requirements are deployed. As for Pod Security Admission, it is a built-in validating admission controller applying the policy specified by the cluster admin. The cluster admin can choose to assign one of the three levels to different namespaces, providing limited flexibility. For example, the kube-system namespace can operate at the privileged level, whereas the production app namespace can operate at the restricted level. 

Wiz Research investigated hundreds of cloud environments to understand and quantify the usage of PSPs, PSA, and external admission controllers in clusters. To start with, we have calculated the version distribution numbers across all customers, cluster flavors, and cloud environments:

According to the pie chart, the vast majority of environments are capable of using both PSPs and PSSs (76%). The relatively minimal adoption of v1.25 (2%) suggests that now is the optimal time to migrate the policies. 

Furthermore, we looked closer at the adoption policies on a per-version basis:

The numbers above show that PSP utilization increases with every version. However, PSA adoption does not rise symmetrically. There are two non-exclusive possible explanations: first – users migrate from PSP to external admission controllers (we see some evidence of that); second – users postpone or delay PSS adoption due to its complexity. The following guide attempts to prevent the latter. In fact, the low adoption of version 1.25 and above indicates that there is still time to perform a proper migration. 

Migration scenarios 

When it comes to applying PSA, there are four scenarios in which users can find themselves: 

  1. Migrating brand-new workloads directly to PSA. 

  2. Migrating existing, policy-free workloads that are not under any policy to PSA. 

  3. Migrating existing workloads with simple PSPs to PSA. 

  4. Migrating existing workloads with elaborate PSPs to an external admission controller. 

We discuss scenarios (1) and (2) in the “Onboarding of new and policy-free workloads” section, and scenario (3) in the “Migration of existing workloads” section. The fourth scenario pertains to customers with a complicated PSP policy requiring more flexibility than PSS can offer. In this case, our recommendation is to use an external admission controller providing complex functionality, such as Wiz Admission Controller.

However, it is worth noting that you can always refer to the detailed Kubernetes guide that contains the detailed description of the steps at the command-by-command level. Here we attempt to simplify and outline the overall process flow, provide additional recommendations for Wiz customers, describe the operational restrictions of the PSA in managed clusters, and warn Kubernetes practitioners about potential hurdles in the process.

Onboarding of new and policy-free workloads 

Whether you need to stage a brand-new cluster, add a new workload to an existing cluster, or migrate existing clusters or namespaces to PSA, this section will guide you through the process. When applying PSS to a new or existing PSP-free workload, we can use the following commands:

$ kubectl label ns <namespace name> pod-security.kubernetes.io/enforce=<level> --dry-run=server

Note the --dry-run=server flag—this flag enables various checks to be carried out, including authentication and authorization, without applying any changes. If the PSS level is suitable for the namespace workloads, there will be no warnings in the output. Otherwise, kubectl will helpfully print a list of warnings detailing the specific problems:

$ kubectl label ns default pod-security.kubernetes.io/enforce=restricted --dry-run=server 

Warning: existing pods in namespace "default" violate the new PodSecurity 
enforce level "restricted:latest" 

Warning: andy-dufresne: host namespaces, privileged, allowPrivilegeEscalation 
!= false, unrestricted capabilities, runAsNonRoot != true, seccompProfile 

namespace/default labeled (server dry run)

In this example, the pod andy-dufresne violates three checks and consequently blocks the restricted policy application. At this point, the cluster admin must choose to either modify the workload, adjust the policy level to baseline or privileged, or ignore this namespace altogether (which is not recommended). 

Finally, after the necessary changes, you can re-run the above command without the --dry-run flag and then verify the successful policy application with the following command:

$ kubectl describe ns default | grep pod-security

pod-security.kubernetes.io/enforce=restricted

Migration of existing workloads

Migrating existing workloads that actively use PSP requires more effort than applying PSA from scratch. There are several issues that need to be avoided when performing such a migration, including irrevocable breakage of running workloads, service disruptions, and the failure to apply a policy to a workload. This migration should therefore be carried out in two stages:

  • Non-enforcing application

Use the audit or warn modes first:  

$ kubectl label --overwrite ns default pod-security.kubernetes.io/warn=restricted 

Once the system has processed the command, you can observe this output when trying to spin up a new pod that violates the policy:

$ cat <<EOF | kubectl apply -f - 
> apiVersion: v1 
> kind: Pod 
> metadata: 
>   name: privpod 
> spec: 
>   containers: 
>   - image: alpine:latest 
>     command: 
>       - "sleep" 
>       - "3600" 
>     imagePullPolicy: IfNotPresent 
>     name: privpod 
>     securityContext: 
>       capabilities: 
>         add: ["NET_ADMIN", "SYS_ADMIN"] 
>       runAsUser: 0 
>   restartPolicy: Never 
>   hostIPC: true 
>   hostNetwork: true 
>   hostPID: true 
> EOF 
Warning: would violate PodSecurity "restricted:latest": host namespaces 
(hostNetwork=true, hostPID=true, hostIPC=true), allowPrivilegeEscalation != false 
(container "privpod" must set securityContext.allowPrivilegeEscalation=false), 
unrestricted capabilities (container "privpod" must set 
securityContext.capabilities.drop=["ALL"]; container "privpod" must not include 
"NET_ADMIN", "SYS_ADMIN" in securityContext.capabilities.add), runAsNonRoot != true 
(pod or container "privpod" must set securityContext.runAsNonRoot=true), runAsUser=0 
(container "privpod" must not set runAsUser=0), seccompProfile (pod or container "privpod" must set securityContext.seccompProfile.type to "RuntimeDefault" or 
"Localhost") 
pod/privpod created 

Several things to note: 

-Despite the warning, the pod was successfully spun up. 

-There are no warnings on the existing workloads that violate the warn policy.

Because of the above, we recommend duplicating the monitored policy with warn and audit modes in order to have additional means of observing the warnings.

  • Enforcing application

The lack of warnings will indicate the namespace is ready for the final application. The same command from the previous section will suffice. Note the --overwrite flag needed to update the level or the mode: 

$ kubectl label --overwrite ns default pod-security.kubernetes.io/enforce=baseline

Because PSA and PSP are separate features, cluster operators are encouraged to leave the PSP active until the PSA is enabled in the enforcing mode. This is only possible in clusters with version 1.24 and below. To avoid potential downgrades, it is encouraged to perform the migration before upgrading clusters to v1.25.

Treatment of problematic workloads

The most difficult situation arises when the workload must run with the privileges violating baseline/restricted profiles. The following options are available to the cluster admin: 

  1. If there is a minority pod in the namespace requiring special privileges, consider splitting the namespaces so that the problematic workload won’t prevent the bigger migration. 

  2. Apply exemptions to problematic workloads. You can exempt workloads initiated by the specific user, or those created by a specific RuntimeClassName. You can even exempt the entire namespace, although the latter option is equivalent to not applying the PSA at all. 

  3. Even if nothing can be done within the specific namespace, we recommend setting the PSA level to privileged rather than omitting it entirely. This will reflect that there was a deliberation behind the decision.

Post-deployment steps 

To maintain cluster security hygiene post migration, we recommend the following actions: 

  • Only allow the creation of explicitly labelled namespaces.  

  • Review the permissions to annotate namespaces and modify the PSS levels. You can run the label command with the --v=8 flag that will show the kubelet’s real API requests:

$ kubectl label --dry-run=server --overwrite ns default pod-security.kubernetes.io/warn=restricted --v=8 
I0227 15:35:58.424646     630 round_trippers.go:463] GET 
https://XX.XX.XX.XX/api/v1/namespaces/default 
... 
I0227 15:35:58.777590     630 round_trippers.go:463] GET 
https://XX.XX.XX.XX/openapi/v2?timeout=32s 
... 
I0227 15:35:59.151049     630 round_trippers.go:463] PATCH 
https://XX.XX.XX.XX/api/v1/namespaces/default?dryRun=All&fieldManager=kubectl-label 
... 

This means that every Kubernetes user/service account with patch/update permissions in a namespace can effectively remove the label and relax the policy.  

Restrictions and limitations 

Managed namespaces 

All managed clusters have a default configuration and begin with the minimal set of workloads CSPs deem necessary to install, such as monitoring, logging, and networking infrastructure. These infrastructure workloads typically require above-average privileges and are deployed as a part of kube-system or another namespace exempted from pod security application: 

$ kubectl label ns kube-system pod-security.kubernetes.io/enforce=restricted 
Warning: namespace "kube-system" is exempt from Pod Security, and the policy 
(enforce=restricted:latest) will be ignored 

These namespaces include kube-system and kube-node-lease in AKS and EKS, and gatekeeper-system in AKS.

Image-level settings 

An important thing to remember is that PSA is an admission controller and thus is susceptible to admission controller workflow bypasses. For example, restricted PSS requires a pod to run as a non-root user. A pod YAML might declare such intention with the following setting: spec.securityContext.runAsUser = 1000. However, if a container image is compiled to run as root, it effectively bypasses the admission controller check and is only caught by the additional runtime check at a container start-up stage. 

Completed workloads 

A completed privileged workload showing as completed when listing pods will still fail the label application as demonstrated below:

$ kubectl run alpine-test --image alpine -n test 

pod/alpine-test created 

$ kubectl get pods -n test 

NAME          READY   STATUS      RESTARTS      AGE 

alpine-test   0/1     Completed   1 (3s ago)    4s 

$ kubectl label namespace test 
pod-security.kubernetes.io/enforce=restricted 

Warning: existing pods in namespace "test" violate the new PodSecurity enforce level 
"restricted:latest" 

Warning: alpine (and 1 other pod): allowPrivilegeEscalation != false, unrestricted 
capabilities, runAsNonRoot != true, seccompProfile 

namespace/test labeled 

The completed pods and jobs are kept by default in order to report the success/fail status. The correct way to regulate the displayed time for workload completion is via the TTL-after-finished controller settings (stable since v1.23). However, to facilitate the PSS application, a cluster admin can detect and later remove these workloads manually with this command: 

$ kubectl delete pod $(kubectl get pods | grep -Ei "(Completed|CrashLoopBackOff|Terminating)" | awk '{print $1}')

Takeaways 

  • What is the current state of security policy adoption?  

Data shows that migration to PSA from PSP has been slow. The worst scenario entails losing Kubernetes users who currently use PSP and stop using any policy after the upgrade to v1.25. 

  • What can I do about the transition to PSA/PSS? 

There is more than one way to facilitate migration, but you must start before version 1.25. Hopefully, this guide can serve as a starting point.  

  • What should I expect? 

To demonstrate what you should expect when attempting to apply PSS to common workloads, we have compiled a table specifying which PSS levels are expected:

Popular add-on/extension/appManaged environmentDefault PSS level
AirflowGKEPrivileged
AirflowAKSBaseline
ActiveMQGKEBaseline
GrafanaGKEBaseline
ConsulGKEPrivileged
ConsulAKSBaseline
ElasticsearchAKSPrivileged
LogstashAKSBaseline
KubecostEKSBaseline

Two patterns emerge: 1) Similar applications can have different PSS levels across CSPs (multi-cloud Kubernetes users should be ready for migration process variation across CSPs), and (2) none of the applications can operate at a restricted level, which after all is rather demanding.

Protecting your environment 

Wiz offers its customers a series of functionalities to aid with the migration process: 

  • A built-in Cloud Configuration Rule to identify workloads running without an assigned PSP at the cluster level. See all the namespaces without an assigned PSP in the image below:

  • Built-in Pod Security Standards frameworks to assess clusters and namespaces without the need for dry runs on each namespace, allowing you to identify the most suitable Pod Security level (baseline or restricted):

Moreover, the following steps are helpful post deployment: 

1. Use Wiz Admission Controller to protect yourself by default with these two rules: 

  1. Kubernetes namespace should have pod security level assigned 

  2. Kubernetes namespace should not have privileged pod security level assigned 

In addition to the Admission Controller hook, these rules also run on the objects retrieved through the API scan to provide a fuller picture. 

2. Find K8s principals that can modify the PSS level on the namespace with this query.

Continue reading

Get a personalized demo

Ready to see Wiz in action?

“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management