Uncover hidden risks

Watch how the Wiz platform can expose unseen risks in your cloud environment without drowning your team in alerts.

7 GKE Security Best Practices

7 essential best practices that every organization should start with

10 minutes read

Google Kubernetes Engine (GKE), the managed Kubernetes solution from Google Cloud, is designed to help manage containerized applications through built-in security features that protect sensitive data and minimize potential risks. This blog post explores vital considerations for securing containerized workloads, with a focus on maximizing Google Kubernetes Engine’s native security features. Let’s jump right in.

Security considerations in GKE architecture

Containerization has many benefits: faster deployment, better usage of resources, and increased portability, to name a few. However, it also brings unique security challenges, such as container escapes, runtime threats, image vulnerabilities, and potential lateral movement. Without proper security guardrails, these vulnerabilities can lead to data breaches, service disruptions, and compliance violations.

In cloud-based managed services like GKE, there is a shared responsibility model that divides the security and compliance responsibilities between the cloud provider and the customer. When it comes to Google Kubernetes Engine, Google takes care of securing the infrastructure (think data centers and hypervisor layers) through measures like encryption, network segmentation, and security audits. Customers are in charge of securing their applications and data within Kubernetes clusters on GKE.

Ensuring security is a top priority in GKE because of the inherent challenges involved in managing containerized workloads. As a managed service, GKE automates many aspects of cluster management, such as provisioning, scaling, and monitoring. It also offers several configurable security settings that can be leveraged to implement proper access controls, enforce security and network policies, and ensure secure container image deployments. 

Next, we’ll take an in-depth look at how to configure these security settings to protect your clusters.

GKE security best practices

1. Workload identity and permissions best practices

Workloads deployed in GKE often need to interact with other Google Cloud services, and workload identity helps facilitate this interaction in a secure manner. Workload identity measures replace less secure options—like using service accounts and their keys—and also help protect sensitive cluster metadata from workloads running in the cluster. Here are some actionable tips:

  • Leverage Workload Identity Federation: In GKE, Workload Identity Federation assigns IAM permissions to your workloads so that they can securely access Google Cloud services such as BigQuery, machine learning APIs, storage, and Compute Engine. In essence, Workload Identity Federation uses Kubernetes service accounts to authenticate GCP services, links them to a Google Cloud IAM service account, and creates a clear identity for each application in your cluster.

To make the most of Workload Identity Federation, define IAM policies that grant specific permissions to workload identities. This way, your identity access management is centralized and you know that your applications have the correct level of permissions. 

You can use conditional role bindings to further refine access permissions. For example, you can use the code below to create time-bound access to a cloud resource:

{
  "bindings": [
    {
      "members": [
        "user:project-owner@example.com"
      ],
      "role": "roles/owner"
    },
    {
      "members": [
        "user:travis@example.com"
      ],
      "role": "roles/iam.securityReviewer",
      "condition": {
          "title": "Expires_2024",
          "description": "Expires at noon on 2024-12-31",
          "expression":
            "request.time < timestamp('2024-12-31T12:00:00Z')"
      }
    }
  ],
  "etag": "BwWPmjvelug=",
  "version": 3
}
  • Create service accounts that comply with the principle of least privilege: As we’ve seen, GKE allows the use of Google Cloud service accounts to grant permissions that manage GKE cluster operations, resources within clusters, as well as access any other Google Cloud resources. This is why it’s crucial that service accounts associated with GKE are aligned with the principle of least privilege. 

Use Kubernetes role-based access control (RBAC) to restrict access to service accounts within a cluster or at the Kubernetes namespace level by creating RBAC roles and binding those roles to service accounts and other authenticated entities. You should also ensure that the legacy attribute-based access control configuration is disabled so that access control is managed only through Kubernetes RBAC and Google Cloud IAM.

2. Cluster security best practices

Securing a GKE cluster is an essential means of protecting the integrity and confidentiality of the containerized workloads hosted in it. Follow these steps to fortify your clusters:

  • Stay on top of Kubernetes upgrades: Be sure to consistently update the Kubernetes version in GKE to keep it secure and take advantage of new features and improvements. As a managed service, GKE provides automated upgrades of the control plane, making it easier to stay up to date on the latest security patches, which reduces the risk of known vulnerabilities being exploited and boosts the overall security posture of the cluster. However, in a standard GKE cluster, you need to ensure that a process is in place to manually upgrade the worker nodes when new versions or security patches are released.

Restrict network access: By default, GKE exposes the control plane components through a public IP. Exposed components include the API server, controller manager, the etcd database, and the scheduler. Here's a sample command that can be used to create a cluster that restricts access to the control plane to a set of authorized networks:

gcloud container clusters create-auto CLUSTER_NAME \
    --enable-master-authorized-networks \
    --master-authorized-networks CIDR1,CIDR2,...
  • Use Shielded GKE Nodes: GKE uses Shielded VMs that are hardened to protect against various attack vectors, including rootkits and bootkits. Without Shielded VMs, attackers can exploit pod vulnerabilities to gain access to host nodes and cluster secrets. Shielded GKE Nodes ensure that the node's firmware and boot process are secure, safeguarding against tampering and unauthorized changes. By implementing Shielded GKE Nodes, you improve the safety of your containerized workloads and defend against advanced threats that target the underlying infrastructure. To create a GKE cluster with Shielded VMs, use this command:

gcloud container clusters create CLUSTER_NAME \
    --enable-shielded-nodes
  • Understand Autopilot clusters: Environments with stringent security requirements can use GKE Autopilot mode to allow Google to optimize the cluster configuration and manage scalability and security settings out of the box. Security best practices for Autopilot clusters, including automatic node upgrades, Workload Identity Federation, Shielded GKE Nodes, secure boot, and logging are enabled by default and cannot be modified by users. Autopilot clusters also implement strict network, security, and firewall policies so that only necessary communication is allowed within a cluster.

3. Secrets management best practices

Protecting sensitive information like API keys, passwords, and certificates used by applications deployed in Google Kubernetes Engine is a crucial aspect of cluster security. If compromised, these secrets can allow unauthorized access to clusters and applications hosted in them. 

To mitigate these risks, Google offers Cloud Key Management Service (Cloud KMS), a managed encryption service that enables organizations to encrypt secrets at rest using customer-managed encryption keys (CMEK).

By default, your data at rest is encrypted by GKE without any manual intervention. Additionally, you can use a key stored in KMS to encrypt data at the application layer. This process involves creating a Cloud KMS key and then providing a GKE service account access to it so the cluster can encrypt all secrets using the key. The following command enables application-layer encryption in a GKE cluster using a key created in KMS:

gcloud container clusters update CLUSTER_NAME \
    --region=COMPUTE_REGION \
  --database-encryption-key=projects/KEY_PROJECT_ID/locations/LOCATION/keyRings/RING_NAME/cryptoKeys/KEY_NAME \
    --project=CLUSTER_PROJECT_ID

In addition to native KMS, you can also use open-source key management services such as HashiCorp Vault to protect secrets in GKE clusters.

4. Workload isolation best practices

Ensuring that each workload is isolated from others is an essential best practice for limiting your attack surface. GKE offers various mechanisms for workload isolation:

  • Leverage GKE Sandbox: You can isolate untrusted workloads in a private cluster using GKE Sandbox. GKE Sandbox uses open-source gVisor technology on the backend and can protect against common risks, such as container escapes and privilege escalation attacks.

Use this command to create a new node pool in a GKE cluster with GKE Sandbox enabled:

gcloud container node-pools create NODE_POOL_NAME \
  --cluster=CLUSTER_NAME \
  --machine-type=MACHINE_TYPE \
  --image-type=cos_containerd \
  --sandbox type=gvisor
  • Restrict container process privileges: In addition to using GKE Autopilot to implement built-in security guardrails, you can take advantage of features like security contexts and Docker AppArmor security policies to limit the privileges of processes running in containers. With security contexts, you can control aspects including the ability to escalate privileges, Linux capabilities, the “run as” user, and group access. The Docker AppArmor security policies enabled by default in GKE clusters ensure that containers are not able to directly manipulate sensitive files or file systems in the cluster.

5. Supply chain security best practices

To shift left in your software supply chain, you need to ensure the security and integrity of container images deployed within GKE clusters. The implementation of an image signing and verification process is a critical aspect of this process, and Binary Authorization is the way to achieve it:

Enable Binary Authorization: GKE supports image signing and verification through integration with Binary Authorization and Artifact Registry. (You can also use any other container image registry service.) With Binary Authorization enabled, you can attest that a container image is signed with a private key, confirming that it was created using a specified build process. These digital signatures are verified during deployment to ensure that only trusted images are deployed in your cluster. Use the following command to create a GKE cluster with Binary Authorization enabled:

gcloud container clusters create \
    --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE \
    --zone us-central1-a \
    test-cluster

6. Network security best practices

When it comes to GKE, network security is all about managing the flow of traffic within and outside your cluster. Here are three crucial steps:

  • Customize your network policies: By default, pods within the same cluster can communicate with each other, which is handy for applications that depend on these communications. However, to implement enhanced security, you need to configure network policies to restrict pod communication based on specific criteria.

Network policies use labels to manage the ingress and egress traffic from pods in the cluster. By creating network policies, you get granular control over which pods can send or receive traffic from various sources (like other pods, namespaces, or IP blocks). You can also create implicit deny rules for egress and ingress traffic, ensuring that only explicitly authorized traffic is allowed between pods. 

Here’s a sample network policy that allows ingress traffic only from pods with a specific label:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: sample-policy
spec:
  policyTypes:
  - Ingress
  podSelector:
    matchLabels:
      app: test
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: dev
  • Filter load-balanced communication: When setting up external access for your GKE services with a load balancer, consider filtering incoming traffic for additional security. Filters can be implemented at the node level using kube-proxy, which allows you to specify what IP addresses are allowed to access the external load balancer public IP.

7. Audit logging best practices

GKE’s audit logs can provide visibility into user activity and API calls in the cluster, and they also flag suspicious behaviors or possible compromises. There are two types of audit logs to review on a regular basis—admin activity logs and data access logs:

  • Admin activity logs: Admin activity logs monitor the administrative actions on your clusters, including the creation and deletion of namespaces, deployments, and pods. These logs reveal who performed the actions and the resources involved.

  • Data access logs: Data access logs provide information about how users engage with Kubernetes APIs to access or modify data within clusters. This information includes data about who is reading/writing pod configuration files, secrets, and other cluster resources.

 

Augment native Kubernetes security with Wiz

While Google Kubernetes Engine offers strong native security features, organizations often require solutions beyond these capabilities to meet complex compliance requirements such as CIS benchmarks. Wiz's cloud-native Kubernetes security solution offers a comprehensive approach to securing your Kubernetes clusters, containerized applications, and the underlying cloud environment. Look to Wiz for:

  • Full visibility and risk prioritization: Wiz’s agentless scanning technology covers clouds, containers, hosts, and clusters; identifies "toxic combinations" based on multiple attack vectors; and provides a single risk-prioritization queue with actionable insights to remediate faster. 

  • Automatic Kubernetes security posture management (KSPM): Through continuous monitoring, compliance assessment, and rule-based assessment, Wiz provides a comprehensive KSPM solution for GKE. 

  • Secured containers throughout their life cycle: With Wiz, you can scan and detect vulnerabilities, secrets, misconfigurations and sensitive data while writing code or while building the container image. And the admission controller prevents the deployment of untrusted images or misconfigured applications before they reach the GKE cluster.

  • Threat detection and response in real time: Wiz can help detect and block malicious behaviors at both host and container levels in real time. In the event of a breach, Wiz facilitates investigations by correlating events from the control plane, hosts, containers, and cloud environment, providing SOC and incident response teams with comprehensive context to enable faster reactions.

See for yourself: Get a demo today to learn more about how Wiz transforms GKE security!

Empower your developers, from code to production

Learn what makes Learn why the fastest growing companies trust Wiz to secure Kubernetes from build-time to runtime. the platform to enable your cloud security operation

Get a demo

Other security best practices you might be interested in:

Continue reading

Cloud Investigation and Response Automation (CIRA)

Cloud investigation and response automation (CIRA) harnesses the power of advanced analytics, artificial intelligence (AI), and automation to provide organizations with real-time insights into potential security incidents within their cloud environments

What is Security by Design?

Wiz Experts Team

Security by design is a software development approach that aims to establish security as a pillar, not an afterthought, i.e., integrating security controls into software products right from the design phase.

What is a Data Poisoning Attack?

Wiz Experts Team

Data poisoning is a kind of cyberattack that targets the training data used to build artificial intelligence (AI) and machine learning (ML) models.