Sunshine After Rain

Hunting for Bucket Traversals in Google's Client Libraries

Table of Contents


Preface

This writeup picks up pretty much where the last one ended, that is when I found an exploitable instance of a bucket traversal vulnerability and stumbled on an N-day in Go Cloud Storage client library.

Intrigued by this finding, I decided to audit other Cloud Storage client libraries focused solely on variants of similar issues.


Bucket Traversal 101

So what is a Bucket Traversal to begin with?

Asked Gemini 2.5 to do the heavy lifting:
A Bucket Traversal Vulnerability is a class of security weaknesses where an application improperly handles user-supplied input when interacting with cloud storage services (like AWS S3, Google Cloud Storage, Azure Blob Storage, etc.). This flaw allows an attacker to manipulate the input (typically filenames, paths, or object keys) to access or sometimes modify objects within a storage bucket that they are not intended to have access to.

The above definition is not wrong but it doesn’t mention the vital Cloud IAM prerequisite, which also has to be met:
The Service Account (non-human identity) attached to the targed workload, has to be granted IAM permissions to read/write/edit bucket(s) other than the intended one

For the purpose of this writeup, I will differentiate between two subsets of bucket traversal issues:

  1. Application level ie. faulty business logic, lack of input validation etc.
  2. Library level ie. security bugs in opinionated SDKs often maintained by Cloud Service Providers

There is little I could add to what practices could prevent issues from #1 (AppSec 101)
Therefore, I will focus primarily on class #2, that is when the implementation of the vendor maintained library is vulnerable itself.


Case study

TL;DR

The method google.cloud.storage.transfer_manager.upload_chunks_concurrently() was vulnerable to a variant of a path (bucket) traversal.

Timeline:

  • Faulty function was introduced in version 2.11.0 on September 19th 2023
  • I submitted this issue to Google VRP on July 14th 2024
  • The vulnerability was fixed in version 2.18.1 on August 6th 2024

Here’s the recently disclosed report -> https://bughunters.google.com/reports/vrp/h1K5SciPh

Why is there no GitHub Security Advisory (GHSA) and/or CVE published you might wonder (?)
Well that’s a topic for a separate discussion - I was told that at least a post factum comment will be eventually added.


Overview

Python Client for Google Cloud Storage() is an Open Source project maintained by Google.

Excerpt from docs:
Client libraries make it easier to access Google Cloud APIs from a supported language. While you can use Google Cloud APIs directly by making raw requests to the server, client libraries provide simplifications that significantly reduce the amount of code you need to write.

This library is used in many foundational Python-based ML/AI Open Source projects such as:

The relatively modest number of stars on GitHub does not properly reflects its significance


Technical analysis

Google Cloud Storage exposes three distinct APIs:

I decided to focus on the XML API due to its subjectively error prone schema & interoperability with Amazon Simple Storage Service (Amazon S3)

After some brief source code review & grey box testing, I pinpointed a spot where a traversal could occur:
https://github.com/googleapis/python-storage/blob/d5d3c68a6e5c6f8cefc59892c1ccceaf181ff32d/google/cloud/storage/transfer_manager.py#L1084-L1087

Issue stemmed from the fact that the URL path was constructed insecurely (lack of context specific encoding)

    url = "{hostname}/{bucket}/{blob}".format(
        hostname=hostname, bucket=bucket.name, blob=blob.name
    )

As a result, if blob.name was supplied from user input, then an attacker could make use of the classic dot-dot-slash technique and upload a file to a bucket unintended by the victim eg. ../bucket/object


PoC

Here’s the orginal PoC recording, based on the official sample snippet


Attack scenario

Depending on the IAM permissions granted to the underlying Service Account this could lead to malicious scenarios such as:

  1. overwriting existing files (data & integrity loss)
  2. upload of an object later consumed by the application (config override, XSS etc.)

Potential impact associated with vector #1 is self evident.
I think that scenario #2 is far more interesting.


Diagram of a sample vulnerable application

Prepared a diagram of a sample vulnerable application GigaUpload to better convey the idea.

This fictitious service meets following criteria:

  • large file upload implemented using google.cloud.storage.transfer_manager.upload_chunks_concurrently()
  • client side behaviour (features, flags etc.) managed via config files fetched from a dedicated GCS bucket


Summary

Bucket traversal appears to be an underresearched class of vulnerabilities, requiring significant context-specific knowledge for comprehensive understanding.
It exists at the intersection of traditional Application Security (AppSec) and Cloud Security, underscoring the critical need to integrate these two domains.