Is the "/" in S3 Really a Folder? - The Truth About Flat Namespaces and How Prefixes Work
Starting from the fact that S3 has no concept of folders and the "/" in object keys is simply part of the string, this article explains how prefix searches work, the trick the console uses to display folders, and best practices for key design.
S3 Has No Folders
S3's data model is a flat namespace consisting of only two levels: buckets and objects. There is no directory hierarchy like a file system. The object key images/2024/photo.jpg is not "photo.jpg inside the 2024 folder inside the images folder" but rather "a single object named images/2024/photo.jpg." The "/" is part of the key name and is fundamentally different from a file system path separator. There is a simple way to verify this. Running aws s3api list-objects-v2 --bucket my-bucket with the AWS CLI returns all objects as a flat list. The concept of folders does not appear at all. On the other hand, running aws s3 ls s3://my-bucket/ displays the contents as if a folder structure exists. This is because the s3 command (a high-level command) uses "/" as a delimiter and displays common prefixes as if they were folders.
The Trick the Console Uses to Display Folders
The S3 screen in the AWS Management Console shows folder icons and even has a "Create folder" button. However, when you "create a folder" in the console, what actually happens is that a zero-byte object with a trailing "/" (e.g., images/) is created. This object is merely a marker that the console displays as a folder. Deleting the folder marker does not affect objects with that prefix. Deleting the images/ object does not remove images/photo.jpg. Conversely, uploading images/photo.jpg causes the console to display an images folder even without the images/ marker. The console uses the Delimiter parameter set to "/" in the ListObjectsV2 API and displays CommonPrefixes as folders. Without understanding this mechanism, you may be puzzled by phenomena like "I deleted the folder but the files inside didn't disappear" or "folders are showing up even though I never created them."
The Relationship Between Prefixes and Performance
S3 prefixes directly affect performance. In 2018, S3 underwent a major performance improvement, enabling 5,500 GET requests and 3,500 PUT requests per second per prefix. The "prefix" here is not the portion up to the first "/" in the object key, but rather everything before the last "/" in the entire key. For example, the prefix of images/2024/01/photo.jpg is images/2024/01. This means that by distributing prefixes, you can scale the overall throughput of a bucket virtually without limit. Even if a single prefix is capped at 5,500 GETs per second, distributing across 100 prefixes enables 550,000 GETs per second. Before 2018, S3's partitioning was based on the leading characters of the key, so the best practice was to prepend a random hash to keys (e.g., a1b2c3/images/photo.jpg). This workaround is no longer necessary, as S3 now automatically optimizes partitions.
Best Practices for Object Key Design
Object key design significantly impacts S3 operational efficiency. First, establish a consistent naming convention. Date-based keys (logs/2024/01/15/access.log) are well-suited for managing time-series data. Since S3 lifecycle rules are applied on a prefix basis, including dates in the prefix lets you automatically transition old data to Glacier or delete it. Second, avoid special characters in keys. While S3 object keys can contain any UTF-8 character, keys with spaces, Japanese characters, or special symbols can cause URL encoding issues. Sticking to alphanumeric characters, hyphens, underscores, and slashes is the safest approach. Third, be mindful of key length. The maximum object key length is 1,024 bytes. Using long keys to mimic deep folder structures can approach this limit. Additionally, since ListObjectsV2 responses include key names, longer keys increase response size and consume more network bandwidth.
S3 Select and Leveraging the Flat Namespace
S3's flat namespace may feel unfamiliar to developers accustomed to traditional file system concepts, but it offers significant advantages for large-scale data processing. In a flat namespace, there is no directory metadata to manage, so performance does not degrade even when storing billions of objects. In a file system, having millions of files in a single directory can take several seconds just to read the directory metadata. S3 Select is a feature that filters object contents on the S3 side using SQL queries, returning only the data you need. You can run SELECT statements against CSV or JSON objects, dramatically reducing data transfer volume. For example, when extracting specific columns from a 1 GB CSV file, S3 Select transfers only a few megabytes of data. S3 Select pricing is based on data scanned ($0.002/GB) and data returned ($0.0007/GB), making it cost-effective when combined with reduced data transfer charges. To systematically learn S3 design patterns, specialized books (Amazon) are a great reference.