Amazon S3

An object storage service with unlimited capacity that provides 99.999999999% durability as standard, supporting a wide range of use cases from website hosting to data lakes

Overview

Amazon Simple Storage Service (S3) is the most widely used storage service on AWS. It lets you store and retrieve data from anywhere over the internet with no limit on total storage capacity. Individual files can be up to 5 TB, and data is automatically replicated across three or more Availability Zones, achieving an extremely high durability of 99.999999999% (eleven nines). Multiple storage classes are available to match different access patterns, including S3 Standard, S3 Standard-IA, S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive. Lifecycle policies let you automatically optimize costs over time. S3 is used wherever data needs to be stored, from static website hosting and data lake construction to backup, archiving, and log aggregation.

Storage Classes and Lifecycle Policies

S3 offers six storage classes that you can choose based on data access patterns to optimize costs. S3 Standard is designed for frequently accessed data and costs approximately $0.025 per GB per month (Tokyo region). S3 Standard-IA suits data accessed roughly once a month, reducing storage costs by about 40%, though retrieval request charges apply. S3 Intelligent-Tiering automatically analyzes access patterns and moves objects to the most cost-effective tier, making it ideal for data with unpredictable access patterns. The S3 Glacier family is designed for long-term archiving; Deep Archive is the cheapest at approximately $0.002 per GB per month, but retrieval can take up to 12 hours. By configuring lifecycle policies, you can define rules such as transitioning objects to Standard-IA after 30 days and to Glacier after 90 days, reducing costs without manual intervention. Azure Blob Storage also has a four-tier structure (Hot/Cool/Cold/Archive), but it lacks the ability to automatically optimize access tiers at the individual object level the way S3 Intelligent-Tiering does, requiring manual lifecycle management policy configuration instead.

Bucket Design and Access Control

S3 bucket names must be globally unique, so including project names or account IDs in the naming convention is a practical standard for avoiding collisions. Access control is structured in two layers: bucket policies (resource-based) and IAM policies (identity-based). The recommended approach is to first enable S3 Block Public Access at the account level to prevent unintended exposure, then grant only the necessary access through bucket policies. Server-side encryption with SSE-S3 (S3-managed keys) is enabled by default; choose SSE-KMS when compliance requirements demand key management control. Enabling versioning allows recovery from accidental deletions, and combining it with MFA Delete also prevents malicious deletions. Related books on Amazon provide a systematic overview of these design patterns.

Performance Tuning and Cost Visibility

S3 supports 5,500 GET and 3,500 PUT requests per second per prefix, and distributing prefixes effectively provides virtually unlimited throughput. Multipart uploads are essential for large files - parallel transfers significantly improve speed for files over 100 MB. S3 Transfer Acceleration speeds up uploads from geographically distant clients by routing through CloudFront's edge network. For cost management, S3 Storage Lens provides a dashboard that visualizes overall bucket usage, helping identify unnecessary objects and optimize storage classes. S3 Inventory periodically exports object listings in CSV or Parquet format, useful for auditing encryption status and storage class distribution across your buckets.

共有するXB!