Amazon DocumentDB

A fully managed document database with MongoDB-compatible APIs that provides scalable JSON document storage, querying, and indexing

Overview

Amazon DocumentDB (with MongoDB compatibility) is a fully managed document database service that provides MongoDB 3.6 / 4.0 / 5.0 compatible APIs. Internally, it uses a distributed storage architecture similar to Aurora, with storage automatically replicated across 6 copies in 3 AZs. It comes standard with up to 15 read replicas, automatic backups, point-in-time recovery, and encryption at rest and in transit. Since you can use existing MongoDB drivers and tools as-is, you can migrate existing MongoDB applications with minimal changes.

MongoDB Compatibility and Where It Diverges

DocumentDB is compatible with the MongoDB wire protocol, but it is not MongoDB itself. Understanding this distinction precisely is essential to avoid unexpected behavior after migration. Compatible features include CRUD operations, aggregation pipelines, indexes (single-field, compound, multi-key, text, and geospatial), transactions (4.0 compatible), and Change Streams. Incompatible features include the $where operator, server-side JavaScript execution, Capped Collections, and some aggregation stages (certain $merge options). For migrations from MongoDB Atlas, AWS Database Migration Service (DMS) enables online migration with minimal downtime. It's strongly recommended to use DocumentDB's compatibility testing tool before migration to verify that the APIs your application uses are supported. MongoDB 6.0+ features like Queryable Encryption are not available in DocumentDB, so if your application heavily depends on the latest features, MongoDB Atlas should be considered instead.

Instance Class Selection and Cost Optimization

DocumentDB costs consist of three components: instance hours, storage, and I/O requests. Instance classes are available in db.r5 and db.r6g (Graviton2) families, with Graviton2-based db.r6g being approximately 10% cheaper and delivering up to 30% better performance than equivalent db.r5 instances. Whether the working set fits in instance memory is the critical performance threshold - when it doesn't, disk I/O increases and latency degrades sharply. If the CloudWatch BufferCacheHitRatio metric drops below 95%, you should consider scaling up the instance class. For read-heavy workloads, add read replicas and distribute reads from the application side. DocumentDB Elastic Clusters enable horizontal scaling through sharding, handling millions of read/write requests per second and petabyte-scale storage. For development and test environments, using a minimal Elastic Clusters configuration instead of instance-based clusters keeps costs down while developing against the same API as production. For a deeper understanding of document databases and NoSQL design, books on NoSQL databases (Amazon) are a great resource.

Choosing Between DocumentDB and DynamoDB

DocumentDB and DynamoDB are both NoSQL databases, but their data models and access patterns are fundamentally different. DynamoDB is a key-value/wide-column store that requires you to design access patterns around partition keys and sort keys upfront. Single-table design achieves extremely high throughput with single-digit millisecond latency, but it's not suited for ad-hoc queries or complex aggregations. DocumentDB is a document store that allows you to create indexes on arbitrary fields for flexible querying. It offers RDB-like flexibility with aggregation pipelines for complex data transformations, partial updates on nested JSON documents, and regex-based searches. As a guideline: choose DynamoDB when access patterns are well-defined and ultra-high throughput is required; choose DocumentDB when schemas change frequently and ad-hoc queries are common. Content management systems, user profiles, and catalog data - where document structures are complex and search criteria are diverse - are cases where DocumentDB's flexibility shines.

共有するXB!