Running MongoDB Workloads as a Managed Service with Amazon DocumentDB - Document Model and Query Design

Manage document data with a MongoDB-compatible API, and achieve read scaling and disaster recovery with up to 15 read replicas and global clusters. This article also covers sharding with Elastic Clusters.

Overview of DocumentDB

Amazon DocumentDB is a fully managed MongoDB-compatible document database service. It is compatible with MongoDB 3.6, 4.0, and 5.0 APIs, allowing you to use existing MongoDB drivers and tools as-is. Storage automatically scales up to 128 TiB and maintains six data copies across three AZs for a highly available design. Unlike RDS relational databases, it can flexibly store schema-less JSON documents, including complex data structures with nested objects and arrays. While DynamoDB is optimized for key-value access, DocumentDB is suited for workloads that require rich queries such as aggregation pipelines, text search, and geospatial queries.

Cluster Configuration and Scaling

A DocumentDB cluster consists of one primary instance and up to 15 read replicas. Read replicas distribute read workloads and automatically fail over when the primary fails. Failover priority can be set per instance, allowing you to prioritize specific instances for promotion. Elastic Clusters is DocumentDB's sharding feature, supporting millions of writes per second and petabyte-scale storage. Simply specify a shard key and data is automatically distributed; adding and removing shards can be done online. Global clusters place read replicas in up to five regions with sub-second replication lag, enabling global read scaling and disaster recovery.

Queries and Operations

DocumentDB supports MongoDB aggregation pipelines, text indexes, and geospatial indexes. You can combine aggregation stages such as $match, $group, $sort, and $lookup (cross-collection joins) to run complex analytical queries. Change Streams detect collection changes in real time and can trigger Lambda functions for event-driven processing. Performance Insights analyzes query execution plans and wait events to identify the causes of slow queries. Automatic backups are retained for up to 35 days, with point-in-time restore (PITR) to any second. For document database design patterns, related books on Amazon can also be a helpful reference.

DocumentDB Pricing

DocumentDB pricing consists of three components: instances, storage, and I/O. A db.r6g.large instance (2 vCPU, 16 GiB) costs approximately $0.277/hour (Tokyo region). Storage costs approximately $0.11 per GB/month, and I/O costs approximately $0.22 per million requests. Choosing I/O-Optimized storage eliminates I/O charges, reducing total costs for I/O-intensive workloads. Elastic Clusters are billed per vCPU and scale with the number of shards. Read replicas incur the same instance charges as the primary, so set the appropriate number of replicas based on your read workload.

Summary

Amazon DocumentDB is a fully managed MongoDB-compatible document database that provides schema-less JSON data storage and rich queries. With up to 15 read replicas, sharding via Elastic Clusters, and disaster recovery through global clusters, it handles workloads ranging from small to large scale.