Practical Guide to Amazon OpenSearch Serverless - OCU Design and Optimization Strategies by Collection Type

Amazon OpenSearch Serverless is a fully managed service that handles search and analytics workloads with auto-scaling while eliminating cluster management. This article explains the OCU billing model, collection type selection criteria, and index design best practices from a practical perspective.

Freedom from Cluster Management - The Operational Challenges OpenSearch Serverless Solves

With traditional Amazon OpenSearch Service, cluster operations such as selecting domain instance types, adjusting shard counts, and rebalancing after node failures were unavoidable. For workloads with high traffic variability, you had to accept either over-provisioning for peak loads or latency degradation from slow scaling. OpenSearch Serverless eliminates this challenge at its root. Users focus on collection and index design while AWS automatically handles compute and storage management. Internally, it employs an architecture that separates indexing from compute, allowing OCUs (OpenSearch Compute Units) to scale independently in response to search request fluctuations. This separation means that even when massive data ingestion and high-frequency search queries occur simultaneously, they are unlikely to interfere with each other's performance. With traditional clusters, indexing and search operations frequently competed for resources on the same nodes, but with Serverless, that concern is eliminated.

Understanding the OCU Billing Structure - Minimum Costs and the Reality of Auto-Scaling

The factor that most influences OpenSearch Serverless costs is the OCU (OpenSearch Compute Unit) mechanism. One OCU provides 6 GiB of RAM and corresponding vCPU, allocated across two separate pipelines: indexing and search. The critical point is that as soon as you create a collection, a minimum of 2 OCUs for indexing and 2 OCUs for search - a total of 4 OCUs - are always reserved. As of April 2026 in the Tokyo Region, one OCU costs approximately 0.334 USD/hour, meaning the minimum configuration incurs roughly 975 USD per month. Overlooking this minimum cost and assuming "serverless means cheap" will lead to unexpected bills. On the other hand, when traffic increases, it auto-scales up to several hundred OCUs, and when load decreases, it scales back down to the minimum. Scaling response time is on the order of tens of seconds to a few minutes, handling sudden spikes relatively quickly. The key to cost optimization is deleting unnecessary collections in development and staging environments to stop the minimum OCU charges, and setting appropriate maximum OCU limits in production to prevent cost runaway.

Three Collection Types - Selection Criteria Based on Workload Characteristics

OpenSearch Serverless offers three collection types: Search, Time series, and Vector search. Each has a different internal indexing strategy, so choosing the wrong type for your workload results in losses on both performance and cost. The Search collection is suited for workloads centered on random access, such as e-commerce product search and document search. Replica placement is optimized to minimize search latency, achieving responses in the tens of milliseconds at p99. The Time series collection is designed for workloads dominated by time-axis writes and range queries, such as log analysis and metrics collection. Segment merging for older data is optimized, making it strong for high-volume append writes. The Vector search collection is gaining attention as a semantic search engine and RAG knowledge base. It internally manages k-NN indexes optimally, providing fast approximate nearest neighbor search even for 1536-dimensional embedding vectors. Azure AI Search also supports vector search, but in OpenSearch Serverless, it exists as an independent collection type with vector-specific optimizations applied.

Index Design Essentials - Shard Strategy and Mapping Definition in Practice

Even though OpenSearch Serverless eliminates the need for manual shard configuration, index mapping design still directly impacts performance. First, always create explicit mappings that define field data types. Relying on dynamic mapping can cause fields you intend to treat as numbers to be registered as text type, drastically reducing aggregation query efficiency. The distinction between keyword and text types is also important: use keyword type for exact-match filtering and text type for full-text search. When a single field needs both, use a multi-field definition. For Time series collections, including a timestamp in the index name (e.g., logs-2026-04) makes it easy to delete old indexes. For Search collections, keeping document size under 100 KB is recommended. Large documents drive up indexing OCU consumption and degrade search latency. For Vector search collections, the choice of dimensions and metric (cosine, l2, dotProduct) directly affects the trade-off between recall rate and speed, so you should validate with benchmarks beforehand.

Security and Access Control - Designing Encryption and Data Access Policies

The security model of OpenSearch Serverless is fundamentally different from traditional OpenSearch Service. Instead of fine-grained access control (FGAC), it uses a three-layer structure of encryption policies, network policies, and data access policies. Encryption policies must be defined before creating a collection, and you choose between AWS-managed keys or customer-managed KMS keys. Choose customer-managed keys when compliance requirements demand key rotation management. Network policies allow you to restrict collection access to public or VPC endpoint access only. In production, it is recommended to limit access to VPC endpoints and apply the same restriction to dashboard access. Data access policies grant IAM principals operation permissions at the collection or index level, defined as JSON rules. Policies can be applied to wildcard patterns of collection names, so maintaining a consistent naming convention simplifies management. For example, a design that isolates access using prefixes like team-a-* for each team is effective in practice.

Adoption Decision Framework - Where Serverless and Traditional Clusters Diverge

The choice between OpenSearch Serverless and traditional OpenSearch Service should be based on workload characteristics and cost structure. Serverless is well-suited for workloads with high and unpredictable traffic variability, environments with small operations teams, and cases where you need to stand up a search platform quickly. On the other hand, traditional clusters are more appropriate when traffic is stable and you need strict cost control, or when custom plugins are required. As a cost breakpoint, if your monthly load consistently exceeds the equivalent of 4 OCUs (approximately 975 USD/month), Serverless benefits become apparent; for small, stable workloads, the minimum OCU fixed cost becomes disproportionately expensive. Azure AI Search uses a different scaling granularity based on replica and partition units. For search engine design patterns, related books (Amazon) are also a helpful reference.