Cassandra-Compatible Database - Serverless Distributed Database with Amazon Keyspaces

Learn how to design and operate distributed databases using Amazon Keyspaces (for Apache Cassandra) and DynamoDB.

Apache Cassandra and Amazon Keyspaces

Apache Cassandra is a NoSQL database known for high write throughput, linear scalability, and multi-region replication as a large-scale distributed database. It is used by major services such as Netflix, Apple, and Instagram, but operating it on-premises requires deep expertise. Operational tasks are extensive, including data rebalancing during node additions and removals, compaction strategy optimization, tombstone management, and JVM tuning. Amazon Keyspaces is a fully managed database service compatible with Apache Cassandra that lets you use CQL (Cassandra Query Language) as-is. You can use existing Cassandra application drivers and tools with minimal modifications, dramatically reducing operational overhead while preserving Cassandra's data model and query patterns. With its serverless architecture, you can start reading and writing data immediately after creating a table, with no need to pre-provision capacity.

Amazon Keyspaces Features and Architecture

Amazon Keyspaces operates in a serverless manner, automatically scaling table throughput based on traffic. In on-demand mode, you pay per read/write request, minimizing costs during low-traffic periods. Provisioned mode offers cost-efficient pricing for predictable workloads. Storage expands automatically, and data is replicated across three Availability Zones, providing a 99.999% availability SLA. Encryption is enabled by default for both data at rest and in transit, with the option to use customer-managed keys via AWS KMS. Keyspaces supports the major features of CQL 3.x, and table definitions, data types, and query syntax are compatible with Cassandra. However, some features such as lightweight transactions (LWT) and counter types have limitations. Point-in-Time Recovery (PITR) lets you restore a table to any point within the past 35 days, making it easy to recover from accidental operations or data corruption. Here is an example of creating a table in Keyspaces using CQL: CREATE TABLE my_keyspace.orders ( customer_id text, order_id timeuuid, product_name text, quantity int, total_amount decimal, PRIMARY KEY (customer_id, order_id) ) WITH CLUSTERING ORDER BY (order_id DESC) AND CUSTOM_PROPERTIES = {'capacity_mode': {'throughput_mode': 'PAY_PER_REQUEST'}};

Migration Strategy from Cassandra to Keyspaces

A phased approach is recommended for migrating from an existing Cassandra cluster to Keyspaces. First, verify connectivity to Keyspaces using cqlsh or DataStax drivers and confirm schema compatibility. For data migration, there are two approaches: batch migration using AWS Glue, and gradual migration using dual writes (writing to both databases simultaneously). When using Glue, you configure an ETL job that reads data from Cassandra and writes it to Keyspaces. For large dataset migrations, it is efficient to secure sufficient write capacity in Keyspaces provisioned mode and switch to on-demand mode after migration is complete. Application-side changes are minimal. The main changes are updating the connection endpoint and configuring TLS (which is required for Keyspaces). During migration performance testing, it is important to verify Keyspaces' read consistency (LOCAL_QUORUM by default) and latency characteristics in advance. For a systematic understanding of distributed database design from fundamentals to advanced topics, books on Amazon can be helpful.

Choosing Between Keyspaces and DynamoDB

When selecting a distributed database on AWS, both Keyspaces and DynamoDB are candidates. Keyspaces is ideal as a migration target for existing Cassandra applications, allowing you to leverage your CQL knowledge and skills directly. You can maintain Cassandra-specific data modeling patterns such as composite partition keys, clustering columns for flexible data modeling, TTL (Time to Live) for automatic data expiration, and static columns for partition-level shared data. On the other hand, DynamoDB, as an AWS-native service, provides seamless integration with Lambda, AppSync, API Gateway, and more. Its strengths lie in AWS ecosystem integration, including DynamoDB Streams for event-driven architectures, global tables for multi-region replication, and DAX (DynamoDB Accelerator) for microsecond-level caching. For new development by teams without Cassandra experience, DynamoDB is recommended; for migrating existing Cassandra workloads, Keyspaces is the better choice. Both services support IAM authentication, VPC endpoints, and encryption, meeting enterprise-level security requirements.

Keyspaces Pricing

In on-demand mode, reads cost approximately $0.297 per million read request units, and writes cost approximately $1.4846 per million write request units. In provisioned mode, RCUs cost approximately $0.000742 per month and WCUs cost approximately $0.000371 per month. Storage costs approximately $0.25 per GB per month. The pricing structure is nearly identical to DynamoDB, but the advantage is lower migration costs for existing Cassandra workloads since you can access data via CQL.

Summary - Choosing the Right Cassandra-Compatible Database

By choosing between Keyspaces as a Cassandra migration target and DynamoDB as an AWS-native option based on your workload characteristics and existing assets, you can build an optimal distributed database strategy. If CQL compatibility is required, Keyspaces is the right choice; if you prioritize serverless flexibility, DynamoDB is the better fit.