Building a Cloud Data Warehouse with Amazon Redshift - Choosing Between Serverless and RA3

Learn the criteria for choosing between Serverless and RA3 provisioned clusters, and how to prevent data silos using data sharing and Spectrum for data lake integration.

About 3 min readLast updated: 2025-12-31

Redshift Architecture Overview

Redshift is a cloud data warehouse built on columnar storage and a massively parallel processing (MPP) architecture. Columnar storage reads only the columns needed for analytical queries, dramatically reducing I/O compared to row-oriented RDBMS systems. The leader node handles query parsing and execution plan generation, while compute nodes perform parallel data processing. RA3 instances separate compute from storage, with data stored in S3-based Redshift Managed Storage (RMS). Frequently accessed data is cached on local SSDs, so you don't need to worry about read latency from S3.

Choosing Between Serverless and Provisioned Clusters

Redshift Serverless automatically scales capacity in RPU (Redshift Processing Unit) increments, and you incur no cost when queries aren't running. It's ideal for intermittent workloads such as periodic BI dashboard queries, ad-hoc analysis, and development/test environments. Provisioned clusters (RA3), on the other hand, are suited for always-on workloads. For production environments with 24/7 continuous query execution, combining RA3 with Reserved Instances can be more cost-effective than Serverless. As a rule of thumb, if daily query execution time is under 8 hours, Serverless is the better choice; if it exceeds 8 hours, RA3 provisioned clusters are more economical.

Data Sharing and Data Lake Integration with Spectrum

Data sharing is a feature that lets you share live data in real time between Redshift clusters. A producer cluster creates a data share, and a consumer cluster references it. No data copying occurs, and the consumer always sees the producer's latest data. This is effective when departments operate independent clusters but need to share common master data. Redshift Spectrum lets you run SQL queries directly on data stored in S3. Combined with columnar formats like Parquet or ORC, you can analyze petabyte-scale data lakes without loading data into Redshift. Using the Glue Data Catalog as a metadata store, both Redshift and Athena can query the same table definitions. For hands-on Redshift knowledge, you can also explore related books on Amazon.

Redshift Pricing

Redshift Serverless is billed based on RPU (Redshift Processing Unit) usage, with a base RPU starting at 8 and costing approximately $0.375 per RPU-hour. No charges apply when queries aren't running. For provisioned clusters, RA3.xlplus costs approximately $1.086 per node per hour (about $782/month), with Reserved Instance discounts of up to 64%. Redshift Managed Storage costs approximately $0.024 per GB per month. If daily query execution is under 8 hours, Serverless is more economical; for always-on workloads exceeding 8 hours, RA3 provisioned clusters are the better choice.

Summary

Redshift is a cloud data warehouse that delivers high-speed analytics on petabyte-scale data. A phased approach of starting small with Serverless and migrating to provisioned clusters as workloads grow is effective. By leveraging data sharing and Spectrum, you can prevent data silos and achieve unified analytics across your data lake.

Practical Use Cases for Amazon Quick - Department-Specific Scenarios and Workflow Automation Design PatternsExplore concrete use cases for sales, IT, and finance departments, along with design patterns for notifications, approvals, and multi-step workflows using Quick Flows.BI Dashboard Visualization - Building a Data-Driven Decision Platform with Amazon QuickSightExplains how to build interactive BI dashboards with Amazon QuickSight and a serverless data analytics platform with Athena integration. Covers high-speed visualization with the SPICE engine and practical methods for sharing insights across the organization.Building Blockchain Networks - Leveraging Distributed Ledgers with Amazon Managed Blockchain and QLDBExplains how to build blockchain networks with Amazon Managed Blockchain and use Amazon QLDB as a verifiable ledger database. Covers practical use cases such as supply chain management and ensuring transparency in financial transactions.Privacy-Preserving Data Collaboration with AWS Clean RoomsRun joint analysis across multiple companies without sharing or copying data. Learn about aggregation rules for preventing individual identification and Cryptographic Computing for encrypted analysis.Customer Identity Unification - Resolving Scattered Customer Data with AWS Entity ResolutionLearn how to perform entity resolution (record matching) on customer data using AWS Entity Resolution. This article covers ML-based matching, rule-based matching, privacy protection, and integration with Clean Rooms.Leveraging Third-Party Data with AWS Data Exchange - Data Procurement and Subscription ManagementProcure third-party data products via Marketplace and build automated delivery pipelines to S3. This article also covers how to productize and monetize your own data.Building a Data Lake with Amazon S3 and Lake Formation - Design Patterns and GovernanceExplore data lake design patterns using S3 as the storage foundation and Lake Formation for fine-grained access control. This article also covers ETL pipelines and cost optimization.Data Lake Governance - Centralized Access Control with AWS Lake FormationLearn about building, access control, and governance for data lakes using AWS Lake Formation. This article covers fine-grained column-level and row-level permission management for S3-based data lakes, along with Glue and Athena integration.

Redshift Architecture Overview

Choosing Between Serverless and Provisioned Clusters

Data Sharing and Data Lake Integration with Spectrum

Redshift Pricing

Summary

Related Services

Related Articles

More on This Topic

Similar Articles and Services