How CloudFront's 600+ PoPs Work - Anycast Routing and the Cache Hierarchy

Learn how CloudFront routes user requests to the nearest PoP using Anycast, the two-tier structure of edge locations and regional edge caches, and the design factors that determine cache hit rates.

約 6 分で読めます最終更新: 2026-03-16

What Is a PoP (Point of Presence)

CloudFront PoPs are cache server clusters distributed around the world. As of 2024, over 600 PoPs are deployed across more than 90 cities. The role of a PoP is to cache content from origin servers (such as S3 buckets or EC2 instances) and serve it from a location physically close to the user, reducing latency. If a user in Tokyo directly accesses an S3 bucket in US Virginia, the round trip across the Pacific incurs 150-200ms of latency. If the content is cached at a Tokyo PoP, latency drops to 5-10ms. PoP sizes are not uniform. Large PoPs in cities like Tokyo, London, and Virginia consist of thousands of servers and can cache large volumes of content. Smaller city PoPs have tens to hundreds of servers with limited cache capacity. CloudFront dynamically adjusts PoP capacity based on traffic volume, temporarily increasing capacity during major events (sports broadcasts, product launches, etc.).

Anycast Routing - How Users Are Directed to the Nearest PoP

CloudFront routes user requests to the nearest PoP through a combination of DNS-based routing and Anycast. When a user accesses a CloudFront distribution URL (e.g., d1234.cloudfront.net), DNS resolution occurs first. CloudFront's DNS servers estimate the user's geographic location from the DNS resolver's IP address and return the IP address of the nearest PoP. This is where Anycast comes in. Anycast is a technology where multiple PoPs simultaneously announce the same IP address. Through BGP (Border Gateway Protocol) routing, packets automatically reach the network-closest PoP. DNS-based routing alone may not select the optimal PoP when the DNS resolver's location differs from the user's actual location (e.g., when using a centralized corporate DNS server). Anycast selects the shortest network path, complementing this limitation. Since 2022, CloudFront also supports EDNS Client Subnet (ECS), where the DNS resolver passes the user's actual subnet information to CloudFront's DNS, enabling more accurate PoP selection.

Two-Tier Cache - Edge Locations and Regional Edge Caches

CloudFront's cache has a two-tier structure. The first tier consists of edge locations (PoPs), located closest to users. The second tier consists of regional edge caches (RECs), positioned between edge locations and origin servers. There are 13 RECs worldwide, with larger cache capacity than edge locations. The request flow works as follows. A user's request arrives at an edge location; if the content is in cache, it is returned immediately (cache hit). If the edge location does not have the content cached, it queries the REC. If the REC has it cached, it returns from there; otherwise, it fetches from the origin server. The advantage of this two-tier structure is a significant reduction in requests to the origin server. Unpopular content (long tail) is easily evicted from individual edge location caches, but the REC's larger cache capacity increases the probability of retention. According to AWS published data, the introduction of RECs has reduced origin requests by up to 60% in some cases.

Design Factors That Determine Cache Hit Rates

CloudFront cache hit rates can range from 50% to 99% depending on design. The single biggest factor determining cache hit rates is cache key design. By default, CloudFront uses the full URL path as the cache key, but if you configure query strings, headers, or cookies to be included in the cache key, the same content gets different cache keys, causing cache fragmentation and lower hit rates. For example, if query strings include tracking parameters (utm_source, utm_medium, etc.), each parameter combination creates a separate cache entry for the same page. By using CloudFront's cache policy to whitelist which query strings are included in the cache key and excluding tracking parameters, hit rates improve dramatically. TTL (Time to Live) settings are also important. If TTL is too short, the cache is invalidated frequently, increasing requests to the origin. The most effective strategy for static assets (images, CSS, JS) is to set TTL to 1 year and include a hash in the filename (app.a1b2c3.js), so that content updates automatically switch to a new URL and cache entry.

Origin Shield - A Third Cache Layer to Protect the Origin

Origin Shield, introduced in 2020, adds an additional cache layer between RECs and the origin. When Origin Shield is enabled, all origin fetches from RECs are consolidated through a single Origin Shield location. If Origin Shield has the content cached, no request reaches the origin. The greatest benefit of Origin Shield is mitigating the "thundering herd" problem. This is the phenomenon where requests from PoPs worldwide simultaneously flood the origin the moment a popular content's TTL expires. Without Origin Shield, origin fetches occur simultaneously from all 13 RECs, but with Origin Shield, they are consolidated into a single request. Origin Shield pricing is request-based ($0.0090 per 10,000 requests), and considering the reduction in origin server load and scaling costs, it is a cost-effective investment in most cases. This is especially true when the origin is a pay-per-use service like Lambda Function URL or API Gateway, where request reduction through Origin Shield directly translates to cost savings. For a systematic study of CDN design and optimization, specialized books on Amazon are a helpful reference.

Building a Service Mesh with AWS App Mesh - Controlling and Observing Microservice CommunicationLearn how to declaratively configure canary deployments, retry policies, and mTLS encryption using Envoy sidecars, and visualize service dependencies with X-Ray integration.Service Discovery with AWS Cloud Map - Dynamic Name Resolution for MicroservicesDynamically discover microservice endpoints using DNS-based and API-based approaches. Learn how to automate service registration and deregistration through integration with ECS and EKS.Dedicated Connection Design - Achieving Stable Private Network Connectivity with Direct ConnectLearn about dedicated connection design with AWS Direct Connect, including choosing between dedicated and hosted connections, redundancy configurations, and achieving stable private network connectivity through VPC integration.Building Dedicated Connections with AWS Direct Connect - Redundancy Design and Traffic ControlDesign redundant dedicated connections and connect to VPCs across multiple regions with Direct Connect Gateway. This article also covers bandwidth aggregation with LAG and MACsec encryption.Elastic IP Address Management and Design - Static IP Usage and Cost OptimizationLearn about Elastic IP allocation, EC2 association, cost implications of unused EIPs, and alternative approaches to consider.Why ELB Has Three Types - The Design Rationale Behind ALB, NLB, and CLBExplore why ALB, NLB, and CLB (Classic) coexist rather than being unified into a single service, examining OSI model layers, performance characteristics, and historical context.Global Network Optimization with AWS Global Accelerator - Low-Latency Delivery and FailoverLearn how to route traffic onto the AWS global network using Anycast IPs, design endpoint groups, and achieve failover through health checks.Global Network Acceleration - Low-Latency Delivery with AWS Global Accelerator and CloudFrontRoute traffic onto the AWS global network early using Anycast IPs to direct clients to the nearest edge location. This guide covers ALB/NLB integration and failover design.

What Is a PoP (Point of Presence)

Anycast Routing - How Users Are Directed to the Nearest PoP

Two-Tier Cache - Edge Locations and Regional Edge Caches

Design Factors That Determine Cache Hit Rates

Origin Shield - A Third Cache Layer to Protect the Origin

Related Services

Related Articles

More on This Topic

Similar Articles and Services