Why AWS APIs Return XML - The Evolution from Query APIs to REST JSON

Explore the historical reasons why S3 and EC2 APIs return XML responses, the differences between Query APIs and REST APIs, the evolution of authentication from Signature V2 to V4, and the complexity that SDKs abstract away.

API Design in 2006 - When XML Was the Standard

When S3 and EC2 launched in 2006, XML was the standard data format in the world of web APIs. SOAP (Simple Object Access Protocol) was the mainstream approach for enterprise web services, and strict type definitions via XML Schema combined with service descriptions via WSDL (Web Services Description Language) were considered "proper" API design. JSON had been proposed by Douglas Crockford in 2001, but as of 2006 it was not yet widely adopted and was evaluated as "lightweight but lacking type safety." AWS's early APIs reflect the design philosophy of this era. The EC2 API uses a format called "Query API," where the action name and parameters are specified in HTTP GET request query parameters, and responses are returned in XML. For example, a request to list EC2 instances looks like https://ec2.amazonaws.com/?Action=DescribeInstances&Version=2016-11-15. The S3 API follows a REST style, using HTTP methods (GET, PUT, DELETE) and paths to operate on resources, but responses are in XML.

The Curse of Backward Compatibility - Why XML Cannot Be Retired

One of AWS's most important design principles is "never retire a published API." The EC2 Query API has been running continuously since 2006 for nearly 20 years, and the XML response format has not changed. Because millions of customers' applications depend on this API, changing the response format to JSON would be a massive breaking change. AWS solves this problem by adopting JSON for new services while maintaining the APIs of older services as-is. Services launched after 2012 (DynamoDB, Lambda, API Gateway, etc.) almost universally use JSON-based REST APIs. DynamoDB's API uses JSON request/response with a Content-Type of application/x-amz-json-1.0. Lambda's API is also JSON-based. Meanwhile, early services like EC2, S3, SQS, and SNS still return XML responses. AWS SDKs absorb this difference internally, so developers using SDKs never need to worry about the distinction between XML and JSON.

Signature V4 - Signing Every Request

Another distinctive feature of AWS APIs is that every request requires a cryptographic signature. The current standard, Signature Version 4 (SigV4), concatenates and hashes the HTTP method, URL, headers, and body of a request, then generates an HMAC-SHA256 signature using the secret access key. This signature is included in the Authorization header and verified on the AWS side. The SigV4 signing process consists of four steps. First, creating the Canonical Request: a normalized string of the HTTP method, URI, query string, and headers. Second, creating the String to Sign: concatenating the date, region, service name, and hash of the canonical request. Third, deriving the signing key: progressively deriving a signing key from the secret access key for each date, region, and service. Fourth, computing the signature: HMAC-SHA256 signing the string to sign with the signing key. Implementing this complex process manually is impractical, and AWS SDKs handle it automatically. When calling AWS APIs with curl without using an SDK, you need to implement this signing process yourself, making debugging extremely difficult.

The Complexity SDKs Abstract Away

AWS SDKs are a massive abstraction layer that hides API complexity from developers. The things SDKs handle internally are extensive: request signing (SigV4), XML/JSON serialization and deserialization, retry logic (with exponential backoff), error handling (distinguishing transient from persistent errors), pagination (automatically fetching large result sets across multiple requests), region endpoint resolution, and automatic refresh of temporary credentials (IAM role AssumeRole). SDK v3 (JavaScript) and boto3 (Python) implement these processes as middleware pipelines, and developers can add custom middleware as well. The SDK's retry logic is particularly important. AWS APIs apply throttling (rate limiting), and sending too many requests in a short period returns 429 (Too Many Requests) errors. SDKs automatically retry with exponential backoff (1 second, 2 seconds, 4 seconds...), eliminating the need for developers to implement retry logic themselves.

API Evolution and Future Direction

AWS API design has evolved significantly over 20 years. From the early Query API + XML, to REST + JSON, and more recently to GraphQL (AppSync) and event-driven (EventBridge), the API paradigms themselves have diversified. What has remained consistent in AWS API design is the principle that "once an API is published, it doesn't change." EC2 API version 2006-10-01 still works today. New features are added as new API versions, and old versions are never retired. This maintenance of backward compatibility is fundamental to AWS's reliability. Systems that enterprises spent years building won't suddenly stop working due to API changes. On the other hand, this principle also creates technical debt. Design decisions from 20 years ago (XML responses, Query API format) are still maintained today, leading new developers to wonder "why such an old format?" The answer is "for backward compatibility," which is a manifestation of AWS's commitment to prioritizing customer trust. To systematically learn about the history and principles of API design, specialized books (Amazon) can be helpful.