How Far Does CloudFormation Drift Detection See - How It Tracks Manual Changes and Its Limitations
Learn how CloudFormation drift detection internally compares the current state of resources with the expected state defined in templates, the boundary between detectable and undetectable changes, and strategies for drift remediation.
What Is Drift - The Gap Between Templates and Reality
CloudFormation drift refers to a state where the expected state defined in a stack's template differs from the actual state of the resource. For example, if the template defines an EC2 instance type as t3.medium but someone changed it to t3.large via the console, drift has occurred. Drift is a fundamental challenge of Infrastructure as Code (IaC). Even when managing infrastructure with templates, manual changes through the console or CLI cause the template and reality to diverge. The next time you update the stack, manual changes may be overwritten causing unintended behavior, or template changes may conflict with manual changes and produce errors. Drift detection, introduced in 2018, detects and visualizes this divergence.
How Drift Detection Works Internally
When you run drift detection, CloudFormation performs the following process for each resource in the stack. First, it reads the expected state from the template. The property values defined in the template constitute the expected state. Second, it retrieves the actual state of each resource via AWS APIs. For EC2 instances it calls DescribeInstances, for S3 buckets it calls GetBucketPolicy and GetBucketEncryption, and so on, using the appropriate API for each resource type. Third, it compares the expected state and actual state on a property-by-property basis. Properties with differences are reported as MODIFIED. Properties that have been added but are not defined in the template are reported as ADD, and properties defined in the template that have been removed are reported as REMOVE. Drift detection is an asynchronous process. For large stacks (hundreds of resources), retrieving the state of all resources can take several minutes. API calls for each resource are subject to throttling constraints, so they are processed sequentially with controlled parallelism.
Detectable and Undetectable Changes
Drift detection has important limitations. First, not all resource types support drift detection. As of 2024, out of approximately 500 resource types, only about 300 support drift detection. Unsupported resource types are excluded from drift detection and reported as NOT_CHECKED. Second, only properties explicitly defined in the template are compared. Properties omitted from the template where default values apply will not be detected as drift even if manually changed. For example, if you omit the Monitoring property of an EC2 instance in the template (default: false), enabling detailed monitoring from the console will not be detected as drift. Third, resource deletion can be detected, but resources created outside the stack cannot. If someone manually launches an EC2 instance unrelated to the stack, CloudFormation has no knowledge of it. Detecting resources outside the stack requires AWS Config.
Drift Remediation Strategies
When drift is detected, there are three remediation approaches. The first is reverting the manual change. If the template definition is correct and the manual change was an error, updating the stack (without changing the template or parameters) will cause CloudFormation to restore the resource to the template state. However, note that changes to some properties trigger resource replacement. The second is updating the template to match reality. If the manual change was legitimate (e.g., a setting changed during an emergency response), update the template to reflect the actual state. CloudFormation's import feature can also bring manually created resources into the stack. The third is tolerating the drift. Some properties (such as Auto Scaling Group's DesiredCapacity) are expected to change dynamically during operation. Drift on these properties is normal behavior and does not require remediation.
Mechanisms to Prevent Drift
Rather than detecting drift after the fact, building mechanisms to prevent drift from occurring in the first place is more effective. CloudFormation stack policies prohibit updates to specific resources. However, stack policies only apply during stack updates and cannot prevent direct changes via the console or CLI. The most reliable approach is using IAM policies to deny resource changes from sources other than CloudFormation. For example, you can set a policy that allows the ModifyInstanceAttribute action for EC2 instances only from CloudFormation's service role, denying direct calls from users. Using AWS Config rules for continuous drift monitoring is also effective. The cloudformation-stack-drift-detection-check rule periodically runs drift detection and reports stacks with drift as non-compliant. Combined with Config remediation actions, automated drift remediation is also possible. For a systematic study of IaC operations, specialized books on Amazon are a helpful reference.