Security Groups Are Stateful, NACLs Are Stateless - The Design Decision Behind This Difference

Learn how Security Groups operate statefulness through connection tracking, why NACLs are stateless, the ephemeral port pitfall, and design patterns for defense in depth combining both.

The Difference Between Stateful and Stateless

A Security Group is a stateful firewall. Responses to requests allowed by inbound rules are automatically permitted regardless of outbound rules. Conversely, responses to requests allowed by outbound rules are automatically permitted regardless of inbound rules. A NACL (Network Access Control List) is a stateless firewall. Inbound and outbound rules are evaluated independently. Even if a response belongs to a request allowed by an inbound rule, it will be blocked if not explicitly allowed by an outbound rule. This difference stems from a fundamental implementation-level distinction. Security Groups maintain a connection tracking table that records the state of each connection (source IP, destination IP, port, protocol). When a response packet arrives, it is matched against the connection tracking table, and if it belongs to an existing connection, it is automatically allowed. NACLs do not maintain a connection tracking table and evaluate each packet independently.

Why Two Types of Firewalls Exist

While Security Groups alone might seem sufficient, NACLs exist because of the defense-in-depth principle. Security Groups operate at the instance level (ENI level), while NACLs operate at the subnet level. These two layers provide protection at different levels. If a Security Group is compromised (for example, if an attacker with IAM permissions modifies Security Group rules), the NACL serves as a backup. Since NACLs apply to the entire subnet, they maintain protection even if individual instance Security Groups are altered. Another reason is the need for explicit Deny rules. Security Groups can only define allow rules, not deny rules. If you want to explicitly block access from a specific IP address, Security Groups cannot do this. NACLs can define both allow and deny rules, evaluated in order by rule number, enabling explicit blocking of specific IP addresses.

The Ephemeral Port Pitfall with NACLs

The most common misconfiguration caused by NACLs being stateless is failing to allow ephemeral ports (temporary ports). When a client connects to a server, the client's source port uses an ephemeral port dynamically assigned by the operating system. Linux uses the range 32768-60999, and Windows uses 49152-65535. Even if the NACL outbound rule allows HTTP (port 80), the response is sent back to the client's ephemeral port. If the NACL inbound rule does not allow the ephemeral port range, the response is blocked. This problem does not occur with Security Groups. Being stateful, responses to outbound-allowed requests are automatically permitted. When using NACLs, it is common practice to allow the ephemeral port range (1024-65535) in both inbound and outbound rules. If allowing such a wide range feels uncomfortable, use Security Groups for fine-grained control and keep NACLs for coarse subnet-level control.

Security Group Connection Tracking in Detail

Security Group connection tracking has both tracked and untracked connections. When rules allowing all traffic (0.0.0.0/0) are set for both inbound and outbound, connection tracking is not performed because it is unnecessary. In this case, the connection tracking overhead is eliminated, improving performance. The connection tracking table has a size limit. By default, each ENI can hold up to 65,535 tracked connections. When this limit is reached, new connections cannot be established. Workloads handling large numbers of short-lived connections (such as instances behind a load balancer) need to watch for connection tracking table exhaustion. The CloudWatch metric ConnTrackAllowanceExceeded can be used to monitor connection tracking table exhaustion. Connection tracking timeouts are 5 days for established TCP connections, 180 seconds for UDP, and 30 seconds for ICMP. Timed-out connections are removed from the table.

Practical Defense-in-Depth Design Patterns

Here are practical design patterns combining Security Groups and NACLs. For an ALB in a public subnet, allow inbound HTTP/HTTPS (0.0.0.0/0) in the Security Group and leave the NACL at default (allow all). Since the ALB is a public-facing service, there is no need to restrict at the NACL level. For application servers in a private subnet, allow inbound traffic only from the ALB's Security Group in the Security Group, and allow inbound traffic only from the ALB subnet's CIDR in the NACL. The Security Group allows only ALB traffic, and the NACL provides backup defense at the subnet level. For the database subnet, allow inbound traffic only from the application server's Security Group in the Security Group, and allow inbound traffic only from the application subnet's CIDR in the NACL. If an attack from a specific IP address is detected, it can be immediately blocked with a NACL deny rule. Security Group changes apply per instance, but NACL changes apply instantly to the entire subnet, making NACLs well-suited for emergency response. To learn network security design systematically, specialized books (Amazon) are a helpful resource.