The Birth of Firecracker - Why the MicroVM Behind Lambda and Fargate Was Created
Explore the design philosophy behind Firecracker, the microVM monitor AWS built for Lambda and Fargate. Learn how it differs from traditional virtualization, the secret behind its 125ms boot time, and the strategic intent behind open-sourcing it.
The Problems with Lambda's Initial Architecture
When Lambda launched in 2014, it used container technology for function execution environments. Functions were isolated using Linux cgroups and namespaces. While this approach was fast and efficient to start, it raised security concerns. Because containers share the kernel, exploiting a kernel vulnerability could potentially allow access from one function to another customer's functions. For Lambda, which runs millions of customers' functions on the same physical hosts in a multi-tenant environment, this weak isolation was a fundamental risk. Traditional virtual machines (VMs) provide hardware-level isolation, but VMs take tens of seconds to boot and have significant memory overhead, making them unsuitable for Lambda's millisecond-level billing model. What AWS needed was a technology that combined "VM-level security isolation" with "container-level boot speed and lightweight operation." Since nothing on the market met these requirements, AWS decided to build it themselves.
Firecracker's Design Philosophy - The Art of Minimalism
Firecracker, announced in 2018, is a KVM (Kernel-based Virtual Machine)-based microVM monitor (VMM). Written in Rust, it consists of approximately 50,000 lines of code. Compared to QEMU (a traditional general-purpose VMM) with its millions of lines of code, Firecracker's codebase is orders of magnitude smaller. This small size is an intentional design decision. Firecracker was designed with the principle of "providing only the minimum functionality needed to run serverless workloads." GPU passthrough, USB device support, graphical console, live migration, and other features needed by general-purpose VMs are all omitted. The only emulated devices are virtio-net (networking), virtio-block (block storage), a serial console, and a minimal i8042 keyboard controller. This thorough stripping down results in an extremely small attack surface, improving security. Less code means fewer bugs and easier auditing.
The Secret Behind 125ms Boot Time
Firecracker microVMs boot in under 125ms. Multiple factors contribute to this speed. First, minimal device emulation. While QEMU emulates numerous devices including BIOS, PCI bus, and USB controllers, Firecracker omits all of these. The guest OS kernel only needs to recognize the minimal devices provided by Firecracker, making kernel initialization fast. Second, custom kernel usage. Lambda's execution environment uses a custom kernel with unnecessary drivers and subsystems removed, rather than a general-purpose Linux kernel. This shortens the kernel boot time itself. Third, minimized memory overhead. Firecracker's process itself uses approximately 5MB of memory, orders of magnitude less than QEMU's tens to hundreds of MB. This allows thousands of microVMs to run simultaneously on a single physical host. If Lambda allocates 128MB of memory to a function, the VMM overhead of 5MB represents about 4%, whereas QEMU's 100MB would represent about 44% overhead. In a large-scale multi-tenant environment, this difference translates to enormous cost differences.
jailer - Multi-Layered Security Defense
Firecracker's security model doesn't rely solely on microVM isolation. A component called jailer further isolates the Firecracker process itself. The jailer confines the Firecracker process in a chroot environment, filters system calls with seccomp-bpf, and limits resource usage with cgroups. This means that even if a microVM escape (breaking out from the guest OS to the host OS) succeeds, the attacker remains confined within the jailer's restrictions. This defense-in-depth design ensures that overall security is maintained even if a single defense layer is breached. Firecracker permits only about 30 system calls, blocking the majority of the 300+ system calls provided by the Linux kernel. This strict filtering neutralizes most attacks that exploit kernel vulnerabilities. AWS designed Firecracker's security model on the premise of "never trusting the guest OS," with the goal of preventing malicious code running inside the guest OS from affecting the host OS or other microVMs.
The Strategic Intent Behind Open-Sourcing
AWS open-sourced Firecracker under the Apache 2.0 license in 2018. This decision to open-source a core AWS technology had multiple strategic intentions. First, security transparency. Firecracker is a security-critical component, and publishing the code allows external researchers to conduct security audits. In fact, multiple security improvement proposals have been submitted by external researchers since the open-sourcing. Second, ecosystem expansion. Firecracker can be used outside of AWS, and cloud platforms like Fly.io and Koyeb have adopted it. As the Firecracker ecosystem grows, more workloads run on Firecracker, ultimately making migration to AWS easier. Third, talent recruitment. Publishing cutting-edge technology like Firecracker as open source is a powerful recruiting tool for attracting talented systems programmers. Firecracker is written in Rust and has garnered significant attention from the Rust community. Firecracker's GitHub repository has earned over 20,000 stars, making it one of AWS's most successful open-source projects. For a deeper understanding of virtualization design philosophy, specialized books (Amazon) are a helpful reference.