AWS Lambda processes over 10 trillion invocations per month. Each invocation requires a secure execution boundary separating customer code from the host operating system, from other customers’ code, and from AWS’s own management plane. Traditional virtual machines — QEMU/KVM with full device emulation — boot in seconds, consume hundreds of megabytes of memory, and expose thousands of lines of device emulation code as attack surface. At 10 trillion invocations per month, those numbers are disqualifying.

The solution was Firecracker: a virtual machine monitor written in Rust, purpose-built for serverless workloads, capable of booting a micro-VM in under 125 milliseconds with a memory footprint of approximately 5 MB per VM. AWS open-sourced Firecracker in November 2018, and it now underpins both Lambda and Fargate. Google, facing the same isolation problem, took a different architectural path with gVisor — an application kernel that intercepts system calls in userspace, avoiding the need for a full virtual machine while still providing strong isolation.

These technologies are not competing products. They represent fundamentally different answers to the same question: how do you isolate untrusted code at cloud scale with minimal overhead? The answer you choose determines the security properties, performance characteristics, and — critically for ephemeral infrastructure — the teardown guarantees of your compute environment.

The Isolation Spectrum

Understanding micro-VMs requires understanding the full spectrum of isolation technologies available in modern cloud infrastructure. Each point on the spectrum trades off between isolation strength, startup latency, memory overhead, and syscall compatibility.

Linux Containers (Namespaces + cgroups)

Standard containers — Docker, containerd, CRI-O — use Linux kernel namespaces (PID, network, mount, UTS, IPC, user) and cgroups to partition the host kernel’s resources. The container shares the host kernel. Every system call from container code is processed by the same kernel that manages the host.

This is efficient. There is zero kernel memory overhead per container, startup is sub-second, and syscall performance is native. But the shared kernel is the single largest attack surface in cloud computing. The Linux kernel exposes over 450 system calls. A container can invoke any of them. A vulnerability in any one of them — and the kernel averages 40-60 CVEs per year — potentially compromises the host and every other container on it.

Between 2019 and 2025, the CVE database recorded 17 container escape vulnerabilities rated Critical (CVSS 9.0+). The most severe — CVE-2024-21626, a runc vulnerability — allowed a malicious container image to escape to the host filesystem via a working directory manipulation. No container runtime configuration could prevent it. The vulnerability existed in the kernel’s handling of the /proc/self/fd symlink.

For multi-tenant cloud infrastructure, kernel sharing is a structural liability that no amount of seccomp profiles or AppArmor policies can eliminate. The question is what replaces it.

Hardware Virtual Machines (QEMU/KVM)

At the opposite end of the spectrum, full hardware virtualization provides the strongest isolation. Each VM runs its own kernel on emulated hardware. The hypervisor (KVM + QEMU on Linux) mediates all access to physical resources. A vulnerability in a guest kernel cannot reach the host kernel because the guest never executes on the real hardware — it executes on virtualized hardware managed by the hypervisor.

The trade-off is weight. QEMU emulates dozens of devices: IDE controllers, VGA adapters, USB buses, network interfaces, audio cards. Each emulated device is attack surface — QEMU’s VGA emulation alone has produced multiple CVEs. A full QEMU/KVM VM consumes 128-512 MB of baseline memory and takes 1-10 seconds to boot. At cloud scale, this is prohibitive for per-request isolation.

The Middle Ground: Micro-VMs and Application Kernels

Micro-VMs and application kernels occupy the territory between containers and full VMs. They provide stronger isolation than containers (separate kernel or kernel-like boundary) with lower overhead than full VMs (stripped-down device model, minimal memory footprint, sub-second boot). This is where Firecracker, gVisor, and Kata Containers operate.

Firecracker: The Micro-VM Approach

Firecracker is a Virtual Machine Monitor (VMM) built on Linux KVM. It replaces QEMU as the userspace component that configures and manages the virtual machine, while retaining KVM as the hypervisor that enforces hardware isolation.

Architecture

Firecracker’s design is defined by what it removes. Where QEMU emulates a full PC architecture with dozens of devices, Firecracker emulates exactly five: a serial console (for logging), a virtio-net network device, a virtio-block storage device, a keyboard controller (for shutdown signaling), and a minimal interrupt controller. Nothing else. No VGA. No USB. No sound. No PCI passthrough.

Each emulated device is implemented in Rust with explicit bounds checking and memory safety guarantees. The total codebase of Firecracker’s device model is approximately 50,000 lines of Rust — compared to QEMU’s roughly 1.4 million lines of C. The reduction in code surface is not incremental. It is a 96% reduction in the code that an attacker inside a guest VM can interact with.

Boot sequence. Firecracker loads a Linux kernel and an optional initrd/rootfs directly — no BIOS, no UEFI, no bootloader. The VM transitions from creation to running guest code in under 125 ms. AWS’s production data reports P99 cold start times of under 150 ms for Lambda functions, with Firecracker being the bottleneck for none of them — the language runtime initialization dominates.

Memory. Each Firecracker micro-VM consumes approximately 5 MB of VMM overhead. The guest memory is additional and user-configurable (minimum 128 MB). At 5 MB of overhead per VM, a host with 256 GB of RAM can theoretically manage over 50,000 micro-VMs — a density that traditional QEMU cannot approach.

Jailer. Firecracker ships with a jailer process that confines the VMM itself inside a chroot with a minimal seccomp filter (allowing only 24 system calls) and dedicated cgroup. Even if an attacker escapes the guest VM and exploits a vulnerability in Firecracker’s Rust code, they land in a severely restricted environment with no access to the host filesystem, no network capabilities beyond the configured tap device, and no ability to invoke privileged system calls.

Security Model

Firecracker provides hardware-enforced isolation via KVM. The guest VM runs in a separate address space, managed by the CPU’s virtualization extensions (Intel VT-x / AMD-V). Guest memory is isolated at the hardware level — the guest cannot address host memory, and the host’s view of guest memory is mediated by extended page tables (EPT on Intel, NPT on AMD).

The attack surface is precisely defined: the five emulated devices, the KVM API (which is maintained by the Linux kernel community and reviewed with hypervisor-level scrutiny), and the Firecracker process itself. A formal security audit by Trail of Bits in 2023 identified zero critical vulnerabilities in the Firecracker VMM.

For zero-trust architecture, Firecracker provides the strongest per-workload isolation available without dedicated hardware. Each function invocation runs in its own VM. No kernel is shared. No memory is shared. When the VM is destroyed, the memory pages are returned to the host and zeroed.

Relevance to Ephemeral Infrastructure

Firecracker’s sub-150 ms boot and negligible memory overhead make it viable for per-request compute isolation — the foundation of ephemeral infrastructure. A request arrives, a micro-VM boots, the computation executes, the VM is destroyed, and the memory is reclaimed. No state persists between requests. No filesystem artifact remains. The VM’s memory is not swapped to disk (Firecracker disables swap by default), eliminating the risk of residual data in persistent storage.

This is the operational model behind Stealth Cloud’s approach to zero-persistence compute: isolation boundaries that exist for exactly the duration of a computation and leave no forensic trace upon destruction.

gVisor: The Application Kernel Approach

Google’s gVisor takes a fundamentally different approach. Instead of running a full Linux kernel inside a hardware-isolated VM, gVisor interposes a user-space kernel — called Sentry — between the application and the host kernel. The application believes it is running on a Linux kernel. It is not. It is running on a purpose-built kernel reimplementation that handles system calls without passing them to the host.

Architecture

gVisor consists of two components:

Sentry. An application kernel written in Go that implements approximately 240 of Linux’s 450+ system calls. Sentry runs as a regular userspace process (no root privileges, no kernel modules). When the containerized application makes a system call, the ptrace or KVM-based interception mechanism redirects it to Sentry, which processes it entirely in userspace.

Gofer. A file proxy that mediates all filesystem access. The application’s file I/O requests pass through Sentry to Gofer, which performs the actual host filesystem operations through a restricted set of system calls. The application never directly interacts with the host filesystem.

The host kernel exposure is drastically reduced. A standard container can invoke any of the host’s 450+ system calls. A gVisor-sandboxed application can only trigger the approximately 70 system calls that Sentry uses to implement its ~240 emulated calls. The attack surface reduction is roughly 85%.

Performance Characteristics

gVisor’s user-space system call interception adds latency to every syscall. Benchmarks from Google’s published research show:

  • CPU-bound workloads: near-native performance (gVisor adds no overhead to pure computation).
  • Syscall-heavy workloads: 10-40% overhead, depending on the frequency and type of system calls.
  • Filesystem-intensive workloads: 30-80% overhead due to the double hop through Sentry and Gofer.
  • Network-intensive workloads: 5-15% overhead with the netstack implementation.

These overheads are acceptable for many workloads but disqualifying for others. gVisor is used in production at Google Cloud Run and Google Cloud Functions, where the workloads are typically HTTP request handlers with moderate filesystem and network I/O.

Security Model Comparison with Firecracker

gVisor and Firecracker defend against different threat models:

gVisor protects the host kernel from malicious or buggy applications by intercepting system calls before they reach the kernel. The isolation is software-enforced. A vulnerability in Sentry (written in Go, with memory safety guarantees) could potentially be exploited to reach the host kernel, though such a vulnerability would require defeating Go’s memory safety and escaping the seccomp sandbox.

Firecracker protects the host from malicious guests by running them in hardware-isolated VMs. The isolation is hardware-enforced. A guest cannot reach the host kernel without first escaping the VM boundary (defeating Intel VT-x/AMD-V hardware isolation) and then escaping the jailer sandbox.

For confidential computing use cases where the threat model includes the infrastructure operator, Firecracker’s hardware isolation is the stronger primitive. For high-density multi-tenant platforms where startup latency and density matter more than defense against nation-state adversaries, gVisor’s lighter-weight approach is appropriate.

Kata Containers: The Hybrid Approach

Kata Containers — a CNCF project merging Intel Clear Containers and Hyper.sh runC — combines the OCI container interface with hardware-virtualized isolation. Each container runs inside a lightweight VM while supporting standard images, orchestration, and Kubernetes pod specs. Kata supports multiple VMMs (QEMU, Firecracker, Cloud Hypervisor), allowing organizations to adopt hardware-isolated workloads without abandoning existing toolchains. The overhead is higher than gVisor but the isolation guarantees are stronger.

V8 Isolates: The Ultralight Option

At the lightest end of the isolation spectrum, V8 isolates — the technology underpinning Cloudflare Workers, Deno Deploy, and Vercel Edge Functions — provide per-request isolation without any virtualization at all. A V8 isolate is a sandboxed instance of the V8 JavaScript engine with its own heap, its own global scope, and no access to the host process’s memory or filesystem.

V8 isolates boot in under 5 ms and consume under 1 MB of memory per isolate. The startup overhead is two orders of magnitude lower than Firecracker. The isolation is enforced by V8’s memory safety guarantees and the engine’s sandboxing of JavaScript execution — there are no system calls available to the isolate, no filesystem access, and no raw network socket access.

The trade-off is capability. V8 isolates can only execute JavaScript and WebAssembly. They cannot run arbitrary Linux binaries. They have no filesystem. They have no persistent state. For the specific use case of stateless HTTP request handling — which describes a large fraction of serverless workloads — these constraints are not limitations. They are features.

Cloudflare reports running over 10 million Worker scripts across 330+ data centers globally, with a P99 cold start time under 5 ms. The density achievable with V8 isolates — tens of thousands per machine — makes them the most economically efficient isolation primitive for stateless, ephemeral workloads.

For privacy-focused infrastructure, V8 isolates offer a tight isolation boundary with almost nothing to leak. No filesystem, no raw sockets, no shared memory — and the attack surface is the V8 engine itself, arguably the most hardened JavaScript runtime in existence.

Choosing the Right Isolation Primitive

The choice between these technologies is not a matter of better or worse. It is a matter of matching the isolation primitive to the threat model, the workload characteristics, and the operational requirements.

PrimitiveBoot TimeMemory OverheadIsolation TypeSyscall SurfaceBest For
Linux Container< 500 ms~0 MBKernel namespaces450+ syscallsTrusted workloads
gVisor< 500 ms~30 MBUser-space kernel~70 syscallsMulti-tenant SaaS
Firecracker< 150 ms~5 MB VMMHardware VM (KVM)24 syscalls (jailer)Serverless, FaaS
Kata Containers< 1 s~30 MBHardware VM (KVM)Varies by VMMContainer compat
V8 Isolate< 5 ms< 1 MBEngine sandbox0 syscallsEdge, stateless

For ephemeral, zero-persistence workloads, the gradient is clear: lighter isolation primitives enable faster creation and destruction cycles, which in turn enable shorter-lived compute environments, which in turn reduce the window during which data exists in any form.

The Teardown Problem

Creating isolated environments is a solved problem. Destroying them without residual trace is not.

When a Firecracker micro-VM is terminated, its memory pages are returned to the host kernel. Those pages contain the plaintext of everything the VM processed — user data, decrypted payloads, intermediate computation results. The host kernel may or may not zero those pages before reallocating them to a new VM. Linux’s default behavior is to zero pages on allocation (not deallocation), which means a brief window exists where the previous VM’s data remains in physical memory.

Firecracker addresses this by explicitly zeroing VM memory on teardown. Combined with cryptographic shredding — destroying the encryption keys that protect the VM’s data rather than overwriting the data itself — the teardown process produces a forensically clean state: even if memory pages are recovered before being overwritten, they contain only ciphertext whose key has been destroyed.

gVisor and V8 isolates face different teardown profiles. Go’s garbage collector does not zero freed memory deterministically, meaning a killed gVisor sandbox may leave heap fragments. V8 isolates similarly free but do not necessarily zero heap memory on destruction. Cloudflare has implemented isolate-level memory zeroing in their Workers runtime, but this is a platform-specific mitigation.

The teardown problem is why ephemeral infrastructure requires more than just lightweight VMs. It requires a full lifecycle discipline: encrypt data before it enters the compute environment, process in hardware-isolated memory, destroy the keys on session termination, and verify that the memory has been returned to a clean state. The isolation primitive is one component of a system that must be correct end-to-end.

The Stealth Cloud Perspective

Stealth Cloud’s architecture uses V8 isolates as the primary compute isolation primitive — specifically, Cloudflare Workers running at the edge across 330+ global locations. The choice is deliberate and reflects the threat model: for stateless, request-scoped AI chat workloads, V8 isolates provide the optimal combination of startup speed (sub-5 ms), memory efficiency (sub-1 MB), zero syscall surface, and geographic distribution.

The isolate receives an encrypted payload, decrypts it in the isolate’s memory using keys derived from the client’s AES-256-GCM session, proxies the sanitized request to the LLM provider, re-encrypts the response, and terminates. The entire lifecycle completes in the time a traditional VM would still be booting.

For workloads that require stronger isolation guarantees — processing that involves untrusted code execution or workloads where the threat model includes the infrastructure operator — Firecracker micro-VMs with confidential computing attestation represent the next layer. The architecture is designed to be isolation-agnostic: the same zero-persistence lifecycle (encrypt, process, shred) applies regardless of whether the compute boundary is a V8 isolate, a Firecracker micro-VM, or a hardware-attested TEE.

The building blocks of ephemeral compute embody a principle: computation should exist for exactly as long as needed and leave nothing behind. Firecracker, gVisor, and V8 isolates each make that principle achievable at a different point on the performance-security spectrum. The architecture that matters selects the right primitive for the right workload — and destroys it correctly every time.