AWS Lambda MicroVMs: stateful sandboxes with full lifecycle control

AWS just shipped something that reframes what “serverless” can mean. Lambda MicroVMs are not Lambda Functions with a bigger timeout. They are a fundamentally different primitive: stateful, VM-level isolated environments with an explicit lifecycle you control.

The key shift: instead of getting a recycled process for 15 minutes, you get a dedicated Firecracker microVM that lives up to 8 hours — and you decide when it starts, suspends, resumes, and terminates.

What Lambda MicroVMs actually are

Regular Lambda Functions are stateless by design. The runtime is reused between invocations, memory is gone when the function exits, and 15 minutes is a hard ceiling. That model is perfect for APIs and event-driven workloads. It is not good for anything that needs to maintain state across interactions.

Lambda MicroVMs flip this. Each MicroVM is:

A dedicated Firecracker microVM with its own kernel, memory space and disk state
Addressable via a unique HTTPS endpoint (per VM, not shared)
Capable of running any Dockerfile-based application, not just a handler function
Suspendable and resumable — the VM’s memory and disk are snapshotted and restored

The lifecycle looks like this:

1

LAUNCH (from snapshot) → RUN → SUSPEND → RESUME → TERMINATE

Billing follows: while running you pay for compute; while suspended you pay only for snapshot storage; terminated means zero cost.

How to build with it

The setup is more explicit than standard Lambda, which makes sense given the stateful nature.

Step 1 — build your image

Package your application as a Dockerfile. Zip it with any dependencies and upload to S3. This is your base image.

Step 2 — create a MicroVM configuration

Lambda MicroVM reads the Dockerfile from S3, runs it, and takes a snapshot of the memory and disk state once the application is initialized. This snapshot becomes the golden image from which future instances start.

Step 3 — launch instances from the snapshot

Each run-microvm call starts an independent MicroVM from that snapshot. Fast, because no cold init — it resumes from the pre-baked state.

Step 4 — route traffic to the VM’s endpoint

Every MicroVM gets its own HTTPS URL. You route your user or session to it. Authentication is mandatory: the endpoint requires JWE tokens.

Resources and constraints

	Baseline	Peak
RAM	0.5–8 GB	up to 32 GB
vCPU	0.25–4	up to 16
Disk	—	up to 32 GB
Duration	—	8 hours

Architecture is ARM64 only at launch. Regions at GA: us-east-1, us-east-2, us-west-2, eu-west-1, ap-northeast-1.

Networking supports HTTP/1.1, HTTP/2, gRPC, WebSockets, and SSE. Internet egress is available by default; VPC egress is possible through a network connector.

What you pay for

Baseline compute while the VM is running
Peak usage above baseline (billed separately)
Snapshot read/write operations
Snapshot storage
Data transfer

The suspend model is the key cost lever. A suspended VM costs only the snapshot storage. This makes the pattern viable for long-lived interactive sessions that have significant idle time between interactions — exactly the pattern of a user working in an IDE or chatting with an AI agent.

The real use case: one VM per user or session

The design intent is clear from the list of stated use cases:

AI / code execution sandboxes
Claude / agent sessions
Browser IDE
Jupyter / data analytics environments
CI/CD workers
Vulnerability scanning

The common thread: one VM per user, per session, or per agent. Not one VM shared across users. The isolation guarantee — separate kernel, memory, and disk — is the point. You cannot get that from a container or a regular Lambda environment.

For AI agent workloads specifically, this solves a real problem. An agent that executes code on behalf of a user needs to run in a context where: the filesystem state persists across turns, the environment cannot bleed into another user’s session, and the agent can be paused between tasks without losing its working state. MicroVMs check all three boxes.

Lambda MicroVMs vs Lambda Functions

Lambda Functions

Lambda MicroVMs

Execution model

Stateless · event-driven · request-response

Stateful · long-running process per session

Max duration

⏱ 15 min per invocation

⏱ 8 hours total (extendable via suspend)

Isolation

Process-level · env reused across invocations

Dedicated Firecracker VM · own kernel, memory & disk

State & lifecycle

Memory lost between invocations

Memory + disk preserved · suspend / resume via API

LAUNCHfrom snapshot

RUNown HTTPS URL

SUSPENDstorage only

RESUMEnear-instant

TERMINATEno charge

Resources

Up to 6 vCPU · 10 GB RAM · 10 GB disk

Baseline 0.25–4 vCPU · 0.5–8 GB RAM
Peak up to 16 vCPU · 32 GB RAM · 32 GB disk

Architecture

x86_64 or ARM64

ARM64 only (at launch)

Networking

Shared endpoint · HTTP + streaming

Unique HTTPS URL per VM · HTTP/1.1, HTTP/2, gRPC, WebSockets, SSE · JWE auth required

Scaling

Automatic · thousands of concurrent envs

One VM per user / job — explicit launch

Pricing model

Requests + GB-seconds · free tier included

Baseline compute while running
+ peak usage + snapshot ops + storage
Suspend → storage billing only

Available regions

All AWS regions

us-east-1 · us-east-2 · us-west-2 · eu-west-1 · ap-northeast-1

Best for

APIs Event-driven Microservices Short jobs Webhooks

AI code sandboxes Agent sessions Browser IDE Jupyter CI/CD workers Multi-tenant SaaS

What this is not

Lambda MicroVMs are not a drop-in replacement for Lambda Functions. They require more operational involvement: you manage the lifecycle explicitly, you provision baseline capacity, and you design around the suspend/resume model.

They also do not scale automatically. You launch one VM per user or job. That is not a limitation — it is the model. Thousands of concurrent MicroVMs are supported, but you orchestrate the allocation, not AWS.

If your workload is stateless, event-driven, or bursty, Lambda Functions are still the right choice. MicroVMs are for the workloads that Lambda Functions were never designed to handle.

The launch is new and the pricing details are still being worked out in practice, but the primitive itself is solid. Firecracker-backed isolation with full lifecycle control, a unique endpoint per VM, and suspend/resume at the API level — that is a meaningful new building block for anyone building multi-tenant interactive applications.

What Lambda MicroVMs actually are#

How to build with it#

Resources and constraints#

What you pay for#

The real use case: one VM per user or session#

Lambda MicroVMs vs Lambda Functions#

What this is not#

📧 Subscribe