AWS just shipped something that reframes what “serverless” can mean. Lambda MicroVMs are not Lambda Functions with a bigger timeout. They are a fundamentally different primitive: stateful, VM-level isolated environments with an explicit lifecycle you control.
The key shift: instead of getting a recycled process for 15 minutes, you get a dedicated Firecracker microVM that lives up to 8 hours — and you decide when it starts, suspends, resumes, and terminates.
What Lambda MicroVMs actually are
Regular Lambda Functions are stateless by design. The runtime is reused between invocations, memory is gone when the function exits, and 15 minutes is a hard ceiling. That model is perfect for APIs and event-driven workloads. It is not good for anything that needs to maintain state across interactions.
Lambda MicroVMs flip this. Each MicroVM is:
- A dedicated Firecracker microVM with its own kernel, memory space and disk state
- Addressable via a unique HTTPS endpoint (per VM, not shared)
- Capable of running any Dockerfile-based application, not just a handler function
- Suspendable and resumable — the VM’s memory and disk are snapshotted and restored
The lifecycle looks like this:
|
|
Billing follows: while running you pay for compute; while suspended you pay only for snapshot storage; terminated means zero cost.
How to build with it
The setup is more explicit than standard Lambda, which makes sense given the stateful nature.
Step 1 — build your image
Package your application as a Dockerfile. Zip it with any dependencies and upload to S3. This is your base image.
Step 2 — create a MicroVM configuration
Lambda MicroVM reads the Dockerfile from S3, runs it, and takes a snapshot of the memory and disk state once the application is initialized. This snapshot becomes the golden image from which future instances start.
Step 3 — launch instances from the snapshot
Each run-microvm call starts an independent MicroVM from that snapshot. Fast, because no cold init — it resumes from the pre-baked state.
Step 4 — route traffic to the VM’s endpoint
Every MicroVM gets its own HTTPS URL. You route your user or session to it. Authentication is mandatory: the endpoint requires JWE tokens.
Resources and constraints
| Baseline | Peak | |
|---|---|---|
| RAM | 0.5–8 GB | up to 32 GB |
| vCPU | 0.25–4 | up to 16 |
| Disk | — | up to 32 GB |
| Duration | — | 8 hours |
Architecture is ARM64 only at launch. Regions at GA: us-east-1, us-east-2, us-west-2, eu-west-1, ap-northeast-1.
Networking supports HTTP/1.1, HTTP/2, gRPC, WebSockets, and SSE. Internet egress is available by default; VPC egress is possible through a network connector.
What you pay for
- Baseline compute while the VM is running
- Peak usage above baseline (billed separately)
- Snapshot read/write operations
- Snapshot storage
- Data transfer
The suspend model is the key cost lever. A suspended VM costs only the snapshot storage. This makes the pattern viable for long-lived interactive sessions that have significant idle time between interactions — exactly the pattern of a user working in an IDE or chatting with an AI agent.
The real use case: one VM per user or session
The design intent is clear from the list of stated use cases:
- AI / code execution sandboxes
- Claude / agent sessions
- Browser IDE
- Jupyter / data analytics environments
- CI/CD workers
- Vulnerability scanning
The common thread: one VM per user, per session, or per agent. Not one VM shared across users. The isolation guarantee — separate kernel, memory, and disk — is the point. You cannot get that from a container or a regular Lambda environment.
For AI agent workloads specifically, this solves a real problem. An agent that executes code on behalf of a user needs to run in a context where: the filesystem state persists across turns, the environment cannot bleed into another user’s session, and the agent can be paused between tasks without losing its working state. MicroVMs check all three boxes.
Lambda MicroVMs vs Lambda Functions
Peak up to 16 vCPU · 32 GB RAM · 32 GB disk
+ peak usage + snapshot ops + storage
Suspend → storage billing only
What this is not
Lambda MicroVMs are not a drop-in replacement for Lambda Functions. They require more operational involvement: you manage the lifecycle explicitly, you provision baseline capacity, and you design around the suspend/resume model.
They also do not scale automatically. You launch one VM per user or job. That is not a limitation — it is the model. Thousands of concurrent MicroVMs are supported, but you orchestrate the allocation, not AWS.
If your workload is stateless, event-driven, or bursty, Lambda Functions are still the right choice. MicroVMs are for the workloads that Lambda Functions were never designed to handle.
The launch is new and the pricing details are still being worked out in practice, but the primitive itself is solid. Firecracker-backed isolation with full lifecycle control, a unique endpoint per VM, and suspend/resume at the API level — that is a meaningful new building block for anyone building multi-tenant interactive applications.