node-vmm: VM isolation that feels like spawning a process

Misael Zapata included in categories Engineering Stories

2026-05-05 2026-06-29 854 words 5 minutes

How I built a Node.js SDK that boots native microVMs in sub-millisecond cold starts — no QEMU, no Docker daemon.

Contents

1 The comfortable lie about containers

There is a polite fiction in modern development that everyone accepts because it is convenient: we pretend Docker containers are fast and lightweight. And they are, if your reference point is provisioning bare metal in 2005. But as my tools needed to isolate workloads more frequently — and more dynamically — I started feeling the drag of depending on a giant external engine sitting between me and what I actually wanted.

What I wanted was something that felt ergonomically like calling child_process.spawn() in Node.js, but delivered the hardware isolation of a full virtual machine. No containers with shared namespaces. I wanted a real kernel — or at least a micro-kernel — running in isolation. And more than anything, I wanted pause and resume to be fast enough that an HTTP API inside the VM could respond to a client and feel like plain network latency. No “thawing” delays.

That’s where node-vmm came from. And like these things tend to go, the idea was simple right up until I ran face-first into the reality of native hypervisors.

2 Skipping QEMU the hard way

Almost every project that needs cross-platform virtualization reaches for QEMU by instinct. It’s the Swiss Army knife. But QEMU is big. Wrapping QEMU from Node would have solved the cross-platform problem in a week — and completely killed the latency and weight goals.

So instead I built it by talking directly to each operating system’s native hypervisor API, using C++ bound to Node through N-API. The result is an architecture split into three completely separate worlds that share no code at the bottom layer:

Linux: Direct ioctl calls into KVM. One file with a certain raw elegance — native/kvm/backend.cc.
Windows: The Windows Hypervisor Platform (WHP). This one was genuinely painful. WHP hands you a bare virtual CPU and says good luck assembling the motherboard, so I had to emulate APIC, timers, and UART ports from scratch.
macOS / Apple Silicon: Hypervisor.framework (HVF). After several late nights I realized the cleanest path was not pretending to be x86 but using an ARM64 machine profile (virt-based) to keep things fast and native.

Hiding all three behind a single interface NativeRunConfig in TypeScript was not trivial. The MMIO interrupt layout for Virtio was where things got messy. On KVM the memory stride for devices is clean — 0x1000. On Windows I had to pack them tighter at 0x200 to avoid overlapping ACPI tables. Eventually the abstraction held, and any developer importing the library never has to think about any of it.

3 Firing the middleman: OCI without Docker Engine

Every modern tool needs to run images. The obvious move was bridging to the local Docker socket. That broke my rule about no heavy engines.

So I wrote oci.ts — a full OCI (Open Container Initiative) registry client in TypeScript. It parses manifests, negotiates tokens, and pulls tar.gz blobs layer by layer, injecting them directly into an ext4 rootfs the VM can mount on the fly. Booting node:22-alpine by decoding the image locally and mounting it without touching dockerd changes what “instant” means in practice. On architectures where mkfs.ext4 is not installed by default, we fall back to WSL2 or Homebrew gracefully without breaking the flow.

4 The aha moment: SharedArrayBuffer as physics-defying glue

Cold boot times of 1–3 seconds were fine. My actual obsession was getting paused processes to resume within network-request latency — sub-100 ms.

The traditional approach to VM pause/resume is freeze the CPU, serialize RAM and interrupt state to disk, restore on wake. That’s too slow by orders of magnitude for what I needed. The other option was message passing between the main Node.js thread and the Worker thread managing the hypervisor. The problem: Node’s message-passing bridge over the event loop introduces latency jitter and stalls.

The insight came from thinking about how modern game engines render: SharedArrayBuffer combined with Atomics.

I implemented a tiny structured buffer with slots for CONTROL_COMMAND, CONTROL_STATE, and the console — something both the TypeScript main thread and the C++ Worker thread can read atomically without expensive locks or any message serialization through V8.

When I want to pause a VM, the TS thread atomically writes a 1 into the buffer. The KVM/WHP Worker, during one of its microscopic VM-exits, checks that shared byte and simply stops vCPU execution — without tearing down the machine’s memory infrastructure. The VM is not on disk. It is still alive in the hypervisor, just asleep, burning no cycles.

Fastify or Express servers inside resume and resolve a pending GET / in 5 to 50 milliseconds. It is a modest trick compared to what a commercial hypervisor does. The ergonomic payoff for spinning up isolated environments is anything but modest.

The project is not finished. Full RAM restore from cold snapshot is still something I am chasing with dirty-page tracking, which I have already wired into the foundations of the code. But so far I have gotten exactly what I set out to build: the uncompromising isolation of a real virtual machine, hidden inside something that looks, operates, and dies as easily as one more process in my terminal.