ATLAS/TRUST MODEL/LIFECYCLE/QUARANTINE SANDBOX SHEET L / 04

Quarantine Sandbox

The cloud-isolated aquarium — the blueprint thinks it owns the machine.

QUARANTINE SANDBOX — THE AQUARIUM
SHEET L / 03 QUARANTINE SANDBOX — THE AQUARIUM CLEAN ROOM L3 dispatcher TIER SEL. capability-based HOST isolated VM A decoy set A VM B decoy set B OBSERVER deterministic NARRATOR LLM advisory AUDIT LOG signed evidence submission + tier hint wasmtime / cf / fc? request instance provision VM A fresh aquarium A provision VM B fresh aquarium B Run 1 — decoy set A execute stub DNS/HTTP → NXDOMAIN fake fs + env + clock observation bundle A Run 2 — decoy set B execute different decoys observation bundle B compare A vs B divergence = auto-flag events for narration advisory commentary teardown teardown signed evidence (HMAC) NO STATE SURVIVES TEARDOWN Each run gets a fresh aquarium with a different decoy set. Behavioural divergence between runs is the signal. The LLM Narrator is advisory — the Observer decides.
MODEL

The aquarium model

The quarantine sandbox — internally called the aquarium — is an isolated execution environment where the blueprint believes it has full access to a real machine. In reality, every external resource is a fake: DNS resolves to NXDOMAIN via stub resolvers, HTTP requests hit mock endpoints, the filesystem contains synthetic decoy data, and even the system clock can be manipulated.

The blueprint’s code runs exactly as it would in production, but its environment is a carefully constructed illusion. This lets us observe what the blueprint tries to do — not just what its source code says it does.

TIERS

Three tiers

Three quarantine tiers are defined, selected automatically based on the blueprint’s declared capabilities:

TIERTECHNOLOGYWHEN USEDISOLATION LEVEL
T0WasmtimePure-compute blueprints with no I/O capabilitiesMemory sandbox, no syscalls
T1CF Containers + LandlockBlueprints with filesystem or limited network accessNamespace isolation, LSM constraints
T2Firecracker microVMBlueprints with shell, broad network, or subagent capabilitiesFull VM isolation, dedicated kernel

Higher tiers provide stronger isolation at higher cost. The tier selector picks the cheapest tier that fully contains the blueprint’s declared capabilities. If a blueprint requests capabilities that span tiers, it is promoted to the higher tier.

COUNTERFACTUAL RUNS

Counterfactual runs

Each blueprint is executed twice, in separate aquariums with different decoy sets. Run A sees decoy set A (fake API keys, synthetic user data, mock endpoints). Run B sees decoy set B (different fake keys, different synthetic data, different mocks). The decoy sets are decorrelated — they share no values.

The Execution Observer compares the observation bundles from both runs. A well-behaved blueprint should produce functionally equivalent outputs regardless of the decoy environment. If the blueprint’s behaviour diverges — for example, it exfiltrates data from one environment but not the other, or it produces different outputs depending on the API keys it finds — it is auto-flagged for human review.

This counterfactual design catches adversaries who tailor their behaviour to specific environments (A8 — sandbox detection) and time-bomb payloads (A6) that activate based on environmental signals.

OBSERVATION

What the observer watches

EVENT TYPECAPTURED DATAFLAG THRESHOLD
Syscall traceSystem calls, arguments, return valuesUnexpected syscalls for declared capability set
NetworkDNS queries, HTTP requests, raw socketsAny egress attempt (all network is mocked)
FilesystemReads, writes, permission changesAccess outside declared paths
ProcessSpawned subprocesses, signalsUndeclared child processes
ClockTime queries, sleep callsExcessive sleeps (delay-based evasion)
EntropyRandom number generationCrypto operations not matching declared capabilities
← Back to Lifecycle Trust Model →