Every CI runner starts a job with a full environment. Every variable set earlier in the workflow, every credential configured at the repo level, every secret the job was given — all of it is in the environment of every process that runs. A compromised npm package in step 5 inherits the AWS credentials configured in step 1.
Authority in CI pipelines is ambient by default. Every process can see everything unless you actively prevent it.
CellOS is built around the opposite assumption: no ambient authority. Every unit of execution declares exactly what it needs. Everything else is withheld.
The execution cell
The core primitive is the execution cell — a least-authority compute unit defined by a spec. The spec is a contract: it declares what the cell is allowed to do before it starts running.
This cell gets exactly one secret. It can reach one host on one port. Everything else — other secrets, arbitrary network access, the host filesystem — is not available. The contract is declared in advance, enforced at exec time, and recorded in the audit trail.
What ambient authority looks like in practice
A standard GitHub Actions workflow:
This is the secret spray problem. npm run build inherits the AWS credentials configured two steps earlier. A compromised postinstall hook finds them in the environment and has everything it needs to exfiltrate them.
With a cell spec:
The supervisor calls env_clear() before execve. The only key available to npm is the one explicitly listed in secretRefs — a short-lived OIDC token scoped to the declared audience. AWS_ACCESS_KEY_ID is never in the child's environment.
Cross-run contamination
Shared runners accumulate state. A pull request job writes a malicious file to /tmp. The next job — a release build from main — runs on the same host and sources cached tooling from /tmp. This is a real attack class with a GitHub advisory behind it.
On the hardened Linux path, the supervisor mounts a fresh empty tmpfs over the cell's working directory before execve. The previous run's workspace data is not visible. When the cell exits, the mount namespace is destroyed.
The isolation is proven with a cross-run test: run A writes a sentinel file; run B asserts it is absent. The test runs in CI.
Destruction that means something
Most ephemeral compute cleans up on a best-effort basis. CellOS defines what destroyed means precisely, per layer:
| Layer | Destroyed means | How it's proven |
|---|---|---|
| Process tree | SIGKILL; no orphan supervisors retaining capabilities | Supervisor exit code + lifecycle events |
| Secrets | TTL at broker; broker-side materialized secrets revoked after teardown | residue.rs: broker empty after destroy+revoke; two-cell isolation test |
| Filesystem (workspace) | Cell-private tmpfs discarded when mount namespace exits | supervisor_linux_private_workspace.rs: run B gets empty workspace |
| Network | Private net namespace and nft rules removed; cell's network identity does not persist | supervisor_linux_network_policy.rs: no host-loopback reachability from child |
| Audit | Teardown event emitted; final residue class recorded (none / documented exception) | CloudEvents lifecycle trail; JSONL sink |
Observable execution
Every cell emits structured CloudEvents over its lifecycle. The events flow to NATS JetStream or a JSONL file for SIEM ingestion.
| Event | When emitted | Key fields |
|---|---|---|
| cell.identity.v1.materialized | OIDC or secret broker resolved successfully | run_id, identity type, audience |
| cell.command.v1.started | Child process spawned | run_id, argv, working_dir |
| cell.network_policy.v1.applied | Network namespace and egress rules configured | run_id, egress_rules, netns_active |
| cell.export.v2.completed | Artifacts exported to declared sink | run_id, destination, bytes |
| cell.command.v1.completed | Child process exited | run_id, exit_code, duration_ms |
| cell.teardown.v1.completed | All resources destroyed | run_id, residue_class |
Standard mode emits events but may not enforce network isolation at the kernel level. Hardened mode requires CELLOS_SUBPROCESS_UNSHARE including net and mnt, and enforces the full claim. The enforcement contract in the docs lists exactly which conditions are required for each capability.
Scored by Claude Code
This assessment was independently reviewed and scored by Claude Code against the cellos-lite codebase, test suite, docs/guarantee-matrix.md, and docs/break-attempts.md.
| Property | Score | Evidence |
|---|---|---|
| No ambient secrets in child process | 9/10 | supervisor_no_ambient_env.rs: proves host env vars not in secretRefs do not appear in child; CELLOS_SECRET_* carrier prefix also absent |
| Cross-run workspace isolation | 8/10 | supervisor_linux_private_workspace.rs: run A sentinel absent in run B; workspace_is_empty_on_start confirmed on hardened path |
| Network containment | 7/10 | Private netns + best-effort nft on hardened path; nftRulesApplied=false is possible if nft unavailable — docs are explicit about this bound |
| Destruction semantics | 8/10 | residue.rs covers host+broker empty after destroy; two-cell isolation; idempotent revoke_for_cell |
| Observability | 9/10 | Structured CloudEvents with stable schema on every lifecycle event; JetStream + JSONL sinks; correlation ID across events |
| Authority contract as first-class spec | 9/10 | spec.authority is declared before execution; validated against JSON Schema in CI; not a flag bag added after the fact |
The governance loop
CellOS is the execution layer in a closed loop:
| Layer | Role |
|---|---|
| taudit | Scans pipeline YAML, builds the authority graph, flags where privilege leaks across trust boundaries |
| tsafe | Constrains secrets to the specific steps that need them; exec injects only what was declared |
| CellOS | Enforces execution — the cell gets what the spec authorised, nothing else; teardown removes every trace |
taudit findings route to the right tool. Scope findings carry a TsafeRemediation. Isolation findings carry a CellosRemediation. The loop closes: detect, constrain, isolate, observe again.
Current state
cellos-lite enforces per-run isolation and authority scoping at the process and Linux namespace level, with auditable lifecycle and event emission. The authority bundle is a first-class contract in the spec. MicroVM-class isolation (Firecracker) is the next milestone; the semantic layer is stable now and carries forward regardless of the isolation primitive underneath it.