Secure execution library · MCP-native · Apache-2.0

Isolated. Audited. Polyglot.
Your agent's execution engine.

sandboxmcp is a modular Python library for secure, isolated code execution. Run Python, Node.js, and shell scripts in sandboxed environments. Zero-egress networking, secret vaults, artifact signing, and immutable audit logs.

Get started View on GitHub

secure_exec.py

from sandboxmcp import SandboxFactory, ExecutionRequest

sandbox = SandboxFactory.default()

result = await sandbox.execute(ExecutionRequest(
    code="print('Hello from the sandbox!')",
    language="python",
))
print(result.stdout)   # "Hello from the sandbox!"
for art in result.artifacts:
    print(art.name, art.sha256)   # artifact hash for verification

from sandboxmcp.mcp_server import SandboxMCPServer
SandboxMCPServer(sandbox).run()

Zero-egressSHA-256 signing

MCP stdio / SSE

Features

Everything your agent needs
to execute code safely

Multi-Backend

Two isolation backends: Process (subprocess with OS rlimits / Windows Job Objects) and Docker (OCI containers).

Process Docker

Zero-Egress

Default-deny networking. Sandboxed code cannot reach the internet unless you explicitly allowlist a host.

default deny allowlist

Secret Vault

Inject secrets at runtime without exposing them in code or logs. Scoped per-execution and scrubbed from output.

scoped injection log scrubbing

Polyglot Runtimes

Execute Python, Node.js, and Shell scripts in the same sandbox infrastructure.

Python Node.js Shell

Artifact Signing

Every execution result is hashed with SHA-256. Chain-of-custody for every artifact your agent produces.

SHA-256 tamper-proof

Immutable Audit

Append-only audit log for every execution. Query by session, time range, or execution ID.

append-only queryable

Backpressure Queue

Bounded execution queue with a configurable concurrency limit, enforced by an async semaphore (max_concurrent).

bounded semaphore

Resource Guard

Per-execution limits on CPU time, memory, process count, and output size. Enforced at the OS level (rlimit / Job Objects).

CPU memory processes

How it works

Submit code.
Get verified results.

Queue

Execution request enters the backpressure queue.

queue.submit(request)

Vault

Secrets are injected into the execution environment.

vault.inject(exec_id, secrets)

Isolate

Backend spins up an isolated environment.

backend.create_sandbox(cfg)

Execute

Code runs with resource guards active.

sandbox.run(code, runtime)

Sign

Outputs hashed with SHA-256.

crypto.sign(result)

Audit

Full record appended to immutable log.

audit.append(record)

Measured, not claimed

`sandboxmcp` vs a raw OS subprocess

Same model, same controls, losses shown as plainly as wins. Every number reproduces from a script with raw JSON in the repo.

Metric	sandboxmcp	a raw OS subprocess
host-impact escapes contained — raw	—	1 / 5
contained — process backend	3 / 5	—
contained — Docker backend	5 / 5	—
dependencies (process backend)	0	—

Real defense-in-depth with zero dependencies. The process backend stops egress and resource attacks but has no kernel isolation — host file reads/writes leak. The Docker backend contains all of them, verified on the host filesystem. Not an unescapable sandbox: process for trusted code with guardrails, Docker to contain code you don't trust.

See the full benchmark →

Isolated. Audited. Polyglot.Your agent's execution engine.

Everything your agent needsto execute code safely