Home / Docs / sandboxmcp
On this page

sandboxmcp

πŸ“¦ 23 tools

:shield: sandboxmcp β€” Secure, polyglot code execution for AI agents

Python 3.11+ License: AGPL-3.0 Tests

Part of the MCP AI suite: ragmcp Β· memorymcp Β· planningmcp Β· workspacemcp Β· sandboxmcp

Philosophy: β€œIsolation is the Law, Execution is a Privilege.”


What is sandboxmcp?

sandboxmcp is a zero-trust code execution engine that lets AI agents run Python, Node.js, and Shell code in isolated sandboxes with full security controls:

  • Polyglot execution β€” Python, Node.js, and Shell in isolated subprocesses or Docker containers
  • 2 backends β€” Process (subprocess + OS resource limits) and Docker (full container isolation)
  • Network deny-all β€” All network access is blocked by default; domains must be explicitly approved per-request
  • Secret vault β€” Inject environment variables securely; secrets are automatically redacted from all output
  • Artifact signing β€” Every output file is SHA-256 signed for tamper-proof verification
  • Resource guard β€” CPU, RAM, process count, and timeout limits enforced via OS primitives (rlimit / Windows Job Objects)
  • Async job queue β€” Backpressure-aware concurrency control prevents host OOM from parallel spawns
  • Immutable audit log β€” Every execution, package install, session, and egress event is recorded (InMemory or SQLite)
  • Code validation β€” Syntax checking and auto-fix before execution
  • Host access β€” Guarded host command execution with approval workflow
  • Web tools β€” Web search, page fetching, and browser rendering (lazily delegated to websearchmcp; requires pip install "mcpaisuite-sandboxmcp[browser]")
  • MCP server β€” 23 tools, stdio transport, compatible with Claude Desktop, Cursor, or any MCP client

Execution Flow

ExecutionRequest ↓

Job Queuebackpressure
Vaultinject env
Network Guarddeny-all / allow
Process / Dockerbackend execution
Artifact SignerSHA-256 hash
Secret Maskredact values
Audit Logimmutable

↓ SandboxResult

Every execution passes through the full zero-trust pipeline: Queue -> Vault -> Network -> Execute -> Sign -> Mask -> Audit.


Quick Start

3-line usage

from sandboxmcp import SandboxFactory, ExecutionRequest

sandbox = SandboxFactory.default()
result = await sandbox.execute(ExecutionRequest(code="print('hello sandbox')"))
print(result.stdout)  # hello sandbox

Multi-language execution

from sandboxmcp import SandboxFactory, ExecutionRequest, Language

sandbox = SandboxFactory.default()

# Python
result = await sandbox.execute(ExecutionRequest(code="print(2 + 2)", language=Language.python))

# Node.js
result = await sandbox.execute(ExecutionRequest(code="console.log(2 + 2)", language=Language.node))

# Shell
result = await sandbox.execute(ExecutionRequest(code="echo $((2 + 2))", language=Language.shell))

Docker backend

from sandboxmcp import SandboxFactory, ExecutionRequest

# Full container isolation with Docker
sandbox = SandboxFactory.create(
    default_backend="docker",
    memory_limit="256m",
    cpu_period=100000,
    cpu_quota=50000,
    network_mode="none",
)

result = await sandbox.execute(ExecutionRequest(code="print('isolated')"))

The Docker backend uses python:3.11-slim for Python, node:20-slim for Node.js, and ubuntu:22.04 for Shell. Install the optional dependency with pip install "mcpaisuite-sandboxmcp[docker]".

Process backend hardening (Linux, opt-in)

The plain process backend has no kernel isolation β€” it stops egress and resource abuse, but a process can still read and write host files (this is why the benchmark scores it 3/5, measured on the default backend). On Linux you can close that gap without Docker:

from sandboxmcp.backends.process_rt import ProcessBackend

# Wrap each run in Landlock + user/network namespaces (Linux only)
backend = ProcessBackend(hardened=True)

When hardened=True, each run is wrapped in:

  • Landlock (LSM, kernel β‰₯ 5.13) β€” restricts the process to read-only access to the system filesystem, so host file writes are denied at the kernel.
  • User + network namespaces (rootless) β€” when network is disallowed, the run drops into a fresh empty network namespace, making egress impossible at the kernel level (stronger than the userspace block).

This is honest about its limits: on a non-Linux host or a kernel without Landlock, supported() returns False and the backend runs unhardened and says so β€” it never reports a containment it didn’t actually apply. For untrusted code, Docker remains the backend that contains all five host-impact escapes (5/5).

MCP server

from sandboxmcp import SandboxFactory
from sandboxmcp.mcp_server import SandboxMCPServer

sandbox = SandboxFactory.default()
SandboxMCPServer(sandbox).run()

Or from the command line:

sandboxmcp serve

Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "sandboxmcp": {
      "command": "sandboxmcp",
      "args": ["serve"]
    }
  }
}

Features

  • :shield: Zero-trust pipeline β€” Queue -> Vault -> Network -> Execute -> Sign -> Mask -> Audit
  • :snake: Polyglot β€” Python, Node.js, and Shell execution in isolated subprocesses or Docker containers
  • :lock: Network deny-all β€” All network blocked by default; DNS/socket/proxy-level enforcement
  • :key: Secret vault β€” InMemory or EnvVar vault; secrets auto-redacted from stdout/stderr
  • :pencil2: Artifact signing β€” SHA-256 hash on every output file; verify integrity anytime
  • :bar_chart: Resource guard β€” CPU, RAM, process count, timeout enforced via rlimit (Linux/macOS) or Job Objects (Windows)
  • :penguin: Process backend hardening (Linux, opt-in) β€” ProcessBackend(hardened=True) adds Landlock (read-only system FS) + user/network namespaces for kernel-level isolation without Docker; transparently runs unhardened (and reports it) where unsupported
  • :hourglass_flowing_sand: Async job queue β€” Semaphore-based backpressure; configurable max concurrency
  • :scroll: Immutable audit log β€” InMemory or SQLite; every action recorded with timestamp and detail
  • :electric_plug: MCP server β€” 23 tools, stdio transport, compatible with any MCP client
  • :computer: CLI β€” 23 commands for execution, sessions, vault, audit, validation, web, and host access
  • :gear: Stateful sessions β€” Create, track, and kill long-running sandbox sessions
  • :bridge_at_night: Cross-language bridge β€” Pass JSON data between Python and Node.js sessions
  • :satellite: 14 event types β€” Subscribe to execution, session, egress, vault, and queue events
  • :factory: SandboxFactory β€” default() / create() / from_env() / from_yaml()

Installation

# Minimal β€” subprocess backend, no external services
pip install mcpaisuite-sandboxmcp

# With Docker backend
pip install "mcpaisuite-sandboxmcp[docker]"

# With web tools (web_search / fetch_webpage / browser_fetch)
pip install "mcpaisuite-sandboxmcp[browser]"

# With suite integrations
pip install "mcpaisuite-sandboxmcp[workspacemcp]"
pip install "mcpaisuite-sandboxmcp[planningmcp]"

# Full stack (Docker + browser + suite integrations)
pip install "mcpaisuite-sandboxmcp[all]"

Requirements: Python 3.11+


Security

Secret Vault

sandboxmcp provides two vault implementations for injecting secrets into sandbox processes without exposing them to the LLM:

InMemoryVault β€” Namespace-isolated in-memory secret store. Secrets are automatically masked in all output with ***REDACTED***.

EnvVault β€” Reads secrets from OS environment variables with a namespace prefix (SANDBOXMCP_SECRET_{NAMESPACE}_{KEY}).

# Store a secret
await sandbox.vault_add("default", "API_KEY", "sk-1234567890")

# Code can access it via env var, but the value is redacted in output
result = await sandbox.execute(ExecutionRequest(
    code="import os; print(os.environ.get('API_KEY'))"
))
# stdout: ***REDACTED***

Network Egress Guard

All network access is denied by default. The NetworkGuard enforces this at three levels:

  1. Proxy blocking β€” http_proxy / https_proxy set to 0.0.0.0:0
  2. Socket/DNS patching β€” Python socket.connect, Node.js dns.lookup and net.connect are overridden
  3. Shell aliasing β€” curl and wget aliased to blocked messages

To allow specific domains, pass allowed_domains in the ExecutionRequest. Human-in-the-loop approval is available via request_egress:

# Per-request allowlist
result = await sandbox.execute(ExecutionRequest(
    code="import urllib.request; print(urllib.request.urlopen('https://api.example.com').read())",
    allowed_domains=["api.example.com"],
))

# Human-in-the-loop approval (blocks until approved/denied/timeout).
# request_egress(domain, namespace) takes positional args.
from sandboxmcp.security.egress import NetworkGuard

network_guard = NetworkGuard(enable_network=True)
approved = await network_guard.request_egress("api.example.com", "default")

# An approval channel (UI/callback) resolves the pending request:
network_guard.approve_egress("default", "api.example.com")  # -> request_egress returns True

Resource Guard

OS-level enforcement of resource limits per execution:

LimitDefaultHard CapMechanism
RAM512 MB8 GBRLIMIT_DATA (Linux/macOS) / Job Objects (Windows)
CPU time60 s3600 sRLIMIT_CPU (Linux/macOS) / asyncio.wait_for
Processes10β€”RLIMIT_NPROC (Linux) / Job Objects (Windows)
Output1 MBβ€”Truncation after decode
from sandboxmcp import ExecutionRequest, ResourceLimits

result = await sandbox.execute(ExecutionRequest(
    code="print('constrained')",
    resource_limits=ResourceLimits(
        max_ram_mb=256,
        max_cpu_cores=1,
        timeout_seconds=30,
        max_processes=5,
        max_output_bytes=500_000,
    ),
))

Artifact Signing

Every output file produced by sandbox execution is automatically hashed with SHA-256. Artifacts include the hash, size, and base64-encoded content for tamper-proof verification:

from sandboxmcp.security.crypto import ArtifactSigner

# Sign manually
artifact = ArtifactSigner.sign("report.csv", data_bytes)

# Verify integrity
assert ArtifactSigner.verify(artifact, artifact.sha256)

# Generate a manifest for multiple artifacts
manifest = ArtifactSigner.sign_result_artifacts(result.artifacts)
# {"report.csv": "a1b2c3...", "chart.png": "d4e5f6..."}

MCP Tools

sandboxmcp exposes 23 tools via the MCP protocol (stdio transport):

Execution

ToolDescription
execute_codeRun code in isolated sandbox (Python, Node.js, or Shell)
install_packageInstall pip/npm packages safely
validate_codeValidate code before execution: syntax check, auto-fix imports, detect dangerous patterns

Sessions

ToolDescription
create_sessionStart a stateful sandbox session for multi-step execution
terminate_sessionKill a sandbox session and wipe state
bridge_dataPass JSON data between Python and Node.js sessions

Configuration and Monitoring

ToolDescription
list_runtimesList available languages and backends
set_resource_limitsUpdate CPU/RAM/timeout limits
sandbox_statsLive telemetry: queue, sessions, resource usage
inspect_stateView env vars (secrets masked) and active sessions
get_artifactsRetrieve output files with SHA-256 signatures

Security and Audit

ToolDescription
request_egressRequest human approval to open a network domain
audit_executionGet immutable execution audit log

Web Tools

ToolDescription
web_searchSearch the web using SearXNG, Yandex, Mojeek, or DuckDuckGo rotation
fetch_webpageFetch a URL and extract clean readable content as markdown
browser_fetchFetch a URL with full JS rendering via headless Chromium (Playwright)

Note: Web tools (web_search, fetch_webpage, browser_fetch) are provided by websearchmcp, which is lazily imported at call time. They require the optional browser extra β€” install with pip install "mcpaisuite-sandboxmcp[browser]" (or [all]). Without it, these tools raise an import error while the rest of sandboxmcp works normally.

Host Access

ToolDescription
host_execExecute a command on the host system (subject to security whitelist)
request_host_accessRequest approval to run a command pattern on the host
list_host_accessList allowed, approved, and blocked host command patterns
host_file_readRead a file from the host machine filesystem
host_file_writeWrite content to a file on the host machine filesystem
host_file_listList files in a directory on the host machine filesystem
host_file_copyCopy a file from the host machine into the workspace

CLI

sandboxmcp provides 23 CLI commands with full parity to the 23 MCP tools:

# Start MCP server (stdio transport)
sandboxmcp serve
sandboxmcp serve --transport sse --port 8080
sandboxmcp serve --config sandbox_config.yaml

# Execute a script file (auto-detects language from extension)
sandboxmcp run script.py
sandboxmcp run app.js --language node --timeout 120
sandboxmcp run deploy.sh --backend docker

# View job queue status
sandboxmcp queue status

# Manage sandbox sessions
sandboxmcp session list
sandboxmcp session kill --id <session_id>

# View configuration
sandboxmcp config

# Manage secret vault
sandboxmcp vault add MY_KEY my_value --namespace prod
sandboxmcp vault list --namespace prod
sandboxmcp vault delete MY_KEY --namespace prod

# Show execution stats
sandboxmcp stats

# Verify artifact integrity
sandboxmcp verify output.csv a1b2c3d4e5f6...

# Validate code without execution
sandboxmcp validate script.py
sandboxmcp validate app.js --language node --no-auto-fix

# View audit log
sandboxmcp audit --namespace default --limit 100

# Web search and fetching
sandboxmcp web search "python asyncio tutorial"
sandboxmcp web fetch https://example.com
sandboxmcp web browser https://example.com --wait-for ".content" --screenshot

# Host system access
sandboxmcp host exec "docker ps"
sandboxmcp host read /etc/hostname
sandboxmcp host write /tmp/note.txt "hello"
sandboxmcp host list /var/log
sandboxmcp host copy /host/data.csv workspace_data.csv
sandboxmcp host access list
sandboxmcp host access request "docker restart *"

# Inspect sandbox session state
sandboxmcp inspect default --namespace prod

# Get execution artifacts
sandboxmcp artifacts <request_id>

# Bridge data into a session
sandboxmcp bridge <session_id> key '{"value": 42}'

# Set resource limits
sandboxmcp limits --ram 1024 --cpu 2 --timeout 120 --processes 20

# Approve a domain for network egress
sandboxmcp egress api.example.com --namespace default

SandboxFactory

from sandboxmcp import SandboxFactory

# Zero config β€” process backend, in-memory vault, no network
sandbox = SandboxFactory.default()

# Read config from environment variables
sandbox = SandboxFactory.from_env()

# Read config from a YAML file
sandbox = SandboxFactory.from_yaml("sandbox_config.yaml")

# Fully configurable
sandbox = SandboxFactory.create(
    default_backend="process",       # "process" | "docker"
    max_concurrent_jobs=4,
    enable_network=False,
    allowed_domains=["api.example.com"],
    vault="memory",                  # "memory" | "env"
    audit="sqlite",                  # "memory" | "sqlite"
    max_ram_mb=512,
    max_cpu_cores=1,
    timeout_seconds=60,
    max_processes=10,
    sqlite_path="sandboxmcp.db",
)

Environment variables

VariableDefaultDescription
SANDBOXMCP_BACKENDprocessExecution backend (process / docker)
SANDBOXMCP_MAX_CONCURRENT4Max concurrent jobs
SANDBOXMCP_NETWORKfalseEnable network access
SANDBOXMCP_VAULTmemoryVault backend (memory / env)
SANDBOXMCP_AUDITsqliteAudit backend (memory / sqlite)
SANDBOXMCP_MAX_RAM_MB512Default RAM limit per execution
SANDBOXMCP_TIMEOUT60Default timeout in seconds
SANDBOXMCP_SQLITE_PATHsandboxmcp.dbSQLite database path

Events

sandboxmcp emits 14 event types through an async event bus:

EventTrigger
execution.startedCode execution begins
execution.completedCode execution succeeds
execution.failedCode execution fails (non-zero exit)
execution.timeoutExecution killed after timeout
session.createdNew sandbox session created
session.killedSession terminated
package.installedpip/npm package installed
egress.requestedNetwork domain access requested
egress.approvedDomain access approved
egress.deniedDomain access denied
vault.accessedSecret vault read
resource.exceededResource limit breached
artifact.signedArtifact SHA-256 signed
queue.backpressureJob queue at capacity
from sandboxmcp.events import sandbox_event_bus, SandboxEventType

# Subscribe to all events for a namespace
queue = sandbox_event_bus.subscribe(namespace="default")

# Subscribe to all events globally
queue = sandbox_event_bus.subscribe()

# Stream events asynchronously
async for event in sandbox_event_bus.stream(namespace="default"):
    print(f"{event.type}: {event.message}")

Integration with the MCP suite

sandboxmcp is designed to work alongside the other four libraries in the MCP AI suite:

LibraryIntegration
ragmcpExecute code that queries RAG pipelines; use sandbox to run generated data-processing scripts
memorymcpStore execution results as facts; remember which code patterns succeeded or failed
planningmcpPlan multi-step execution workflows; sandbox each step with resource limits
workspacemcpRead/write workspace files, then execute them in sandboxmcp for safe testing

All five libraries expose MCP servers on stdio, so they can run side-by-side in Claude Desktop or any MCP-compatible agent:

{
  "mcpServers": {
    "sandboxmcp": { "command": "sandboxmcp", "args": ["serve"] },
    "memorymcp": { "command": "memorymcp", "args": ["serve"] },
    "ragmcp": { "command": "ragmcp", "args": ["serve"] },
    "planningmcp": { "command": "planningmcp", "args": ["serve"] },
    "workspacemcp": { "command": "workspacemcp", "args": ["serve"] }
  }
}

Development / Contributing

git clone https://github.com/gashel01/sandboxmcp
cd sandboxmcp
pip install -e ".[dev]"

# Unit tests β€” no external services needed
pytest tests/unit/ -v                            # 398 tests

# With coverage
pytest tests/unit/ --cov=sandboxmcp --cov-report=html

Project structure:

sandboxmcp/
  core/
    models.py        β€” Pydantic models (ExecutionRequest, SandboxResult, Session, Artifact, AuditEntry)
    base.py          β€” Abstract base classes (BaseBackend, BaseVault, BaseNetworkGuard, BaseAuditLogger)
  backends/
    process_rt.py    β€” Process backend (subprocess + resource guard + network blocking)
    docker_rt.py     β€” Docker backend (container isolation + resource limits)
  runtimes/
    python_env.py    β€” Python runtime (pip install)
    node_env.py      β€” Node.js runtime (npm install)
    shell_env.py     β€” Shell runtime (bash -r restricted mode)
  security/
    vault.py         β€” InMemoryVault + EnvVault (secret injection + output masking)
    egress.py        β€” NetworkGuard (DNS allowlist, default deny-all, human-in-the-loop approval)
    resource.py      β€” ResourceGuard (rlimit on Linux/macOS, Job Objects on Windows)
    crypto.py        β€” ArtifactSigner (SHA-256 signing + verification)
    audit.py         β€” InMemoryAuditLogger + SQLiteAuditLogger (immutable execution log)
    validator.py     β€” CodeValidator (syntax check, auto-fix, dangerous pattern detection)
    host_guard.py    β€” HostGuard (host command execution with approval workflow)
  pipeline/
    manager.py       β€” SandboxPipeline (central orchestrator)
    queue.py         β€” AsyncJobQueue (semaphore-based backpressure)
  utils/
    web_extractor.py β€” WebExtractor (HTML to markdown content extraction)
    browser_fetch.py β€” Browser rendering via Playwright (headless Chromium)
  integration/       β€” Suite integration helpers
  events.py          β€” Event bus (14 event types, subscribe/emit/stream)
  mcp_server.py      β€” MCP server (23 tools, stdio transport)
  factory.py         β€” SandboxFactory (default / create / from_env / from_yaml)
  cli.py             β€” CLI (23 commands: serve, run, queue, session, config, vault, stats, verify, validate, audit, web {search,fetch,browser}, host {exec,read,write,list,copy,access}, inspect, artifacts, bridge, limits, egress)

Contributions are welcome. Please open an issue before submitting a large PR.


License

AGPL-3.0 β€” see LICENSE.

For commercial licensing (closed-source usage), contact the author.