sandboxmcp
π¦ 23 tools
:shield: sandboxmcp β Secure, polyglot code execution for AI agents
Part of the MCP AI suite: ragmcp Β· memorymcp Β· planningmcp Β· workspacemcp Β· sandboxmcp
Philosophy: βIsolation is the Law, Execution is a Privilege.β
What is sandboxmcp?
sandboxmcp is a zero-trust code execution engine that lets AI agents run Python, Node.js, and Shell code in isolated sandboxes with full security controls:
- Polyglot execution β Python, Node.js, and Shell in isolated subprocesses or Docker containers
- 2 backends β Process (subprocess + OS resource limits) and Docker (full container isolation)
- Network deny-all β All network access is blocked by default; domains must be explicitly approved per-request
- Secret vault β Inject environment variables securely; secrets are automatically redacted from all output
- Artifact signing β Every output file is SHA-256 signed for tamper-proof verification
- Resource guard β CPU, RAM, process count, and timeout limits enforced via OS primitives (rlimit / Windows Job Objects)
- Async job queue β Backpressure-aware concurrency control prevents host OOM from parallel spawns
- Immutable audit log β Every execution, package install, session, and egress event is recorded (InMemory or SQLite)
- Code validation β Syntax checking and auto-fix before execution
- Host access β Guarded host command execution with approval workflow
- Web tools β Web search, page fetching, and browser rendering (lazily delegated to websearchmcp; requires
pip install "mcpaisuite-sandboxmcp[browser]") - MCP server β 23 tools, stdio transport, compatible with Claude Desktop, Cursor, or any MCP client
Execution Flow
ExecutionRequestΒ β
βΒ SandboxResult
Every execution passes through the full zero-trust pipeline: Queue -> Vault -> Network -> Execute -> Sign -> Mask -> Audit.
Quick Start
3-line usage
from sandboxmcp import SandboxFactory, ExecutionRequest
sandbox = SandboxFactory.default()
result = await sandbox.execute(ExecutionRequest(code="print('hello sandbox')"))
print(result.stdout) # hello sandbox
Multi-language execution
from sandboxmcp import SandboxFactory, ExecutionRequest, Language
sandbox = SandboxFactory.default()
# Python
result = await sandbox.execute(ExecutionRequest(code="print(2 + 2)", language=Language.python))
# Node.js
result = await sandbox.execute(ExecutionRequest(code="console.log(2 + 2)", language=Language.node))
# Shell
result = await sandbox.execute(ExecutionRequest(code="echo $((2 + 2))", language=Language.shell))
Docker backend
from sandboxmcp import SandboxFactory, ExecutionRequest
# Full container isolation with Docker
sandbox = SandboxFactory.create(
default_backend="docker",
memory_limit="256m",
cpu_period=100000,
cpu_quota=50000,
network_mode="none",
)
result = await sandbox.execute(ExecutionRequest(code="print('isolated')"))
The Docker backend uses python:3.11-slim for Python, node:20-slim for Node.js, and ubuntu:22.04 for Shell. Install the optional dependency with pip install "mcpaisuite-sandboxmcp[docker]".
Process backend hardening (Linux, opt-in)
The plain process backend has no kernel isolation β it stops egress and resource abuse, but a process can still read and write host files (this is why the benchmark scores it 3/5, measured on the default backend). On Linux you can close that gap without Docker:
from sandboxmcp.backends.process_rt import ProcessBackend
# Wrap each run in Landlock + user/network namespaces (Linux only)
backend = ProcessBackend(hardened=True)
When hardened=True, each run is wrapped in:
- Landlock (LSM, kernel β₯ 5.13) β restricts the process to read-only access to the system filesystem, so host file writes are denied at the kernel.
- User + network namespaces (rootless) β when network is disallowed, the run drops into a fresh empty network namespace, making egress impossible at the kernel level (stronger than the userspace block).
This is honest about its limits: on a non-Linux host or a kernel without Landlock, supported() returns False and the backend runs unhardened and says so β it never reports a containment it didnβt actually apply. For untrusted code, Docker remains the backend that contains all five host-impact escapes (5/5).
MCP server
from sandboxmcp import SandboxFactory
from sandboxmcp.mcp_server import SandboxMCPServer
sandbox = SandboxFactory.default()
SandboxMCPServer(sandbox).run()
Or from the command line:
sandboxmcp serve
Claude Desktop claude_desktop_config.json:
{
"mcpServers": {
"sandboxmcp": {
"command": "sandboxmcp",
"args": ["serve"]
}
}
}
Features
- :shield: Zero-trust pipeline β Queue -> Vault -> Network -> Execute -> Sign -> Mask -> Audit
- :snake: Polyglot β Python, Node.js, and Shell execution in isolated subprocesses or Docker containers
- :lock: Network deny-all β All network blocked by default; DNS/socket/proxy-level enforcement
- :key: Secret vault β InMemory or EnvVar vault; secrets auto-redacted from stdout/stderr
- :pencil2: Artifact signing β SHA-256 hash on every output file; verify integrity anytime
- :bar_chart: Resource guard β CPU, RAM, process count, timeout enforced via rlimit (Linux/macOS) or Job Objects (Windows)
- :penguin: Process backend hardening (Linux, opt-in) β
ProcessBackend(hardened=True)adds Landlock (read-only system FS) + user/network namespaces for kernel-level isolation without Docker; transparently runs unhardened (and reports it) where unsupported - :hourglass_flowing_sand: Async job queue β Semaphore-based backpressure; configurable max concurrency
- :scroll: Immutable audit log β InMemory or SQLite; every action recorded with timestamp and detail
- :electric_plug: MCP server β 23 tools, stdio transport, compatible with any MCP client
- :computer: CLI β 23 commands for execution, sessions, vault, audit, validation, web, and host access
- :gear: Stateful sessions β Create, track, and kill long-running sandbox sessions
- :bridge_at_night: Cross-language bridge β Pass JSON data between Python and Node.js sessions
- :satellite: 14 event types β Subscribe to execution, session, egress, vault, and queue events
- :factory: SandboxFactory β
default()/create()/from_env()/from_yaml()
Installation
# Minimal β subprocess backend, no external services
pip install mcpaisuite-sandboxmcp
# With Docker backend
pip install "mcpaisuite-sandboxmcp[docker]"
# With web tools (web_search / fetch_webpage / browser_fetch)
pip install "mcpaisuite-sandboxmcp[browser]"
# With suite integrations
pip install "mcpaisuite-sandboxmcp[workspacemcp]"
pip install "mcpaisuite-sandboxmcp[planningmcp]"
# Full stack (Docker + browser + suite integrations)
pip install "mcpaisuite-sandboxmcp[all]"
Requirements: Python 3.11+
Security
Secret Vault
sandboxmcp provides two vault implementations for injecting secrets into sandbox processes without exposing them to the LLM:
InMemoryVault β Namespace-isolated in-memory secret store. Secrets are automatically masked in all output with ***REDACTED***.
EnvVault β Reads secrets from OS environment variables with a namespace prefix (SANDBOXMCP_SECRET_{NAMESPACE}_{KEY}).
# Store a secret
await sandbox.vault_add("default", "API_KEY", "sk-1234567890")
# Code can access it via env var, but the value is redacted in output
result = await sandbox.execute(ExecutionRequest(
code="import os; print(os.environ.get('API_KEY'))"
))
# stdout: ***REDACTED***
Network Egress Guard
All network access is denied by default. The NetworkGuard enforces this at three levels:
- Proxy blocking β
http_proxy/https_proxyset to0.0.0.0:0 - Socket/DNS patching β Python
socket.connect, Node.jsdns.lookupandnet.connectare overridden - Shell aliasing β
curlandwgetaliased to blocked messages
To allow specific domains, pass allowed_domains in the ExecutionRequest. Human-in-the-loop approval is available via request_egress:
# Per-request allowlist
result = await sandbox.execute(ExecutionRequest(
code="import urllib.request; print(urllib.request.urlopen('https://api.example.com').read())",
allowed_domains=["api.example.com"],
))
# Human-in-the-loop approval (blocks until approved/denied/timeout).
# request_egress(domain, namespace) takes positional args.
from sandboxmcp.security.egress import NetworkGuard
network_guard = NetworkGuard(enable_network=True)
approved = await network_guard.request_egress("api.example.com", "default")
# An approval channel (UI/callback) resolves the pending request:
network_guard.approve_egress("default", "api.example.com") # -> request_egress returns True
Resource Guard
OS-level enforcement of resource limits per execution:
| Limit | Default | Hard Cap | Mechanism |
|---|---|---|---|
| RAM | 512 MB | 8 GB | RLIMIT_DATA (Linux/macOS) / Job Objects (Windows) |
| CPU time | 60 s | 3600 s | RLIMIT_CPU (Linux/macOS) / asyncio.wait_for |
| Processes | 10 | β | RLIMIT_NPROC (Linux) / Job Objects (Windows) |
| Output | 1 MB | β | Truncation after decode |
from sandboxmcp import ExecutionRequest, ResourceLimits
result = await sandbox.execute(ExecutionRequest(
code="print('constrained')",
resource_limits=ResourceLimits(
max_ram_mb=256,
max_cpu_cores=1,
timeout_seconds=30,
max_processes=5,
max_output_bytes=500_000,
),
))
Artifact Signing
Every output file produced by sandbox execution is automatically hashed with SHA-256. Artifacts include the hash, size, and base64-encoded content for tamper-proof verification:
from sandboxmcp.security.crypto import ArtifactSigner
# Sign manually
artifact = ArtifactSigner.sign("report.csv", data_bytes)
# Verify integrity
assert ArtifactSigner.verify(artifact, artifact.sha256)
# Generate a manifest for multiple artifacts
manifest = ArtifactSigner.sign_result_artifacts(result.artifacts)
# {"report.csv": "a1b2c3...", "chart.png": "d4e5f6..."}
MCP Tools
sandboxmcp exposes 23 tools via the MCP protocol (stdio transport):
Execution
| Tool | Description |
|---|---|
execute_code | Run code in isolated sandbox (Python, Node.js, or Shell) |
install_package | Install pip/npm packages safely |
validate_code | Validate code before execution: syntax check, auto-fix imports, detect dangerous patterns |
Sessions
| Tool | Description |
|---|---|
create_session | Start a stateful sandbox session for multi-step execution |
terminate_session | Kill a sandbox session and wipe state |
bridge_data | Pass JSON data between Python and Node.js sessions |
Configuration and Monitoring
| Tool | Description |
|---|---|
list_runtimes | List available languages and backends |
set_resource_limits | Update CPU/RAM/timeout limits |
sandbox_stats | Live telemetry: queue, sessions, resource usage |
inspect_state | View env vars (secrets masked) and active sessions |
get_artifacts | Retrieve output files with SHA-256 signatures |
Security and Audit
| Tool | Description |
|---|---|
request_egress | Request human approval to open a network domain |
audit_execution | Get immutable execution audit log |
Web Tools
| Tool | Description |
|---|---|
web_search | Search the web using SearXNG, Yandex, Mojeek, or DuckDuckGo rotation |
fetch_webpage | Fetch a URL and extract clean readable content as markdown |
browser_fetch | Fetch a URL with full JS rendering via headless Chromium (Playwright) |
Note: Web tools (
web_search,fetch_webpage,browser_fetch) are provided by websearchmcp, which is lazily imported at call time. They require the optionalbrowserextra β install withpip install "mcpaisuite-sandboxmcp[browser]"(or[all]). Without it, these tools raise an import error while the rest of sandboxmcp works normally.
Host Access
| Tool | Description |
|---|---|
host_exec | Execute a command on the host system (subject to security whitelist) |
request_host_access | Request approval to run a command pattern on the host |
list_host_access | List allowed, approved, and blocked host command patterns |
host_file_read | Read a file from the host machine filesystem |
host_file_write | Write content to a file on the host machine filesystem |
host_file_list | List files in a directory on the host machine filesystem |
host_file_copy | Copy a file from the host machine into the workspace |
CLI
sandboxmcp provides 23 CLI commands with full parity to the 23 MCP tools:
# Start MCP server (stdio transport)
sandboxmcp serve
sandboxmcp serve --transport sse --port 8080
sandboxmcp serve --config sandbox_config.yaml
# Execute a script file (auto-detects language from extension)
sandboxmcp run script.py
sandboxmcp run app.js --language node --timeout 120
sandboxmcp run deploy.sh --backend docker
# View job queue status
sandboxmcp queue status
# Manage sandbox sessions
sandboxmcp session list
sandboxmcp session kill --id <session_id>
# View configuration
sandboxmcp config
# Manage secret vault
sandboxmcp vault add MY_KEY my_value --namespace prod
sandboxmcp vault list --namespace prod
sandboxmcp vault delete MY_KEY --namespace prod
# Show execution stats
sandboxmcp stats
# Verify artifact integrity
sandboxmcp verify output.csv a1b2c3d4e5f6...
# Validate code without execution
sandboxmcp validate script.py
sandboxmcp validate app.js --language node --no-auto-fix
# View audit log
sandboxmcp audit --namespace default --limit 100
# Web search and fetching
sandboxmcp web search "python asyncio tutorial"
sandboxmcp web fetch https://example.com
sandboxmcp web browser https://example.com --wait-for ".content" --screenshot
# Host system access
sandboxmcp host exec "docker ps"
sandboxmcp host read /etc/hostname
sandboxmcp host write /tmp/note.txt "hello"
sandboxmcp host list /var/log
sandboxmcp host copy /host/data.csv workspace_data.csv
sandboxmcp host access list
sandboxmcp host access request "docker restart *"
# Inspect sandbox session state
sandboxmcp inspect default --namespace prod
# Get execution artifacts
sandboxmcp artifacts <request_id>
# Bridge data into a session
sandboxmcp bridge <session_id> key '{"value": 42}'
# Set resource limits
sandboxmcp limits --ram 1024 --cpu 2 --timeout 120 --processes 20
# Approve a domain for network egress
sandboxmcp egress api.example.com --namespace default
SandboxFactory
from sandboxmcp import SandboxFactory
# Zero config β process backend, in-memory vault, no network
sandbox = SandboxFactory.default()
# Read config from environment variables
sandbox = SandboxFactory.from_env()
# Read config from a YAML file
sandbox = SandboxFactory.from_yaml("sandbox_config.yaml")
# Fully configurable
sandbox = SandboxFactory.create(
default_backend="process", # "process" | "docker"
max_concurrent_jobs=4,
enable_network=False,
allowed_domains=["api.example.com"],
vault="memory", # "memory" | "env"
audit="sqlite", # "memory" | "sqlite"
max_ram_mb=512,
max_cpu_cores=1,
timeout_seconds=60,
max_processes=10,
sqlite_path="sandboxmcp.db",
)
Environment variables
| Variable | Default | Description |
|---|---|---|
SANDBOXMCP_BACKEND | process | Execution backend (process / docker) |
SANDBOXMCP_MAX_CONCURRENT | 4 | Max concurrent jobs |
SANDBOXMCP_NETWORK | false | Enable network access |
SANDBOXMCP_VAULT | memory | Vault backend (memory / env) |
SANDBOXMCP_AUDIT | sqlite | Audit backend (memory / sqlite) |
SANDBOXMCP_MAX_RAM_MB | 512 | Default RAM limit per execution |
SANDBOXMCP_TIMEOUT | 60 | Default timeout in seconds |
SANDBOXMCP_SQLITE_PATH | sandboxmcp.db | SQLite database path |
Events
sandboxmcp emits 14 event types through an async event bus:
| Event | Trigger |
|---|---|
execution.started | Code execution begins |
execution.completed | Code execution succeeds |
execution.failed | Code execution fails (non-zero exit) |
execution.timeout | Execution killed after timeout |
session.created | New sandbox session created |
session.killed | Session terminated |
package.installed | pip/npm package installed |
egress.requested | Network domain access requested |
egress.approved | Domain access approved |
egress.denied | Domain access denied |
vault.accessed | Secret vault read |
resource.exceeded | Resource limit breached |
artifact.signed | Artifact SHA-256 signed |
queue.backpressure | Job queue at capacity |
from sandboxmcp.events import sandbox_event_bus, SandboxEventType
# Subscribe to all events for a namespace
queue = sandbox_event_bus.subscribe(namespace="default")
# Subscribe to all events globally
queue = sandbox_event_bus.subscribe()
# Stream events asynchronously
async for event in sandbox_event_bus.stream(namespace="default"):
print(f"{event.type}: {event.message}")
Integration with the MCP suite
sandboxmcp is designed to work alongside the other four libraries in the MCP AI suite:
| Library | Integration |
|---|---|
| ragmcp | Execute code that queries RAG pipelines; use sandbox to run generated data-processing scripts |
| memorymcp | Store execution results as facts; remember which code patterns succeeded or failed |
| planningmcp | Plan multi-step execution workflows; sandbox each step with resource limits |
| workspacemcp | Read/write workspace files, then execute them in sandboxmcp for safe testing |
All five libraries expose MCP servers on stdio, so they can run side-by-side in Claude Desktop or any MCP-compatible agent:
{
"mcpServers": {
"sandboxmcp": { "command": "sandboxmcp", "args": ["serve"] },
"memorymcp": { "command": "memorymcp", "args": ["serve"] },
"ragmcp": { "command": "ragmcp", "args": ["serve"] },
"planningmcp": { "command": "planningmcp", "args": ["serve"] },
"workspacemcp": { "command": "workspacemcp", "args": ["serve"] }
}
}
Development / Contributing
git clone https://github.com/gashel01/sandboxmcp
cd sandboxmcp
pip install -e ".[dev]"
# Unit tests β no external services needed
pytest tests/unit/ -v # 398 tests
# With coverage
pytest tests/unit/ --cov=sandboxmcp --cov-report=html
Project structure:
sandboxmcp/
core/
models.py β Pydantic models (ExecutionRequest, SandboxResult, Session, Artifact, AuditEntry)
base.py β Abstract base classes (BaseBackend, BaseVault, BaseNetworkGuard, BaseAuditLogger)
backends/
process_rt.py β Process backend (subprocess + resource guard + network blocking)
docker_rt.py β Docker backend (container isolation + resource limits)
runtimes/
python_env.py β Python runtime (pip install)
node_env.py β Node.js runtime (npm install)
shell_env.py β Shell runtime (bash -r restricted mode)
security/
vault.py β InMemoryVault + EnvVault (secret injection + output masking)
egress.py β NetworkGuard (DNS allowlist, default deny-all, human-in-the-loop approval)
resource.py β ResourceGuard (rlimit on Linux/macOS, Job Objects on Windows)
crypto.py β ArtifactSigner (SHA-256 signing + verification)
audit.py β InMemoryAuditLogger + SQLiteAuditLogger (immutable execution log)
validator.py β CodeValidator (syntax check, auto-fix, dangerous pattern detection)
host_guard.py β HostGuard (host command execution with approval workflow)
pipeline/
manager.py β SandboxPipeline (central orchestrator)
queue.py β AsyncJobQueue (semaphore-based backpressure)
utils/
web_extractor.py β WebExtractor (HTML to markdown content extraction)
browser_fetch.py β Browser rendering via Playwright (headless Chromium)
integration/ β Suite integration helpers
events.py β Event bus (14 event types, subscribe/emit/stream)
mcp_server.py β MCP server (23 tools, stdio transport)
factory.py β SandboxFactory (default / create / from_env / from_yaml)
cli.py β CLI (23 commands: serve, run, queue, session, config, vault, stats, verify, validate, audit, web {search,fetch,browser}, host {exec,read,write,list,copy,access}, inspect, artifacts, bridge, limits, egress)
Contributions are welcome. Please open an issue before submitting a large PR.
License
AGPL-3.0 β see LICENSE.
For commercial licensing (closed-source usage), contact the author.