kernelmcp

⚙ 32 tools

kernelmcp

The sovereign orchestrator — connects all MCP AI suite libraries into an autonomous agent.

The brain of the MCP AI suite: ragmcp (knowledge) · memorymcp (memory) · planningmcp (reasoning) · workspacemcp (files) · sandboxmcp (execution) · schedulermcp (scheduling) · websearchmcp (search) · ltpmcp (protocols) · evalmcp (evaluation) · kernelmcp (orchestrator)

120+ native tools — and connects at runtime to any MCP server or LangChain tool (2000+ community MCP servers and 500+ LangChain tools are reachable, none bundled). 7 agent types. Multi-agent TaskForce with 5 patterns. ReAct + LTP hybrid engine. Dynamic MCP client. LangChain bridge. Self-hosted Hub connector (monitor + control your own kernels). Zero manual intervention.

Philosophy: “The orchestrator feels nothing. It reads the event, enforces the plan, delegates the execution, and reports the result.”

What is kernelmcp?

kernelmcp is the central nervous system that wires the entire MCP AI suite into a multi-agent orchestrator. It receives a goal, creates a plan, delegates work to specialized servers and agents, tracks costs, heals failures, and returns results — all through a ReAct + LTP (Lean Task Protocol) hybrid engine driven by an LLM.

Without kernelmcp, each library operates independently. With it, they become a coordinated multi-agent system:

Integration Matrix

Capability	Library	How kernelmcp uses it
Knowledge retrieval	ragmcp	Searches documents before the LLM guesses. Self-RAG for verified answers. ReAct RAG for multi-step reasoning. RAGAS eval for quality measurement.
Persistent memory	memorymcp	Full context assembly (persona + working memory + episodes + facts) at task start. Auto-stores episodes and outcomes. Consolidation + decay engine. GDPR forget_user.
Plan decomposition	planningmcp	Creates and enforces step-by-step plans with templates (deploy, migrate, audit…). Cost estimation, validation, replan on failure.
File operations	workspacemcp	Reads, writes, checkpoints files with DLP secret detection and approval gates for sensitive files.
Code execution	sandboxmcp	Runs code in isolated sandboxes with auto-heal. Vault for secrets, artifact signing, code validation.
Task scheduling	schedulermcp	Cron, interval, and watch jobs for event-driven automation
Host access	kernelmcp	host_exec, host file tools with per-tenant HostGuard whitelist
Web search	websearchmcp	SearXNG self-hosted search + Playwright browser_fetch

Governance & Security

Governance in kernelmcp is enforced at the execution layer — a code path that stops the agent regardless of what the LLM “decides” — not asked for in a prompt. Every tool call funnels through a single chokepoint where the rules are applied.

DLP secret guard — every tool result is scanned and secrets (AWS keys, tokens, private keys, connection strings) are redacted before they reach the model, so it can’t leak what it never sees. Outbound tool calls whose arguments carry a secret are blocked at dispatch. Deterministic pattern matching, not LLM judgement. Opt-in: enable_dlp=True / KERNELMCP_DLP; emits secret.redacted / secret.blocked events.
Provable plans — when running in LTP/hybrid mode, kernelmcp can statically verify the compiled plan against a policy before it executes a single step. Pass a ltpmcp.PlanPolicy as plan_policy (opt-in; default off): the kernel runs verify_plan on the plan and refuses it if it violates the policy — denied tools, a step budget, runtime escape hatches, a no-egress-after-sensitive-read data-flow rule, or secret/path arguments. An exfiltration plan is refused at 0 steps, before any tool runs. See ltpmcp → Provable plans.
Budget cap — hard token/cost ceiling per task; the loop stops the moment it’s exceeded.
Loop detection — identical repeated tool calls are blocked (TOOL_DISPATCH_BLOCKED).
Approval gates & checkpoints (via workspacemcp) — destructive file ops on sensitive paths require approval and are snapshotted first.
Egress allowlist, RBAC, and a full audit trail — every block carries a reason and a correlation id, and is replayable.

Same model, same prompt — the only difference is whether the rules are enforced or merely prompted.

A/B testing of constitutions

run_ab(...) runs a goal under two constitutions and reports success / cost / token (and optional quality) deltas plus a winner — decided on quality first (when an evalmcp judge is supplied), then success rate, then cost. Surfaces: lib run_ab, CLI kernelmcp ab, MCP ab_test, API POST /ab. (In the Hub, a current-vs-proposed A/B panel on the Security page, with an optional evalmcp judge.)

Dry-run (simulation)

Plan and decide with zero side effects: kernel.run(goal, dry_run=True) runs the loop but short-circuits every tool call, recording the calls it would have made in task.metadata["dry_run_calls"] (and which guards would have fired) without executing any. Surfaces: lib (dry_run=True), CLI (kernelmcp run "<goal>" --dry-run), and the REST API (run request with dry_run: true, returns dry_run_calls).

Multi-Agent TaskForce

The headline feature: compose multiple specialized agents into coordinated teams that collaborate on complex goals.

Creating a TaskForce

from kernelmcp import KernelFactory
from kernelmcp.agents.taskforce import TaskForce

kernel = KernelFactory.from_env()
registry = kernel._agent_registry

# Pre-built template (coding / research / writing / analysis)
tf = TaskForce.create("coding", goal="Build a REST API with auth", registry=registry)

# Or build a taskforce from explicit agents
from kernelmcp.agents.patterns import AgentConfig

tf = TaskForce(
    agents=[
        AgentConfig(type="research", role="Researcher"),
        AgentConfig(type="code", role="Implementer"),
        AgentConfig(type="file", role="Documenter"),
    ],
    goal="Research and implement caching strategy",
    pattern="sequential",
    registry=registry,
)

result = await tf.run()

5 Execution Patterns

Pattern	Description	Use case
`sequential`	Agents execute one after another, each building on the previous result	Linear workflows, pipelines
`parallel`	All agents execute concurrently, results are merged	Independent subtasks, speed
`supervisor`	A supervisor agent delegates to workers and synthesizes results	Complex multi-step goals
`debate`	Agents argue opposing positions, a judge selects the best answer	Decision-making, verification
`swarm`	Agents self-organize dynamically based on the task state	Emergent collaboration

Inter-Agent Infrastructure

Component	Description
SharedMemory	Shared key-value store accessible by all agents in the taskforce
MessageBus	Agent-to-agent messaging for coordination and status updates
SandboxScope	Per-agent isolation — each agent gets its own sandbox, files, and context
Handoff	Structured context transfer when one agent hands off to another

Pre-Built TaskForce Templates

Template	Agents	Pattern	Purpose
`coding`	code, code, file	sequential	Build software with tests and docs
`research`	research, research, memory	sequential	Research a topic with verification and summary
`writing`	research, custom, custom	sequential	Draft and edit written content from research
`analysis`	research, code, custom	sequential	Gather data, analyze it, and write a report

7 Agent Types

kernelmcp provides 7 specialized agent types, each with a focused tool set and constitution:

Agent Type	Role	Key capabilities
`code`	Software engineer	Write, edit, execute, debug code via sandboxmcp
`research`	Researcher	Search the web (via websearchmcp), browse, RAG queries, synthesize
`file`	File manager	Read, write, checkpoint, organize files via workspacemcp
`memory`	Memory curator	Store, retrieve, consolidate, forget via memorymcp
`plan`	Planner	Decompose goals, create plans, estimate costs via planningmcp
`custom`	User-defined	Any combination of tools and constitution rules
`meta`	Meta-agent	Analyze runs, improve constitutions, generate templates

from kernelmcp import KernelFactory

# Spawn a specific agent type
kernel = KernelFactory.from_env()
result = await kernel.spawn_agent("research", goal="Find best practices for API rate limiting")

MetaAgent (Self-Improving)

The MetaAgent analyzes past runs and proposes improvements to the system itself:

Failure analysis — examines failed or slow runs to identify root causes
Constitution improvements — proposes rule changes based on observed failure patterns
LTP template generation — auto-generates reusable LTP templates from successful runs
Dry-run mode — preview proposed changes before applying them

# CLI: analyze recent runs and suggest improvements
kernelmcp improve --dry-run

# Apply improvements
kernelmcp improve

from kernelmcp.agents.meta_agent import MetaAgent

meta = MetaAgent(
    llm=kernel._engine._llm,
    orchestrator=kernel.orchestrator,
    audit_logger=kernel._audit,
)

report = await meta.analyze(namespace="default", limit=200)
suggestions = await meta.suggest(report)
for s in suggestions:
    print(f"[{s.type}] {s.content} (confidence={s.confidence:.2f})")

Observability

Full visibility into agent execution with tracing, analytics, and replay.

Tracing (spans)

Tracing is automatic — there’s nothing to wire up. Every run records a hierarchical span tree (SpanCollector, in kernelmcp.core.models) onto the executed task at task.spans, so you read it straight off the task:

# Every executed Task carries a hierarchical span tree on `task.spans`.
for span in task.spans:                 # nested via span.children
    print(span.name, span.type, span.status, span.duration_ms, span.metadata)
    # name e.g. "tool.web_search" / "llm.complete"; metadata: model, tokens, cost, namespace

Each Span carries name, type, status, timing (duration_ms), input/output, error, nested children, and a metadata dict (model, tokens, cost, namespace). The same spans feed the Hub’s span-waterfall view and the OTel exporter below.

ReplayEngine

Replay and debug past runs:

from kernelmcp.observability import ReplayEngine

# Spans live on task.spans; ReplayEngine pairs the audit trail with a span lookup.
class _Spans:
    def get_trace(self, task_id): return []
replay = ReplayEngine(audit_logger=kernel._audit, tracer=_Spans())

# Register a completed task, then inspect it
replay.register_task(task)

timeline = replay.get_timeline(task_id="abc123")
state = replay.get_state_at(task_id="abc123", turn_index=5)
forked = replay.fork(task_id="abc123", from_turn=3)
diff = replay.compare(task_id_a="abc", task_id_b="def")

OTel Bridge (Optional)

Export traces to any OpenTelemetry-compatible backend (Jaeger, Zipkin, Grafana Tempo):

kernel = KernelFactory.create(otel_endpoint="http://localhost:4317")

Prometheus metrics

The kernel API and the Hub each expose a /metrics endpoint: tool-call counts + success rate + duration, tasks by status, tokens and cost. Instrumented at the single governed tool chokepoint, so every execution path is covered. Opt-in via the [metrics] extra (pip install mcpaisuite-kernelmcp[metrics]); a graceful no-op when prometheus-client isn’t installed.

Observability MCP Tools

Tool	Description
`get_trace`	Retrieve the full trace for a task
`get_analytics`	Get analytics summary for a namespace
`compare_runs`	Compare two task runs side by side

Self-Hosted Hub (monitor & control your kernels)

Embed kernelmcp in your own apps and point each one at a self-hosted Hub to monitor your kernels from one place — and optionally control them. Monitoring is telemetry push over an outbound-only connection (no inbound port on your app); control is opt-in.

from kernelmcp import KernelFactory, connect_hub

kernel = KernelFactory.from_env()

# Monitoring only (always on once connected):
await connect_hub(kernel, hub_url="http://my-hub:8007", project="prod", api_key="kmh_...")

# ...or also let the Hub send commands to this kernel (opt-in):
await connect_hub(kernel, hub_url="http://my-hub:8007", project="prod",
                  api_key="kmh_...", allow_control=True)

# Use the kernel normally — finished tasks show up in your Hub.

connect_hub(...) is fail-safe and a no-op when unconfigured (it also reads KERNELMCP_HUB_URL / KERNELMCP_HUB_KEY / KERNELMCP_HUB_PROJECT from the environment), so it is always safe to call unconditionally. With allow_control=True the Hub can send ping / stats / set_config / run / cancel commands; pass run_handler(goal) to customize how run executes. Unlike a telemetry collector, this is a control plane your own embedded kernels connect to — you self-host it and keep your data.

Persistence

SQLite-based checkpointing for task state recovery.

StateManager

from kernelmcp.persistence.state_manager import StateManager
from kernelmcp.persistence.checkpoint import SQLiteCheckpointStore

# KernelFactory.create() already wires a StateManager as kernel._state_manager.
# To build one manually:
store = SQLiteCheckpointStore(db_path="kernel_state.db")
state_mgr = StateManager(store=store, kernel_pipeline=kernel)

# Manual checkpoint -- returns a checkpoint id
checkpoint_id = await state_mgr.checkpoint(label="before migration")

# Recover from latest checkpoint (or pass a specific checkpoint_id)
restored = await state_mgr.restore()

# List recent checkpoints
checkpoints = await state_mgr.list_checkpoints(limit=20)

Persistence MCP Tools

Tool	Description
`checkpoint`	Create a checkpoint for a running task
`restore`	Restore a task from a checkpoint
`list_checkpoints`	List available checkpoints

A2A Protocol

Agent-to-Agent (A2A) protocol support for cross-agent interoperability.

Components

Component	Description
`A2AServer`	Exposes any kernel agent as an A2A endpoint (`/.well-known/agent.json` + `/a2a/tasks`)
`A2AClient`	Discover and call external A2A agents by their well-known URL
`A2ABridge`	Auto-creates MCP tools from discovered A2A agent skills

Usage

from kernelmcp.a2a import A2AServer, A2AClient
from kernelmcp.a2a.bridge import A2ABridge  # not re-exported from kernelmcp.a2a

# Expose this agent as an A2A endpoint
server = A2AServer(kernel, skills=["code_review", "testing"])
await server.start(port=8080)
# Now discoverable at http://localhost:8080/.well-known/agent.json

# Call an external A2A agent
client = A2AClient("https://other-agent.example.com")
card = await client.discover()  # fetch agent card
result = await client.send_task("review this PR", skill="code_review")

# Bridge: auto-register external A2A skills as local MCP tools
bridge = A2ABridge(client, kernel.orchestrator)
tool_names = await bridge.register_agent("https://other-agent.example.com")
# Now callable as: kernel.call_tool("a2a_code_review", {...})

Vision and audio analysis capabilities exposed as MCP tools. Audio is a real governed path: transcribe_audio / summarize_audio accept a file path, URL, or base64, and use the dedicated transcription endpoint (Whisper / gpt-4o-transcribe) for OpenAI-family models, falling back to the chat input_audio path for audio-native models (Gemini). Needs an audio-capable model.

Inline vision (the agent sees images in its loop)

Beyond the analysis tools below, the ReAct loop is inline-multimodal: when a tool returns an image (a screenshot of a page, or read_file on a .png/.jpg), the pixels are injected into the conversation as real image blocks so the same reasoning model sees them on the next turn — not a separate text description. This is gated on the configured model supporting vision. It lets an agent verify a UI it built (screenshot("http://localhost:3000")) or inspect a frame it generated, then iterate: render → screenshot/read_file → look → fix.

Components

Component	Description
`VisionAnalyzer`	Image/screenshot analysis: `analyze`, `analyze_screenshot`, `analyze_file`, `compare`
`AudioTranscriber`	Audio processing: `transcribe`, `summarize`

MCP Tools

Tool	Description
`analyze_image`	Analyze an image from URL or base64 with a prompt
`analyze_screenshot`	Capture and analyze a screenshot of a URL
`screenshot`	Capture a URL (or local app) and return the image inline so the agent sees it directly
`transcribe_audio`	Transcribe an audio file / URL / base64 (governed)
`summarize_audio`	Transcribe then summarize audio (governed)

Usage

from kernelmcp.multimodal import VisionAnalyzer, AudioTranscriber

# Vision
vision = VisionAnalyzer(kernel)
result = await vision.analyze_file("screenshot.png", question="Describe the UI layout")
diff = await vision.compare(["before.png", "after.png"], question="What changed?")
analysis = await vision.analyze_file("diagram.pdf", question="Extract the architecture")

# Audio
audio = AudioTranscriber(kernel)
transcript = await audio.transcribe("meeting.mp3")
summary = await audio.summarize("meeting.mp3", prompt="Key decisions and action items")

Enterprise

Role-based access control, cost allocation, and SLA monitoring for production deployments.

Components

Component	Description
`RBACManager`	4 default roles (`admin`, `operator`, `viewer`, `agent`) with namespace + tool permissions
`CostAllocator`	Per-namespace, per-agent, and per-tool cost tracking with budget alerts
`SLAMonitor`	SLA rules, violation alerts, and webhook notifications

Usage

from kernelmcp.enterprise.rbac import RBACManager, Permission
from kernelmcp.enterprise.cost_allocation import CostAllocator
from kernelmcp.enterprise.sla import SLAMonitor, SLARule

# RBAC -- define roles, assign users, then check permissions
rbac = RBACManager()  # or RBACManager.default_roles() for admin/operator/viewer/agent
rbac.add_role("operator", [
    Permission(role="operator", namespace="prod", tools=["run_task", "get_task_status"]),
])
rbac.assign_user("alice", "operator")
rbac.check("alice", tool="host_exec", namespace="prod")  # -> False (denied)

# Cost allocation -- track spend per namespace/agent/tool
costs = CostAllocator()
await costs.set_budget(namespace="prod", max_cost=500.0)
await costs.record(namespace="prod", cost=0.42, agent_type="code", tool="execute_code")
report = await costs.get_report(namespace="prod")  # spent, budget, within_budget

# SLA monitoring -- add rules, then check against live metrics
sla = SLAMonitor()  # or SLAMonitor.default_sla() for sensible presets
sla.add_rule(SLARule(name="P95 Latency", metric="p95_latency_ms", threshold=5000, operator="lte"))
sla.add_rule(SLARule(name="Error Rate", metric="error_rate", threshold=0.01, operator="lte"))
sla.set_webhook("https://hooks.example.com/alerts")
alerts = await sla.check({"p95_latency_ms": 6200, "error_rate": 0.03})

Scaling

BudgetScaler

Control cost and concurrency at the system level:

from kernelmcp.scaling import BudgetScaler

scaler = BudgetScaler(
    max_cost_per_hour=5.0,       # USD ceiling per hour
    max_tokens_per_hour=500000,  # Token ceiling per hour
    max_concurrent_agents=10,    # Agent concurrency limit
)
kernel = KernelFactory.create(scaler=scaler)

Docker Compose

Production-ready stack with all services:

docker compose -f deploy/docker-compose.yml up

Kubernetes

Kubernetes manifests for cloud-native deployment:

kubectl apply -f deploy/k8s/

Streaming

Real-time task execution streaming via Server-Sent Events (SSE):

from kernelmcp.streaming import TaskStream

stream = TaskStream(kernel)
async for event in stream.run("build the API"):
    print(f"[{event.type}] {event.data}")
    # task.started, turn.completed, tool.called, task.completed, ...

SSE events are compatible with any SSE client (browser EventSource, curl, etc.).

ReAct Engine

The ReAct Engine is the heart of kernelmcp. It runs a recursive loop: ask the LLM, execute tool calls, feed results back, repeat until the task is complete or the budget is exhausted.

Loop features:

Context compaction — when the conversation approaches the window size, older turns are summarized by the LLM into a single, fact-preserving summary turn (incremental, with a circuit breaker) instead of being silently truncated.
In-loop todo list — for multi-step tasks the agent maintains a checklist via write_todos (TodoWrite-style), re-injected each turn so it stays oriented; offered only for non-trivial tasks.
Inline multimodal — tool-returned images (screenshots, generated frames) are shown to the model directly (see Multi-Modal).
Parallel read-only tools — side-effect-free tool calls issued together (read/grep/glob) run concurrently; stateful calls stay serial.
Workspace awareness — a bounded top-level snapshot of the workspace is surfaced so the agent doesn’t guess whether a file exists.

User Goal

Bootstrap Contextmemorymcp + ragmcp

Route to LLMSupervisor: local / fast / cloud

⟳ ReAct Loop

Build system prompt — Constitution + context
Call LLM — with fallback
If tool_calls — execute via orchestrator, auto-heal on failure, audit + cost tracking, then loop back to step 1
If text response — task complete, store outcome in memory

Return Task Resultstatus, turns, cost

LTP Engine (Lean Task Protocol)

The LTP engine compiles natural language plans into deterministic execution graphs. Unlike ReAct (which reasons at every step), LTP compiles once and executes deterministically — faster, cheaper, and predictable for structured tasks.

Hybrid mode (default)

Hybrid mode auto-selects between ReAct and LTP based on task structure:

# Hybrid mode (default) -- kernel decides
result = await kernel.run_task("deploy the API to staging and production")

# Force LTP mode
result = await kernel.run_task("deploy the API", mode="ltp")

# Force ReAct mode
result = await kernel.run_task("debug the auth bug", mode="react")

LTP directives

Directive	Description	Example
`@PARALLEL`	Execute steps concurrently	`STEP 2a @PARALLEL: run tests`
`ON_FAIL`	Error handling per step	`ON_FAIL: RE-PLAN`
`FOREACH`	Iterate over a collection	`FOREACH env IN [staging, prod]:`
`RE-PLAN`	Dynamic replanning on failure	`ON_FAIL: RE-PLAN`
Dot notation	Access data from previous steps	`{{step1.output.url}}`
Type casting	Cast values between types	`{{step2.count \| int}}`

Compiled plan example

STEP 1: search_documents "API authentication patterns"
STEP 2a @PARALLEL: write_file api/auth.py
STEP 2b @PARALLEL: write_file tests/test_auth.py
STEP 3: execute_code "pytest tests/test_auth.py"
  ON_FAIL: RE-PLAN
STEP 4: FOREACH env IN [staging, production]:
  deploy --target {{env}} --artifact {{step2a.output.path}}

Programmatic Graph API

Build cyclic, stateful agent graphs in pure Python (from kernelmcp.graph import Graph, END): add_node(name, fn) runs an arbitrary Python callable that shares a state dict; add_edge(a, b) (a self-edge is a bounded loop with an early-exit {"__stop__": True} signal); add_conditional_edge(src, router, mapping) for branching; plus checkpoint/resume (GraphCheckpointer). It’s a thin facade over the same executor as the Hub’s visual builder, so parallel waves, governed tool/code nodes, per-node spans and map-reduce all work unchanged.

Honest scope: callable nodes are your own in-process code (like a LangGraph node) — not sandboxed and not routed through the tool chokepoint (governance still applies to any kernel tools they call); use a code node for sandboxed execution. Graphs with Python callables are library-only by nature (a function can’t cross JSON/MCP/HTTP).

Dynamic map-reduce (`map` node)

Fan a step out over a list whose size is only known at runtime — process each of N search results / files / rows in parallel — then reduce. over resolves a parent’s output (or a ${channel:NAME}) to a list; each branch sees ${item}/${index}; results merge via a named reducer (append / concat / sum / dedup / last / merge) into the node output and an optional named into channel. Each branch is a governed, traced step (its tool calls hit the same chokepoint). ${channel:NAME} is a named shared-state layer alongside edge flow, readable by any downstream node.

Governance note: the graph path has no static plan verifier (unlike compiled LTP plans), so dynamic fan-out is bounded at runtime — a per-node max_fanout (hard-ceiled at 200) plus per-call budget/quota — rather than statically proven. Use the LTP path when you need a statically verifiable plan.

In the Hub, the visual editor exposes a 🔀 Map node (palette + inspector: over / reducer / max_fanout / into, with a tool|code|agent body), and the chat-to-build architect can propose map nodes when the item count varies.

Agent-JIT cache (experimental, situational)

Many workloads repeat the same kind of task with different parameters (“sum of squares to 100”, then “…to 900”). Agent-JIT amortizes those repeats. It is experimental and off by default — a genuine win on the right workload, but not a universal speedup (measured details below).

First sighting (cold). The task runs through the normal engine. Its winning solution (the execute_code pattern) is cached, keyed by a semantic signature — the goal with numbers masked, embedded locally (fastembed) so paraphrases of the same family match.
Second sighting (shadow). The cached pattern is adapted to the new goal and run, and the cold engine runs too. Their outputs are compared deterministically (not by an LLM). A match marks the family trusted.
Trusted (warm). Later instances skip cold reasoning entirely: adapt the validated pattern and execute it. Much cheaper, and safe — trust came from exact output comparison, not a model’s judgement.
Correctness is safe; cost is not always. A warm answer is only used after deterministic output validation, and any mismatch falls back to the full engine — so it never ships an unvalidated answer. But it is not “never cheaper”: measured, a warm reuse is ~34× cheaper (~330 vs ~11k tokens) once engaged, yet engagement requires the family to reliably route through execute_code (non-deterministic for simple tasks) and the cold runs to validate the shadow. On low-repetition or non-code-routed traffic it can be net-neutral to ~+15% (an un-amortized shadow pass). That’s why it’s experimental and off by default — turn it on for known repetitive, code-heavy workloads, not as a blanket optimization.

Off by default — opt in per kernel:

kernel = KernelFactory.from_env()              # honours KERNELMCP_JIT=1
# or explicitly
kernel = KernelFactory.create(..., jit=True)

Inspect or clear the cache (optionally per namespace) from any surface:

kernel.jit_stats()   # {"enabled": True, "families": 3, "trusted": 2, "total_hits": 41, ...}
kernel.jit_clear()   # drop cached patterns; returns the count removed

kernelmcp run "sum of squares to 500" --jit
kernelmcp jit stats
kernelmcp jit clear

Also exposed as MCP tools (jit_stats, jit_clear), HTTP endpoints (GET /jit/stats, POST /jit/clear), and a toggle in the Hub’s Engine settings. Patterns persist to ~/.kernelmcp/jit_cache.json and are isolated per namespace (tenant-safe).

schedulermcp

Event-driven task scheduling with four schedule types:

Type	Description	Example
`once`	Run once at a specific time	`run_at: "2026-04-28T09:00:00"`
`cron`	Cron expression schedule	`cron_expr: "0 9 * * *"`
`interval`	Run every N seconds	`interval_seconds: 300`
`watch`	Trigger on condition change	`watch_command + watch_condition`

Watch jobs

Watch jobs monitor a command’s output and trigger when conditions are met:

# Watch for errors in the log file
await kernel.call_tool("schedule_task", {
    "name": "error_watcher",
    "schedule_type": "watch",
    "watch_command": "tail -1 /var/log/app.log",
    "watch_condition": "contains:ERROR",
    "task": "Analyze the error and suggest a fix"
})

Host System Access

Secure host commands and file operations via HostGuard whitelist:

Tool	Description
`host_exec`	Execute a whitelisted command on the host
`host_file_read`	Read a file from the host filesystem
`host_file_write`	Write content to a file on the host
`host_file_copy`	Copy a file on the host
`host_file_list`	List files in a host directory

All operations are gated by HostGuard — commands and paths must be explicitly whitelisted. Unapproved operations are rejected.

Web Search & Browser Fetch (via websearchmcp)

The kernel delegates web search and browser fetch to websearchmcp, which provides:

SearXNG — self-hosted meta search engine with multi-engine rotation (Google, DuckDuckGo, Brave, Bing). Privacy-respecting, no tracking.
browser_fetch — Playwright-based headless browser for rendering JavaScript-heavy pages before extracting content.

# The kernel routes web tools to websearchmcp automatically
content = await kernel.call_tool("browser_fetch", {
    "url": "https://example.com/dashboard",
    "wait_for": "networkidle"
})

MCP Server Modes

kernelmcp exposes three MCP server modes via the --mode flag. Each mode changes what tools are exposed and how LLM reasoning works.

agent mode (default)

The kernel has its own LLM and drives the full ReAct/LTP loop internally. The client sends a goal, the kernel plans, executes, heals, and returns the result. 32 orchestration tools are exposed (run_task, spawn_agent, get_task_status, etc.).

Best for: autonomous agents, headless deployments, programmatic usage.

kernelmcp start --mode agent

Claude Desktop config:

{
  "mcpServers": {
    "kernelmcp": {
      "command": "kernelmcp",
      "args": ["start", "--mode", "agent"]
    }
  }
}

router mode

The client LLM drives reasoning — the kernel just routes tool calls to the correct sub-server. ALL 90+ suite tools are exposed directly to the client. Zero double API calls: the client’s LLM calls tools directly without an intermediary LLM layer.

Best for: Claude Desktop, Cursor, VS Code, or any MCP client with its own LLM.

kernelmcp start --mode router

Claude Desktop config:

{
  "mcpServers": {
    "kernelmcp": {
      "command": "kernelmcp",
      "args": ["start", "--mode", "router"]
    }
  }
}

sampling mode

Same as agent mode (kernel drives the ReAct/LTP loop), but the kernel uses the client’s LLM via MCP sampling instead of its own API key. The SamplingLLMGateway routes LLM calls through the host application (VS Code, AWS Bedrock, future Claude Desktop sampling).

Best for: environments where you don’t want to manage a separate API key, or when the client already has LLM access.

kernelmcp start --mode agent --sampling

Claude Desktop config:

{
  "mcpServers": {
    "kernelmcp": {
      "command": "kernelmcp",
      "args": ["start", "--mode", "agent", "--sampling"]
    }
  }
}

Mode comparison

	agent (default)	router	sampling
Who reasons	Kernel’s LLM	Client’s LLM	Client’s LLM (via sampling)
Tools exposed	32 orchestration tools	ALL 90+ suite tools	32 orchestration tools
Double API calls	Yes (client + kernel)	No	No
Needs API key	Yes	No	No
ReAct/LTP loop	Kernel-driven	N/A (client drives)	Kernel-driven
Best for	Autonomous agents	Claude Desktop, Cursor, VS Code	No-API-key deployments

MCP Sampling

The SamplingLLMGateway enables MCP Sampling passthrough — route LLM calls through the host application (VS Code, AWS Bedrock) instead of managing API keys directly.

kernel = KernelFactory.create(llm_gateway="sampling")

Tenant Isolation

workspacemcp supports per-tenant isolation for multi-tenant deployments:

workspace:
  tenant_isolation: true
  base_path: /data/tenants

Each tenant gets isolated file storage, checkpoints, and artifacts. Cross-tenant access is blocked.

Quick Start

3-line usage

from kernelmcp import KernelFactory

kernel = KernelFactory.from_env()
result = await kernel.run("analyze the auth module and suggest improvements")
print(result.summary)

Full suite (all 7 servers wired in-process)

from kernelmcp import KernelFactory

kernel = KernelFactory.full_suite(
    llm_model="claude-sonnet-4-6",
    api_key="sk-...",
    namespace="my_project",
)

result = await kernel.run(
    goal="migrate the database schema to v2",
    budget_usd=1.0,
    mode="hybrid",
)
print(f"Summary: {result.summary}")
print(f"Steps: {len(result.steps_taken)} | Cost: ${result.cost_usd:.4f}")

# Direct tool calls (no LLM)
await kernel.call_tool("schedule_task", {
    "goal": "daily backup", "job_type": "cron", "cron": "0 2 * * *"
})

Multi-Agent TaskForce

from kernelmcp import KernelFactory
from kernelmcp.agents.taskforce import TaskForce

kernel = KernelFactory.from_env()

# Launch a pre-built taskforce (templates: coding, research, writing, analysis)
tf = TaskForce.create("coding", goal="Build a payment API", registry=kernel._agent_registry)
result = await tf.run()
print(f"Agents used: {len(result.agent_results)} | Tokens: {result.total_tokens}")

MCP Server

from kernelmcp.factory import KernelFactory
from kernelmcp.mcp_server import KernelMCPServer

kernel = KernelFactory.from_env()
KernelMCPServer(kernel).run()

Or from the command line:

# Agent mode (default) -- kernel drives ReAct/LTP, exposes 32 tools
kernelmcp start

# Router mode -- client LLM drives, ALL 90+ suite tools exposed, zero double API calls
kernelmcp start --mode router

# Sampling mode -- agent mode but uses client's LLM via MCP sampling
kernelmcp start --mode agent --sampling

Claude Desktop claude_desktop_config.json (router mode recommended):

{
  "mcpServers": {
    "kernelmcp": {
      "command": "kernelmcp",
      "args": ["start", "--mode", "router"]
    }
  }
}

See MCP Server Modes for details on all three modes.

Features

Multi-Agent & Orchestration

:people_holding_hands: Multi-Agent TaskForce — compose agent teams with 5 patterns: sequential, parallel, supervisor, debate, swarm
:busts_in_silhouette: 7 Agent Types — code, research, file, memory, plan, custom, meta — each with focused tools and constitution
:jigsaw: SharedMemory + MessageBus — inter-agent collaboration with shared state and messaging
:shield: SandboxScope — per-agent isolation for safe concurrent execution
:arrows_counterclockwise: Handoff Protocol — structured context transfer between agents
:brain: MetaAgent — self-improving: analyzes failures, proposes constitution changes, generates LTP templates

Engine

:brain: ReAct + LTP Hybrid Engine — ReAct for exploratory tasks, LTP for structured execution, hybrid mode auto-selects
:rocket: LTP Compiler — Lean Task Protocol: compile once, execute deterministically. @PARALLEL, ON_FAIL, FOREACH, RE-PLAN, dot notation, type casting
:robot: Hybrid Router — routes simple tasks to local models, complex to cloud (cost optimization)
:scroll: Constitution — hardcoded PM persona with rules across 5 suite servers constraining the LLM
:wrench: Auto-Healing — detects execution failures and injects fix-retry prompts automatically

Observability & Debugging

:mag_right: Span tracing — automatic hierarchical spans on every task (task.spans), with an optional OTel exporter
:bar_chart: Analytics — token efficiency, tool/agent performance, cost breakdown, bottleneck detection
:rewind: ReplayEngine — timeline, state_at, fork, compare for post-mortem debugging
:clipboard: Immutable Audit Trail — every LLM call and tool invocation logged with per-model cost tracking
:satellite: 26 Event Types — async event bus with subscribe/emit/stream
:ocean: TaskStream — real-time SSE streaming of task execution events

Persistence & Scaling

:floppy_disk: SQLite Checkpointing — auto-checkpoint before tasks, restore from any checkpoint
:chart_with_upwards_trend: BudgetScaler — max cost/tokens per hour, max concurrent agents
:whale: Docker Compose — production-ready deployment stack
:cloud: Kubernetes — manifests for cloud-native scaling

Resilience

:moneybag: Budget Enforcer — hard caps on tokens and cost per task and per namespace
:zap: LLM Fallback Chain — primary -> secondary -> tertiary model failover
:repeat: Retry with Backoff — exponential backoff on transient failures
:traffic_light: Rate Limiter — max tasks per minute per namespace
:electric_plug: Circuit Breaker — disables a server after N consecutive failures, auto-resets

Infrastructure

:link: Suite Orchestrator — wires all servers into a unified tool registry (120+ tools)
:gear: KernelFactory — default() / create() / from_env() / from_yaml() / full_suite()
:desktop_computer: MCP Server (3 modes) — agent (kernel-driven ReAct), router (client-driven, 90+ suite tools, zero double API calls), sampling (agent + client LLM)
:keyboard: CLI — start, taskforce, templates, new, deploy, eval, improve, cost, and more
:alarm_clock: schedulermcp — cron, interval, and watch jobs for event-driven automation
:mag: Web Search — via websearchmcp (SearXNG, DuckDuckGo, Mojeek, Brave + Playwright browser rendering)
:globe_with_meridians: Dynamic MCP tools — connect any external MCP server at runtime (stdio/SSE)
:computer: Host Access — host_exec, host_file_read/write/copy/list with HostGuard security whitelist
:handshake: MCP Sampling — SamplingLLMGateway for passthrough to VS Code / Bedrock / future Claude Desktop
:lock: Tenant Isolation — workspacemcp per-tenant file, memory, and execution isolation

Knowledge & Memory

:brain: Self-RAG — retrieve, generate, self-critique, re-retrieve for verified factual answers
:zap: ReAct RAG — multi-step iterative reasoning with multiple searches for complex questions
:bar_chart: RAGAS Eval — 5-metric evaluation (context relevancy, precision, faithfulness, answer correctness)
:bust_in_silhouette: User Profiles — personalized search ranking based on user preferences
:shield: GDPR forget_user — permanently delete all data for a namespace
:file_folder: Folder & URL Ingest — ingest entire folders or download documents from URLs
:memo: Auto-Episodes — conversation turns automatically stored as episodic memory
:gear: Full Context Assembly — persona + working memory + episodes + facts + RAG docs injected at every task

Installation

# Core kernel (no suite libraries)
pip install mcpaisuite-kernelmcp

# With specific libraries
pip install "mcpaisuite-kernelmcp[memorymcp,planningmcp]"

# Full suite (all servers + webhooks + observability)
pip install "mcpaisuite-kernelmcp[all]"
# [all] pulls in: memorymcp, planningmcp, ragmcp, workspacemcp,
#                 sandboxmcp, schedulermcp + webhooks + REST API.
# (websearchmcp, ltpmcp, evalmcp are core deps — always installed.)

# Development
pip install -e ".[dev]"

One-command setup & deploy

kernelmcp init is an interactive wizard (provider, model, key, namespace, which sub-libs) that writes a ready-to-run config.yaml. A Helm chart (helm/kernelmcp) deploys the HTTP API server (/health + Prometheus /metrics) with probes, optional bundled Redis, Ingress and HPA — helm install kernel ./helm/kernelmcp.

Requirements: Python 3.11+

Constitution

The Constitution is a hardcoded system prompt that constrains every LLM call. It defines the PM persona and rule domains spanning all 5 suite servers:

Domain	Key rules
Planning	ALWAYS create a plan before executing. Never skip steps. Fix failures before advancing.
Memory	Relevant memories are injected at task start. Store important outcomes for future reference.
Knowledge	Search ragmcp before guessing. Use document context to inform decisions.
Workspace	Route ALL code to `write_file` first. Create checkpoints before modifications.
Execution	Execute through sandboxmcp only. Debug failures: read -> fix -> re-execute. Never hardcode secrets.
General	Be concise. Ask for clarification on ambiguity. Prefer local models for simple lookups.

The Constitution can be updated at runtime via the set_constitution MCP tool or programmatically:

kernel._engine._constitution.update_rules("Your custom rules here...")

Memory context and RAG context are injected into the system prompt dynamically at each turn.

Hybrid Router

The TaskSupervisor estimates task complexity (0.0 to 1.0) and routes to the optimal model:

Complexity	Range	Model	Use case
Simple	< 0.3	`ollama/mistral` (local)	Status checks, lookups, simple queries
Medium	0.3 - 0.7	`claude-haiku-4-5` (fast)	Summaries, explanations, searches
Complex	> 0.7	`claude-sonnet-4-6` (cloud)	Code generation, architecture, multi-step tasks

Complexity scoring uses keyword analysis, task length, multi-step indicators, and code-related terms. Routing can be disabled to always use the cloud model:

kernel = KernelFactory.create(enable_routing=False)

Auto-Healing

When a tool execution fails (specifically execute_code via sandboxmcp), the engine automatically injects a system prompt instructing the LLM to:

Analyze the error from stderr
Fix the code using workspacemcp.edit_file
Re-execute using sandboxmcp.execute_code
NOT advance the plan until exit_code == 0

This creates a self-correcting loop without human intervention. Auto-healing can be disabled:

kernel = KernelFactory.create(auto_heal=False)

Resilience

kernelmcp includes three resilience layers that protect against failures and runaway costs:

Budget Enforcer

Hard caps on tokens and cost per task. When exceeded, the task is marked as failed immediately.

kernel = KernelFactory.create(
    max_tokens_per_task=50000,   # token ceiling
    max_cost_per_task=1.0,       # dollar ceiling
)

Per-namespace spend tracking is available via BudgetEnforcer.get_spent(namespace).

Rate Limiter

Limits the number of tasks per minute per namespace (default: 30/min). Tasks that exceed the limit are rejected immediately.

Circuit Breaker

Disables a server after N consecutive failures (default: 3). Automatically resets after a cooldown period (default: 60 seconds). Prevents cascading failures when a suite library is down.

LLM Fallback Chain

If the primary LLM fails, the engine falls back through a chain of models:

claude-sonnet-4-6  -->  gpt-4o  -->  ollama/mistral

Retry with Backoff

Failed tool calls are retried with exponential backoff (base delay: 1s, max delay: 30s, default retries: 1).

Audit Trail

Every LLM call and tool invocation is logged as an immutable AuditEntry with:

Task ID, action type, model name, tool name
Token count and cost
Success/failure status
Namespace and timestamp

Two backends are available:

Backend	Use case
`InMemoryAuditLogger`	Development, testing
`SQLiteAuditLogger`	Production — persistent, queryable

Per-model cost tracking

# Get cost breakdown by model
costs = await audit.cost_by_model(namespace="default")
# {"claude-sonnet-4-6": {"cost": 0.42, "tokens": 12500}, "ollama/mistral": {"cost": 0.0, "tokens": 800}}

Querying the audit trail

# Recent entries
entries = await audit.query(namespace="default", action="llm_call", limit=50)

# Total entry count
count = await audit.count(namespace="default")

MCP Tools

The full suite exposes 120+ tools across all servers. The kernel itself provides orchestration, observability, and persistence tools via the MCP protocol (stdio transport):

Orchestration Tools

Tool	Description
`run_task`	Submit a task for autonomous execution
`get_task_status`	Check task progress
`list_tasks`	List all tasks
`cancel_task`	Abort a running task
`get_turns`	Get ReAct turn history
`spawn_agent`	Spawn a specialized agent (code, research, file, memory, plan, custom, meta)
`create_taskforce`	Create a multi-agent taskforce
`kernel_stats`	Token costs, latency, model usage
`kernel_config`	View kernel configuration
`set_constitution`	Update PM rules
`trigger_webhook`	Fire a webhook event
`health`	Kernel health + server connections

Observability Tools

Tool	Description
`get_trace`	Retrieve the full trace for a task
`get_analytics`	Get analytics summary for a namespace
`compare_runs`	Compare two task runs side by side

Persistence Tools

Tool	Description
`checkpoint`	Create a checkpoint for a running task
`restore`	Restore a task from a checkpoint
`list_checkpoints`	List available checkpoints

Agent & Background Tools

Tool	Description
`list_agents`	List available sub-agent types
`run_background`	Run a task asynchronously, returns an operation ID
`get_operation`	Check the status of a background operation
`list_operations`	List all background operations

Introspection Tools

Tool	Description
`kernel_audit`	View kernel task audit log (completions, failures, costs)
`improve`	Analyze kernel performance and suggest improvements
`list_taskforce_templates`	List available TaskForce templates with agent configs
`list_taskforce_examples`	List pre-built taskforce configs (secure_coding, research_verify, enterprise)

CLI

kernelmcp start                                       # Start the kernel MCP server
kernelmcp start --mode router                         # Start in router mode
kernelmcp start --sampling                            # Use the client's LLM via MCP sampling
kernelmcp taskforce "Build REST API" --template coding   # Launch a multi-agent taskforce
kernelmcp taskforce "Analyze sales data" --pattern debate  # Custom pattern
kernelmcp templates                                   # List available taskforce templates
kernelmcp new coding "Build a REST API"               # Generate a TaskForce config file from a template
kernelmcp deploy taskforce.json                       # Deploy (run) a TaskForce from a JSON config file
kernelmcp eval --suite memory                         # Run evaluation benchmarks
kernelmcp improve --dry-run                           # Preview MetaAgent improvements
kernelmcp improve                                     # Apply MetaAgent improvements
kernelmcp cost                                        # Show token cost summary
kernelmcp run "migrate the database"                  # Execute a task interactively
kernelmcp agents                                      # List available agent types
kernelmcp status                                      # Show kernel health and connected servers
kernelmcp stop                                        # Stop the kernel daemon
kernelmcp logs --tail 50                              # View recent events
kernelmcp config                                      # View kernel configuration
kernelmcp servers                                     # List connected MCP servers and tool count

Events (26 types)

kernelmcp emits 26 event types through an async event bus. Subscribe to monitor task execution in real time.

Event	Emitted when
`task.started`	A task begins execution
`task.completed`	A task finishes successfully
`task.failed`	A task fails (budget, max turns, error)
`task.cancelled`	A task is cancelled
`turn.started`	A new ReAct turn begins
`turn.completed`	A ReAct turn finishes
`tool.called`	A tool is about to be executed
`tool.succeeded`	A tool call returns successfully
`tool.failed`	A tool call fails
`plan.enforced`	A plan step is enforced
`auto_heal.triggered`	Auto-healing activates after execution failure
`context.bootstrapped`	Memory context is loaded at task start
`context.trimmed`	Context window is trimmed to fit token limits
`llm.called`	An LLM call is made
`llm.delta`	A streaming token/delta is emitted
`llm.routed`	The supervisor selects a model
`webhook.received`	An external webhook is received
`taskforce.started`	A TaskForce begins execution
`taskforce.completed`	A TaskForce finishes successfully
`taskforce.failed`	A TaskForce fails
`agent.handoff`	One agent hands off context to another
`agent.message`	An agent-to-agent message is sent

from kernelmcp.events import kernel_event_bus

# Stream all events
async for event in kernel_event_bus.stream():
    print(f"{event.type.value}: {event.data}")

# Subscribe with a queue
queue = kernel_event_bus.subscribe()
event = await queue.get()

Factory

KernelFactory provides multiple construction methods:

from kernelmcp import KernelFactory

# Zero config -- in-memory, no libraries, just LLM + ReAct
kernel = KernelFactory.default()

# From environment variables (KERNELMCP_MODEL, ANTHROPIC_API_KEY, etc.)
kernel = KernelFactory.from_env()

# From a YAML configuration file
kernel = KernelFactory.from_yaml("kernel_config.yaml")

# Full configuration
kernel = KernelFactory.create(
    llm_model="claude-sonnet-4-6",
    local_model="ollama/mistral",
    fast_model="claude-haiku-4-5-20251001",
    api_key="sk-...",
    enable_routing=True,
    max_turns=20,
    max_tokens_per_task=50000,
    max_cost_per_task=1.0,
    auto_plan=True,
    auto_heal=True,
    auto_memory=True,
    jit=False,                     # Agent-JIT: reuse shadow-validated solution patterns
    nano=False,                    # fast path for trivial single-shot tasks
    namespace="my_project",
    memory_pipeline=memory,        # from memorymcp
    planning_pipeline=planning,    # from planningmcp
    rag_pipeline=rag,              # from ragmcp
    workspace_pipeline=workspace,  # from workspacemcp
    sandbox_pipeline=sandbox,      # from sandboxmcp
)

# Full suite -- all 7 servers auto-detected and wired
kernel = KernelFactory.full_suite(
    llm_model="claude-sonnet-4-6",
    api_key="sk-...",
    namespace="default",
)

Environment variables

Variable	Default	Description
`KERNELMCP_MODEL`	`claude-sonnet-4-6`	Primary LLM model
`KERNELMCP_LOCAL_MODEL`	`ollama/mistral`	Local model for simple tasks
`KERNELMCP_ROUTING`	`true`	Enable hybrid routing
`KERNELMCP_MAX_TURNS`	`20`	Maximum ReAct turns per task
`KERNELMCP_MAX_TOKENS`	`50000`	Token budget per task
`KERNELMCP_JIT`	`false`	Reuse shadow-validated solution patterns across repeated task families (Agent-JIT)
`KERNELMCP_NANO`	`false`	Fast path for trivial single-shot tasks
`KERNELMCP_NAMESPACE`	`default`	Default namespace
`ANTHROPIC_API_KEY`	—	Anthropic API key
`OPENAI_API_KEY`	—	OpenAI API key (fallback)

Development

git clone https://github.com/gashel01/kernelmcp
cd kernelmcp
pip install -e ".[dev]"

# Run tests
pytest tests/ -v                    # 1285+ tests

# With coverage
pytest tests/ --cov=kernelmcp --cov-report=html

Project structure

kernelmcp/
  core/
    models.py          -- Task, Turn, ToolCall, KernelConfig; AgentType (7 types: code, research, file, memory, plan, custom, meta)
    engine.py          -- ReAct engine (the autonomous execution loop)
    ltp_runner.py      -- LTP compiler + deterministic executor (@PARALLEL, ON_FAIL, FOREACH, RE-PLAN)
    constitution.py    -- PM persona and hardcoded LLM constraints
    context.py         -- Context window manager
    audit.py           -- InMemory + SQLite audit loggers
    resilience.py      -- Fallback chain, retry, rate limiter
    bootstrap.py / tool_executor.py / tool_selection.py / nudges.py / ...
  agents/
    base.py            -- Agent base + search_tools meta-tool
    meta_agent.py      -- MetaAgent (self-improving: failure analysis, constitution updates)
    patterns.py        -- 5 patterns: sequential, parallel, supervisor, debate, swarm
    taskforce.py       -- TaskForce orchestrator + templates
    shared_memory.py / message_bus.py / sandbox_scope.py / graph_executor.py / registry.py
    code_agent.py / research_agent.py / file_agent.py / memory_agent.py
  routing/
    supervisor.py      -- Hybrid router (complexity estimation + model selection)
    llm_gateway.py     -- LLM abstraction layer (litellm)
    sampling_gateway.py -- SamplingLLMGateway for MCP Sampling
  observability/
    tracer.py          -- Span-based tracing + analytics
    replay.py          -- ReplayEngine: timeline, state_at, fork, compare
    hub.py             -- connect_hub() control-plane connector
    otel.py            -- OpenTelemetry export (optional)
  integration/
    orchestrator.py    -- Suite orchestrator (wires all servers) + host tools (HostGuard via sandboxmcp)
    mcp_client.py      -- Dynamic MCP server connections
    langchain_adapter.py -- LangChain tools bridge
  enterprise/          -- RBACManager, CostAllocator, SLAMonitor
  multimodal/          -- VisionAnalyzer, AudioTranscriber
  persistence/         -- SQLite checkpointing and recovery
  a2a/                 -- A2AServer / A2AClient / A2ABridge
  mcp_client/          -- MCP client utilities
  triggers/            -- Trigger / scheduling support
  api/                 -- FastAPI app (optional)
  scaling.py           -- BudgetScaler: cost/token/concurrency limits
  streaming.py         -- TaskStream with SSE events
  pipeline.py          -- KernelPipeline, BudgetEnforcer, CircuitBreaker
  events.py            -- Event bus (26 event types)
  factory.py           -- KernelFactory
  facade.py            -- High-level facade
  mcp_server.py        -- MCP server (32 tools)
  cli.py               -- CLI commands

Ecosystem & Interoperability

kernelmcp is the only AI agent framework that natively connects to three ecosystems:

Dynamic MCP Client

Connect to any of the 2000+ MCP servers at runtime:

await kernel.orchestrator.connect_mcp_server("github", transport="stdio", command="github-mcp-server")
await kernel.orchestrator.connect_mcp_server("slack", transport="sse", url="http://localhost:8080/sse")
# Tools appear automatically — the LLM can use them immediately

LangChain Tool Bridge

Use any of LangChain’s 500+ tools without leaving kernelmcp:

from langchain_community.tools import WikipediaQueryRun
kernel.orchestrator.register_langchain_tool(WikipediaQueryRun())
# lc__wikipedia is now available to the engine

License

Apache-2.0 — see LICENSE.

kernelmcp

kernelmcp

What is kernelmcp?

Integration Matrix

Governance & Security

A/B testing of constitutions

Dry-run (simulation)

Multi-Agent TaskForce

Creating a TaskForce

5 Execution Patterns

Inter-Agent Infrastructure

Pre-Built TaskForce Templates

7 Agent Types

MetaAgent (Self-Improving)

Observability

Tracing (spans)

ReplayEngine

OTel Bridge (Optional)

Prometheus metrics

Observability MCP Tools

Self-Hosted Hub (monitor & control your kernels)

Persistence

StateManager

Persistence MCP Tools

A2A Protocol

Components

Usage

Multi-Modal

Inline vision (the agent sees images in its loop)

Components

MCP Tools

Usage

Enterprise

Components

Usage

Scaling

BudgetScaler

Docker Compose

Kubernetes

Streaming

ReAct Engine

LTP Engine (Lean Task Protocol)

Hybrid mode (default)

LTP directives

Compiled plan example

Programmatic Graph API

Dynamic map-reduce (map node)

Agent-JIT cache (experimental, situational)

schedulermcp

Watch jobs

Host System Access

Web Search & Browser Fetch (via websearchmcp)

MCP Server Modes

agent mode (default)

router mode

sampling mode

Mode comparison

MCP Sampling

Tenant Isolation

Quick Start

3-line usage

Full suite (all 7 servers wired in-process)

Multi-Agent TaskForce

MCP Server

Features

Multi-Agent & Orchestration

Engine

Observability & Debugging

Persistence & Scaling

Resilience

Infrastructure

Knowledge & Memory

Installation

One-command setup & deploy

Constitution

Hybrid Router

Auto-Healing

Resilience

Budget Enforcer

Rate Limiter

Dynamic map-reduce (`map` node)