Sovereign AI orchestrator · MCP-native · Apache-2.0

One brain. Seven servers.
Zero manual intervention.

kernelmcp is the sovereign orchestrator for AI agents. ReAct + LTP hybrid engine, constitution-driven planning, hybrid compute routing, auto-healing, budget enforcement, circuit breaking, multi-agent TaskForce, and host system access — one kernel that wires everything together. Connects to any MCP server or LangChain tool at runtime — 2000+ community MCP servers and 500+ LangChain tools are reachable, none bundled. Deploy as agent, router, or sampling server.

Get started View on GitHub

orchestrate.py

from kernelmcp import KernelFactory

# Full suite — all 7 servers wired in-process
kernel = KernelFactory.full_suite(
    llm_model="claude-sonnet-4-6",
    api_key="sk-...",
)

# Submit a goal — the ReAct engine plans + executes
result = await kernel.run(
    goal="Analyze Q3 revenue data and generate report",
    budget_usd=0.50,          # hard spend cap
)

print(result.summary)         # final report
print(result.steps_taken)     # full ReAct trace
print(result.cost_usd)        # actual spend

ReAct Engine Constitution-driven

MCP stdio / SSE Auto-healing

Features

Everything your agent needs
to plan, execute, and self-heal

Twelve core capabilities, each replaceable. From prototype to production without rewrites.

ReAct Engine

Observe-Think-Act loop with automatic tool selection. The engine reasons about which tool to call next, observes the result, and decides the next step — no hardcoded chains.

Observe Think Act

Constitution (PM Persona)

Define planning rules, quality gates, and output constraints in a declarative constitution. The PM persona decomposes goals into tasks, assigns priorities, and enforces acceptance criteria.

pm engineer analyst

3 Server Modes

agent: kernel drives ReAct/LTP with its own LLM. router: client LLM drives, all 90+ suite tools routed, zero double API calls. sampling: agent mode using the client's LLM via MCP sampling.

agent router sampling

Auto-Healing

When a tool call fails or the model returns empty/invalid output, the kernel retries transient errors with exponential backoff and feeds the error back to the model to fix and retry — auto-healing code-execution failures with no manual intervention.

retry backoff error feedback

Budget Enforcement

Set a hard USD/token cap per run and per namespace. The kernel tracks token usage across all LLM calls, stops gracefully before overspend, and reports actual cost in the result.

per-run per-namespace hard cap

Circuit Breaker

Tracks consecutive failures per sub-server. After several consecutive failures the circuit opens and calls to that server are short-circuited with a clear error, then it auto-resets after a cooldown.

closed open cooldown

LTP Engine

Lean Task Protocol — compile a natural language plan once, then execute deterministically. Supports @PARALLEL, ON_FAIL, FOREACH, RE-PLAN directives, dot notation, and type casting.

compile execute hybrid

A2A Protocol

Agent-to-Agent interoperability. Expose and consume agents over the A2A standard — interoperate with any A2A-compliant framework via a generic server, client, and bridge.

A2A server A2A client bridge

Agent-JIT Cache (experimental)

Amortize repeated task families. A validated solution pattern is reused after one deterministic shadow check — ~34× cheaper per reuse when it engages, and never ships an unvalidated answer (falls back on mismatch). Situational: a net win only on repetitive, code-routed workloads; can be net-neutral otherwise. Off by default.

shadow-validate ~34× on reuse opt-in

How it works

Five phases. Seven servers.
Fully autonomous.

Submit a goal. The kernel bootstraps, plans, executes, heals, and reports — all without manual intervention.

Bootstrap

KernelFactory wires all seven servers, loads the constitution, and initializes the event bus.

kernel = KernelFactory.default()

Plan

The PM persona decomposes your goal into ordered subtasks with acceptance criteria and budget allocation — automatically, inside the run.

result = await kernel.run(goal=goal)

Execute

The ReAct engine runs each subtask: observe context, think about the next action, act via tool call, repeat.

result = await kernel.run(goal=goal)

Heal

When a step fails, the auto-healer retries, falls back to alternative tools, or triggers a re-plan if needed.

# automatic — no code needed

Report

Returns a structured result with summary, full ReAct trace, cost breakdown, and events emitted.

print(result.summary, result.cost_usd)

Goal → Constitution → ReAct / LTP Engine → Result

memorymcp planningmcp workspacemcp sandboxmcp ragmcp schedulermcp

Code examples

See it in action

Six common configurations — pick the one that fits and copy it straight into your project.

Python — Quick Start

import asyncio
from kernelmcp import KernelFactory

async def main():
    kernel = KernelFactory.from_env()

    result = await kernel.run(
        goal="Summarize the top 3 issues in the backlog",
        budget_usd=0.25,
    )

    print(result.summary)
    print(f"Cost: ${result.cost_usd:.4f}")
    print(f"Steps: {len(result.steps_taken)}")

asyncio.run(main())

Python — Full Suite with all 7 servers

from kernelmcp import KernelFactory

# full_suite auto-detects and wires every installed suite library
kernel = KernelFactory.full_suite(
    llm_model="claude-sonnet-4-6",
    api_key="sk-...",
    namespace="research",
)

result = await kernel.run(
    goal="Research competitor pricing and draft a report",
    budget_usd=1.00,
)

for step in result.steps_taken:
    print(f"[{step.action}] {step.tool} — ${step.cost_usd:.4f}")

3 server modes: agent (kernel-driven ReAct) · router (client-driven, 90+ suite tools) · sampling (agent + client's LLM)

router mode (recommended for Claude Desktop)

JSON — claude_desktop_config.json

{
  "mcpServers": {
    "kernelmcp": {
      "command": "kernelmcp",
      "args": ["start", "--mode", "router"],
      "cwd": "/path/to/your/project"
    }
  }
}

agent mode (headless / SSE)

JSON — agent mode / SSE

{
  "mcpServers": {
    "kernelmcp": {
      "type": "sse",
      "url": "http://localhost:8007/sse"
    }
  }
}

YAML — constitution.yaml

name: pm
description: Product Manager persona — structured planning

planning:
  decomposition: hierarchical
  max_subtasks: 10
  require_acceptance_criteria: true

quality_gates:
  - name: completeness
    threshold: 0.8
  - name: accuracy
    threshold: 0.9

budget:
  default_usd: 0.50
  warn_at_percent: 80
  abort_at_percent: 100

Python — LTP (Lean Task Protocol) Mode

from kernelmcp import KernelFactory

kernel = KernelFactory.default()

# LTP mode: compile a plan once, execute deterministically
result = await kernel.run(
    goal="Build and deploy the API server",
    mode="ltp",
)

# Hybrid mode (default) auto-selects ReAct or LTP
result = await kernel.run(
    goal="Analyze logs and fix the bug",
    mode="hybrid",
)

LTP directives: @PARALLEL, ON_FAIL, FOREACH, RE-PLAN, dot notation, type casting

Measured, not claimed

`kernelmcp` vs LangGraph & CrewAI

Same model, same controls, losses shown as plainly as wins. Every number reproduces from a script with raw JSON in the repo.

Metric	kernelmcp	LangGraph & CrewAI
Deep-chain success (Sonnet)	100%	100% / 100%
$ per task at 100% (Sonnet)	$0.015	$0.038 / $0.046
Cost advantage (Sonnet)	2.6×–3.1× cheaper	—
Latency (Sonnet deep chain)	~9.55s	~31s

On deep agentic chains all four frameworks hit 100% — reliability is a tie. kernelmcp Hybrid is the cheapest 100%-reliable system because LTP compiles the chain once instead of re-invoking the LLM per step. On a near-empty single-tool task a bare agent is still cheaper — the edge is depth + prompt caching, not raw tokens.

See the full benchmark →

One brain. Seven servers. Zero manual intervention.

Everything your agent needsto plan, execute, and self-heal