Python memory library · MCP-native · Apache-2.0

Your agent's memory.
Persistent. Intelligent.
MCP-native.

memorymcp is a modular Python library for persistent cognitive memory. Store facts, episodes, and context across sessions. Query with semantic search. Expose as an MCP server.

Get started View on GitHub

agent_memory.py

from memorymcp import MemoryFactory

pipeline = MemoryFactory.default()

await pipeline.store_fact(
    "User prefers Python over JavaScript",
    fact_type="preference",
    importance=0.9,
)

facts = await pipeline.query_memory("coding preferences", top_k=5)
context = "\n".join(f.fact.content for f in facts)

from memorymcp.mcp_server import MemoryMCPServer
MemoryMCPServer(pipeline).run()

4-tier memory Semantic search via ragmcp

MCP stdio / SSE

Features

Everything your agent needs
to remember and learn

Modular by design — each memory tier and component is replaceable.

4-Tier Memory

HotCache (LRU RAM) -> Working -> Episodic -> Semantic. Each tier has the right TTL and storage backend.

HotCache Working Episodic

Semantic Memory

ragmcp Qdrant pgvector

Reranker & query expansion

Two optional retrieval upgrades, off by default. A cross-encoder reranker (fastembed, no torch) re-scores retrieved facts — measured recall@1 0.40 → 0.88. Adaptive query expansion rewrites weak queries and re-searches — measured recall@5 0.42 → 0.92.

enable_rerank enable_query_expansion

$0 ingestion by default

Most memory systems call an LLM to extract facts from text — a paid API call every time you write. memorymcp's default extractor is deterministic regex (no LLM) and its default embedder runs locally (fastembed, on your CPU), so ingestion costs $0. Need to parse messy prose? Switch on the optional LLM extractor — then you pay per write, like the others.

No LLM Local embeddings LLM extractor optional

MCP-Native

16 MCP tools out of the box: query_memory, store_fact, get_session_context, add_episode, assemble_context, and more.

stdio SSE

Consolidation Engine

ADD/UPDATE/OVERWRITE/SKIP logic. Contradictions are detected and archived. Full timeline preserved.

ADD UPDATE OVERWRITE

Decay & Confidence

Tri-modal decay (linear/exponential/anchored). Usage-based slowdown. Separate confidence tracking.

linear exponential adaptive

Privacy & Quotas

PII filter (email, phone, CC, SSN). GDPR Art.17 forget_user(). Per-namespace memory quotas.

GDPR Art.17 PII filter

Zero Config Start

MemoryFactory.default() works without any external service. Add Qdrant, Redis, Neo4j — all optional.

default() create() from_env()

Persistent Backends

Redis for working memory, SQLite for local persistence, Neo4j for the fact graph, and pgvector or Chroma for semantic search.

Redis SQLite Neo4j

How it works

Install once.
Remember what matters.

No boilerplate. memorymcp handles the memory plumbing so your agent can focus on reasoning.

Install

One pip install. Zero dependencies to start. Backends are optional extras.

pip install mcpaisuite-memorymcp

Store

Store facts with type, importance, and confidence. Consolidation engine handles duplicates automatically.

await pipeline.store_fact( "User prefers Python", fact_type="preference" )

Query

Retrieve relevant context by meaning. memorymcp searches all tiers and ranks by decay-adjusted score.

facts = await pipeline.query_memory( "coding preferences", top_k=5 )

Deploy

Expose as an MCP server, embed in your agent, or use as a standalone service.

MemoryMCPServer(pipeline).run()

Code examples

See it in action

Python — pydantic-ai agent with memorymcp

import asyncio
from memorymcp import MemoryFactory
from pydantic_ai import Agent

async def main():
    pipeline = MemoryFactory.default()
    agent = Agent("openai:gpt-4o")

    @agent.tool
    async def remember(content: str, fact_type: str = "general") -> str:
        await pipeline.store_fact(content, fact_type=fact_type, importance=0.8)
        return f"Stored: {content}"

    @agent.tool
    async def recall(query: str, top_k: int = 5) -> str:
        facts = await pipeline.query_memory(query, top_k=top_k)
        return "\n".join(f"[{f.fact.fact_type}] {f.fact.content}" for f in facts)

    ctx = await pipeline.assemble_context("coding preferences", session_id="sess_1", max_tokens=1000)
    result = await agent.run("What do you know about my coding preferences?", system_prompt=ctx.render("markdown"))
    print(result.data)

asyncio.run(main())

Python — MCP server

from memorymcp import MemoryFactory
from memorymcp.mcp_server import MemoryMCPServer

pipeline = MemoryFactory.default()
MemoryMCPServer(pipeline).run()

16 MCP tools: query_memory, store_fact, get_session_context, add_episode, assemble_context, set_working_memory, get_working_memory, forget_fact, memory_stats, memory_config, traverse_graph, find_contradictions, extract_facts, export_memory, import_memory, memory_analytics

stdio (recommended)

JSON

{
  "mcpServers": {
    "memorymcp": {
      "command": "memorymcp",
      "args": ["serve"],
      "cwd": "/path/to/your/project"
    }
  }
}

SSE (shared server)

JSON

{
  "mcpServers": {
    "memorymcp": {
      "type": "sse",
      "url": "http://localhost:8080/sse"
    }
  }
}

Memory tiers

A dedicated store for
each tier of memory

Tier	Name	Backend	TTL	Purpose
0	HotCache	LRU RAM	seconds	Ultra-fast access to recently used facts
1	Working	InMemory	minutes	Current task context
2	Episodic	InMemory / DB	hours	Session history and episode summaries
3	Semantic	ragmcp (Qdrant / pgvector)	permanent	Long-term knowledge — queried by meaning

Measured, not claimed

`memorymcp` vs Mem0

Same model, same controls, losses shown as plainly as wins. Every number reproduces from a script with raw JSON in the repo.

Metric	memorymcp	Mem0
recall@1	0.775	0.500
recall@5	0.900	0.925
MRR@10	0.826	0.679
extraction cost / 8 facts	$0	$0.071

Recall is close and split — memorymcp ranks the target higher (recall@1, MRR), Mem0 catches a couple more in the tail (recall@5). The clear, defensible win is cost: deterministic extraction is free where Mem0's LLM extraction is ~$0.009/fact. (With the optional cross-encoder reranker on, memorymcp leads every recall metric.)

See the full benchmark →

Your agent's memory.Persistent. Intelligent.MCP-native.

Everything your agent needsto remember and learn