Python web-search library · MCP-native · AGPL-3.0

Web search +
cited answers.
No API key.

Multi-engine search (SearXNG-first), cross-encoder reranking, recency/news filtering, clean extraction, and a bring-your-own-LLM answer layer. It matches a paid search API's cited answer — at $0 and no lock-in.

search.py
from websearchmcp import WebSearchFactory

pipe = WebSearchFactory.create(enable_rerank=True)

# Ranked results (cross-encoder + domain-authority prior)
results = await pipe.search("python 3.13 release notes", rerank=True)

# Search + cited answer (bring your own LLM)
res = await pipe.search_with_answer(
    "why was the feature deprecated",
    answer_fn=my_llm, fetch_content=True,
)
print(res.answer)   # "... [1][3]"
SearXNG-first Cross-encoder rerank
BYO-LLM answer
SearXNG + 3 engines
Cross-encoder rerank
Recency / news
4 MCP tools
AGPL-3.0
Python 3.11+

The agent-facing surface of a
paid search API — for free

Everything is opt-in and additive. Search, fetch and browser stay untouched.

Multi-engine, SearXNG-first

SearXNG (your self-hosted instance, JSON API) as Priority 1, with DuckDuckGo, Mojeek and Brave as scraping fallbacks. Per-engine circuit breaker + rate limiter.

SearXNGDDGBrave

Cross-encoder reranking

Optional fastembed cross-encoder ($0, no torch) plus a small domain-authority prior — reference/official sources outrank SEO pages. No-op if not installed.

fastembed$0

Search + answer (BYO-LLM)

search_with_answer() returns ranked sources + a synthesized, cited answer via your own LLM callback. Matches a paid API's answer at zero search cost.

answer_fncited

Recency & news

Restrict to fresh news results (recency=day/week/month) — essential for current events, which plain search buries under evergreen pages. (Needs a SearXNG backend.)

time_rangenews

Token-lean extraction

trafilatura main-content extraction + passage trimming: the LLM reads only query-relevant text, not whole pages. Cuts downstream tokens by ~35% with no answer loss.

trafilaturapassage trim

JS pages via Playwright

browser_fetch renders SPA / JS-heavy pages with headless Chromium and stealth, with an HTTP/1.1 retry fallback for anti-bot sites. Optional screenshots.

Playwrightscreenshots

Four surfaces

Library, CLI, MCP server (web_search, web_answer, fetch_webpage, browser_fetch) and a FastAPI REST app — the same engine everywhere.

libCLIMCPAPI

Search & fetch caching

In-memory TTL caches for both search results and fetched pages — repeated queries and revisited pages don't pay twice.

TTL cache

websearchmcp vs Tavily

Same model, same controls, losses shown as plainly as wins. Every number reproduces from a script with raw JSON in the repo.

MetricwebsearchmcpTavily
Answer correctness (6 factual Qs)5 / 65 / 6
Authority@3 (raw results, 10 Qs)3 / 106 / 10
Search cost$0 / no keypaid (credits)
Latency / query~22.13s (live)~1.26s (index)

We lose the index contest, structurally — you can't out-crawl a crawler. But on the actual deliverable, the cited answer, it's a tie at $0 and no API key: the LLM extracts the fact from any decent fetched page, which is exactly what Tavily does too. The price of having no index is latency.

See the full benchmark →

Give your agent the web
without the API bill.

Read the docs Star on GitHub