Python web-search library · MCP-native · Apache-2.0

Web search +
cited answers.
No API key.

Multi-engine search (SearXNG-first), cross-encoder reranking, recency/news filtering, clean extraction, and a bring-your-own-LLM answer layer. It matches a paid search API's cited answer — at $0 and no lock-in.

Get started View on GitHub

search.py

from websearchmcp import WebSearchFactory

pipe = WebSearchFactory.create(enable_rerank=True)

# Ranked results (cross-encoder + domain-authority prior)
results = await pipe.search("python 3.13 release notes", rerank=True)

# Search + cited answer (bring your own LLM)
res = await pipe.search_with_answer(
    "why was the feature deprecated",
    answer_fn=my_llm, fetch_content=True,
)
print(res.answer)   # "... [1][3]"

SearXNG-first Cross-encoder rerank

BYO-LLM answer

Features

The agent-facing surface of a
paid search API — for free

Everything is opt-in and additive. Search, fetch and browser stay untouched.

Multi-engine, SearXNG-first

SearXNG (your self-hosted instance, JSON API) as Priority 1, with DuckDuckGo, Mojeek and Brave as scraping fallbacks. Per-engine circuit breaker + rate limiter.

SearXNG DDG Brave

Cross-encoder reranking

Optional fastembed cross-encoder ($0, no torch) plus a small domain-authority prior — reference/official sources outrank SEO pages. No-op if not installed.

fastembed $0

Search + answer (BYO-LLM)

search_with_answer() returns ranked sources + a synthesized, cited answer via your own LLM callback. Matches a paid API's answer at zero search cost.

answer_fn cited

Recency & news

Restrict to fresh news results (recency=day/week/month) — essential for current events, which plain search buries under evergreen pages. (Needs a SearXNG backend.)

time_range news

Token-lean extraction

trafilatura main-content extraction + passage trimming: the LLM reads only query-relevant text, not whole pages. Cuts downstream tokens by ~35% with no answer loss.

trafilatura passage trim

JS pages via Playwright

browser_fetch renders SPA / JS-heavy pages with headless Chromium and stealth, with an HTTP/1.1 retry fallback for anti-bot sites. Optional screenshots.

Playwright screenshots

Four surfaces

Library, CLI, MCP server (web_search, web_answer, fetch_webpage, browser_fetch) and a FastAPI REST app — the same engine everywhere.

lib CLI MCP API

Search & fetch caching

In-memory TTL caches for both search results and fetched pages — repeated queries and revisited pages don't pay twice.

TTL cache

Measured, not claimed

`websearchmcp` vs Tavily

Same model, same controls, losses shown as plainly as wins. Every number reproduces from a script with raw JSON in the repo.

Metric	websearchmcp	Tavily
Answer correctness (6 factual Qs)	5 / 6	5 / 6
Authority@3 (raw results, 10 Qs)	3 / 10	6 / 10
Search cost	$0 / no key	paid (credits)
Latency / query	~22.13s (live)	~1.26s (index)

We lose the index contest, structurally — you can't out-crawl a crawler. But on the actual deliverable, the cited answer, it's a tie at $0 and no API key: the LLM extracts the fact from any decent fetched page, which is exactly what Tavily does too. The price of having no index is latency.

See the full benchmark →

Web search +cited answers.No API key.

The agent-facing surface of apaid search API — for free