Context Harness: a Local-First RAG Engine in Rust with Lua Extensions and an MCP Server

2026-06-26

Parker Jones and Claude Opus 4.8 in software

#rust , #rag , #ai , #mcp , #lua , #embeddings , #sqlite and #local-first

5 minute read

AI tools are only useful when they can see your context — your docs, your code, your notes, the Hacker News thread you read last week. The usual answer is a cloud RAG service: ship your data to someone's vector database, pay per query, hope it's still up. I wanted the opposite — a single binary that ingests my stuff, indexes it locally, and hands it to any AI tool over a standard protocol, with no cloud dependency. So I built Context Harness (ctx).

$ ctx --help
Context Harness provides a connector-driven pipeline for ingesting documents
from multiple sources (filesystem, Git repositories, S3 buckets), chunking and
embedding them, and exposing hybrid search (keyword + semantic) via a CLI and
MCP-compatible HTTP server.

It's a RAG engine that lives on your laptop. Here's the design.

The pipeline

The flow is the standard RAG shape, but every stage is local and configurable through one ctx.toml:

connectors → sync → chunk → embed → SQLite → hybrid search → CLI / MCP

Connectors define where documents come from — filesystem globs, Git repos, S3 buckets, or custom Lua scripts (more on those below). ctx sync pulls from a connector and runs the rest of the pipeline.

Chunking and embedding are configured, not hard-coded:

[chunking]
max_tokens = 700
overlap_tokens = 80

[embedding]
provider = "openai"
model = "text-embedding-3-small"
dims = 1536

The provider is pluggable — that line can point at OpenAI's text-embedding-3-small, or at a fully local model. My test setup runs fastembed with a quantized all-MiniLM-L6-v2 ONNX model, so embeddings happen on-device with no API call at all. That choice is the whole "local-first" thesis in one config key: trade a little retrieval quality for zero cloud dependency and zero per-query cost.

Storage is SQLite. No vector-database service to stand up, no Docker, no daemon. The index is a file you can copy, back up, or delete. For a single-user knowledge base, a managed vector DB is wildly over-provisioned; SQLite is exactly right.

Hybrid search

Pure semantic search misses exact terms; pure keyword search misses paraphrase. ctx does both and blends them:

[retrieval]
final_limit = 12
hybrid_alpha = 0.6           # weight toward semantic vs keyword
candidate_k_keyword = 80     # pull 80 keyword candidates
candidate_k_vector  = 80     # and 80 vector candidates
group_by = "document"        # then dedup/group by document
doc_agg = "max"
max_chunks_per_doc = 3

It gathers candidates from both a keyword index and a vector index, blends the scores with hybrid_alpha, groups by document so one long file can't flood the results, and returns the top 12. Tuning hybrid_alpha toward 0 or 1 lets you dial between "find the exact phrase" and "find the related idea."

The part that makes it an engine: the MCP server

A search CLI is handy. A search CLI that any AI tool can call as a tool is a force multiplier:

$ ctx serve mcp
# Exposes search/get over a JSON API for Cursor, Claude, and other
# MCP-compatible AI tools.

MCP is the protocol AI tools use to call external tools. By speaking it, ctx turns your local knowledge base into a tool the model can reach for mid-conversation: "search my notes for the retry-policy decision," and the model queries your SQLite index and gets grounded results — without your notes ever leaving the machine. That's the feature I use every day.

Extensibility: connectors, tools, and agents in Lua

Built-in connectors cover the common cases, but the interesting data is always somewhere weird. So ctx embeds Lua: you can script connectors (new data sources), tools (new capabilities), and agents (personas with a system prompt and a scoped toolset). Each has init/test scaffolding:

$ ctx connector init    # scaffold a new Lua connector from a template
$ ctx connector test    # run it without writing to the DB
$ ctx agent  init/test/list

The agent system is the one I'm proudest of. An agent is a Lua script that, at resolve time, assembles its own context by querying the knowledge base, then hands the model a system prompt plus pre-loaded research and a scoped set of tools. Here's a real one — hn-writer, which writes Hacker News launch posts:

agent = {
    name = "hn-writer",
    description = "Write Hacker News posts by studying top HN content and your product docs",
    tools = { "search", "get" },          -- scoped: this agent can only search and fetch
    arguments = {
        { name = "style", description = "show_hn, launch, ask_hn, or discussion" },
        { name = "angle", description = "e.g. 'local-first', 'developer tooling'" },
        { name = "tone",  description = "technical, conversational, or minimal" },
    },
}

function agent.resolve(args, config, context)
    -- pre-load HN trends from the knowledge base
    for _, q in ipairs({ "Show HN", "Rust CLI tool", "local first", "AI context" }) do
        local results = context.search(q, { mode = "keyword", limit = 5 })
        -- ...fold the top results into the prompt as research...
    end
    -- ...also search the project's own docs, then return:
    return { system = system_prompt, tools = { "search", "get" }, messages = preloaded }
end

What I love about this pattern: the agent does its own retrieval before the model gets involved, so the model starts with both "what performs well on HN right now" (from a connector that ingests HN) and "what this product actually does" (from a filesystem connector over the docs) already in context. And there's a pleasant recursion to it — I have an agent whose job is to write the Show HN post for the tool the agent runs on. Its prompt even encodes the house style: "What HN hates: marketing speak, buzzwords, superlatives… technical substance over marketing language." Which, not coincidentally, is the ethos of this whole blog.

Lua scripts are shareable, so ctx supports registries — git repos of community connectors, tools, and agents that sync in automatically:

[registries.community]
url = "https://github.com/parallax-labs/ctx-registry.git"
auto_update = true
readonly = true

A registry is just a versioned directory of .lua files; pointing at one makes its connectors and agents available locally. It's the same "distribute capability declaratively" idea I use for agent skills, applied to data connectors.

Static-site search, for free

One more trick. ctx export dumps the whole index to JSON for client-side search:

$ ctx export
# Exports documents and chunks to JSON for use with ctx-search.js —
# client-side search on a static site.

Which means the same engine that grounds my AI tools could also power search on this blog — index the posts, export the JSON, search it in the browser with no backend. (My terminal theme already runs a database in the browser, so this is a natural next step.)

Honest trade-offs

Context Harness is young, and local-first is a set of trade-offs, not a free lunch:

Local embeddings are private and free but lower-quality than the big cloud models. provider lets you choose per use case, but you don't get both at once.
SQLite scales to a personal knowledge base, not a team's corpus. That's the design target, not a bug — but know the ceiling.
Lua is power and rope. Scriptable connectors mean I can ingest anything; they also mean a bad script can do bad things. connector test (which never writes to the DB) exists for exactly that reason.

But the core bet has paid off: a single Rust binary, a SQLite file, optional fully-local embeddings, and a standard protocol is enough to give every AI tool I use grounded access to my own context — without renting a vector database to do it.

Context Harness is open source (AGPL-3.0) at parallax-labs/context-harness; docs and prebuilt binaries for macOS and Linux are at parallax-labs.github.io/context-harness.

— Parker Jones, parkerjones.dev