AI tools are only useful when they can see your context — your docs, your code, your notes, the Hacker News thread you read last week. The usual answer is a cloud RAG service: ship your data to someone's vector database, pay per query, hope it's still up. I wanted the opposite — a single binary that ingests my stuff, indexes it locally, and hands it to any AI tool over a standard protocol, with no cloud dependency. So I built Context Harness (ctx).
$ ctx --help
Context Harness provides a connector-driven pipeline for ingesting documents
from multiple sources (filesystem, Git repositories, S3 buckets), chunking and
embedding them, and exposing hybrid search (keyword + semantic) via a CLI and
MCP-compatible HTTP server.
It's a RAG engine that lives on your laptop. Here's the design.
The pipeline
The flow is the standard RAG shape, but every stage is local and configurable through one ctx.toml:
connectors → sync → chunk → embed → SQLite → hybrid search → CLI / MCP
Connectors define where documents come from — filesystem globs, Git repos, S3 buckets, or custom Lua scripts (more on those below). ctx sync pulls from a connector and runs the rest of the pipeline.
Chunking and embedding are configured, not hard-coded:
[chunking]
max_tokens = 700
overlap_tokens = 80
[embedding]
provider = "openai"
model = "text-embedding-3-small"
dims = 1536
The provider is pluggable — that line can point at OpenAI's text-embedding-3-small, or at a fully local model. My test setup runs fastembed with a quantized all-MiniLM-L6-v2 ONNX model, so embeddings happen on-device with no API call at all. That choice is the whole "local-first" thesis in one config key: trade a little retrieval quality for zero cloud dependency and zero per-query cost.
Storage is SQLite. No vector-database service to stand up, no Docker, no daemon. The index is a file you can copy, back up, or delete. For a single-user knowledge base, a managed vector DB is wildly over-provisioned; SQLite is exactly right.
Hybrid search
Pure semantic search misses exact terms; pure keyword search misses paraphrase. ctx does both and blends them:
[retrieval]
final_limit = 12
hybrid_alpha = 0.6 # weight toward semantic vs keyword
candidate_k_keyword = 80 # pull 80 keyword candidates
candidate_k_vector = 80 # and 80 vector candidates
group_by = "document" # then dedup/group by document
doc_agg = "max"
max_chunks_per_doc = 3
It gathers candidates from both a keyword index and a vector index, blends the scores with hybrid_alpha, groups by document so one long file can't flood the results, and returns the top 12. Tuning hybrid_alpha toward 0 or 1 lets you dial between "find the exact phrase" and "find the related idea."
The part that makes it an engine: the MCP server
A search CLI is handy. A search CLI that any AI tool can call as a tool is a force multiplier:
$ ctx serve mcp
# Exposes search/get over a JSON API for Cursor, Claude, and other
# MCP-compatible AI tools.
MCP is the protocol AI tools use to call external tools. By speaking it, ctx turns your local knowledge base into a tool the model can reach for mid-conversation: "search my notes for the retry-policy decision," and the model queries your SQLite index and gets grounded results — without your notes ever leaving the machine. That's the feature I use every day.
Extensibility: connectors, tools, and agents in Lua
Built-in connectors cover the common cases, but the interesting data is always somewhere weird. So ctx embeds Lua: you can script connectors (new data sources), tools (new capabilities), and agents (personas with a system prompt and a scoped toolset). Each has init/test scaffolding:
$ ctx connector init # scaffold a new Lua connector from a template
$ ctx connector test # run it without writing to the DB
$ ctx agent init/test/list
The agent system is the one I'm proudest of. An agent is a Lua script that, at resolve time, assembles its own context by querying the knowledge base, then hands the model a system prompt plus pre-loaded research and a scoped set of tools. Here's a real one — hn-writer, which writes Hacker News launch posts:
agent = {
name = "hn-writer",
description = "Write Hacker News posts by studying top HN content and your product docs",
tools = { "search", "get" }, -- scoped: this agent can only search and fetch
arguments = {
{ name = "style", description = "show_hn, launch, ask_hn, or discussion" },
{ name = "angle", description = "e.g. 'local-first', 'developer tooling'" },
{ name = "tone", description = "technical, conversational, or minimal" },
},
}
function agent.resolve(args, config, context)
-- pre-load HN trends from the knowledge base
for _, q in ipairs({ "Show HN", "Rust CLI tool", "local first", "AI context" }) do
local results = context.search(q, { mode = "keyword", limit = 5 })
-- ...fold the top results into the prompt as research...
end
-- ...also search the project's own docs, then return:
return { system = system_prompt, tools = { "search", "get" }, messages = preloaded }
end
What I love about this pattern: the agent does its own retrieval before the model gets involved, so the model starts with both "what performs well on HN right now" (from a connector that ingests HN) and "what this product actually does" (from a filesystem connector over the docs) already in context. And there's a pleasant recursion to it — I have an agent whose job is to write the Show HN post for the tool the agent runs on. Its prompt even encodes the house style: "What HN hates: marketing speak, buzzwords, superlatives… technical substance over marketing language." Which, not coincidentally, is the ethos of this whole blog.
Sharing extensions: registries
Lua scripts are shareable, so ctx supports registries — git repos of community connectors, tools, and agents that sync in automatically:
[registries.community]
url = "https://github.com/parallax-labs/ctx-registry.git"
auto_update = true
readonly = true
A registry is just a versioned directory of .lua files; pointing at one makes its connectors and agents available locally. It's the same "distribute capability declaratively" idea I use for agent skills, applied to data connectors.
Static-site search, for free
One more trick. ctx export dumps the whole index to JSON for client-side search:
$ ctx export
# Exports documents and chunks to JSON for use with ctx-search.js —
# client-side search on a static site.
Which means the same engine that grounds my AI tools could also power search on this blog — index the posts, export the JSON, search it in the browser with no backend. (My terminal theme already runs a database in the browser, so this is a natural next step.)
Honest trade-offs
Context Harness is young, and local-first is a set of trade-offs, not a free lunch:
- Local embeddings are private and free but lower-quality than the big cloud models.
providerlets you choose per use case, but you don't get both at once. - SQLite scales to a personal knowledge base, not a team's corpus. That's the design target, not a bug — but know the ceiling.
- Lua is power and rope. Scriptable connectors mean I can ingest anything; they also mean a bad script can do bad things.
connector test(which never writes to the DB) exists for exactly that reason.
But the core bet has paid off: a single Rust binary, a SQLite file, optional fully-local embeddings, and a standard protocol is enough to give every AI tool I use grounded access to my own context — without renting a vector database to do it.
— Parker Jones, parkerjones.dev