Skip to content

The plugin architecture

Read this if you want to know how extensible OMem really is — the types of plugin it has, which plugins ship under each, and where the honest boundaries of v1.0 are.

OMem was built platform-first: every stage of the ingest pipeline plugs into an interface defined from day one, each with at least one real, shipping implementation — no “TBD” stubs. The point isn’t that you must extend it; it’s that the architecture doesn’t trap you. You can add a source, swap an index, or (later) replace a parser without rebuilding the rest.

Four extension points — and the plugins under each

Section titled “Four extension points — and the plugins under each”

There are four extension-point types. Scroll to see each one and the concrete plugins OMem ships under it:

extension point · source

Sources decide where your work comes from.

A Source reads the items to ingest. Each kind of work has one — and adding a new origin later (Slack, Jira, …) is just one more Source, leaving the rest of OMem untouched.

local-filesv1.0Any folder you point it at — OneDrive, Box, Dropbox, iCloud, Downloads.
mail-appv1.0Apple Mail’s local store; threads aggregated into one page each.
calendar-appv1.0Apple Calendar — Exchange, iCloud, CalDAV all flow through it.
loop-resolverv1.0Microsoft Loop / Fluid meeting notes, fetched from SharePoint.
ics-filev1.0Standalone .ics calendar files.
outlook-classic / -web / -applescriptv1.5+Other ways into Outlook mail & calendar.
extension point · parser

A parser per format — so nothing your office uses is left out.

Each format gets a dedicated parser that turns it into clean Markdown, keeping the parts generic converters drop. This breadth is the quiet thing that sets OMem apart.

docxv1.0Word — headings, lists, tables, embedded images, plus tracked changes and reviewer comments.
pptxv1.0PowerPoint — slide-by-slide, speaker notes, embedded charts.
xlsxv1.0Excel — sheets as Markdown tables, embedded images kept.
pdfv1.0PDF — layout-aware for digital, OCR for scanned.
eml / msgv1.0Email files, headers and body and attachments.
htmlv1.0Web/email HTML → clean Markdown.
icsv1.0Calendar event data.
imagev1.0PNG/JPEG/HEIC/… described via OCR + a vision model.
plain-text / markdownv1.0Passed through, structure preserved.
parser-llmv1.5+Whole-file LLM parsing, for layout over reproducibility.
extension point · index

The index is how a query finds a page — and it’s swappable.

The retrieval layer is an opinion laid over the wiki, never the source of truth. Switch it with one command; only the index rebuilds, your wiki is untouched.

fts5defaultSQLite full-text search + jieba Chinese segmentation. Fast, zero setup.
qmdoptionalHybrid BM25 + local vector embeddings + reranker, for semantic & cross-language search.
extension point · wikistore

Where the pages live — defined as a seam, reserved for later.

Be precise: the interface exists and is clean, but v1.0 ships a single implementation with no swap mechanism. It’s an extension point on paper — three of the four are things you can act on today.

DiskWikiStorereservedMarkdown files on disk + SQLite metadata. The one v1.0 implementation.

Be precise about the count, because it’s easy to oversell: three of these are pluggable today (Source, Parser, Index), and one is architecturally defined but reserved (WikiStore — the interface is clean, but v1.0 ships a single implementation with no swap mechanism). Calling it “four extension points” is true on paper; in practice, three are things you can act on now.

  • New sources without core changes. A file / mail / calendar / loop source is just an implementation of the Source interface. Adding Slack, Jira, or Linear later is a new Source — the parser, index, and wiki layers don’t move.
  • The retrieval layer is a choice, not a lock-in. v1.0 ships fts5 (SQLite FTS5 + jieba Chinese segmentation) as a genuinely good default. Want hybrid vector search? omem plugin enable qmd swaps in a BM25 + local-embedding + reranker stack. It replaces fts5 rather than merging — a clean boundary — and only the index rebuilds; your wiki is untouched.
  • The parser chain can evolve. Today it’s deterministic (no LLM). The interface leaves room for a future parser-llm to plug in for file kinds — for people who want layout understanding over archival reproducibility — without changing the contract the rest of the pipeline relies on.

There’s a fifth kind of extensibility that isn’t in the diagram, because it lives one level up: the omem CLI is the authoritative interface, and every agent integration is a thin wrapper over it. The OMem skill is ~30 lines of shell that shell out to omem query; the MCP server is a similarly thin wrapper for any MCP client. The agent layer is volatile — last year’s hot agent isn’t next year’s — so OMem’s bet is that the memory layer underneath shouldn’t be bound to today’s agent. Querying from an agent covers that side.

Because the wiki is plain Markdown files on your disk and every format is open, the whole thing is portable on top of being pluggable: back it up, version-control it, hand-edit a page, point it at a different machine. The extension points keep the software open; the Markdown-on-disk output keeps your data open. Andrej Karpathy has written about LLM-readable wikis as a primitive of the AI-native stack — design principle P3 is where that idea becomes concrete.

Continue to design principles — the eight ideas that explain why the architecture is shaped this way, including why the wiki is the truth and why the parser deliberately isn’t an LLM.