Skip to content

Design principles

Read this if you want the why behind everything else — the handful of decisions that explain why OMem stores Markdown instead of a database, runs in bursts instead of as a daemon, and refuses to let an LLM do the parsing.

These tenets are the load-bearing ideas — everything in the other concept pages is a consequence of one or more of them. Click any card to expand the deeper reasoning:

Two of these carry more weight than the rest, and are worth stating in prose.

Most “AI memory” tools keep your memory inside something you can’t see — a vector store, an opaque index. OMem makes the opposite bet, borrowed from Andrej Karpathy’s idea of an LLM-readable wiki: the core output is a human-readable Markdown wiki that is also the retrieval corpus. You read the same thing the agent reads.

That has a concrete payoff. The indexes are opinions layered on top — delete them all and they rebuild from the wiki. The wiki itself is generated from the immutable raw/ archive — delete the wiki and it rebuilds from raw/. Nothing is trapped: back it up, version-control it, hand-edit a page, point it at a different machine. Your work context stays yours, in a format that will still open in fifty years.

A fair question: frontier LLMs can read a PDF or a deck directly — so why does OMem first parse every file into a clean parsed.md, instead of just feeding the raw file to the model?

The biggest reason is cost, and it has two halves:

  • Parsing the file costs no LLM tokens. OMem extracts the file with real libraries (python-docx, pymupdf, …), not a model — turning a complex PPT/Excel/PDF into Markdown is essentially free.
  • The AI then reads Markdown, which is far cheaper than reading the raw file. Feeding a 30-slide deck or a fat spreadsheet to a model directly burns a fortune in tokens every time it’s queried; a tidy Markdown page is a fraction of that — and it gets queried again and again.

So the “just show the LLM the file” shortcut is the expensive option, not the easy one. OMem does the cheap thing: parse once, for free, into Markdown the AI can read cheaply forever after.

And here’s the part that’s genuinely hard: doing that cheaply without losing quality. Office files hide their value in the details, and naive converters quietly drop the part of each format that matters most — a PowerPoint’s speaker notes and embedded charts, an Excel’s merged cells and in-sheet images, the EMF vector art in an old doc, a scanned bilingual page, the cid-embedded images in a mail thread. So OMem writes a purpose-built parser per format to recover exactly those edge cases: PPT split by slide with speaker notes kept and embedded charts described by a vision model; Excel rendered per sheet with images preserved; scans run through OCR tuned for mixed Chinese/English; EMF converted to PNG before the vision model sees it. Reading the images is part of parsing, not a bolt-on. It’s the least glamorous layer and the one with the most work in it — and that’s the achievement: the free, cheap path also reads your files accurately, so the AI answers accurately.

(A parser-llm plugin is reserved for v1.5, for people who’d trade this for whole-file layout understanding — an explicit choice, not the default.)

You’ve seen the why. The last piece is the how-an-agent-uses-it: continue to querying from an agent to see a real agent walk OMem’s layers to answer a question.