Design principles

Read this if you want the why behind everything else — the handful of decisions that explain why OMem stores Markdown instead of a database, runs in bursts instead of as a daemon, and refuses to let an LLM do the parsing.

The eight principles

These tenets are the load-bearing ideas — everything in the other concept pages is a consequence of one or more of them. Click any card to expand the deeper reasoning:

Two of these carry more weight than the rest, and are worth stating in prose.

Why the wiki is the truth

Most “AI memory” tools keep your memory inside something you can’t see — a vector store, an opaque index. OMem makes the opposite bet, borrowed from Andrej Karpathy’s idea of an LLM-readable wiki: the core output is a human-readable Markdown wiki that is also the retrieval corpus. You read the same thing the agent reads.

That has a concrete payoff. The indexes are opinions layered on top — delete them all and they rebuild from the wiki. The wiki itself is generated from the immutable raw/ archive — delete the wiki and it rebuilds from raw/. Nothing is trapped: back it up, version-control it, hand-edit a page, point it at a different machine. Your work context stays yours, in a format that will still open in fifty years.

Why parse to Markdown first

A fair question: frontier LLMs can read a PDF or a deck directly — so why does OMem first parse every file into a clean parsed.md, instead of just feeding the raw file to the model?

The biggest reason is cost, and it has two halves:

Parsing the file costs no LLM tokens. OMem extracts the file with real libraries (python-docx, pymupdf, …), not a model — turning a complex PPT/Excel/PDF into Markdown is essentially free.
The AI then reads Markdown, which is far cheaper than reading the raw file. Feeding a 30-slide deck or a fat spreadsheet to a model directly burns a fortune in tokens every time it’s queried; a tidy Markdown page is a fraction of that — and it gets queried again and again.

So the “just show the LLM the file” shortcut is the expensive option, not the easy one. OMem does the cheap thing: parse once, for free, into Markdown the AI can read cheaply forever after.

And here’s the part that’s genuinely hard: doing that cheaply without losing quality. Office files hide their value in the details, and naive converters quietly drop the part of each format that matters most — a PowerPoint’s speaker notes and embedded charts, an Excel’s merged cells and in-sheet images, the EMF vector art in an old doc, a scanned bilingual page, the cid-embedded images in a mail thread. So OMem writes a purpose-built parser per format to recover exactly those edge cases: PPT split by slide with speaker notes kept and embedded charts described by a vision model; Excel rendered per sheet with images preserved; scans run through OCR tuned for mixed Chinese/English; EMF converted to PNG before the vision model sees it. Reading the images is part of parsing, not a bolt-on. It’s the least glamorous layer and the one with the most work in it — and that’s the achievement: the free, cheap path also reads your files accurately, so the AI answers accurately.

(A parser-llm plugin is reserved for v1.5, for people who’d trade this for whole-file layout understanding — an explicit choice, not the default.)

What’s next

You’ve seen the why. The last piece is the how-an-agent-uses-it: continue to querying from an agent to see a real agent walk OMem’s layers to answer a question.