- Authors

- Name
- Kai Kang
- Role
- Staff Software Engineer @ Meta · Solo App Builder
I've been thinking about second brain apps for a long time — both as a user frustrated by existing tools, and as an engineer who wants to build one.
After months of experimenting (building prototypes for personal use and for teams at work), I've landed on a framework for thinking about this problem. A second brain has exactly four layers, and most apps get the priorities wrong.
The Four Layers
1. Capture → get thoughts in, fast
2. Storage → where the data lives
3. Rendering → how it looks
4. Retrieval → how you get it back
Most note-taking apps obsess over layers 2 and 3 (storage and rendering). Notion has beautiful databases. Obsidian has gorgeous graph views. But these are the wrong layers to optimize first.
I've used both extensively, and here's where they fall short:
Notion is too polished and too bulky. It tries to be everything — wiki, database, project manager, docs — and the result is a bloated app that's slow to open and slow to capture a quick thought. I recently discovered Notion was eating 100GB+ of disk space on my machine, and I had fewer than 20 notes stored. Clearly a bug, but it tells you something about the engineering priorities when your note-taking app is heavier than most video games.
Obsidian has the opposite problem: too much customizability. It's a playground for power users and plugin enthusiasts, but that's exactly who it serves — geeks tweaking their setup instead of capturing thoughts. The plugin ecosystem is impressive but overwhelming. Somewhere in the pursuit of infinite configurability, simplicity got lost.
Both are great tools, but neither nails the thing that matters most: getting a thought out of your head and into the system with zero friction.
The two layers that actually matter are 1 and 4: capture and retrieval. If you can't get thoughts in fast enough, you won't use it. If you can't get them back when you need them, it's a graveyard.
Layer 1: Capture — Speed Is Everything
The most important design principle for capture: remove every possible source of friction.
When a thought hits you, you have about 5 seconds before the activation energy to record it exceeds the perceived value. Every tap, every loading screen, every "which folder should this go in?" decision is a chance for the thought to die.
This means making deliberate trade-offs:
- Trade off structure for speed. Don't make users pick a category, tag, or folder at capture time. Dump it in an inbox. Organize later (or let AI organize it).
- Trade off being online. Capture must work offline. If the app needs a network round-trip before confirming your note is saved, you've already lost.
- Trade off formatting. The capture interface should be a single text field. Not a rich text editor, not a markdown toolbar, not a template picker. Just text. Maybe voice.
- Trade off completeness. A half-captured thought is infinitely more valuable than an uncaptured one. Let people write fragments, single sentences, even just keywords.
The gold standard: you pull out your phone, the app opens to a text field, you type or speak, and you're done in under 10 seconds. Everything else is downstream.
Layer 2: Storage — Simplicity Over Cleverness
I've been experimenting with two approaches:
Option A: Plain Markdown Files + Git
~/brain/
inbox/
2026-04-13-random-thought.md
projects/
kioku/
notes.md
architecture.md
references/
llm-context-windows.md
Version controlled by git. Each note is a file. Folders are the only organizational primitive.
Pros: Portable, grep-able, works with any editor, zero vendor lock-in, diffs are meaningful, works offline forever.
Cons: No relational queries, no real-time sync across devices without extra tooling, merge conflicts if you edit from multiple places.
Option B: Database with Markdown Columns
Store markdown content in database rows. Render to files on demand. Supabase/Postgres as the backend.
Pros: Real-time sync, relational queries, easy to build search, works naturally with a mobile app.
Cons: Vendor dependency, need a server, content is trapped behind an API.
Where I've Landed
For a personal tool: files + git. The simplicity is unbeatable. I can use VS Code, vim, Obsidian, or any text editor. My data is always mine.
For a team/company tool: database. You need real-time sync, permissions, and the ability to query across users' knowledge bases.
For a consumer app (which is what I'd ship): database for storage, but export to markdown at any time. Don't lock people in.
Layer 3: Rendering — It Depends on Your Audience
Here's a take that might be controversial: rendering barely matters for most second brain use cases.
For a consumer app, yes — you need it to look good. People expect polished UIs, and a beautiful reading experience helps with adoption. Obsidian got this right with their theme ecosystem.
For a team or internal tool, rendering is almost irrelevant. Engineers and knowledge workers care about two things: "can I find it?" and "is it accurate?" Nobody is admiring the typography of an internal wiki article.
So my advice: don't over-invest in rendering early. Use a standard markdown renderer and move on. Spend that time on retrieval instead.
Layer 4: Retrieval — This Is the Whole Game
This is the part I'm most excited about, and where I think the entire second brain category is about to be disrupted.
The Old Way: Search, Tags, and Folders
Traditional note apps give you three retrieval mechanisms:
- Folders — you have to remember where you put things
- Tags — you have to remember what you tagged things
- Full-text search — you have to remember the exact words you used
All three assume you remember something about the note you're looking for. But the whole point of a second brain is to remember things for you.
The RAG Era (2023–2025)
The first wave of AI-powered note apps used RAG (Retrieval-Augmented Generation): embed your notes as vectors, find semantically similar chunks, pass them to an LLM.
RAG works, but it's complex and lossy:
- You need embedding models, vector databases, chunking strategies
- Retrieval quality depends heavily on chunk size and embedding quality
- The LLM only sees fragments, not your full context
- It's another system to maintain, tune, and debug
The New Way: Just Dump Everything Into the Context Window
This is the idea that Andrej Karpathy recently tweeted about, and I independently arrived at the same conclusion while building prototypes.
Modern LLMs have context windows of 1M+ tokens. That's roughly 750,000 words — or about 1,500 full-length notes. For most people's personal knowledge base, that's... everything.
The approach:
- User asks a question
- Load their entire knowledge base (or a large, relevant subset) into the context window
- LLM answers with full context awareness
No embeddings. No vector databases. No chunking. No retrieval pipeline to tune.
Why this works better than RAG:
- The LLM sees connections between notes that a retrieval pipeline would miss
- No information is lost to chunking boundaries
- You can ask complex, cross-cutting questions ("what have I learned about X that contradicts Y?")
- Dramatically simpler to build and maintain
The obvious limitation: cost and latency. Sending 1M tokens per query isn't free. But context window costs are dropping fast — what costs 0.10 next year. And for a subscription app charging $10–20/month, the math already works if users make a reasonable number of queries per day.
What This Means for the Product
If retrieval is just "dump everything to the LLM," then the app architecture simplifies dramatically:
Capture (fast input)
→ Storage (simple files or DB)
→ Retrieval (load context + LLM query)
No embedding pipeline. No vector DB. No reindexing when notes change. The hard engineering problem disappears, and what's left is pure product design: how do you make capture effortless, and how do you present the LLM's answers in a way that's useful?
What I'm Building
I'm working on Kioku — a knowledge app built on these principles:
- Capture in under 10 seconds — text or voice, no categorization required
- Plain storage — markdown-friendly, exportable, no lock-in
- AI-first retrieval — no manual search, no folders to navigate. Ask a question, get an answer grounded in your own knowledge
- On-device AI — powered by edge LLM models like Google's Gemma 4, so your knowledge stays on your device. No cloud round-trips, no privacy concerns, instant responses
- Learning integration — the app doesn't just store knowledge, it helps you retain it through spaced repetition
If this sounds interesting, subscribe below — I'll share the build process as it happens.
The best second brain is one you actually use. Everything else is decoration.
