Coding Collie Logo
Coding Collie

Building a Second Brain App: What Actually Matters

Authors
  • avatar
    Name
    Kai Kang
    Role
    Staff Software Engineer @ Meta · Solo App Builder
    Twitter

I've been thinking about second brain apps for a long time — both as a user frustrated by existing tools, and as an engineer who wants to build one.

After months of experimenting (building prototypes for personal use and for teams at work), I've landed on a framework for thinking about this problem. A second brain has exactly four layers, and most apps get the priorities wrong.

The Four Layers

1. Captureget thoughts in, fast
2. Storage    →  where the data lives
3. Rendering  →  how it looks
4. Retrieval  →  how you get it back

Most note-taking apps obsess over layers 2 and 3 (storage and rendering). Notion has beautiful databases. Obsidian has gorgeous graph views. But these are the wrong layers to optimize first.

I've used both extensively, and here's where they fall short:

  • Notion is too polished and too bulky. It tries to be everything — wiki, database, project manager, docs — and the result is a bloated app that's slow to open and slow to capture a quick thought. I recently discovered Notion was eating 100GB+ of disk space on my machine, and I had fewer than 20 notes stored. Clearly a bug, but it tells you something about the engineering priorities when your note-taking app is heavier than most video games.

  • Obsidian has the opposite problem: too much customizability. It's a playground for power users and plugin enthusiasts, but that's exactly who it serves — geeks tweaking their setup instead of capturing thoughts. The plugin ecosystem is impressive but overwhelming. Somewhere in the pursuit of infinite configurability, simplicity got lost.

Both are great tools, but neither nails the thing that matters most: getting a thought out of your head and into the system with zero friction.

The two layers that actually matter are 1 and 4: capture and retrieval. If you can't get thoughts in fast enough, you won't use it. If you can't get them back when you need them, it's a graveyard.


Layer 1: Capture — Speed Is Everything

The most important design principle for capture: remove every possible source of friction.

When a thought hits you, you have about 5 seconds before the activation energy to record it exceeds the perceived value. Every tap, every loading screen, every "which folder should this go in?" decision is a chance for the thought to die.

This means making deliberate trade-offs:

  • Trade off structure for speed. Don't make users pick a category, tag, or folder at capture time. Dump it in an inbox. Organize later (or let AI organize it).
  • Trade off being online. Capture must work offline. If the app needs a network round-trip before confirming your note is saved, you've already lost.
  • Trade off formatting. The capture interface should be a single text field. Not a rich text editor, not a markdown toolbar, not a template picker. Just text. Maybe voice.
  • Trade off completeness. A half-captured thought is infinitely more valuable than an uncaptured one. Let people write fragments, single sentences, even just keywords.

The gold standard: you pull out your phone, the app opens to a text field, you type or speak, and you're done in under 10 seconds. Everything else is downstream.


Layer 2: Storage — Simplicity Over Cleverness

I've been experimenting with two approaches:

Option A: Plain Markdown Files + Git

~/brain/
  inbox/
    2026-04-13-random-thought.md
  projects/
    kioku/
      notes.md
      architecture.md
  references/
    llm-context-windows.md

Version controlled by git. Each note is a file. Folders are the only organizational primitive.

Pros: Portable, grep-able, works with any editor, zero vendor lock-in, diffs are meaningful, works offline forever.

Cons: No relational queries, no real-time sync across devices without extra tooling, merge conflicts if you edit from multiple places.

Option B: Database with Markdown Columns

Store markdown content in database rows. Render to files on demand. Supabase/Postgres as the backend.

Pros: Real-time sync, relational queries, easy to build search, works naturally with a mobile app.

Cons: Vendor dependency, need a server, content is trapped behind an API.

Where I've Landed

For a personal tool: files + git. The simplicity is unbeatable. I can use VS Code, vim, Obsidian, or any text editor. My data is always mine.

For a team/company tool: database. You need real-time sync, permissions, and the ability to query across users' knowledge bases.

For a consumer app (which is what I'd ship): database for storage, but export to markdown at any time. Don't lock people in.


Layer 3: Rendering — It Depends on Your Audience

Here's a take that might be controversial: rendering barely matters for most second brain use cases.

For a consumer app, yes — you need it to look good. People expect polished UIs, and a beautiful reading experience helps with adoption. Obsidian got this right with their theme ecosystem.

For a team or internal tool, rendering is almost irrelevant. Engineers and knowledge workers care about two things: "can I find it?" and "is it accurate?" Nobody is admiring the typography of an internal wiki article.

So my advice: don't over-invest in rendering early. Use a standard markdown renderer and move on. Spend that time on retrieval instead.


Layer 4: Retrieval — This Is the Whole Game

This is the part I'm most excited about, and where I think the entire second brain category is about to be disrupted.

The Old Way: Search, Tags, and Folders

Traditional note apps give you three retrieval mechanisms:

  1. Folders — you have to remember where you put things
  2. Tags — you have to remember what you tagged things
  3. Full-text search — you have to remember the exact words you used

All three assume you remember something about the note you're looking for. But the whole point of a second brain is to remember things for you.

The RAG Era (2023–2025)

The first wave of AI-powered note apps used RAG (Retrieval-Augmented Generation): embed your notes as vectors, find semantically similar chunks, pass them to an LLM.

RAG works, but it's complex and lossy:

  • You need embedding models, vector databases, chunking strategies
  • Retrieval quality depends heavily on chunk size and embedding quality
  • The LLM only sees fragments, not your full context
  • It's another system to maintain, tune, and debug

The New Way: Just Dump Everything Into the Context Window

This is the idea that Andrej Karpathy recently tweeted about, and I independently arrived at the same conclusion while building prototypes.

Modern LLMs have context windows of 1M+ tokens. That's roughly 750,000 words — or about 1,500 full-length notes. For most people's personal knowledge base, that's... everything.

The approach:

  1. User asks a question
  2. Load their entire knowledge base (or a large, relevant subset) into the context window
  3. LLM answers with full context awareness

No embeddings. No vector databases. No chunking. No retrieval pipeline to tune.

Why this works better than RAG:

  • The LLM sees connections between notes that a retrieval pipeline would miss
  • No information is lost to chunking boundaries
  • You can ask complex, cross-cutting questions ("what have I learned about X that contradicts Y?")
  • Dramatically simpler to build and maintain

The obvious limitation: cost and latency. Sending 1M tokens per query isn't free. But context window costs are dropping fast — what costs 10todaywillcost10 today will cost 0.10 next year. And for a subscription app charging $10–20/month, the math already works if users make a reasonable number of queries per day.

What This Means for the Product

If retrieval is just "dump everything to the LLM," then the app architecture simplifies dramatically:

Capture (fast input)
Storage (simple files or DB)
Retrieval (load context + LLM query)

No embedding pipeline. No vector DB. No reindexing when notes change. The hard engineering problem disappears, and what's left is pure product design: how do you make capture effortless, and how do you present the LLM's answers in a way that's useful?


What I'm Building

I'm working on Kioku — a knowledge app built on these principles:

  1. Capture in under 10 seconds — text or voice, no categorization required
  2. Plain storage — markdown-friendly, exportable, no lock-in
  3. AI-first retrieval — no manual search, no folders to navigate. Ask a question, get an answer grounded in your own knowledge
  4. On-device AI — powered by edge LLM models like Google's Gemma 4, so your knowledge stays on your device. No cloud round-trips, no privacy concerns, instant responses
  5. Learning integration — the app doesn't just store knowledge, it helps you retain it through spaced repetition

If this sounds interesting, subscribe below — I'll share the build process as it happens.


The best second brain is one you actually use. Everything else is decoration.

Enjoyed this post? Subscribe for more.