Patterns for AI-Augmented Deep Research
A methodology essay for anyone running serious research with AI assistants. The premise: LLM chat alone is not research — it's a lossy first draft. Real research with AI is a loop of memory-first context loading, structured source gathering, batch synthesis, and disciplined capture into a knowledge base that compounds over time.
Why a workflow at all
Without structure, AI-assisted research collapses into two failure modes:
- Ephemeral chat — useful answers in a session that vanish when the context window closes. Nothing accumulates.
- Hallucinated confidence — synthesis without source citations, where the model's fluency hides the absence of evidence.
A workflow exists to defeat both: every claim has a traceable source, and every session deposits something into a persistent store that the next session can build on.
Inputs
Memory-first context loading
Before asking a model a research question, load what's already known. A well-maintained knowledge base — wiki pages, daily logs, prior session summaries — is a far cheaper context than re-deriving from web search.
The order of operations:
- Search local memory first. What have I already filed on this topic, this entity, this concept?
- Read the existing wiki entry if one exists. Reuse and extend, don't duplicate.
- Only then go to the open web for what's missing or stale.
This is not just an efficiency move. It forces the synthesis step to reconcile new information against what was previously believed — which is where most insight actually comes from.
Source gathering
A typical research session pulls from a layered stack:
| Layer | Examples | Use case |
|---|---|---|
| Search APIs | tavily, Perplexity, Brave, Exa | Broad initial sweep, citation-backed |
| Primary sources | Company sites, GitHub, whitepapers, SEC filings | Verifying specific claims |
| Structured data | Crunchbase, RootData, Token Terminal | Funding, metrics, quantitative facts |
| Long-form | Podcasts, talks, longreads | Founder voice, thesis depth |
| Aggregator LLMs | Routed via openrouter or similar | Cross-checking with different model biases |
The pattern: API search for breadth, primary sources for verification, structured DBs for numbers, long-form for context. No single tool covers all four.
Triggers worth following
Not every curiosity deserves a deep dive. Triggers that actually pay off:
- Investment-grade questions — anything you'd want to be right about with money on the line
- Recurring patterns — the third time a name appears across unrelated sources, file it
- Contrarian signals — a credible source disagrees with consensus
- Building decisions — research that gates a concrete next step
Triggers that don't: idle curiosity dressed as research, doomscrolling with extra steps, "interesting" with no follow-on action.
Synthesis
Batch over stream
LLMs are tempting for incremental Q&A — ask, get an answer, ask the next thing. This is the wrong shape for research. The high-leverage move is to batch raw material into a single long-context call and ask for synthesis across the whole pile.
A typical batch:
- 5–15 source URLs or excerpts
- Prior wiki page on the topic, if any
- An explicit synthesis prompt: compare, contrast, identify patterns, flag contradictions
claude-code-sessions are particularly suited to this because the working directory itself becomes the context — raw notes, prior pages, fetched content all live as files the model reads in one pass.
Cross-model cross-check
Different models have different training cutoffs, different biases, different hallucination patterns. For high-stakes claims, run the same synthesis through two models from different families. Disagreements are signal — they tell you which claims are stable and which are model-dependent.
openrouter makes this cheap; one client, many backends.
Keep the contrarian view in frame
Single-source claims, vendor self-reports, and consensus-only narratives all underweight risk. A synthesis that doesn't surface at least one credible contrarian view is incomplete — flag it as such and keep digging.
Capture & cite
Every claim has a source
The non-negotiable rule: when a fact lands in the wiki, it carries a citation. Two formats cover almost everything:
local: <filename>for material from your own notes or session logshttps://<url> (YYYY-MM-DD)for web sources, with the access date — content drifts, dates anchor it
For numerically sensitive claims (funding rounds, valuations, user counts, revenue), inline [1] [2] references keyed to a Sources block at the bottom prevent drift when the page is later edited.
Page kinds
A useful split for a research wiki:
- Module pages — analyses of a space, concept, or workflow. Multiple entities may live inside.
- Entity pages — a specific company, product, protocol, or person. One page, one referent.
- Source pages (optional) — distilled summaries of a single significant source, linked from anywhere it informs.
The module-vs-entity distinction matters: it's the difference between "agent payment protocols, the space" and "Skyfire, the company." Research often starts as a module sweep and then spawns entity pages as specific names earn their own treatment.
Densely link
Wiki-link liberally. The value of a knowledge base is not its pages — it's the graph between them. A well-linked wiki surfaces non-obvious connections (this founder also funded that protocol; this concept overlaps with that thesis) that flat notes never will.
Use a closed vocabulary of valid slugs, and degrade unknown links to plain text rather than 404s. This lets you write future-page aspirationally without breaking the build.
The compilation loop
Research sessions produce raw material. Compilation turns raw material into navigable knowledge. They are different jobs, and conflating them is why most personal wikis decay.
Cadence
A workable rhythm:
| Window | Activity |
|---|---|
| Per session | Raw notes into a dated daily log |
| Weekly | Review logs, promote significant findings into wiki pages |
| Monthly | Cross-link sweep, consolidate duplicates, kill dead pages |
| Quarterly | Theme retrospective — what changed, what was wrong, what's stable |
The weekly promote step is the load-bearing one. Without it, daily logs accumulate and the wiki rots.
Idempotent updates
Compilation should be safe to re-run. If a wiki page already exists for an entity, the next session merges new info in rather than starting from scratch. Frontmatter timestamps (last_compiled) make this auditable: pages older than the most recent relevant source are candidates for refresh.
The "what would I tell a smart friend" test
A wiki page passes review when you'd hand it to a smart friend asking the same question and feel it answers them honestly — including the gaps and the things you're not sure about. If the page is mostly vendor copy or model-generated platitude, it doesn't earn its slot.
Maintenance loop
Sources of decay
Wikis decay in predictable ways:
- Stale facts — funding rounds, headcount, product names change
- Dead links — primary sources reorganize their sites
- Drifted definitions — a term means something different two years later
- Orphan pages — entities that nothing else links to anymore
The maintenance loop is reading old pages with fresh eyes and asking: still true? still relevant? still linked?
Quality signals
A healthy research wiki shows:
- Most pages have multiple inbound links
- Sources include both local notes and dated web references
- Recent
last_compileddates on the topics you're actively thinking about - A small, ruthless set of slugs — not sprawl
Sprawl is the silent killer. A wiki with 400 thinly populated entity pages is worse than one with 80 strong ones, because the graph value depends on density.
Tooling stack
The specific tools matter less than the shape, but a representative AI-augmented stack:
- Search: tavily for agent-friendly search APIs, Perplexity / Brave / Exa as alternates
- Synthesis: claude-code-sessions for long-context multi-file reasoning; openrouter for cross-model checks
- Storage: a flat directory of markdown files with frontmatter, version-controlled in git
- Render: a static site generator (Next.js export, Hugo, mkdocs, etc.) for browseable output
- Capture: dated daily logs as the universal inbox; promotion to module/entity pages on the weekly cadence
The directory-of-markdown-with-frontmatter format is deliberate: it's the lowest-friction shape for both human editing and LLM context loading. JSON databases and proprietary note apps both lose against cat *.md.
Anti-patterns
Things that look like research but aren't:
- One-shot LLM Q&A with no source verification and no capture — pure dopamine, zero accumulation
- Bookmark hoarding — saving links is not reading them
- Vendor-only sources — only reading what a company says about itself
- Single-model dependency — never cross-checking against a model from a different family
- Synthesis without contradictions — a tidy story that ignores the messy parts is usually wrong
- Wiki without maintenance — write-only knowledge bases rot within months
The compounding bet
The reason to invest in this workflow is compounding. Session N+1 starts smarter than session N because session N filed something useful. After a year, the wiki is doing real work: serving as the load-bearing context for new investigations, surfacing connections you'd otherwise miss, and replacing "let me search again" with "let me read what I already concluded."
That compounding is what separates AI-augmented research from AI-as-search-replacement. The model is the engine; the wiki is the chassis.
Related
- claude-code-sessions — long-context synthesis as the primary research environment
- tavily-search-integration — search APIs designed for agent workflows
- openrouter — multi-model routing for cross-checks