Memory & context

An agent's working memory is its context window — and it ends the moment the session ends. Without persistent memory, you re-explain your project, your stack, your conventions every single time. This page is about writing things down once, so you never have to.

Three layers of persistent context

1 · Project memory · `GEMINI.md`

A markdown file at the root of your project that the agent reads on every session. It's the README the agent follows, not the README humans read. Keep it short and useful.

Good things to put in it:

How to run the project (the one command, not the README's full setup)
How to run tests (the exact command — agents will guess wrong)
Conventions that aren't obvious from the code (naming, file layout, error handling)
"Don't change X without checking Y" — the constraints that aren't enforced by tests
The deploy flow, in 3–5 bullets

Bad things to put in it:

Anything the agent can derive from reading the code (don't re-document what's in the file structure)
A wall of architectural philosophy — keep it operational
Long examples — they bloat every session's context budget

2 · User memory · auto-recall

Most coding IDEs now have an auto-memory layer that captures facts about you — your preferred stack, your past corrections, the conventions you've established. The agent surfaces them when relevant.

This works best when you let the agent capture things naturally. If it offers to remember "you prefer pnpm over npm" — yes, save it. If it offers to remember a specific debugging fix — probably no, that's project-specific noise.

Re-evaluate auto-memories every couple of months. Stale memories pollute the context just like dead-end paths in a session.

3 · Session memory · plans & specs

The plan, spec, or brainstorm doc you write together during a session is durable memory you wrote yourself. Save it as a markdown file in the repo, and the next session opens by reading it — the agent skips the re-explanation entirely. Today's decisions become tomorrow's context.

What this looks like in practice:

Before a complex task, ask the agent to write the plan first and save it as plan.md. The next session reads the file and picks up where you left off.
For a multi-day feature, keep a short spec.md alongside the code — decisions, rejected alternatives, open questions. The agent updates it as the work evolves.
For brainstorming, ask the agent to capture the conversation as bullet points before you close the tab. Three minutes of recap saves twenty minutes of re-discovering the thread later.

The pattern: think → capture → reload. The agent forgets; the markdown doesn't.

A starter `GEMINI.md` template

# Project: <name>

## Stack
- Language: <e.g. Node.js 20, TypeScript>
- Framework: <e.g. Express, Vite, React>
- Tests: <e.g. Vitest>
- Deploy: <e.g. Cloud Run via gcloud>

## Commands
- Dev: `pnpm dev`
- Test: `pnpm test`
- Lint/format: `pnpm lint`
- Deploy: `gcloud run deploy --source .`

## Conventions
- File naming: kebab-case
- All components are functional + hooks
- Errors throw, never return null
- Don't add dependencies without checking with me first

## Don't touch
- `vendor/` — third-party code, mirror only
- `migrations/` — append-only, never edit existing files

Pruning rules

Memory becomes noise just as fast as it becomes signal. Re-read your GEMINI.md every couple of weeks and ask:

Has anything become obvious from reading the code? Delete it from memory.
Did any rule get violated and survive review? It wasn't important — delete it.
Is there anything you re-explain in chat that should live here instead? Add it.

← Prompting in practice · Recovery toolkit →