The Memory Lifecycle
At a high level, Agent Memory works in three phases: capture, consolidate, and recall. During a session, hooks fire automatically on every tool use and observation is collected. When the session ends, a consolidation pipeline compresses those raw observations into durable, searchable memories. When your next session starts, the most relevant memories are retrieved and prepended to your agent’s context window — before your first prompt.Agent calls a tool
Your agent reads a file, runs a shell command, performs a web fetch, or executes any other tool. This is the raw material Agent Memory works with.
Agent Memory intercepts via hooks
Before and after every tool call, Agent Memory’s hook handlers fire. These hooks capture the tool name, its inputs, its output, and the surrounding context — including any errors that occurred.
Observations are stored and optionally compressed
The raw observation is stored immediately. If you have an LLM provider configured and
AGENTMEMORY_AUTO_COMPRESS=true, Agent Memory also calls your LLM to compress the observation into structured facts, a narrative summary, and extracted concepts. Without an LLM, a synthetic (BM25-compatible) compression path runs instead — so search still works, just without AI-generated summaries.Session-end consolidation pipeline runs
When your session ends, Agent Memory runs the consolidation pipeline: raw working-memory observations are summarized into episodic memories, episodic memories are distilled into semantic facts and procedural patterns. A knowledge graph is optionally extracted if
GRAPH_EXTRACTION_ENABLED=true.Next session starts with injected context
When your next session begins, Agent Memory performs a hybrid search against all stored memories, retrieves the most relevant ones within your token budget (default: 2,000 tokens), and injects them as context before your first prompt lands. Your agent already knows what it learned last time.
Hook Types
Agent Memory intercepts your agent’s activity through a set of lifecycle hooks. Each hook type maps to a specific event in your agent’s session:session_start
Fires when a new session begins. Triggers context retrieval and injects relevant memories from past sessions into the conversation.
session_end
Fires when a session completes. Triggers the full consolidation pipeline — session summary, graph extraction, slot reflection.
pre_tool_use
Fires before every tool call. Captures file access patterns and enriches context so Agent Memory knows what your agent is about to touch.
post_tool_use
Fires after every successful tool call. This is the primary capture point — tool name, input, and output are all recorded here.
post_tool_failure
Fires when a tool call fails. Captures error context, stack traces, and failure patterns so your agent learns from mistakes across sessions.
prompt_submit
Fires when you submit a prompt. Captures the user prompt (privacy-filtered) to provide conversation context for memory retrieval.
task_completed
Fires when a task completes. Triggers an end-of-session summary and the knowledge graph extraction pass.
stop
Fires when the agent stops. Works in tandem with
task_completed to close the session and flush final observations.pre_compact
Fires before the agent compacts its context window. Agent Memory captures a snapshot of the current working context so no observations are lost during compaction.
notification
Fires when the agent receives a notification. Captures notification content as an observation so background events are part of the session record.
subagent_start / subagent_stop
Fires when sub-agents are spawned or complete. Tracks sub-agent lifecycle so multi-agent workflows are captured holistically.
src/types.ts:
Different agent integrations support different subsets of these hooks. Claude Code supports all 12. Codex CLI supports 6 (
SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PreCompact, Stop). Any agent connected via MCP gets memory tools but may not emit all hook events automatically.What Gets Captured
Agent Memory categorizes every observation it records into a typedObservationType. This lets you search, filter, and recall memories by the kind of activity they represent:
File operations: file_read, file_write, file_edit
File operations: file_read, file_write, file_edit
Every time your agent reads, writes, or edits a file, Agent Memory records the file path, the content (where relevant), and the context around why the file was touched. Over time, this builds a per-file history you can retrieve with
memory_file_history.Shell commands: command_run
Shell commands: command_run
Shell commands your agent executes — build steps, test runs, installs, migrations — are captured with their output and exit codes. Patterns in which commands succeed or fail become part of your agent’s procedural memory.
Search and web operations: search, web_fetch
Search and web operations: search, web_fetch
When your agent searches or fetches external content, the query and results are captured. This helps Agent Memory understand what your agent was researching and surface relevant findings in future sessions.
Decisions and discoveries: decision, discovery
Decisions and discoveries: decision, discovery
Explicit decisions (choosing one library over another, picking an architecture pattern) and discoveries (finding a root cause, understanding a system’s behavior) are captured as high-importance observations that feed directly into semantic memory.
Errors and conversations: error, conversation
Errors and conversations: error, conversation
Errors your agent encounters — and the prompts and responses exchanged — are captured to build a record of what went wrong and how it was resolved.
Tasks and sub-agents: task, subagent, notification, image, other
Tasks and sub-agents: task, subagent, notification, image, other
Task lifecycle events, sub-agent operations, notifications, and image-based observations (when your agent processes screenshots or diagrams) are all tracked.
ObservationType union from src/types.ts:
Memory Storage
All data is stored locally on your machine under~/.agentmemory/ using SQLite. There is no cloud sync, no external database, no third-party service. Your memories stay on your hardware.
Token Injection
At session start, Agent Memory retrieves the most relevant memories from your history and prepends them to your agent’s context window. This is what makes your agent “remember” — it literally receives a curated summary of relevant past work before your first prompt. You control the injection behavior with two settings in~/.agentmemory/.env:
TOKEN_BUDGET tokens and diversified across sessions so you don’t get the same session’s memories repeated.
Agent Memory works even without an LLM API key. Without one, it uses BM25 keyword search and optional local embeddings (install
@xenova/transformers for free offline vector search). Consolidation into semantic and procedural memory requires an LLM, but capture, storage, and search all work out of the box with zero API keys.