Skip to main content
These endpoints power Agent Memory’s retrieval layer — use them to search past observations with hybrid BM25 + vector + graph ranking, generate context blocks ready to inject into prompts, enrich file paths with related memories and known bugs, and traverse the knowledge graph. All retrieval results are session-diversified (at most 3 results per session) and ranked with Reciprocal Rank Fusion (RRF, k=60).

POST /agentmemory/smart-search

Runs a hybrid three-stream search across all memories and observations. This is the primary search endpoint — it combines keyword matching (BM25), semantic similarity (vector), and entity-linked knowledge graph traversal, then fuses the ranked results.
You must provide either a query string or an expandIds array. Providing both is allowed — expandIds expands the context around specific observations while query runs the full retrieval pipeline.

Request

curl -X POST http://localhost:3111/agentmemory/smart-search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how does the auth token refresh work",
    "project": "my-project",
    "limit": 10,
    "includeLessons": true
  }'
query
string
Natural language search query. Agent Memory tokenizes this for BM25, embeds it for vector similarity, and extracts named entities for graph traversal — all in a single call.
expandIds
array
Array of observation IDs (strings or { obsId, sessionId } objects) to expand context around. Used internally by the viewer and MCP tools to fetch neighbouring observations.
project
string
Scope the search to a specific project. Omit to search across all projects.
limit
number
Maximum number of results to return (default 10). Must be a positive integer.
includeLessons
boolean
When true, also searches the lessons store (crystallized learnings from past sessions) and merges them into the results.
agentId
string
Override the agent scope for this search. Useful when routing queries on behalf of a specific agent in multi-agent setups. You can also pass ?agentId=<id> as a query parameter.
sessionId
string
Bias results toward observations from a specific session.

Response

{
  "results": [
    {
      "obsId": "obs_abc",
      "sessionId": "sess_xyz",
      "title": "JWT refresh token rotation implemented",
      "type": "file_write",
      "score": 0.94,
      "timestamp": "2025-01-14T15:30:00.000Z"
    },
    {
      "obsId": "obs_def",
      "sessionId": "sess_xyz",
      "title": "Chose jose over jsonwebtoken for Edge compatibility",
      "type": "decision",
      "score": 0.87,
      "timestamp": "2025-01-10T09:00:00.000Z"
    }
  ],
  "totalCount": 2
}
results
CompactSearchResult[]
Ranked array of matching observations. Each result includes obsId, sessionId, title, type, score (combined RRF score), and timestamp.
For the full observation detail (facts, narrative, files, concepts), call GET /agentmemory/observations?sessionId=<id> and filter by obsId. The compact response keeps payload sizes small for fast recall.

POST /agentmemory/search

A lower-level search endpoint that gives you more control over output format and token budget. Returns observations in one of three formats.
curl -X POST http://localhost:3111/agentmemory/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "database performance N+1 query",
    "project": "my-project",
    "limit": 5,
    "format": "narrative",
    "token_budget": 1500
  }'
query
string
required
Search query string. Must be non-empty.
limit
number
Max results to return. Must be a positive integer.
project
string
Scope to a project.
cwd
string
Working directory — biases results toward files under this path.
format
string
Output format. One of:
  • full — complete observation objects
  • compact — titles and scores only
  • narrative — prose summary of results
token_budget
number
Maximum tokens for the response. When set, Agent Memory trims results to fit within the budget.

POST /agentmemory/context

Generates a formatted context block ready to inject directly into an agent prompt. Unlike raw search, this endpoint applies token budgeting, deduplication, and formatting so the output is immediately usable.

Request

curl -X POST http://localhost:3111/agentmemory/context \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "project": "my-project",
    "budget": 2000
  }'
sessionId
string
required
The current session ID. Agent Memory uses this to avoid including observations from the current session in the recalled context.
project
string
required
Project name to recall context for.
budget
number
Maximum token budget for the returned context block (default 2000). Must be a positive integer. Agent Memory ranks and trims memories to fit within this budget.

Response

{
  "context": "## Previous Session Context\n\n**Architecture**\nAuth uses JWT tokens with 1-hour expiry. Refresh tokens rotate on each use (7-day sliding window) via jose in `src/auth/tokens.ts`.\n\n**Patterns**\nUses Drizzle ORM for Postgres queries. Avoids N+1 by preloading relations in service layer.\n\n**Recent work**\nAdded rate limiting middleware (`src/middleware/rate-limit.ts`). Tests in `test/rate-limit.test.ts`.\n",
  "tokenCount": 1243
}
context
string
Formatted context block as a markdown string. Inject this into the agent’s first turn or system prompt.
tokenCount
number
Approximate token count of the returned context.

POST /agentmemory/enrich

Enriches one or more file paths with related memories, past edit history, observed bugs, and relevant concepts. Use this to give the agent deep context about a specific file before it reads or edits it.

Request

curl -X POST http://localhost:3111/agentmemory/enrich \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "files": ["src/auth/tokens.ts", "src/middleware/auth.ts"],
    "terms": ["JWT", "refresh"],
    "project": "my-project"
  }'
sessionId
string
required
Current session ID.
files
string[]
required
Array of file paths to enrich. Must be non-empty. All paths must be strings.
terms
string[]
Additional search terms to broaden the enrichment query beyond the file path itself.
toolName
string
The tool that triggered this enrichment (e.g. "Read", "Edit"). Used for diagnostic tracing.
project
string
Project name for scoped recall.

Response

The response includes per-file summaries, related memories, past edit history from observations, known issues, and matched concepts.

POST /agentmemory/file-context

Returns a focused context block for specific files within a session. Lighter than /enrich — returns raw observation context for the given files without the full enrichment pass.
curl -X POST http://localhost:3111/agentmemory/file-context \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "files": ["src/db/index.ts"]
  }'

POST /agentmemory/graph/query

Traverses the knowledge graph starting from a node, querying by node type, or running a text search across node names. Returns matching nodes and their edges up to the specified depth.
This endpoint requires GRAPH_EXTRACTION_ENABLED=true in your config and a configured LLM provider. Without it, the endpoint returns 503 with instructions on how to enable the feature.

Request

curl -X POST http://localhost:3111/agentmemory/graph/query \
  -H "Content-Type: application/json" \
  -d '{
    "startNodeId": "node_JwtService",
    "maxDepth": 2,
    "limit": 50
  }'
{
  "startNodeId": "node_JwtService",
  "maxDepth": 2,
  "limit": 50
}
startNodeId
string
ID of the graph node to start traversal from. Returns all nodes reachable within maxDepth hops.
nodeType
string
Filter results to a specific node type. One of: file, function, concept, error, decision, pattern, library, person, project, preference, location, organization, event.
query
string
Text query matched against node names.
maxDepth
number
Maximum traversal depth from the start node (default 2).
limit
number
Maximum number of nodes to return.
offset
number
Pagination offset.

Response

{
  "nodes": [
    {
      "id": "node_JwtService",
      "type": "function",
      "name": "JwtService",
      "properties": { "file": "src/auth/tokens.ts" },
      "createdAt": "2025-01-10T09:00:00.000Z"
    }
  ],
  "edges": [
    {
      "id": "edge_001",
      "type": "imports",
      "sourceNodeId": "node_AuthMiddleware",
      "targetNodeId": "node_JwtService",
      "weight": 0.9
    }
  ],
  "depth": 2,
  "totalNodes": 14,
  "totalEdges": 11,
  "truncated": false
}
nodes
GraphNode[]
Matching graph nodes. Each node has an id, type, name, properties, sourceObservationIds (for provenance tracing), and createdAt.
edges
GraphEdge[]
Edges connecting the returned nodes. Edge types include: uses, imports, modifies, causes, fixes, depends_on, related_to, prefers, blocked_by, rejected, avoids, optimizes_for, and more.
truncated
boolean
true when the result was capped by limit. Use offset to page through the rest.
fromSnapshot
boolean
true when the response was served from the precomputed top-degree snapshot rather than a live graph enumeration. Snapshots are rebuilt via POST /agentmemory/graph/snapshot-rebuild.

Graph Management

EndpointMethodDescription
/agentmemory/graph/statsGETReturns aggregate counts: total nodes, edges, node type breakdown.
/agentmemory/graph/buildPOSTBackfills the graph from all existing compressed observations. Pass { "batchSize": 25 } to control batch size.
/agentmemory/graph/snapshot-rebuildPOSTRebuilds the top-degree snapshot used to serve large-corpus queries without timeout.
/agentmemory/graph/resetPOSTWipes graph state without touching observations. Use when the graph is corrupt or too large to rebuild safely.
/agentmemory/graph/extractPOSTExtracts graph nodes and edges from a batch of observations. Pass { "observations": [...] }.

GET /agentmemory/config/flags

Returns the current runtime feature flags, LLM provider kind, and embedding provider. Use this to check which optional features are active before calling feature-gated endpoints.
curl http://localhost:3111/agentmemory/config/flags

Response

{
  "version": "0.9.27",
  "provider": "openai",
  "embeddingProvider": "embeddings",
  "flags": [
    {
      "key": "GRAPH_EXTRACTION_ENABLED",
      "label": "Knowledge graph extraction",
      "enabled": false,
      "default": false,
      "needsLlm": true,
      "description": "Extracts entities and relations from observations into a knowledge graph.",
      "enableHow": "Set GRAPH_EXTRACTION_ENABLED=true and provide an LLM key, then restart.",
      "docsHref": "https://github.com/rohitg00/agentmemory#knowledge-graph",
      "affects": ["Graph", "Dashboard"]
    },
    {
      "key": "CONSOLIDATION_ENABLED",
      "label": "Memory consolidation",
      "enabled": true,
      "default": false,
      "needsLlm": true,
      "description": "Periodically summarizes sessions into semantic facts + procedures.",
      "enableHow": "Set CONSOLIDATION_ENABLED=true and provide an LLM key, then restart.",
      "affects": ["Dashboard", "Memories", "Crystals"]
    },
    {
      "key": "AGENTMEMORY_AUTO_COMPRESS",
      "label": "LLM-powered observation compression",
      "enabled": false,
      "default": false,
      "needsLlm": true,
      "description": "Every observation is compressed by the LLM for richer summaries (costs tokens). OFF uses zero-LLM synthetic compression.",
      "enableHow": "Set AGENTMEMORY_AUTO_COMPRESS=true and provide an LLM key."
    },
    {
      "key": "AGENTMEMORY_INJECT_CONTEXT",
      "label": "In-conversation context injection",
      "enabled": true,
      "default": false,
      "needsLlm": false,
      "description": "Hooks write recalled context into the agent's conversation. OFF captures in the background without injecting.",
      "enableHow": "Set AGENTMEMORY_INJECT_CONTEXT=true and restart."
    }
  ]
}
version
string
The running Agent Memory version.
provider
string
Detected LLM provider kind (anthropic, openai, gemini, openrouter, minimax, noop).
embeddingProvider
string
"embeddings" if an embedding provider is configured, "none" if BM25-only mode is active.
flags
FeatureFlag[]
Array of feature flag objects. Each flag includes key, label, enabled (current state), needsLlm (whether an LLM provider is required), and enableHow (exact instruction to turn it on).

POST /agentmemory/timeline

Returns a chronological slice of observations centred on a given anchor timestamp or observation ID. Useful for replaying what happened around a specific point in time.
curl -X POST http://localhost:3111/agentmemory/timeline \
  -H "Content-Type: application/json" \
  -d '{
    "anchor": "2025-01-15T10:35:00.000Z",
    "project": "my-project",
    "before": 5,
    "after": 5
  }'
anchor
string
required
ISO timestamp or observation ID to centre the timeline around.
project
string
Scope to a project.
before
number
Number of observations to include before the anchor.
after
number
Number of observations to include after the anchor.

POST /agentmemory/patterns

Detects recurring patterns across all sessions for a project — common code idioms, repeated workflows, and frequently co-occurring concepts.
curl -X POST http://localhost:3111/agentmemory/patterns \
  -H "Content-Type: application/json" \
  -d '{ "project": "my-project" }'

GET /agentmemory/diagnostics/followup

Returns a directional signal for the follow-up query rate — the proportion of searches followed by another search within a short window. High follow-up rates may indicate recall quality issues.
curl http://localhost:3111/agentmemory/diagnostics/followup
The follow-up rate is a directional signal only — it overcounts on legitimate query refinement (when you intentionally narrow a search). Tune the window with AGENTMEMORY_FOLLOWUP_WINDOW_SECONDS.