Search and Context API — Smart Search and Enrichment

These endpoints power Agent Memory’s retrieval layer — use them to search past observations with hybrid BM25 + vector + graph ranking, generate context blocks ready to inject into prompts, enrich file paths with related memories and known bugs, and traverse the knowledge graph. All retrieval results are session-diversified (at most 3 results per session) and ranked with Reciprocal Rank Fusion (RRF, k=60).

POST /agentmemory/smart-search

Runs a hybrid three-stream search across all memories and observations. This is the primary search endpoint — it combines keyword matching (BM25), semantic similarity (vector), and entity-linked knowledge graph traversal, then fuses the ranked results.

You must provide either a query string or an expandIds array. Providing both is allowed — expandIds expands the context around specific observations while query runs the full retrieval pipeline.

Request

curl -X POST http://localhost:3111/agentmemory/smart-search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "how does the auth token refresh work",
    "project": "my-project",
    "limit": 10,
    "includeLessons": true
  }'

string

Natural language search query. Agent Memory tokenizes this for BM25, embeds it for vector similarity, and extracts named entities for graph traversal — all in a single call.

array

Array of observation IDs (strings or { obsId, sessionId } objects) to expand context around. Used internally by the viewer and MCP tools to fetch neighbouring observations.

string

Scope the search to a specific project. Omit to search across all projects.

number

Maximum number of results to return (default 10). Must be a positive integer.

boolean

When true, also searches the lessons store (crystallized learnings from past sessions) and merges them into the results.

string

Override the agent scope for this search. Useful when routing queries on behalf of a specific agent in multi-agent setups. You can also pass ?agentId=<id> as a query parameter.

string

Bias results toward observations from a specific session.

Response

{
  "results": [
    {
      "obsId": "obs_abc",
      "sessionId": "sess_xyz",
      "title": "JWT refresh token rotation implemented",
      "type": "file_write",
      "score": 0.94,
      "timestamp": "2025-01-14T15:30:00.000Z"
    },
    {
      "obsId": "obs_def",
      "sessionId": "sess_xyz",
      "title": "Chose jose over jsonwebtoken for Edge compatibility",
      "type": "decision",
      "score": 0.87,
      "timestamp": "2025-01-10T09:00:00.000Z"
    }
  ],
  "totalCount": 2
}

CompactSearchResult[]

Ranked array of matching observations. Each result includes obsId, sessionId, title, type, score (combined RRF score), and timestamp.

For the full observation detail (facts, narrative, files, concepts), call GET /agentmemory/observations?sessionId=<id> and filter by obsId. The compact response keeps payload sizes small for fast recall.

POST /agentmemory/search

A lower-level search endpoint that gives you more control over output format and token budget. Returns observations in one of three formats.

curl -X POST http://localhost:3111/agentmemory/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "database performance N+1 query",
    "project": "my-project",
    "limit": 5,
    "format": "narrative",
    "token_budget": 1500
  }'

string

required

Search query string. Must be non-empty.

number

Max results to return. Must be a positive integer.

string

Scope to a project.

string

Working directory — biases results toward files under this path.

string

Output format. One of:

full — complete observation objects
compact — titles and scores only
narrative — prose summary of results

number

Maximum tokens for the response. When set, Agent Memory trims results to fit within the budget.

POST /agentmemory/context

Generates a formatted context block ready to inject directly into an agent prompt. Unlike raw search, this endpoint applies token budgeting, deduplication, and formatting so the output is immediately usable.

Request

curl -X POST http://localhost:3111/agentmemory/context \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "project": "my-project",
    "budget": 2000
  }'

string

required

The current session ID. Agent Memory uses this to avoid including observations from the current session in the recalled context.

string

required

Project name to recall context for.

number

Maximum token budget for the returned context block (default 2000). Must be a positive integer. Agent Memory ranks and trims memories to fit within this budget.

Response

{
  "context": "## Previous Session Context\n\n**Architecture**\nAuth uses JWT tokens with 1-hour expiry. Refresh tokens rotate on each use (7-day sliding window) via jose in `src/auth/tokens.ts`.\n\n**Patterns**\nUses Drizzle ORM for Postgres queries. Avoids N+1 by preloading relations in service layer.\n\n**Recent work**\nAdded rate limiting middleware (`src/middleware/rate-limit.ts`). Tests in `test/rate-limit.test.ts`.\n",
  "tokenCount": 1243
}

string

Formatted context block as a markdown string. Inject this into the agent’s first turn or system prompt.

number

Approximate token count of the returned context.

POST /agentmemory/enrich

Enriches one or more file paths with related memories, past edit history, observed bugs, and relevant concepts. Use this to give the agent deep context about a specific file before it reads or edits it.

Request

curl -X POST http://localhost:3111/agentmemory/enrich \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "files": ["src/auth/tokens.ts", "src/middleware/auth.ts"],
    "terms": ["JWT", "refresh"],
    "project": "my-project"
  }'

string

required

Current session ID.

string[]

required

Array of file paths to enrich. Must be non-empty. All paths must be strings.

string[]

Additional search terms to broaden the enrichment query beyond the file path itself.

string

The tool that triggered this enrichment (e.g. "Read", "Edit"). Used for diagnostic tracing.

string

Project name for scoped recall.

Response

The response includes per-file summaries, related memories, past edit history from observations, known issues, and matched concepts.

POST /agentmemory/file-context

Returns a focused context block for specific files within a session. Lighter than /enrich — returns raw observation context for the given files without the full enrichment pass.

curl -X POST http://localhost:3111/agentmemory/file-context \
  -H "Content-Type: application/json" \
  -d '{
    "sessionId": "sess_abc123",
    "files": ["src/db/index.ts"]
  }'

POST /agentmemory/graph/query

Traverses the knowledge graph starting from a node, querying by node type, or running a text search across node names. Returns matching nodes and their edges up to the specified depth.

This endpoint requires GRAPH_EXTRACTION_ENABLED=true in your config and a configured LLM provider. Without it, the endpoint returns 503 with instructions on how to enable the feature.

Request

curl -X POST http://localhost:3111/agentmemory/graph/query \
  -H "Content-Type: application/json" \
  -d '{
    "startNodeId": "node_JwtService",
    "maxDepth": 2,
    "limit": 50
  }'

Start from a node
Query by type
Text search

{
  "startNodeId": "node_JwtService",
  "maxDepth": 2,
  "limit": 50
}

{
  "nodeType": "function",
  "limit": 25,
  "offset": 0
}

{
  "query": "JWT token rotation",
  "maxDepth": 1,
  "limit": 20
}

string

ID of the graph node to start traversal from. Returns all nodes reachable within maxDepth hops.

string

Filter results to a specific node type. One of: file, function, concept, error, decision, pattern, library, person, project, preference, location, organization, event.

string

Text query matched against node names.

number

Maximum traversal depth from the start node (default 2).

number

Maximum number of nodes to return.

number

Pagination offset.

Response

{
  "nodes": [
    {
      "id": "node_JwtService",
      "type": "function",
      "name": "JwtService",
      "properties": { "file": "src/auth/tokens.ts" },
      "createdAt": "2025-01-10T09:00:00.000Z"
    }
  ],
  "edges": [
    {
      "id": "edge_001",
      "type": "imports",
      "sourceNodeId": "node_AuthMiddleware",
      "targetNodeId": "node_JwtService",
      "weight": 0.9
    }
  ],
  "depth": 2,
  "totalNodes": 14,
  "totalEdges": 11,
  "truncated": false
}

GraphNode[]

Matching graph nodes. Each node has an id, type, name, properties, sourceObservationIds (for provenance tracing), and createdAt.

GraphEdge[]

Edges connecting the returned nodes. Edge types include: uses, imports, modifies, causes, fixes, depends_on, related_to, prefers, blocked_by, rejected, avoids, optimizes_for, and more.

boolean

true when the result was capped by limit. Use offset to page through the rest.

boolean

true when the response was served from the precomputed top-degree snapshot rather than a live graph enumeration. Snapshots are rebuilt via POST /agentmemory/graph/snapshot-rebuild.

Graph Management

Endpoint	Method	Description
`/agentmemory/graph/stats`	GET	Returns aggregate counts: total nodes, edges, node type breakdown.
`/agentmemory/graph/build`	POST	Backfills the graph from all existing compressed observations. Pass `{ "batchSize": 25 }` to control batch size.
`/agentmemory/graph/snapshot-rebuild`	POST	Rebuilds the top-degree snapshot used to serve large-corpus queries without timeout.
`/agentmemory/graph/reset`	POST	Wipes graph state without touching observations. Use when the graph is corrupt or too large to rebuild safely.
`/agentmemory/graph/extract`	POST	Extracts graph nodes and edges from a batch of observations. Pass `{ "observations": [...] }`.

GET /agentmemory/config/flags

Returns the current runtime feature flags, LLM provider kind, and embedding provider. Use this to check which optional features are active before calling feature-gated endpoints.

curl http://localhost:3111/agentmemory/config/flags

Response

{
  "version": "0.9.27",
  "provider": "openai",
  "embeddingProvider": "embeddings",
  "flags": [
    {
      "key": "GRAPH_EXTRACTION_ENABLED",
      "label": "Knowledge graph extraction",
      "enabled": false,
      "default": false,
      "needsLlm": true,
      "description": "Extracts entities and relations from observations into a knowledge graph.",
      "enableHow": "Set GRAPH_EXTRACTION_ENABLED=true and provide an LLM key, then restart.",
      "docsHref": "https://github.com/rohitg00/agentmemory#knowledge-graph",
      "affects": ["Graph", "Dashboard"]
    },
    {
      "key": "CONSOLIDATION_ENABLED",
      "label": "Memory consolidation",
      "enabled": true,
      "default": false,
      "needsLlm": true,
      "description": "Periodically summarizes sessions into semantic facts + procedures.",
      "enableHow": "Set CONSOLIDATION_ENABLED=true and provide an LLM key, then restart.",
      "affects": ["Dashboard", "Memories", "Crystals"]
    },
    {
      "key": "AGENTMEMORY_AUTO_COMPRESS",
      "label": "LLM-powered observation compression",
      "enabled": false,
      "default": false,
      "needsLlm": true,
      "description": "Every observation is compressed by the LLM for richer summaries (costs tokens). OFF uses zero-LLM synthetic compression.",
      "enableHow": "Set AGENTMEMORY_AUTO_COMPRESS=true and provide an LLM key."
    },
    {
      "key": "AGENTMEMORY_INJECT_CONTEXT",
      "label": "In-conversation context injection",
      "enabled": true,
      "default": false,
      "needsLlm": false,
      "description": "Hooks write recalled context into the agent's conversation. OFF captures in the background without injecting.",
      "enableHow": "Set AGENTMEMORY_INJECT_CONTEXT=true and restart."
    }
  ]
}

string

The running Agent Memory version.

string

Detected LLM provider kind (anthropic, openai, gemini, openrouter, minimax, noop).

string

"embeddings" if an embedding provider is configured, "none" if BM25-only mode is active.

FeatureFlag[]

Array of feature flag objects. Each flag includes key, label, enabled (current state), needsLlm (whether an LLM provider is required), and enableHow (exact instruction to turn it on).

POST /agentmemory/timeline

Returns a chronological slice of observations centred on a given anchor timestamp or observation ID. Useful for replaying what happened around a specific point in time.

curl -X POST http://localhost:3111/agentmemory/timeline \
  -H "Content-Type: application/json" \
  -d '{
    "anchor": "2025-01-15T10:35:00.000Z",
    "project": "my-project",
    "before": 5,
    "after": 5
  }'

string

required

ISO timestamp or observation ID to centre the timeline around.

string

Scope to a project.

number

Number of observations to include before the anchor.

number

Number of observations to include after the anchor.

POST /agentmemory/patterns

Detects recurring patterns across all sessions for a project — common code idioms, repeated workflows, and frequently co-occurring concepts.

curl -X POST http://localhost:3111/agentmemory/patterns \
  -H "Content-Type: application/json" \
  -d '{ "project": "my-project" }'

GET /agentmemory/diagnostics/followup

Returns a directional signal for the follow-up query rate — the proportion of searches followed by another search within a short window. High follow-up rates may indicate recall quality issues.

curl http://localhost:3111/agentmemory/diagnostics/followup

The follow-up rate is a directional signal only — it overcounts on legitimate query refinement (when you intentionally narrow a search). Tune the window with AGENTMEMORY_FOLLOWUP_WINDOW_SECONDS.

CLI

MCP Tools

REST API

Search and Context API — Smart Search and Enrichment

POST /agentmemory/smart-search

Request

Response

POST /agentmemory/search

POST /agentmemory/context

Request

Response

POST /agentmemory/enrich

Request

Response

POST /agentmemory/file-context

POST /agentmemory/graph/query

Request

Response

Graph Management

GET /agentmemory/config/flags

Response

POST /agentmemory/timeline

POST /agentmemory/patterns

GET /agentmemory/diagnostics/followup

​POST /agentmemory/smart-search

​Request

​Response

​POST /agentmemory/search

​POST /agentmemory/context

​Request

​Response

​POST /agentmemory/enrich

​Request

​Response

​POST /agentmemory/file-context

​POST /agentmemory/graph/query

​Request

​Response

​Graph Management

​GET /agentmemory/config/flags

​Response

​POST /agentmemory/timeline

​POST /agentmemory/patterns

​GET /agentmemory/diagnostics/followup

POST /agentmemory/smart-search

Request

Response

POST /agentmemory/search

POST /agentmemory/context

Request

Response

POST /agentmemory/enrich

Request

Response

POST /agentmemory/file-context

POST /agentmemory/graph/query

Request

Response

Graph Management

GET /agentmemory/config/flags

Response

POST /agentmemory/timeline

POST /agentmemory/patterns

GET /agentmemory/diagnostics/followup