Skip to main content
All configuration is stored in ~/.agentmemory/.env. Create it by running agentmemory init, then uncomment the variables you want to activate. Restart agentmemory after making changes — the server reads the file at startup and does not hot-reload. Every variable is optional. Without any keys set, agentmemory runs in a safe no-LLM mode: observations are indexed via synthetic compression, hybrid BM25 search still works, but LLM-backed summarisation, consolidation, and reflection are disabled.

Minimal Config

The smallest useful configuration — an LLM provider key plus the two most impactful features:
~/.agentmemory/.env
# LLM provider (pick one)
ANTHROPIC_API_KEY=your-key-here

# Enable key features
CONSOLIDATION_ENABLED=true
AGENTMEMORY_INJECT_CONTEXT=true
Run agentmemory doctor after editing to verify the daemon sees your changes.

LLM Providers

agentmemory uses a single LLM provider for compression, summarisation, consolidation, and reflection. Set exactly one provider key. The detection order when multiple keys are present is: OPENAI_API_KEYMINIMAX_API_KEYANTHROPIC_API_KEYGEMINI_API_KEYOPENROUTER_API_KEY → noop.
VariableProviderDefault Model
ANTHROPIC_API_KEYAnthropicclaude-sonnet-4-20250514
OPENAI_API_KEYOpenAIgpt-4o-mini
GEMINI_API_KEY / GOOGLE_API_KEYGoogle Geminigemini-2.5-flash
OPENROUTER_API_KEYOpenRouteranthropic/claude-sonnet-4-20250514
MINIMAX_API_KEYMiniMaxMiniMax-M2.7
Additional LLM variables:
VariableDescription
ANTHROPIC_MODELOverride the Anthropic model (e.g. claude-opus-4-5)
GEMINI_MODELOverride the Gemini model
OPENROUTER_MODELOverride the OpenRouter model
MINIMAX_MODELOverride the MiniMax model
OPENAI_MODELOverride the OpenAI model
OPENAI_BASE_URLOverride the OpenAI-compatible endpoint — use this for Ollama, vLLM, LM Studio, DeepSeek, or Azure
ANTHROPIC_BASE_URLOverride the Anthropic-compatible endpoint for proxies or Azure AI Foundry
MAX_TOKENSCap completion tokens for LLM calls (default: 4096)
AGENTMEMORY_LLM_TIMEOUT_MSOutbound LLM request timeout in milliseconds (default: 60000)
FALLBACK_PROVIDERSComma-separated list of providers to try after the primary returns an error, e.g. anthropic,gemini
If you set OPENROUTER_MODEL to a premium model like claude-sonnet or gpt-4o, background compression can cost $5+ per day under active use. Cheaper alternatives with comparable quality for memory compression: deepseek/deepseek-v4-pro, deepseek/deepseek-chat, qwen/qwen3-coder.

Embedding Providers

Embeddings power the vector leg of agentmemory’s hybrid search. Without an embedding provider, search falls back to BM25-only mode. The detection order is: EMBEDDING_PROVIDER override → GEMINI_API_KEYOPENAI_API_KEYVOYAGE_API_KEYCOHERE_API_KEYOPENROUTER_API_KEY → local (offline).
VariableProviderModel
EMBEDDING_PROVIDER=localLocal (offline, no API key needed)all-MiniLM-L6-v2 (384-dim)
OPENAI_API_KEYOpenAItext-embedding-3-small
VOYAGE_API_KEYVoyage AI (optimised for code)voyage-code-3
COHERE_API_KEYCohereembed-english-v3.0
GEMINI_API_KEYGoogle Geminigemini-embedding-001
Additional embedding variables:
VariableDescription
EMBEDDING_PROVIDERForce a specific provider: local, openai, voyage, cohere, gemini, or openrouter
OPENAI_EMBEDDING_MODELOverride the OpenAI embedding model
OPENAI_EMBEDDING_DIMENSIONSRequired when the model is not in the known-models table
OPENROUTER_EMBEDDING_MODELEmbedding model when using OpenRouter (default: openai/text-embedding-3-small)
EMBEDDING_PROVIDER=local runs entirely offline using the bundled all-MiniLM-L6-v2 model. It is slower on first use (model download) but requires no API key and works in air-gapped environments.

Feature Flags

All feature flags default to false unless noted. Enable them by setting the variable to true in ~/.agentmemory/.env.
VariableDefaultDescription
AGENTMEMORY_AUTO_COMPRESSfalseRun LLM compression on every observation batch as it is captured. Requires a provider key. Disabled by default because synthetic compression handles most cases without burning API tokens.
AGENTMEMORY_INJECT_CONTEXTfalseInject recalled memories into the agent’s conversation at session start. When disabled, hooks capture observations for background indexing but do not modify the conversation.
CONSOLIDATION_ENABLEDautoRun the 4-tier consolidation pipeline (observations → memories → semantic → procedural) at session end. Defaults to true when any LLM provider key is set; set to false to disable even with a key.
GRAPH_EXTRACTION_ENABLEDfalseExtract knowledge graph entities and relationships on every remember call. Powers the graph-traversal recall path and the graph stats shown in agentmemory status.
AGENTMEMORY_SLOTSfalseEnable pinned, editable memory slots that persist across sessions.
AGENTMEMORY_REFLECTfalseAutomatically synthesize lessons from memories at the end of each session.
SNAPSHOT_ENABLEDfalsePeriodically export a git-versioned snapshot of the memory state and BM25/vector indexes to ~/.agentmemory/snapshots/.
CLAUDE_MEMORY_BRIDGEfalseBi-directionally sync compressed memories into the CLAUDE.md file in your project. Requires CLAUDE_PROJECT_PATH to also be set.
AGENTMEMORY_TOOLSallTool surface exposed to MCP clients. all enables all 53 tools; core limits to the 8 essential tools for a lighter footprint.
Additional behaviour variables:
VariableDescription
CONSOLIDATION_DECAY_DAYSAge in days after which non-reinforced memories decay during consolidation (default: 30)
GRAPH_EXTRACTION_BATCH_SIZEMemories processed per graph-extraction batch (default: 8 — tuned for the default LLM context window)
SNAPSHOT_DIRDirectory for periodic snapshots (default: ~/.agentmemory/snapshots)
SNAPSHOT_INTERVALSeconds between snapshots (default: 3600)
CLAUDE_PROJECT_PATHAbsolute path to your project, required when CLAUDE_MEMORY_BRIDGE=true
CLAUDE_MEMORY_LINE_BUDGETMax lines the bridge writes into CLAUDE.md (default: 200)
SUMMARIZE_CHUNK_SIZEObservations per chunk during large-session summarisation (default: 400). Primarily matters for bulk-imported JSONL sessions.
SUMMARIZE_CHUNK_CONCURRENCYParallel LLM calls during chunked summarisation (default: 6)

Ports

VariableDefaultPurpose
III_REST_PORT3111REST API and MCP HTTP endpoint
AGENTMEMORY_VIEWER_PORT3113Real-time web viewer
Overriding III_REST_PORT automatically shifts all derived ports, so a single variable is all you need to run a second instance on the same machine.
Additional port and URL variables:
VariableDescription
AGENTMEMORY_URLFull REST base URL (e.g. http://localhost:3111). Honored by status, doctor, and the MCP shim.
AGENTMEMORY_VIEWER_URLOverride the viewer URL printed by agentmemory status

Search Tuning

Adjust the balance between keyword and semantic search, and control how much context is injected per session.
VariableDefaultDescription
BM25_WEIGHT0.4Weight for the BM25 keyword search stream in hybrid ranking
VECTOR_WEIGHT0.6Weight for the vector embedding search stream in hybrid ranking
AGENTMEMORY_GRAPH_WEIGHT0.2Bonus weight applied to results found via knowledge graph traversal
TOKEN_BUDGET2000Maximum tokens injected as context per session via mem::context
MAX_OBS_PER_SESSION500Maximum observations captured per session before consolidation is triggered
BM25_WEIGHT and VECTOR_WEIGHT are independent — they do not need to sum to 1.0. The graph weight is an additive bonus on top of the hybrid score, not a separate stream.

Multi-Agent Scoping

When you run multiple agents or users against the same memory server, use these variables to namespace memories and control whether agents share or isolate their recall.
VariableDescription
TEAM_IDTeam namespace — memories are scoped to this identifier when set alongside USER_ID
USER_IDIndividual user identity within the team
AGENT_IDAgent identity for per-agent memory scoping. Trimmed to 128 characters.
AGENTMEMORY_AGENT_SCOPEshared (default) — tag memories with AGENT_ID but do not filter recall. isolated — tag and filter so each agent only recalls its own memories.
All agents see all memories. Use this when you want every agent to benefit from the team’s accumulated context.
~/.agentmemory/.env
TEAM_ID=acme-eng
USER_ID=alice
AGENTMEMORY_AGENT_SCOPE=shared

Security

VariableDescription
AGENTMEMORY_SECRETBearer token required on all API and viewer requests when set. Without this, the REST endpoints are open on loopback. Set it when you expose agentmemory beyond localhost or run behind a reverse proxy.
When AGENTMEMORY_SECRET is set, all CLI commands (status, doctor, import-jsonl, etc.) automatically attach the Authorization: Bearer <secret> header so they continue to work without any extra configuration.
Do not commit ~/.agentmemory/.env to version control. It may contain API keys and your bearer token.

Use agentmemory status to verify which features are active after editing your config. The status panel shows the detected LLM provider, embedding provider, and a checklist of every enabled feature flag.