Skip to main content
Agent Memory uses an LLM for three things: compressing observations into structured memories, running the consolidation pipeline at session end, and extracting entities and relationships into the knowledge graph. None of these require a key — without one, Agent Memory runs in BM25-only mode and synthetic compression handles indexing. Recall still works. But adding an LLM key meaningfully improves the quality of long-term memory and semantic search.

Supported Providers

Anthropic

Default model: claude-sonnet-4-20250514. High quality for both compression and knowledge graph extraction.

OpenAI

Default model: gpt-4o-mini. Cost-effective for continuous background compression. Override with OPENAI_MODEL=gpt-4o.

Google Gemini

Default model: gemini-2.5-flash. Also auto-enables Gemini embeddings (gemini-embedding-001). Supports a free tier.

OpenRouter

Default model: anthropic/claude-sonnet-4-20250514. Routes to any model in the OpenRouter catalog — useful for cost optimization.

MiniMax

Default model: MiniMax-M2.7. Anthropic-compatible API. Good alternative for high-volume compression workloads.

Local / Ollama

Uses any OpenAI-compatible server. Zero API cost, fully offline. Works with Ollama, LM Studio, vLLM, and llama.cpp.

Setup for Each Provider

Add the relevant key to ~/.agentmemory/.env, then restart Agent Memory.
# ~/.agentmemory/.env
ANTHROPIC_API_KEY=sk-ant-...
To pin a specific model:
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514
To route through an Anthropic-compatible proxy or Azure AI Foundry:
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://your-proxy.example.com

Embedding Providers

Embeddings are configured separately from the LLM provider. Agent Memory auto-detects the embedding provider from your available keys, or you can set EMBEDDING_PROVIDER explicitly.
ProviderVariableModelNotes
Local (default)EMBEDDING_PROVIDER=localall-MiniLM-L6-v2 (384-dim)Free, offline, no key required. Ships bundled via @xenova/transformers.
Voyage AIVOYAGE_API_KEY=pa-...voyage-code-3Recommended for code projects. Optimized for code semantics and retrieval.
OpenAIOPENAI_API_KEY=sk-...text-embedding-3-small (1536-dim)Enabled automatically when OPENAI_API_KEY is set. Override model with OPENAI_EMBEDDING_MODEL.
GeminiGEMINI_API_KEY=...gemini-embedding-001Enabled automatically when GEMINI_API_KEY is set. Supports 100+ languages.
CohereCOHERE_API_KEY=...embed-english-v3.0General-purpose embeddings with a free trial tier.
OpenRouterOPENROUTER_API_KEY=...configurableSet OPENROUTER_EMBEDDING_MODEL to select the model.

Provider Auto-Detection

Agent Memory checks for API keys in a fixed priority order and activates the first one it finds. You don’t need to set EMBEDDING_PROVIDER or any provider name explicitly — just set your API key. Detection order for LLM providers:
OPENAI_API_KEY → MINIMAX_API_KEY → ANTHROPIC_API_KEY → GEMINI_API_KEY → OPENROUTER_API_KEY → noop
Detection order for embedding providers:
EMBEDDING_PROVIDER (explicit) → GEMINI_API_KEY → OPENAI_API_KEY → VOYAGE_API_KEY → COHERE_API_KEY → OPENROUTER_API_KEY → local

Fallback Chain

If your primary LLM provider returns an error (for example, a rate limit or temporary outage), Agent Memory can automatically retry with a secondary provider:
# ~/.agentmemory/.env
ANTHROPIC_API_KEY=sk-ant-...
FALLBACK_PROVIDERS=openai,gemini
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
Agent Memory tries each fallback in the order listed. If all providers fail, the operation is skipped and retried on the next session. For the best recall quality on code-heavy projects:
# ~/.agentmemory/.env
ANTHROPIC_API_KEY=sk-ant-...      # or OPENAI_API_KEY
VOYAGE_API_KEY=pa-...              # voyage-code-3: best code embeddings
CONSOLIDATION_ENABLED=true
GRAPH_EXTRACTION_ENABLED=true
voyage-code-3 is specifically trained on code and significantly outperforms general-purpose embedding models on code retrieval tasks. Pair it with any LLM provider for consolidation and graph extraction.