Memory
By default an agent starts every call from scratch — even with a repeat caller. Memory changes that: after each call an extractor saves short, durable facts about the caller, and the next call injects the most relevant ones into the system prompt. Support bots stop re-asking for the caller’s plan, concierge agents recall preferred times, and retention flows resume where they left off.
To turn it on, see Add memory.
How it works
Extract
When a call ends, the server sends the transcript to a small LLM, which returns 0–5 short third-person facts about the caller, each with a confidence score.
Embed and store
Each fact is embedded (OpenAI text-embedding-3-large, the same model knowledge bases use) and stored in Postgres pgvector, scoped to (agent_id, caller_identity).
Scope
Memory is keyed on agent × caller. Anonymous widget sessions — callers without a stable user_identity — are never recorded or retrieved.
What it keeps and drops
Keeps: preferences, identifiers, commitments, recurring needs, and constraints — anything a future call benefits from.
Drops: volatile details (mood, weather), one-off facts, and anything sensitive (health, card numbers, passwords). The extractor is tuned to emit zero facts rather than invent filler, so many calls produce none.
Memory vs. knowledge base
Both can run on the same agent; they solve different problems.
The prompt handles the memory block; the LLM decides when to reach for the knowledge base. Many support agents use both.
Privacy and retention
memory_retention_days caps both retrieval visibility and the nightly cleanup job (0 means no cap; the default is 90). Deletes are soft immediately — unreachable from retrieval — and hard-deleted by the retention job.