Persistent Memory for Roleplay // SillyTavern Plugin
A graph + vector memory store that makes your characters actually remember.
Relationships evolve, events persist, knowledge stays consistent, and stories build
on themselves across sessions.
Without ChronicleDB, the model only sees what fits in its context window — usually the last handful of turns. Anything older falls off the back, and with it goes every relationship, trait, and promise you set up earlier. ChronicleDB assembles a focused memory block from the full history and injects it just before the most recent message, so the model writes with fresh recency and long-term grounding.
Without ChronicleDB
Personas / Presets / Character
Eden is a stoic cartographer drawn into a war she never wanted. Marcus is a charming scholar with something to hide.
Message 07
"I told you, the Glass Library is no place for us."
Message 08
Eden's hand drifts to the hilt of the dagger her father left her. She doesn't answer.
Most Recent
"Why are you lying to me about the Archivist?"
With ChronicleDB
Personas / Presets / Character
Eden is a stoic cartographer drawn into a war she never wanted. Marcus is a charming scholar with something to hide.
Message 07
"I told you, the Glass Library is no place for us."
Message 08
Eden's hand drifts to the hilt of the dagger her father left her. She doesn't answer.
[ChronicleDB Memory Context]## Current Scene- Eden: the Archivist's study- Marcus: the Archivist's study## Character TraitsEden — stoic, principled, suspicious of institutionsMarcus — charming, evasive, former Archivist## Active Plot Threads⏳ Marcus knows what happened to Eden's father🔮 Eden suspects the Archivist ordered the raid on her village## Relevant Past ContextMsg 23 — Marcus admitted he'd been inside the Library beforeMsg 31 — Eden found her father's seal on a letter Marcus was carrying
Most Recent
"Why are you lying to me about the Archivist?"
What ChronicleDB Does [overview]
Every message you exchange gets analyzed by a lightweight LLM. Characters, events, traits,
relationships, locations, items, and plot threads are extracted into a PostgreSQL graph with
vector embeddings for semantic search. When it's time for the AI to reply, the most relevant
memories are retrieved and injected into context — so the model knows what happened,
who said what, and how characters feel about each other.
The five pipelines that make the memory block feel like it knew what you were going to ask.
01
Extraction pipeline
A cheap extraction LLM (Gemini Flash Lite or equivalent) reads each new batch of messages and returns a structured JSON payload: characters + traits, events + source quotes, relationships, knowledge updates, world state, plot threads. Traits pass through a three-layer dedup before insertion so near-synonyms collapse into one canonical row with merged aliases.
Every turn, retrieval runs six searches in parallel: dense + lexical across memory passages, dense + lexical across event source quotes, tsvector + trigram on dialogue, and dense on scene snapshots. Results are fused via Reciprocal Rank Fusion with a recency bias, padded by their ±1 message neighbors, and expanded with the story arc each event belongs to. The final memory block is rendered to a token budget.
A raw user question often doesn't embed anywhere near its answer in vector space — the query wording differs from how the memory was written. HyDE (Hypothetical Document Embeddings) fixes that with one cheap LLM call: generate a short fake "answer" to the query, then embed that hypothetical instead of the question. The fake answer lives in the same stylistic/semantic space as the real memories, so the dense search lands closer to the actual answer. Opt-in per retrieval via { hyde: true }; adds one LLM call per turn.
user question→tiny LLM→hypothetical answer
↓
embed the hypothetical, not the question→dense search
Story arcs aren't handwritten. Every N events, the arc-builder constructs a weighted event graph (co-participation + temporal proximity + causal chains) and runs Louvain community detection three times at different resolutions — γ=0.25 surfaces super-arcs, γ=0.5 arcs, γ=1.0 episodes. Each cluster gets a cheap LLM-generated title; titles are recycled when their centerpiece event doesn't move, so most rebuilds cost near-zero LLM tokens.
event graph→Louvain × 3
γ=0.25super-arcsγ=0.5arcsγ=1.0episodes
↓
LLM name each cluster (title-recycled when stable)
05
Per-chat scoping
Every character row carries a chat_id. New characters extracted in a chat get a chat-hash-prefixed row (chr-{hash}-{slug}) so the same name in two different stories never bleeds aliases, traits, relationships, or plot threads between them. Legacy global rows without a chat_id are still visible through the per-character chat picker — the scoping is additive, not destructive.
Global (legacy)
chr-alice
one row shared across every chat Alice appears in
Per-chat (new)
chr-a3f912-alicechr-8c4e01-alice
one row per (chat, character) pair — no bleed
Features [12 modules]
Character Traits
Extracts stable personality traits, skills, and backstory through a three-layer dedup pipeline: a lexicon gate that rejects transient moods before any LLM cost, a fuzzy pre-check using Postgres stemming and trigram similarity, and a contextual-embedding kNN search with an LLM verifier for the ambiguous band — so "cunning" / "shrewd" / "calculating" collapse into one canonical trait row with merged aliases.
Lexicon + kNN + verifierContextual embeddings
Relationships
Tracks how characters feel about each other with sentiment, intensity, and descriptions. Evolves as the story progresses.
Per-chat scoped
Events & Timeline
Every significant event is recorded with participants, causal chains, significance scores, verbatim source quotes, and in-story time markers.
World clock aware
Story Arcs
Discovers narrative structure automatically via a three-pass Louvain resolution sweep (γ=0.25 / 0.5 / 1.0) over a weighted event graph. Produces a super-arc / arc / episode hierarchy, names each with a cheap LLM pass, and auto-resolves plot threads whose closing event has already landed on-screen.
3-level hierarchyLLM-namedAuto-resolve
Knowledge Boundaries
Tracks what each character knows and doesn't know. Prevents information leakage across character perspectives with epistemic masking.
Per-character POV
Hybrid Retrieval
Six-bucket search fusing dense vectors + lexical across memories, events, dialogue quotes, and scene snapshots. Reciprocal Rank Fusion with recency boost.
RRF fusion
World State
Key-value facts about the world with temporal validity. Tracks locations, items, ownership, and location adjacency for spatial awareness.
Bi-temporal
Dialogue Quotes
Indexes verbatim dialogue with full-text and trigram search. Answer "what did Alice say about the prophecy?" with the exact quote.
tsvector + trgm
Auto-Ingest
Memory builds automatically as you chat — no manual steps. Configurable padding window keeps recent messages raw until they're far enough back to safely ingest.
Swipe-safe
Per-Character Memory
Chats are isolated by default: new character rows are scoped to the chat they're extracted in, so two stories featuring the same name never bleed aliases, traits, or relationships into each other. Opt-in cross-chat memory per character via the chat picker. Persistent, isolated, and read-only modes are all one click.
Per-chat scopeChat picker
Mind Map
Interactive 3D force-directed graph powered by Three.js and 3d-force-graph. Characters as glowing nodes, events and locations as orbital bodies, relationships as directional particle streams. Click any node for a detail panel; hover to dim the unrelated graph.
WebGL 3Dd3-force-3d
LLM Call Monitor
Debug panel showing every extraction and embedding request with provider, model, latency, and status. Spot rate limits and stalls instantly.
Debug surface
Supported Providers [any OpenAI-compatible]
ChronicleDB works with any LLM and embedding provider that speaks the OpenAI or Gemini API format. A cheap, fast model is all you need — extraction doesn't require frontier intelligence.
Provider
Extraction
Embeddings
Google Gemini
gemini-2.5-flash-lite
gemini-embedding-2-preview
Vertex AI (Express)
Any Gemini model
text-embedding-004 / 005
OpenAI
gpt-4o-mini
text-embedding-3-small
Ollama / LM Studio
Any local model
Any 768-dim model
OpenRouter / LiteLLM
Any routed model
Any compatible endpoint
Mistral / Voyage
—
mistral-embed / voyage-3
Embedding dimension is fixed at 768 in the schema. Pick a model that natively outputs 768 dims, or one that supports the dimensions parameter (OpenAI 3-small/3-large and Vertex 004/005 do).
Setup [5 minutes]
One command. Local or cloud, your pick.
The installer handles PostgreSQL, pgvector, symlinks, and SillyTavern's config.yaml — or skips all of that with --skip-postgres if you're pointing at a free cloud DB. The walkthrough covers both paths, plus every setting in the ChronicleDB drawer, with mock panels.
The extraction and embedding models process your chat text. If you use Gemini or OpenAI, that text goes to their API. If you run Ollama or LM Studio locally, everything stays on your machine. The PostgreSQL database is always local. ChronicleDB never phones home — it's a SillyTavern plugin that talks only to the endpoints you configure.
How much does it cost?
Gemini's free tier handles most usage. Flash Lite extraction costs ~$0.01 per 1M tokens. Gemini embeddings are free. If you use OpenAI, text-embedding-3-small is $0.02 per 1M tokens. A typical chat session costs less than a penny.
What happens when I swipe or regenerate?
The padding window (default: 5 messages) keeps recent turns out of the graph entirely. If you swipe or edit within that window, nothing needs cleanup. For older messages that were already ingested, ChronicleDB detects the swipe and surgically deletes the old extraction before re-extracting the new content.
How do I import existing chats?
Open the Character memory section in settings, select a character, and click Build memory from all. ChronicleDB reads every past chat file for that character and batch-processes them through the extraction pipeline. Depending on chat length and your API speed, this can take a few minutes.
Windows support? Cloud databases? Per-chat memory?
All covered in the setup walkthrough. Short version: Windows works via WSL2 for local Postgres or natively for cloud DB; Neon and Supabase both work with the --skip-postgres installer flag; per-chat scoping is on by default for any character extracted after mid-April 2026.