SYS://CHRONICLE ●

CHRONICLEDB

Persistent Memory for Roleplay // SillyTavern Plugin

A graph + vector memory store that makes your characters actually remember. Relationships evolve, events persist, knowledge stays consistent, and stories build on themselves across sessions.

Setup walkthrough View source

NODE: ACTIVE ◢ v1.0

Before / After [prompt stack]

Without ChronicleDB, the model only sees what fits in its context window — usually the last handful of turns. Anything older falls off the back, and with it goes every relationship, trait, and promise you set up earlier. ChronicleDB assembles a focused memory block from the full history and injects it just before the most recent message, so the model writes with fresh recency and long-term grounding.

Without ChronicleDB

Personas / Presets / Character Eden is a stoic cartographer drawn into a war she never wanted. Marcus is a charming scholar with something to hide.

Message 07 "I told you, the Glass Library is no place for us."

Message 08 Eden's hand drifts to the hilt of the dagger her father left her. She doesn't answer.

Most Recent "Why are you lying to me about the Archivist?"

With ChronicleDB

Personas / Presets / Character Eden is a stoic cartographer drawn into a war she never wanted. Marcus is a charming scholar with something to hide.

Message 07 "I told you, the Glass Library is no place for us."

Message 08 Eden's hand drifts to the hilt of the dagger her father left her. She doesn't answer.

[ChronicleDB Memory Context] ## Current Scene - Eden: the Archivist's study - Marcus: the Archivist's study ## Character Traits Eden — stoic, principled, suspicious of institutions Marcus — charming, evasive, former Archivist ## Active Plot Threads ⏳ Marcus knows what happened to Eden's father 🔮 Eden suspects the Archivist ordered the raid on her village ## Relevant Past Context Msg 23 — Marcus admitted he'd been inside the Library before Msg 31 — Eden found her father's seal on a letter Marcus was carrying

Most Recent "Why are you lying to me about the Archivist?"

What ChronicleDB Does [overview]

Every message you exchange gets analyzed by a lightweight LLM. Characters, events, traits, relationships, locations, items, and plot threads are extracted into a PostgreSQL graph with vector embeddings for semantic search. When it's time for the AI to reply, the most relevant memories are retrieved and injected into context — so the model knows what happened, who said what, and how characters feel about each other.

SillyTavern Chat → Extraction LLM → Graph + Vector Store AI Generation ← Hybrid Retrieval (RRF) ← PostgreSQL + pgvector

How it works [deep dive]

Magic.

Close the tab. Trust the process. Live your life.

The five pipelines that make the memory block feel like it knew what you were going to ask.

Extraction pipeline

A cheap extraction LLM (Gemini Flash Lite or equivalent) reads each new batch of messages and returns a structured JSON payload: characters + traits, events + source quotes, relationships, knowledge updates, world state, plot threads. Traits pass through a three-layer dedup before insertion so near-synonyms collapse into one canonical row with merged aliases.

5-msg batch→extraction LLM→JSON

trait path: lexicon gate→ fuzzy pre-check→ kNN + LLM verifier

→graph + vector writes

Hybrid retrieval (RRF) ↗Cormack '09SIGIR

Every turn, retrieval runs six searches in parallel: dense + lexical across memory passages, dense + lexical across event source quotes, tsvector + trigram on dialogue, and dense on scene snapshots. Results are fused via Reciprocal Rank Fusion with a recency bias, padded by their ±1 message neighbors, and expanded with the story arc each event belongs to. The final memory block is rendered to a token budget.

memory · vector memory · lexical event · vector event · lexical dialogue · trgm snapshot · vector

↓

RRF fusion (k=60)→±1 neighbor pad→arc expand→memory block

HyDE query rewriting ↗arxiv2212.10496

A raw user question often doesn't embed anywhere near its answer in vector space — the query wording differs from how the memory was written. HyDE (Hypothetical Document Embeddings) fixes that with one cheap LLM call: generate a short fake "answer" to the query, then embed that hypothetical instead of the question. The fake answer lives in the same stylistic/semantic space as the real memories, so the dense search lands closer to the actual answer. Opt-in per retrieval via { hyde: true }; adds one LLM call per turn.

user question→tiny LLM→hypothetical answer

↓

embed the hypothetical, not the question→dense search

Louvain arc rebuild ↗Blondel '080803.0476

Story arcs aren't handwritten. Every N events, the arc-builder constructs a weighted event graph (co-participation + temporal proximity + causal chains) and runs Louvain community detection three times at different resolutions — γ=0.25 surfaces super-arcs, γ=0.5 arcs, γ=1.0 episodes. Each cluster gets a cheap LLM-generated title; titles are recycled when their centerpiece event doesn't move, so most rebuilds cost near-zero LLM tokens.

event graph→Louvain × 3

γ=0.25super-arcs γ=0.5arcs γ=1.0episodes

↓

LLM name each cluster (title-recycled when stable)

Per-chat scoping

Every character row carries a chat_id. New characters extracted in a chat get a chat-hash-prefixed row (chr-{hash}-{slug}) so the same name in two different stories never bleeds aliases, traits, relationships, or plot threads between them. Legacy global rows without a chat_id are still visible through the per-character chat picker — the scoping is additive, not destructive.

Global (legacy)

chr-alice

one row shared across every chat Alice appears in

Per-chat (new)

chr-a3f912-alice chr-8c4e01-alice

one row per (chat, character) pair — no bleed

Features [12 modules]

Character Traits

Extracts stable personality traits, skills, and backstory through a three-layer dedup pipeline: a lexicon gate that rejects transient moods before any LLM cost, a fuzzy pre-check using Postgres stemming and trigram similarity, and a contextual-embedding kNN search with an LLM verifier for the ambiguous band — so "cunning" / "shrewd" / "calculating" collapse into one canonical trait row with merged aliases.

Lexicon + kNN + verifier Contextual embeddings

Relationships

Tracks how characters feel about each other with sentiment, intensity, and descriptions. Evolves as the story progresses.

Per-chat scoped

Events & Timeline

Every significant event is recorded with participants, causal chains, significance scores, verbatim source quotes, and in-story time markers.

World clock aware

Story Arcs

Discovers narrative structure automatically via a three-pass Louvain resolution sweep (γ=0.25 / 0.5 / 1.0) over a weighted event graph. Produces a super-arc / arc / episode hierarchy, names each with a cheap LLM pass, and auto-resolves plot threads whose closing event has already landed on-screen.

3-level hierarchy LLM-named Auto-resolve

Knowledge Boundaries

Tracks what each character knows and doesn't know. Prevents information leakage across character perspectives with epistemic masking.

Per-character POV

Hybrid Retrieval

Six-bucket search fusing dense vectors + lexical across memories, events, dialogue quotes, and scene snapshots. Reciprocal Rank Fusion with recency boost.

RRF fusion

World State

Key-value facts about the world with temporal validity. Tracks locations, items, ownership, and location adjacency for spatial awareness.

Bi-temporal

Dialogue Quotes

Indexes verbatim dialogue with full-text and trigram search. Answer "what did Alice say about the prophecy?" with the exact quote.

tsvector + trgm

Auto-Ingest

Memory builds automatically as you chat — no manual steps. Configurable padding window keeps recent messages raw until they're far enough back to safely ingest.

Swipe-safe

Per-Character Memory

Chats are isolated by default: new character rows are scoped to the chat they're extracted in, so two stories featuring the same name never bleed aliases, traits, or relationships into each other. Opt-in cross-chat memory per character via the chat picker. Persistent, isolated, and read-only modes are all one click.

Per-chat scope Chat picker

Mind Map

Interactive 3D force-directed graph powered by Three.js and 3d-force-graph. Characters as glowing nodes, events and locations as orbital bodies, relationships as directional particle streams. Click any node for a detail panel; hover to dim the unrelated graph.

WebGL 3D d3-force-3d

LLM Call Monitor

Debug panel showing every extraction and embedding request with provider, model, latency, and status. Spot rate limits and stalls instantly.

Debug surface

Supported Providers [any OpenAI-compatible]

ChronicleDB works with any LLM and embedding provider that speaks the OpenAI or Gemini API format. A cheap, fast model is all you need — extraction doesn't require frontier intelligence.

Provider	Extraction	Embeddings
Google Gemini	gemini-2.5-flash-lite	gemini-embedding-2-preview
Vertex AI (Express)	Any Gemini model	text-embedding-004 / 005
OpenAI	gpt-4o-mini	text-embedding-3-small
Ollama / LM Studio	Any local model	Any 768-dim model
OpenRouter / LiteLLM	Any routed model	Any compatible endpoint
Mistral / Voyage	—	mistral-embed / voyage-3

Embedding dimension is fixed at 768 in the schema. Pick a model that natively outputs 768 dims, or one that supports the dimensions parameter (OpenAI 3-small/3-large and Vertex 004/005 do).

Setup [5 minutes]

One command. Local or cloud, your pick.

The installer handles PostgreSQL, pgvector, symlinks, and SillyTavern's config.yaml — or skips all of that with --skip-postgres if you're pointing at a free cloud DB. The walkthrough covers both paths, plus every setting in the ChronicleDB drawer, with mock panels.

curl -fsSL https://raw.githubusercontent.com/alani-fan-club/chronicledb/master/install.sh | bash

Setup walkthrough →

macOS · Linux · Windows (WSL or native)

Common Questions [faq]

Does this send my chats to an external API?

The extraction and embedding models process your chat text. If you use Gemini or OpenAI, that text goes to their API. If you run Ollama or LM Studio locally, everything stays on your machine. The PostgreSQL database is always local. ChronicleDB never phones home — it's a SillyTavern plugin that talks only to the endpoints you configure.

How much does it cost?

Gemini's free tier handles most usage. Flash Lite extraction costs ~$0.01 per 1M tokens. Gemini embeddings are free. If you use OpenAI, text-embedding-3-small is $0.02 per 1M tokens. A typical chat session costs less than a penny.

What happens when I swipe or regenerate?

The padding window (default: 5 messages) keeps recent turns out of the graph entirely. If you swipe or edit within that window, nothing needs cleanup. For older messages that were already ingested, ChronicleDB detects the swipe and surgically deletes the old extraction before re-extracting the new content.

How do I import existing chats?

Open the Character memory section in settings, select a character, and click Build memory from all. ChronicleDB reads every past chat file for that character and batch-processes them through the extraction pipeline. Depending on chat length and your API speed, this can take a few minutes.

Windows support? Cloud databases? Per-chat memory?

All covered in the setup walkthrough. Short version: Windows works via WSL2 for local Postgres or natively for cloud DB; Neon and Supabase both work with the --skip-postgres installer flag; per-chat scoping is on by default for any character extracted after mid-April 2026.