Behind the scenes

Ask your show a question. Get the line that answers it.

Fabula turns your script archive into connected canon: a living show bible that remembers the line, the motivation, the object, the relationship, and the consequence — then traces every answer back to the page. It comes pre-tested across 800+ episodes of continuity-supervisor nightmare fuel.

// BDI extraction · The West Wing S2 · Bartlet concealing MS

{
  "incarnation_identifier":
    "as the President Concealing a Chronic Illness",
  "emotional_state_at_event":
    "Resolute but conflicted, carrying the weight of public deception
     against personal integrity",
  "goals_at_event": [
    "Reveal the MS diagnosis on his own terms before it becomes a scandal",
    "Protect Abbey from the fallout of the concealment"
  ],
  "beliefs_at_event": [
    "The American people deserve honesty from their President",
    "Disclosure now, while he controls the narrative, is less damaging
      than disclosure later"
  ],
  "importance_to_event": "primary",
  "confidence": 0.96
}

Each EventParticipation shows a character's state in this scene. The incarnation marks how they differ from their core identity. With this, you can ask where Bartlet's beliefs and goals conflict—and the answer is just a graph traversal. We've written about why this layer matters in the architecture of dramatic irony and the doctor's doxastic web.

Rashomon for databases

A graph that tracks state, not strings.

LLMs can improv the next line, but a writer’s room needs to know what’s *actually true* in the story right now. Who knew what, when? What did they believe? You can’t trust a language model to keep track of a decade of canon. Fabula never tries. Instead, it builds a structured, auditable scaffold from the ground up.

Long-context LLMs grow unreliable as input expands: middle facts are lost, earlier claims contradicted, and there’s no audit trail. Relying on a single model call to “remember” a hundred episodes is not just hard—it's fundamentally unstable. Fabula solves this by never asking the model to do it.

Most tools flatten a show to a single timeline of dialogue. Fabula goes deeper: every character, location, group — and every Ark of the Covenant — gets a traversable journey. Each event logs the participant’s beliefs, goals, and emotional state at that moment, mapped to the Episode → Act → Scene → Beat structure.

“What did Bartlet believe here?”
“Where was he in his journey this season?”
“What changed between seasons?”

All of these are easy to query directly from the per-entity arc.

// Schema · 25 node labels, 44 relationship types

{
  "entities":   ["Agent", "Location", "Object", "Organization"],
  "events":     ["Event", "PARTICIPATED_AS edge with BDI state",
                 "EventBeatLink"],
  "structure":  ["Series", "Season", "Episode", "Act",
                 "SceneBoundary", "PlotBeat"],
  "narrative":  ["Theme", "ConflictArc",
                 "NarrativeConnection (typed: CAUSAL,
                  FORESHADOWING, THEMATIC_PARALLEL, ESCALATION,
                  CALLBACK, EMOTIONAL_ECHO, ...)"],
  "synthesis":  ["AgentEpisodeProfile",
                 "AgentSeasonProfile",
                 "cross-season arc on the canonical Agent"],
  "provenance": "Every node, every edge, every BDI field carries
                 a pointer to the line of script that produced it."
}

any genre, any era

Even for the pilot that never aired.

Universality (dare we say *multiversality*?) wasn't an afterthought—it was the only way to survive our own development hell. We built it to handle the show that changes showrunners every season, the pilot that gets rewritten seven times and still doesn't get picked up, the feature film with twelve credited writers. Its ontology doesn't care about your drama; it cares about the drama *in the script*.

Whether it’s a blockbuster franchise or that trunk script you're still developing, Fabula works the same. The model can’t rely on internet memory or training data—each call gets exactly the context needed: script, graph, schema. It has to play the scene as written.

Same with our entity-resolving LLM judges—adjudication based on just the facts, never guesswork. When entities have similar names, the model checks script lines and each graph footprint, then explains its reasoning like a script supervisor. If the evidence is unclear, it keeps them separate and flags for review. No assumptions, just citations. (More details: teaching a knowledge graph to learn from “no.”)

// Entity adjudication · Wolf Hall · MERGE

{
  "decision": "MERGE",
  "confidence": 0.94,
  "reasoning":
    "The source entity ('Master Cromwell') and target entity
     ('Crumb') both refer to a man addressed by Cardinal
     Wolsey's household and the King's privy chamber across
     the same scenes; both are described as Wolsey's lawyer
     turned royal fixer; their script footprints overlap on
     the same locations and interlocutors in S1E01 and S1E02.
     'Crumb' reads as a deliberate diminutive used by hostile
     courtiers, not a separate person.",
  "merged_aliases": [
    "Master Cromwell",
    "Crumb",
    "Cremuel",
    "Thomas Cromwell"
  ]
}

// every word of the reasoning derives from the script
// lines handed to the model in context; nothing comes
// from training-data knowledge of Tudor history.

shoot day

The engine drives the model, not the other way around.

A 26-episode season is a cron job. The shell script below processed Doctor Who Season 20 end-to-end and unattended — twenty-two episodes, six writers, every entity reconciled against everything in the prior nineteen seasons:

$ SERIES_ID=doctorwho SEASON_NUMBER=20 PROCESSING_MODE=full_async \
    python -m app.scripts.batch_process \
    --config config/batch_configs/doctorwho_s20.json

Started Tuesday evening. Picked up a completed graph for the entire season Wednesday lunchtime. No babysitting.

Fabula runs the pipeline. Models are called only for a precise task, bound to a strict schema, never allowed to wander. The engine handles all the heavy lifting: it builds context, manages caching, guarantees stable identifiers (no hallucinated UUIDs), and always routes calls to the right model. Failures are rerouted or logged—never allowed to halt a run. Every result is checked and saved instantly; nothing waits for a risky end-of-batch commit. Crash partway through? Previous episodes are already locked down. Fabula's deterministic harness keeps AI disciplined, reliable, and grounded.

casting

Best model for each job.

Each pipeline step gets the LLM most suited to its task. Structural scenes run fast and cheap on Gemini Flash; while Sonnet is a reliable judge, with GPT 5 Mini as backup. Each request includes a tailored context dossier with all the evidence needed for a grounded answer.

Small, focused prompts mean cheaper models can handle most work, while expensive ones step in only when truly needed. (More: post-processing as ontology maintenance.)

The same structure lets local open-weight models stand in anywhere. If scripts must stay on-premise, point a step at Llama, Mistral, or Qwen—schema and audit trail stay identical; only the inference location changes.

Model upgrades are easy: new models slot in for any task with a config change — no disruptions, no legacy baggage. Schema and audit log stay stable, so the system keeps moving forward.

scene_decomposition:
  model:    gemini-2.5-flash
  context:  ~1.8KB
  cost:     $0.0008 / call

entity_adjudication:
  model:    claude-sonnet-4
  context:  ~2.4KB
  cost:     $0.0042 / call
  fallback: gpt-5-mini

bdi_extraction:
  model:    claude-sonnet-4
  context:  ~3.1KB
  cost:     $0.0061 / call

theme_analysis:
  model:    gemini-2.5-pro
  context:  ~2.0KB
  cost:     $0.0034 / call

# offline mode — same schema, same audit trail
# any step can be pointed at a local open-weight model:
# entity_adjudication:
#   model:    llama-3.1-70b-instruct (local)
#   runtime:  on-prem GPU
#   cost:     $0.00 / call

wrap party

No proprietary lock-in. Open data, open standards.

The data you create should work wherever you need it. Fabula uses open standards — never proprietary formats — so your story graphs integrate cleanly into your pipelines, databases, and tools.

Plain JSON, Parquet, Cypher, and more. Standard formats that play nicely in any modern stack, whether you’re building AI apps, running analytics, or feeding a production pipeline.

Try the data yourself. Demo datasets sit on Hugging Face under brandburner — including the full Doctor Who, Star Trek: The Next Generation, and West Wing megagraphs alongside per-series exports for Wolf Hall and Happy Valley.

$ fabula export the_west_wing --season 4 --format parquet

dataset_export/the_west_wing_s4/
├── nodes.parquet      # All graph nodes
├── edges.parquet      # All relationships
├── positions.parquet  # 3D layout coordinates
├── meta.json          # Series info, entity counts
└── README.md          # HuggingFace dataset card

✓ 4,127 nodes
✓ 18,402 edges
✓ exported in 11.3s

next season

Story is our first chapter.

Fabula’s breakthrough — rendering long-context documents as intricate graph data — works beyond screenplays. The same need for auditable, computable claims now reaches legislation, science, data journalism, and academic publishing.

Built for dense relations, persistent and mutable identities, ambiguity, and withheld truths. Sometimes stranger than fiction.

end credits

If you’ve read this far, the next step is the catalog.