Storyteller Story QA

The Problem

What Storyteller asked us to solve — and why it's bigger than story quality.

The brief

Teams publish media-heavy content at high volume. As volume grows, it gets harder to ensure everything is consistent, professional, and trustworthy. Manual review doesn't scale.

The real issue

This isn't just a story quality problem. It's a tenant health monitoring problem. Storyteller needs visibility into what tenants are publishing — before trust damage shows up in engagement metrics.

The failure mode

A CTA says "Buy tickets" but links to /highlights. The user taps it, lands on the wrong page, bounces. The editor didn't catch it. Storyteller had no signal. Invisible trust damage.

What we built

An automated pipeline that catches these issues, logs them, and tracks patterns per tenant over time. Runs automatically from the data — not a one-off analysis.

How It Works

Six steps. Built entirely in n8n. Each step does one thing.

1

Webhook

Receives tenant story payload via POST request

2

Split

Breaks the stories array into individual items

3

Validate

Structural checks — missing fields, empty titles, broken actions. No AI needed.

4

AI Eval

GPT-4 checks CTA coherence, title alignment, page completeness

5

Route

Sends each result down the right path based on status

6

Store + Alert

Writes to Neon DB. Emails alerts on flagged or broken stories.

The n8n Pipeline

What each node does and why it exists.

Webhook trigger — Receives the full tenant payload (tenant_id, tenant_name, stories array). No transformation at this stage.
Split stories — Each story flows through the pipeline independently. n8n handles concurrency natively.
Structural validation — Fast, free, deterministic. Checks for missing titles, empty pages, broken action objects. Catches invalid data before any AI call is made.
AI evaluation — One API call per story. Checks CTA-to-URL coherence, title-to-category alignment, and page completeness. Returns structured JSON with specific issues.
Route by status — ok stories get logged quietly. review/invalid/error stories get logged and trigger an email alert with the story ID, tenant, and issues list.
Database trigger — A PostgreSQL trigger fires on every insert. It extracts each issue and upserts into a tenant_health table. Analytics are decoupled from the pipeline — n8n doesn't need to know the trigger exists.

What We Decided Not To Build

Every "no" is a deliberate choice. Here's what we left out and why.

No composite quality score

A number like "72" is not actionable. "CTA mismatch on page_2" is. Specific, named issues tell an editor exactly what to fix.

No video or image analysis

The structured metadata already carries the coherence signal. Video analysis adds latency and cost at the wrong layer for this use case.

No automated moderation

QA surfaces signals. It does not replace editors. Automating moderation decisions creates liability and erodes editorial trust.

No multi-agent AI chain

Four AI reviewers producing conflicting outputs that get averaged together degrades signal quality. One precise, well-prompted call is more reliable.

Structural validation before AI

If a story has no pages or a missing title, there's nothing for the AI to evaluate. Catching this first is free, fast, and saves API spend on broken data.

Tenant health via database trigger

The analytics layer runs independently of the pipeline. A PostgreSQL trigger counts issues per tenant automatically. No extra n8n nodes needed.

Architecture

Two tables, one trigger. The pipeline writes results. The database handles analytics.

story_qa_results

Every evaluated story gets a row. Stores story_id, tenant_id, status, confidence, issues (as JSONB), summary, and error_message. All outcomes — clean passes, flagged issues, and pipeline errors — live in one table.

tenant_health

Auto-populated by a database trigger. Tracks issue_type and occurrence_count per tenant. Over time, this answers: which tenants have recurring CTA problems? Which issues are systemic vs. one-off? Which tenants are getting worse?

What We'd Build Next

With engineering support, these are the logical next steps.

Raw payload inbox — Store the raw event before processing. If the pipeline breaks, nothing is lost. First production hardening step.
Tenant health dashboard — The data is already in the tenant_health table. It needs a UI for the customer success team.
Pre-publish guardrails — Same pipeline logic, but runs inside the CMS before a story goes live. Editors see issues before publish, not after.
CMS feedback loop — Surface QA results directly to the editor who published the story. Closes the loop between ops monitoring and content creation.
URL validation — HTTP HEAD check on each action URL. Catches broken links that AI can't detect from metadata alone.

Demo

Watch the pipeline process two stories: one clean pass, one with a CTA mismatch flagged for review.

Neon Database — Live Results

Every evaluation is persisted to PostgreSQL. These are real rows written by the pipeline.

story_qa_results

story_id	tenant_name	status	confidence	issues	summary
story_123	Antarctic Football League	review	high	page_2: cta_mismatch — "Buy tickets" → /highlights	CTA mismatch on page 2 needs correction.
story_124	Antarctic Football League	ok	high	—	Coherent, complete, aligned with title and categories.

tenant_health

tenant_name	issue_type	occurrence_count	first_seen_at	last_seen_at
Antarctic Football League	cta_mismatch	6	2026-03-13 06:25	2026-03-13 08:45
Test Tenant - Invalid Structure	missing_title	1	2026-03-13 07:34	2026-03-13 07:34
Test Tenant - Invalid Structure	no_pages	1	2026-03-13 07:34	2026-03-13 07:34
Test Tenant - Invalid Structure	missing_cta	1	2026-03-13 07:34	2026-03-13 07:34

Populated automatically by a PostgreSQL trigger — the pipeline doesn't write to this table directly.

Error Flow — What Happens When Structure Fails

Before the AI ever sees a story, structural validation catches broken data. No API spend on garbage in.

story_broken_001 invalid_structure

✕ missing_title — story_title is empty or absent

✕ no_pages — pages array is empty

✕ missing_cta — page has no action/CTA defined

What happens next

1. Story is rejected before AI evaluation — no API call made

2. Result written to Neon with status invalid_structure

3. Email alert sent with story_id, tenant, and what failed

4. tenant_health trigger logs each issue type

Structural validation is deterministic, free, and fast. It runs before AI so broken payloads never waste an API call.

Storyteller Story QA

Story Payload

QA Results

The Problem

The brief

The real issue

The failure mode

What we built

How It Works

The n8n Pipeline

What We Decided Not To Build

No composite quality score

No video or image analysis

No automated moderation

No multi-agent AI chain

Structural validation before AI

Tenant health via database trigger

Architecture

story_qa_results

tenant_health

What We'd Build Next

Demo

Neon Database — Live Results

Error Flow — What Happens When Structure Fails