Storyteller Story QA

Automatically reviews tenant stories for broken CTAs, missing content, and quality issues — so editors catch problems before users do.

Story Payload

Paste a story payload and hit Submit to run QA evaluation.
Ctrl+Enter to submit.

QA Results

Results will appear here after submission.

The Problem

What Storyteller asked us to solve — and why it's bigger than story quality.

The brief

Teams publish media-heavy content at high volume. As volume grows, it gets harder to ensure everything is consistent, professional, and trustworthy. Manual review doesn't scale.

The real issue

This isn't just a story quality problem. It's a tenant health monitoring problem. Storyteller needs visibility into what tenants are publishing — before trust damage shows up in engagement metrics.

The failure mode

A CTA says "Buy tickets" but links to /highlights. The user taps it, lands on the wrong page, bounces. The editor didn't catch it. Storyteller had no signal. Invisible trust damage.

What we built

An automated pipeline that catches these issues, logs them, and tracks patterns per tenant over time. Runs automatically from the data — not a one-off analysis.

How It Works

Six steps. Built entirely in n8n. Each step does one thing.

1
Webhook
Receives tenant story payload via POST request
2
Split
Breaks the stories array into individual items
3
Validate
Structural checks — missing fields, empty titles, broken actions. No AI needed.
4
AI Eval
GPT-4 checks CTA coherence, title alignment, page completeness
5
Route
Sends each result down the right path based on status
6
Store + Alert
Writes to Neon DB. Emails alerts on flagged or broken stories.

The n8n Pipeline

What each node does and why it exists.

  • Webhook trigger — Receives the full tenant payload (tenant_id, tenant_name, stories array). No transformation at this stage.
  • Split stories — Each story flows through the pipeline independently. n8n handles concurrency natively.
  • Structural validation — Fast, free, deterministic. Checks for missing titles, empty pages, broken action objects. Catches invalid data before any AI call is made.
  • AI evaluation — One API call per story. Checks CTA-to-URL coherence, title-to-category alignment, and page completeness. Returns structured JSON with specific issues.
  • Route by status — ok stories get logged quietly. review/invalid/error stories get logged and trigger an email alert with the story ID, tenant, and issues list.
  • Database trigger — A PostgreSQL trigger fires on every insert. It extracts each issue and upserts into a tenant_health table. Analytics are decoupled from the pipeline — n8n doesn't need to know the trigger exists.

What We Decided Not To Build

Every "no" is a deliberate choice. Here's what we left out and why.

No composite quality score

A number like "72" is not actionable. "CTA mismatch on page_2" is. Specific, named issues tell an editor exactly what to fix.

No video or image analysis

The structured metadata already carries the coherence signal. Video analysis adds latency and cost at the wrong layer for this use case.

No automated moderation

QA surfaces signals. It does not replace editors. Automating moderation decisions creates liability and erodes editorial trust.

No multi-agent AI chain

Four AI reviewers producing conflicting outputs that get averaged together degrades signal quality. One precise, well-prompted call is more reliable.

Structural validation before AI

If a story has no pages or a missing title, there's nothing for the AI to evaluate. Catching this first is free, fast, and saves API spend on broken data.

Tenant health via database trigger

The analytics layer runs independently of the pipeline. A PostgreSQL trigger counts issues per tenant automatically. No extra n8n nodes needed.

Architecture

Two tables, one trigger. The pipeline writes results. The database handles analytics.

story_qa_results

Every evaluated story gets a row. Stores story_id, tenant_id, status, confidence, issues (as JSONB), summary, and error_message. All outcomes — clean passes, flagged issues, and pipeline errors — live in one table.

tenant_health

Auto-populated by a database trigger. Tracks issue_type and occurrence_count per tenant. Over time, this answers: which tenants have recurring CTA problems? Which issues are systemic vs. one-off? Which tenants are getting worse?

What We'd Build Next

With engineering support, these are the logical next steps.

  • Raw payload inbox — Store the raw event before processing. If the pipeline breaks, nothing is lost. First production hardening step.
  • Tenant health dashboard — The data is already in the tenant_health table. It needs a UI for the customer success team.
  • Pre-publish guardrails — Same pipeline logic, but runs inside the CMS before a story goes live. Editors see issues before publish, not after.
  • CMS feedback loop — Surface QA results directly to the editor who published the story. Closes the loop between ops monitoring and content creation.
  • URL validation — HTTP HEAD check on each action URL. Catches broken links that AI can't detect from metadata alone.

Demo

Watch the pipeline process two stories: one clean pass, one with a CTA mismatch flagged for review.

Neon Database — Live Results

Every evaluation is persisted to PostgreSQL. These are real rows written by the pipeline.

story_qa_results

story_id tenant_name status confidence issues summary
story_123 Antarctic Football League review high page_2: cta_mismatch — "Buy tickets" → /highlights CTA mismatch on page 2 needs correction.
story_124 Antarctic Football League ok high Coherent, complete, aligned with title and categories.

tenant_health

tenant_name issue_type occurrence_count first_seen_at last_seen_at
Antarctic Football League cta_mismatch 6 2026-03-13 06:25 2026-03-13 08:45
Test Tenant - Invalid Structure missing_title 1 2026-03-13 07:34 2026-03-13 07:34
Test Tenant - Invalid Structure no_pages 1 2026-03-13 07:34 2026-03-13 07:34
Test Tenant - Invalid Structure missing_cta 1 2026-03-13 07:34 2026-03-13 07:34

Populated automatically by a PostgreSQL trigger — the pipeline doesn't write to this table directly.

Error Flow — What Happens When Structure Fails

Before the AI ever sees a story, structural validation catches broken data. No API spend on garbage in.

story_broken_001 invalid_structure
missing_title — story_title is empty or absent
no_pages — pages array is empty
missing_cta — page has no action/CTA defined

What happens next

1. Story is rejected before AI evaluation — no API call made
2. Result written to Neon with status invalid_structure
3. Email alert sent with story_id, tenant, and what failed
4. tenant_health trigger logs each issue type

Structural validation is deterministic, free, and fast. It runs before AI so broken payloads never waste an API call.