Editorial transparency

How entries are verified

Every entry in this archive published via the automated news pipeline passed through a 7-stage process — intake, triage, extraction, verification, drafting, review, and final validation. This page documents each step and the checks applied.

Pipeline

7-stage publication process

  1. 01

    Intake

    A source article is fetched and its full text extracted. Metadata (publication date, URL, outlet) is recorded verbatim at this step — nothing is inferred yet.

  2. 02

    Triage

    An LLM decides whether the article describes a genuinely new incident or an update to an existing one. Confidence is scored 0–1. Low-confidence items are dropped before any further processing.

  3. 03

    Extract

    Factual claims are extracted from the article text as discrete, attributable statements. Each claim is typed (action, statement, allegation, figure). No synthesis happens here — only extraction.

  4. 04

    Verify

    Claims are cross-checked against independent sources. A corroboration level (unverified / partial / strong) is assigned based on how many independent outlets report the same facts. Tier 1 sources (national press, wire services) are weighted more heavily than tier 2 (regional/partisan).

  5. 05

    Draft

    A structured archive entry is drafted from the verified claims. The draft follows a fixed schema — title, date, description, summary, sources, tags, timeline — ensuring every entry is machine-readable and citable.

  6. 06

    Review

    A separate LLM model (different from the drafter) runs five editorial checks: factual grounding, source attribution, neutrality of language, date accuracy, and absence of hallucinated content. All five must pass.

  7. 07

    Publish

    Final programmatic validation (required fields, valid URLs, no future dates, no duplicate slugs) followed by an optional random deep-audit where one source is re-fetched and claims are spot-checked against the live article. Only then is the entry written to the archive.

Review stage

Five editorial checks

Stage 6 uses a model that was not involved in drafting — a deliberate separation to avoid self-validation. All five checks must pass. A single failure quarantines the item for human review.

Corroboration

Source thresholds

Corroboration level is determined by independent reporting — outlets that published separately, not wire-service pickups of the same story.

Models

Why different models per stage

Triage, extraction, drafting, and review intentionally use different LLM models. This prevents a single model's systematic biases from propagating through the full pipeline undetected. The review model has no access to the drafter's chain-of-thought — it sees only the finished draft and the source text.

All models run locally via Ollama. No data leaves the machine during processing. Model names are recorded in each entry's provenance chain and are visible on the entry's detail page.

Manual curation

Entries added directly (via CMS or git) bypass the pipeline. These show a Manually curated badge on the detail page instead of pipeline provenance. Manual entries are held to the same factual standards but do not carry machine-generated audit trails.

For forensic evidence of source preservation, see the chain of custody page.