Case-bundle schema (v1) — data/caseN.js

Case-bundle schema (v1) — data/caseN.js

Contract between the backtest exporter (scripts/export_cases.py; replays one held-out station-day at every decision hour through the production stage-1 path) and the demo page reports/webapp/index.html (Examples 1–3 tabs).

Each file is a single JS statement wrapping ONE strict-JSON object:

// generated by the backtest case exporter — do not edit
window.CASES = window.CASES || {};
window.CASES["case1"] = { ...strict JSON, no comments, no trailing commas... };

The page loads data/case1.jsdata/case3.js; a missing file simply leaves that tab in its “case pending” state (the <script> 404s harmlessly).

Top-level object

fieldtypemeaning
schemastrliteral "case-bundle v1"
stationstrICAO id, e.g. "KDEN"
namestrdisplay name, e.g. "Denver, CO"
datestrclimate date (LST), "YYYY-MM-DD"
tz_labelstre.g. "LST (UTC-7)" — the page never does tz math
settleobj{"high": int, "low": int} — the final official CLI integers
model_versionstrgit short hash of the model that produced the bundle
obslistrealized observation curve over the WHOLE day (see below)
hourslistone entry per decision hour, ascending (see below)

obs entries (15-min subsampling is fine — display only):

fieldtypemeaning
tstr"HH:MM" local-standard
fnumobserved temperature, °F

hours[i] — one decision hour

fieldtypemeaning
hourintdecision hour, local standard (0–23; a negative value is a PRE-DAY decision, e.g. -6 = 18:00 LST the previous evening)
highobj/nullside document (below); null = not priced at this hour
lowobj/nullsame shape
warningsliststrings — the tick’s warnings (schema v1 warnings)

Side document — hours[i].high / .low

Mirrors realtime/schema.py SideDoc.to_dict() (schema v1) exactly, so the exporter can serialize the backtest’s per-hour SideDocs as-is:

fieldtypemeaning
pmf_priorobj/nullforecast-only PMF, {"<int °F>": prob} (6 dp, zeros dropped); null when no ensemble data
pmfobjobs-corrected PMF, same encoding
bracketslistbracket dicts (below); [] when no market was listed
diagnosticsobjat minimum the fields below; extra keys are allowed and ignored

Bracket dict

fieldtypemeaning
tickerstrmarket ticker (or a reconstructed RECON-* id when historical strike tables are unavailable)
typestr"less" / "between" / "greater"
loint/nullINCLUSIVE payout lower bound (null for less)
hiint/nullINCLUSIVE payout upper bound (null for greater)
fair_value_priornum/nullPMF mass over the payout set, prior
fair_valuenum/nullPMF mass over the payout set, obs-corrected

Diagnostics (minimum set)

fieldtypemeaning
p_locknumP(realized extreme already dominates)
lockedboollock declared
essnum/nulleffective sample size of the member weights
n_membersintmembers entering the weighting
n_obsintobservations seen up to the hour
sources_usedlistobs sources, e.g. ["synoptic_5min"]

Exporter notes

  • One bundle per held-out day; the day must come from the holdout split (dataset/splits.is_held_out_day) — never a training day.
  • Run the production path (stage1_truncate.fair_value_pmf / prior_pmf + kalshi_map-shaped brackets) at each decision hour with available_at honoring the no-leak rule L1, exactly like backtest/runner.py.
  • Keep bundles < ~300 KB: subsample obs to 15 min, drop PMF entries < 1e-6 (already the schema-v1 convention), limit hours to ~12 entries.
  • data/case1.jscase3.js in this directory are REAL exports from scripts/export_cases.py (the original pre-exporter placeholder bundle has been replaced): backfilled GEFS/HRRR-lag/NAM-nest trajectory archives + archived 1-min obs + the official NWS CLI settlement, replayed through the walk-forward backtest stage-1 path. Historical strike tables are not archived, so brackets are the standard 6-bracket structure reconstructed around the corrected PMF’s median (see each bundle’s meta).