# Projections M3 Milestone 1 Ship Summary

Milestone 1 is now the plumbing/reset layer for the Mithrandir betting stack: production pipelines are schedulable, schema-defensive, health-checked, and pared back to the live surfaces we actually want to carry forward.

## Ladder Summary

| Step | Outcome | Primary artifact / report |
|---|---|---|
| Chadwick + MLB Stats API client | Daily identity/register refresh and defensive MLB API access are now standardized instead of ad hoc | `scripts/shared/fetch_chadwick_register.py`, `src/pitcher_card_engine/data/mlb_stats_api/client.py`, `data/historical/chadwick/register_*.parquet` |
| Savant extensions + Statcast diff audit | New Savant leaderboard pulls landed and the historical pitch-store gap is now measured instead of implicit | `scripts/shared/fetch_savant_leaderboards.py`, `data/historical/savant_leaderboards/*`, `data/historical/statcast_backfill_diff/*.parquet` |
| Scheduler + health check | Core production jobs now run through one registry with SQLite-backed state, row-count probes, and alert hooks | `scripts/ops/schedule.py`, `scripts/ops/health_check.py`, `scripts/ops/health_check_thresholds.yml`, `outputs/ops/scheduler_cycle_*.json` |
| Kill list execution | Legacy or misleading betting surfaces were archived, reframed, or guarded instead of quietly lingering | `_archive/clv_fanduel_legacy/`, `_archive/home_run_geometry_variants/`, `_archive/nrfi_variants/`, `_archive/manual_pipelines/` |
| Verification + ship summary | Forced scheduler cycle, SMTP dry-run behavior, active entrypoint reachability, and kill-list regression checks are now documented | `outputs/ops/scheduler_cycle_20260422T011828.json`, this document |

## Commits Shipped

| Commit | Description |
|---|---|
| `bcc8e99` | Add Chadwick register refresh + defensive MLB Stats API client |
| `1212c85` | Add Savant leaderboard pulls + Statcast historical diff audit |
| `086e92a` | Add scheduler, health checks, and SMTP alert wiring |
| `50ab835` | Soft-kill Combo Analyzer edge language and reframe as fair-price calculator |
| `8d992e3` | Audit legacy CLV dashboard and document archive stance |
| `d51be2c` | Archive surplus HR geometry variants and keep the best measured lane |
| `accf426` | Guard stale pre-2024 HR artifacts at loader time |
| `bed337b` | Consolidate NRFI variants to the baseline survivor |
| `38263ee` | Sunset unscheduled manual pipelines into archive |

## File Movement Snapshot

- New files added since `c166ee3`: `72`
- Files archived via rename into `_archive/`: `65`
- Existing files modified in-place: `10`

Archive buckets created in this milestone:

- `_archive/clv_fanduel_legacy/`
- `_archive/home_run_geometry_variants/`
- `_archive/nrfi_variants/`
- `_archive/manual_pipelines/`

## Scheduled Task Registry

The active scheduler registry now carries 11 production tasks:

| Task | Cadence |
|---|---|
| `fetch_chadwick_register` | Daily at `04:15` |
| `fetch_savant_leaderboards` | Daily at `04:25` |
| `run_season_projection_pipeline` | Daily at `04:45` |
| `update_scorecard` | Daily at `05:15` |
| `fetch_daily_odds_fanduel` | Every 2 hours at `:05` |
| `fetch_strikeout_props_sportsgameodds` | Every 2 hours at `:10` |
| `run_daily_sportsbook_ingestion` | Every 2 hours at `:20` |
| `run_home_run_daily_production` | Daily at `09:00` |
| `run_total_base_daily_production` | Daily at `09:10` |
| `run_strikeout_edge_refresh` | Daily at `09:20` |
| `run_daily_mithrandir` | Daily at `10:00` |

Cadence principle:

- Sportsbook fetch / ingestion jobs run every two hours because line movement changes faster than projection artifacts.
- Reference pulls and season-level products run daily.
- Product-facing boards run after the upstream fetch/ingestion lane.

## Health Check Thresholds

Current health thresholds are intentionally simple and row-count oriented:

| Task | Max age | Minimum metric |
|---|---|---|
| `fetch_chadwick_register` | 48h | 100,000 rows |
| `fetch_savant_leaderboards` | 48h | 1,000 rows |
| `fetch_daily_odds_fanduel` | 6h | 10 rows |
| `fetch_strikeout_props_sportsgameodds` | 6h | 1 row |
| `run_daily_sportsbook_ingestion` | 6h | 10 rows |
| `run_season_projection_pipeline` | 48h | 1,000 combined hitter/pitcher rows |
| `update_scorecard` | 48h | 1 row |
| `run_home_run_daily_production` | 24h | 0 rows allowed |
| `run_total_base_daily_production` | 24h | 0 rows allowed |
| `run_strikeout_edge_refresh` | 24h | 0 rows allowed |
| `run_daily_mithrandir` | 24h | 1 combined card/status metric |

Current known threshold misses are carried state, not scheduler regressions:

- `fetch_strikeout_props_sportsgameodds` failed the minimum metric because the forced cycle found `0` rows in `outputs/raw/sportsbook/strikeouts/2026-04-22/lines.csv`.
- `run_daily_mithrandir` failed the minimum metric because the probe returned `0` combined cards from `outputs/2026-04-22/mithrandir_daily_status.json`.

## Forced Verification Cycle

Forced end-to-end scheduler run:

- Command: `python scripts/ops/schedule.py --force`
- Cycle artifact: `outputs/ops/scheduler_cycle_20260422T011828.json`
- Logs root: `outputs/ops/scheduler_logs/`

Per-task result from the forced cycle:

| Task | Result | Note |
|---|---|---|
| `fetch_chadwick_register` | Pass | 539,420-row parquet refreshed |
| `fetch_savant_leaderboards` | Pass | 3,654 total leaderboard rows |
| `run_season_projection_pipeline` | Pass | 1,209 combined projection rows |
| `update_scorecard` | Pass | 544 live scorecard rows |
| `fetch_daily_odds_fanduel` | Pass | 230 rows |
| `fetch_strikeout_props_sportsgameodds` | Pass at task layer | health threshold miss because metric was `0` |
| `run_daily_sportsbook_ingestion` | Pass | 88 normalized coverage rows |
| `run_home_run_daily_production` | Pass | task succeeded, zero-row outcome allowed |
| `run_total_base_daily_production` | Pass | task succeeded, zero-row outcome allowed |
| `run_strikeout_edge_refresh` | Pass | task succeeded, zero-row outcome allowed |
| `run_daily_mithrandir` | Pass at task layer | health threshold miss because probe metric was `0` |

Health-check result from the same cycle:

- Health check executed automatically after task execution
- Exit code: non-zero, as expected from the two threshold misses above
- Scheduler remained stable and still wrote the cycle JSON

## SMTP Status

SMTP alerting is wired but intentionally dormant.

Dry-run verification completed:

- The alert code path was exercised during the forced health-failure cycle.
- With no SMTP env vars configured, the cycle JSON recorded:
  - `smtp_sent: false`
  - `smtp_message: "SMTP not configured, skipping alert."`
- No crash occurred and the scheduler completed normally.

Env vars to activate SMTP later:

- `MITHRANDIR_SMTP_HOST`
- `MITHRANDIR_SMTP_PORT`
- `MITHRANDIR_SMTP_USERNAME`
- `MITHRANDIR_SMTP_PASSWORD`
- `MITHRANDIR_ALERT_FROM`
- `MITHRANDIR_ALERT_TO`

## scripts/shared Verification Map

The retained active `scripts/shared` entrypoints now map back to the scheduler like this:

| Active entrypoint | Invoked by scheduler task | Reachability |
|---|---|---|
| `run_daily_mithrandir.py` | `run_daily_mithrandir` | Direct |
| `run_abs_daily.py` | `run_daily_mithrandir` | Transitive via `_run_abs(...)` |
| `run_fielding_daily.py` | `run_daily_mithrandir` | Transitive via `_run_fielding(...)` |
| `run_running_game_daily.py` | `run_daily_mithrandir` | Transitive via `_run_running_game(...)` |
| `run_postgame_research_evaluations.py` | `run_daily_mithrandir` | Transitive via `_run_postgame_research_evaluations(...)` |
| `run_optimization_lab.py` | `run_daily_mithrandir` | Transitive via `_run_optimization_lab(...)` |
| `project_pitcher_strikeouts.py` | `run_daily_mithrandir`, `run_strikeout_edge_refresh` | Direct subprocess dependency in both |
| `project_strikeout_props.py` | `run_daily_mithrandir`, `run_strikeout_edge_refresh` | Direct subprocess dependency in both |
| `grade_home_run_edges.py` | `run_daily_mithrandir` | Transitive via `_run_home_run_analytics(...)` |
| `run_mithrandir_health_check.py` | `run_daily_mithrandir` | Transitive via `_run_health_check(...)` |

Verification result:

- All 10 retained entrypoints are reachable from the scheduler registry, directly or transitively.
- No additional kept `scripts/shared` entrypoint failed the reachability check.

## Kill-List Regression Check

Regression sweep result after the final cleanup pass:

- No active `src/` or `scripts/` file still imports archived HR geometry variants.
- No active `src/` or `scripts/` file still imports archived NRFI process/tree variants.
- No active `src/` or `scripts/` file still imports archived manual pipeline wrappers.
- Pre-2024 HR artifact guard remains active in `src/pitcher_card_engine/domains/betting/home_run_edges/confidence.py`.

One cleanup was required during verification:

- Several HR report scripts still imported helper functions from archived tournament runners.
- Those helpers were moved into `scripts/shared/hr_model_support.py`, and the active reports now import from that live support file instead.

## Kill List Outcome Snapshot

- Combo Analyzer: kept functional, but reframed to honest correlation + fair-price language rather than edge-finding language.
- Legacy CLV dashboard: no active FanDuel-closing dashboard survived audit; archive location documented for future rebuild discipline.
- HR geometry variants: only the best measured geometry lane remains active.
- Pre-2024 HR artifacts: blocked at load time rather than silently used.
- NRFI variants: baseline survivor retained, process/tree archived.
- Manual pipeline surface: reduced to scheduler-backed or scheduler-dependent entrypoints only.

## Reliability Read

What we proved in this milestone window:

- One overnight scheduler lane was already exercised in Sub-step 3.
- One full forced cycle was rerun in Sub-step 5 and completed end-to-end.
- Health-check execution, cycle JSON writing, scheduler logs, and SMTP dry-run handling all behaved correctly.

What we did not claim yet:

- We have not claimed a completed 7-day unattended soak.
- We have not activated live SMTP delivery.

Operationally, this is ready for the next unattended soak window, but the summary stays honest about the verification horizon.

## Carried Debts

- `883K` upstream-only Statcast pitches between 2023-2025 represent roughly a `42%` coverage gap versus the local pitch store. Investigate before Milestone 4 because that milestone depends on complete pitch-level data. Reference artifact: `data/historical/statcast_backfill_diff/hf_vs_local_pitch_events_2023_2025_coverage_gaps.parquet`
- `35,731` Statcast field mismatches, dominated by retroactive `FF -> SI` pitch-type reclassifications, still need an explicit upstream-corrections merge pass rather than silent overwrite.
- Savant ABS `challengeType=batter` vs `challengeType=umpire` returned identical row counts/schema in this pull pass. Verify the data-mode assumption before Milestone 4 uses ABS features materially.
- The pre-2024 HR artifact guard currently falls back to file mtime inference because the legacy artifacts did not ship with committed `trained_on_date` metadata. Milestone 4 retraining must embed real artifact metadata so the guard becomes deterministic instead of heuristic.
- SMTP remains dormant by design. Activation still requires the six-env-var checklist above plus a deliberate go-live decision.
- `fetch_strikeout_props_sportsgameodds` and `run_daily_mithrandir` still need threshold tuning or probe refinement because zero-row days can be operationally real but currently fail the health gate.

## What Milestone 1 Explicitly Does Not Do

- No new betting models
- No new betting products
- No sharp/consensus CLV rebuild yet
- No upstream Statcast correction merge yet
- No paid feeds or paid APIs
- No full unattended 7-day certification yet
- No live email alerting activation yet

## Runbook

Core operator commands:

```powershell
$env:PYTHONPATH='C:\Projects\mithrandir-metrics\src;C:\Projects\mithrandir-metrics'
python scripts/ops/schedule.py --force
python scripts/ops/health_check.py
```

## Honest Read On Ship State

M3 Milestone 1 did the right kind of work: it reduced silent operational risk, tightened what counts as “live,” and made the betting stack much less dependent on memory and manual invocation. The system is not done, and the longer unattended soak plus a few threshold/probe refinements still matter, but the foundation is now disciplined enough to support later modeling work without carrying as much hidden plumbing debt.
