# SEAGER+ Investigation Ship Summary - 2026-05-19

## Commit Ladder

| Step | Commit | Summary |
| --- | --- | --- |
| Step 1 | `729af27` | Methodology audit identifying formula/denominator divergence from public SEAGER |
| Step 2-4 | Pending commit | Alternative target diagnostic, one-shot rate-based formula fix, decision, and ship summary |

## Why This Investigation Happened

SEAGER+ v1.0 failed the original promotion gate with `r = 0.023` to next-year ISO, far below the pre-registered `0.300` threshold and far below Robert Orr's reported SEAGER signal (`r = 0.41`). The gap was too large to accept as a null without checking whether Mithrandir had implemented the same metric.

## Step 1: Methodology Audit Finding

The audit found Mithrandir v1.0 was not actually the public SEAGER formula. It used a per-pitch run-value average:

```text
selection_value = pitch_quality if swing else 0
hittable_taken_cost = pitch_quality if hittable and take else 0
seager_raw = mean(selection_value - hittable_taken_cost)
```

The public SEAGER description is rate-based:

```text
Selection Tendency = good takes / non-hittable pitch opportunities
Hittable Pitches Taken = hittable takes / total takes
SEAGER = Selection Tendency - Hittable Pitches Taken
```

This was the dominant candidate divergence. Missing called-strike probability weighting remains a real but separate future-methodology issue.

Detailed audit:

`docs/codex_context/seager_plus_methodology_audit_2026_05_19.md`

## Step 2: Alternative Target Diagnostic

Before changing the formula, the current v1.0 metric was checked against multiple next-year targets.

| Target | v1.0 r |
| --- | --- |
| Next-year wOBA | `0.088` |
| Next-year BB% | `0.340` |
| Next-year K% | `0.078` |
| Next-year walks/PA | `0.340` |
| Next-year ISO | `0.023` |
| Next-year chase rate | `-0.476` |

Interpretation: v1.0 was picking up plate-discipline/selectivity persistence, especially chase and walks, but it was not predicting the ISO-facing skill Orr's SEAGER is meant to capture.

## Step 3: One-Shot Formula Fix

Applied only the audited formula correction:

- Kept the existing count/location `swing_xRV` and `take_xRV` table.
- Did not add a called-strike probability model.
- Did not tune weights.
- Removed bat-tracking quality from the primary score and retained it as a diagnostic field only.
- Recomputed SEAGER as rate-based `Selection Tendency - Hittable Pitches Taken`.

Corrected validation result:

| Gate | Corrected Result |
| --- | --- |
| Qualified 2024-to-2025 hitter pairs | `272` |
| SEAGER+ r to next-year ISO | `0.274` |
| O-Swing% r to next-year ISO | `-0.094` |
| Z-Swing% r to next-year ISO | `0.146` |
| Next-year wOBA r | `0.271` |
| Next-year BB% r | `0.648` |
| Next-year chase-rate r | `-0.692` |
| RMSE delta vs chase baseline | `-0.00027` |
| Stabilization N | Did not reach `r >= 0.707` by `250 PA` |

Component diagnostics:

| Component | r to next-year ISO |
| --- | --- |
| `seager_raw` | `0.274` |
| `correct_decision_rate` | `0.232` |
| `selection_tendency` | `0.085` |
| `hittable_pitches_taken` | `-0.334` |
| `bat_tracking_quality` | `0.078` |

## Decision

Original mechanical recommendation before user decision: **HUMAN-JUDGMENT-NEEDED**

User decision after review: **PROMOTE as SEAGER+ v1.0, plate discipline index**

Why the reframing is honest:

- The formula fix recovered the headline ISO signal from `0.023` to `0.274`, a large and meaningful methodology correction.
- Corrected SEAGER+ beats raw O-Swing/Z-Swing baselines on next-year ISO.
- It also strongly predicts next-year walk rate and chase-rate persistence.
- It still does not clear the original `r >= 0.300` threshold.
- It does not satisfy the original stabilization target by `200 PA`.

Promoted positioning: SEAGER+ remains the name, honoring Orr's published framework, but the public subtitle is **plate discipline index**. The original ISO-focused threshold miss is documented on the methodology page and metric landing page rather than hidden.

## What Changed In Code

- `scripts/shared/train_seager_plus.py`
  - Restored rate-based `Selection Tendency - Hittable Pitches Taken`.
  - Kept bat-tracking quality as a diagnostic, not part of the primary score.
  - Updated formula notes in artifact metadata.

- `scripts/shared/validate_seager_plus.py`
  - Added alternative target diagnostics.
  - Added human-judgment band logic for `r = 0.20-0.30`.
  - Updated PA-level stabilization to use corrected rate-based SEAGER opportunities.

- `tests/test_seager_plus_training.py`
  - Updated aggregation test to validate rate-based SEAGER behavior.

## Generated Research Output

Validation report:

`outputs/research/metric_validation_seager_plus_20260519.md`

Latest corrected research artifact:

`outputs/models/mithrandir_plus/seager_plus_v1_0_20260519T015429Z.joblib`

These are generated artifacts and are not committed.

## Carried Debts

- Build a called-strike probability model and use it to weight edge-pitch swing/take values, matching the public SEAGER methodology more closely.
- Re-run the corrected formula after called-strike weighting; this is the likely next path if the user wants to chase the final gap from `0.274` to `0.300+`.
- Decide whether to keep the current scheduler hooks for SEAGER+ disabled/unused until promotion, or leave them inert with no UI surfacing.
- Phase 1 public UI wiring can include SEAGER+ as a plate discipline index, not as a primary power predictor.

## Explicit Non-Goals

- No SEAGER+ power-predictor framing in this investigation.
- No called-strike model rebuilt in this pass.
- No further formula/weight tuning beyond the one audited correction.
- No push to GitHub.