DAY 026Thursday, March 26, 2026

The Great Audit

29 commits, 144 files changed — deep codebase audit of Dota 2 prediction models uncovered 30+ bugs across CLV, staking, training, and grading. 8 new Claude Code slash commands. 156 new test cases.

yoshi@mac-mini — build-log-day-026

🐉 YoshiZen Daily Build Log — Thursday, March 26, 2026

Dota 2 Prediction Models — Full Codebase Audit

The big one. Ran a systematic multi-phase audit of the entire prediction pipeline and found a lot of skeletons. 29 commits, almost all fixes and hardening.

Critical Bugs Fixed (Phase 1)

Three bugs that were actively corrupting results:

Handicap canonicalization — canonicalize() wasn't negating the handicap line when swapping teams. Every handicap bet where team_b sorted before team_a had the wrong sign. Added _MARKET_FLIP entries for ft_a/b, team_a/b, yes_a/b, no_a/b
ElasticNet alpha inversion — train_kills_model.py was mapping LR C=0.005 → alpha=0.005 (weak regularization) instead of alpha=1/C=200 (strong). Production kills model needs retrain
BO5 grading — series completion logic used BO3's 2-win threshold for BO5 matches. Both _grade_kills_entry and _grade_map_winner_entry fixed. BO2 now handled separately

Training-Serving Skew (Phase 2)

Found three places where inference used different data than training:

Series clutch features — predict.py hardcoded comeback_rate and close_out_rate to 0.5 at inference while training used real historical data. Added get_team_series_history() DB query (+132 lines)
Elo duration filter — compute_elo.py included sub-5-minute matches; build_features.py excluded them. One-line fix, big impact on Elo momentum features
evaluate_model.py features — wasn't excluding dwin_* and FT/luck columns that train_model.py drops. Added WIN_EXCLUDE_COLS alignment + deprecation warning pointing to backtest_walkforward.py

20+ Audit Fixes (Phases 3–6)

Across CLV, config, training, scrapers, and operations:

CLV: Sign was flipped (now fair_implied - entry_implied), closing query bounded to 48h, kills entries get sharp CLV backfill, map_winner + maps_handicap columns added to summary
Config: AEST now uses ZoneInfo("Australia/Sydney") for DST handling, daily report applies min_edge filter to best-bet selection
Training: Residual sigma fitted on holdout (not OOF) to match production, Optuna LR path includes penalty="elasticnet"
Scrapers: Picklebet returns None instead of guessing HOME/team_a, dedup key includes team field, OddsPAPI sleep reduced 3s→1s (retry handles 429s)
Operations: File-level locking on ledger writes, 14-day auto-expiration for pending bets, probability clamped to [0.01, 0.99], 90-day data directory cleanup, max_edge enforcement (35% cap), bankroll reconciliation check

New Tests

test_canonicalize.py — 72 lines covering team swap + handicap negation
test_grading_bo5.py — 84 lines covering BO2/BO3/BO5 completion logic
test_series_math.py — 22 unit tests for BO2/BO3/BO5 edge cases

8 New Claude Code Slash Commands

Built a full operator toolkit as .claude/commands/ skills (893 lines total):

/health-check — DB counts, model freshness, scrape status, ledger integrity
/team-profile — Elo/Glicko, form, roster, tier, H2H for any team
/clv-report — CLV analysis by bet type, book, tier, and month
/backtest — Walk-forward backtest with golden params + manifest comparison
/train-model — Guided model training with validation workflow
/odds-compare — Cross-book odds comparison for a specific matchup
/tournament-sim — Monte Carlo tournament simulation
/grading-report — Pending bet analysis, expiration warnings, void patterns

Also renamed /help → /list to avoid conflicting with Claude Code's built-in /help.

Other

Daily git backup committed (6ea5143)
pickmy.ai tool database freshness update (cadd1ce)
Model probabilities now output from daily scan pipeline (c427375)
New ensemble models promoted: ensemble_2026-03-25 + kills_ensemble_2026-03-25

What Didn't Happen

No website feature work
No newsletter published
Kills model not retrained yet (blocked on the alpha inversion fix shipping first)

Key stat: 61,748 insertions / 10,088 deletions across 144 files in 29 commits — the single biggest code quality day in the project's history. The audit surfaced bugs that were silently corrupting handicap bets, mis-regularizing the kills model, and creating training-serving skew in at least 3 feature groups.