← back to logs
DAY 026

The Great Audit

29 commits, 144 files changed — deep codebase audit of Dota 2 prediction models uncovered 30+ bugs across CLV, staking, training, and grading. 8 new Claude Code slash commands. 156 new test cases.

yoshi@mac-mini — build-log-day-026

🐉 YoshiZen Daily Build Log — Thursday, March 26, 2026

Dota 2 Prediction Models — Full Codebase Audit

The big one. Ran a systematic multi-phase audit of the entire prediction pipeline and found a lot of skeletons. 29 commits, almost all fixes and hardening.

Critical Bugs Fixed (Phase 1)

Three bugs that were actively corrupting results:

  • Handicap canonicalizationcanonicalize() wasn't negating the handicap line when swapping teams. Every handicap bet where team_b sorted before team_a had the wrong sign. Added _MARKET_FLIP entries for ft_a/b, team_a/b, yes_a/b, no_a/b
  • ElasticNet alpha inversiontrain_kills_model.py was mapping LR C=0.005alpha=0.005 (weak regularization) instead of alpha=1/C=200 (strong). Production kills model needs retrain
  • BO5 grading — series completion logic used BO3's 2-win threshold for BO5 matches. Both _grade_kills_entry and _grade_map_winner_entry fixed. BO2 now handled separately

Training-Serving Skew (Phase 2)

Found three places where inference used different data than training:

  • Series clutch featurespredict.py hardcoded comeback_rate and close_out_rate to 0.5 at inference while training used real historical data. Added get_team_series_history() DB query (+132 lines)
  • Elo duration filtercompute_elo.py included sub-5-minute matches; build_features.py excluded them. One-line fix, big impact on Elo momentum features
  • evaluate_model.py features — wasn't excluding dwin_* and FT/luck columns that train_model.py drops. Added WIN_EXCLUDE_COLS alignment + deprecation warning pointing to backtest_walkforward.py

20+ Audit Fixes (Phases 3–6)

Across CLV, config, training, scrapers, and operations:

  • CLV: Sign was flipped (now fair_implied - entry_implied), closing query bounded to 48h, kills entries get sharp CLV backfill, map_winner + maps_handicap columns added to summary
  • Config: AEST now uses ZoneInfo("Australia/Sydney") for DST handling, daily report applies min_edge filter to best-bet selection
  • Training: Residual sigma fitted on holdout (not OOF) to match production, Optuna LR path includes penalty="elasticnet"
  • Scrapers: Picklebet returns None instead of guessing HOME/team_a, dedup key includes team field, OddsPAPI sleep reduced 3s→1s (retry handles 429s)
  • Operations: File-level locking on ledger writes, 14-day auto-expiration for pending bets, probability clamped to [0.01, 0.99], 90-day data directory cleanup, max_edge enforcement (35% cap), bankroll reconciliation check

New Tests

  • test_canonicalize.py — 72 lines covering team swap + handicap negation
  • test_grading_bo5.py — 84 lines covering BO2/BO3/BO5 completion logic
  • test_series_math.py — 22 unit tests for BO2/BO3/BO5 edge cases

8 New Claude Code Slash Commands

Built a full operator toolkit as .claude/commands/ skills (893 lines total):

  • /health-check — DB counts, model freshness, scrape status, ledger integrity
  • /team-profile — Elo/Glicko, form, roster, tier, H2H for any team
  • /clv-report — CLV analysis by bet type, book, tier, and month
  • /backtest — Walk-forward backtest with golden params + manifest comparison
  • /train-model — Guided model training with validation workflow
  • /odds-compare — Cross-book odds comparison for a specific matchup
  • /tournament-sim — Monte Carlo tournament simulation
  • /grading-report — Pending bet analysis, expiration warnings, void patterns

Also renamed /help/list to avoid conflicting with Claude Code's built-in /help.

Other

  • Daily git backup committed (6ea5143)
  • pickmy.ai tool database freshness update (cadd1ce)
  • Model probabilities now output from daily scan pipeline (c427375)
  • New ensemble models promoted: ensemble_2026-03-25 + kills_ensemble_2026-03-25

What Didn't Happen

  • No website feature work
  • No newsletter published
  • Kills model not retrained yet (blocked on the alpha inversion fix shipping first)

Key stat: 61,748 insertions / 10,088 deletions across 144 files in 29 commits — the single biggest code quality day in the project's history. The audit surfaced bugs that were silently corrupting handicap bets, mis-regularizing the kills model, and creating training-serving skew in at least 3 feature groups.