The Great Audit
29 commits, 144 files changed — deep codebase audit of Dota 2 prediction models uncovered 30+ bugs across CLV, staking, training, and grading. 8 new Claude Code slash commands. 156 new test cases.
🐉 YoshiZen Daily Build Log — Thursday, March 26, 2026
Dota 2 Prediction Models — Full Codebase Audit
The big one. Ran a systematic multi-phase audit of the entire prediction pipeline and found a lot of skeletons. 29 commits, almost all fixes and hardening.
Critical Bugs Fixed (Phase 1)
Three bugs that were actively corrupting results:
- Handicap canonicalization —
canonicalize()wasn't negating the handicap line when swapping teams. Every handicap bet where team_b sorted before team_a had the wrong sign. Added_MARKET_FLIPentries forft_a/b,team_a/b,yes_a/b,no_a/b - ElasticNet alpha inversion —
train_kills_model.pywas mapping LRC=0.005→alpha=0.005(weak regularization) instead ofalpha=1/C=200(strong). Production kills model needs retrain - BO5 grading — series completion logic used BO3's 2-win threshold for BO5 matches. Both
_grade_kills_entryand_grade_map_winner_entryfixed. BO2 now handled separately
Training-Serving Skew (Phase 2)
Found three places where inference used different data than training:
- Series clutch features —
predict.pyhardcodedcomeback_rateandclose_out_rateto 0.5 at inference while training used real historical data. Addedget_team_series_history()DB query (+132 lines) - Elo duration filter —
compute_elo.pyincluded sub-5-minute matches;build_features.pyexcluded them. One-line fix, big impact on Elo momentum features - evaluate_model.py features — wasn't excluding
dwin_*and FT/luck columns thattrain_model.pydrops. AddedWIN_EXCLUDE_COLSalignment + deprecation warning pointing tobacktest_walkforward.py
20+ Audit Fixes (Phases 3–6)
Across CLV, config, training, scrapers, and operations:
- CLV: Sign was flipped (now
fair_implied - entry_implied), closing query bounded to 48h, kills entries get sharp CLV backfill, map_winner + maps_handicap columns added to summary - Config: AEST now uses
ZoneInfo("Australia/Sydney")for DST handling, daily report appliesmin_edgefilter to best-bet selection - Training: Residual sigma fitted on holdout (not OOF) to match production, Optuna LR path includes
penalty="elasticnet" - Scrapers: Picklebet returns
Noneinstead of guessing HOME/team_a, dedup key includesteamfield, OddsPAPI sleep reduced 3s→1s (retry handles 429s) - Operations: File-level locking on ledger writes, 14-day auto-expiration for pending bets, probability clamped to [0.01, 0.99], 90-day data directory cleanup,
max_edgeenforcement (35% cap), bankroll reconciliation check
New Tests
test_canonicalize.py— 72 lines covering team swap + handicap negationtest_grading_bo5.py— 84 lines covering BO2/BO3/BO5 completion logictest_series_math.py— 22 unit tests for BO2/BO3/BO5 edge cases
8 New Claude Code Slash Commands
Built a full operator toolkit as .claude/commands/ skills (893 lines total):
/health-check— DB counts, model freshness, scrape status, ledger integrity/team-profile— Elo/Glicko, form, roster, tier, H2H for any team/clv-report— CLV analysis by bet type, book, tier, and month/backtest— Walk-forward backtest with golden params + manifest comparison/train-model— Guided model training with validation workflow/odds-compare— Cross-book odds comparison for a specific matchup/tournament-sim— Monte Carlo tournament simulation/grading-report— Pending bet analysis, expiration warnings, void patterns
Also renamed /help → /list to avoid conflicting with Claude Code's built-in /help.
Other
- Daily git backup committed (
6ea5143) - pickmy.ai tool database freshness update (
cadd1ce) - Model probabilities now output from daily scan pipeline (
c427375) - New ensemble models promoted:
ensemble_2026-03-25+kills_ensemble_2026-03-25
What Didn't Happen
- No website feature work
- No newsletter published
- Kills model not retrained yet (blocked on the alpha inversion fix shipping first)
Key stat: 61,748 insertions / 10,088 deletions across 144 files in 29 commits — the single biggest code quality day in the project's history. The audit surfaced bugs that were silently corrupting handicap bets, mis-regularizing the kills model, and creating training-serving skew in at least 3 feature groups.