What It Is
A “lakehouse in a box” demo: a small data pipeline that produces compact gold tables, plus a dashboard that
combines sportsbook lines with simple model projections to highlight potential +EV spots (simulation-first).
What It Is Not
Not a guaranteed winning system. It does not model injuries/lineups, travel/rest, coaching changes, or
matchup-level context beyond team-level historical stats. Treat outputs as a starting point for analysis.
How The Dashboard Helps
The dashboard’s main job is to put these on one screen:
1) Market: DraftKings moneyline/spread/total from The Odds API
2) Model: a projected score, pace, and win/cover/over probabilities
3) Decision support: breakeven %, EV/unit, and suggested bet sizing (paper betting)
Model (NBA)
The model is intentionally simple and explainable:
Team form window: last N games (default 12), recency-weighted (newer games count more).
Pace + ratings: estimate possessions and compute off/def/net rating (points per 100 possessions) using both teams’ boxscore stats.
Projected pace: average of the two teams’ recent pace.
Projected points:
home = pace × avg(home off_rtg, away def_rtg) / 100
away = pace × avg(away off_rtg, home def_rtg) / 100
Probabilities: convert projected margin/total into probabilities using historical σ for margin and total.
Rest/B2B: apply a small adjustment to margin/total based on days rest and back-to-backs (learned from historical data).
Calibration: use backtest error (MAE) to avoid overconfidence by inflating σ when needed.
Market blend: to reduce overconfidence, EV uses a blended probability:
p_used = α·p_model + (1−α)·p_market(no‑vig) (α≈0.35, smaller when confidence is low).
Bet Sizing + Logging
Suggested stakes use conservative fractional Kelly (or flat %) with caps, and are scaled down when confidence is low
or the data snapshot is stale.
Use the Log button in Guided Picks to store the exact odds/line/time in your browser (localStorage),
then mark outcomes (win/loss/push) to compute paper P/L and ROI.
Data Pipeline (Bronze → Silver → Gold)
Bronze: raw team game logs from stats.nba.com via nba_api
Silver: cleaned per-team game rows with derived opponent/home-away and rest features
Gold: compact Parquet tables used by the site:
- gold_nba_team_games (pace, possessions, off/def/net rating, rest/b2b, points)
- gold_nba_game_summary (game-level outcomes + possessions/pace + rest/b2b)
- gold_nba_team_dim (team lookup)
The pipeline also generates Quality checks, Lineage, and a simple Backtest report.
Privacy + Operations
- The dashboard is a static site; the API provides JSON responses.
- Bet Log is stored locally in your browser only.
- The NBA pipeline runs daily on the server to refresh data.