About

What this is, how it works, and what it is not.

What It Is

A “lakehouse in a box” demo: a small data pipeline that produces compact gold tables, plus a dashboard that combines sportsbook lines with simple model projections to highlight potential +EV spots (simulation-first).

What It Is Not

Not a guaranteed winning system. It does not model injuries/lineups, travel/rest, coaching changes, or matchup-level context beyond team-level historical stats. Treat outputs as a starting point for analysis.

How The Dashboard Helps

The dashboard’s main job is to put these on one screen:

1) Market: DraftKings moneyline/spread/total from The Odds API
2) Model: a projected score, pace, and win/cover/over probabilities
3) Decision support: breakeven %, EV/unit, and suggested bet sizing (paper betting)

Model (NBA)

The model is intentionally simple and explainable:

Team form window: last N games (default 12), recency-weighted (newer games count more).
Pace + ratings: estimate possessions and compute off/def/net rating (points per 100 possessions) using both teams’ boxscore stats.
Projected pace: average of the two teams’ recent pace.
Projected points:
home = pace × avg(home off_rtg, away def_rtg) / 100
away = pace × avg(away off_rtg, home def_rtg) / 100
Probabilities: convert projected margin/total into probabilities using historical σ for margin and total.
Rest/B2B: apply a small adjustment to margin/total based on days rest and back-to-backs (learned from historical data).
Calibration: use backtest error (MAE) to avoid overconfidence by inflating σ when needed.
Market blend: to reduce overconfidence, EV uses a blended probability:
p_used = α·p_model + (1−α)·p_market(no‑vig) (α≈0.35, smaller when confidence is low).

Bet Sizing + Logging

Suggested stakes use conservative fractional Kelly (or flat %) with caps, and are scaled down when confidence is low or the data snapshot is stale.

Use the Log button in Guided Picks to store the exact odds/line/time in your browser (localStorage), then mark outcomes (win/loss/push) to compute paper P/L and ROI.

Data Pipeline (Bronze → Silver → Gold)

Bronze: raw team game logs from stats.nba.com via nba_api
Silver: cleaned per-team game rows with derived opponent/home-away and rest features
Gold: compact Parquet tables used by the site:
- gold_nba_team_games (pace, possessions, off/def/net rating, rest/b2b, points)
- gold_nba_game_summary (game-level outcomes + possessions/pace + rest/b2b)
- gold_nba_team_dim (team lookup)

The pipeline also generates Quality checks, Lineage, and a simple Backtest report.

Privacy + Operations

- The dashboard is a static site; the API provides JSON responses.
- Bet Log is stored locally in your browser only.
- The NBA pipeline runs daily on the server to refresh data.