Methodology
Every metric below is computed by an automated nightly job that compares our final pre-lock projections against official DraftKings scoring — no hand-picking, no retroactive edits. Distribution metrics (CRPS, PIT, SSR, coverage) currently run for MLB, MMA, and golf; the other sports are scored on point accuracy (MAE) until their distribution pipelines come online.
MAEMean Absolute ErrorThe average size of our miss, in DraftKings fantasy points, across every projected player on a slate. An MAE of 7.0 means we were off by 7 fantasy points per player on average. Lower is better, and it is the one number that is comparable across our seven sports.CRPSContinuous Ranked Probability ScoreWe do not just publish a single number per player — we publish a range of outcomes. CRPS grades that whole distribution against what actually happened: it rewards putting high probability near the real result and punishes both wrong centers and wrong widths. Lower is better. Our forecasts use the closed-form CRPS; the sample-based naive baseline uses the “fair” (finite-ensemble-corrected) version.CRPSSCRPS Skill Score vs naive baselineCRPSS compares our CRPS to a deliberately dumb baseline: each player's own last games (for example, last 10 MLB games), with no modeling at all. Positive means our distributions beat the naive baseline; zero means a tie; negative means the naive baseline wins. As of launch, our MLB CRPSS sits at roughly zero — our distributions are not yet decisively better than naive. We publish that number anyway, because it is the bar every model upgrade has to clear in public.PITProbability Integral Transform uniformityIf our probability ranges were honest, actual results would land uniformly across them — players would beat their 90th percentile about 10% of the time, and so on. Each slate gets a chi-squared uniformity test on those PIT values; we report the share of slates where the test rejects (p < 0.05). A high rejection rate means our stated ranges do not yet match reality. Today most MLB slates fail this test — that is exactly the kind of result a trust page exists to show.SSRSpread-Skill RatioThe width of our predicted ranges divided by the size of our actual errors. A ratio near 1 means our uncertainty estimates are honest. Below 1 means we are overconfident (ranges too narrow); above 1 means we are over-dispersed (ranges too wide, hedging).CoverageFloor–ceiling band coverageHow often a player's actual score lands inside the floor-to-ceiling band we published. The band is nominally an 80% interval, so well-calibrated projections should read close to 80% here — not higher, not lower. Why publish numbers that are not flattering? Because "trust our projections" is meaningless unless you can check them. The skill score (CRPSS) is computed on exactly the same players, slates, and actuals for our model and the naive baseline, so that comparison cannot be gamed. (The two raw CRPS averages cover slightly different pools — the baseline can only score players with enough history.) When our scores improve, you will see it here first — and when they do not, you will see that too.