A baseline-driven model for evaluating NFL draft outcomes, separating the picks who exceeded expectation from the ones who simply met it.
The NFL draft is a process characterized by notable outliers, showcasing remarkable success stories and stark failures. Examples like Tom Brady, who was selected as the 199th overall pick and went on to become the player with the most Super Bowl championship wins, and JaMarcus Russell, who was chosen as the first overall pick but quickly descended into the ranks of notable busts, highlight the complexity and uncertainty that surrounds this critical juncture in the NFL's operation.
The hard part is telling those outcomes apart in a way that controls for where players were drafted. A seventh-round pick who turns into a starter is a different kind of win than a first-overall pick who turns into a starter. This project builds a model for that, a way to measure whether a draft pick exceeded, met, or fell short of what a typical pick at that round and position should produce.
Assessing a player's contribution is an entire field of NFL analysis. I considered multiple methods along the way, each of which shaped my reasoning and helped refine my methodology. My initial approach involved leveraging fantasy football data from the past three complete NFL seasons to establish a comprehensive metric for assessing player value. I decided against this as scoring methodologies varied widely across different platforms introducing inconsistencies that hindered my ability to create a consistent and robust player evaluation metric.
Recognizing that player value in football is a multifaceted concept, I sought to opt for a more established assessment metric analogous to baseball's Wins Above Replacement (WAR). Due to the sport's complexity there is an absence of a generally available, standardized WAR-like metric for football. Pro Football Focus (PFF) has developed its own proprietary football WAR metric, but unfortunately access to this data is heavily restricted, compelling me to explore alternative avenues.
Eager and Chahrouri's "PFF WAR: Modeling Player Value in American Football" offers insight into positional value and provides summary statistics for PFF WAR at each position, offering a valuable starting point for adjusting raw fantasy football metrics. These summary statistics included positional mean WAR, positional coefficient of variation of WAR, and the number of observations within each position.
| Position | n | Mean in WAR | CoV in WAR | YoY Correlation |
|---|---|---|---|---|
| QB | 994 | 1.63 | 0.70 | 0.62 |
| RB/FB | 2,373 | 0.10 | 0.64 | 0.53 |
| WR | 2,864 | 0.28 | 0.84 | 0.52 |
| TE | 1,621 | 0.18 | 0.62 | 0.66 |
| T | 1,543 | 0.09 | 1.09 | 0.49 |
| G | 1,604 | 0.10 | 1.11 | 0.57 |
| C | 708 | 0.10 | 1.08 | 0.50 |
| DI | 2,559 | 0.06 | 1.34 | 0.68 |
| ED | 2,259 | 0.06 | 1.54 | 0.61 |
| LB | 2,721 | 0.11 | 0.83 | 0.51 |
| CB | 2,733 | 0.23 | 0.91 | 0.29 |
| S | 2,169 | 0.23 | 0.77 | 0.30 |
The table is a great illustration of why positional importance matters so much. Look at the gap between QB (mean WAR of 1.63) and everything else. Even the average QB is producing more value than the best player at most other positions. Then look at edge rushers: a tiny mean of 0.06, but a coefficient of variation of 1.54, meaning the top performers are massively more valuable than the average. Any draft model worth its salt has to account for both of those dynamics, and that's what made this so hard to do without paywalled data.
I tried to incorporate these summary statistics into a scoring system, but balancing the factors (variance and mean of WAR) that contributed to the overall scaling factor proved to be incredibly difficult. My findings highlighted limitations and inconsistencies in every attempted variation, exacerbated by the absence of data for positions like offensive linemen (OL) and punters (P). In response to these complexities, I made a critical decision to reassess my methodology.
I landed on PFF's "player grade" as the foundational metric. PFF's grading system scrutinizes every player on every play, emphasizing their "contribution to production" rather than relying on inherent traits or measurable attributes. PFF employs a grading scale ranging from -2 to +2 in 0.5 increments, tailored for each position. PFF's grades are further converted to a 0-100 scale at both the game and season levels, facilitating straightforward player comparisons.
I created a slightly modified metric, Raw Value Provided (RVP), to assess the total value of a player within a specific season. The calculation is:
This does not control for the random nature of injuries, nor does it take into account actual usage in games played, but it was necessary to create a baseline metric to work with and continue to advance the research. Later iterations could attempt to balance both availability/playworthiness (games played) and usage (snaps played per game) into a more refined metric.
That said, "availability is the best ability" is a real cliche in football for a reason. Simply being a player who stays healthy enough to take the field at a starting level in the NFL is itself a strong indicator of the quality of a draft pick. The games played over total games component of RVP is doing meaningful work, even if it isn't the whole picture.
It's worth acknowledging that by shifting to RVP, I'm effectively abandoning positional importance as a factor in the metric itself. That's a real concession. The reason it's defensible in this context is that I'm not comparing across positions, I'm grading players against expected value within their own position and selection round. A QB drafted in the 2nd round is being measured against other 2nd round QBs, not against running backs or guards. So the absence of positional weighting doesn't break the analysis here, but it does limit what this model can say about which positions teams should prioritize at which points in the draft. For a more rigorous treatment of that question, I'd recommend the Michael MacKelvie video linked at the end of this article.
To comprehensively evaluate the performance of players from the 2020 NFL Draft, I initiated my analysis by establishing a baseline. This baseline was constructed by assessing players drafted in the years 2015 through 2019. Since I only have three full years of performance to look at for the 2020 class, only the first three subsequent seasons were considered for each draft pick. This is less than ideal as it does not consider each position having unique characteristics such as average career longevity and average "peak" year, but it is a necessary control to establish a baseline.
The total RVP over three seasons was calculated for each member of the 2015-2019 draft. An aggregated average total 3-year RVP was then calculated for each possible position-round combination.
| Position | R1 | R2 | R3 | R4 | R5 | R6 | R7 |
|---|---|---|---|---|---|---|---|
| CB | 148.7 | 108.4 | 103.0 | 82.4 | 66.2 | 49.2 | 65.8 |
| DI | 168.1 | 149.3 | 118.0 | 113.9 | 124.8 | 94.0 | 60.9 |
| ED | 147.1 | 119.9 | 117.7 | 112.8 | 73.0 | 73.1 | 58.3 |
| FB | NA | NA | NA | 191.7 | 137.3 | 125.4 | NA |
| HB | 173.8 | 156.5 | 149.2 | 133.2 | 95.2 | 66.5 | 56.8 |
| K | NA | 37.9 | NA | NA | 149.7 | NA | 188.6 |
| LB | 133.9 | 137.0 | 105.3 | 124.3 | 70.4 | 62.0 | 47.9 |
| OL | 161.6 | 149.0 | 101.7 | 71.5 | 75.9 | 53.3 | 55.7 |
| P | NA | NA | NA | 190.0 | 175.9 | 121.7 | 179.6 |
| QB | 160.6 | 72.2 | 44.0 | 39.4 | 24.5 | 10.7 | 47.2 |
| S | 161.1 | 125.2 | 132.5 | 109.4 | 67.3 | 72.1 | 72.1 |
| TE | 162.3 | 152.9 | 134.0 | 123.5 | 108.8 | 91.7 | 62.9 |
| WR | 143.8 | 162.5 | 133.2 | 88.8 | 122.5 | 67.7 | 67.3 |
Some round-position cells simply don't have enough draft history to support reliable averages, which becomes a real constraint when evaluating low-volume positions like QB and K.
Using the aggregate averages, it was then possible to determine the difference between expected and actual three year RVP for the 2020 draft class. The resulting top draft picks displayed a good mix of generational talents in the first few rounds and value picks in later rounds.
Aggregating the total difference in expected value on the teams, I find that based on my current methodology the Titans had the worst 2020 draft with a total difference in expected value of -306.1, while the Bengals had the best draft with a difference of 285.4. This is consistent with many leading NFL draft analysts and platforms.
The Bengals' 2020 class (Joe Burrow, Tee Higgins, Logan Wilson) is widely considered one of the strongest of the decade. The Titans' class (Isaiah Wilson chief among them) one of the worst. The fact that an outside-in statistical model surfaces the same picture is itself a useful validation of the methodology.
The model has real shortcomings worth being honest about:
RVP doesn't properly account for positional importance. A high-grade kicker and a high-grade quarterback get treated as equally valuable, which they obviously aren't. Something like PFF's WAR would be ideal, but it's gated behind a paywall. Falling back to RVP was a tradeoff for tractability over precision.
Sample size by position. Some position-round cells don't have enough historical data to support reliable baselines, which inflates noise for positions like QB and K.
Career arc isn't accounted for. A 3-year window catches early-career production but misses positions where peak performance comes later (offensive line) or where careers are short (running back). A position-weighted time window would help.
PFF grades aren't perfect. They're the best publicly grounded grade available, but they're still subjective at the margin and don't capture all positional context equally well.
Survivorship bias. Players who washed out of the league before three seasons are scored on truncated data, which can over or under-state their actual contribution depending on when they fell off.
Two pieces of work in this space worth checking out. Both tackle the positional value problem more rigorously than this project does:
"I created a public PFF WAR (sort of)". A Reddit user's attempt to recreate a PFF-style WAR metric outside the paywall using publicly available data. Same problem space, different approach to the positional weighting question.
Michael MacKelvie on draft value modeling. One of my favorite sports analytics creators on YouTube. He revisits the same surplus-value question with more rigor and a much cleaner dataset than I had access to. Recommended.