CanonDynasty

Foreman, Fournette and Mixon Lead the 2017 RB Success Model

We have roughly a month until the 2017 NFL draft, when we will learn where our favorite (or not so favorite) prospects will land this coming season. While draft position and landing spot are huge factors for forecasting the success of any running back prospect, I’ve found that we can accurately predict whether a running back will be successful largely based on his production profile and athletic measurables. We know that collegiate production isn’t everything for wide receivers, it’s the only thing. For running backs, the situation is wholly different. Production matters, but size-adjusted speed is king for determining which running backs will be successful in the NFL. You can define success many ways, but I’m choosing to use a top-12 fantasy point season (PPR) for running backs. The model’s dependent variable for early NFL success is whether or not a player had such a season within his first three years in the NFL. We used age, production, and combine measurables to train and test the updated 2017 running back model. The model used 350 running back prospects that entered in the NFL from 2000-2014, splitting the data roughly 2-to-1 into training and testing sets. After plugging dozens of different production and combine statistics into the model and slowly taking away, one-by-one the least statistically significant, we were left with four (two combine, two production) that provide the most explanatory and predictive power (listed in order of statistical significance): 1. 40-yard dash 2. Weight 3. Final season rushing yards per game 4. Final season receptions per game As you’d expect, the model favors faster, heavier prospects who had strong rushing and receiving production in their final college season. The 40-yard dash is by far the most influential statistic for predicting NFL success, followed by weight. The model’s c-statistic on the test set is nearly 0.80, which is generally considered a strong score for a logistic regression model. We are big believers in the predictive value of the three-cone drill here at RotoViz, and I found in my regression tree analysis that when looking at strictly combine measurables, the three-cone drill is significant for slower backs. But many RB prospects choose to skip the agility drills at the combine, leaving us with a hard choice to either exclude them from the analysis or estimate the missing times. I estimated the missing times using a linear regression on a prospect’s weight and 40-yard dash, which are strong predictors. Even with these estimates, my analysis found that adding three-cone times to weight and 40-yard dash didn’t enhance prediction. I don’t think you should ignore agility in your running back prospect analysis, but I wouldn’t give a prospect additional credit for a fast three-come time if he already has strong weight-adjusted speed. To get an historical perspective on the types of prospects the model favors, here are the top-15 scores for the entire 2000-2014 data set. Remember, draft position is not one of the inputs in the model, I only added it here for reference. You can think of the “Top 12 Predict” score as the likelihood that the running back will meet the model threshold of registering at least one top-12 PPR season in his first three years as a pro.
Player School Draft Year Draft Position Weight Forty RuYds/Gm Rec/Gm Top-12 Top-12 Predict
Chris Johnson East Carolina 2008 24 191 4.24 109.5 2.8 Yes 0.58
Darren McFadden Arkansas 2008 4 210 4.33 140.8 1.6 Yes 0.58
Matt Forte Tulane 2008 44 218 4.46 177.2 2.7 Yes 0.58
Kevin Jones Virginia Tech 2004 30 228 4.38 126.7 1.1 Yes 0.57
Michael Turner Northern Illinois 2004 154 244 4.49 137.3 1.6 No 0.57
JJ Arrington California 2005 44 214 4.40 168.2 1.8 No 0.56
Demarco Murray Oklahoma 2011 71 213 4.41 86.7 5.1 Yes 0.55
Latavius Murray Central Florida 2013 181 223 4.38 100.5 2.5 Yes 0.55
Ladainian Tomlinson Texas Christian 2001 5 221 4.46 196.2 0.9 Yes 0.50
Rashard Mendenhall Illinois 2008 23 225 4.45 129.3 2.6 No 0.50
Reggie Bush USC 2006 2 203 4.36 133.8 2.8 Yes 0.50
Adrian Peterson Oklahoma 2007 7 217 4.40 144.6 1.4 Yes 0.47
Jonathan Stewart Oregon 2008 13 235 4.48 132.5 1.7 No 0.47
Larry Johnson Penn State 2003 27 228 4.55 160.5 3.2 Yes 0.45
Ronnie Brown Auburn 2005 2 230 4.43 76.1 2.8 No 0.43
You’ll see that this draft-agnostic model was good at predicting success, even though only eight of the 15 above went in the first round of the NFL draft. The model does have some misses, but even technical misses like Michael TurnerRashard Mendenhall and Jonathan Stewart were more near-hits or late-bloomers than abject failures. Now the part we’ve all been waiting for: Let’s apply our historically accurate model to the 2017 draft class. Here are the top-10 scores.

Subscribe to the best value in fantasy sports

You're all out of free reads for now and subscribing is the only way to make sure you don't ever miss an article.

By Kevin Cole | @Cole_Kev | Archive

Comments   Add comment

  1. There is a formula, but it's not easily calculated like a linear regression. Not sure it would be much value to share.

  2. @colekev_FF Ssoooo he ran a 4.45 at his pro day, thats pretty amazing at that size

  3. I think you're right that the tree nodes are smaller and more difficult to rely on for statistical significance. The value of the trees was more to put the combine drills into a digestible format based on past results. The trees can be overfit, or closely follow past data at the expense of being predictive.

  4. Thanks for doing this. The tree model always bothered me since it created binary branches out of continuous data, and resulted in findings where a 0.01 difference in forty or agility time would create massive swings in "success" predictions.

Discuss this article on the RotoViz Forums

13 more replies