4246699723_98c36b5b59_o I’ve had a few questions recently about how to interpret the similarity score apps and what information can be gleaned from using them.  I figured I should get a post together to address some common questions.

The Player Name Dropdown and Game Checklist

The killer part of this projection tool is the ability to throw out games you don’t want considered as part of your similarity search.  If you want to forecast Mike Wallace’s 2013 by throwing out games he played in that Roethlisberger didn’t start, you can do that.

The Three Tabs

Season N

This tab contains the 2012 stat summary for the subject player, along with the 20 most similar seasons in our database.  The cool thing about this tab is that if you paste the 2nd table into Excel and then average up the columns, you’ll come so close to the stat summary of the subject player as to think it’s magic (this happens unless the subject player is an extreme outlier like AP 2012 or Calvin Johnson 2012, but even then it gets pretty close).  So in aggregate, the 20 similar seasons are a rough approximation of our subject player.

Season N+1

This tab contains the results of what the similar players did after they had a season that was comparable to the subject player. This creates a very simple, and also very lazy projection.  It’s a lazy projection because we haven’t spent any time running linear regressions, which some people might use to forecast.  Instead, we just make an assumption that looking at similar seasons is a good way to create a forecast.  Even though I say that we haven’t spent time running regressions, the reality is that I’ve run regressions and also used similarity scores to forecast and I opt to use similarity scores because I’ve found that they reduce standard errors, and also they pick up effects that a linear regression might miss.

Plots

This is a simple visual representation of the Year Over Year change in per game fantasy scoring that happened among the comps.  Don’t be discouraged if a guy you like has some negatives in his comparable set.  This is football, which means that it’s a war of attrition and it’s just very difficult to keep up fantasy scoring year to year.  What should be discouraging though is when you find a player whose comps overwhelmingly hit a wall in Year N+1.  This happens most frequently with older running backs who don’t catch passes.  It’s also sometimes seen in smaller receivers who get a lot of touchdowns on long pass plays.

Now on to some stuff that covers how you should think about the apps.

The Range of Outcomes

The RotoViz motto is “Seeing is Believing”.  Sometimes that means actual data visualization, but sometimes it just means a visual manifestation of an idea.  The visual manifestation of the idea that all players have built-in risk is found in the Similarity Score apps.  In a sense the Similarity Score apps are risk management tools.  Every fantasy player has been burned by the “Can’t Miss” draft pick who tore up the league the previous year.  The Similarity Score apps present a reminder that this happens all of the time.  Take a look at AP’s Similar Players and note all of the guys who averaged 100 yards per game one year and then dropped to 40 or 60 or 70 yards the next year.  They might not be AP, but they were pretty good too.  I drafted Jamaal Lewis in 2004 after he had a season pretty close to AP’s 2012.  Lewis didn’t totally crater, but he did experience a dropoff in production.  If you look at the plot tab, you see that Lewis’ dropoff is the median expectation of dropoff for the AP comprables.

Mean Reversion

What do you think the odds are that AP averages 6 YPC again next year?  I would have to get pretty good odds on a bet in order to put any money on him averaging 6 YPC again.  I don’t have to know anything about him, or the Vikings intended use of him to say that.  All I have to know is that 6 YPC exists on the edge of the distribution for running back per carry averages and it’s more likely that his YPC number next year will be closer to the mean.  Similarity scores also illustrate this idea because you can actually see the mean reversion taking place.  It’s not an abstract idea because you have 20 examples of guys that it happened to, who are reasonable approximations of the guy you’re looking at.  Sometimes people respond to this idea to say that Player X is different.

Some Similarities Seem Apocalyptic

“This time is different” is the battle cry of people who bet against mean reversion.  Financial bubbles are built on the “this time is different” mantra.  Every time I present the results of the Similarity Score projections I get without fail someone who says that Player X is different.  But I think that what people don’t realize is that I’m presenting 20 observations of similar seasons.  This isn’t just “here’s one guy and here’s another guy, the results will be the same.”  In a very simple way the similarity score apps are giving you a range of outcomes.  It might be possible to dismiss one or two guys as not being actually similar, but it’s pretty difficult to dismiss every single one of the 20 similar players.  If you want to dismiss one or two guys because you have a good reason, that’s fine.  It’s easy to paste the tables into excel, delete the names you don’t think are similar, and then average up the results to get a new projection.  But the thing I would caution you about related to doing that is engaging in wishful thinking, and also opening yourself up to not being truly risk aware.

A Few Bad Comps Doesn’t Mean You Can’t Draft a Guy

Value is always relative.  Don’t be discouraged if the guy you’re looking at has a few bad comps because all of the guys you might draft instead will also have a few bad comps in their similar set.  In the end you’re just trying to make the most accurate risk assessment you can and then draft your team accordingly.

I’ll try to cover this stuff further in future posts so that the apps can be as useful as possible for you.