Editor’s note: I’m very excited to have a good friend of the blog Clay Collins making his TLN debut today in a guest post. My fingers are crossed that we can make it more of a regular thing. Clay has been developing a player point projection model, and admittedly I’m a sucker for it as I am a sucker for anything that gives me a range to work with rather than an absolute number. I encourage you all to check out Clay’s work, and give him a follow on twitter. Hopefully this will lead to more of his work appearing on The Leafs Nation.
Maple Leafs 2021-2022 Player’s Point Projections
by Clay Collins
DISCLAMER – This model is NOT using best practices when creating and implementing a model. And this was sort of by design. I wanted to put together a very quick and dirty model this season and see my own personal growth through my models as I progress through my schooling and career as a data scientist. This season is a baseline for me, and as such has some very major issues. There are plenty of much smarter and more experienced modelers and projections throughout the hockey world that I implore you to explore!
The Components of My Model:
I created this model using very basic Multiple Linear Regression. I promise that sounds more complex than it is. It’s basically the old:
y = mx + b
formula, but adding more variables to make it:
y = m1x1 + m2x2 + … + b
I used the last five seasons (2016-17 through 2020-21) as my dataset. I also used four fairly basic stats, and an additional stat to help out. These four raw stats are: Shots on Goal (SOG), Shot Attempts (Satt), Corsi-For (CF) and Takeaways (TK). The “cheat” stat that I introduced were Goals/60 (Gper60) and Assists/60 (Aper60) respectively in the goal and assists models. I will get to why I decided this when I address why there are ranges.
So here is where those ugly math formulas come into play, the actual models themselves:
Goals = 0.102*SOG – 0.0008*CF + 0.065*TK – 0.0169*Satt + 9.21*Gper60 – 3.655
Assists = -0.00011*SOG + 0.0009*CF + 0.088*TK + 0.028*Satt + 11.79*Aper60 – 8.567
There is a lot more you can dive into with this but only the real stats nerds would want to talk about it, and if you are a real stats nerd, please don’t put any weight in my model because it is not good.
Why Are There Ranges?
The models listed above were computed using historical data. However, we need to decide what to put in for the SOG, CF, TK, Satt, and Gper60 stats for each player to spit out a real number. So, what do we put in there? I used a “per game” rate for the stats (excluding the Goals and Assists per 60) and multiplied it by the number of games I expect the players to play. These are the numbers I will be adjusting throughout the season as players miss due to injuries. An example of this is that I had Matthews at 82 games when I ran these originally; but, with him not being ready yet I knocked him down to 75 games.
My models’ ranges aren’t a spectrum of all possibilities between the low and high. Rather, I used two different inputs: Last Five Season Averages, and Last Five Season Trends. This is also why some players have a small range (like how Tavares is projected between 28 and 30 goals) and others have a larger range (such as Matthews’ 37 to 45). A player like Tavares has very similar five-year averages as he does five-year trends, whereas (believe it or not) Matthews is still trending upwards from his averages.
What do I mean by the trends? This is where the other bad modeling behavior comes into play. I’ll use my model for the Coyotes that I did first to explain this one (and you will see why).
Alright #yotes friends:
Here are my goals-assists-points projections/ranges for the Coyotes roster this year.
*This is NOT using best data practices but I will run through my models and inputs in a thread below pic.twitter.com/eClgItQYYM
— Clay Collins (@Clay_C10) October 12, 2021
Let’s look at Chychrun’s Five Year Trend for Shot Attempts per Game.
For this, I use some opinions in deciding what trend type to use. For this, I think “Chychrun has been trending up in a lot of ways, so is it reasonable to assume that his last season’s 6.34 shot attempts per game could be pumped up to the projected 6.8 shot attempts per game? Well, yes. He is only getting more confident, and the Coyotes traded away OEL. This paves the way for even more time and growth from Chychrun.”
Another example the other way is Jay Beagle’s Goals per 60.
Well… Even if it makes sense in its own weird way, Beagle cannot have negative Goals per 60. Even saying zero may be disingenuous.
For him I went with an exponential model (think Radioactive half-lifes):
This one seemed more likely to me for my purposes, and it fits very well.
I do these for each player with their different stats and adjust the trendline to be more reasonable. This is a lot of bias and opinion going into what trend to use to project the input, which is good for more reasonable models player to player. However, it is bad if you are trying to project across the entire NHL.
This also means I am much more confident my Coyotes stats than I am the Leafs numbers. I have a lot more time invested in how the team should play next season with the current roster and can make more informed decisions when deciding what model to use.
For the “My ____ Pick” columns, I picked which of the two outputs I felt more likely based on the player. This means for some I chose the averages, and for others I chose the trends.
How accurate will this model be?
As accurate as using averages and trends can be. For some players, I feel pretty good about my definitive picks. Others I feel like my ranges will probably be accurate. Then for players with not a lot of data or absurd trends, like Bunting, I really have no idea.
While I would tend to say that using the trend data is overall better than using averages, the last two seasons were anything but average for all of us, including NHL players. For this reason, I decided to use both and set them as a range between the lower and higher values between the models.
Either way, this is just my first step on my personal journey for predictive modeling and I will be monitoring it throughout the season, and I hope to come back next season with a much better model and a lot more confidence in what I am choosing.
After signing up for a free account, we’re going to give you a bunch of boxes with player names and you’re going to pick a name from each box until you’ve put together a super crew that you think can contend for a NationDrafts championship. Seems easy, right? It is easy and that’s not just because you’re wicked smaht. Sign up for FREE right here.