What’s a goalie worth these days? Determining replacement level goaltending

The concept of a replacement level player is one of the most important ideas in sports analytics, but one that unfortunately is used relatively infrequently in hockey. Replacement level is meant to represent the level of talent that’s widely available to any team at little or no cost. Comparing players against replacement level is important because it essentially tells us whether a player is worth keeping or not – anyone performing below replacement level should be discarded, since it’s likely that any free-agent or waiver wire pickup would perform better. It’s important to note that this is different than comparing against the average player – when we compare against the average we know whether a player is better or worse than most of the players eating up minutes throughout the league, but it doesn’t give us any indication of whether he’s worth keeping around.

For position players figuring out an appropriate replacement level is extremely difficult – there’s no broad agreement as to what individual statistics should be used, let alone any idea of how to determine who a replacement level player is. For goalies, however, we have a half-decent metric (Even Strength Save Percentage), that’s broadly accepted as being representative of individual talent, even if it is somewhat more variable in the short term than we’d like.

Read more ›

Posted in Uncategorized

Why is it so hard for good teams to get better? Looking at the value of a marginal goal

We’re now three days into free agency and with most of the marquee names on the market already signed on for the coming year and beyond, teams and fans alike are starting to look over their rosters trying to figure out whether they’re sitting in a better position than they were on June 30th. With some teams the improvement is fairly obvious: after finishing 8th in the West last year the Dallas Stars have added Jason Spezza and Ales Hemsky and looked poised to take a run at the Blues and Blackhawks for Central Division supremacy. The Anaheim Ducks also made a big splash, bringing in Ryan Kesler from the Canucks while only sacrificing Nick Bonino and Lucas Sbisa. Having added a big name to take some of the pressure off of Ryan Getlaf, should we expect a similar bump to elevate the Ducks to President’s Trophy champions next year?

This type of situation comes up frequently not just in the NHL but across professional sports: fringe team adds a few key pieces and turns into championship contenders, while a team that finished the previous year on the edge of winning it all can’t turn their added potential into dominance. Ultimately, what this comes down to is that the value of a given player isn’t the same to each team: adding on additional goals (or allowing fewer goals, depending on which way you look at it), has different effects on a team’s record depending on how good they were before. Adding Brad Richards to the Blackhawks isn’t the same as adding Brad Richards to the Blue Jackets would be. To put it another way, the value of a marginal goal decreases as a team’s goal differential increases (and in a non-linear manner, as we’ll see later).

To illustrate this point, I put together a quick simulation to look at how the value of an extra goal varies based on team strength. I began by generating 82 games, picking a random number of goals for and against between 0 and 5. I assumed that we were only looking at regulation play, and so allowed the game to finish tied in this case (while this isn’t perfect, practically it shouldn’t make much difference). I totalled up the initial goal differential across the 82 games, and then chose one of the 82 games to randomly add a goal to. If the game had initially finished in a tie, or with our simulated team down by 1 goal, I added one point on to their overall point total. Then, I repeated this process 50,000,000 times and counted how many times a team got an extra point by adding an additional goal. If we take that number and divide it by the total number of times we added an extra goal, we can figure out the marginal value of an added goal for a team with a given goal differential.

Pts. Per Marginal Goal

Pts. Per Marginal Goal

As we’d assume, what we see is that the number of additional points we expect a team to earn for adding an extra goal decreases as our imaginary team’s goal differential increases, and the relationship isn’t linear. A team that is sitting on a +20 goal differential gets less benefit from adding one additional goal than a team with a -20 goal differential. From a practical perspective, this means that it’s a lot easier for a bad team to improve than it is for a good team.

What’s interesting to note though is that the graph doesn’t actually peak at 0 as you might expect. While a team with just as many goals for and against might expect to each 0.36 additional points from adding one more goal, a team with a -50 goal differential will get about 0.37 additional points. On the other hand, a team with a +50 goal differential is only going to earn approximately 0.32 additional points from scoring once more over the course of a season. This might explain why teams are able to go from the bottom of the league to near the playoff cut-off so quickly while the teams that dominate are often a lot harder to knock of their perch.

Another way to look at this is by considering how many goals you need to add one extra win. While we commonly use a constant goals per win number when discussing player value in the analytics community (usually about 6), looking at the data above we know we shouldn’t expect the number of goals per win to be the same for each team. If we take the inverse of each number above and multiply it by 2, we can translate our marginal goal value into the number of goals needed to add one additional win (or 2 standings points in this case) for different team strengths.

Goals Per Win

Goals Per Win

Looking at it this way makes things slightly more obvious: a team with a -50 goal differential needs to bring on 5.3 additional goals to increase their win total by 1, while a team with a positive 50 goal differential needs to add over 6 goals to achieve the same result.

One thing to keep in mind here is that the values in both graphs above will change depending on the goal scoring environment. While in our model we assumed that each team had an equal chance of scoring and allowing anywhere between 0 and 5 goals, in reality the probabilities are different and while the results should generally look the same, the fitted curves that we generate will be different.

Nevertheless, the implications for teams shopping the free agent market should be obvious: while it might make sense for a team struggling at the edge of the playoff picture to look for one or two big pieces to propel them forward, for a team at the top it’s a lot more difficult to improve their place in the standings by spending money. This is of course complicated by the fact that it’s harder to find pieces to replace on a good team: even if you can go out and add a top line player forward, the player whose minutes they’ll be taking is likely to have been a useful player already. All of which is to say that it’s not all that surprising that teams often fail to strike gold and establish dominance through free agency: it’s not that they’re not getting better, it’s just that the numbers are often stacked against them from the start.

Posted in Theoretical

2014 Free Agent Preview: Forwards

It’s almost free-agent season, the time of year when most fans are salivating over all the possible superstars their team might add, while the analytics fans of the world are mostly hoping their GM’s don’t screw up too badly. In celebration of this most joyous terrifying time of year, Puck++ is proud to present our First Annual Puck++ Players for Purchase at a Premium Price Preview™.

Rather than going through every available free-agent though (since in the time it would take me to write up most of them will end up signing), I’m going to focus on 15 forwards and 10 or so defencemen (to be determined when/if I get around to writing part 2) who are likely to garner significant interest, or at least significant media attention.

For each player I’m going to present 3 sets of data: 1) His traditional stats from the last 3 seasons (G, A, Pts., +/-); 2) His xGF20, xGA20 and xGD20 numbers for the last 3 seasons (the details on how these numbers are calculated can be found here and here); and 3) His 5 most comparable players, using xGF20 and xGA20 for the past 3 years. The comparable players are determined by looking at which players had seasons most similar from an xGF20 and xGA20 point of view over the past 3 years. As an aside, if you want a quick cheat sheet to evaluate xGD20 numbers, here are the average numbers based on ice-time in the 2013-2014 season:

Line xGD20
1 0.09
2 0.05
3 0.00
4 -0.05

All salary data quoted below is from CapGeek. All standard stats from HockeyDB. Blame them (or my poor transcription) for any errors.

Read more ›

Tagged with: ,
Posted in Free Agency

Context Neutral Player Evaluation: Examining Defence and Calculating xGD20

A few months ago I wrote about xGF20, my attempt to isolate a player’s offensive ability from his teammates abilities and the luck involved in shooting percentages in small samples. At its core, xGF20 is based on 3 reasonably repeatable individual measures:

1) A player’s rate of individual shot generation;

2) A player’s rate of altruistic shot generation (that is, the shots he generates for his teammates); and

3) A player’s individual shooting ability (regressed based on position)

Using these three metrics, xGF20 presents a player’s expected on-ice goals for per 20 minutes of ice-time, assuming his teammates shoot at a league-average rate. While the calculation is slightly more complicated than what I’ve just presented, it does allow us to make comparisons between players without worrying about who their linemates were. If you put Jason Spezza on a line between my brother and I (sorry Ben), you’d expect his on-ice GF20 to drop dramatically through no fault of his own. Although xGF20 is very similar in theory to Relative CF20, taking into account a players own shooting ability allows us to differentiate between those who just throw the puck at the net without purpose (David Clarkson) and those who take more shots because they’re better at it (Alex Ovechkin).

Looking at xGF20 numbers over the past few years was encouraging not only because the results agreed with common intuition (Sidney Crosby had 5 of the top 10 seasons over the past 5 years), but also because it was highly repeatable year-over-year. When we calculate a player’s xGF20 for a given year, it has a strong correlation to his xGF20 in subsequent years, a good sign that we’re isolating an individual ability.

xGF20 did have a major fault as a player evaluation tool though: it only looked at one side of the game. To get a full view of a player’s contribution to his team we need to consider the defensive side-of-things as well. This is, unfortunately enough, where things get tricky. Isolating individual defensive ability is extremely hard to do, as defensive results are influenced by the efforts of each player on the ice as well as any effects a coach might have through his choice of defensive system (see here, for example, where I found that teams had a fairly strong ability to control whether opposing forwards or defencemen were taking the shots against their net).

A lot of effort in the analytics community recently has been put into finding effective ways to evaluate defensemen (who tend to, intuitively at least, shoulder most of the defensive responsibility): Tyler Dellow has looked at using CorsiRel (the difference between a team’s CF% with a player on the ice and off the ice) and found that most “famous” defensemen tend to outperform their teams results, while Garret Hohl looked at defensemen’s effect on both their team’s and opponent’s Corsi %.

We’re going to take a similar approach here on the defensive side of things, but rather than look at Corsi %, we’ll use CA20 WOWY. In an ideal world we’d be able to break down defensive contribution further, but given that there’s a lack of clear evidence to support the idea that on-ice save percentage is repeatable, CA20 WOWY gives us a decent way to measure a player’s shot prevention ability relative to his teammates. Our expected goals against metric simply takes a player’s CA20 WOWY and adds it to the league average CA20 (17.5) to calculate his expected CA20. We then multiply by the league average Corsi shooting percentage to find his xGA20:

xGA20 = (17.5 + CA20 - TMCA20) * lgCSh%

xGA20 shows significantly less repeatability than xGF20, which we would expect given that it attempts to measure something that’s far stronger at the team than individual level. If we look at the year to year correlations in the graph below, we see that that as more years pass xGA20 shows stronger repeatability than unadjusted CA20 for defencemen (interestingly, for forwards raw CA20 seems to show a stronger correlation). While the repeatability isn’t what we’d hope for in an individual stat, it does give us a starting point in measuring a player’s defensive contribution.

xGA20 Year-Over-Year Correlations

xGA20 Year-Over-Year Correlations

One thing that’s important to note is that we see a lot less variance in xGA20 than in xGF20 since we don’t use an individualized save percentage. While this may unfairly penalize a small subset of players who do have an effect on their on-ice save percentage, it allows us to sidestep the difficult issue of trying to seperate out a player’s contribution to on-ice save percentage from his goalie.

Position Mean Std. Dev.
F 0.76 0.057
D 0.75 0.057

Now that we’ve taken care of the defensive side of things, let’s look at pulling everything together into one aggregate metric. After all, what we ultimately care about is whether or not a player is going to help our team outscore our opponents or not. To measure that all we have to look at is the difference between each player’s xGF20 and his xGA20, which we’ll call xGD20.

xGD20 = xGF20 - xGA20

xGD20 shows reasonably good repeatability, with a year to year correlation forbl forwards of 0.62 between subsequent seasons dropping down to 0.51 when we look at comparing one year with 4 years in the future. For defensemen, the correlations are lower as we’d expect, ranging from between 0.41 (year y to y+1) to 0.47 (year y to y+4). What’s important to note though is that it tends to be more repeatable than CF% WOWY for forwards, and at the same level for defencemen (to be expected since the defensive shooting is heavily regressed and they take fewer shots themselves).

xGD20 Year-Over-Year Correlations

xGD20 Year-Over-Year Correlations

From a descriptive statistics point of view, the mean for both forwards and defencemen are right where we’d expect them to be around 0. We again see more variance in the forward’s numbers, driven by the higher variance in regressed shooting percentage.

Position Mean Std. Dev.
F -0.01 0.14
D 0.01 0.10

One of the interesting things that I’ve found is that xGD20 seems to relate fairly well to even-strength ice-time. If we break up the list of forwards from 2013-2014 into 4 groups by ice-time, we see that xGD20 decreases steadily by each group.

Group xGD20 Average
1 0.09
2 0.05
3 0.00
4 -0.05

We also see the same result, albeit to a lesser degree, if we break up the defencemen by ice-time, with first pairing blueliners showing better xGD20 numbers then 2nd and 3rd pairing players respectively.

Group xGD20 Average
1 0.02
2 -0.02
3 -0.03

Looking at the top 10 forwards from last year we see a lot of expected names, as well as a few players who appear to have had more extremely unlucky seasons. Sidney Crosby unsurprisingly leads the list, while Joe Thornton, Patrice Bergeron, Corey Perry and Jonathan Toews aren’t shocking to see at the top. In contrast, while Alexander Semin and Alex Burrows both had disappointing years, their past shooting ability and their shot numbers last year suggest that they’re due to bounce back next year.

Player Team Position xGF20 xGA20 xGD20
CROSBY, SIDNEY Pittsburgh F 1.12 0.73 0.39
THORNTON, JOE San Jose F 1.01 0.62 0.38
BERGERON, PATRICE Boston F 0.97 0.61 0.36
VORACEK, JAKUB Philadelphia F 1.01 0.65 0.35
NEAL, JAMES Pittsburgh F 1.08 0.73 0.35
PERRY, COREY Anaheim F 0.96 0.63 0.33
SEMIN, ALEXANDER Carolina F 1.04 0.72 0.32
TOEWS, JONATHAN Chicago F 1.03 0.71 0.32
MARCHAND, BRAD Boston F 0.98 0.67 0.30
BURROWS, ALEX Vancouver F 0.98 0.70 0.28

On the blueline, TJ Brodie and Mark Giordano lead the way, although a large part of that is driven by their amazing xGA20 numbers (which are in turn related to Calgary’s awful defence). Upcoming free-agents Matt Niskanen and Anton Stralman look like they could be good value pickups, with their xGD20 rankings outpacing their traditional stats. Interestingly, most of the players made this list by holding their opponents to fewer shots, rather than through their own offensive ability – all of the top 10 defencemen are well below in xGA20, while only one of the top 10 defencemen in xGF20 made the list (Lubomir Visnovsky).

Player Team Position xGF20 xGA20 xGD20
BRODIE, TJ Calgary D 0.81 0.57 0.24
GIORDANO, MARK Calgary D 0.80 0.58 0.23
TIMONEN, KIMMO Philadelphia D 0.83 0.62 0.22
VLASIC, MARC-EDOUARD San Jose D 0.82 0.62 0.20
NISKANEN, MATT Pittsburgh D 0.84 0.64 0.20
STRALMAN, ANTON NY Rangers D 0.81 0.61 0.20
VISNOVSKY, LUBOMIR NY Islanders D 0.88 0.68 0.20
WISNIEWSKI, JAMES Columbus D 0.85 0.66 0.19
DONOVAN, MATT NY Islanders D 0.88 0.69 0.19
GARDINER, JAKE Toronto D 0.82 0.63 0.19

I’ve uploaded a full list of xGD20 for all players in 2013-2014 to a Google Spreadsheet here. If you’re looking for a primer on who your team should avoid come July 1st, the bottom of the list will certainly come in handy.

Obviously there are still improvements to be made-the list of top defencemen above highlights that players on extreme teams can observe extreme results in the short term. If we look over a longer horizon though, xGD20 does seem to provide a fairly consistent view of a player’s talent, and one that intuitively makes sense. All it attempts to measure are things that we know a player has a degree of control over-how many shots he takes, how many shots he sets up for his teammates, how good of a shooter he is, and how well he prevents shots against his own net. Everything else is, generally speaking, too variable to use to evaluate players, and as such we need to take it out of a player’s results to get a true sense of his talent independent of the results in any given year.

Tagged with: , ,
Posted in Statistics

Adjusting Save Percentage for Team Effects

As anyone who’s looked at evaluating goalies before knows it’s not a pleasant task. Even ignoring the inherent extreme variability in game to game or year to year statistics, a goaltender’s numbers can be heavily influenced by the 5 skaters in front of him. While when we look at forwards ordefencemen we can get around this issue by looking at WOWY or Relative numbers, we don’t have that same luxury with goalies. Even if we were to compare a goalie to how his team did when he had the night off our analysis is complicated by the fact that most teams tend to use only a small number of goalies each season. While we may be able to say that a team posted a better save percentage with their starter than their backup that doesn’t give us a good sense of whether the starter was great or the backup was terrible – we’re still stuck trying to isolate the team effect.

One method we can use to get an estimate of the team effect is their skill at shot prevention. Teams that are better defensively are likely to give up fewer shots, although somewhat paradoxically those shots are more likely to go in. But what if we looked at things from a different point of view: what if rather than focusing on a team’s ability to prevent shots in general we adjusted for who was taking the shots? After all, shots by defencemen tend to go in less than half as often as shots by forwards, so it stands to reason that if a team can prevent opposing forwards from putting the puck on net that their goalie would likely experience a higher save percentage because of that.

The first thing we should check before we get into this is whether the claim I just made (that teams can control whether shots against their net come from opposing forwards or defencemen) is actually true. The easiest way to verify whether something is more talent than luck is to check what the split half correlation is between even and odd games (in simpler terms we’re looking at whether the results we observe in the even games are a good predictor of the results in the odd games). There are two ways to look at this talent in our case and fortunately enough for us they both give similar results. The first way is to look at what percentage of a team’s even strength shots against come from forwards vs defencemen. In this case our split half correlation is 0.74, which is high enough to say that we’re likely on to something. The other stat that we can test is the total number of shots against per 20 minutes of even strength ice time that a team gives up to forwards and defencemen (fFA20 and dFA20, we’re using shot attempts here since I have the numbers handy but the results should hold generally for shots on goal). Under this method, our correlations are 0.80 and 0.72 for forwards and defencemen respectively, again good enough that we can feel confident that this is at least a team level skill*.

So now that we’ve established that teams can generally control where their opponents’ shots are coming from (positionally, at least) it naturally makes sense to adjust our goalie statistics (in this case 5v5 save percentage, which we’ll be using throughout the rest of this article) to take this fact into account. After all, if I post a 0.960 save percentage you’d likely be impressed, but if I told you that I only ever faced shots from the point your appreciation of my netminding skills would probably decline significantly. To do this we’ll start by looking at how many shots each goalie faced from forwards and defencemen, which we’ll call fSA and dSA respectively. After that we’ll calculate how many goals we’d expect our goalie to give up given their shots against distribution by position:

xGA = fSA * lgFSh% + dSA * lgDSh%

Where lgFSh% and lgDSh% are the league average forward and defencemen shooting percentage in each season, as given below:

Season lgDSh% lgFSh%
20082009 3.96% 9.18%
20092010 4.32% 9.15%
20102011 3.98% 9.03%
20112012 3.94% 9.19%
20122013 4.12% 9.24%
20132014 4.12% 9.12%
2008-2014 4.07% 9.15%

Once we’ve calculated our expected goals against, we can turn this into an expected save percentage by simply dividing by the total shots each goalie faced:

xSv% = 1 – (xGA / SA)

We can then look at how each goalie actually performed against their expected value to come up with a relative team-adjusted (and in actuality also season-adjusted) save percentage:

adjSv% Rel = Sv% – xSv%

Which we can turn into adjusted save percentage by adding on the league average save percentage in the appropriate season (as shown in the table below):

adjSv% = adjSv% Rel + lgSv%

Season lgSv%
20082009 92.05%
20092010 91.99%
20102011 92.21%
20112012 92.17%
20122013 92.09%
20132014 92.24%

So now that we have our new stat defined, let’s take a look at the results. The table below has the standard 5v5 Sv% as well as the 5v5 adjSv% for each goalie who faced more than 500 5v5 shots over the past season. I’ve also included a column at the end which shows the delta between the traditional save percentage and our adjusted save percentage.

Goalie Sv% xGA xSv% adjSv% Rel adjSv% Delta
RASK, TUUKKA 94.2% 99.6 92.2% 2.0% 94.2% 0.0%
HARDING, JOSH 94.2% 41.1 92.2% 1.9% 94.2% 0.0%
KHUDOBIN, ANTON 93.6% 69.4 92.2% 1.4% 93.6% 0.1%
JOHNSON, CHAD 93.4% 41.5 92.2% 1.2% 93.5% 0.0%
PRICE, CAREY 93.4% 109.3 92.2% 1.2% 93.4% 0.0%
VARLAMOV, SEMYON 93.5% 118.7 92.4% 1.1% 93.4% -0.1%
KUEMPER, DARCY 93.4% 43.0 92.3% 1.1% 93.3% 0.0%
BOBROVSKY, SERGEI 93.3% 99.5 92.2% 1.1% 93.3% 0.0%
BISHOP, BEN 93.2% 101.5 92.3% 0.9% 93.2% 0.0%
SCRIVENS, BEN 93.2% 70.6 92.3% 0.8% 93.1% -0.1%
HOLTBY, BRADEN 93.0% 86.3 92.2% 0.8% 93.0% 0.0%
BERNIER, JONATHAN 93.0% 105.5 92.3% 0.7% 92.9% -0.1%
ENROTH, JHONAS 92.9% 51.4 92.2% 0.7% 92.9% 0.0%
QUICK, JONATHAN 92.8% 67.9 92.2% 0.6% 92.8% 0.1%
LUNDQVIST, HENRIK 92.9% 108.7 92.3% 0.5% 92.8% -0.1%
CRAWFORD, COREY 92.7% 94.8 92.2% 0.5% 92.7% 0.1%
BRYZGALOV, ILYA 92.8% 52.2 92.3% 0.5% 92.7% -0.1%
SMITH, MIKE 92.7% 108.4 92.2% 0.5% 92.7% 0.0%
MASON, STEVE 92.6% 108.0 92.1% 0.4% 92.7% 0.1%
ANDERSEN, FREDERIK 92.8% 48.5 92.4% 0.4% 92.6% -0.1%
LUONGO, ROBERTO 92.6% 96.8 92.2% 0.4% 92.6% 0.0%
LEHTONEN, KARI 92.8% 108.4 92.4% 0.4% 92.6% -0.2%
HILLER, JONAS 92.4% 80.0 92.1% 0.3% 92.5% 0.1%
HOWARD, JIMMY 92.5% 87.1 92.2% 0.3% 92.5% 0.0%
LACK, EDDIE 92.5% 63.9 92.3% 0.2% 92.5% 0.0%
MONTOYA, AL 92.5% 45.3 92.2% 0.2% 92.5% 0.0%
ELLIOTT, BRIAN 92.3% 42.2 92.1% 0.2% 92.5% 0.1%
SCHNEIDER, CORY 92.4% 64.4 92.2% 0.2% 92.4% 0.0%
ANDERSON, CRAIG 92.5% 96.3 92.3% 0.1% 92.4% -0.1%
HALAK, JAROSLAV 92.3% 80.7 92.2% 0.1% 92.3% 0.0%
MILLER, RYAN 92.0% 110.6 92.1% -0.1% 92.1% 0.2%
REIMER, JAMES 92.2% 65.2 92.4% -0.2% 92.0% -0.2%
EMERY, RAY 91.8% 42.5 92.1% -0.3% 92.0% 0.2%
RAMO, KARRI 91.7% 69.0 92.0% -0.3% 91.9% 0.2%
FLEURY, MARC-ANDRE 91.9% 103.9 92.3% -0.5% 91.8% -0.1%
HUTTON, CARTER 91.7% 63.1 92.2% -0.5% 91.8% 0.0%
NIEMI, ANTTI 91.8% 102.3 92.4% -0.6% 91.7% -0.1%
WARD, CAM 91.6% 49.3 92.2% -0.6% 91.7% 0.0%
NABOKOV, EVGENI 91.5% 67.0 92.2% -0.7% 91.5% 0.0%
LEHNER, ROBIN 91.5% 64.7 92.3% -0.8% 91.5% 0.0%
MAZANEC, MAREK 91.0% 41.5 92.1% -1.1% 91.2% 0.2%
MCELHINNEY, CURTIS 91.4% 39.3 92.5% -1.1% 91.2% -0.2%
GUSTAVSSON, JONAS 91.3% 41.8 92.4% -1.1% 91.1% -0.1%
THOMAS, TIM 91.0% 82.4 92.3% -1.3% 91.0% -0.1%
BERRA, RETO 90.5% 54.3 91.9% -1.4% 90.8% 0.3%
PAVELEC, ONDREJ 90.9% 96.7 92.3% -1.4% 90.8% -0.1%
BRODEUR, MARTIN 90.4% 57.7 92.2% -1.8% 90.5% 0.0%
DUBNYK, DEVAN 90.2% 56.5 92.2% -2.0% 90.2% 0.1%
POULIN, KEVIN 90.2% 51.5 92.2% -2.0% 90.2% 0.0%
RINNE, PEKKA 89.9% 41.2 92.1% -2.2% 90.0% 0.1%

The way to interpret the delta is this: a positive delta means that a goalie’s adjusted save percentage is greater than his observed save percentage, or in other words that his defense cost him points by allowing more shots from forwards than we’d expect. As you can see, the deltas aren’t huge, but there are goalies for whom it does seem to make a difference in our evaluation. Reto Berra, for example, seems to have played behind the worst defence in the league, costing him roughly 3 points of save percentage over the course of the season (sorry, Flames fans). Over the 672 shots he took that works out to roughly 2 goals, or a third of a win that he was cost due to team defence alone.

On the other hand, both Jonathan Bernier and James Reimer’s save percentages seem to have been boosted by the Leafs’ defensive efforts (yes, you read that right). While the Leafs did give up a ton of shots, a greater percentage of those came off the sticks of opposing defencemen than we’d expect given the league-wide numbers.

While most of the deltas are relatively small, the raw differences don’t quite tell the whole story, as they don’t take into account the number of shots against a goalie faces. Rather than looking at raw Sv% and adjSv%, we can instead look at Goals Saved Above Average (GSAA) which attempts to measure how many extra goals a given netminder prevented over a league-average goalie:

GSAA = (Sv% – lgSv%) * SA

If we look at the delta in GSAA and adjGSAA, we see that the differences become more significant.

Goalie GSAA Adj GSAA GSAA Delta
MILLER, RYAN -3.74 -1.36 2.37
BERRA, RETO -11.88 -9.68 2.21
RAMO, KARRI -4.68 -2.95 1.73
MASON, STEVE 4.56 5.98 1.42
HILLER, JONAS 1.87 3.03 1.16
EMERY, RAY -2.51 -1.52 0.99
MAZANEC, MAREK -6.36 -5.52 0.84
ELLIOTT, BRIAN 0.41 1.19 0.78
KHUDOBIN, ANTON 11.71 12.44 0.72
CRAWFORD, COREY 5.15 5.84 0.69
RINNE, PEKKA -12.44 -11.76 0.68
HOLTBY, BRADEN 7.85 8.34 0.49
QUICK, JONATHAN 4.47 4.93 0.46
DUBNYK, DEVAN -14.93 -14.48 0.45
NABOKOV, EVGENI -6.38 -6.03 0.36
BRODEUR, MARTIN -13.61 -13.28 0.33
SCHNEIDER, CORY 1.06 1.37 0.31
ENROTH, JHONAS 4.11 4.39 0.28
WARD, CAM -3.99 -3.72 0.26
JOHNSON, CHAD 6.26 6.51 0.25
SMITH, MIKE 6.11 6.36 0.25
HUTTON, CARTER -4.10 -3.90 0.20
PRICE, CAREY 16.19 16.33 0.14
POULIN, KEVIN -13.66 -13.54 0.12
HALAK, JAROSLAV 0.58 0.69 0.11
MONTOYA, AL 1.21 1.31 0.09
BOBROVSKY, SERGEI 13.42 13.49 0.07
LUONGO, ROBERTO 4.79 4.84 0.06
HOWARD, JIMMY 3.09 3.15 0.05
RASK, TUUKKA 25.58 25.63 0.05
HARDING, JOSH 10.10 10.13 0.02
LACK, EDDIE 1.98 1.87 -0.11
KUEMPER, DARCY 6.27 6.03 -0.25
BRYZGALOV, ILYA 3.58 3.22 -0.36
LEHNER, ROBIN -5.93 -6.30 -0.37
BISHOP, BEN 13.06 12.49 -0.57
THOMAS, TIM -13.02 -13.64 -0.62
ANDERSEN, FREDERIK 3.25 2.55 -0.70
GUSTAVSSON, JONAS -5.42 -6.24 -0.82
SCRIVENS, BEN 8.50 7.62 -0.89
FLEURY, MARC-ANDRE -4.99 -6.13 -1.14
PAVELEC, ONDREJ -17.13 -18.28 -1.15
MCELHINNEY, CURTIS -4.52 -5.70 -1.19
BERNIER, JONATHAN 10.71 9.51 -1.20
ANDERSON, CRAIG 2.64 1.35 -1.29
LUNDQVIST, HENRIK 9.05 7.73 -1.31
REIMER, JAMES -0.30 -1.79 -1.48
NIEMI, ANTTI -5.85 -7.69 -1.85
VARLAMOV, SEMYON 19.91 17.70 -2.21
LEHTONEN, KARI 7.82 5.40 -2.43

Looking at the delta column, the differences become a little clearer: both Ryan Miller and Reto Berra were cost about 35-40% of a win by their defences, while the defensive prowess of Colorado and Dallas likely added more than 1/3 of a win to each of Semyon Varlamov and Kari Lehtonen’s totals.

While these are still relatively small differences for many goalies, over the course of a career these can start to add up. Since 2008, the Rangers defensive play has saved Henrik Lundqvist roughly 8 goals, or about 1.5 per year. If we look solely at John Tortorella’s years behind the bench, the average delta works out to be more at almost 2 goals per year.

Expanding our sample back to 2008 also reveals larger discrepancies between GSAA and adjGSAA: both Jaro Halak (2010/11) and Craig Anderson (2009/10) were cost more than 3 goals by their team’s defensive play. Given that most goalies GSAA fall between -10 and +10, differences of even 1 goal can represent a fairly large difference in a goalie’s valuation. And it’s these differences in valuation that can shed light on some of the more difficult to measure aspects of the game, allowing us to get a better sense as to what’s within a goaltender’s control, and what we can lay at the feet of their teammates.


*Based on some initial analysis I’ve done this seems to be primarily a team/system level skill. If you look at it at the individual level and take into account how a players teammates did the dFA20 and fFA20 metrics show little year over year repeatability. But all that’s a topic for another post.

Tagged with: ,
Posted in Goaltending, Statistics

Round 2 Preview and Round 1 Review

The second round of the Stanley Cup playoffs begin tonight (in 37 minutes at the moment I started writing this, to be precise), and after an exciting first round it’s nice not to have to wait even one night to get back into the action. In the first round our model had a good showing, correctly predicting the winners of 5 of 8 series while missing on the Avs (incredibly high shooting percentage), Lightning (replacement level backup playing for Vezina candidate) and Blue Jackets (not really clear that either team was trying to win the series).

While some might not view 5 of 8 as a great percentage, from the point of view of the model it was actually spot on. Since our model picks not only the winners but also the probability of each team winning, we have to look at the predictions it makes in the context of each team’s predicted winning percentage. After all, the model was much more confident in its pick of Anaheim (71.8%) than of Columbus (53.6%). Because our predicted win probabilities aren’t 100%, we shouldn’t get all of the series right and we should actually be worried if it does: if we’re able to consistently pick all of the winners even if we don’t view them as 100% favourites it means that our model is actually underestimating the probability, and not giving us a good view of each team’s odds.

To illustrate this further, imagine we predicted 5 series: in 3 of them we predicted that the favourite would win 66% of the time (2 out of 3 times), while in the other two we viewed each team as having equal odds (50%). In this world we’d expect to get 3 games right: 2 of the 3 66% series, and 1 of the two coin flips. We can get an estimate of how many games we think we should get right by simply adding up the probability of the favourites (0.66 + 0.66 + 0.66 + 0.5 + 0.5 = 3)

Looking back at our model, we can estimate how many series we should have predicted correctly by applying a similar analysis. Using our round 1 odds for all of our favourites our expected number of correct predictions is: 0.685 + 0.556 + 0.536 + 0.594 + 0.634 + 0.635 + 0.718 + 0.549 = 4.91. So when we got 5/8 right, we’re actually right around where we’d expect to be (admittedly due to the  small sample size we won’t always be that close, but you get the point).

All of which is to say that the model did a pretty good job at round 1, perhaps better than it would appear at first glance. With that beings said, let’s take a look at what we expect to happen in Round 2.

Team 3rd Round Stanley Cup Win Cup
Boston Bruins 80.0% 64.6% 37.9%
Montréal Canadiens 20.0% 10.2% 2.7%
Pittsburgh Penguins 54.7% 14.5% 4.1%
New York Rangers 45.3% 10.7% 2.5%
Minnesota Wild 17.8% 4.5% 1.2%
Chicago Blackhawks 82.2% 48.7% 28.2%
Anaheim Ducks 62.9% 32.5% 17.3%
Los Angeles Kings 37.1% 14.3% 6.0%

Looking at the numbers, if you’re a fan of the exciting, close series we saw in round 1 you’re likely in for a bit of a disappoint. Both Boston and Chicago are viewed as overwhelming favourites by the model, with each team coming in above 80% to move on to the conference finals. Unsurprisingly, the Hawks and Bruins are also heavy favourites to take home the Cup at this point, with the model seeing only a 34% chance that Lord Stanley’s mug won’t end up in Beantown or the Windy City.

The only other team with a decent shot at the cup, at least according to our predictions, are the Anaheim Ducks, who enter their series against the comeback Kings as 63% favourites. Los Angeles obviously cares not for the rules of probability though, as they were down to 8% odds to move on and less than 1% to win the cup after losing 3 straight to start their series with the Sharks. While the Ducks struggled at times versus the Stars, the model still favours their high shooting percentage game over the pure volume effort that the Kings tend to put forward.

The closest series, and the one this author is most excited for, is the Penguins-Rangers series set to start tomorrow in Pittsburgh. While the Pens definitely struggled at times in their series versus Columbus, and while Marc-Andre Fleury has done little to combat his reputation for choking in the clutch (not that clutch play is really a thing, but that’s another story) the team led by the Cole Harbour Kid enters their series as slight favourites over the Blue Shirts. Similar to the Ducks-Kings series this looks to be another quality versus quantity matchup, with the Pens relying on Sid the Kid and Evgeni Malkin to drag their woeful bottom 6 past New York’s Fenwick machine (oh and perhaps the greatest goalie of his generation, Henrik Lundqvist).

Coming back to our little probability exercise that we started with, let’s take a look at how many series we expect the model to get right this round. Taking the expected win probabilities of the Bruins, Penguins, Ducks and Blackhawks we should get approximately 2.8 series correct, or roughly all but 1. So there you have it: you’re now prepped for Round 2, and there’s actually still time left to grab a beer before the anthem!

Tagged with: ,
Posted in Predictions

2014 Playoff Predictions

Tomorrow night marks the start of the NHL playoffs and so to continue an annual tradition (once is a tradition, right?) I’ve put together the World Famous PuckPlusPlus Playoff Prediction Preview (hurray, alliteration). In the lockout shortened season, our model managed to do fairly well, correctly predicting all of the conference finalists as well as the eventual Cup winner and runner-up so we’ve set a pretty high bar for ourselves for this year.

Our model last year was relatively straightforward, looking at each team’s Fenwick For %, Shooting Percentage, and Save Percentage to determine the probability of winning an individual game. For this season our model still uses the same variables, but I’ve slightly tweaked how they’re fed into the model: rather than looking at each team’s save and shooting percentage individually, the model now looks at the home team’s “Shooting Advantage” and “Save Advantage”, which are simply the difference between the home team and the visiting team’s shooting and save percentages respectively.  The rationale for making this change was to ensure that the shooting percentages for each team were given the same weight in determining the outcome of the game (and likewise for each team’s save percentage). I’ve also set up the model to use the save percentage of the expected starter for each team, to remove the effect of the back-up goalies on our predictions.

With all that explanation out of the way, let’s take a look at what the model thinks is going to happen over the next few weeks. The table below shows each teams odds of advancing to a given round of the playoffs, as well as their odds of being the eventual Cup champions. Winners of the 1st round are also highlighted in bold

Team 2nd Round 3rd Round Stanley Cup Win Cup
Boston Bruins 68.5% 53.7% 43.0% 27.1%
Chicago Blackhawks 63.5% 47.4% 29.8% 18.8%
Anaheim Ducks 71.8% 47.2% 26.9% 15.7%
St. Louis Blues 36.5% 22.2% 10.7% 5.3%
Detroit Red Wings 31.5% 19.0% 11.9% 5.1%
Los Angeles Kings 54.9% 24.2% 10.4% 4.9%
Colorado Avalanche 63.4% 22.3% 9.3% 3.9%
Columbus Blue Jackets 53.6% 31.9% 11.2% 3.9%
Tampa Bay Lightning 55.6% 16.0% 8.8% 3.0%
San Jose Sharks 45.1% 17.0% 6.6% 2.7%
Pittsburgh Penguins 46.4% 26.8% 8.5% 2.7%
New York Rangers 59.4% 26.3% 7.4% 2.1%
Montréal Canadiens 44.4% 11.3% 5.8% 1.8%
Dallas Stars 28.2% 11.7% 3.9% 1.5%
Minnesota Wild 36.6% 8.1% 2.4% 0.7%
Philadelphia Flyers 40.6% 14.9% 3.3% 0.7%

In the Eastern Conference, and looking at the battle for the Cup in general, the Bruins appear to be the heavy favourites at nearly even money to make the cup and roughly 1 in 4 odds to win the whole thing. Boston’s high odds are driven by 2 things: 1) They’re a really good hockey team, backstopped by arguably the best goaltender in the league; and 2) The Eastern Conference is much weaker than the West this year. The East is so weak in fact, that the model believes the Bruins toughest battle before the cup will be their first round opponents, the Red Wings, who still only have a 31.5% chance of taking down the President’s Trophy winners. After Detroit, there’s not a single team in the East who the model views as having better than a 25% chance of beating the Bruins.

On the Western Conference side, things are a bit tighter: Chicago and Anaheim are a close 2nd and 3rd in overall odds, although neither team has even a 50% chance of getting to the conference finals. The Ducks in particular are a tough team to figure out given their goaltending situation. The probabilities shown above assume that Frederik Andersen starts for the Ducks but if we re-run the numbers with Jonas Hiller in net the Ducks overall odds decrease to 8.1%, while their odds of even getting out of the first round drop 8.5 points to 63.3%.

Speaking of team’s whose odds change dramatically based on who’s in net, the Tampa Bay Lightning definitely need Ben Bishop to get better fast if they want to have any hope of taking home the cup at the end of the year. With Anders Lindback in net, the model views them as only a 55% favourite to get past the Habs in the first round, while with Bishop manning the crease their odds increase to 62.6% (admittedly, Bishop won’t make much of a difference on their Cup odds increasing them by only ~2% as they’d still have to get by the Bruins).

The most interesting teams in my opinion are the Columbus Blue Jackets and Detroit Red Wings. While the Wings are heavy underdogs, if they are able to get past the Bruins in round 1 their odds will skyrocket and they would definitely be the favourites to represent the East in the Cup Finals. Although the model may view a defeat of the Bruins as being less than likely, it is important to note that the Wings seem to be getting healthy at the right time, and may be being underestimated in our predictions.

Columbus, on the other hand, go into the first round as surprising favourites against the Pittsburgh Penguins (Marc-Andre Fleury, this one’s on you). The Blue Jackets are struggling with injury issues of their own however, and unless they can get Nathan Horton, R.J. Umberger and Nick Foligno back they could be in trouble against a Penguins team that is getting some big names back for the start of round 1. Should ‘Lumbus get through the first round though, they’d likely be strong favourites to take on the Bruins in the Conference finals, as they’re predicted to come out ahead in matchups against either the New York Rangers of Philadelphia Flyers.

The least likely team to move past the first round is one that many are predicting to pull off an upset in the first round, the Dallas Stars. Dallas is ultimately a heavy underdog due to the fact that they’re only slightly ahead of the Ducks on the puck possession side, while trailing them heavily both on the shooting percentage and goaltending fronts. Of course as we mentioned above the Ducks advantage in the model is primarily due to Frederik Andersen-facing off against Jonas Hiller, the Ducks are viewed as having roughly the same likelihood of advancing as the Blues and Wild (one of those should offer comfort to Stars fans, while the other-not so much).

Tagged with: ,
Posted in Predictions
Follow

Get every new post delivered to your Inbox.