The Eagles Super Bowl victory over the Patriots was obviously the best game in franchise, or league, history. It captured so many elements that make games great: back and forth action, fourth quarter comebacks, a last second Hail Mary, and a cherry on top: the toppling of an evil empire. Postseason aside, what’s the best Eagles regular season victory? If you were to ask ten beer drinkin’, bean bag tossin’, horse-lovin’ tailgaters at the next home game, you might get twenty different answers. But, there is a way to rank best games objectively.
How? We need a lot of data.
Before we get there, finding the best games in the NFL isn’t an entirely new idea. In 2009 Brian Burke of Advanced Football Analytics (now, of ESPN) used win probability to rank the league’s best games from 2000-2008. To do it, he derived two elements: Excitement Index (EI) and Comeback Factor (CF).
Have you ever seen a win probability graph (see below)? Imagine that the line is a string. If you pull both ends of the string taut and measure how long it is, you get a pretty good idea how exciting the game was. The longer the string, the more exciting the game. Each play of the game moves the win probability line up or down. For close games, the movement of the line is more magnified compared to blow outs. If it’s a blow out, then the line won’t move much. You can also imagine that the number of plays in a game impacts the length of the line. Back and forth action is great if the game has 100 plays. It gets even better if a game has 150.
The second factor Burke used was what he called the Comeback Factor (CF). The thinking here is that the length of the line doesn’t really tell the whole story. What if the game was an upset? To account for this, Burke used the inverse of the lowest win probability point of the game for the eventual winning team. In other words, a game won by a team whose lowest win probability at any point was 90% is generally less exciting than a game won by a team whose lowest win probability at any given point was, say, 25%.
Burke’s work is tremendous, but I don’t think it quite captures everything we need to objectively find the best games. I think there needs to be a third element.
A game may have a high EI and/or CF, but what if that game was between the Cleveland Browns and New York Jets? Who cares, right? A game between the Eagles and the Packers would surely be more exciting. So we need a “good team” factor that gives weight to games between better teams. We could use win percentage, but it’s really too volatile and unreliable, especially for games early in a season. There might be another solution though.
To figure out all of this, we need data for every play of every game. Ron Yurko, a PhD student at Carnegie Mellon University created a stat package (nflscrapR) that compiles play-by-play data for every NFL game since 2009 (correction: the stat package was created by Carnegie Mellon grad Maksim Horowitz. Ron has been developing and maintaining it). It has everything we need to calculate EI and CF. It doesn’t, however, really tell us which teams were “good.” For that, we can use FiveThirtyEight’s ELO ratings. Thankfully, they have all of the data for the NFL since its inception. If you’re unfamiliar with ELO, you can read more here. To simplify, imagine every team in the NFL was “born” with the same rating (in this case, it’s 1500). A team’s rating increases or decreases according to margin of victory (or defeat), and the ELO rating of the opposing team. ELO ratings are carried over (and slightly adjusted to the mean) from one season to the next, so it’s a really good proxy for defining “good teams” relative to “bad teams” at any point in time.
We can blend the nflscrapR and ELO data and dump it into Tableau to see the Eagles (or any team’s) best regular season wins since 2009. When we do this, the Miracle at the Meadowlands II (DeSean!) floats to the top as the best non-Super Bowl Eagles win.
Each game’s rank is a product of the Excitement Factor (Burke’s EI), Comeback Factor, and ELO Factor. The Excitement and Comeback factors are on a scale of 0 to 1, where 1 is best. The ELO factor is on a different scale, where 1 represents games with average teams. So the higher the number above 1, the better the teams. When the number is below one, the teams are below average.
Here’s the complete viz of all regular season games between 2009 and 2017 (Click here if it doesn’t load). The Miracle at the Meadowlands II ranks 10th overall. What do you think? Do the results make sense?