For those of you who are longtime members of Bleeding Green Nation, you know that this is about. Every season I analyze statistics from every NFL team with the singular goal of separating the true contenders from those that "flame out" as soon as possible. For those of you who may be new to the site (welcome!) this will be the third season where I publish my results publicly on this blog. What makes this year different, however, is that I took the time over the summer to apply some regression analysis to my system in an attempt to make it more accurate (see the posts here). As a result, I have done some major overhauls to the algebraic formula that I use to rank teams, which I will describe here below.
OFFENSE
For the formula, I arbitrarily divide statistics into two categories: passing efficiency and line play. In that regard, I'm ignoring the play of linebackers and running backs, but the statistics themselves correspond to a team effort. For example, if the "sacks per game" metric is low for a team, they could either have a great offensive line, a mobile quarterback that can bail out a poor one, or a strong running game where they do not need to pass as much as other teams. However, to make things easier on myself, I have filed the sacks per game metric under "offensive line" for convenience. So don't go asking about running backs or linebackers in the comments, because they are technically covered in the formula. Now, onto the indices.
Passing Index = (Passer Rating/15.83) + (Yards per Attempt/Incompletions per Game)
The passing index incorporates the top three offensive metrics from my brief study on the correlations between statistics and winning. I "normalized" the passer rating by dividing by 15.83 for two reasons: 1) most of the other metrics are much smaller than the raw quarterback rating, meaning that the impact they have on the formula would be essentially negligible, and 2) dividing by 15.83 ensures that the rating for each quarterback in my formula falls somewhere between 1 and 10 since the maximum passer rating possible is 158.3. The other part to the index is just math: the number will go up in cases where yards per attempt increases AND incompletions per game decreases, since dividing by a smaller number gives you a larger one.
Offensive Line Index = Rushing First Downs per Game/Sacks per Game
This one might have some controversy associated with it. As several of you pointed out during my study, a lot of the rushing statistics that were associated with winning was a case of "correlation vs. causation." That is, teams were running because they had the lead and were trying to kill the clock instead of using the run game to actually score. Maybe I'm a traditionalist, but I find it hard to believe that this is the ONLY purpose for the run game in contemporary football. So I tried to choose a statistic that I felt could not be entirely attributed to the "run the clock" reasoning, and came up with rushing first downs per game. This is the metric that actually correlated the most with winning on offense; my reasoning behind using it is that even though teams may run the clock by running the ball, they are not necessarily getting their first downs by running, especially if they are in third-and-long situations (they may run the ball and then punt or pass to attempt the first down). Alternatively, if a team is able to consistently get themselves in third-and-short (or even second-and-short) situations early in the game by passing, they may run to pick up the first. The other metric, sacks per game, is fairly self-explanatory, and the math is the same as what I explained above.
DEFENSE
All statistics for the defense are numbers put up by each team's opponent, with no regard to the players involved. For instance, if a wide receiver passes the ball on a trick play and tosses a pick, it will still be counted against the passer rating (the same is actually true for the statistics used for offense as well).
Coverage Index = (Incompletions per Game/Yards per Attempt) - (Passer Rating/15.83)
This is essentially a "mirrored" version of the passing index. As incompletions go up and yards per attempt goes down, the number increases. Since the "normalized" passer rating is subtracted, a smaller number there indicates a larger index value. Where things get a little questionable here is that the values used here are not all the top three metrics correlated with winning (like they were on offense), but I thought it was important to keep the formulas consistent on both sides of the ball. This is something I will be monitoring as I analyze the results after the season is over.
Defensive Line Index = Sacks per Game/Rushing First Downs per Game
Again, the same logic was applied here to develop this formula, and it runs into the same issue. In this case, the correlation for rushing first downs per game was much lower. It wasn't low enough to be inconsequential (like my old metric of rushing yards per attempt), but it did not have the same 95% confidence level that I used as my standard in my study. I will be taking a look at this as well, but as far as the math goes, the index will increase as sacks increase and rushing first downs decrease.
OVERALL
These are the scoring and turnover related indices that cannot really be related to any specific unit, so I tack them on at the end.
Overall = Points Scored per Game - Points Allowed per Game + Turnover Margin per Game
Before, I used a modified score differential that rewarded stingy defenses based off of the old conventional wisdom that "defense wins championships." After looking at the numbers it turned out that the direct score differential was more related to winning, so I'm going to use that this year. This will form the "foundation" of the ranking formula as the number should be larger than the other indices I described above.
To get a final number (which I call the "Rank Index"), I simply add everything together:
Rank Index = Passing Index + OL Index + Coverage Index + DL Index + Overall
And that's it! I'm looking forward to apply some more advanced statistical analysis this year on the formula during my never-ending quest to separate "pretenders" from "contenders." The rankings will debut Week 4 after teams create a good sample size to work with, so stay tuned!