clock menu more-arrow no yes

Filed under:

Crunching The Numbers: Under Further Review (Part II of IV)

New, comment

The next step in my quest for truth was to analytically critique the system. This involved looking over the stats I used to build my algorithm and mathematically determining their overall value to my goal of predicting how good a team will be.

Kirby Lee-USA TODAY Sports

I'm going to throw this out there now. These next three posts are going to be long. Like, really long. So if you're in the "too long, didn't read" camp, here's your warning. I will also say that there is a lot of really interesting stuff in the next three posts, so I will do my best to use headings and pullquotes so you guys can skim. However, I do recommend that you take the time to read it. It's good stuff.

After making observations about the effectiveness of my system, I decided to mathematically analyze the metrics and statistics used to see what the numbers have to say. This involved some basic regression analysis, which I will give some background for. If you are already familiar with regression analysis, feel free to skip over that section.


When analyzing two sets of data (let's call them sets "X" and "Y"), a common question is whether or not the sets are related in any way. The easiest way to determine this is to conduct a regression analysis, which will determine a "correlation coefficient" (commonly known as an r-value) between X and Y. Correlation coefficients represent the slope of the "line of best fit" that the data sets generate and always lie between -1 and 1. If the r-value is 1, the relationship between the data is perfectly positive, meaning that as X increases, Y increases by the same amount. Conversely, if the r-value is -1, the relationship is perfectly negative and Y will decrease by the same amount that X increases. If the r-value is 0 then there is no relationship between the two sets of data. If some of that went over your head, just remember that the farther the correlation coefficient is from zero, the greater the relationship between the two sets of data. The sign (positive or negative) just determines the type of relationship.

There is one more piece to this puzzle. With regression analysis there is always the chance that the relationship is purely coincidence. For this reason there are "critical values" of the correlation coefficient depending on how large your data set is. These critical values correspond to the confidence (expressed as a percentage) that you can claim that the relationship is meaningful. In engineering, we like to report data with 95% confidence, which means that with our sample size (32 teams in the league), the critical value is 0.3494. This indicates that for a good relationship, the r-vale must be either greater than 0.3494 or less than -0.3494.

Okay, I know that was a lot. Got it all? Then we can move on to the analysis.


[Note: I address the "one season is too small of a sample" argument after this section.] In each of the tables below, the index or statistic you see were compared with the final regular season win total for each team to find the r-value. First, let's look at the big picture. How well did each of my "major indices" (offensive line, defensive line, quarterback, coverage, scoring, and overall) correlate to winning? Check it out in the table below:

Index Correlation
Rank Index 0.9430
Adjusted Score Differential 0.9423
Quarterback 0.6902
Offense (Overall) 0.6196
Coverage 0.5102
Defense (Overall) 0.4783
Defensive Line 0.3387
Offensive Line 0.3280

Not too shabby! All of them had a fairly strong correlation to winning, although the offensive and defensive line indices just missed the cut of 95% confidence (I'll get to that in a minute). The overall rank index had an almost perfect correlation with winning, but this was highly influenced by the adjusted score differential. In fact, the other indices when combined into the algorithm only improved the coefficient by 0.0007, which is not really a significant number.

But what about the raw statistics that were used to create these indices? How did they stack up? See for yourself:

Statistic Correlation
Score Differential/Game 0.9448
Points Forced/Game 0.7050
Turnover Margin/Game 0.6560
Yards/Attempt 0.6099
Sacks Forced/Game 0.4758
Opponent Interception Percentage 0.4554
Offensive Third Down Efficiency 0.3555
Fumbles Recovered/Game 0.2995
Yards/Rush 0.0517
Yards/Rush Allowed -0.0087
Fumbles Lost/Game -0.2127
Defensive Third Down Efficiency -0.3402
Sacks Allowed/Game -0.3989
Opponent Yards/Attempt -0.4233
Interception Percentage -0.4821
Points Allowed/Game -0.7617

There's a lot to take in here. I'll run through a few observations that I made. Firstly, there is in fact a slight discrepancy between the r-values for points forced and points allowed (remember, the negative sign indicates that as fewer points are allowed, the likelihood of winning increases, which makes sense). For points forced, the correlation coefficient was around 0.71 while points allowed edged it out with -0.76. I believe this 0.05 difference is enough to say that defense has a slight significance over offense when it comes to winning. Secondly, check out just how little the yards per rush statistics correlate to winning on both sides of the ball (this is why my offensive and defensive line indices were so low). I personally was floored at this, because it seemed so counter-intuitive. If you're able to get the most bang for your buck every time you run, shouldn't you win more? Or if you can stop your opponent from running effectively, shouldn't that give you the upper hand?

Among the other statistics, there are no real surprises that I see. Statistics like sacks, turnover margin, and yards per attempt had very high correlations with winning. If there's anything to be said, it's probably that teams should place a higher premium on protecting their quarterback (if they haven't already) considering just how large the coefficient is.

Finally, the one truly dominating statistic was score differential. After all my efforts to create a ranking formula for the NFL, I would technically be better off just using score differential straight-up. Even my adjusted score differential, which gave preference to teams with good defenses, didn't have a better relationship. Does this mean my system is pointless? Possibly (I know I'll be seeing that question quoted in the comments for jokes). But one of my stated goals is to create a predictive system. When the score differential after Week 4 (when I began 'Crunching The Numbers') is compared against final wins, the correlation drops to 0.59. This is still a really strong relationship, but it's no longer nearly perfect (my ranking system's correlation was 0.55). In the future, devising an algorithm that will have a higher r-value after four weeks than score differential after four weeks of regular season play will be one of the goals of my system.

About The Sample Size

If there is one argument that can be made against this analysis, is that these numbers only represent one season, and this is true. I have two comments to make regarding this. First off, in one regular season there are 256 individual games of football. That is a lot of football, and countless more individual plays. On this alone you can almost make the argument that the sample size is large enough to draw meaningful conclusions.

My second comment is in relation to the nature of the statistics themselves. They are results. They do not tell the story of how they were achieved. The NFL is always changing and evolving, but I contend that the methodology is changing almost exclusively. In other words, successful teams are still putting up the same numbers, still achieving the same thing, just in different ways. So while the methods change, ultimately the results do not. I concede this argument doesn't hold if you go across eras - statistics are definitely different now than from what they were twenty years ago - but in our case we're not talking about eras, just the recent past and near future.

The Verdict (Part II)

There was a lot of information to digest there, and at the end it may have raised more questions than it answered. On the surface, it appears the running game is quickly turning into an afterthought - a sentiment reinforced as recently as the 2014 draft, where the first running back came off the board at the lowest draft position ever. But is it actually true? Is the NFL nearing completion in its transformation into a passing league? What other statistics have a large effect on winning? Keep your eyes peeled for Part III where I tackle these questions and more.