Filed under:

# Crunching The Numbers: Under Further Review (Part II of IV)

The next step in my quest for truth was to analytically critique the system. This involved looking over the stats I used to build my algorithm and mathematically determining their overall value to my goal of predicting how good a team will be.

I'm going to throw this out there now. These next three posts are going to be long. Like, really long. So if you're in the "too long, didn't read" camp, here's your warning. I will also say that there is a lot of really interesting stuff in the next three posts, so I will do my best to use headings and pullquotes so you guys can skim. However, I do recommend that you take the time to read it. It's good stuff.

After making observations about the effectiveness of my system, I decided to mathematically analyze the metrics and statistics used to see what the numbers have to say. This involved some basic regression analysis, which I will give some background for. If you are already familiar with regression analysis, feel free to skip over that section.

#### Background

When analyzing two sets of data (let's call them sets "X" and "Y"), a common question is whether or not the sets are related in any way. The easiest way to determine this is to conduct a regression analysis, which will determine a "correlation coefficient" (commonly known as an r-value) between X and Y. Correlation coefficients represent the slope of the "line of best fit" that the data sets generate and always lie between -1 and 1. If the r-value is 1, the relationship between the data is perfectly positive, meaning that as X increases, Y increases by the same amount. Conversely, if the r-value is -1, the relationship is perfectly negative and Y will decrease by the same amount that X increases. If the r-value is 0 then there is no relationship between the two sets of data. If some of that went over your head, just remember that the farther the correlation coefficient is from zero, the greater the relationship between the two sets of data. The sign (positive or negative) just determines the type of relationship.

There is one more piece to this puzzle. With regression analysis there is always the chance that the relationship is purely coincidence. For this reason there are "critical values" of the correlation coefficient depending on how large your data set is. These critical values correspond to the confidence (expressed as a percentage) that you can claim that the relationship is meaningful. In engineering, we like to report data with 95% confidence, which means that with our sample size (32 teams in the league), the critical value is 0.3494. This indicates that for a good relationship, the r-vale must be either greater than 0.3494 or less than -0.3494.

Okay, I know that was a lot. Got it all? Then we can move on to the analysis.

#### Results

[Note: I address the "one season is too small of a sample" argument after this section.] In each of the tables below, the index or statistic you see were compared with the final regular season win total for each team to find the r-value. First, let's look at the big picture. How well did each of my "major indices" (offensive line, defensive line, quarterback, coverage, scoring, and overall) correlate to winning? Check it out in the table below:

 Index Correlation Rank Index 0.9430 Adjusted Score Differential 0.9423 Quarterback 0.6902 Offense (Overall) 0.6196 Coverage 0.5102 Defense (Overall) 0.4783 Defensive Line 0.3387 Offensive Line 0.3280

Not too shabby! All of them had a fairly strong correlation to winning, although the offensive and defensive line indices just missed the cut of 95% confidence (I'll get to that in a minute). The overall rank index had an almost perfect correlation with winning, but this was highly influenced by the adjusted score differential. In fact, the other indices when combined into the algorithm only improved the coefficient by 0.0007, which is not really a significant number.

But what about the raw statistics that were used to create these indices? How did they stack up? See for yourself:

 Statistic Correlation Score Differential/Game 0.9448 Points Forced/Game 0.7050 Turnover Margin/Game 0.6560 Yards/Attempt 0.6099 Sacks Forced/Game 0.4758 Opponent Interception Percentage 0.4554 Offensive Third Down Efficiency 0.3555 Fumbles Recovered/Game 0.2995 Yards/Rush 0.0517 Yards/Rush Allowed -0.0087 Fumbles Lost/Game -0.2127 Defensive Third Down Efficiency -0.3402 Sacks Allowed/Game -0.3989 Opponent Yards/Attempt -0.4233 Interception Percentage -0.4821 Points Allowed/Game -0.7617

There's a lot to take in here. I'll run through a few observations that I made. Firstly, there is in fact a slight discrepancy between the r-values for points forced and points allowed (remember, the negative sign indicates that as fewer points are allowed, the likelihood of winning increases, which makes sense). For points forced, the correlation coefficient was around 0.71 while points allowed edged it out with -0.76. I believe this 0.05 difference is enough to say that defense has a slight significance over offense when it comes to winning. Secondly, check out just how little the yards per rush statistics correlate to winning on both sides of the ball (this is why my offensive and defensive line indices were so low). I personally was floored at this, because it seemed so counter-intuitive. If you're able to get the most bang for your buck every time you run, shouldn't you win more? Or if you can stop your opponent from running effectively, shouldn't that give you the upper hand?
YARDS PER RUSH HAS VIRTUALLY NO IMPACT ON WINNING AT ALL.

Among the other statistics, there are no real surprises that I see. Statistics like sacks, turnover margin, and yards per attempt had very high correlations with winning. If there's anything to be said, it's probably that teams should place a higher premium on protecting their quarterback (if they haven't already) considering just how large the coefficient is.

Finally, the one truly dominating statistic was score differential. After all my efforts to create a ranking formula for the NFL, I would technically be better off just using score differential straight-up. Even my adjusted score differential, which gave preference to teams with good defenses, didn't have a better relationship. Does this mean my system is pointless? Possibly (I know I'll be seeing that question quoted in the comments for jokes). But one of my stated goals is to create a predictive system. When the score differential after Week 4 (when I began 'Crunching The Numbers') is compared against final wins, the correlation drops to 0.59. This is still a really strong relationship, but it's no longer nearly perfect (my ranking system's correlation was 0.55). In the future, devising an algorithm that will have a higher r-value after four weeks than score differential after four weeks of regular season play will be one of the goals of my system.