/cdn.vox-cdn.com/uploads/chorus_image/image/56503923/usa_today_10237129.0.jpg)
In Part I of this three-part series, I presented a few ideas about what successful football teams that make them, well, successful. The next step is to test whether or not these ideas hold any merit. A simple way to accomplish this is to quantify the ideas and then see how they correlate to wins. I’ll provide an explanation here, but if you want to skip over that there’s a TL;DR just beneath it.
Correlation is determined using linear regression, which assigns an r-value to each set of football metrics used in the analysis. The r-value falls between 0 and 1, with 0 indicating no correlation and 1 indicating perfect correlation. If the sign is positive, the relationship is positive: for example, the correlation between time spent on a treadmill and calories burned is positive because as you spend more time on the treadmill, you burn calories. Conversely, if the sign is negative, the relationship is negative: for example, the correlation between speed and time spent traveling is negative because as your speed increases, the time spent traveling to the destination decreases.
Of course, we need to be wary of the old phrase, “correlation does not imply causation.” Just because there is a noticeable relationship between two data sets does not mean that there is any real connection between them. A common example is the “positive” relationship between ice cream sales and drowning – they both increase during the same time period, but that’s because it’s summer. There is no actual relationship between ice cream and drowning.
To determine whether or not a relationship is significant, we need to see if the r-value passes a certain threshold. This is known as the “critical” value of r. For a dataset of size 32 (the number of teams in the NFL), the critical r-value is 0.34937. This is the r-value in which we have 95% confidence that the relationship is significant. We could set the threshold at other points of confidence (say, 85% or 97%), but 95% confidence is standard in the engineering field, so that’s what I’m going to use here.
TL;DR: If any of this went over your head, just remember that we are looking for r-values greater than 0.34937 in the analysis. A positive number means a positive relationship, and a negative number means a negative relationship.
With the boring (and hopefully not too confusing) statistics lesson out of the way, let’s actually look at the hypotheses I made in the first post and see how they hold up against the analysis. Statistics chosen for linear regression were taken from either www.teamrankings.com or www.sportingcharts.com. I wanted to look at how the league is “trending,” so the analysis looks at the past three seasons (2014-2016). If you need a refresher on the explanation of each hypothesis, check out the first part of this series from yesterday.
Hypothesis #1: The offense needs balance, but only the passing game needs to be effective.
So how do we quantify this statement? Testing balance is easy; we can compare the ratio of run versus pass (RUSH%) of each team against their average win percentage from 2014-2016. Testing “effectiveness” versus win percentage is a little different since there are several ways to measure the passing and running game. Ultimately, I chose yards per pass attempt (YPA) and yards per rush attempt (YPR) since these two metrics are very analogous to each other.
R-Values from Analysis (Critical R: 0.34937)
Rushing Play Percentage: 0.216
Yards per Pass Attempt: 0.466
Yards per Rush Attempt: -0.042
A few interesting things to note here. First off, YPA was the only statistic to have a significant relationship. The number was positive, indicating that the higher average yardage per pass attempt correlates very strongly with wins. On the flip side, the r-value for YPR was essentially zero (and actually slightly negative), indicating that the yards you pick up per play on the ground are virtually irrelevant.
Somewhat surprisingly, RUSH% did not reach criticality, although the relationship was positive. While I’m wary of concluding that teams can be successful with one-dimensional offenses, I do think that these results underscore just how important an efficient passing game is for a successful team. The running game can’t be ignored, but it shouldn’t be treated as an equivalent part of the offense either.
Verdict: Plausible
Hypothesis #2: Good coverage is more important than a good pass rush.
I spoke at length about why I feel this way, but ultimately I only found three metrics that I thought would accurately tell the true story. The pass rush was easy to assess – I looked at both sack percentage (SACK%) and hurries per game (HURR/G). For coverage, it was a little more challenging to select an appropriate statistic. I originally tried to use pass break-ups, but it wasn’t tracked at the team level from what I could find. I settled with opponent yards per completion (Y/CMP). This seemed like a good choice to me, because removing incompletions eliminates any poor quarterback play caused by the pass rush (throwaways, tipped at the line, etc.). Of course, this also discounts pass break-ups, but it was the best I could do. The idea here is that a good secondary will limit big plays and have tight coverage over the middle, limiting the quarterback to screens and swing passes. As I’m sure you’re aware, constantly throwing screens and swing passes is not really a recipe for success.
R-Values from Analysis (Critical R: 0.34937)
Sack Percentage: 0.242
Hurries per Game: 0.174
Yards per Completion Allowed: -0.434
Neither pass-rushing metrics surpassed the critical r-value. Even worse, HURR/G, which is often used as a reason to bail out PFF’s greatest hero, Brandon Graham, has an even lower r-value than SACK%. Both are positive, which we should expect, but we can’t say with 95% confidence that this relationship is truly significant.
On the flip side, Y/CMP had almost as strong a correlation as YPA. It’s negative, which is to be expected (fewer Y/CMP correlates to more wins). While we obviously cannot discount the importance of a pass rush entirely, I don’t think there’s any doubt that being able to cover well is more valuable.
Verdict: Confirmed
Hypothesis #3: The goal of the offense is to impose will, and the goal of the defense is to break will.
This idea is a bit more abstract than the others, so it’s harder to quantify. I ended up selecting three metrics for the offensive side of the theory. The first one was time of possession (TOP) with the idea that the longer an offense holds the ball, the more control they have over the game, and the more likely they will be to win. The other two statistics was a comparison of first half points (PTS/1HLF) and second half points (PTS/2HLF). The thought here is that an offense which “imposes will” scores early and then defends a lead in the second half by grinding clock. If this is true, there should be a noticeable discrepancy between the r-value of PTS/1HLF and PTS/2HLF.
On the defensive side, I only looked at one metric: opponent yards per point (Y/PT). My feeling with this one is that a defense that “breaks will” forces the opposing offense to work harder to score. There’s a difference between gaining 300 yards of offense and scoring 10 points and gaining 400 yards of offense and scoring 30 points, if you can believe that.
R-Values from Analysis (Critical R: 0.34937)
Time of Possession: 0.585
First Half Points: 0.717
Second Half Points: 0.672
Opponent Yards per Point: 0.744
All of these had very strong positive correlations, which shouldn’t really come as a shock. The meat of the analysis comes within the 0.045 difference between first and second half points. This nominally supports the idea that teams who score early win more often, but the discrepancy is so small we can’t really say anything conclusive about it. However, the fact that there’s a difference at all bears acknowledging, even if it’s not very significant.
The other two are pretty self-explanatory. Time of possession does indeed relate to winning (somebody tell Chip Kelly) and making your opponent work harder to score will help you win games. Really strong takes here.
Verdict: Plausible
Conclusions
None of my theories were shown to be patently false. They were at least plausible, and one was confirmed. There’s always room for interpretation, of course. Moving forward into Part III, I’ll take the metrics that I felt were the most significant and package them into a sleek, refined version of Crunching The Numbers. What exactly does that mean? Check back tomorrow to find out!
Author’s note: Have additional questions about my methods or the statistics I chose? Let me know in the comments! I can’t reply at work – web blockers be damned – but I’ll get to them as soon as I can.