In order to talk about statistics intelligently, we first need to lay down some definitions, the most important of which is the definition of data.
Data represent collections of observations. As I scan my living room, I see 3 glasses one plate and 2 knives left out from last night. Additionally there is one cat on the couch and 1 dog at my feet. These data tell you something about me. I'm not the neatest person in the world, but dogs and cats live together in my house and the world hasn't ended. These are certainly not the kind of data we're interested in here, but I wanted to present it to make the point that data are all around us. The statistics applied to football are just as useful when applied to linguistics or population modelling.
Generally speaking, observations can be broken into two groups: categorical data and quantitative data.
Categorial data are data that bin into groups, for instance players drafted by the Philadelphia Eagles or passes attempted short, medium and deep. Categorical data can be further described as nominal or ordinal.
Nominal data have nothing distinguishing them beyond the bin in which they reside. Players drafted by the Philadelphia Eagles represent nominal data. While there may be some emotional investment, there is nothing this group of observations has that distinguishes itself from the players drafted by the Cincinatti Bengals other than the jersey the players will be wearing on the first day of training camp.
Unlike nominal data, ordinal data do have characteristics that distinguish them. Ordinal data are data in which order is important. The NFL play by play lists short medium and long pass attempts, this represents a prime example of ordinal data.
A particularly important type of categorical data to football fans is binary data. Binary data are categories with only 2 bins. Passes are either complete or incomplete; a penalty is either commited on a play or it is not; and a team is either on offense or defense. The NFL play by play is full of examples of binary data, if you feel like it, check it out...
Quantitative data represent observations that can be measured. Rather than saying that the game was played on a hot day, a measurement gets made and we can say that it was 97 degrees fahrenheit outside at kickoff. Like categorical data there are two kinds of quantitative data too: continuous data and discrete data.
Discrete data are data in which only certain values are possible. They exist in discrete intervals. A field goal is worth 3 points, a touchdown 6 and an extra point 1.
Continuous data are data which can be recorded continuously. Brian Westbrook carried for 5.6251329 yards on the last play. Donovan McNabb was sacked for -12.5532323232 yards. Almost all of the continuous data in the NFL play by play are recorded as discrete data, but that is something we simply have to accept.
Enough about data. In my next installment I'll discuss statistics in general and define descriptive and inductive statistics.
Thanks for reading.