What I like about FiveThirtyEight's Allison McCann's, The NFL’s Uneven History Of Punishing Domestic Violence

List of data sources

McCann manually compiled and cross-referenced records from three existing data sources:

  1. The San Diego Union-Tribune’s NFL arrests database, which covers arrests of NFL players since 2000 "that were more serious than speeding tickets" and appears to be kept up-to-date.
  2. The Wikipedia entry on List of suspensions in the National Football League
  3. A list of NFL suspensions and fines from Sportrac

What is the “birth story” of the data?

(How were they created in the first place?)

The San Diego UT's database was compiled by looking through news reports and public records, possibly via a database like Lexis-Nexis and doing a search for variations of "NFL" and "arrest". The Wikipedia entry is crowdsourced and refers to a variety of published news sources.

McCann (and presumably the U-T) could not find an official comprehensive list from the NFL itself, nor would the players union respond to her. Such a list may not exist as it's not a particularly flattering datapoint for the league or the union.

Claims based on the data

Searching over the span of the NFL's 94-year existence, McCann found 263 suspensions. The most frequent reason for suspensions involved performance-enhancing drugs. These suspensions also were the most consistent in their punishment. McCann states:

There’s little question about the number of games a player will be forced to miss for using steroids because the length of the suspension is specifically outlined in league policy.

Substance abuse violations have less consistency but there are league guidelines to work off from.

But when it comes to suspensions for "personal conduct violations", McCann found the duration to be all over the place:

The NFL’s punishment of personal conduct violations has been inconsistent and on average less harsh than its punishment of drug offenses.

McCann's data is an unsurprising consequence of the league's lack of priority in dealing with personal conduct violations, a problem that has become fully exposed due to their handling of Ray Rice:

Limitations of the data

One of the McCann's data sources, the U-T database, only goes back to 2000. And even in that timeframe, the U-T says, it's impossible to know whether their list is complete, since not all incidents may have been reported "and some public records proved to be elusive." In fact, the U-T notes, the higher frequency of recent incidents in their database may merely reflect increased media coverage.