Abridged from the 2014 course lecture.
How do we use untrustworthy data?
In 2010, Baltimore Sun crime reporter Justin Fenton not only built a clear, logical refutation against something as infuriatingly opaque and politically controversial as rape statistics. But he did it so convincingly that his work immediately effected a dramatic change in policy. And he started with the same flawed, public data that the police and the politicians had access to but apparently overlooked.
In the year 2009, the Baltimore police reported 158 rapes. Moving past the sentiment that one such crime is too many, the first thing to question is: how significant is 158? Is that more than expected for a city of Baltimore's size? Fewer? The easiest route of comparison is to look at that number historically, i.e. the number of rape reports, year over year:
via the Baltimore Sun's graphics department:
It's possible that Baltimore has had exceptional success with reducing crime overall compared to the rest of the nation. It's possible there has been a significant investment in police funding, as well as an implementation of innovative law enforcement strategies (a real-life Hamsterdam?).
We can confirm that theory by comparing across categories. We should see a dramatic drop in other categories of crime, such as homicide. Let's look at the FBI Uniform Crime Reporting for Baltimore's homicide and rape numbers (officially labeled as, "Murder and nonnegligent manslaughter" and "Forcible rape", respectively)
Because the population fluctuates, it's more relevant to look at crime rates: in this case, number of crimes per 100,000 citizens:
|Year||Homicide rate||Rape rate|
Charting that data leads to the graph below: whatever has led to the drop in rape reports has not affected the rate of homicides: