Contributed by Chuan Sun. He takes the NYC Data Science Academy 12 week full time Data Science Bootcamp program from July 5th to September 23rd, 2016. This post is based on their first class project - the Exploratory Data Analysis Visualization Project, due on the 2nd week of the program. You can find the original article here.
When Vito Corleone, the head of the Corleone crime family in the movie “The Godfather”, was shot on the street of New York by hitmen, I was shocked.
I was shocked not just because I was so immersed in the movie, but also due to one sentence echoing in my mind: “no one is an island”.
Uncertainty is everywhere, even for the mafia boss, not to mention millions of ordinary New Yorkers.
Safety is one of the most fundamental needs for people. As one of the most populous urban agglomerations in the world, New York City is heaven for many, but perhaps hell for few, especially those who were unfortunately affected by the seven “sins”:
Each week, NYPD publishes City Wide Crime Statistics, containing detailed weekly statistics of crime complaints on 7 felonies. For example, for the one report during 7/4 to 7/10 of 2016, there were 1888 total crime complaints in NYC: 6 murder, 35 rape, 304 robbery, 444 felony assault, 202 burglary, 765 grand larceny, 132 grand larceny of vehicles.
1888 is not a small number, although the total complaints decreased 5.51% as of 2015. By simple math, we know that there were on average 11.24 felony incidents per hour, or 1 felony incident per 6 minutes in NYC.
This project investigates 7 sins, a.k.a, felonies, which occurred in NYC in the past 10 years (2006-2015). It focuses on answering the following simple yet important questions:
See here for R source code to generate the graphs in this post.
The NYPD 7 Major Felony Incidents dataset:
According to the NYPD Incident Level Data Footnotes:
The first point indicates that the number of actual incidents is larger than that in the dataset. Since we know nothing about which types of offenses are typically associated together in incidents of multiple offenses, we can make no assumptions. The second point affects the accuracy of incident locations. Nevertheless, at the scale of borough or city level, the inaccuracy in longitude and latitude will not have a major impact on the overall distribution of incidents.
Quick exploration using R revealed that, although the years in the dataset span from 1919 to 2015, over 95% of all incidents occurred after 2005. I thus mainly focus on the year from 2006 to 2015. This 10-year period covers 1.1 million incidents.
First let us take a look at the overall trend of 7 felonies in NYC in the last 10 years.
Grand larceny is the most frequent offense of all 7 felonies. The number of incidents is almost twice that of the second most frequent one.
Three felonies are declining: robbery, burglary, and auto theft. I cannot help but link this to the widely used technology in camera surveillance. Wrongdoers know their big faces will instantly show up in NYPD screens once they risk themselves.
Murder and rape have stayed at the same level across 10 years.
The number of felony assaults is on a slightly increasing trend.
To sum up, it is safe to conclude that NYC is getting safer.
NYC’s seasons are defined as follows:
Late winter and early spring tend to have the smallest number of incidents for almost all 7 felonies, with February having a particularly low felony incidence. These can be considered as the safest seasons. This is understandable. During those months it can become very chilly, windy, and snowy. Who would want to go out in such weather?
Summer and early fall tend to have the largest number of incidents for almost all 7 felonies. Summer months in NYC are usually hot and humid, and temperatures may remain high at night. This can make certain people ornery.
Friday is the least safe day in the week. This insight is easily perceived from the histograms. On Friday, burglary, grand larceny, larceny of motor vehicle, and robbery occur more frequently than on other days. Maybe, people tend to feel very relaxed on Friday after one week’s work, perhaps therefore not being as vigilant as they otherwise might be. This could give wrongdoers great opportunities to break into houses, steal property, such as cars, or commit robberies on the streets.
As for the weekend, the number of incidents for burglary, grand larceny, auto theft, and robbery declines. If you think that people are at home playing with their kids, enjoying family time, watching favorite TV shows, or preparing for their next week’s work, then maybe there is less of an opportunity for wrongdoers to sneak into their homes.
On the other hand, weekend is less safe in terms of felony assault, rape and murder. Home violences, bad family relationships and unkindly words, may all related to an unhappy or disastrous weekend. So maybe family time is not equally great for everyone!
Knowing which hours are safe or unsafe for certain offenses is vital for New Yorkers, since hour is a “tangible” and controllable unit. One can choose to be at one place at a certain hour, or not.
It strikes me that, even with just simple density and histogram graphs, without any complex machine learning models, we can still distill many insights from history.
It is easy to see on a clock when each of the deadly sins peak in terms of frequency. You can almost map the life of a felon, and only few hours in a day are really safe, e.g., 5am is a safe time to be alive.
We should also keep an eye on where felonies occur.
From the histogram above, it can be seen that Manhattan has the most number of grand larcenies. This is somehow not surprising. Perhaps Wall Street and most financial companies are located there, and wrongdoers can get their hands dirty easily. Brooklyn is the second runner, and Staten island has the least. Despite Manhattan being the winner when it comes to grand larceny, Brooklyn in fact appears to be the most dangerous borough. It ranks first on the of incidents for 6 out of 7 felonies. In contrast, Staten Island ranks last.
The density map below depicts a visualization of crime in all 5 boroughs. It turns out that each borough has its own distinct pattern of hot locations.
New Yorkers may rely solely on NYPD to solve those problems. But if each New Yorker is aware of the time/space patterns identified in this report, s/he can take proper action and things may be different.
NYC is getting safer and safer. But we should not be satisfied with this. Eradicating felonies is a long-term mission. I believe more work can be done, including but not limited to: