{"id":1136,"date":"2024-04-04T07:00:01","date_gmt":"2024-04-04T12:00:01","guid":{"rendered":"https:\/\/wollen.org\/blog\/?p=1136"},"modified":"2024-10-18T02:58:52","modified_gmt":"2024-10-18T07:58:52","slug":"violent-crime-in-the-united-states","status":"publish","type":"post","link":"https:\/\/wollen.org\/blog\/2024\/04\/violent-crime-in-the-united-states\/","title":{"rendered":"Violent crime in the United States"},"content":{"rendered":"<p>The FBI <a href=\"https:\/\/cde.ucr.cjis.gov\/LATEST\/webapp\/#\/pages\/about\" target=\"_blank\" rel=\"noopener\">Uniform Crime Reporting Program<\/a> (UCR) collects crime-related data from thousands of law enforcement agencies across the country. The FBI then compiles the data and publishes quarterly and annual reports. It isn&#8217;t a perfect snapshot of crime as participation among agencies is voluntary, but it is the best picture we have.<\/p>\n<p>For this post I&#8217;ll be digging into the most recent annual release, October 2023, which covers data through the end of 2022. You can download the spreadsheet I&#8217;m using by clicking over to the FBI&#8217;s <a href=\"https:\/\/cde.ucr.cjis.gov\/LATEST\/webapp\/#\/pages\/downloads\" target=\"_blank\" rel=\"noopener\">Crime Data Explorer<\/a> and finding the section shown below. I&#8217;ll also link the file directly at the bottom of this post.<\/p>\n<p><a href=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/03\/fbi_ucr_screenshot.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1139 size-medium\" src=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/03\/fbi_ucr_screenshot-300x152.png\" alt=\"\" width=\"300\" height=\"152\" srcset=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/03\/fbi_ucr_screenshot-300x152.png 300w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/03\/fbi_ucr_screenshot.png 759w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<hr \/>\n<h4>1. Prepare the data.<\/h4>\n<p>You might notice the spreadsheet isn&#8217;t as clean as the CSV files I usually work with here. The federal government doesn&#8217;t have the softest touch. But in the spirit of imperfect data I want to leave the Excel file as it is and go through the process of cleaning it for presentation. Sure, it would be easier to copy-and-paste the target data into a clean file and manually remove any weird stuff, like footnotes or trailing white space. But data is rarely in a perfect form so let&#8217;s practice meeting it where it is.<\/p>\n<p>After importing pandas and Matplotlib, set an option that will allow us to see the entire DataFrame&#8217;s width on screen. Because the spreadsheet has so many columns\u2014and several with very long names\u2014pandas will try to collapse the DataFrame whenever we print it to screen. It will replace the majority of the data with ellipses. Changing this setting will avoid truncation and make it easier to visualize our work.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\nimport matplotlib.pyplot as plt\r\nfrom matplotlib.offsetbox import OffsetImage, AnnotationBbox\r\n\r\npd.set_option(\"display.expand_frame_repr\", False)<\/pre>\n<p>Read the spreadsheet into a DataFrame and take a look.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df = pd.read_excel(\"Table_1_Crime_in_the_United_States_by_Volume_and_Rate_per_100000_Inhabitants_2003-2022.xlsx\")\r\n\r\nprint(df.head())<\/pre>\n<p>The output is shown below. In this case I&#8217;m using ellipses because it would be too much to display in a blog post. The important thing is to see what a mess the DataFrame is.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">                                             Table 1                    Unnamed: 2  [...]\r\n0                         Crime in the United States          NaN              NaN  [...]\r\n1  by Volume and Rate per 100,000 Inhabitants, 20...          NaN              NaN  [...]\r\n2                                               Year  Population1  Violent\\ncrime2  [...]\r\n3                                               2003    290809777          1459416  [...]\r\n4                                               2004    293655404          1428745  [...]<\/pre>\n<p>The first problem is that the column labels (<em>Year<\/em>, <em>Population1<\/em>, et al.) are located in the third row.<\/p>\n<p>To solve it we can grab that specific row using <code>iloc<\/code> and parse the strings into something more readable, then tell the DataFrame that those are its new column labels.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">column_labels = [str(item).replace(\"\\n\", \" \").replace(\"  \", \" \").strip() for item in df.iloc[2]]\r\ndf.columns = column_labels<\/pre>\n<p>The second problem is the extraneous rows at top and bottom of the spreadsheet. We&#8217;re only interested in rows that correspond to yearly data from 2003 to 2022.<\/p>\n<p>We can simply redefine <code>df<\/code> to be a subset of its rows.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df = df[3:23]<\/pre>\n<p>The last change we&#8217;ll need to make is in the <em>Year<\/em> column. There is a footnote in the 2021 row that changes its value. The tail of the column looks like this:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">     Year\r\n18   2018\r\n19   2019\r\n20   2020\r\n21  20215\r\n22   2022<\/pre>\n<p>But this is an easy fix. Rather than applying a function we can just redefine the row to be the numbers from 2003 to 2022.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df.loc[:, 'Year'] = range(2003, 2023)<\/pre>\n<p>At this point we could begin plotting but I have an idea to make the data a little more informative.<\/p>\n<p>I want to create two plots in the output image. On top will be the most-often cited statistic, overall violent crime rate. And on bottom it will show percent change in each subcategory (murder, assault, robbery) over the past 20 years. In absolute terms, robbery rate is much higher than murder rate, but we can peg both at 100% and see how they change relative to each other.<\/p>\n<p>Create new columns to hold this data. Divide each cell by the topmost row in its column (2003) and multiply by 100 to express the value as a percentage.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df.loc[:, 'murder_rate_change'] = df['Murder and nonnegligent manslaughter rate'] \/ df['Murder and nonnegligent manslaughter rate'].iloc[0] * 100\r\ndf.loc[:, 'assault_rate_change'] = df['Aggravated assault rate'] \/ df['Aggravated assault rate'].iloc[0] * 100\r\ndf.loc[:, 'robbery_rate_change'] = df['Robbery rate'] \/ df['Robbery rate'].iloc[0] * 100<\/pre>\n<hr \/>\n<h4>2. Plot the data.<\/h4>\n<p>This plot uses a custom Matplotlib style that I&#8217;ll link at the bottom of this post.<\/p>\n<p>Specify the shape of the subplot array in <code>plt.subplots()<\/code>\u20142 rows, 1 column. In other words we&#8217;ll have two plots stacked in a vertical orientation. When I create multiple axes on the same figure I like to name them <code>axs<\/code> rather than the usual <code>ax<\/code>. Each can be addressed as <code>axs[0]<\/code> or <code>axs[1]<\/code>.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">plt.style.use(\"wollen_dark.mplstyle\")\r\nfig, axs = plt.subplots(2, 1, figsize=(14, 11))<\/pre>\n<p>The violent crime rate plot is below. The Matplotlib code is fairly straightforward so I won&#8217;t bog down the post with too much commentary.<\/p>\n<p>I&#8217;m overlaying an FBI seal onto the plot with low <code>alpha<\/code> to give it a watermark effect. The trickiest part is to correctly set <code>box_alignment<\/code>. (0, 0) means the lower-left corner of the image will be placed at the specified location. (0.5, 0.5) would center the image at that point.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">axs[0].plot(df['Year'], df['Violent crime rate'], color=\"#B0E441\", marker=\"o\", markersize=6)\r\n\r\nx_ticks = df['Year'].tolist()\r\naxs[0].set_xticks(x_ticks)\r\nx_range = x_ticks[-1] - x_ticks[0]\r\nx_left, x_right = x_ticks[0] - x_range * 0.02, x_ticks[-1] + x_range * 0.02\r\naxs[0].set_xlim(x_left, x_right)\r\n\r\ny_ticks = range(360, 540, 20)\r\naxs[0].set_yticks(y_ticks)\r\ny_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_range * 0.01, y_ticks[-1] + y_range * 0.01\r\naxs[0].set_ylim(y_bottom, y_top)\r\naxs[0].set_ylabel(\"Rate per 100,000\")\r\n\r\naxs[0].set_title(\"United States Violent Crime Rate  \u2022  2003\u20132022\")\r\n\r\naxs[0].text(x_right - x_range * 0.005, y_ticks[-1] - y_range * 0.01,\r\n            \"Data:  FBI Crime in the Nation, October 2023.\",\r\n            size=10, ha=\"right\", va=\"top\")\r\n\r\nab = AnnotationBbox(OffsetImage(plt.imread(\"fbi_seal.png\"), zoom=0.2, alpha=0.05),\r\n                    (x_ticks[0], y_ticks[0]), box_alignment=(0, 0), frameon=False)\r\naxs[0].add_artist(ab)<\/pre>\n<p>Next is the bottom plot, <code>axs[1]<\/code>, which will display change in violent crime subcategories over the past 20 years.<\/p>\n<p>This code is a little simpler because it doesn&#8217;t include an image or citation text. We can also reuse previous definitions for the x-axis.<\/p>\n<p>Remember to include a legend so readers can identity each crime subcategory. The data series trend downward so lower-left is a nice spot for a legend.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">axs[1].plot(df['Year'], df['murder_rate_change'], marker=\"o\", markersize=6, label=\"Murder and Nonnegligent Manslaughter\")\r\naxs[1].plot(df['Year'], df['assault_rate_change'], marker=\"o\", markersize=6, label=\"Aggravated Assault\")\r\naxs[1].plot(df['Year'], df['robbery_rate_change'], marker=\"o\", markersize=6, label=\"Robbery\")\r\n\r\naxs[1].set_xticks(x_ticks)\r\naxs[1].set_xlim(x_left, x_right)\r\n\r\ny_ticks = range(0, 140, 20)\r\naxs[1].set_yticks(y_ticks)\r\naxs[1].set_yticklabels([f\"{n}%\" for n in y_ticks])\r\ny_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_range * 0.01, y_ticks[-1] + y_range * 0.01\r\naxs[1].set_ylim(y_bottom, y_top)\r\n\r\naxs[1].set_title(\"Subcategories  \u2022  Change Since 2003\")\r\n\r\naxs[1].legend(loc=\"lower left\")<\/pre>\n<p>Finally, save the figure. I bump up <code>dpi<\/code> from its default 100 to aid readability.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">plt.savefig(\"fbi_violent_crime.png\", dpi=150)<\/pre>\n<hr \/>\n<h4>3. The output.<\/h4>\n<p><a href=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1816 size-full\" src=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1.png\" alt=\"\" width=\"2100\" height=\"1650\" srcset=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1.png 2100w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1-300x236.png 300w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1-1024x805.png 1024w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1-768x603.png 768w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1-1536x1207.png 1536w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/04\/fbi_violent_crime-1-2048x1609.png 2048w\" sizes=\"auto, (max-width: 2100px) 100vw, 2100px\" \/><\/a><\/p>\n<p>Overall the country is much safer than it was 20 years ago. We saw an uptick in violent crime during the COVID-19 pandemic, especially within the murder category, but we&#8217;ve nearly returned to the lowest levels on record.<\/p>\n<p>So far, quarterly data has pointed to <a href=\"https:\/\/www.nbcnews.com\/news\/crime-courts\/us-crime-rate-still-dropping-says-fbi-rcna144100\" target=\"_blank\" rel=\"noopener\">another large decline in 2023<\/a>. Experts say it could be the largest recorded year-over-year drop in the murder rate. We&#8217;ll learn more when the FBI released full 2023 data this October.<\/p>\n<hr \/>\n<p><a href=\"https:\/\/wollen.org\/misc\/fbi_violent_crime.zip\"><strong>Download the data.<\/strong><\/a><\/p>\n<p><strong>Full code:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\nimport matplotlib.pyplot as plt\r\nfrom matplotlib.offsetbox import OffsetImage, AnnotationBbox\r\n\r\n\r\npd.set_option(\"display.expand_frame_repr\", False)\r\n\r\ndf = pd.read_excel(\"Table_1_Crime_in_the_United_States_by_Volume_and_Rate_per_100000_Inhabitants_2003-2022.xlsx\")\r\n\r\ncolumn_labels = [str(item).replace(\"\\n\", \" \").replace(\"  \", \" \").strip() for item in df.iloc[2]]\r\ndf.columns = column_labels\r\n\r\ndf = df[3:23]\r\n\r\ndf.loc[:, 'Year'] = range(2003, 2023)\r\n\r\ndf.loc[:, 'murder_rate_change'] = df['Murder and nonnegligent manslaughter rate'] \/ df['Murder and nonnegligent manslaughter rate'].iloc[0] * 100\r\ndf.loc[:, 'assault_rate_change'] = df['Aggravated assault rate'] \/ df['Aggravated assault rate'].iloc[0] * 100\r\ndf.loc[:, 'robbery_rate_change'] = df['Robbery rate'] \/ df['Robbery rate'].iloc[0] * 100\r\n\r\nprint(df.head(50))\r\n\r\nplt.style.use(\"wollen_dark.mplstyle\")\r\nfig, axs = plt.subplots(2, 1, figsize=(14, 11))\r\n\r\naxs[0].plot(df['Year'], df['Violent crime rate'], color=\"#B0E441\", marker=\"o\", markersize=6)\r\n\r\nx_ticks = df['Year'].tolist()\r\naxs[0].set_xticks(x_ticks)\r\nx_range = x_ticks[-1] - x_ticks[0]\r\nx_left, x_right = x_ticks[0] - x_range * 0.02, x_ticks[-1] + x_range * 0.02\r\naxs[0].set_xlim(x_left, x_right)\r\n\r\ny_ticks = range(360, 540, 20)\r\naxs[0].set_yticks(y_ticks)\r\ny_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_range * 0.01, y_ticks[-1] + y_range * 0.01\r\naxs[0].set_ylim(y_bottom, y_top)\r\naxs[0].set_ylabel(\"Rate per 100,000\")\r\n\r\naxs[0].set_title(\"United States Violent Crime Rate  \u2022  2003\u20132022\")\r\n\r\naxs[0].text(x_right - x_range * 0.005, y_ticks[-1] - y_range * 0.01,\r\n            \"Data:  FBI Crime in the Nation, October 2023.\",\r\n            size=10, ha=\"right\", va=\"top\")\r\n\r\nab = AnnotationBbox(OffsetImage(plt.imread(\"fbi_seal.png\"), zoom=0.2, alpha=0.05),\r\n                    (x_ticks[0], y_ticks[0]), box_alignment=(0, 0), frameon=False)\r\naxs[0].add_artist(ab)\r\n\r\naxs[1].plot(df['Year'], df['murder_rate_change'], marker=\"o\", markersize=6, label=\"Murder and Nonnegligent Manslaughter\")\r\naxs[1].plot(df['Year'], df['assault_rate_change'], marker=\"o\", markersize=6, label=\"Aggravated Assault\")\r\naxs[1].plot(df['Year'], df['robbery_rate_change'], marker=\"o\", markersize=6, label=\"Robbery\")\r\n\r\naxs[1].set_xticks(x_ticks)\r\naxs[1].set_xlim(x_left, x_right)\r\n\r\ny_ticks = range(0, 140, 20)\r\naxs[1].set_yticks(y_ticks)\r\naxs[1].set_yticklabels([f\"{n}%\" for n in y_ticks])\r\ny_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_range * 0.01, y_ticks[-1] + y_range * 0.01\r\naxs[1].set_ylim(y_bottom, y_top)\r\n\r\naxs[1].set_title(\"Subcategories  \u2022  Change Since 2003\")\r\n\r\naxs[1].legend(loc=\"lower left\")\r\n\r\nplt.savefig(\"fbi_violent_crime.png\", dpi=150)<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The FBI Uniform Crime Reporting Program (UCR) collects crime-related data from thousands of law enforcement agencies across the country. The FBI then compiles the data and publishes quarterly and annual reports. It isn&#8217;t a perfect snapshot of crime as participation<\/p>\n","protected":false},"author":1,"featured_media":1158,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[238],"tags":[136,245,250,241,253,240,248,242,24,126,244,251,137,30,249,187,25,254,246,63,116,255,247,252,243],"class_list":["post-1136","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-government","tag-annotationbbox","tag-assault","tag-cops","tag-crime","tag-excel","tag-fbi","tag-federal","tag-government","tag-matplotlib","tag-mplstyle","tag-murder","tag-offsetbox","tag-offsetimage","tag-pandas","tag-police","tag-politics","tag-python","tag-read_excel","tag-robbery","tag-statistics","tag-stats","tag-subplots","tag-ucr","tag-united-states","tag-violent-crime"],"_links":{"self":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1136","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/comments?post=1136"}],"version-history":[{"count":23,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1136\/revisions"}],"predecessor-version":[{"id":1817,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1136\/revisions\/1817"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media\/1158"}],"wp:attachment":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media?parent=1136"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/categories?post=1136"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/tags?post=1136"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}