{"id":1209,"date":"2024-05-02T07:00:30","date_gmt":"2024-05-02T12:00:30","guid":{"rendered":"https:\/\/wollen.org\/blog\/?p=1209"},"modified":"2025-03-04T18:59:51","modified_gmt":"2025-03-05T00:59:51","slug":"like-emo-music-bigfoot-peaked-in-the-mid-2000s","status":"publish","type":"post","link":"https:\/\/wollen.org\/blog\/2024\/05\/like-emo-music-bigfoot-peaked-in-the-mid-2000s\/","title":{"rendered":"Like emo music, Bigfoot peaked in the mid-2000s"},"content":{"rendered":"<p>In case it wasn&#8217;t clear from the picture, I&#8217;m not actually a Bigfoot enthusiast. My original plan was to plot election data but that felt a little too heavy. I need to pace myself if I&#8217;m going to make it another six months to election day.<\/p>\n<p>Fortunately, a lot of people out there <span style=\"text-decoration: underline;\">are<\/span> Bigfoot enthusiasts and they have more data than you&#8217;d ever hope to see. I&#8217;ll be using <a href=\"https:\/\/data.world\/timothyrenner\/bfro-sightings-data\" target=\"_blank\" rel=\"noopener\">this dataset<\/a>, which contains geolocated sightings compiled by the Bigfoot Field Research Organization (BFRO). According to the <a href=\"http:\/\/bfro.net\/REF\/aboutbfr.asp\" target=\"_blank\" rel=\"noopener\">BFRO<\/a>, they were founded in 1995 and are &#8220;widely considered the most credible and respected investigative network involved in the study of [Bigfoot].&#8221;<\/p>\n<p>Good enough for me.<\/p>\n<hr \/>\n<p>The plan is to make a scatter plot of all sightings since 1960 on a U.S. map, and a histogram below the map to display sightings per year. I want to go a little further and plot each year as its own image and turn the frames into an animated GIF. This will make it clear to viewers how Bigfoot grew in popularity, peaked in the mid-2000s, and has since gone out of style.<\/p>\n<h4>1. Prepare the data.<\/h4>\n<p>Begin by reading the dataset into a pandas DataFrame. Drop rows where date or location data is missing.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\nimport geopandas as gpd\r\nimport matplotlib.pyplot as plt\r\n\r\ndf = pd.read_csv(\"bfro_reports_geocoded.csv\", parse_dates=[\"date\"])\r\n\r\ndf.dropna(subset=[\"date\", \"latitude\", \"longitude\"], inplace=True)<\/pre>\n<p>The goal is to group the data into annual bins so let&#8217;s create a <em>year<\/em> column. The most recent full year in the dataset is 2022 so that will be the final frame of our visualization.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df.loc[:, 'year'] = df['date'].dt.strftime(\"%Y\").astype(int)\r\n\r\ndf = df[df['year'] &lt;= 2022]<\/pre>\n<p>It&#8217;s safe to eliminate most columns from the DataFrame. Redefine <code>df<\/code> and take a closer look.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df = df[[\"year\", \"date\", \"latitude\", \"longitude\"]]\r\n\r\nprint(df.head())\r\nprint(df.shape[0])<\/pre>\n<p>The output is below. You can see we have 4,102 clean rows and latitude\/longitude appear to be in the correct geographic region. In a moment we&#8217;ll turn this DataFrame into a scatter plot.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">   year       date  latitude  longitude\r\n1  2005 2005-12-03  37.58135  -81.29745\r\n2  2005 2005-10-08  43.46540  -72.70510\r\n3  1984 1984-04-08  37.22647  -81.09017\r\n4  1996 1996-12-22  32.79430  -95.54250\r\n7  1974 1974-09-20  41.45000  -71.50000\r\n4102<\/pre>\n<p>First let&#8217;s worry about the histogram. Count the number of sightings per year and store that information in a new DataFrame.<\/p>\n<p>Normally <code>value_counts()<\/code> returns a pandas Series object, i.e. a simple list of values. By including <code>reset_index()<\/code> we instead create a DataFrame with two data columns (year, count) and an index.<\/p>\n<p>Sightings occurring after 2022 are already filtered but let&#8217;s also exclude rows before 1960, our start date.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df_histogram = df['year'].value_counts().reset_index()\r\ndf_histogram = df_histogram[df_histogram['year'] &gt;= 1960]<\/pre>\n<p>We&#8217;ll use the <a href=\"https:\/\/geopandas.org\/en\/stable\/\" target=\"_blank\" rel=\"noopener\">geopandas<\/a> library to plot geospatial data. You can tell from the name it&#8217;s designed to work closely with pandas.<\/p>\n<p>GeoDataFrames (<code>gdf<\/code> below) are an analog to pandas DataFrames. You can manipulate columns the same way and even call a lot of the same methods. The important difference is that GeoDataFrames have a <em>geometry<\/em> column. This column is the key to translating numerical data into a map.<\/p>\n<p>Geographic data is usually stored as a shapefile. In this case we&#8217;re using a U.S. map shapefile which I&#8217;ll link at the bottom of this post.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">gdf = gpd.read_file(\"shapefile\/cb_2018_us_state_20m.shp\", epsg=4326)<\/pre>\n<p>Now we have everything we need to create the visualization:<\/p>\n<ol>\n<li>A DataFrame with Bigfoot sighting coordinates.<\/li>\n<li>A DataFrame with yearly counts for the histogram.<\/li>\n<li>A GeoDataFrame to create a U.S. map.<\/li>\n<\/ol>\n<hr \/>\n<h4>2. Plot the data.<\/h4>\n<p>The script will use a large <code>for<\/code> loop to iterate through each year from 1960 to 2022. I&#8217;ll do my best to make things clear as I break the code into pieces but, as always, the full code can be found at the bottom of this post.<\/p>\n<p>The first priority is to make a <code>copy<\/code> of the original sightings DataFrame. This step allows us to silo each year from one another.<\/p>\n<p>It&#8217;s tempting to write something like <code>df2 = df[df['year'] == 1960]<\/code>. But creating multiple views of a DataFrame can cause serious problems on pandas&#8217; back-end. It&#8217;s better to be safe and create a <code>copy<\/code> when utilizing a loop like this.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for n in range(1960, 2023):\r\n\r\n    df2 = df.copy()<\/pre>\n<p>I want to add another dimension to the visualization and show previous years&#8217; sightings with reduced <code>alpha<\/code>. In other words, dots from the previous year will be visible but they will begin to fade out. Sightings from five years earlier will be faintly visible and after six years they will be gone completely.<\/p>\n<p>Filter the DataFrame so it contains data from the current year <strong>and<\/strong> the previous five years. Create a new <em>alpha<\/em> column which we can later pass into Matplotlib&#8217;s <code>scatter()<\/code>. You can see from the lambda function that sightings from the current year will have alpha=1.0 while the faintest sightings from five years ago will have alpha=0.1.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for n in range(1960, 2023):\r\n\r\n    [...]\r\n    \r\n    df2 = df2[(n - 5 &lt;= df2['year']) &amp; (df2['year'] &lt;= n)]\r\n    \r\n    df2.loc[:, 'alpha'] = df2['year'].apply(lambda x: 1 - (n - x) * 0.18)<\/pre>\n<p>Now we can begin the familiar work of Matplotlib. You&#8217;ll see this figure has two subplots (<code>ax0<\/code> and <code>ax1<\/code>) of unequal size. The upper plot will be a U.S. map and it will be 4x larger than the histogram below it.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">color_primary = \"#E24A33\"\r\ncolor_secondary = \"#202020\"\r\n\r\nfor n in range(1960, 2023):\r\n\r\n    [...]\r\n\r\n    plt.style.use(\"bigfoot.mplstyle\")\r\n    fig, (ax0, ax1) = plt.subplots(2, 1, height_ratios=[4, 1])<\/pre>\n<p>First draw the U.S. map onto <code>ax0<\/code>. Just like when using pandas, you can call <code>.plot()<\/code> on a GeoDataFrame.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">for n in range(1960, 2023):\r\n\r\n    [...]\r\n\r\n    gdf.plot(ax=ax0, facecolor=\"#fdf2d9\", edgecolor=\"black\", linewidth=0.5)<\/pre>\n<p>Then plot a <code>scatter<\/code> of sightings onto the map. Pass the <em>alpha<\/em> column to the <code>alpha<\/code> parameter.<\/p>\n<p>Take care of setting x- and y-axis window limits, which are in units of longitude and latitude respectively.<\/p>\n<p>Use <code>text<\/code> to display the current year somewhere north of the Great Lakes. Each plot will be one frame of an animated GIF so it will help to clearly label each year.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for n in range(1960, 2023):\r\n\r\n    [...]\r\n\r\n    ax0.scatter(df2['longitude'], df2['latitude'],\r\n                color=color_primary, s=90, edgecolor=\"#444444\", linewidth=0.5, alpha=df2['alpha'])\r\n\r\n    ax0.set_xlim(-125.2, -65.7)\r\n    ax0.set_ylim(24, 50)\r\n\r\n    ax0.text(-79, 47, n, size=24, ha=\"center\")<\/pre>\n<p>That completes the upper subplot. Now we need to create a histogram on <code>ax1<\/code>.<\/p>\n<p>The plan is for most of the histogram&#8217;s bars to match the scatter, <code>color_primary<\/code>, but the current year&#8217;s bar will be set apart with <code>color_secondary<\/code>. Create a new column to hold color data and pass it when calling <code>bar()<\/code>.<\/p>\n<p>Again take care of window limits, this time in regular units.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for n in range(1960, 2023):\r\n\r\n    [...]\r\n\r\n    df_histogram.loc[:, 'color'] = df_histogram['year'].apply(lambda x: color_secondary if x == n else color_primary)\r\n\r\n    ax1.bar(df_histogram['year'], df_histogram['count'], color=df_histogram['color'])\r\n\r\n    ax1.set_xticks(range(1960, 2030, 5))\r\n    ax1.set_xlim(1958.5, 2026.5)\r\n\r\n    ax1.set_yticks(range(0, 300, 50))<\/pre>\n<p>It&#8217;s usually best to call <code>set_axis_off()<\/code> when plotting maps. It hides borders, ticks, etc. surrounding the geospatial data.<\/p>\n<p>Save each year&#8217;s image with its own unique filename. Remember we&#8217;re going to turn these frames into an animated GIF.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">for n in range(1960, 2023):\r\n\r\n    [...]\r\n\r\n    ax0.set_axis_off()\r\n\r\n    plt.savefig(f\"frames\/{n}.png\")<\/pre>\n<hr \/>\n<h4>3. The output.<\/h4>\n<p>I should mention that Matplotlib <a href=\"https:\/\/matplotlib.org\/stable\/users\/explain\/animations\/animations.html\" target=\"_blank\" rel=\"noopener\">can create animations<\/a> natively but I find them to be a little unwieldy. And anyone who wanted to see the output would need their own Python environment, etc.<\/p>\n<p>For image editing I like to use <a href=\"https:\/\/www.gimp.org\/\" target=\"_blank\" rel=\"noopener\">GIMP<\/a>. It&#8217;s essentially a free but slightly less feature-rich version of Photoshop. GIMP is a great option if you&#8217;re like me and wouldn&#8217;t know what to do with 90% of Photoshop&#8217;s features anyway. I&#8217;ll skip the tutorial and get to the results:<\/p>\n<p><a href=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/bigfoot_output-1.gif\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2242 size-full\" src=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/bigfoot_output-1.gif\" alt=\"\" width=\"1300\" height=\"1000\" \/><\/a><\/p>\n<p>Notice how sightings appear and then fade over time. And the histogram&#8217;s highlighted bar moves with each passing year.<\/p>\n<p>The Pacific Northwest seems to have been Bigfoot&#8217;s preferred habitat from the start. But he also spends plenty of time in East Texas, Florida, and Appalachia.<\/p>\n<p>The famous <a href=\"https:\/\/en.wikipedia.org\/wiki\/Patterson%E2%80%93Gimlin_film\" target=\"_blank\" rel=\"noopener\">Patterson-Gimlin film<\/a> was taken in 1967. I chose to start the plot at 1960 so you can clearly see the ensuing rise in Bigfoot sightings.<\/p>\n<p style=\"font-size: 13px; text-align: center;\"><a href=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/patterson_bigfoot.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1235 size-medium\" style=\"margin-bottom: 0px; padding-bottom: 0px;\" src=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/patterson_bigfoot-300x172.jpg\" alt=\"\" width=\"300\" height=\"172\" srcset=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/patterson_bigfoot-300x172.jpg 300w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/patterson_bigfoot-768x441.jpg 768w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2024\/05\/patterson_bigfoot.jpg 994w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><em>The Patterson\u2013Gimlin film (1967).<\/em><\/p>\n<p>Reports then tailed off for another 15 years. My guess is that the 90s rise was related to the early internet&#8217;s growth. Online communities allowed people to come together, share stories, and create a sort of echo chamber. Not to mention the constant drumbeat of paranormal investigator shows on the History Channel, etc.<\/p>\n<p>Then, like emo music, Bigfoot sightings peaked in the mid-2000s. To be fair, I can only plot the data I have. Maybe folks today are seeing Bigfoot so often that they&#8217;ve gotten tired of filing reports.<\/p>\n<hr style=\"width: 50%;\" \/>\n<div style=\"margin-left: 12%; margin-right: 12%;\">\n<p><em>\u201cAnd you suspect what?\u201d Scully asked. \u201cBigfoot maybe?\u201d<\/em><\/p>\n<p><em>\u201cNot likely,\u201d Mulder answered deadpan. \u201cThat&#8217;s a lot of flannel to choke down. Even for Bigfoot.\u201d<\/em><\/p>\n<p><em>Scully sighed. She should have known better than to joke about Bigfoot to Mulder. Bigfoot wasn&#8217;t a joke to him.<\/em><\/p>\n<p>\u2014<strong>Darkness Falls (The X Files, No. 2)<\/strong><\/p>\n<\/div>\n<hr \/>\n<p><strong><a href=\"https:\/\/data.world\/timothyrenner\/bfro-sightings-data\" target=\"_blank\" rel=\"noopener\">Download the dataset.<\/a><\/strong><\/p>\n<p><strong><a href=\"https:\/\/wollen.org\/misc\/bigfoot_2024.zip\">Download shapefile &amp; mplstyle<\/a>.<\/strong><\/p>\n<p><strong>Full code:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\nimport geopandas as gpd\r\nimport matplotlib.pyplot as plt\r\n\r\n\r\ndf = pd.read_csv(\"bfro_reports_geocoded.csv\", parse_dates=[\"date\"])\r\n\r\ndf.dropna(subset=[\"date\", \"latitude\", \"longitude\"], inplace=True)\r\n\r\ndf.loc[:, 'year'] = df['date'].dt.strftime(\"%Y\").astype(int)\r\n\r\ndf = df[df['year'] &lt;= 2022]\r\n\r\ndf = df[[\"year\", \"date\", \"latitude\", \"longitude\"]]\r\n\r\ndf_histogram = df['year'].value_counts().reset_index()\r\n\r\ndf_histogram = df_histogram[df_histogram['year'] &gt;= 1960]\r\n\r\ngdf = gpd.read_file(\"shapefile\/cb_2018_us_state_20m.shp\", epsg=4326)\r\n\r\ncolor_primary = \"#E24A33\"\r\ncolor_secondary = \"#202020\"\r\n\r\nfor n in range(1960, 2023):\r\n\r\n    df2 = df.copy()\r\n    \r\n    df2 = df2[(n - 5 &lt;= df2['year']) &amp; (df2['year'] &lt;= n)]\r\n    \r\n    df2.loc[:, 'alpha'] = df2['year'].apply(lambda x: 1 - (n - x) * 0.18)\r\n\r\n    plt.style.use(\"bigfoot.mplstyle\")\r\n    fig, (ax0, ax1) = plt.subplots(2, 1, height_ratios=[4, 1])\r\n\r\n    gdf.plot(ax=ax0, facecolor=\"#fdf2d9\", edgecolor=\"black\", linewidth=0.5)\r\n\r\n    ax0.scatter(df2['longitude'], df2['latitude'],\r\n                color=color_primary, s=90, edgecolor=\"#444444\", linewidth=0.5, alpha=df2['alpha'])\r\n\r\n    ax0.set_xlim(-125.2, -65.7)\r\n    ax0.set_ylim(24, 50)\r\n\r\n    ax0.text(-79, 47, n, size=24, ha=\"center\")\r\n\r\n    df_histogram.loc[:, 'color'] = df_histogram['year'].apply(lambda x: color_secondary if x == n else color_primary)\r\n\r\n    ax1.bar(df_histogram['year'], df_histogram['count'], color=df_histogram['color'])\r\n\r\n    ax1.set_xticks(range(1960, 2030, 5))\r\n    ax1.set_xlim(1958.5, 2026.5)\r\n\r\n    ax1.set_yticks(range(0, 300, 50))\r\n\r\n    ax0.set_axis_off()\r\n\r\n    plt.savefig(f\"frames\/{n}.png\")\r\n<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In case it wasn&#8217;t clear from the picture, I&#8217;m not actually a Bigfoot enthusiast. My original plan was to plot election data but that felt a little too heavy. I need to pace myself if I&#8217;m going to make it<\/p>\n","protected":false},"author":1,"featured_media":1605,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[19,279],"tags":[272,278,275,22,122,277,21,270,38,72,73,271,26,24,126,30,276,46,142,25,273,274],"class_list":["post-1209","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-maps","category-nature","tag-bigfoot","tag-coordinates","tag-cryptid","tag-data","tag-dataset","tag-geocoded","tag-geopandas","tag-geospatial","tag-histogram","tag-latitude","tag-longitude","tag-map","tag-maps","tag-matplotlib","tag-mplstyle","tag-pandas","tag-paranormal","tag-plot","tag-pyplot","tag-python","tag-sasquatch","tag-yeti"],"_links":{"self":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1209","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/comments?post=1209"}],"version-history":[{"count":43,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1209\/revisions"}],"predecessor-version":[{"id":2243,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/1209\/revisions\/2243"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media\/1605"}],"wp:attachment":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media?parent=1209"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/categories?post=1209"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/tags?post=1209"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}