{"id":2260,"date":"2025-05-01T07:00:52","date_gmt":"2025-05-01T12:00:52","guid":{"rendered":"https:\/\/wollen.org\/blog\/?p=2260"},"modified":"2025-05-31T21:04:44","modified_gmt":"2025-06-01T02:04:44","slug":"partisan-growing-pains","status":"publish","type":"post","link":"https:\/\/wollen.org\/blog\/2025\/05\/partisan-growing-pains\/","title":{"rendered":"Partisan growing pains"},"content":{"rendered":"<p>This post will dive a little deeper into American politics. I try to avoid it unless there&#8217;s something genuinely interesting to say about the data. In this case, I think there is!<\/p>\n<p>We&#8217;ll look at the last decade of county-level population growth and how it correlates with 2024 election results. Do Democrats have a growth problem?<\/p>\n<p>This analysis is interesting because it&#8217;s more forward-looking. We&#8217;ll run a regression on 2024 results and then think about how demographic trends <em>could<\/em> affect future elections. I&#8217;d like to register my thoughts before we get too far removed from 2024.<\/p>\n<hr \/>\n<h4>1. Prepare the data.<\/h4>\n<p>We have three spreadsheets\u2014one for election results and two for census data. Everything is reported at the county level. We&#8217;re calculating population change from 2013 to 2023 (the most recently available year) so there are two Excel files from the Census Bureau. Everything is linked at the bottom of this post.<\/p>\n<p>The plan is to create a column common to all three tables and then merge them into a single pandas DataFrame. From there, we&#8217;ll do a <a href=\"https:\/\/online.stat.psu.edu\/stat462\/node\/91\/\" target=\"_blank\" rel=\"noopener\">linear regression<\/a> to see how strongly population growth is correlated with vote.<\/p>\n<p>For the most part, county names are clean and conveniently match across the datasets. Election and map nerds tend to follow Census Bureau naming conventions. But we do have a few edge cases to deal with:<\/p>\n<ul>\n<li>The District of Columbia doesn&#8217;t have counties.<\/li>\n<li>Alaska doesn&#8217;t have counties either. They have boroughs but they report election results at the state legislative district level.<\/li>\n<li>Connecticut is complicated. They recently switched to &#8220;planning regions,&#8221; which don&#8217;t match the previously reported boundaries, so we can&#8217;t accurately measure change over time.<\/li>\n<li>Kalawao County, Hawaii exists in census files but the state doesn&#8217;t report election results for it.<\/li>\n<\/ul>\n<p>We&#8217;ll simply filter out Alaska, Connecticut, D.C., and Kalawao County and move ahead.<\/p>\n<p>We have to read two census Excel files so let&#8217;s write a function. There&#8217;s nothing complicated here so I won&#8217;t waste too many words going over it.<\/p>\n<p>First exclude a few rows at the top and bottom of the spreadsheet and rename columns. Then we can parse county names, which are formatted as <code>.[County], [State]<\/code>. Our code separates county from state and then adds a new column, <em>id<\/em>, formatted <code>[State]_[County]<\/code>. We&#8217;ll create this column in all three files and use it to merge the data.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\n\r\n\r\ndef get_census_df(filename, new_columns):\r\n    df = pd.read_excel(filename)\r\n\r\n    df = df[4:-6]\r\n\r\n    df.columns = new_columns\r\n\r\n    df[['county', 'state']] = df['census_name'].str.split(\", \", expand=True)\r\n\r\n    df.loc[:, 'county'] = df['county'].apply(lambda x: x[1:-7])\r\n\r\n    df.loc[:, 'id'] = df['state'] + \"_\" + df['county']\r\n\r\n    df = df[~df['state'].isin([\"Alaska\", \"Connecticut\", \"District of Columbia\"])]\r\n\r\n    df = df[df['census_name'] != \".Kalawao County, Hawaii\"]\r\n\r\n    return df\r\n\r\n\r\ndf_2013 = get_census_df(filename=\"co-est2019-annres.xlsx\",\r\n                        new_columns=['census_name', 'census', 'base_estimate',\r\n                                     'pop2010', 'pop2011', 'pop2012', 'pop2013', 'pop2014',\r\n                                     'pop2015', 'pop2016', 'pop2017', 'pop2018', 'pop2019'])<\/pre>\n<p>The Census Bureau files have yearly population estimates, despite canvassing once per decade. We only care about 2013 so we can filter out the extra columns.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df_2013 = df_2013[['id', 'pop2013']]<\/pre>\n<p><code>df_2013.head()<\/code> looks like this:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">                id   pop2013\r\n4  Alabama_Autauga   54727.0\r\n5  Alabama_Baldwin  194885.0\r\n6  Alabama_Barbour   26937.0\r\n7     Alabama_Bibb   22521.0\r\n8   Alabama_Blount   57619.0<\/pre>\n<p>Repeat the process for the 2023 census file. The only differences are column names.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df_2023 = get_census_df(filename=\"co-est2023-pop.xlsx\",\r\n                        new_columns=['census_name', 'base_estimate', 'pop2020', 'pop2021', 'pop2022', 'pop2023'])\r\n\r\ndf_2023 = df_2023[['id', 'pop2023']]<\/pre>\n<p>We have two DataFrames, one each for 2013 and 2023 census data. Now let&#8217;s read the 2024 election results file.<\/p>\n<p>Like before, we need to filter out Alaska, Connecticut, and D.C. We also need to parse county names so they have an identical <em>id<\/em> column (<code>[State]_[County]<\/code>). The <em>per_point_diff<\/em> column measures percentage point difference between Republican and Democratic votes. It will eventually be our dependent variable along the y-axis.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df_elx = pd.read_csv(\"2024_US_County_Level_Presidential_Results.csv\")\r\n\r\ndf_elx = df_elx[~df_elx['state_name'].isin([\"Alaska\", \"Connecticut\", \"District of Columbia\"])]\r\n\r\ndf_elx.loc[:, 'county'] = df_elx['county_name'].apply(lambda x: x[:-7])\r\n\r\ndf_elx.loc[:, 'id'] = df_elx['state_name'] + \"_\" + df_elx['county']\r\n\r\ndf_elx = df_elx[['id', 'per_point_diff']]<\/pre>\n<p><code>df_elx.head()<\/code> is shown below. A positive <em>per_point_diff<\/em> indicates a Republican win.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">                id  per_point_diff\r\n0  Alabama_Autauga        0.462753\r\n1  Alabama_Baldwin        0.581768\r\n2  Alabama_Barbour        0.147274\r\n3     Alabama_Bibb        0.644194\r\n4   Alabama_Blount        0.810173<\/pre>\n<p>At this point, we have three DataFrames, each with 3,103 rows and identical counties in the <em>id<\/em> column. It&#8217;s time to merge them into a single DataFrame.<\/p>\n<p><code>pandas.merge<\/code> won&#8217;t accept a list of DataFrames, but you can easily chain together method calls. Specify that the common column is <em>id<\/em> and assign the output to a new variable, <code>df<\/code>.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df = df_2013.merge(df_2023, on=\"id\").merge(df_elx, on=\"id\")<\/pre>\n<p><code>df.head()<\/code> looks like this:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">                id   pop2013   pop2023  per_point_diff\r\n0  Alabama_Autauga   54727.0   60342.0        0.462753\r\n1  Alabama_Baldwin  194885.0  253507.0        0.581768\r\n2  Alabama_Barbour   26937.0   24585.0        0.147274\r\n3     Alabama_Bibb   22521.0   21868.0        0.644194\r\n4   Alabama_Blount   57619.0   59816.0        0.810173<\/pre>\n<p>We can almost do a regression now. Our independent &#8220;predictor&#8221; variable is population <em>growth<\/em> so first we need to calculate percent change. Go ahead and re-order the columns so they&#8217;re easier to read.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df.loc[:, 'pct_change'] = (df['pop2023'] - df['pop2013']) \/ df['pop2013']\r\n\r\ndf = df[['id', 'pop2013', 'pop2023', 'pct_change', 'per_point_diff']]<\/pre>\n<p>Here&#8217;s one more look at <code>df.head()<\/code>.<\/p>\n<ul>\n<li>x variable: <em>pct_change<\/em> of population<\/li>\n<li>y variable: <em>per_point_diff<\/em> of 2024 vote.<\/li>\n<\/ul>\n<p>The first row is saying (roughly) that Autauga County, Alabama grew from 55K people to 60K people, which was a 10% change. They voted for Trump by a 46-point margin. The third and fourth rows show counties that declined in population.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">                id   pop2013   pop2023  pct_change  per_point_diff\r\n0  Alabama_Autauga   54727.0   60342.0    0.102600        0.462753\r\n1  Alabama_Baldwin  194885.0  253507.0    0.300803        0.581768\r\n2  Alabama_Barbour   26937.0   24585.0   -0.087315        0.147274\r\n3     Alabama_Bibb   22521.0   21868.0   -0.028995        0.644194\r\n4   Alabama_Blount   57619.0   59816.0    0.038130        0.810173<\/pre>\n<p>We&#8217;re analyzing percent change of population so each county&#8217;s size, in absolute terms, doesn&#8217;t change the math. But it will improve our scatter plot to let dot size represent population. Larger dots mean larger counties. Let&#8217;s create a column for <code>marker_size<\/code>.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df.loc[:, 'marker_size'] = df['pop2023'] * 0.00015<\/pre>\n<p>We could go straight to the regression (and I would start there if this wasn&#8217;t a blog post) but the effect is more interesting if we restrict our analysis to the top 10% of counties by population. Collectively, they account for two-thirds of the national vote. What happens there drives the overall trend.<\/p>\n<p>We have 3,103 rows to start. Sort by 2023 population and truncate the DataFrame to 310 rows. That will leave us with the 10% most populous counties.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">df = df.sort_values(\"pop2023\", ascending=False)[:310]<\/pre>\n<p>Now we can pass the columns to <code>scipy.stats.linregress<\/code>.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">from scipy.stats import linregress\r\n\r\nslope, intercept, r_value, p_value, std_err = linregress(df['pct_change'], df['per_point_diff'])<\/pre>\n<p>We&#8217;ll want to plot the regression line on top of the scatter plot. <code>slope<\/code> and <code>intercept<\/code> correspond to <code>m<\/code> and <code>b<\/code> in the linear equation <code>y=mx+b<\/code>.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">x_reg = [df['pct_change'].min(), df['pct_change'].max()]\r\ny_reg = [n * slope + intercept for n in x_reg]<\/pre>\n<hr \/>\n<h4>2. Plot the data.<\/h4>\n<p>I&#8217;ll use a custom Matplotlib style that will be linked at the bottom of this post. Start by creating an Axes instance for plotting (<code>ax<\/code>).<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import matplotlib.pyplot as plt\r\n\r\nplt.style.use(\"wollen_election.mplstyle\")\r\n\r\nfig, ax = plt.subplots()<\/pre>\n<p>Create a scatter plot of population growth and 2024 vote.<\/p>\n<p>I&#8217;m using a purple color to avoid blue\/red partisan association. The size of each dot is defined by the <em>marker_size<\/em> column we created above. I&#8217;m giving the dots some transparency (<code>alpha<\/code>) because there will be a lot of overlap. Set <code>zorder<\/code> to 2 because we&#8217;ll create multiple layers on the figure.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.scatter(x=df['pct_change'],\r\n           y=df['per_point_diff'],\r\n           color=\"#9885BF\",\r\n           s=df['marker_size'],\r\n           edgecolor=\"#333\",\r\n           linewidth=0.5,\r\n           alpha=0.7,\r\n           zorder=2)<\/pre>\n<p>Next is the regression line. This is a simple call to <code>ax.plot()<\/code>. #333 is a dark gray color. <code>zorder=3<\/code> puts this line directly on top of the scatter markers.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.plot(x_reg, y_reg, color=\"#333\", zorder=3)<\/pre>\n<p>I like to include coordinate axes lines whenever possible. They make it easy to see what values are positive and negative. <code>zorder=1<\/code> means the lines are drawn underneath the data.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.plot([-2, 2], [0, 0], color=\"#333\", linewidth=0.5, zorder=1)\r\nax.plot([0, 0], [-2, 2], color=\"#333\", linewidth=0.5, zorder=1)<\/pre>\n<p>Often I&#8217;ll use <code>numpy.arange<\/code> or <code>numpy.linspace<\/code> to get a list of evenly spaced float values. But due to <a href=\"https:\/\/www.cs.drexel.edu\/~popyack\/Courses\/CSP\/Fa17\/extras\/Rounding\/index.html\" target=\"_blank\" rel=\"noopener\">round off errors<\/a>, they can add bugs when using the <code>==<\/code> operator. For example, <code>arange<\/code> might give me 0.00002 instead of 0.0, and Python will tell me the value doesn&#8217;t equal zero.<\/p>\n<p>Along the horizontal axis, I want x-ticks to be a list of evenly spaced floats from -0.2 to 0.6. I&#8217;ll use a list comprehension instead of numpy. I want label strings to be the tick values multiplied by 100, so they&#8217;re displayed as percentages ranging from 0 to 100.<\/p>\n<p>Assign <code>xlim<\/code> values to variables because we&#8217;ll use them later to draw text on the plot.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">x_ticks = [n \/ 10 for n in range(-2, 7)]\r\nax.set_xticks(x_ticks, labels=[f\"{n * 100:+.0f}%\" if n != 0 else \"0\" for n in x_ticks])\r\nx_tick_range = x_ticks[-1] - x_ticks[0]\r\nx_left, x_right = x_ticks[0] - x_tick_range * 0.03, x_ticks[-1] + x_tick_range * 0.03\r\nax.set_xlim(x_left, x_right)<\/pre>\n<p>The y-axis is similar but these ticks represent a partisan margin from D+80 to R+60. I&#8217;ve written a function to convert <em>per_point_diff<\/em> values to D-R margins.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">def get_partisan_ticks(ticks):\r\n    new_ticks = []\r\n\r\n    for tick in ticks:\r\n        if tick &gt; 0:\r\n            new_ticks.append(f\"R+{tick * 100:.0f}\")\r\n        elif tick &lt; 0:\r\n            new_ticks.append(f\"D+{abs(tick) * 100:.0f}\")\r\n        else:\r\n            new_ticks.append(\"TIE\")\r\n\r\n    return new_ticks\r\n\r\n\r\ny_ticks = [n \/ 10 for n in range(-8, 8, 2)]\r\nax.set_yticks(y_ticks, labels=get_partisan_ticks(y_ticks))\r\ny_tick_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_tick_range * 0.01, y_ticks[-1] + y_tick_range * 0.01\r\nax.set_ylim(y_bottom, y_top)<\/pre>\n<p>Let&#8217;s make a note in the lower-left corner that Alaska, Connecticut, and D.C. are excluded from the data. I like to locate text just inside the outermost grid lines.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.text(x=x_ticks[0] + x_tick_range * 0.004,\r\n        y=y_ticks[0] + y_tick_range * 0.002,\r\n        s=\"AK, CT, DC excluded.\",\r\n        ha=\"left\",\r\n        va=\"bottom\")<\/pre>\n<p>In the lower-right corner, cite the data sources.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.text(x=x_ticks[-1] - x_tick_range * 0.004,\r\n        y=y_ticks[0] + y_tick_range * 0.002,\r\n        s=\"Election data: github.com\/tonmcg.\\nPopulation data: US Census Bureau.\",\r\n        ha=\"right\",\r\n        va=\"bottom\")<\/pre>\n<p>Let&#8217;s include the value of R\u00b2 just above the regression line. R\u00b2 measures how tightly the data points fit the line. (In politics, it&#8217;s usually not a high number.)<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.text(x=x_reg[-1],\r\n        y=y_reg[-1] + y_tick_range * 0.01,\r\n        s=f\"R\u00b2 = {r_value ** 2:.2f}\",\r\n        size=11,\r\n        ha=\"right\",\r\n        va=\"bottom\")<\/pre>\n<p>Finally, set a title and save the figure. Variables are defined explicitly enough in the title that we can skip axes labels.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">ax.set_title(\"Largest 10% of Counties  \u2022  Population Change 2013 to 2023  \u2022  2024 POTUS Results\")\r\n\r\nplt.savefig(\"county_population_reg.png\", dpi=200)<\/pre>\n<hr \/>\n<h4>3. The output.<\/h4>\n<p><a href=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2802 size-full\" src=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1.png\" alt=\"\" width=\"1900\" height=\"1900\" srcset=\"https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1.png 1900w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-300x300.png 300w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-1024x1024.png 1024w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-150x150.png 150w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-768x768.png 768w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-1536x1536.png 1536w, https:\/\/wollen.org\/blog\/wp-content\/uploads\/2025\/05\/county_population_reg-1-800x800.png 800w\" sizes=\"auto, (max-width: 1900px) 100vw, 1900px\" \/><\/a><\/p>\n<p>It&#8217;s a questionable linear model! Residuals in the upper half of the range tend to form a J-shape, which is a sign of non-linearity. That&#8217;s okay. It tells us something about the data.<\/p>\n<p>Model aside, there is clearly <em>some<\/em> positive relationship between county population growth and partisan preference. Notice how many dots are in the upper-right compared to the lower-right. A lot of moderately large, fast-growing counties are very Republican. Remember, we&#8217;re plotting the top 10% of counties by population. Even the small dots represent large groups of people.<\/p>\n<p><strong>Of these 310 large counties, 34 grew at least 25%. Only two of the 34 voted for Harris.<\/strong> If I had to boil this post down to a single fact, that would be it.<\/p>\n<hr style=\"margin-left: 20%; margin-right: 20%;\" \/>\n<p>The data strongly suggests that Republican voters are having more kids. And they are, according to <a href=\"https:\/\/ifstudies.org\/blog\/the-trump-bump-the-republican-fertility-advantage-in-2024\" target=\"_blank\" rel=\"noopener\">this analysis<\/a> of CDC fertility data. It&#8217;s a profound long-run advantage for Republicans because people <a href=\"https:\/\/www.pewresearch.org\/short-reads\/2023\/05\/10\/most-us-parents-pass-along-their-religion-and-politics-to-their-children\/\" target=\"_blank\" rel=\"noopener\">very often inherit<\/a> their political identity from parents.<\/p>\n<p>To be clear, this approach doesn&#8217;t distinguish between natural population growth and inbound migration. Some people relocate, at least in part, for <a href=\"https:\/\/www.npr.org\/2022\/02\/18\/1081295373\/the-big-sort-americans-move-to-areas-political-alignment\" target=\"_blank\" rel=\"noopener\">partisan reasons<\/a>. That makes it more difficult to draw conclusions from the relationship modeled above. If a Pennsylvania Republican moves to a fast-growing, Trump-loving Texas county, it&#8217;s probably not helpful to the party overall.<\/p>\n<p>Regardless, these trends should give Democrats pause. Reapportionment after the 2020 census weakened the <em>Blue Wall<\/em> strategy (Pennsylvania, Michigan, and Wisconsin). 2030 is <a href=\"https:\/\/thearp.org\/blog\/apportionment\/2030-apportionment-forecast-2024\/\" target=\"_blank\" rel=\"noopener\">set to take more<\/a> electoral votes from blue states. And the Electoral College isn&#8217;t the only 2030 concern. Republicans will benefit in the House as well, thanks in large part to <a href=\"https:\/\/www.bushcenter.org\/publications\/america-keeps-moving-to-high-opportunity-cities-in-the-sun-belt-new-census-data-confirms\" target=\"_blank\" rel=\"noopener\">surging migration<\/a> to affordable Sun Belt metros.<\/p>\n<hr style=\"margin-left: 20%; margin-right: 20%;\" \/>\n<p>Democrats&#8217; smaller margins in population centers aren&#8217;t necessarily cause for alarm. The party is increasingly targeting marginal battleground states in both policy and ad spending. Harris lost but she <a href=\"https:\/\/nymag.com\/intelligencer\/article\/harris-campaign-battleground-states-electoral-college.html\" target=\"_blank\" rel=\"noopener\">performed better<\/a> relative to baseline in actively contested states. In other words, (1) campaigning works, and (2) Democratic votes are now more efficiently distributed.<\/p>\n<p>Trump gains in high-population blue states almost completely erased the Republican Electoral College advantage in 2024. The delta between national popular vote and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Tipping-point_state\" target=\"_blank\" rel=\"noopener\">tipping-point state<\/a> vote hasn&#8217;t been this small since the 1980s:<\/p>\n<ul>\n<li>2016, R+2.8<\/li>\n<li>2020, R+3.8<\/li>\n<li>2024, R+0.2<\/li>\n<\/ul>\n<hr style=\"margin-left: 20%; margin-right: 20%;\" \/>\n<p>Neither party should panic. The major party voting coalitions are always churning, recomposing, and evolving. Our modern presidential elections are extremely competitive by historical standards.<\/p>\n<p>Republicans have done well to build a more <a href=\"https:\/\/datawrapper.dwcdn.net\/TctkT\/full.png\" target=\"_blank\" rel=\"noopener\">multiracial coalition<\/a>. They&#8217;ve made significant progress in deep blue states that previously felt unwinnable. And the fastest-growing population centers are solidly Republican, as we showed above. But so far, these gains have been inefficient with respect to the Electoral College.<\/p>\n<p>Democrats have benefited from <a href=\"https:\/\/www.slowboring.com\/p\/the-upside-of-education-polarization\" target=\"_blank\" rel=\"noopener\">education polarization<\/a>, especially in midterm and special elections. Their voters are more likely to show up. They&#8217;ve also made gains in suburban neighborhoods, which are the sweet spot for making your vote count. But those advantages are less helpful in presidential elections where turnout is higher across the board.<\/p>\n<p>Democrats should be concerned about long-term trends but, by definition, there is plenty of time to adapt. <a href=\"https:\/\/abcnews.go.com\/538\/democrats-incumbent-parties-lost-elections-world\/story?id=115972068\" target=\"_blank\" rel=\"noopener\">Despite the headwinds<\/a> of 20% cumulative inflation during Biden&#8217;s tenure, 2024 was a competitive election and Trump fell short of winning 50% of the vote.<\/p>\n<p>Arguably, Democrats are better positioned to make gains in 2028 than Republicans. The current Republican coalition is built on activating <a href=\"https:\/\/www.npr.org\/2024\/11\/18\/nx-s1-5183063\/trump-turnout-republican-voting-access\" target=\"_blank\" rel=\"noopener\">low-propensity voters<\/a>. It remains to be seen if they can hold it together without Trump leading the party.<!-- HFCM by 99 Robots - Snippet # 15: endmark-python -->\n<span class=\"endmark-python\"><\/span>\n<!-- \/end HFCM by 99 Robots -->\n<\/p>\n<hr \/>\n<p><strong><a href=\"https:\/\/github.com\/tonmcg\/US_County_Level_Election_Results_08-24\" target=\"_blank\" rel=\"noopener\">Election results (github.com\/tonmcg).<\/a><\/strong><\/p>\n<p><strong><a href=\"https:\/\/www.census.gov\/data\/tables\/time-series\/demo\/popest\/2010s-counties-total.html\" target=\"_blank\" rel=\"noopener\">2013 population data (US Census Bureau).<\/a><\/strong><\/p>\n<p><strong><a href=\"https:\/\/www.census.gov\/data\/tables\/time-series\/demo\/popest\/2020s-counties-total.html\" target=\"_blank\" rel=\"noopener\">2023 population data (US Census Bureau).<\/a><\/strong><\/p>\n<p><a href=\"https:\/\/wollen.org\/misc\/county_potus_scatter_2025.zip\"><strong>Download the Matplotlib style.<\/strong><\/a><\/p>\n<p><strong>Full code:<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\">import pandas as pd\r\nfrom scipy.stats import linregress\r\nimport matplotlib.pyplot as plt\r\n\r\n\r\ndef get_census_df(filename, new_columns):\r\n    df = pd.read_excel(filename)\r\n\r\n    df = df[4:-6]\r\n\r\n    df.columns = new_columns\r\n\r\n    df[['county', 'state']] = df['census_name'].str.split(\", \", expand=True)\r\n\r\n    df.loc[:, 'county'] = df['county'].apply(lambda x: x[1:-7])\r\n\r\n    df.loc[:, 'id'] = df['state'] + \"_\" + df['county']\r\n\r\n    df = df[~df['state'].isin([\"Alaska\", \"Connecticut\", \"District of Columbia\"])]\r\n\r\n    df = df[df['census_name'] != \".Kalawao County, Hawaii\"]\r\n\r\n    return df\r\n\r\n\r\ndef get_partisan_ticks(ticks):\r\n    new_ticks = []\r\n\r\n    for tick in ticks:\r\n        if tick &gt; 0:\r\n            new_ticks.append(f\"R+{tick * 100:.0f}\")\r\n        elif tick &lt; 0:\r\n            new_ticks.append(f\"D+{abs(tick) * 100:.0f}\")\r\n        else:\r\n            new_ticks.append(\"TIE\")\r\n\r\n    return new_ticks\r\n\r\n\r\npd.set_option(\"display.expand_frame_repr\", False)\r\n\r\ndf_2013 = get_census_df(filename=\"co-est2019-annres.xlsx\",\r\n                        new_columns=['census_name', 'census', 'base_estimate',\r\n                                     'pop2010', 'pop2011', 'pop2012', 'pop2013', 'pop2014',\r\n                                     'pop2015', 'pop2016', 'pop2017', 'pop2018', 'pop2019'])\r\n\r\ndf_2013 = df_2013[['id', 'pop2013']]\r\n\r\ndf_2023 = get_census_df(filename=\"co-est2023-pop.xlsx\",\r\n                        new_columns=['census_name', 'base_estimate', 'pop2020', 'pop2021', 'pop2022', 'pop2023'])\r\n\r\ndf_2023 = df_2023[['id', 'pop2023']]\r\n\r\ndf_elx = pd.read_csv(\"2024_US_County_Level_Presidential_Results.csv\")\r\n\r\ndf_elx = df_elx[~df_elx['state_name'].isin([\"Alaska\", \"Connecticut\", \"District of Columbia\"])]\r\n\r\ndf_elx.loc[:, 'county'] = df_elx['county_name'].apply(lambda x: x[:-7])\r\n\r\ndf_elx.loc[:, 'id'] = df_elx['state_name'] + \"_\" + df_elx['county']\r\n\r\ndf_elx = df_elx[['id', 'per_point_diff']]\r\n\r\ndf = df_2013.merge(df_2023, on=\"id\").merge(df_elx, on=\"id\")\r\n\r\ndf.loc[:, 'pct_change'] = (df['pop2023'] - df['pop2013']) \/ df['pop2013']\r\n\r\ndf = df[['id', 'pop2013', 'pop2023', 'pct_change', 'per_point_diff']]\r\n\r\ndf.loc[:, 'marker_size'] = df['pop2023'] * 0.00015\r\n\r\ndf = df.sort_values(\"pop2023\", ascending=False)[:310]\r\n\r\nslope, intercept, r_value, p_value, std_err = linregress(df['pct_change'], df['per_point_diff'])\r\n\r\nx_reg = [df['pct_change'].min(), df['pct_change'].max()]\r\ny_reg = [n * slope + intercept for n in x_reg]\r\n\r\nplt.style.use(\"wollen_election.mplstyle\")\r\n\r\nfig, ax = plt.subplots()\r\n\r\nax.scatter(x=df['pct_change'],\r\n           y=df['per_point_diff'],\r\n           color=\"#9885BF\",\r\n           s=df['marker_size'],\r\n           edgecolor=\"#333\",\r\n           linewidth=0.5,\r\n           alpha=0.7,\r\n           zorder=2)\r\n\r\nax.plot(x_reg, y_reg, color=\"#333\", zorder=3)\r\n\r\nax.plot([-2, 2], [0, 0], color=\"#333\", linewidth=0.5, zorder=1)\r\nax.plot([0, 0], [-2, 2], color=\"#333\", linewidth=0.5, zorder=1)\r\n\r\nx_ticks = [n \/ 10 for n in range(-2, 7)]\r\nax.set_xticks(x_ticks, labels=[f\"{n * 100:+.0f}%\" if n != 0 else \"0\" for n in x_ticks])\r\nx_tick_range = x_ticks[-1] - x_ticks[0]\r\nx_left, x_right = x_ticks[0] - x_tick_range * 0.03, x_ticks[-1] + x_tick_range * 0.03\r\nax.set_xlim(x_left, x_right)\r\n\r\ny_ticks = [n \/ 10 for n in range(-8, 8, 2)]\r\nax.set_yticks(y_ticks, labels=get_partisan_ticks(y_ticks))\r\ny_tick_range = y_ticks[-1] - y_ticks[0]\r\ny_bottom, y_top = y_ticks[0] - y_tick_range * 0.01, y_ticks[-1] + y_tick_range * 0.01\r\nax.set_ylim(y_bottom, y_top)\r\n\r\nax.text(x=x_ticks[0] + x_tick_range * 0.004,\r\n        y=y_ticks[0] + y_tick_range * 0.002,\r\n        s=\"AK, CT, DC excluded.\",\r\n        ha=\"left\",\r\n        va=\"bottom\")\r\n\r\nax.text(x=x_ticks[-1] - x_tick_range * 0.004,\r\n        y=y_ticks[0] + y_tick_range * 0.002,\r\n        s=\"Election data: github.com\/tonmcg.\\nPopulation data: US Census Bureau.\",\r\n        ha=\"right\",\r\n        va=\"bottom\")\r\n\r\nax.text(x=x_reg[-1],\r\n        y=y_reg[-1] + y_tick_range * 0.01,\r\n        s=f\"R\u00b2 = {r_value ** 2:.2f}\",\r\n        size=11,\r\n        ha=\"right\",\r\n        va=\"bottom\")\r\n\r\nax.set_title(\"Largest 10% of Counties  \u2022  Population Change 2013 to 2023  \u2022  2024 POTUS Results\")\r\n\r\nplt.savefig(\"county_population_reg.png\", dpi=200)<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post will dive a little deeper into American politics. I try to avoid it unless there&#8217;s something genuinely interesting to say about the data. In this case, I think there is! We&#8217;ll look at the last decade of county-level<\/p>\n","protected":false},"author":1,"featured_media":2277,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[238,469],"tags":[468,23,629,22,334,122,189,185,467,631,628,24,407,632,630,465,30,46,187,31,25,225,254,117,190,627,626,201,63,116,395,625],"class_list":["post-2260","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-government","category-stats","tag-468","tag-census","tag-curvilinear","tag-data","tag-dataframe","tag-dataset","tag-democrat","tag-election","tag-harris","tag-j-shape","tag-linear","tag-matplotlib","tag-model","tag-non-linearity","tag-nonlinear","tag-ols","tag-pandas","tag-plot","tag-politics","tag-population","tag-python","tag-read_csv","tag-read_excel","tag-regression","tag-republican","tag-residual","tag-residuals","tag-scatter","tag-statistics","tag-stats","tag-trump","tag-variance"],"_links":{"self":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/2260","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/comments?post=2260"}],"version-history":[{"count":41,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/2260\/revisions"}],"predecessor-version":[{"id":3139,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/posts\/2260\/revisions\/3139"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media\/2277"}],"wp:attachment":[{"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/media?parent=2260"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/categories?post=2260"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wollen.org\/blog\/wp-json\/wp\/v2\/tags?post=2260"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}