The critically endangered split-ticket voter
Split-ticket voting is when you select candidates of different political parties on the same ballot. You might vote for a Democrat for state legislature and a Republican for mayor. That’s in contrast to straight-ticket voting, which of course means picking one party straight down the ballot.
In today’s world, where all politics are national, split-ticket voting usually refers to splitting party loyalty between president and Congress. It was much more common in the 20th century, beginning in the years following WWII and peaking in the 70s and 80s.
No one agrees precisely on why ticket splitting has trended steadily downward. Most cite political polarization as being at least partially responsible. We tend to be stronger partisans and less likely to divide our loyalty. There’s also the aging out of conservative Southern Democrats who were at the center of Civil Rights era party realignment.
Whatever the underlying causes, it’s now much more difficult for a Democratic congressional candidate to win during a Republican presidential victory, and vice-versa. Let’s take a look at the data and see exactly how rare ticket splitting has become.
1. Prepare the data.
I grabbed House election results from here and scraped district-level presidential results from here. The files I used will be linked at the bottom of this post.
The first row of the House CSV looks like this:
year,district,rep_party,winner 1976,AL-01,R,JACK EDWARDS
We’re counting split tickets during presidential election years so let’s filter out midterm results using the %
operator. Since this Dataframe will be merged with a presidential results Dataframe, create an id
column that will be common to both.
import pandas as pd df_house = pd.read_csv("house_winners_1976-2020.csv") df_house = df_house[df_house['year'] % 4 == 0] df_house.loc[:, 'id'] = df_house['year'].astype(str) + "_" + df_house['district']
Now df_house.head()
looks like this:
year district rep_party winner id 0 1976 AL-01 R JACK EDWARDS 1976_AL-01 1 1976 AL-02 R WILLIAM L \"BILL\" DICKINSON 1976_AL-02 2 1976 AL-03 D BILL NICHOLS 1976_AL-03 3 1976 AL-04 D TOM BEVILL 1976_AL-04 4 1976 AL-05 D RONNIE G FLIPPO 1976_AL-05
We can follow essentially the same process for the presidential election dataset, save for filtering Washington DC. DC residents can vote for president but they don’t yet have full representation in Congress.
df_pres = pd.read_csv("pres_winners_1976-2020.csv") df_pres = df_pres[df_pres['district'] != "DC-00"] df_pres.loc[:, 'id'] = df_pres['year'].astype(str) + "_" + df_pres['district']
After merging on the id
column…
df = df_house.merge(df_pres[['pres_party', 'id']], on="id")
… df.head()
looks like this:
year district rep_party winner id pres_party 0 1976 AL-01 R JACK EDWARDS 1976_AL-01 R 1 1976 AL-02 R WILLIAM L \"BILL\" DICKINSON 1976_AL-02 D 2 1976 AL-03 D BILL NICHOLS 1976_AL-03 D 3 1976 AL-04 D TOM BEVILL 1976_AL-04 D 4 1976 AL-05 D RONNIE G FLIPPO 1976_AL-05 D
Now we have one row for each district in each election. It will be easy enough to check where the R’s and D’s diverge. You can already see that AL-02 was split in 1976.
I think the best approach will be to filter out non-split rows, i.e. where rep_party
and pres_party
are equal. Then create a new column containing the information from both.
df = df[df['pres_party'] != df['rep_party']] df.loc[:, 'vote'] = df['pres_party'] + df['rep_party']
Take one more look at df.head()
before digging in:
year district rep_party winner id pres_party vote 1 1976 AL-02 R WILLIAM L \"BILL\" DICKINSON 1976_AL-02 D DR 9 1976 AZ-02 D MORRIS K UDALL 1976_AZ-02 R RD 10 1976 AZ-03 D BOB STUMP 1976_AZ-03 R RD 14 1976 AR-03 R JOHN PAUL HAMMERSCHMIDT 1976_AR-03 D DR 17 1976 CA-02 R DON H CLAUSEN 1976_CA-02 D DR
The plan is to count “DR” and “RD” separately. An R president and a D rep is different from a D president and an R rep. There are a handful of independent congressional winners and they can have their own category as well. We’ll plot the three values as a stacked bar chart. Viewers will be able to see overall trends as well as compositional changes.
We can accomplish this with a for
loop. Step through each election year and create a new view of the Dataframe. The only tricky part is to count “DR” and “RD” instances using value_counts()
. This method returns a Series object but you can access any specific item like it’s a dictionary.
x = [] y_d_r = [] y_r_d = [] y_other = [] for year in range(1976, 2024, 4): df2 = df[df['year'] == year] d_r_count = df2['vote'].value_counts()['DR'] r_d_count = df2['vote'].value_counts()['RD'] x.append(year) y_d_r.append(d_r_count) y_r_d.append(r_d_count) y_other.append(df2.shape[0] - d_r_count - r_d_count)
2. Plot the data.
Now our data is in four lists and we can easily plot them. I’ll use a custom Matplotlib style I created to mimic The Economist’s style. They use a lot of red, which may not be ideal in a red/blue political context, but I think we can get away with it.
The trick when creating a stacked bar chart is to use the bottom
parameter. Each bar begins at the previous bar’s top. Since there are three sections, we calculate a sum of the other two to locate the third.
import matplotlib.pyplot as plt plt.style.use("wollen_economist.mplstyle") fig, ax = plt.subplots() ax.bar(x, y_r_d, width=2.3, label="R President • D House") ax.bar(x, y_d_r, bottom=y_r_d, width=2.3, label="D President • R House") ax.bar(x, y_other, bottom=[sum(pair) for pair in zip(y_r_d, y_d_r)], width=2.3, label="Other Split Tickets")
x-ticks and y-ticks are straightforward on this plot. As usual, let’s parameterize window limits and use them later to position text.
ax.set_xticks(x) x_tick_span = x[-1] - x[0] x_left, x_right = x[0] - x_tick_span * 0.05, x[-1] + x_tick_span * 0.05 ax.set_xlim(x_left, x_right) y_ticks = range(0, 225, 25) ax.set_yticks(y_ticks) y_tick_span = y_ticks[-1] - y_ticks[0] y_bottom, y_top = 0, y_ticks[-1] + y_tick_span * 0.005 ax.set_ylim(y_bottom, y_top)
The built-in legend alignment options are a little too restrictive in this case. We can use bbox_to_anchor
to define a specific position—top and slightly left of center.
ax.legend(bbox_to_anchor=(0.43, 1.0), loc="upper center")
Finally, add a title and citation text. Then save the output.
ax.set_title("Split-Ticket Districts in US Presidential Elections | 1976–2020") source_message = ('House Election Data\nMIT Election Data and Science Lab, 2017, "U.S. House 1976–2022"\n\n' 'Presidential Election Data\nhttps://sites.google.com/view/presidentialbycongressionaldis') ax.text(x_right - x_tick_span * 0.005, y_ticks[-1] - y_tick_span * 0.007, source_message, size=8, ha="right", va="top") plt.savefig("split_tickets_1976-2020.png", dpi=120)
3. The output.
The downward trend is very obvious when visualized. After seeing only 16 split-ticket districts in 2020, it’s difficult to imagine 1984 when more than 40% of districts split their vote.
The plot also lends some credibility to the theory that realignment was a major contributor. Conservative Republicans—formerly conservative Democrats—might have been willing to stick with their local Democratic representatives, with whom they were more familiar, while voting Republican in presidential contests. The 1980s bars aren’t just taller. They’re also significantly more R-D than D-R. Ticket splitting erosion may be, in part, the result of one generation’s gradual replacement by younger straight-ticket voters. You could argue realignment began in the 1950s and 60s but took a generation or more to complete.
That’s not the whole story. It doesn’t explain what motivated D-R voters in the 90s. Candidate appeal certainly has a large effect so it’s difficult to untangle without digging deeper. I tend to believe that the modern information environment doesn’t necessarily make us better informed, but it does make us stronger partisans.
We’ll find out in a month if the straight-ticket trend will continue.
Full code:
import pandas as pd import matplotlib.pyplot as plt df_house = pd.read_csv("house_winners_1976-2020.csv") df_house = df_house[df_house['year'] % 4 == 0] df_house.loc[:, 'id'] = df_house['year'].astype(str) + "_" + df_house['district'] df_pres = pd.read_csv("pres_winners_1976-2020.csv") df_pres = df_pres[df_pres['district'] != "DC-00"] df_pres.loc[:, 'id'] = df_pres['year'].astype(str) + "_" + df_pres['district'] df = df_house.merge(df_pres[['pres_party', 'id']], on="id") df = df[df['pres_party'] != df['rep_party']] df.loc[:, 'vote'] = df['pres_party'] + df['rep_party'] x = [] y_d_r = [] y_r_d = [] y_other = [] for year in range(1976, 2024, 4): df2 = df[df['year'] == year] d_r_count = df2['vote'].value_counts()['DR'] r_d_count = df2['vote'].value_counts()['RD'] x.append(year) y_d_r.append(d_r_count) y_r_d.append(r_d_count) y_other.append(df2.shape[0] - d_r_count - r_d_count) plt.style.use("wollen_economist.mplstyle") fig, ax = plt.subplots() ax.bar(x, y_r_d, width=2.3, label="R President • D House") ax.bar(x, y_d_r, bottom=y_r_d, width=2.3, label="D President • R House") ax.bar(x, y_other, bottom=[sum(pair) for pair in zip(y_r_d, y_d_r)], width=2.3, label="Other Split Tickets") ax.set_xticks(x) x_tick_span = x[-1] - x[0] x_left, x_right = x[0] - x_tick_span * 0.05, x[-1] + x_tick_span * 0.05 ax.set_xlim(x_left, x_right) y_ticks = range(0, 225, 25) ax.set_yticks(y_ticks) y_tick_span = y_ticks[-1] - y_ticks[0] y_bottom, y_top = 0, y_ticks[-1] + y_tick_span * 0.005 ax.set_ylim(y_bottom, y_top) ax.legend(bbox_to_anchor=(0.43, 1.0), loc="upper center") ax.set_title("Split-Ticket Districts in US Presidential Elections | 1976–2020") source_message = ('House Election Data\nMIT Election Data and Science Lab, 2017, "U.S. House 1976–2022"\n\n' 'Presidential Election Data\nhttps://sites.google.com/view/presidentialbycongressionaldis') ax.text(x_right - x_tick_span * 0.005, y_ticks[-1] - y_tick_span * 0.007, source_message, size=8, ha="right", va="top") plt.savefig("split_tickets_1976-2020.png", dpi=120)