Government

The critically endangered split-ticket voter

Split-ticket voting is when you select candidates of different political parties on the same ballot. You might vote for a Democrat for state legislature and a Republican for mayor. That’s in contrast to straight-ticket voting, which of course means picking one party straight down the ballot.

In today’s world, where all politics are national, split-ticket voting usually refers to splitting party loyalty between president and Congress. It was much more common in the 20th century, beginning in the years following WWII and peaking in the 70s and 80s.

No one agrees precisely on why ticket splitting has trended steadily downward. Most cite political polarization as being at least partially responsible. We tend to be stronger partisans and less likely to divide our loyalty. There’s also the aging out of conservative Southern Democrats who were at the center of Civil Rights era party realignment.

Whatever the underlying causes, it’s now much more difficult for a Democratic congressional candidate to win during a Republican presidential victory, and vice-versa. Let’s take a look at the data and see exactly how rare ticket splitting has become.


1. Prepare the data.

I grabbed House election results from here and scraped district-level presidential results from here. The files I used will be linked at the bottom of this post.

The first row of the House CSV looks like this:

year,district,rep_party,winner
1976,AL-01,R,JACK EDWARDS

We’re counting split tickets during presidential election years so let’s filter out midterm results using the % operator. Since this Dataframe will be merged with a presidential results Dataframe, create an id column that will be common to both.

import pandas as pd

df_house = pd.read_csv("house_winners_1976-2020.csv")
df_house = df_house[df_house['year'] % 4 == 0]
df_house.loc[:, 'id'] = df_house['year'].astype(str) + "_" + df_house['district']

Now df_house.head() looks like this:

   year district rep_party                        winner          id
0  1976    AL-01         R                  JACK EDWARDS  1976_AL-01
1  1976    AL-02         R  WILLIAM L \"BILL\" DICKINSON  1976_AL-02
2  1976    AL-03         D                  BILL NICHOLS  1976_AL-03
3  1976    AL-04         D                    TOM BEVILL  1976_AL-04
4  1976    AL-05         D               RONNIE G FLIPPO  1976_AL-05

We can follow essentially the same process for the presidential election dataset, save for filtering Washington DC. DC residents can vote for president but they don’t yet have full representation in Congress.

df_pres = pd.read_csv("pres_winners_1976-2020.csv")
df_pres = df_pres[df_pres['district'] != "DC-00"]
df_pres.loc[:, 'id'] = df_pres['year'].astype(str) + "_" + df_pres['district']

After merging on the id column…

df = df_house.merge(df_pres[['pres_party', 'id']], on="id")

df.head() looks like this:

   year district rep_party                        winner          id pres_party
0  1976    AL-01         R                  JACK EDWARDS  1976_AL-01          R
1  1976    AL-02         R  WILLIAM L \"BILL\" DICKINSON  1976_AL-02          D
2  1976    AL-03         D                  BILL NICHOLS  1976_AL-03          D
3  1976    AL-04         D                    TOM BEVILL  1976_AL-04          D
4  1976    AL-05         D               RONNIE G FLIPPO  1976_AL-05          D

Now we have one row for each district in each election. It will be easy enough to check where the R’s and D’s diverge. You can already see that AL-02 was split in 1976.

I think the best approach will be to filter out non-split rows, i.e. where rep_party and pres_party are equal. Then create a new column containing the information from both.

df = df[df['pres_party'] != df['rep_party']]

df.loc[:, 'vote'] = df['pres_party'] + df['rep_party']

Take one more look at df.head() before digging in:

    year district rep_party                        winner          id pres_party vote
1   1976    AL-02         R  WILLIAM L \"BILL\" DICKINSON  1976_AL-02          D   DR
9   1976    AZ-02         D                MORRIS K UDALL  1976_AZ-02          R   RD
10  1976    AZ-03         D                     BOB STUMP  1976_AZ-03          R   RD
14  1976    AR-03         R       JOHN PAUL HAMMERSCHMIDT  1976_AR-03          D   DR
17  1976    CA-02         R                 DON H CLAUSEN  1976_CA-02          D   DR

The plan is to count “DR” and “RD” separately. An R president and a D rep is different from a D president and an R rep. There are a handful of independent congressional winners and they can have their own category as well. We’ll plot the three values as a stacked bar chart. Viewers will be able to see overall trends as well as compositional changes.

We can accomplish this with a for loop. Step through each election year and create a new view of the Dataframe. The only tricky part is to count “DR” and “RD” instances using value_counts(). This method returns a Series object but you can access any specific item like it’s a dictionary.

x = []
y_d_r = []
y_r_d = []
y_other = []

for year in range(1976, 2024, 4):
    df2 = df[df['year'] == year]

    d_r_count = df2['vote'].value_counts()['DR']
    r_d_count = df2['vote'].value_counts()['RD']

    x.append(year)
    y_d_r.append(d_r_count)
    y_r_d.append(r_d_count)
    y_other.append(df2.shape[0] - d_r_count - r_d_count)

2. Plot the data.

Now our data is in four lists and we can easily plot them. I’ll use a custom Matplotlib style I created to mimic The Economist’s style. They use a lot of red, which may not be ideal in a red/blue political context, but I think we can get away with it.

The trick when creating a stacked bar chart is to use the bottom parameter. Each bar begins at the previous bar’s top. Since there are three sections, we calculate a sum of the other two to locate the third.

import matplotlib.pyplot as plt

plt.style.use("wollen_economist.mplstyle")
fig, ax = plt.subplots()

ax.bar(x, y_r_d, width=2.3, label="R President  •  D House")
ax.bar(x, y_d_r, bottom=y_r_d, width=2.3, label="D President  •  R House")
ax.bar(x, y_other, bottom=[sum(pair) for pair in zip(y_r_d, y_d_r)], width=2.3, label="Other Split Tickets")

x-ticks and y-ticks are straightforward on this plot. As usual, let’s parameterize window limits and use them later to position text.

ax.set_xticks(x)
x_tick_span = x[-1] - x[0]
x_left, x_right = x[0] - x_tick_span * 0.05, x[-1] + x_tick_span * 0.05
ax.set_xlim(x_left, x_right)

y_ticks = range(0, 225, 25)
ax.set_yticks(y_ticks)
y_tick_span = y_ticks[-1] - y_ticks[0]
y_bottom, y_top = 0, y_ticks[-1] + y_tick_span * 0.005
ax.set_ylim(y_bottom, y_top)

The built-in legend alignment options are a little too restrictive in this case. We can use bbox_to_anchor to define a specific position—top and slightly left of center.

ax.legend(bbox_to_anchor=(0.43, 1.0), loc="upper center")

Finally, add a title and citation text. Then save the output.

ax.set_title("Split-Ticket Districts in US Presidential Elections  |  1976–2020")

source_message = ('House Election Data\nMIT Election Data and Science Lab, 2017, "U.S. House 1976–2022"\n\n'
                  'Presidential Election Data\nhttps://sites.google.com/view/presidentialbycongressionaldis')
ax.text(x_right - x_tick_span * 0.005, y_ticks[-1] - y_tick_span * 0.007, source_message, size=8, ha="right", va="top")

plt.savefig("split_tickets_1976-2020.png", dpi=120)

3. The output.

The downward trend is very obvious when visualized. After seeing only 16 split-ticket districts in 2020, it’s difficult to imagine 1984 when more than 40% of districts split their vote.

The plot also lends some credibility to the theory that realignment was a major contributor. Conservative Republicans—formerly conservative Democrats—might have been willing to stick with their local Democratic representatives, with whom they were more familiar, while voting Republican in presidential contests. The 1980s bars aren’t just taller. They’re also significantly more R-D than D-R. Ticket splitting erosion may be, in part, the result of one generation’s gradual replacement by younger straight-ticket voters. You could argue realignment began in the 1950s and 60s but took a generation or more to complete.

That’s not the whole story. It doesn’t explain what motivated D-R voters in the 90s. Candidate appeal certainly has a large effect so it’s difficult to untangle without digging deeper. I tend to believe that the modern information environment doesn’t necessarily make us better informed, but it does make us stronger partisans.

We’ll find out in a month if the straight-ticket trend will continue.


Download the data.

Full code:

import pandas as pd
import matplotlib.pyplot as plt


df_house = pd.read_csv("house_winners_1976-2020.csv")
df_house = df_house[df_house['year'] % 4 == 0]
df_house.loc[:, 'id'] = df_house['year'].astype(str) + "_" + df_house['district']

df_pres = pd.read_csv("pres_winners_1976-2020.csv")
df_pres = df_pres[df_pres['district'] != "DC-00"]
df_pres.loc[:, 'id'] = df_pres['year'].astype(str) + "_" + df_pres['district']

df = df_house.merge(df_pres[['pres_party', 'id']], on="id")

df = df[df['pres_party'] != df['rep_party']]

df.loc[:, 'vote'] = df['pres_party'] + df['rep_party']

x = []
y_d_r = []
y_r_d = []
y_other = []

for year in range(1976, 2024, 4):
    df2 = df[df['year'] == year]

    d_r_count = df2['vote'].value_counts()['DR']
    r_d_count = df2['vote'].value_counts()['RD']

    x.append(year)
    y_d_r.append(d_r_count)
    y_r_d.append(r_d_count)
    y_other.append(df2.shape[0] - d_r_count - r_d_count)

plt.style.use("wollen_economist.mplstyle")
fig, ax = plt.subplots()

ax.bar(x, y_r_d, width=2.3, label="R President  •  D House")
ax.bar(x, y_d_r, bottom=y_r_d, width=2.3, label="D President  •  R House")
ax.bar(x, y_other, bottom=[sum(pair) for pair in zip(y_r_d, y_d_r)], width=2.3, label="Other Split Tickets")

ax.set_xticks(x)
x_tick_span = x[-1] - x[0]
x_left, x_right = x[0] - x_tick_span * 0.05, x[-1] + x_tick_span * 0.05
ax.set_xlim(x_left, x_right)

y_ticks = range(0, 225, 25)
ax.set_yticks(y_ticks)
y_tick_span = y_ticks[-1] - y_ticks[0]
y_bottom, y_top = 0, y_ticks[-1] + y_tick_span * 0.005
ax.set_ylim(y_bottom, y_top)

ax.legend(bbox_to_anchor=(0.43, 1.0), loc="upper center")

ax.set_title("Split-Ticket Districts in US Presidential Elections  |  1976–2020")

source_message = ('House Election Data\nMIT Election Data and Science Lab, 2017, "U.S. House 1976–2022"\n\n'
                  'Presidential Election Data\nhttps://sites.google.com/view/presidentialbycongressionaldis')
ax.text(x_right - x_tick_span * 0.005, y_ticks[-1] - y_tick_span * 0.007, source_message, size=8, ha="right", va="top")

plt.savefig("split_tickets_1976-2020.png", dpi=120)