Entertainment

How does Matt Amodio’s Jeopardy streak compare?

If you’re reading this blog there’s a good chance you heard about Matt Amodio’s incredible run on Jeopardy. After winning 38 consecutive games he walked away with over $1.5 million dollars. I thought I’d take a look at how his streak fits into Jeopardy history.

I’d like to visualize both total winnings and consecutive games won. I’ll use these variables to plot average daily winnings, which will check both boxes and hopefully provide some new information you haven’t seen a dozen times before.

Conveniently, the show maintains a Hall of Fame with most of the data we need. I put the data into a CSV file you can download at the bottom of this post.


1. Prepare the data.

Start by reading the dataset into a pandas Dataframe.

df = pd.read_csv("jeopardy_top_winners.csv")

The show uses ALL CAPS in their text:

So we’ll follow suit. Convert the name column with upper().

df["name"] = df["name"].str.upper()

Now we can generate the data we intend to plot: average daily winnings. It’s as simple as dividing total winnings by number of games won. Create a new column to hold this information.

df.loc[:, "average"] = df["total_won"] / df["games_won"]

Before jumping into Matplotlib I want to double-check the Dataframe.

Although the primary objective is to plot average daily winnings, I’d like to order contestants by their number of games won. This should convey a second dimension of the data while also highlighting Amodio’s longevity.

Notice that I pass a list into the sort_values method. Some contestants have won an equal number of games so I’ll use average winnings as the “tiebreaker.” You could include more tiebreakers if you wanted, but that won’t be necessary with this data.

df = df.sort_values(["games_won", "average"]).reset_index(drop=True)

print(df)

The output:

                 name  games_won  total_won       average
0     JONATHAN FISHER         11     246100  22372.727273
1          ARTHUR CHU         11     297200  27018.181818
2         SETH WILSON         12     265002  22083.500000
3       AUSTIN ROGERS         12     411000  34250.000000
4        MATT JACKSON         13     411612  31662.461538
5        DAVID MADDEN         19     430400  22652.631579
6   JASON ZUFFRANIERI         19     532496  28026.105263
7       JULIA COLLINS         20     428100  21405.000000
8     JAMES HOLZHAUER         32    2462216  76944.250000
9         MATT AMODIO         38    1518601  39963.184211
10       KEN JENNINGS         74    2520700  34063.513514

The sorted values appear upside down right now, but notice the longest win streak (Ken Jennings) corresponds to the highest index (10). This will work because we’ll plot the data on a horizontal bar chart. It would be awkward and unnecessary to read the name text sideways. Let’s make it easy and turn the bars sideways instead.

I’ve created a Jeopardy-themed Matplotlib style for this plot. The colors and text are designed to mimic the style of the show. For reference:

I also found a font that copies clue text fairly well. We can use Matplotlib’s Path Effects to create a drop shadow and get even closer.


2. Plot the data.

Begin in the usual way by referencing an mplstyle and retrieving fig and ax objects.

plt.style.use("jeopardy.mplstyle")

fig, ax = plt.subplots()

Horizontal bar charts are a little tricky—at least for me—because it’s easy to mix up the independent and dependent variables. I can never remember what Matplotlib considers x and y or which argument should be passed first.

Confessions out of the way, the independent (vertical) iterable is passed first. This is df.index (0-10, seen above). Remember we’re plotting names so we’ll have to replace integer tick labels with text in a moment. The dependent variable, average winnings, is passed second.

On a regular bar chart the bar width parameter is intuitively called width. On a barh chart, somewhat less obviously, it becomes height.

ax.barh(df.index, df["average"], height=0.65)

I don’t want to turn this into a patheffects tutorial so I’ll just link to the documentation here. The module is capable of much more but we’ll use it to create a nice text drop shadow.

  • offset defines the shadow’s horizontal and vertical distance from the foreground text.
  • shadow_rgbFace is the shadow’s color.
  • alpha is the shadow’s transparency. 1.0 means completely opaque.
drop_shadow = [path_effects.withSimplePatchShadow(offset=(1.25, -1.25),
                                                  shadow_rgbFace="black",
                                                  alpha=1.0)]

Since we’re going to pass a list of strings into set_yticklabels (contestant names) it’s important to first manually set those ticks. This is a non-negotiable step when customizing tick labels! Because if you later tweak the code or underlying data and Matplotlib changes its automatically generated ticks, your labels will then be in the wrong place.

Some of the names are too long to place on a single line so we’ll replace spaces with newline characters. Notice we pass the previously defined drop_shadow as a path_effects argument. We’ll have to pass the argument to other label methods as well.

ax.set_yticks(df.index)
ax.set_yticklabels([item.replace(" ", "\n") for item in df["name"]], path_effects=drop_shadow)
ax.set_ylim(-0.6, 10.6)

We follow roughly the same process for the x-axis (average daily winnings). Ticks are first set manually and then a list of strings replaces those tick labels.

x_ticks = range(0, 100000, 20000)
ax.set_xticks(x_ticks)
ax.set_xticklabels([f"${n:,}" if n > 0 else "0" for n in x_ticks], path_effects=drop_shadow)
ax.set_xlim(0, x_ticks[-1] * 1.03)
ax.set_xlabel("AVERAGE DAILY WINNINGS", path_effects=drop_shadow)

For the plot’s title I’d like to have a major heading and a sub-heading below it, each with its own font size. Rather than using set_title I’ll call text twice. This is a little more work but it allows for a more customized final product. Don’t forget the drop shadow here.

middle_x = mean(ax.get_xlim())
ax.text(middle_x, 11.3, "JEOPARDY! AVERAGE DAILY WINNINGS", size=15, ha="center", path_effects=drop_shadow)
ax.text(middle_x, 10.9, "TOP 10 WIN STREAKS", size=12, ha="center", path_effects=drop_shadow)

Remember we sorted the contestants by their number of games won. We should make that clear to the audience.

We can add the information as text at the tip of each horizontal bar. Use pd.iterrows to step through the Dataframe much like you would use enumerate from the standard library. Note that only the first bar needs to include the literal text “Games Won:”. Each bar below it can simply display a number.

for i, row in df.iterrows():
    if row["games_won"] == df["games_won"].max():
        bar_label = f"GAMES WON: {row['games_won']}"
    else:
        bar_label = row["games_won"]
    ax.text(row["average"] - 1000, i, bar_label, size=10, ha="right", va="center")

Finally, save the figure however you’d like. I suggest saving in a vectorized SVG format if that works for your application. If not, bump up dpi to prevent aliasing in the drop shadows. Finer details tend to be lost at Matplotlib’s default dpi of 100.

plt.savefig("jeopardy_average_win.png", dpi=200)

The output:

I think the style represents Jeopardy well!

As you can see, Matt Amodio averaged a higher daily total than all but James Holzhauer. However, Ken Jennings far out-earned Matt thanks to his unrivaled longevity.

In summary, Amodio holds the 2nd-longest streak as well as the 2nd-largest average win—at least among Hall of Fame members. He can’t claim to be the best in either category, but his streak represents one of the most well rounded performances in Jeopardy! history.


Download the data.

Full code:

import pandas as pd
from numpy import mean
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects


df = pd.read_csv("jeopardy_top_winners.csv")

df["name"] = df["name"].str.upper()

df.loc[:, "average"] = df["total_won"] / df["games_won"]

df = df.sort_values(["games_won", "average"]).reset_index(drop=True)

print(df)

plt.style.use("jeopardy.mplstyle")

fig, ax = plt.subplots()

ax.barh(df.index, df["average"], height=0.65)

drop_shadow = [path_effects.withSimplePatchShadow(offset=(1.25, -1.25), shadow_rgbFace="black", alpha=1.0)]

ax.set_yticks(df.index)
ax.set_yticklabels([item.replace(" ", "\n") for item in df["name"]], path_effects=drop_shadow)
ax.set_ylim(-0.6, 10.6)

x_ticks = range(0, 100000, 20000)
ax.set_xticks(x_ticks)
ax.set_xticklabels([f"${n:,}" if n > 0 else "0" for n in x_ticks], path_effects=drop_shadow)
ax.set_xlim(0, x_ticks[-1] * 1.03)
ax.set_xlabel("AVERAGE DAILY WINNINGS", path_effects=drop_shadow)

middle_x = mean(ax.get_xlim())
ax.text(middle_x, 11.3, "JEOPARDY! AVERAGE DAILY WINNINGS", size=15, ha="center", path_effects=drop_shadow)
ax.text(middle_x, 10.9, "TOP 10 WIN STREAKS", size=12, ha="center", path_effects=drop_shadow)

for i, row in df.iterrows():
    if row["games_won"] == df["games_won"].max():
        bar_label = f"GAMES WON: {row['games_won']}"
    else:
        bar_label = row["games_won"]
    ax.text(row["average"] - 1000, i, bar_label, size=10, ha="right", va="center")

plt.savefig("jeopardy_average_win.png", dpi=200)