The World Series of Poker Main Event
I’m a week later than usual on this post because I wanted to wait for the World Series of Poker (WSOP) to finalize Main Event payouts. Registration is closed and this year’s tournament is the third-largest to date. The final player standing in this year’s 9,735-person field will receive an even $10 million. 9th place—the player who barely squeezes into the final table—will take home $1 million.
In this post I want to look at how the prize pool is distributed among final table participants. Typically as fields grow, top finishers receive a larger prize in dollar terms but a smaller share of the overall prize pool. That allows tournament directors to spread the cash around and keep everyone happy.
The $10,000 Main Event buy-in hasn’t budged in over 50 years. With inflation, that means the entry fee becomes more affordable every year. So it’s no surprise that fields have been hitting all-time highs recently.
What does that mean for the prize pool distribution? Let’s take a look at the data and find out.

1. Prepare the data.
Download this Main Event results dataset from Kaggle and read it into a Pandas DataFrame. I put the dataset together a few months ago thinking I could continue updating it every year, but the WSOP has since migrated all results onto their new app. It makes scraping much more difficult. So the dataset may be frozen at 2024 but we can still tack on the 2025 final table and analyze what’s there.
df = pd.read_csv("wsop_main_event_results_1971-2024.csv")
Columns include:
- year
- place
- name
- prize
- city
- state
- country
- entries
- prizepool
- buyin
- start
- finish
Let’s append the 2025 final table to the bottom of the DataFrame. We don’t yet know who will be there but we know prize amounts, which is enough to plot the prize pool distribution.
The proper way to append is to create a new DataFrame and concatenate it (concat
) with the original. Not all columns need to be present in the temporary DataFrame, so it won’t matter that we don’t know city, state, etc. Those columns will simply contain NaNs.
df2 = pd.DataFrame({"year": [2025 for _ in range(9)], "place": range(1, 10), "prize": [10e6, 6e6, 4e6, 3e6, 2.4e6, 1.9e6, 1.5e6, 1.25e6, 1e6], "entries": [9735 for _ in range(9)], "prizepool": [90.5355e6 for _ in range(9)]}) df = pd.concat([df, df2])
We’re analyzing how payouts are distributed, i.e. how top-heavy is the structure. Divide the payout column by overall prize pool to create a percentage column.
Of course, you don’t have to multiply by 100. You could leave the values as proportions. Or later multiply y-ticks labels by 100.
df.loc[:, 'pct_of_pool'] = df['prize'] / df['prizepool'] * 100
Filter the DataFrame to begin with 2003, the year Chris Moneymaker stole the show. His win single-handedly reinvented the WSOP so it doesn’t really make sense to compare earlier years.
Also limit rows to final table positions, 1st through 9th.
df = df[df['year'] >= 2003] df = df[df['place'] <= 9]
df.head(15)
looks like this:
653 2003 1 Chris Moneymaker 2500000.0 Lakeland TN US 839 7802700.0 10000.0 2003-05-19 2003-05-23 32.040191 654 2003 2 Sammy Farha 1300000.0 Houston TX US 839 7802700.0 10000.0 2003-05-19 2003-05-23 16.660899 655 2003 3 Dan Harrington 650000.0 Santa Monica CA US 839 7802700.0 10000.0 2003-05-19 2003-05-23 8.330450 656 2003 4 Jason Lester 440000.0 Aventura FL US 839 7802700.0 10000.0 2003-05-19 2003-05-23 5.639074 657 2003 5 Tomer Benvenisti 320000.0 Las Vegas NV US 839 7802700.0 10000.0 2003-05-19 2003-05-23 4.101144 658 2003 6 Amir Vahedi 250000.0 - - IR 839 7802700.0 10000.0 2003-05-19 2003-05-23 3.204019 659 2003 7 Young Pak 200000.0 Bainbridge WA US 839 7802700.0 10000.0 2003-05-19 2003-05-23 2.563215 660 2003 8 David Grey 160000.0 Henderson NV US 839 7802700.0 10000.0 2003-05-19 2003-05-23 2.050572 661 2003 9 David Singer 120000.0 Las Vegas NV US 839 7802700.0 10000.0 2003-05-19 2003-05-23 1.537929 716 2004 1 Greg Raymer 5000000.0 Raleigh NC US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 20.640346 717 2004 2 David Williams 3500000.0 Las Vegas NV US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 14.448242 718 2004 3 Josh Arieh 2500000.0 Alpharetta GA US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 10.320173 719 2004 4 Dan Harrington 1500000.0 Santa Monica CA US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 6.192104 720 2004 5 Glenn Hughes 1100000.0 Scottsdale AZ US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 4.540876 721 2004 6 Al Krux 800000.0 Feyetteville NY US 2576 24224400.0 10000.0 2004-05-22 2004-05-28 3.302455
You can see that Moneymaker’s $2.5 million prize was about 32% of the prize pool. The next year, Greg Raymer won a much bigger tournament but his $5 million was just 21% of the prize pool. All else equal, larger field sizes produce a flatter (less top-heavy) payout structure.
2. Plot the data.
To visualize the distribution and how it’s changed over time, we’ll create a stack plot (example from the documentation below). A stack plot allows you to quickly see how the total has changed, e.g. how much money is reserved for final table participants. As well as how components have changed relative to each other, e.g. how steep is the drop from 1st place to 2nd and so on.
Stack plots aren’t perfect. It can be difficult to judge whether an individual component’s share has risen or fallen from year to year. One might stay constant while neighboring sections change erratically. All the movement can play tricks on your eyes. But as always, use with caution.
A good stack plot approach is to create a pivot table. A pivot table takes two categorical variables—in this case year and finishing position—and generates a grid to display the values for every combination of the two variables.
We have 23 years in our DataFrame with 9 places each, so pivot_table
will create a 23-by-9 grid with 23*9=207 pct_of_pool values.
df = df.pivot_table(index="year", columns="place", values="pct_of_pool").reset_index()
It’s a little tricky but we call reset_index
to avoid year becoming the DataFrame’s index. That will be important when we hand df
to Matplotlib in a moment.
The new pivot table’s df.head()
is shown below. You can safely ignore the left-most column, place. Finishing positions are organized in columns. For example, 1st place received 32% of the prize pool in 2003, 21% in 2004, and so on, just as we observed above. The same goes for 2nd through 9th place. If you squint you can see this starting to become a stack plot.
place year 1 2 3 4 5 6 7 8 9 0 2003 32.040191 16.660899 8.330450 5.639074 4.101144 3.204019 2.563215 2.050572 1.537929 1 2004 20.640346 14.448242 10.320173 6.192104 4.540876 3.302455 2.786447 2.373640 1.941844 2 2005 14.199541 8.046406 4.733180 3.786544 3.313226 2.839908 2.461254 2.177263 1.893272 3 2006 14.543311 7.395878 4.997215 4.397549 3.897828 3.398106 2.898385 2.398663 1.898942 4 2007 13.799459 8.097323 5.098315 3.098975 2.099306 1.599471 1.179610 0.979676 0.879710
I’ll use a custom WSOP-themed mplstyle that will be linked at the bottom of this post.
Create Figure and Axes objects for plotting.
plt.style.use("wollen_wsop.mplstyle") fig, ax = plt.subplots()
With Matplotlib’s stackplot
, you can either pass a bunch of individual lists or pass one list along with a 2-D type, like a pandas DataFrame, and let the library handle it. If you pass a DataFrame, Matplotlib expects the data to be arranged in rows. Our pivot table is arranged in columns so we need to transpose it with .T
. This method swaps rows and columns and essentially rotates the DataFrame 90 degrees.
Notice that the first argument is the year column. The second argument is everything except year.
ax.stackplot(df['year'], df.drop("year", axis=1).T, alpha=0.8)
That takes care of the substance. Now we can deal with style.
x-ticks and y-ticks can be hard-coded. I hate doing it but we already typed 2025 figures above. For the y-ticks, pass in a list of strings with percentage signs.
Yearly x-ticks will be a little cramped on the horizontal axis so let’s rotate them. The easiest way is to call plt.setp()
.
ax.set(xticks=range(2003, 2026), xlim=(2002.6, 2025.4), yticks=range(0, 90, 10), yticklabels=[f"{n}%" for n in range(0, 90, 10)], ylim=(0, 80.5)) plt.setp(ax.xaxis.get_majorticklabels(), rotation=60, ha="right", rotation_mode="anchor")
I want to label each layer of the stack plot. The loop below steps through and calculates the middle y value of each. It calls ax.text
and labels the finishing position immediately to its right at x=2025.1.
total = 0 for ordinal, place in zip(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th"], range(1, 10)): ax.text(x=2025.1, y=total + df[place].iloc[-1] / 2, s=ordinal, size=8, va="center") total += df[place].iloc[-1]
I’ve been trying to be a little more creative with my titles. It’s popular to have a title and subtitle left-aligned above the plot. Let’s call ax.text
twice, making the main title bolder and slightly larger than the subtitle.
ax.text(x=2003, y=83, s="World Series of Poker • Main Event", weight="bold", size=11) ax.text(x=2003, y=81.2, s="Final Table payouts as a percentage of overall prize pool", size=10)
As a final touch, include the WSOP logo at the top-right corner. I find AnnotationBbox
is the cleanest way to draw an image. You can easily specify zoom
and alpha
(transparency). box_alignment
locates the image relative to xy
.
ab = AnnotationBbox(OffsetImage(plt.imread("wsop_logo.png"), zoom=0.25, alpha=0.05), xy=(2024.9, 78.8), box_alignment=(1, 1), frameon=False) ax.add_artist(ab)
Save the figure with a bumped dpi
.
plt.savefig("wsop_ft_stackplot.png", dpi=200)
3. The output.
In this case, I think it’s okay to use two alternating colors. If we were comparing nine different types of fruit, for example, it would be better to assign each a different color.
There are no big surprises in 2025. The field is slightly smaller and the winner is set to earn a slightly larger percentage of the prize pool. But generally speaking, the payout structure has become flatter over the last 20 years, which is what we expected to see.
Peter Eastgate’s 2008 win really stands out. Field size was about 20% off its 2006 peak. I wonder if WSOP officials sensed the Moneymaker Boom slipping away and reached to try and manufacture some excitement. Whatever the reason, great fortune for Eastgate.
4. Further reading.
That’s cool and stack plots are technically interesting, but there’s more to learn about the Main Event by looking at straightforward line graphs and adjusting for inflation. Below are a few slices of the data presented without code.
In 1972, the Main Event entry fee was about $77,000, adjusted for inflation. $10,000 is nothing to sneeze at. It’s a bigger tournament than most people will play in their lives. But it’s less nosebleed stakes than it was a couple generations ago. Chris Moneymaker paid nearly twice as much to enter in 2003.
In this plot you can see the Main Event reflect the broader economy. Entry count spun its wheels for several years following the Great Recession. Then it took off like a rocket in the late 2010s, only to be interrupted by COVID-19. The field had mostly gotten back on trend by 2023.
Here it gets interesting. There’s a lot of movement but it seems that overall prize pool, adjusted for inflation, has trended down since its mid-2000s peak. Growing field sizes and inflation are nearly a wash, but not quite. At best, prize pool has been flat.
1st place money, adjusted for inflation, has clearly trended down. Jamie Gold‘s 2006 win remains just as ridiculous as it was back then—over $18 million. Stack it up against the $10 million someone will win next week. Not that I would complain.
Last and perhaps most revealing, I plotted the rake (tournament fees) collected by the WSOP. Despite overall prize pools trending down since 2006, the people running the tournament are making more money.
When the Main Event returned to normal form in 2021, Caesars expected entry count to fall. They compensated by increasing their cut from 6% to 6.75%. Call it a pandemic premium to ensure safety, fine. But when entries rocketed to an all-time high in 2023, they didn’t reverse course. They hiked it again to an even 7%. That was Caesars’ final year running the WSOP before selling to NSUS, so maybe they took a little extra on their way out the door.
Unsurprisingly, NSUS has maintained the same 7% rake. I won’t hold my breath for it to fall.
Full code:
import pandas as pd import matplotlib.pyplot as plt from matplotlib.offsetbox import OffsetImage, AnnotationBbox pd.set_option("display.expand_frame_repr", False) df = pd.read_csv("wsop_main_event_results_1971-2024.csv") df2 = pd.DataFrame({"year": [2025 for _ in range(9)], "place": range(1, 10), "prize": [10e6, 6e6, 4e6, 3e6, 2.4e6, 1.9e6, 1.5e6, 1.25e6, 1e6], "entries": [9735 for _ in range(9)], "prizepool": [90.5355e6 for _ in range(9)]}) df = pd.concat([df, df2]) df.loc[:, 'pct_of_pool'] = df['prize'] / df['prizepool'] * 100 df = df[df['year'] >= 2003] df = df[df['place'] <= 9] df = df.pivot_table(index="year", columns="place", values="pct_of_pool").reset_index() plt.style.use("wollen_wsop.mplstyle") fig, ax = plt.subplots() ax.stackplot(df['year'], df.drop("year", axis=1).T, alpha=0.8) ax.set(xticks=range(2003, 2026), xlim=(2002.6, 2025.4), yticks=range(0, 90, 10), yticklabels=[f"{n}%" for n in range(0, 90, 10)], ylim=(0, 80.5)) plt.setp(ax.xaxis.get_majorticklabels(), rotation=60, ha="right", rotation_mode="anchor") total = 0 for ordinal, place in zip(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th"], range(1, 10)): ax.text(x=2025.1, y=total + df[place].iloc[-1] / 2, s=ordinal, size=8, va="center") total += df[place].iloc[-1] ax.text(x=2003, y=83, s="World Series of Poker • Main Event", weight="bold", size=11) ax.text(x=2003, y=81.2, s="Final Table payouts as a percentage of overall prize pool", size=10) ab = AnnotationBbox(OffsetImage(plt.imread("wsop_logo.png"), zoom=0.25, alpha=0.05), xy=(2024.9, 78.8), box_alignment=(1, 1), frameon=False) ax.add_artist(ab) plt.savefig("wsop_ft_stackplot.png", dpi=200)