Sports, Betting

How much is NBA home court advantage worth?

December 5, 2024March 27, 2025 by Jeff Wollen

You’ll often hear people talk about home field advantage in football, which simply means that home crowd, lack of travel, etc. are worth a couple points to the final score. It’s discussed less often in basketball but the advantage is just as real.

Let’s look at historical NBA data and measure both the expected advantage according to oddsmakers and the empirical value from scoring data. We’ll plot each season’s average and see how home court has changed over the years.

1. Prepare the data.

I’ll use the Kaggle dataset here. It contains both scoring and betting data beginning with the 2007-2008 season. Load the CSV into a pandas DataFrame and take a look at the relevant columns.

import pandas as pd

df = pd.read_csv("nba_2008-2024.csv")

print(df[['season', 'date', 'away', 'home', 'score_away', 'score_home', 'whos_favored', 'spread']].head())

The output:

   season        date  away home  score_away  score_home whos_favored  spread
0    2008  2007-10-30   por   sa          97         106         home    13.0
1    2008  2007-10-30  utah   gs         117          96         home     1.0
2    2008  2007-10-30   hou  lal          95          93         away     5.0
3    2008  2007-10-31   phi  tor          97         106         home     6.5
4    2008  2007-10-31   wsh  ind         110         119         away     1.5

The whos_favored column is always either “home” or “away” and spread is always a positive number. Notice that season is encoded as the year the season ends, e.g. 2007-08 becomes 2008.

We’re interested first in real-world home court advantage according to the final score. Create a new column to hold this data and later we’ll calculate each season’s average.

df.loc[:, 'home_score_margin'] = df['score_home'] - df['score_away']

We’re also interested in the expected advantage according to the point spread. Just like before, we want a positive value to indicate the home team is favored and negative to be an away favorite.

df.loc[:, 'home_favored_by'] = df.apply(lambda row: row['spread'] if row['whos_favored'] == "home" else row['spread'] * -1, axis=1)

I’m certainly not a pandas expert but I’ve learned that groupby, while intimidating at first, is a powerful method to have in your arsenal. A few years ago it would have been tempting to iterate through each season, filter the DataFrame, and calculate averages individually. But that’s less efficient and (more importantly to me) a lot more work.

It helps me to think about what I’m doing in plain English. We want to group the data into separate buckets according to the season column, so that’s the argument passed to groupby. We’re interested in the two recently created columns so those are referenced as a list. agg applies a function to each bucket of data.

df2 = df.groupby("season")[['home_score_margin', 'home_favored_by']].agg("mean")

print(df2.head())

Things becomes clearer when you look at the new DataFrame.

        home_score_margin  home_favored_by
season                                    
2008             3.712766         3.476064
2009             3.309506         3.377567
2010             2.842226         3.323933
2011             3.194508         3.271930
2012             2.965549         3.167132

The DataFrame’s index is season and it holds yearly averages for each column. That’s everything we need. Next we can plot the data.

2. Plot the data.

I’ll use a custom Matplotlib style I created to emulate FiveThirtyEight. It won’t actually look like a FiveThirtyEight plot because it will use NBA-themed colors, but it provides a good blank slate that’s less off-putting than default Matplotlib.

Create an Axes instance and pass the appropriate df2 columns to scatter. Remember that x-axis data, season, is the DataFrame’s index.

import matplotlib.pyplot as plt

plt.style.use("wollen_538.mplstyle")
fig, ax = plt.subplots()

ax.scatter(df2.index, df2['home_score_margin'],
           color="#DB132E", marker="h", s=110,
           edgecolor="#555", linewidth=1.0,
           label="Score Margin")

ax.scatter(df2.index, df2['home_favored_by'],
           color="#00418D", marker="D", s=60,
           edgecolor="#555", linewidth=1.0,
           label="Favored By")

We could hard-code ticks and window limits and it would require fewer lines of code, but I generally try to avoid it. It will be easier to reuse this script in a year or two when I return with new data.

NBA seasons span two calendar years so let’s communicate that along the x-axis. That means labels take up more space so let’s also rotate them 60 degrees.

x_ticks = range(df2.index.min(), df2.index.max() + 1)
ax.set_xticks(x_ticks, labels=[f"{n - 1}-{n - 2000:02}" for n in x_ticks])
plt.setp(ax.xaxis.get_majorticklabels(), rotation=60, ha="right", rotation_mode="anchor")
x_tick_range = x_ticks[-1] - x_ticks[0]
ax.set_xlim(x_ticks[0] - x_tick_range * 0.03, x_ticks[-1] + x_tick_range * 0.02)

Identify the bottom and top y-ticks using a while loop. We can staple the two columns together with concat to make sure we consider the overall minimum and maximum values.

bottom_y_tick = 10.0
while bottom_y_tick > pd.concat([df2['home_score_margin'], df2['home_favored_by']]).min():
    bottom_y_tick -= 0.5
top_y_tick = 0.0
while top_y_tick < pd.concat([df2['home_score_margin'], df2['home_favored_by']]).max():
    top_y_tick += 0.5
y_ticks = arange(bottom_y_tick, top_y_tick + 0.5, 0.5)
ax.set_yticks(y_ticks)
y_tick_range = y_ticks[-1] - y_ticks[0]
ax.set_ylim(y_ticks[0] - y_tick_range * 0.03, y_ticks[-1] + y_tick_range * 0.005)

Finally, create a legend in the upper-right corner, set plot labels, and save the figure.

ax.legend(loc="upper right")

ax.set_ylabel("Points")
ax.set_title("NBA  •  Home Court Advantage")

plt.savefig("nba_hca.png", dpi=200)

3. The output.

To answer the original question, NBA home court advantage is worth about 2 to 2.5 points.

What’s interesting to me is how clearly the advantage has trended down over the past 17 years. 2019-20 and 2020-21 were affected by the COVID-19 “bubble” and reduced crowd sizes. But even if you throw out those seasons, home court has lost a full point of value.

I don’t think there’s any clear answer as to why this happened. Are home crowds really less rowdy than they were 20 years ago? I’m sure you could find grumpy fans who insist that people are too busy playing on their phones to be loud. I would look more toward innovation in travel methods. Teams have better optimized routines that help them arrive healthy and ready to perform. In addition, salaries have grown so players are more incentivized to take those routines seriously.

Still, home court will always have some positive value. We’ll have to circle back in a few years to see where the trend has leveled off.

Download the Matplotlib style.

Full code:

import pandas as pd
import matplotlib.pyplot as plt
from numpy import arange


df = pd.read_csv("nba_2008-2024.csv")

print(df[['season', 'date', 'away', 'home', 'score_away', 'score_home', 'whos_favored', 'spread']].head())

df.loc[:, 'home_score_margin'] = df['score_home'] - df['score_away']

df.loc[:, 'home_favored_by'] = df.apply(lambda row: row['spread'] if row['whos_favored'] == "home" else row['spread'] * -1, axis=1)

df2 = df.groupby("season")[['home_score_margin', 'home_favored_by']].agg("mean")

print(df2.head())

plt.style.use("wollen_538.mplstyle")
fig, ax = plt.subplots()

ax.scatter(df2.index, df2['home_score_margin'],
           color="#DB132E", marker="h", s=110,
           edgecolor="#555", linewidth=1.0,
           label="Score Margin")

ax.scatter(df2.index, df2['home_favored_by'],
           color="#00418D", marker="D", s=60,
           edgecolor="#555", linewidth=1.0,
           label="Favored By")

x_ticks = range(df2.index.min(), df2.index.max() + 1)
ax.set_xticks(x_ticks, labels=[f"{n - 1}-{n - 2000:02}" for n in x_ticks])
plt.setp(ax.xaxis.get_majorticklabels(), rotation=60, ha="right", rotation_mode="anchor")
x_tick_range = x_ticks[-1] - x_ticks[0]
ax.set_xlim(x_ticks[0] - x_tick_range * 0.03, x_ticks[-1] + x_tick_range * 0.02)

bottom_y_tick = 10.0
while bottom_y_tick > pd.concat([df2['home_score_margin'], df2['home_favored_by']]).min():
    bottom_y_tick -= 0.5
top_y_tick = 0.0
while top_y_tick < pd.concat([df2['home_score_margin'], df2['home_favored_by']]).max():
    top_y_tick += 0.5
y_ticks = arange(bottom_y_tick, top_y_tick + 0.5, 0.5)
ax.set_yticks(y_ticks)
y_tick_range = y_ticks[-1] - y_ticks[0]
ax.set_ylim(y_ticks[0] - y_tick_range * 0.03, y_ticks[-1] + y_tick_range * 0.005)

ax.legend(loc="upper right")

ax.set_ylabel("Points")
ax.set_title("NBA  •  Home Court Advantage")

plt.savefig("nba_hca.png", dpi=200)

wollen.org

How much is NBA home court advantage worth?

1. Prepare the data.

2. Plot the data.

3. The output.

Leave a Reply Cancel reply