Is Christmas music becoming more popular?
I’m not one of those people who makes hating Christmas music an important part of their personality, but I have to be in the right mood, and there quickly comes a time when I’ve had enough. I had a suspicion that Christmas music has crept further into mainstream culture. Let’s see what the data can tell us.
I found a great Christmas music dataset on Kaggle that merges a Billboard Hot 100 dataset with the Wikipedia page for popular Christmas singles. We can use it to visualize Christmas songs’ chart presence over time. Are they becoming more popular? The catch is that this dataset only covers 1958 through 2017, so we’ll miss the past few years, but it should be sufficient for our purposes.
1. Prepare the data.
Start by reading the dataset with pandas.
df = pd.read_csv("christmas_billboard_data.csv")
A full list of columns is below:
url weekid week_position song performer songid instance previous_week_position peak_position weeks_on_chart year month day
Our goal is to measure the presence of Christmas songs over time, which means we won’t have to worry about filtering. The dataset creator already did the hard work of removing non-Christmas music. We’ll just need to count rows.
The column we’re most interested in is week_position
. This is a song’s ranking (1-100). I think grouping the data by year will be most appropriate to see trends over 60 years. The holiday season comes once a year, after all, so it makes sense to use the same period.
A lower ranking indicates higher popularity, so we should pass week_position
through a function and invert it. Let’s call the new column popularity_index
.
df.loc[:, "popularity_index"] = 101 - df.week_position
For example, topping the Hot 100 at #1 would be worth 100 points. Sneaking in at #100 would be worth just a point.
Obviously Popularity Index uses an arbitrary scale. I think we should take a cue from Google and the way they handle Google Trends data. Regardless of the absolute popularity of any search term, Google transforms the data to use a relative 0-100 scale. Every data point is plotted as a proportion of the series’ maximum value. In other words, peak popularity is always 100 and the rest of the data follows accordingly.
As an example, here’s the search term “bitcoin” over the last five years:
Notice the y-axis goes from 0 to 100. This is always the case with Google Trends plots. We can normalize Popularity Index the same way.
First use a pandas groupby
to bundle together each year, then use a sum
aggregate function on the popularity_index
column.
df2 = df.groupby("year")["popularity_index"].sum().reset_index()
The new dataframe looks like this:
year popularity_index 0 1958 295 1 1959 442 2 1960 1332 3 1961 1459 4 1962 1662
But what do 295 or 442 mean? They represent the total presence of Christmas songs in a given year—but those values would seem so arbitrary on a plot.
Now we should normalize the data using a 0-100 scale. We’ll call it a Popularity Percentile. Each value is represented as a proportion of the largest value.
df2.loc[:, "popularity_percentile"] = df2.popularity_index / df2.popularity_index.max() * 100
2. Plot
I created a Christmas-themed Matplotlib style for the plot. It will be linked at the bottom of this post.
A few quick notes about the Matplotlib code:
- We can pass dataframe columns directly into plotting methods.
- Use
set
to cover several customizations with a single method. - We can add small inset images so the plot will be overflowing with Christmas cheer. How you wield such power is up to you and your feelings about Christmas music. Be sure to set
annotation_clip=False
so Matplotlib can draw outside the axes limits. wollen_christmas.mplstyle
uses a particular typeface I found online. I don’t want to distribute it but I’ll link it at the bottom of this post.- It’s good to increase
dpi
when saving raster images, especially those with small inset images.
plt.style.use("wollen_christmas.mplstyle") fig, ax = plt.subplots() ax.bar(df2["year"], df2["popularity_percentile"], width=0.7) y_ticks = range(0, 110, 10) ax.set(xticks=range(1955, 2025, 5), xlim=(1954, 2021), yticks=y_ticks, ylim=(-1, 101), yticklabels=[f"{n}%" for n in y_ticks], ylabel="Annual Popularity", title="Christmas Song Popularity | Billboard Top 100 | 1958–2017") inset_image_positions = [(1971, 104), (1973, 104), (2002, 104), (2004, 104)] for pos in inset_image_positions: ab = AnnotationBbox(OffsetImage(plt.imread("snowflake.png"), zoom=0.03), pos, frameon=False, annotation_clip=False) ax.add_artist(ab) plt.savefig("christmas_song_popularity.png", dpi=150)
The output:
To answer the original question: Yes, Christmas music has been gaining popularity in recent years—at least through 2017. It suffered through a 40-year arctic slumber but it has since regained some momentum.
My theory, for all it’s worth, is that Christmas music was genuinely more popular in the mid-20th century. There was less music in general to crowd it out. In the 21st century, significantly more music is being published. Technology makes it easier than ever for artists to get their songs onto listeners’ playlists, and audiences sort into smaller and smaller niches. Now the industry is so diverse that it requires music with an exceptionally broad appeal, like Christmas music, to gain traction on the Billboard Hot 100. Something like Mariah Carey’s infamously irritating All I Want for Christmas is You!
Or maybe modern Americans just love Christmas music as much as they do comic book movies. It’s hard to say.
Merry Christmas to those who celebrate! Happy Holidays to all! And to all a good night.
Full Code:
import pandas as pd import matplotlib.pyplot as plt from matplotlib.offsetbox import OffsetImage, AnnotationBbox df = pd.read_csv("christmas_billboard_data.csv") df.loc[:, "popularity_index"] = 101 - df.week_position df2 = df.groupby("year")["popularity_index"].sum().reset_index() df2.loc[:, "popularity_percentile"] = df2.popularity_index / df2.popularity_index.max() * 100 plt.style.use("wollen_christmas.mplstyle") fig, ax = plt.subplots() ax.bar(df2["year"], df2["popularity_percentile"], width=0.7) y_ticks = range(0, 110, 10) ax.set(xticks=range(1955, 2025, 5), xlim=(1954, 2021), yticks=y_ticks, ylim=(-1, 101), yticklabels=[f"{n}%" for n in y_ticks], ylabel="Annual Popularity", title="Christmas Song Popularity | Billboard Top 100 | 1958–2017") inset_image_positions = [(1971, 104), (1973, 104), (2002, 104), (2004, 104)] for pos in inset_image_positions: ab = AnnotationBbox(OffsetImage(plt.imread("snowflake.png"), zoom=0.03), pos, frameon=False, annotation_clip=False) ax.add_artist(ab) plt.savefig("christmas_song_popularity.png", dpi=150)