Palindrome dates : setad emordnilaP
“What’s a palindrome again? Wait, I think that’s an anagram. I can’t even spell onomatopoeia.”
A palindrome is a word or phrase (or anything, really) that’s written the same forward and backward. For example:
- Racecar
- Madam
- Evil olive
- Step on no pets
Or my favorite, Aibohphobia, the fear of palindromes.
We can apply this concept to dates as well. Take the next upcoming palindrome date: February 20, 2022. The digits 2 20 2022
read the same in reverse.
A critical observer might point out that not everyone uses a month-day-year date format like us Americans. In fact most of the world outside the United States writes day-month-year. Isn’t the Chinese calendar on 4000-something? What about leading zeroes and a 2-digit year format?
Those are fair points. I have to admit that palindrome dates are an arbitrary, ultimately meaningless concept. But so are a lot of fun things in life and that’s never stopped us from enjoying them before.
In this post I’ll identify all the palindrome dates from now until the year 3000. Then I’ll plot their frequency and see if any obvious patterns emerge.
The general approach will be:
- Create a datetime object at January 1, 2000.
- Convert it to a string.
- Check if it’s a palindrome. If it is, make a note of it.
- Increment one day and check again.
- Repeat ad infinitum, or rather until 3000 A.D.
Eventually we’ll group palindrome counts into decade-sized bins. pandas makes that part easy. On the other hand pandas Timestamp
objects only count up to the year 2262. Usually that’s okay but it won’t be enough today. Instead we’ll use the datetime
class from the datetime
module.
1. Hunt for palindrome dates.
Begin with the imports.
from datetime import datetime, timedelta from collections import defaultdict import pandas as pd import matplotlib.pyplot as plt
The pandas documentation says it’s more efficient to take whole lists of data and create a dataframe all at once, rather than to append new rows to a dataframe over and over again. We’ll use a dictionary to keep a running tally of palindrome date occurrences and worry about a dataframe later.
We can use a defaultdict
from the collections module to streamline bookkeeping. This special type of dictionary allows us to reference a key that hasn’t yet been created. Instead of raising an error when we attempt to reference a non-existent key, it creates one and gives it a default value, in this case zero.
annual_count = defaultdict(int)
date
is a counter variable that is incremented until it reaches whatever endpoint we set. Notice how defaultdict
makes things easy. We don’t have to check if a year exists in the dictionary before incrementing it.
date = datetime(2000, 1, 1) while date <= datetime(2999, 12, 31): date_string = date.strftime("%-m%-d%Y") if date_string == date_string[::-1]: year = int(date.strftime("%Y")) annual_count[year] += 1 date += timedelta(days=1)
Now we have all the data we need. The only problem is that it’s in an annual form, which would be difficult to visualize or understand. The next step will be to turn yearly counts into decade-long counts. But first let’s turn the dictionary into a dataframe.
df = pd.DataFrame({"year": annual_count.keys(), "palindromes": annual_count.values()})
pandas.cut
is perfect for this task. The process is essentially like preparing a histogram. We define boundaries between each bin (every 10 years) and check where each row belongs. That information goes into a new column.
bins = range(2000, 3010, 10) labels = range(2005, 3005, 10) df.loc[:, "decade"] = pd.cut(df["year"], bins=bins, labels=labels, right=False)
right
refers to the upper boundary of each bin. If you’re familiar with interval notation it looks like this:
[left, right)
We specify labels because without them every cell in the new column would look like, for example, [2000, 2010)
. I prefer this cell’s value to be 2005—the middle of the interval. That way when I generate a bar plot the bars will be in the correct place.
At this point df.head(10)
looks like this. Notice a new decade
column that defines to which decade each row belongs:
year palindromes decade 0 2011 1 2015 1 2012 1 2015 2 2013 1 2015 3 2014 1 2015 4 2015 1 2015 5 2016 1 2015 6 2017 1 2015 7 2018 1 2015 8 2019 1 2015 9 2021 1 2025
Use groupby
to finally generate the information we want: the total number of palindromes in each decade.
as_index
is False so that decade
doesn’t become an index. Without this argument df2
would be a Series rather than a DataFrame.
df2 = df.groupby("decade", as_index=False)["palindromes"].sum()
df2.head(10)
looks like this:
decade palindromes 0 2005 0 1 2015 9 2 2025 9 3 2035 8 4 2045 0 5 2055 0 6 2065 0 7 2075 0 8 2085 0 9 2095 0
This means there are 9 palindrome dates in the 2010’s, 9 in the 2020’s, 8 in the 2030’s, none in the 2040’s, and so on.
2. Plot the data.
It’s time to plot the data. Although I’d like to use Seaborn‘s barplot()
, it expects categorical data along the x-axis, which would make it difficult to customize x-ticks. Instead I’ll use Matplotlib directly and implement the built-in seaborn
style to achieve roughly the same appearance.
plt.style.use("seaborn") fig, ax = plt.subplots(figsize=(14, 6)) fig.subplots_adjust(left=0.042, right=0.986, top=0.944, bottom=0.096) ax.bar(df2["decade"], df2["palindromes"], width=8) ax.set_xticks(range(2000, 3100, 100)) ax.set_yticks(range(0, 22, 2)) ax.set_xlim(1975, 3025) ax.set_ylim(0, 20.5) font = "Ubuntu Condensed" plt.xticks(font=font, fontsize=13) plt.yticks(font=font, fontsize=13) ax.set_xlabel("Decade", font=font, size=13, labelpad=6) ax.set_ylabel("Count", font=font, size=13, labelpad=6) ax.set_title("Palindrome Dates Per Decade | 2000-3000 A.D.", font=font, size=15) plt.show()
The output:
3. Discussion.
The first things I notice are:
- The 2200’s will be a golden age for palindrome dates.
- They mostly follow a regular pattern and appear in the first few decades of each century.
In total there are 306 palindrome dates during this 1000-year period. If we take a look at the number of digits in each string an interesting pattern emerges:
During the 2200s... 6 digits: 81 7 digits: 21 8 digits: 3 During the other 900 years... 6 digits: 0 7 digits: 198 8 digits: 3
It turns out that the 2200’s have 81 6-digit palindrome dates and the other 900 years experience none. That’s because of the double-2 at the beginning of every year, e.g. 2234. It allows for a symmetric date structure that isn’t possible otherwise.
Take for example June 7, 2276:
6 7 2 2 7 6
It’s only possible for a 6-digit date string to be palindromic if the middle 2 digits are the same, which only occurs during the 2200’s.
As you’d expect, the next millennium will follow a similar pattern with the 3300’s being the most active palindrome century.
So I plan to enjoy all the palindrome dates I can over the next 18 years, because after September 30, 2039 they won’t appear again until the year 2101.
Remember to appreciate that you aren’t a day-month-year person. They’re currently 2 years into an 81-year palindrome drought. Imagine the despair.
Full code:
from datetime import datetime, timedelta from collections import defaultdict import pandas as pd import matplotlib.pyplot as plt annual_count = defaultdict(int) date = datetime(2000, 1, 1) while date <= datetime(2999, 12, 31): date_string = date.strftime("%-m%-d%Y") if date_string == date_string[::-1]: year = int(date.strftime("%Y")) annual_count[year] += 1 date += timedelta(days=1) df = pd.DataFrame({"year": annual_count.keys(), "palindromes": annual_count.values()}) bins = range(2000, 3010, 10) labels = range(2005, 3005, 10) df.loc[:, "decade"] = pd.cut(df["year"], bins=bins, labels=labels, right=False) df2 = df.groupby("decade", as_index=False)["palindromes"].sum() plt.style.use("seaborn") fig, ax = plt.subplots(figsize=(14, 6)) fig.subplots_adjust(left=0.042, right=0.986, top=0.944, bottom=0.096) ax.bar(df2["decade"], df2["palindromes"], width=8) ax.set_xticks(range(2000, 3100, 100)) ax.set_yticks(range(0, 22, 2)) ax.set_xlim(1975, 3025) ax.set_ylim(0, 20.5) font = "Ubuntu Condensed" plt.xticks(font=font, fontsize=13) plt.yticks(font=font, fontsize=13) ax.set_xlabel("Decade", font=font, size=13, labelpad=6) ax.set_ylabel("Count", font=font, size=13, labelpad=6) ax.set_title("Palindrome Dates Per Decade | 2000-3000 A.D.", font=font, size=15) plt.show()