Age Differences of Celebrity Couples

In one of my first posts, I wrote about the age difference of movie couples and showed that the male is, on average, slightly older than his partner. The goal of this article is to find out whether there is a similar trend among real celebrity couples. Even though the age difference in sexual relationships is a frequent subject of proper scientific studies, I thought it would be still interesting to take a closer look at stars’ dating lives.

As a data source, I scraped the website (WDW), which collects information about the dating history of celebrities. Since it is a typical gossip site, and also relies on rumors, it must be expected that some of the data is inaccurate or incomplete. Nevertheless, this should be acceptable for our use case. The web-scraping provided data on 53,820 celebrities (dataset: 2016/12/06). To ensure the relevance of the data, I only included relationships which started in the past 50 years in the sample where at least one partner had been among the 5,000 most searched celebrities (WDW Rank). In addition, I removed all relationships from the sample for which the birthdates of both partners were not available. After applying all the criteria, the sample consisted of 6,693 relationships.

While extracting the data from the HTML source was a straightforward process with the Python library, Beautiful Soup, it was surprising that WDW doesn’t provide any information about gender. In cases where the sex couldn’t be derived from the occupation (e.g., actor/actresses), I used to determine the gender from the first name and manually checked unclear cases.

To visualize the data, we can use a program like Gephi to plot the relationships as a network. For the following figure, I used Gephi’s ego network filter with a depth setting of 2 to illustrate to whom Alexander Skarsgård, a Swedish actor, was directly and indirectly connected. Network

This figure indicates that both Alexander Skarsgård and Marilyn Manson, an American singer, dated the actress Evan Rachel Wood. Next, let’s look at the ages of the partners at the start of their (alleged) relationships. The 2D density plot shows the age of the male partner on the x-axis and the age of the female partner on the y-axis. The plot suggests that, in many cases, the man is slightly older than his female partner, a difference that gets bigger with the increasing age of the male. Overall, this age disparity is similar to the age difference among married couples in Western countries (Wikipedia).

Couples - Density Plot

To visualize the age difference across certain age groups, I used the R package yarrr for the next pirate plot. The plot illustrates that the age disparity in new relationships increases significantly when male celebrities get older. While male stars in their 20s have female partners 2.4 years older on average, the women in new relationships with 40- to 49-year-olds and over-50s are about 6.0 and 16.5 years younger, respectively.

age difference
age (male) mean median n
20-29 -2.39 -1 420
30-39 0.51 1 3,170
40-49 5.99 7 2,130
50+ 16.47 17 973

Couples - Pirate Plot

A regression analysis can be applied to examine the relationship between female and male age further. To automatically account for non-linearity, I chose MARS (earth R package). The result shows the age difference changes with the increasing age of the male partner. The shaded area is the 90% prediction interval. According to the model, the new partner (female) of a 40-year-old male celebrity who started dating is on average 9.8 years younger.

Couples - Pirate Regression

Overall, the data shows that the often criticized age difference among movie couples may not be too far from reality, or at least the reality of some celebrities. If you have questions or concerns, feel free to write me an email.