CrossFit 2015 Leaderboard

by Vitor Bernardes

Introduction to CrossFit and the Data Set

CrossFit is a popular fitness program and fitness sport created in 2000. It combines elements of aerobic exercise, calisthenics (body weight exercises), and Olympic weightlifting with the goal of improving overall fitness.

On the sport side, since 2007 CrossFit promotes an annual competition open for athletes from all over the world, called the CrossFit Games. The Games has three stages of qualification: the Open, Regionals, and the Games themselves.

The Open, which receives its name because participation is open to anyone, is held over five weeks at the beginning of the competition season. Each week contains a workout that must be completed by athletes. The athletes can complete the workout at their local box (how CrossFit gyms are called) and submit their scores online. The workouts are referenced by their year and the number corresponding to the order they have been presented in. For example, the first workout of the 2015 Open is called 15.1, the second one is called 15.2, and so forth.

The data set we are going to analyze is the 2015 Open leaderboard. It contains data from athletes from all over the world and the results they submitted for each completed workout.

Summary of the Data Set

Let’s review the data set we are working with.

## [1] 250717

As we can see, it contains observations about roughly over 250,000 athletes that competed in the 2015 Open.

Let’s what data we have about each athlete.

## 'data.frame':    1504303 obs. of  28 variables:
##  $ division  : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 2 2 2 2 ...
##  $ stage     : num  5 5 5 5 5 5 5 5 5 5 ...
##  $ athlete_id: int  1690 1998 2206 2559 2811 3008 3021 3938 4006 4156 ...
##  $ rank      : int  154 5950 768 294 1946 245 1105 2260 880 2989 ...
##  $ score     : int  366 497 404 379 437 375 415 444 408 457 ...
##  $ howlong   : Factor w/ 5 levels "Less than 6 months|",..: 4 4 NA 5 4 4 1 4 NA 4 ...
##  $ category  : Factor w/ 2 levels "Rx","Scaled": 1 1 1 1 1 1 1 1 1 1 ...
##  $ scaled    : Factor w/ 2 levels "false","true": 1 1 1 1 1 1 1 1 1 1 ...
##  $ name      : Factor w/ 236538 levels " Bill Sheehan",..: 227131 109885 166370 215890 29501 31246 235109 92131 1300 165324 ...
##  $ region    : Factor w/ 18 levels "","Africa","Asia",..: 16 18 10 8 12 10 3 18 15 1 ...
##  $ team      : Factor w/ 4531 levels "","#CF9J","#CFFPNation",..: 1 3521 4389 1645 4033 1342 1 1 3083 1 ...
##  $ affiliate : Factor w/ 9776 levels "","100 Pourcent CrossFit",..: 5453 8721 9533 4114 5441 3114 2367 2261 8003 1 ...
##  $ gender    : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 2 2 2 2 ...
##  $ age       : int  24 31 25 26 28 38 29 24 26 24 ...
##  $ height    : int  71 70 72 68 68 72 65 70 72 68 ...
##  $ weight    : int  198 170 197 176 195 219 165 170 185 170 ...
##  $ fran      : int  135 157 139 140 152 124 NA 756 154 147 ...
##  $ helen     : int  394 422 410 410 NA 481 NA 458 440 NA ...
##  $ grace     : int  80 116 NA 100 NA 77 NA 110 NA 132 ...
##  $ filthy50  : int  819 1099 NA NA NA 1080 NA 1271 986 NA ...
##  $ fgonebad  : int  447 NA NA NA NA 451 NA 387 NA 413 ...
##  $ run400    : int  54 60 NA NA NA 62 NA 128 NA NA ...
##  $ run5k     : int  1083 1350 NA 1140 NA 1260 NA 1800 NA NA ...
##  $ candj     : int  345 280 335 309 335 305 285 120 265 275 ...
##  $ snatch    : int  275 225 265 234 245 255 235 85 215 215 ...
##  $ deadlift  : int  550 425 NA 441 545 485 425 135 485 385 ...
##  $ backsq    : int  455 345 455 390 465 465 385 225 465 380 ...
##  $ pullups   : int  75 50 NA 45 NA 53 NA 56 65 NA ...

We can see we have several variables with data on the athletes themselves (such as name, region, age, height, and weight), some variables related to the Open workouts and results (such as stage, category, score, and rank), and also some results for benchmark workouts by each athlete (such as fran, helen, snatch, and deadlift).

We will primarily be interested to see what factors are related to the athletes’ results, contained in the variables score and rank.

Summary of Features

Let’s briefly examine the features we will be using in our analysis in order to identify their distribution, any outliers, and also improve our knowledge of the data we will be working with.

We can see there are some pretty extreme values for height, weight, snatch, deadlift, and pullups that are getting in the way of our understanding the data. Let’s identify those outliers and remove them in order to make our analysis more robust.

Now this looks much better and provides us with a first look at the data we will be working with and its distribution.

Getting to know our athletes

Let’s get to know a little about the athletes we will be exploring further in this analysis.

Age

We see the distribution of the number of athletes by age is pretty similar for both men and women. We can also see the number of male athletes is larger in the 2015 Open.

Category

The competition is divided into two categories: Rx and Scaled. In the Rx category, athletes must complete the workouts exactly as prescribed. The Scaled category was created so the Open would be more accessible to a larger number of athletes, and has scaled-down versions of the Rx workouts.

Let’s see how the athletes are divided into both categories.

## 
##        Rx    Scaled 
## 0.8505327 0.1494673

This plot shows us the absolute majority of athletes (85%) in the 2015 Open are in the Rx category. The plot also shows the proportion between male and female athletes on both categories. While men are in higher number in the Rx category, the Scaled category includes more women than men.

Now let’s create a single plot where we will be able to see the distribution of age per category.

It is interesting to note the center of the distribution of scaled athletes appears slightly higher than for Rx athletes. That is specially noticeable for men. It seems reasonable, because older athletes might find it more difficult to complete Rx workouts.

Let’s plot the histogram and some summary statistics for the scaled category to check that observation.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   16.00   26.00   30.00   31.53   36.00   54.00
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   15.00   27.00   31.00   31.98   37.00   54.00
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   17.00   28.00   33.00   34.29   40.00   54.00
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   15.00   30.00   36.00   36.11   43.00   54.00

Indeed, we can see that while the average age for men and women in the Rx category is very close, the Scaled category has higher average ages for both genders.

CrossFit experience

Let’s find out how long the athletes have had CrossFit experience prior to joining the Open.

While we do not have data on many athletes, we can see many of them have joined the 2015 Open with less than a year of CrossFit experience, which might show eagerness to participate in the event.

One aspect that might influence the choice of Scaled vs. Rx category is how long the athlete has been practicing CrossFit for. It seems reasonable that more experienced athletes might be more inclined to opt for the Rx category.

We can see the proportion of Rx athletes is larger for athletes with over 2 years of experience, and smaller for athletes with between 6 months and 2 years of experience. One interesting observation is that most athletes with less than 6 months of CrossFit experience chose the Rx category. It certainly is a curious fact to notice, however since we unfortunately don’t have experience data for many of our athletes, we can’t draw many conclusions from it.

Regions

Finally, let’s take a look at where our athletes come from.

This plot shows that, despite being open to participation to athletes from all over the world, the popularity of the competition is still heavily centered in North America, followed by Europe. Huge continents such as Africa and Asia still show very little participation in the Open.

Taking a look at the workouts

Now let’s take a look at some results.

As we mentioned, the workouts are referenced by their year and number, such as 15.1, 15.2 etc. We will refer to them using this pattern.

NOTE: The 2015 Open had a special workout on the first week, which we will refer to as 15.1a. In our data set, it is refered to as 1.1.

Also, it is important to mention the workouts can be one of two kinds. In the first kind, the athlete tries to achieve the highest possible number of repetitions, or reps, in the given timeframe. In the second kind, the athlete must complete the workout as fast as possible. The scores for the first kind of workout are measured in number of reps (which means the higher, the better), and for the second kind are measured in seconds (which means the lower, the better).

All but the last workout of the 2015 Open are of the first kind. Only the 15.5 workout score is measured in seconds.

Now let’s plot the distribuition of scores by workout and division. Unless otherwise mentioned, we will focus on the Rx category for the analysis.

Several plots show peaks. The peaks are present on 15.1, 15.2, and 15.4, but they are particularly sharp and intriguing on workout 15.3. I should investigate further to find out what happended there.

That is a very interesting chart, but warrants a closer look at each workout so we can better understand the story they are telling.

Workout 15.1

Workout 15.1 consisted of:

Complete as many rounds and reps as possible in 9 minutes of: 15 toes-to-bars 10 deadlifts (115 / 75 lb.) 5 snatches (115 / 75 lb.)

Each 30 reps represents a completed round of exercises. This particular workout is interesting as the first movement is a relatively easier gymnastic one, compared to the other 2 weightlifting exercises. So the peaks in this plot show how many people struggled to perform the weightlifting exercises in each round. The dips at 15, 45, 75, and so on, show that athletes rarely ended their workouts on the gymnastic movement, but rather on the heavier exercises.

Since the last 2 exercises for each round are the deadlift and the snatch, and our data set contains benchmarks for those exercises for some athletes, let’s check if their maximum weight lifted had any relationship to their results in this workout.

Even though the data is very dispersed, we can see a positive trend between the athlete’s record deadlift and their score on this workout, both for male and female athletes.

Now let’s run the same analysis for the snatch.

The same can be said for the snatch. Though the data is also very dispersed, we can see a positive trend between the athlete’s record snatch and their score on this workout.

Workout 15.1a

Workout 15.1a consisted of:

1-rep-max clean and jerk 6-minute time cap

In other words, the athlete had 6 minutes to perform the heaviest clean and jerk he or she could manage. This is a workout where strength is critical.

Let’s see the distribution of scores for this workout.

The distribution of scores looks very similar for both divisions.

Since strength is vital for this workout, I wonder if bodyweight has any relation to the score. Let’s make a plot of result by weight and find out.

## 
##  Pearson's product-moment correlation
## 
## data:  open.15.1a.rx$weight and open.15.1a.rx$score
## t = 225.62, df = 136280, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.5176000 0.5253309
## sample estimates:
##       cor 
## 0.5214762

Indeed we see a positive relationship between bodyweight and score for 15.1a.

Workout 15.2

Workout 15.2 consisted of:

Every 3 minutes for as long as possible complete: From 0:00-3:00 2 rounds of: 10 overhead squats (95 / 65 lb.) 10 chest-to-bar pull-ups From 3:00-6:00 2 rounds of: 12 overhead squats (95 / 65 lb.) 12 chest-to-bar pull-ups Etc., following same pattern until you fail to complete both rounds

Let’s zoom in on the female plot and try to figure out why there is a sharp peak at around score 10.

##   10 
## 9482
##        10 
## 0.1728809

Now we can see very clearly what happened: almost 9,500 women, or 17.3% of the total number of athletes that completed this workout, were not able to complete 1 chest-to-bar pull-up after the first set of 10 overhead squats. The ones that were able to perform that movement were able to pass the first round and completed the workout at several score levels. That shows how difficult that movement is to perform.

Our data set also contains data about the maximum number of pull-ups each athlete has performed in a row. Let’s plot the average 15.2 score by maximum pull-ups.

This plot clearly shows a positive relationship between the average maximum number of pull-ups and 15.2 scores. In addition, since the female score by average maximum pull-ups is higher than male scores, it reinforces how particularly challenging the chest-to-bar pull-up was for female athletes.

Workout 15.3

Workout 15.3 consisted of:

Complete as many rounds and reps as possible in 14 minutes of: 7 muscle-ups 50 wall-ball shots 100 double-unders

This is the plot we saw earlier that shows very sharp peaks. Let’s dig deeper and take a look at this workout to try to figure out what the peaks are about. We will plot the women’s division and look more closely at what happened.

This workout consisted of 3 different movements, the first of which was 7 reps of a highly demanding movement (muscle-up), followed by 50 reps and 100 reps of two much simpler movements. So this plot depicts a very clear story: athletes struggling to perform the first movement, then speeding somewhat more easily through the next two, and then struggling (again) if they are able to reach the next round of muscle-ups. Each peak corresponds to athletes whose workouts ended trying to execute that movement.

After the start of rounds two and three (second and third peaks), we can see a smaller concentration of scores, which represent athletes whose workouts ended executing the second movement in the round, the wall balls.

Workout 15.4

Workout 15.4 consisted of:

Complete as many reps as possible in 8 minutes of: 3 handstand push-ups 3 cleans 6 handstand push-ups 3 cleans 9 handstand push-ups 3 cleans 12 handstand push-ups 6 cleans 15 handstand push-ups Etc., adding 3 reps to the handstand push-up each round, and 3 reps to the clean every 3 rounds. Men clean 185 lb. Women clean 125 lb.

Since this workout involves handstand pushups, which are pushups where the athlete is upside down, pushing up his or her own weight, let’s investigate if lighter athletes had an edge.

That doesn’t look to be the case. We can identify, especially on the male division, a curve where lighter and heavier athletes performed more poorly. We should take this information and remember this workout also included a weightlifting movement, so it makes sense that an optimal range of weights would yield better results.

How about height? I wonder how it relates to scores for this workout.

This plot shows us something interesting and somewhat unexpected. We can definitely see a trend where shorter athletes had better results on average for this workout.

Now let’s combine the two measurements and display both on one chart.

On both the above charts we can see that lighter athletes tend to be shorter, and the heavier athletes are usually taller. We can also see how the top of the charts are lighter (indicating shorter athletes), especially on the lighter bodyweight range.

Workout 15.5

Workout 15.5 consisted of:

27-21-15-9 reps for time of: Row (calories) Thrusters Men use 95 lb. Women use 65 lb.

This looks like a pretty intensive workout. It involved completing the sequence of movements as fast as possible, so the lower the score, the better the result for the athlete. I wonder how age affected athletes’ performance.

Even though the points are very disperse, it looks like the older the athlete, the longer he or she will take to complete this workout. Also, male athletes appear to have achieved better scores on average than female athletes for this workout.

Let’s average the results by age to help us check this trend.

The trend looks very clear now. For 15.5, average performance decreased with age.

To continue with this reasoning, let’s check if age is reflected on other workouts as well. Let’s plot the average result by age for the rest of the workouts.

Indeed, for most workouts, performance seems to peak at the age range of 20 to 30, then it starts to decrease from that age on. One interesting thing this chart shows is the female performance on 15.3, which doesn’t display that trend as clearly, and shows relatively high mean scores for ages above 50.

Finally, let’s create a variable that represents each athlete’s overall rank and plot that against age.

We can see here the same trend we saw on the last chart. The average overall rank by age seems to peak at 20 to 30, then it starts to decrease from that age on.

Final Plots and Summary

Plot One

This is a very simple — though impressive — chart, because it tells a story so clearly.

The first round of this workout involved executing a set of 10 overhead squats, followed by a set of 10 chest-to-bar pull-ups. This chart shows us the large number of athletes who were able to complete the first 10 overhead squats, but were unable to perform a single chest-to-bar pull-up. The number of athletes who scored 10 repetitions on this workout was almost 9,500, or 17.3% of the total number of competitors.

In other words, only by mastering that movement, athletes would have been immediately ahead of 9,500 competitors.

Plot Two

This plot shows depicts the relationship between weight and scores for the female division of Open 15.4, and also displays how heights are related to those features.

Even though the scores are widely distributed across weights, averaging out their values helps to identify a curve where the best average scores were obtained by athletes weighing around 130 to 140 lbs.

Also, we can see that athletes’ weight tend to increase as height increases. In addition, we can see that, as score increases, the horizontal “levels” get slightly lighter, indicating a slightly better performance by shorter athletes for this workout.

Plot Three

This is a somewhat depressing chart, but the trend it shows is very clear: average CrossFit athletes’ performance decreases with age. Except for some high scores for older female athletes on 15.3, we can clearly see that performance peaks when the athlete is around 20 to 30 years old, then decreases from that age on.

Reflection

Exploring this data set enabled me to get a good sense of what relates to an athlete’s performance at the Open. And, more often than not, it showed me what is not related.

During the exploration, I hit several dead-ends, where expected relationships between variables were very faint or appeared to be non-existent. It was only after persisting and investigating multiple possibilities that some relationships came to life.

The relationships that did appear were fascinating in two ways: they either showed clear trends that marked an athlete’s performance, or they told precise stories about what happened on a given week during the Open.

Regarding future work, I would love to incorporate the results from other years and investigate how athletes’ profile and demographics changed over the years, as well as the distribution of results. That would enable me to check if athletes are getting better, and if the CrossFit Open is getting more competitive each year.