Lesson 1.A.2 - Describing a Categorical Variable
Key Question: Why are more elite athletes born in January - June?
Content: Relative Frequencies | Bar & Pie Charts | Describing a Categorical Variable
Alignment: CED Topics 1.3 - 1.4
Video
Course Resources
Resources for teaching our AP® Statistics curriculum.
- Lesson Flow - timing and flow of class, using our lesson materials
- Pacing Guide - pacing our units, with daily or block schedules
- CED Alignment Guide - aligning our lessons to the AP® Statistics Course and Exam Description
Teaching Resources
Resources for teaching with Skew The Script.
- Discussion Norms - our model discussion norms for the classroom
- Letter to Parents - letter to share with parents about our nonpartisan approach
- Teaching Math on Civic Topics - tips for teaching math lessons that cover civic topics
Lesson Notes
Lesson-specific insights from the creators of this lesson.
This lesson introduces students to a surprising phenomenon that we call The Astrological Law of Sports. Namely, across many sports, the Capricorns - Geminis (people born between January and June) dominate. Students utilize graphs and summary statistics for categorical data to describe this trend. Then, they’re tasked with explaining it – why does birth month matter so much to athletes’ success?
- Calculate and interpret relative frequencies
- Interpret bar charts and pie charts
- Describe trends for a categorical variable
Before proceeding: Familiarize yourself with the lesson materials linked above (e.g. handout, handout key, slides, video). Then, for additional background and teaching tips from the lesson creators, check out the sections below.
- When launching the lesson, it’s helpful to play up the fact that the “Astrological Law of Sports” is not only surprising, but it’s also astonishingly widespread across varying sports and across the globe. This helps invest students in the context of the lesson and the Discussion Question.
- The AP Exam isn’t incredibly picky about rounding. However, as a good rule of thumb, you can tell students that three significant figures is usually sufficient for proportions (e.g. 0.273 or 27.3%).
- It’s unlikely that the AP Exam will ask students to draw graphs. Instead, students will mostly be asked to interpret pre-printed graphs. Having students draw their own bar chart and pie chart once or twice is helpful for learning the concepts behind each visualization. However, drawing graphs takes up a lot of class time, so it’s best to focus student practice on interpretation.
- When showing the graphs comparing the birthdays of the Italian population with the birthdays of Italian soccer players, ask students what these graphs might look like if the y-axes used raw counts, instead of percentages. Of course, because the sample sizes are very different (there are far more Italians than there are Italian professional soccer players), the graphs would no longer be comparable. This is why percentages are called relative frequencies – they allow us to relate the information across sample sizes.
First, download this lesson's Handout Key and read through its Discussion Question section. Then, check out our model discussion norms and the additional background notes below.
- At first, some students may draw a blank on the discussion question, especially those with little experience in youth sports. Expanding on the hint in the following ways may be helpful:
- “If everyone born in 2015 is in the same group, would players born in January 2015 be the oldest or youngest in the cohort?”
- “Which players might tend to display the most athleticism in a youth group: the oldest players or the youngest players? Why?”
- “Which players might tend to get the most attention and coaching: the more athletic ones or the less athletic ones? Why?”
- The phenomenon described in the Discussion Question is known in academic literature as the “relative age effect.” Numerous journal articles have been written on the relative age effect, in contexts that include education, the workplace,and psychology. Although the exact causal mechanism discussed here (older kids getting better opportunities due to initial age advantage) cannot be fully proven without an experiment, the prevalence of the relative age effect across many sports and other contexts provides substantial evidence for this explanation.
- This lesson utilizes the 12 zodiac signs of Western astrology as a playful hook to draw students into the lesson. Of course, there are a multitude of other astrological traditions that can be referenced. Ultimately, as students discover, the trends have more to do with how youth leagues group athletes, rather than anything astrological.
- The phenomenon described in the Discussion Question is known in academic literature as the “relative age effect.” Numerous journal articles have been written on the relative age effect, in contexts that include education, the workplace,and psychology. Although the exact causal mechanism discussed here (older kids getting better opportunities due to initial age advantage) cannot be fully proven without an experiment, the prevalence of the relative age effect across many sports and other contexts provides substantial evidence for this explanation.
- The relative age effect was most popularized by Malcolm Gladwell in his book Outliers (2008). The example Gladwell discussed focused on hockey. The lesson focuses on Italian soccer due to the strength of the relative age effect in this context and the readily available data from this journal article. However, the relative age effect has also been observed for Handball, Rugby, Track & Field, Skiing, Hockey, Tennis, and the Olympics.
- Generally, bar charts are preferred to pie charts among most statisticians. The human eye is much more perceptive of the relative size of bar heights than the relative size of sector areas. See this demonstration as an example. In addition, pie charts become very difficult to read when a variable has more than five or six categories, whereas bar charts are still fairly readable with a high number of categories.
- There are multiple ways to make visualizations of categorical data misleading. These include using pictures in place of bars for bar graphs, creating pie charts that don’t add to 100%, or using odd axis scales. Check out our Classic AP Stats 1.1 Lesson (created before the 2026 course revision) or High School Statistics Lesson 1.1 for examples. Note that misleading graphs are not covered in the current AP Statistics Course and Exam Description.
Student Supports
Lesson-specific resources to support all learners.
The following sub-questions can be used as additional scaffolding for this question from the lesson Handout: “How do the birthdays for Italian soccer players compare to birthdays among all Italians? Describe any trends you see.”
- a) What do you notice about the first half of the year in each group?
- b) What do you notice about the second half of the year in each group?
- c) Combine your responses for (a) & (b) into a few sentences to describe any trends that you see.
- d) Review and edit your response to (c) to include comparative language (e.g. higher, lower, similar).
- Vocabulary used in the context of the lesson may include words that are unfamiliar or have several meanings. In particular, the following mathematical terms may need clarification or a definition provided:
- Frequency
- Frequency table
- Relative frequency
- Percentage
- Pie chart
- Trends
- In addition, the following contextual terms may need clarification or a definition provided:
- Astrological
- The terms “relative frequency” and “proportion” can be used interchangeably. Pointing this out and utilizing both terms in instruction can help students learn their equivalence.