Lesson 2.A.1 - Describing Two Categorical Variables (Version B)

Key Question: How can a player be better overall, but also worse in each category?

Content: Joint, Marginal, & Conditional Frequencies | Side-By-Side Bar Charts, Segmented Bar Charts, & Mosaic Plots

Alignment: CED Topic 2.1-2.2

Video

Student Items

Handout: pdf, doc

Mastery Check: link

Teacher Items

Handout Key: pdf, doc

Mastery Check Key: link

Slide Deck: pdf, ppt

Course Resources

Resources for teaching our AP® Statistics curriculum.

  • Lesson Flow - timing and flow of class, using our lesson materials
  • Pacing Guide - pacing our units, with daily or block schedules
  • CED Alignment Guide - aligning our lessons to the AP® Statistics Course and Exam Description

Teaching Resources

Resources for teaching with Skew The Script.

Lesson Notes

Lesson-specific insights from the creators of this lesson.

GIF

This lesson introduces students to a WNBA paradox. Las Vegas Aces basketball star A'ja Wilson had a higher shooting percentage than her 2022 teammate Kelsey Plum. Yet, somehow, Plum had higher shooting percentages in every shot category (2-pointers and 3-pointers). How is this possible? In this lesson, students solve the paradox using marginal distributions, conditional distributions, and data visuals.

Learning Targets
  • Calculate and interpret joint, marginal, and conditional relative frequencies
  • Interpret side-by-side bar charts, segmented bar charts, and mosaic plots
  • Describe associations between two categorical variables

Before proceeding: Familiarize yourself with the lesson materials linked above (e.g. handout, handout key, slides, video). Then, for additional background and teaching tips from the lesson creators, check out the sections below.


  • For students who are less familiar with basketball, and to help set the stage for students to solve the paradox, it’s important to describe the difference between 2-point shots (closer to the hoop) and 3-point shots (further from the hoop). In particular, it’s helpful to name that 3-pointers are worth more points because they’re more difficult. Then, when students think about the paradox, they’ll be able to connect Plum’s overall lower shooting percentage to the fact that a higher share of her shots are in the more difficult category (3-pointers).
  • For motivating the lesson context, it’s helpful to emphasize that both players were MVP (Most Valuable Player) contenders in 2022. So, solving the paradox isn’t just an interesting mathematical exercise. The solution could inform how we think about which player should win the award.
  • This lesson covers a relatively high number of statistical ideas and methods. Intentionally pointing out connections between them can help students better internalize them. For example, vertically stacking the bars in a side-by-side bar chart would yield a segmented bar chart. In addition, adjusting the bar widths in a segmented bar chart (according to sample sizes) would yield a mosaic plot. So, side-by-side bar charts, segmented bar charts, and mosaic plots are highly interrelated.
  • The Discussion Question covers Simpson’s Paradox – a fascinating statistical concept that students should appreciate, but that also won’t show up on the AP Exam. If students are struggling with the paradox, you can let them know that the other concepts from this lesson (joint, marginal, and conditional relative frequencies; side-by-side bar charts, segmented bar charts, and mosaic plots) are more important for their course performance and for the AP Exam.

First, download this lesson's Handout Key and read through its Discussion Question section. Then, check out our model discussion norms and the additional background notes below.

  • The discussion provides a good opportunity to promote deeper conceptual understanding of mosaic plots. Students can compare the areas of the segmented bar charts and the mosaic plots. Looking at the segmented bar charts, the “shots made” area is greater for Plum, since the height of the white areas in the charts is higher for Plum. However, for the mosaic plots, the “shots made” area is greater for Wilson, since the total area (height x width) of the white areas is greater for Wilson. Because mosaic plots vary their bar widths by sample size, they showcase the fact that most of Wilson’s shots are in the higher accuracy category (2-pointers). This leads to a greater “shots made” area and, therefore, a greater shooting percentage overall.
  • This question covers Simpson’s Paradox – a fascinating statistical concept that students should appreciate, but that also won’t show up on the AP Exam. If students are struggling with the paradox, you can let them know that the other concepts from this lesson (joint, marginal, and conditional relative frequencies; side-by-side bar charts, segmented bar charts, and mosaic plots) are more important for their course performance and for the AP Exam.
  • The shooting data for this lesson comes from the 2021-2022 seasons. During these seasons, the dynamic duo (A'ja Wilson and Kelsey Plum) hit their prime as teammates, becoming the first two WNBA players on the same team to each score 700 points in a single season (2022). Plum was traded to the Los Angeles Sparks in 2025, so the two players are (unfortunately) no longer teammates.
  • Even though Kelsey Plum was the “better shooter,” A’ja Wilson actually won the 2022 league MVP (Most Valuable Player) award. Why? Her superb defense and floor presence made her stand out above the other candidates. She led the league in blocks that season and had more total rebounds than any other player. When evaluating a player’s value, shooting is one consideration among many. Ultimately, Plum finished in 3rd place in MVP voting.
  • The shooting percentage paradox (Simpson’s Paradox) in this lesson can be found between other players in the league. This lesson can be modified to showcase players from teams more local to your area. Basketball Reference is a great site for looking up the relevant data for players close to you.
  • At this point in the course, we say that two categorical variables are “associated” if conditional relative frequencies differ between groups. Later in the course, we’ll use hypothesis tests to determine if these associations are statistically significant (so large that they’re unlikely due to chance alone). This provides a more formal threshold for determining whether two variables are truly “associated” with one another.
  • There are multiple ways to make visualizations of categorical data misleading. These include using pictures in place of bars for bar graphs or using truncated y-axes. Check out our Classic AP Stats 1.1 Lesson (created before the 2026 course revision) or High School Statistics Lesson 1.1 for examples. Note that misleading graphs are not covered in the current AP Statistics Course and Exam Description.
  • The paradox explored in the Discussion Question is called Simpson’s Paradox. Although Simpson’s Paradox is not covered on the AP Exam, it’s a great statistical concept for students to learn more about, as time allows. For example, check out this project from Skew The Script, which allows students to explore how Simpson’s Paradox arises in criminal justice, college admissions, medicine, and elections.

Student Supports

Lesson-specific resources to support all learners.

  • For supporting students with finding conditional distributions, encourage them to cover up any parts of the two-way table they no longer have to consider. For example, in the practice exercise in which students are asked to “calculate the conditional distribution of severity of symptoms for Medicine A,” students can use their hand to cover any table cells that don’t apply to Medicine A. This makes it clearer for students that the new total for their fractions is just the total among the Medicine A cases.
  • Vocabulary used in the context of the lesson may include words that are unfamiliar or have several meanings. In particular, the following mathematical terms may need clarification or a definition provided:
    • Two-Way Table / Contingency Table
    • Joint Relative Frequency
    • Marginal Relative Frequency
    • Side-by-Side Bar Chart
    • Conditional Relative Frequency
    • Segmented Bar Chart
    • Mosaic Plot
    • Weighted average
  • In addition, the following contextual terms may need clarification or a definition provided:
    • Paradox
    • 2-Pointer
    • 3-Pointer