Lesson 3.A.1 - Sampling Distribution for a Proportion

Key Question: Can Parkinson’s be detected with scent?

Content: Sampling Distribution for One Proportion | Sample Size & Precision

Alignment: CED Topics 3.1-3.2.A

Teacher Guide

Part A

Activity: link

Credit: Lesson from Doug Tyson and made available by Math Medic. See the Teacher Guide Video for more info.

Part B

Amplify Activity: link

Handout: pdf, doc

Handout Key: pdf, doc

Mastery Check: link

Mastery Check Key: link

Course Resources

Resources for teaching our AP® Statistics curriculum.

  • Lesson Flow - timing and flow of class, using our lesson materials
  • Pacing Guide - pacing our units, with daily or block schedules
  • CED Alignment Guide - aligning our lessons to the AP® Statistics Course and Exam Description

Teaching Resources

Resources for teaching with Skew The Script.

Lesson Notes

Lesson-specific insights from the creators of this lesson.

GIF

This lesson was inspired by an original activity from all-star AP Statistics instructor Doug Tyson. The lesson utilizes the inference trifecta approach, which differs from our typical lesson format. Before proceeding, watch the Teacher Guide Video and familiarize yourself with the lesson materials (e.g. handout and key). Then, for additional background and teaching tips from the lesson creators, check out the sections below.

Learning Targets
  • Use simulation to approximate the sampling distribution for a proportion
  • Calculate and describe the sampling distribution for a proportion
  • Use the sampling distribution for a proportion to justify claims about a parameter
  • For daily class schedules (45-min class periods), this lesson can be completed in three days:
    • Day 1: Activity from Part A (Detecting Parkinson’s with Scent)
    • Day 2: Activity & Discussion from Part B (The “E-Nose”)
    • Day 3: Practice, Mastery Check from Part B (The “E-Nose”)
  • For block periods (90-minute class periods), the entire lesson can be completed in 1.5 class days.
  • If using the optional Lesson 3.A.0, these activities can flow together as 4 days for daily class schedules of 2 full class days for block schedules.
  • For Part A of the lesson, instructors can download the materials and view the full activity guidance at this page (from our friends at Math Medic). Note that instructors have to make a Math Medic account to download the materials. Much like Skew The Script, accounts with Math Medic are free.
  • For Part B of the lesson, students complete the Amplify activity (with the instructor facilitating), as they record notes in their handout along the way. To facilitate an Amplify activity, instructors can create an Amplify account (also free) and share a single session code with their students (who can join without accounts).
    • For facilitating the Amplify activity, we recommend using the “Sync to Me” pacing option. For more Amplify activity facilitation tips, check out our Amplify/Desmos session with expert Kevin McSorley.
    • After the notes, students discuss the Discussion Question in small groups. Then, students discuss in full-group, with the instructor facilitating. Finally, students proceed to the Practice problems and, eventually, the lesson Mastery Check.

First, download this lesson's Handout Key and read through its Discussion Question section. Then, check out our model discussion norms and the additional background notes below.

  • Emphasize that the goal of the discussion question is not to decide whether the “E-nose” did well, but to determine whether the results would be expected under random guessing. This shift from evaluating performance to evaluating probability is central to statistical reasoning.
  • A key idea that arises in the Discussion Question is the impact of sample size on the spread of a sampling distribution. The connection between sample size and spread can be illustrated two ways:
    • With the sampling distribution formula: In the formula for the standard deviation of the sampling distribution, the sample size is in the denominator. So, as n increases, we divide by a larger number, thereby shrinking the standard deviation value.
    • With the law of large numbers and intuition: If we flip a fair coin only twice, there’s a sizable chance that 100% of those flips will be Heads. If we flip the coin 1,000 times, getting 100% Heads is almost impossible. Instead, the proportion of flips that come up Heads will likely be very close to the true probability: 50%. This is the law of large numbers (from our earlier probability unit) at work, and it also applies to sampling. With a very high sample size, our estimates will be tightly clustered around the true probability or proportion value. With a lower sample size, there’s more variation. This is why higher sample sizes produce more precise sampling distributions.
  • The Zhejiang University study suggests that Parkinson’s disease is associated with a distinct chemical “fingerprint” in skin oils, likely tied to metabolic and microbiome changes – and that this signal may be detectable before clear motor symptoms appear.
  • In preliminary studies, the model was able to reliably distinguish Parkinson’s patients from healthy controls. While this points to strong potential for early-stage, scalable screening, the approach remains in the research and validation phase, with no approved clinical devices yet. Ongoing studies are focused on confirming whether these findings hold across larger, more diverse populations before moving toward real-world use.
  • This lesson continues to use simulation, but it also introduces formulas and calculations that will form the basis of hypothesis testing and confidence intervals in upcoming lessons. Because of the higher degree of abstraction introduced in this lesson, students may start to lose their bearings. If this occurs, continue returning to the logical foundations of sampling variation and extreme results that have been established in the last few lessons, using the results of simulations as concrete examples.
  • To support students as they begin working with calculations for the sampling distribution of a proportion, reinforce that sampling distributions are composed of many statistics, rather than many data values. This is illustrated by the graphic included in the handout showing many p̂ values under the normal curve.
  • For AP Statistics, students are not expected to be able to derive the standard deviation formula for the sampling distribution of a proportion. Focus on interpretation and application rather than derivation. However, for students who are curious, consider sharing this supplemental video, which shows how the formula \( \sqrt{\frac{p(1-p)}{n}} \) comes from the standard deviation of the binomial distribution. The video is tied to the next lesson but can be shared alongside this lesson for especially curious students.
  • Point out that the formulas for the mean and standard deviation of a sampling distribution of a proportion are provided in the AP Statistics formula booklet. This means that students should focus on understanding the concepts and applications of these formulas rather than memorizing them.

Student Supports

Lesson-specific resources to support all learners.

  • The language “in a world where” can provide helpful framing for interpreting sampling distributions. For example, here are several ways to use the phrase in this lesson:
    • In a world where the E-Nose was randomly guessing, what is the probability that it’d get 70.8% of its guesses correct by chance alone?
    • In a world where the E-Nose is randomly guessing, there is only a 2.1% probability that it’d guess 70.8% correct (or more) by chance alone.
    • The sampling distribution represents all the possible trial outcomes In a world where the E-Nose was randomly guessing.
    • This applet provides an excellent bridge between Part A and Part B of the lesson. The applet allows students to repeat the T-shirt experiment digitally, as well as simulate the experiment repeatedly to create a sampling distribution. Although a similar feature is also available in Part B’s Amplify activity, using the applet first provides a supportive connection to the T-shirt activity completed in class.
  • Vocabulary used in the context of the lesson may include words that are unfamiliar or have several meanings. In particular, the following mathematical terms may need clarification or a definition provided:
    • Population proportion
    • Sample proportion
    • Sampling distribution
    • Normal curve
  • In addition, the following contextual terms may need clarification or a definition provided:
    • Parkinson’s disease
    • Sebum
    • Compound
  • The language “in a world where” can provide helpful framing for interpreting sampling distributions. For example: In a world where the E-Nose was randomly guessing, what is the probability that it’d get 70.8% of its guesses correct by chance alone?