Lesson 2.A.3 - Conditional Probability

Key Question: How many people lie in their online dating profiles?

Content: Venn Diagrams | Mutually Exclusive Events | Conditional Probability

Alignment: CED Topic 2.5 - 2.6.A.1

Video

Student Items

Handout: pdf, doc

Mastery Check: link

Teacher Items

Handout Key: pdf, doc

Mastery Check Key: link

Slide Deck: pdf, ppt

Course Resources

Resources for teaching our AP® Statistics curriculum.

  • Lesson Flow - timing and flow of class, using our lesson materials
  • Pacing Guide - pacing our units, with daily or block schedules
  • CED Alignment Guide - aligning our lessons to the AP® Statistics Course and Exam Description

Teaching Resources

Resources for teaching with Skew The Script.

Lesson Notes

Lesson-specific insights from the creators of this lesson.

GIF

When folks take the plunge and try to find their special someone on dating apps, sometimes it goes well. Other times, connections made online don’t always translate to a connection in person – especially if a carefully crafted online profile doesn’t quite stack up with reality. In this lesson, students analyze data from OkCupid. Using the principles of probability, they investigate: How many people lie on their online dating profiles?

Learning Targets
  • When folks take the plunge and try to find their special someone on dating apps, sometimes it goes well. Other times, connections made online don’t always translate to a connection in person – especially if a carefully crafted online profile doesn’t quite stack up with reality. In this lesson, students analyze data from OkCupid. Using the principles of probability, they investigate: How many people lie on their online dating profiles?
  • Describe intersections and mutually exclusive events
  • Calculate and interpret conditional probabilities

Before proceeding: Familiarize yourself with the lesson materials linked above (e.g. handout, handout key, slides, video). Then, for additional background and teaching tips from the lesson creators, check out the sections below.


  • To frame the lesson, it can be helpful to discuss the factors that add pressure to “stand out” online. In the physical world, singles have their own filters in mind for the person who could become their significant other. However, there’s always a chance of meeting someone who is not “your type” on paper but who inspires a surprising connection nonetheless. In the virtual world, the filters are more absolute. If your profile doesn’t meet someone’s filter, there’s no chance of matching with them virtually. So, there’s also no chance of meeting up and forming a surprising connection. Therefore, the pressure to lie and get past the first virtual filter may be heightened in the online world.
  • San Francisco residents tend to have higher incomes. Students may wonder if this could explain why the sample of OkCupid profiles tends to be so high-earning. Students may need to be reminded that the benchmark used to define a profile as “high-earning” in this data set is earning more than the median income among individual men living in San Francisco. So, geography alone can’t explain the high share of high-earning individuals.
  • Many students find probability problems challenging. This is not because the probability formulas are too complex to learn. Rather, it’s often because probability word problems contain a lot of information. When reading through that information, it can be difficult to choose which formula best applies to the scenario (and what to “plug in” once a formula is chosen). Students often wonder, “Where do I even start?” So, this lesson introduces a framework we’ll use throughout the unit: rather than going straight to formulas, utilize the Keys to Probability. The first Key – Draw it first – is especially useful for solving probability problems. A visual representation, in the form of a Venn diagram, tree diagram (next lesson), or two-way table (following lesson), helps students deconstruct complex word problems and lay the information out in a familiar way. When students are wondering where to start, asking them to “draw it first” gives them a useful path forward.

First, download this lesson's Handout Key and read through its Discussion Question section. Then, check out our model discussion norms and the additional background notes below.

  • Here’s a nice summary note you can share to wrap up the discussion: “Ultimately, when finding love, personal appearance matters far less than personal character. So, shoutout to all the honest Short Kings out there.”
  • The full OkCupid data set can be found here. The data set in this lesson has a relatively small sample size (n = 192) due to several factors: dating apps hadn’t fully “taken off” by 2012, the geography of the data set is limited to San Francisco, the age is limited to 36 year olds (a less common age on dating apps, compared to younger age groups), and the data set is only composed of individuals that chose to make their profiles publicly viewable.
  • Because the profiles listed in the data set are from individuals who chose to make their profiles publicly viewable, one could argue that this sample of individuals could be truly taller and higher income than the general public – hence, they have the confidence to set their profiles to be fully public. However, one could also make the case the other way: people making their profiles public have more of an incentive to lie, given the extra visibility on their profiles.
  • Conditional probability is a powerful concept that connects to ideas throughout the course. Highlighting these connections in class can be helpful. For example, in Lesson 1.A.1, Abraham Wald’s airplane problem can be re-expressed in terms of the likelihood of airplanes sustaining critical hits given or conditional on the fact that they came back. In Lesson 1.B.3, we explore how measures of job placement rates change given or conditional on the fact that students graduated or that they chose to respond to an alumni survey. In Lesson 2.A.1, we calculate conditional distributions, which are the observed proportions given or conditional on a certain trait. In future lessons, we’ll calculate the chi-squared test for independence, which aims to determine if there are statistically significant associations between different conditional distributions. The prominence of conditional probability throughout the study of statistics is why Harvard statistician Joseph Blitzstein calls conditioning the “soul of statistics.”
  • When combined with the Multiplication Rule, the formula for conditional probability can be re-expressed as Bayes’ Theorem: \( P(A|B) = \frac{P(B|A)P(A)}{P(B)} \). Bayes’ Theorem is another powerful probability concept that’s slightly outside the scope of AP Statistics. As an extension activity, students can explore this New York Times lesson from a Skew The Script author, which utilizes conditional probability to introduce students to Bayes’ Theorem, along with the theorem’s sometimes unintuitive results.

Student Supports

Lesson-specific resources to support all learners.

  • Among all the Keys to Probability, the one that best supports students in getting started with probability problems is “Draw it First.” For the practice exercises that don’t already include a Venn diagram, encouraging students to draw one can help them organize the information and organize their thinking.
  • For supporting students with finding conditional probabilities, encourage them to cover up any parts of the Venn diagram that they no longer have to consider. For example, consider the practice exercise in which students are asked to find the probability of selecting a student who takes biology, given that they also take economics. Since we know the student must take economics, they can use their hand to cover up any numbers outside of the economics circle. This makes it clearer for students that the new total for their fraction is just the total among the students already taking economics.
  • Vocabulary used in the context of the lesson may include words that are unfamiliar or have several meanings. In particular, the following mathematical terms may need clarification or a definition provided:
    • Venn diagram
    • Intersection (∩)
    • Mutually exclusive
    • Disjoint
    • Conditional
  • In addition, the following contextual terms may need clarification or a definition provided:
    • Singles
    • High-earner