Bringing AP Stats into the 21st Century

By Dashiell Young-Saver

AP Statistics is, by far, my favorite class to teach. In the often procedure-driven “just solve the equation” world of high school math, AP Stats stands apart in preparing students to think critically. By covering skills like evaluating misleading graphs, biased samples, and correlation vs. causation, the course fosters the quantitative reasoning needed to make sense of an increasingly complex world. For these reasons, among all high school math courses offered at a national scale, I’d argue that AP Stats is the most relevant, useful, and compelling.

My belief in AP Stats, and its unique value in high school math, also makes me concerned for its future. AP Stats was launched in 1996. In the decades since, the course standards haven’t changed much. By contrast, the world of data analysis has evolved tremendously. The datasets are larger. The software is more advanced. And the dominance of statistical inference (e.g. hypothesis tests) has been upended by the rise of statistical learning and prediction (e.g. machine learning & AI).

Across the country, organizations are releasing new high school data science courses (see here, here, here, & here) that build modern data skills (programming, data wrangling/exploration, visualization, machine learning, etc.) into traditional statistics curricula. They're filling a void of essential data skills for college and industry that, for one reason or another, aren't often taught in high schools.

AP Stats should, and must, be a part of filling that void too. Otherwise, the course will get left behind. It’s time to face the reality: the AP Stats standards are stuck in 1996. We need to update them - stat.

It’s troublesome that the current AP Stats CED mentions outdated tools, like random digit tables and t-tables. However, there are deeper and more important structural issues to the standards. If we can fix these issues, it would make our course more cohesive, engaging, and aligned to modern data work and college courses.

Issue 1: The 15, yes FIFTEEN (???), inference procedures. We're in the era of big data. When you have a dataset with thousands or millions of values, every hypothesis test gives a significant result. Instead, big data demands skills in describing and visualizing multivariate datasets, evaluating potential biases in samples, and understanding the scope of conclusions. Don't get me wrong: we should still teach hypothesis tests. But it would be so much more powerful to focus on just a few hypothesis tests, so that students can really understand them at a conceptual level. Then, we'd have time to authentically explore the limitations of hypothesis tests, discussions of which have spawned a modern-day reckoning in many scientific disciplines. Unfortunately, with 15 inference procedures to teach in AP Stats, there isn’t time to go deep. Each hypothesis test and confidence interval is covered quickly, and students often walk away with a procedural (rather than conceptual) understanding of statistical inference.

Issue 2: We're still using graphing calculators. Want to see what the TI-83 looked like 26 years ago (1996)? Here's a picture. Almost identical to those same calculators today. The fact that Texas Instruments still gets away with selling these two-decades-old pieces of junk for $100 is insane. Instead, let's use the time saved from condensing inference to teach students how to use much more powerful, much more relevant, and much more free (!) spreadsheet software (e.g. google sheets). In addition, R and Python are the mainstays in academic and industry-based statistics / data science. If possible, it would great to teach one of these languages as well. I know it sounds intimidating. But if we can host AP Exams online (2020) and move the SAT online (2024), I'm sure we can create a digital component of the AP Exam that meaningfully tests what all statisticians do: data analysis on computers.

Issue 3: Small, prepackaged datasets. AP Stats is, largely, the analysis of small univariate datasets that have already been collected, cleaned, and formatted by someone else. Those data processing skills - collecting the data, handling missing data or unusual data, and preparing data for analysis - are essential for working with data in college and in industry. Professors and employers want to see these skills! Let's use the time saved from condensing inference to really build projects into our curriculum, allowing our students to work with real and relevant datasets that interest them (and gain these skills along the way). We may even be able to incorporate projects as part of their AP score (like AP Computer Science).

Issue 4: A lot of inferring, not much predicting. Statistical/machine learning - roughly defined as the practice of making predictions with data - is one of the most in-demand skillsets in academia and industry. It’s also the basis behind the rapidly evolving field of artificial intelligence (AI). AP Stats doesn't really cover it at all, aside from a brief nod to prediction in the linear regression unit. Let's add a very basic version of these skills, in earnest, to our linear regression unit (again, using the time we save from condensing inference).

While the above might seem like radical changes, they seem obvious to many college professors and industry leaders I've spoken with over the years. With each passing year, our course grows more irrelevant. Don't get me wrong: I love AP Stats. That's exactly why I want to see it evolve and stay relevant for many years to come.

Let's skew it!

Previous
Previous

Hypothesis Tests, Made Relevant

Next
Next

Doing Probability Right