Teaching with Data for the Public Good

Me, in previous life as University of British Columbia stats prof

STAT 545 (Exploratory?) Data Analysis grad course
STAT 540 Statistics for High Dimensional Biology
Master of Data Science

Inside-Out Statistics | Bodwin

Ack, grading! Don’t miss it at all.
“want to hire?”, unofficial grad rubric
Misguided litigation re: confidence interval verbiage
Predict what you’ll see: Either you were right (I win!) or learn something / correct a mistake (I win!)
Pre-commit to what would convince you –> harder to move the goal posts

A well-reasoned, informal analysis is much better than a formal statistical analysis that lacks intuition.

“They might prefer imperfect solutions to ill-defined problems than perfect solutions to well-defined non-problems.” Gower discussing Cormack (1971)

Who’s Underrepresented? | Tackett

Transform/Visualize vs Model, oh yaasss
In practice vs In class
80/20 20/80
Tension between using real data but also MAKING SURE, e.g. missing data comes up a lot. So hard to find the right data and keep it fresh. Bodwin and Tackett are working real data into courses with very different goals. The more specialized the mandate (“regression”), the harder it is to find real-world data, because extra constraints?
Difficulty around simply getting the data out of awkward places/formats and into students hands quickly.

Difficult Dialogues | Hardin

“Goldilocks level of data wrangling” Ha! So hard to get this “just right”.
“Each student can work with a different dataset” <– neat way to get this w/o impractical explosion of variety
“Ability (need!) to work with SQL” <– “Teaching-driven personal growth”
How to communicate: the kind of thing it’s tempting to shy away from because “it’s not statistics”, but learning to communicate is equipping students for the future. Similar to the attitudes re: teaching programming.

General remarks and discussion points

The more applied, real-world the course, the more it exposed gaps in what I was a pro at. Not being the expert all the time. Is this more or less risky or rewarding if you already don’t meet the prof stereotype?

Risks vs. rewards of working with, e.g., data on COVID, slave trade, policing. One person’s topical is another persons lived experience. How do you do this with great empathy and humility? Also important to compare to realistic baseline: people weren’t 100% happy with the existing “tired” datasets, so you can’t expect to make everyone perfectly happy with new “read world” datasets either.