Thursday, March 12, 2015

Book Review: How to lie with statistics


A data analysts bible for communicating stats to non-experts. A recommended re-read as annual absolution for your statistical sins.

There is no surprise it's a classic: the book has aged remarkably well, the (humourus) anticdoes being as pertinent today as 60 years ago.

The premise is quite straight-forward. When presented with stats, keep in mind:
1) Tools of the trade,
2) Lies, and
3) Fallacies.
Then do a "sniff-test."

Tools are the trade include bias, sample size and significance tests.

Lies are (often graphical) ways of misleading the reader (intentionally for the data scientist; plausibly unintentionally for those with less of a background): changing the scale bars, 'cleverly' chosen percentages, dishonest before/after and my personal favorite, semi-attached figures (what the medical profession now calls 'surrogate end points').

Fallacies include the ever present correlation is of course causation, and 'proving' the null hypothesis.

If a breezy 124 pages is too much, cut straight to the end. At a 'lengthy' (by this books standards) 15 pages, the 10th and final chapter enumerates a 5-step 'sniff test' that can stop a good many lie in its tracks:
1) Who says so?
2) How does he know?
3) What's missing?
4) Did somebody change the subject?
5) Does it make sense (particularly for extrapolations)

If pointy haired boss ever read this book, it'd make the data analysts job -- appease power by bending truth -- 456.7% more challenging!

PS. Speaking truth to power will get you fired 654.3% faster than appeasement. Exercise minimally bent truth with caution. You've been warned!