In “The Book of Why”, authors Judea Pearl and Dana Mackenzie stress on the importance of causal inference over simple correlations. “The rooster’s crow is highly correlated with the sunrise; yet it does not cause the sunrise.” Advancements in technology and data science have changed this “concept of causal inference” to the “science of experimentation”, which provides a solid logical, mathematical approach to taking decisions.
Here is a popular and interesting story that illustrates how correlations might not be your best friend when it comes to taking critical decisions. The story is called “Eat chocolate, win the Nobel prize?”. In 2012, a New York doctor Franz Messerli published a note in the famous New England Journal of Medicine that created a buzz. His note indicated that “The higher a country’s chocolate consumption, the more Nobel laureates it spawns per capita”. He plotted a correlation between industry data on “chocolate intake” vs. “per capita Nobel laureates” for 23 countries. The p-value calculated by him was 0.0001 which indicated extremely high correlation. The Swiss, the Swedes, the Danes led the pack followed by the Americans.
It was found that the note was published by the researchers jovially. You can read the full story here, but some interesting/funny quotes by Messerli were:
“The amount it takes, it’s actually quite stunning, you know,” Messerli chuckled. “The Swiss eat 120 bars – that is, 3-ounce bars – per year, for every man, woman and child, that’s the average.”
“As physician scientists we live and die by p-values, and here we have a p-value of a magnitude that is incredible, and unless you teach me otherwise it’s a complete nonsense correlation.”
There were other researchers who did not shy away from having fun:
“I attribute essentially all my success to the very large amount of chocolate that I consume,” said Eric Cornell, an American physicist who shared the Nobel Prize in 2001.
“Personally, I feel that milk chocolate makes you stupid,” he added. “Now dark chocolate is the way to go. It’s one thing if you want like a medicine or chemistry Nobel Prize, OK, but if you want a physics Nobel Prize it pretty much has got to be dark chocolate.”
In this entire episode, the right hypothesis would probably have been that rich countries could spend more on research and hence produced more Nobel laureates. Because they were rich, they could also afford to consume more chocolates than other countries. If only one could conduct an experiment of a massive scale, it could prove that this hypothesis makes more sense and that the correlation in picture is not causation. Take a sample with a good mix of countries from different richness strata. Pick premier research institutes in the sample and deliberately increase the intake of chocolates amongst scientists through certain programs for few years. Measure the impact and the results will clearly indicate that within the sample, only countries that were rich produced more Nobel laureates.
Scale and duration of experimentation required in the above example is probably too massive. But it is now commonplace for businesses to run experiments today. Of course, it is not as convenient as establishing correlations by quickly running some popular algorithms. Experimentation requires right experimental design followed by accurate matching and measurement. But advancements in engineering and data science have provided a rapid, automated yet simple way of executing complex experiments. Experiments, when done accurately, would rarely establish a link that is meaningless or difficult to explain. As such, experimentation is the best and the most proven method of establishing causal inferences and taking critical decisions.
About Trial Run: Trial Run is a cloud-based product that lets companies conduct business experiments on sites, markets and individuals before they implement. It helps companies scale their experimentation capability efficiently and affordably by providing insight into which decisions will work and which will not.