3/2/2021
Published in Towards Data Science.
Frequentist terms like “significance”, “p-values”, and “confidence intervals” can feel like non-sequitur panel transitions—they give a yes/no answer but don’t always convey the story we need. In Scott McCloud’s Understanding Comics, action-to-action transitions link panels so the reader gets a clear, objective sequence. The same idea applies to experiment analysis: we need data views that work together to tell a coherent story of risk, reward, and certainty.
The article frames two experiment goals: conversion win (validate the hypothesis or learn something) and de-risk (check the variation is safe to launch; a “flat” result is acceptable). For de-risk tests, “how long to run?” is hard—teams often keep tests running to “collect more data”. Boolean significance isn’t enough when nuance is needed. Expected Loss (Chris Stucchio, VWO) adds the missing “risk” panel: what is the conversion cost of choosing one variant over the other? Charting cumulative Expected Loss over time for each variant, plus cumulative conversion rates (reward) and Bayesian probability (certainty), gives an action-to-action story. Rules of thumb: e.g. stable lines for the last seven days, identify best performer, then check certainty and expected loss if probability is borderline. At Trainline, this approach added clarity, reduced runtimes (especially for “flat” tests), and helped stakeholders decide with less reliance on highly specialised stats—and worked for lower-traffic contexts too.