12/14/2018
Published on The Trainline Blog (Medium).
A/B testing is about comparing two groups. To do that well, we need the right people counted in the experiment and a clear view of who’s in vs out. The article separates assignment (who gets A or B when they hit the site) from counting (who we actually include in the analysis) and uses simple visuals to show why both matter.
- Basics — Users as dots, page hits, assignment to A or B on entry. Not everyone assigned is “counted”; we only count users who meet the test criteria (e.g. hit the page we’re testing).
- Example 1: Page-level test — Count users who hit the search-results page. Red = counted; we need even A/B split and test criteria as a metric so we can filter to visits that actually saw the test.
- Example 2: Page state — Count only users who did a one-way search on that page. Tighter criteria, so we need a metric for “single search” at visit level.
- Example 3: New feature — Feature only appears in the variation (e.g. delay message). If we count only “saw message”, control has no one. We need an equivalent event in control (e.g. “eligible to see message”) so we can compare like with like.
Takeaways: be explicit about when and how we count; use test criteria as a metric for clean analysis; and design features with counting in mind so we don’t build unanalysable tests.
