The essential guide to Sample Ratio Mismatch for your A/B tests

Published in Towards Data Science.

If you can’t trust the result of an experiment, you can’t trust the decisions you make from it. Data integrity issues are common—especially with redirect tests, single-page apps, or complex setups. The Sample Ratio Mismatch (SRM) check is a simple, essential validation anyone can do: the observed split of users (or visits) between control and variation doesn’t match the expected split (e.g. 50/50).

The article gives a practical overview: what SRM is (e.g. expected even split vs skewed counts), two rules—prioritise users over visits (users are assigned to experiments; visit skews can be behavioural), and check frequently from launch, treating new tests like “intensive care” for at least the first week. It then covers glaring checks (obvious imbalances), a sample ratio formula (control % = control / total, etc.), and the Chi-squared test of independence in Python (using scipy.stats.chisquare with observed and expected counts) and in spreadsheets (CHITEST). A p-value below 0.01 (stricter than 0.05) is recommended for declaring SRM, to reduce false alarms. Optional: a deeper look at the Chi formula (observed vs expected, degrees of freedom, CHISQ.DIST.RT). Cumulative views of sample ratios over time help spot when SRM started. Summary: use Chi regularly, don’t cry wolf on day one unless it’s glaring, and treat this as the start of data validation.

Read the full article on Towards Data Science →

Iqbal Ali

Iqbal Ali

Fractional AI Advisor and Experimentation Lead. Training, development, workshops, and fractional team member.