Skip to main content

Experimentation Platform

Accelerating software innovation through trustworthy experimentation

Home: ExP Platform
Talks
Journal Survey
Encyclopedia article
Objective Bayesian AB
Dilution
Rules of Thumb
Two Stage
Large Scale
Puzzling Outcomes
CUPED
Experiments at Microsoft
Practical Guide (short)
Tracking Users' Clicks an
Seven Pitfalls
Semmelweis Reflex
Bloodletting
Power Calculator
What is a HiPPO
Pitfalls of Long Term

Statistical inference in two-stage online controlled experiments with treatment selection and validation

by Alex Deng, Tianxi Li and Yu Guo

 

WWW 2014, April 7–11, 2014, Seoul, Korea.  PDF

 

Abstract

Online controlled experiments, also called A/B testing, have been established as the mantra for data-driven decision making in many web-facing companies. A/B Testing support decision making by directly comparing two variants at a time. It can be used for comparison between (1) two candidate treatments and (2) a candidate treatment and an established control. In practice, one typically runs an experiment with multiple treatments together with a control to make decision for both purposes simultaneously. This is known to have two issues. First, having multiple treatments increases false positives due to multiple comparison. Second, the selection process causes an upward bias in estimated effect size of the best observed treatment. To overcome these two issues, a two stage process is recommended, in which we select the best treatment from the first screening stage and then run the same experiment with only the selected best treatment and the control in the validation stage. Traditional application of this two-stage design often focus only on results from the second stage. In this paper, we propose a general methodology for combining the first screening stage data together with validation stage data for more sensitive hypothesis testing and more accurate point estimation of the treatment effect. Our method is widely applicable to existing online controlled experimentation systems.