Skip to main content

Experimentation Platform

Accelerating software innovation through trustworthy experimentation

Home: ExP Platform
Talks
Journal Survey
Encyclopedia article
Objective Bayesian AB
Dilution
Rules of Thumb
Two Stage
Large Scale
Puzzling Outcomes
CUPED
Experiments at Microsoft
Practical Guide (short)
Tracking Users' Clicks an
Seven Pitfalls
Semmelweis Reflex
Bloodletting
Power Calculator
What is a HiPPO
Pitfalls of Long Term
ExP Talks (reverse chronological order)
 
If downloading the talks asks you for username/password, it's trying to "edit."  Switch to another browser.

Pitfalls in Online Controlled Experiments: MIT Code 2016 invited talk slides, talk cartoon, 20 minutes video + Q&A (10/15/2016)

A/B Testing Pitfalls: PPT (3/30/2016) at ConversionXL Live (3/31/2016)
 
Fun talk I did to data scientists at Microsoft: Intuition Busters: video (2/16/2016)

 
Challenging Problems in Online Controlled Experiments: MIT Code 2015 invited talk slides and talk cartoon (10/17/2015)

Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 years: KDD 2015 Keynote (8/11/2015)
 
Lessons from Running Thousands of A/B Tests: MIT Code 2014 invited talk slides, talk cartoon (10/11/2014)

 
Online Controlled Experiments: Introduction, Insights, and Humbling Statistics video and slides from Qcon (11/11/2013)
 
ACM Recommender Systems industry keynote (9/12/2012)
 
Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained
  •  KDD Conference, Aug 2012 PPTXSee paper.
 
Tutorial: Controlled Experiments on the Web: Planning, Running, and Analyzing, Advanced Research Techniques Forum (ART 2012) (previous version at KDD 2009)

 

Online Controlled Experiments: Listening to the Customers, not to the HiPPO

 

 
Testing with Real USers, Better Software Conference, June 2010

Evidence shows than more than half of the ideas that we think will improve the user experience actually fail to do so—and some actually make it worse. Instead of guessing, why not measure what your real users like and don’t like? Controlled, online experiments (A/B tests being the simplest version) are a proven way to make data-driven decisions about what works and what doesn’t. Seth Eliot shares numerous examples of online experimentation within Microsoft to test new user interfaces with their customers. Seth shows how special frameworks, such as Microsoft’s ExP (Experimentation Platform) can also move testing into the high-value realm of testing-in-production. In addition to new features and designs, Microsoft tests the impact of new code in production. By employing online experimentation, you can control how and when new, potentially dangerous code is exposed to users. Exposure control enables you to reap the benefits of testing in production while limiting the potential negative impact on your customers and users.

 
Online Experimentation at Microsoft, 2009
  • Video (46 minutes), PowerPoint, PDF
  • Greg Linden wrote: ... In the barely viewable video of the talk, the action starts at the Q&A around 1:28:00. The presenters of the two talks, Googler Sandra Cheng and Microsoft's Ronny Kohavi, aggressively debate the importance of performance when running weblabs, with others chiming in as well. Oddly, it appears to be Microsoft, not Google, arguing for faster performance.

 

 

 
 

Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce, ACM Data Mining SIG (first invited talk at the SIG)

Abstract: Electronic Commerce is now entering its second decade, with Amazon.com and eBay now in existence for ten years. With massive amounts of data, an actionable domain, and measurable ROI, multiple companies use data mining and knowledge discovery to understand their customers and improve interactions. The talk will cover important lessons and challenges using e-commerce examples across two dimensions: (i) business-level to technical, and (ii) the mining lifecycle from data collection, data warehouse construction, to discovery and deployment. The talk will include examples from real-world A/B tests and Simpson's paradox.