Data is a wonderful thing. Data is a dangerous thing. Here's a fabulous correlation that proves the point:
Next time you're shown data to prove something, pause a moment and dig a little deeper. Never take statistics at face value.
We're pretty obsessed with A/B testing and proving the ROI of our tools. Just yesterday we declined to work with a hotel group that wanted to run an A/B test comparison between us and another provider. They intended to test each provider on a couple of hotels for one month.
Whilst such a test may appear on the surface to be useful and valuable way to compare vendors, it is statistically unsound. The most obvious reasons are:
1. Different guests. The two separate groups of hotels are different and may attract different guests who may behave in different ways
2. Insufficient data. Each group in the A/B test will receive around 100 conversion events over the month. If we assume a 20% uplift in conversion from running the Price Check widget (that's about average for larger hotel groups with good websites) then to have a statistically significant result that 'proves' the efficacy of PriceCheck (at a 95% confidence interval) we actually need 410 conversion events. Thus the data set is too small.
3. The impact of accuracy. We know that the average booker can take 30days to make a booking. The problem with a single-month test is that the impact of low-accuracy that we see among other price comparison widget providers is hidden. It is not until month 2 and 3 that the negative impact of poor accuracy is seen and conversion uplift is reversed (if you display inaccurate pricing data).
Data is a wonderful thing. Data is a dangerous thing. It is far far too easy for statistics to imply a false story which is why I thought I'd share a couple of my favourite data correlations from Tyler Vigen that suggest extraordinary stories but are in reality totally coincidental.