Sometimes it can be difficult to figure out exactly what it is that is making your company’s growth plateau or shrink. In fact, it is often challenging to even perceive some potential culprits because they seem so fundamentally beneficial. Nonetheless, it is important to ask hard questions – and, in so doing, put different aspects of your company under a microscope – if you want to grow. (For example, have you truly adopted high-performance hosting so that your infrastructure is furthering UX?)
In that spirit, here’s a question: Is it possible that split testing (or A/B testing) could be hurting your e-commerce revenue? Clearly, the concept behind split testing is a sound one: by presenting different versions of a page being shown to a random portion of your audience, you should be able to determine which one is preferable based on how well each version is able to turn visitors of the site into users or customers. This method has even somewhat controversially been used by major newspapers to split-test headlines, driving more traffic to news stories to keep the outfits prominent in the digital era.
A/B testing seems to be a smart way to better understand how your prospects and users make decisions; so how could it hurt your revenue? Online growth specialist Sherice Jacob notes that the trusted, somewhat standardized practice often does not deliver the results that business owners and executives expect. Jacob points out that this form of digital analysis, somewhat bizarrely, “could be the very issue that’s causing even the best-planned campaign to fall on its face.”
In a way, though, it’s not bizarre. Thoughtful business decisions often have unexpected results. (Anything can be done well or poorly – such as your choice of host, which will determine whether your infrastructure is secure. Failure to look for SSAE-16 auditing is an example of a mistake made when picking a web host.) What mistakes can be made when split testing? How and why does it fail? Let’s take a look.
- How many tails do you have?
- The magic of split testing: Is it all an illusion?
- Getting granular – 6 key questions for core hypotheses
- SEO hit #1 – failure to set canonicals
- SEO hit #2 – failure to delete the losing option
- Results from your e-commerce hosting
How many tails do you have?
Analytics company SumAll put two copies of a page that were identical – with no differences whatsoever – into one of the most well-known split-testing tools, Optimizely. Option A beat option B by almost 20%. Optimizely fixed that particular issue; nonetheless, it does reveal how misleading the output from these experiments can be. Think, after all, if those pages had just one minor difference. You would then confidently assume that A was the better choice, and feel backed up by the software’s numbers.
The reason that an issue such as this might arise with A/B testing is due, fundamentally, to the design approach taken for the algorithms built into the program. These approaches are categorized as one-tailed and two-tailed. One-tailed tests are simply trying to find a positive connection. It’s a black-and-white solution. With just one tail, your weakness is the statistical blind spots, says Jacob. Two-tailed testing looks from two different angles at these e-commerce outcomes.
The distinction made by the UCLA Institute for Digital Research and Education helps to clarify:
- One-tailed – Testing that is based on determining whether there is a relationship from a single direction “and completely disregarding the possibility of a relationship in the other direction.”
- Two-tailed – No matter which direction you use to address the relationship, “you are testing for the possibility of the relationship in both directions.”
The magic of split-testing: Is it all an illusion?
In 2014, conversion optimization firm Qubit published a white paper by Martin Goodson with the shocking title, “Most Winning A/B Test Results are Illusory.” In the report, Goodson presents evidence that shows that poorly performed split testing is actually more likely to lead to false conclusions than true ones – and, well, um, bad information should not be integrated into e-commerce strategy.
The crux of Goodson’s argument comes down to the concept of statistical power – which can be understood by thinking of a project in which you want the find out height differences between men and women. Only measuring one member of each sex would not give you a very broad set of data. Using a larger population of men and women, getting a large set of heights by measuring a lot of people, will mean that the average height will stabilize and the actual difference will be better revealed. As your sample size grows, you access greater statistical power.
To get back to the notion of split-testing, let’s say that you have two variants of the site you want to assess. Group A sees the site with a special offer. Group B sees the sites without it. You simply want to calculate the difference in response based on the presence of the offer. The difference between the two results should be considered in light of the statistical power (amount of traffic).
What is the significance of statistical power? Knowing what you want in a sample size (volume of traffic) will ensure that you don’t stop the testing before you have collected enough data. It is easy to stop and see false positives that lead you in the wrong direction.
Goodson says to think of a scenario in which two months would give you enough statistical power for results to be reliable. A company wants the answer right away, so they test for just two weeks. What is the impact? “Almost two-thirds of winning tests will be completely bogus,” he says. “Don’t be surprised if revenues stay flat or even go down after implementing a few tests like these.”
Getting granular – 6 key questions for core hypotheses
You want results that are meaningful from these tests. Otherwise, why bother? Think in terms of possible sources of confusion or frustration for visitors, either at the level of the hook or within the funnel, advises Qualaroo CEO Sean Ellis. Get this information directly from users via surveys or other comments.
Based on those bits and pieces, come up with a few hypotheses – your hunches about what you can do that might improve the conversion rate or give you better business intelligence. You can see whether or not those hypotheses are correct using the A/B tests, via an organized testing plan. A testing plan will make it much easier to strategize and consistently collect more valuable information.
These 6 questions can guide you as you develop your testing plan, says Ellis:
- What is confusing customers?
- What is my hypothesis?
- Will the test I use influence response?
- Can the test be improved in any way?
- Is the test reasonable based on my current knowledge?
- What amount of time is necessary for this test to be helpful?
That short list of questions can help you become more sophisticated with your A/B testing to avoid false positives and use the method in its full glory.
SEO hit #1 – failure to set canonicals
Split testing can hurt your SEO also. You need to set canonical URLs for each page because these two almost identical versions of the same page cause confusion for search engines.
SEO hit #2 – failure to delete the losing option
Another issue for SEO (and in turn for your revenue, if not your conversion) that can be caused by A/B testing is when you do not delete the page that loses in the comparison. That’s particularly important if you’ve been testing the choices for a while – since that generally means that the search engines will have indexed it.
“Deleting it does not delete it from search results,” notes Tom Ewer via Elegant Themes, “so it’s quite possible that a user could find the page in a search, click it, and receive a 404 error.”
Results from your e-commerce hosting
Just as you want to see impressive results (and not a downturn) from your split-testing, you want your hosting to be working in your favor as well – and for online sales, security and performance are fundamental. At Total Server Solutions, compliance with SSAE 16 is your assurance that we provide the best environmental and security controls for data & equipment residing in our facilities. See our high-performance plans.