How To Build A Strong A/B Testing Plan
How To Build A Strong A/B Testing Plan
Establishing an A/B testing plan can be one of the most effective ways of optimising your website or app. This is because it allows you to test your ideas with real visitors through a controlled experiment. However, don’t forget to conduct an audit of your user experience first to fix any bugs and usability issues as otherwise these can undermine any experiments you run.
What are AB and MV Tests?
A/B testing software allows you randomly direct visitors coming to your site to two or more page designs. Your existing page, such as your homepage, is the ‘control’. This allows you to compare the performance of your new design using pre-determined success metrics (e.g. sales or leads) to your existing page (also known as the default). Any statistically significant difference in the performance of the default and the new design(s) can therefore be put down to the change in design rather than other extraneous factors (e.g. the day of the week or change in source of traffic).
Multivariate testing (MVT) is where multiple changes are made simultaneously to a design (I.e. called a recipe). MVT’s allow you to measure the impact of changing a number of elements at once to discover the best performing combination of elements. A/B tests are normally used to validate if the combination of elements that the test identified as optimal does actually deliver the expected uplift. This is because individual elements when combined may conflict in some way and so reduce the overall impact of the changes made.
So, how do you create a strong A/B and MV testing plan?
1. Align Business and Website Goals:
Before establishing an A/B testing plan you must define your business objectives in a way that is clearly understood and is measurable. An e-commerce site for example might set the business objective as to increase the total value of sales by generating more online purchases.
You should then define the website goals to align with the business objectives. These are largely strategies to achieve the business objectives and some will help you define your key performance indicators (KPIs).
Taking the e-commerce site as an example we might set the following website goals:
- Improve engagement – Add dynamic content for most popular merchandise on homepage.
- Improve add to basket rates. Add product images to pages that allow users to zoom in more to examine merchandise.
- Reduce cart abandonment – Add mechanisms for remarketing to customers abandoning site.
- Increase average basket value – Offer free shipping on orders over £50.
Once you have defined your goals you will have a much better idea of what you should be tracking. Your goals also help you identify your key performance indicators. Metrics become KPIs when they relate directly to your business objectives.
In our e-commerce example where the business objective is to increase the total value of sales the average order value could be a KPI. Monitoring your KPI’s on a regular basis will allow you to evaluate the success of your strategies and what progress you are making towards your objective.
You are now ready to set targets for each KPI. Without a target for each KPI you have no benchmark to measure your relative performance. If you hire a conversion rate optimisation consultant to help your improve your conversion rate. They should insist you set targets to measure their performance as well so that you know they are adding value to your organisation.
2. Segment Your Performance:
There is no such thing as an average user and so don’t use site averages in your analysis or testing as you won’t see what’s really happening. Always segment because averages hide the changes within segments and are skewed by the weight given to larger segments.
You can identify relevant segments by exploring dimensions in your web analytics tool. Dimensions are visitor attributes. Such as source of traffic (e.g. channels or country), type of user (new or returning) and engagement (number of seconds on site). Dimensions also include more technical information. Such as device, browser and screen resolution which can be useful for identifying usability and rendering issues. The rows from one or more dimension make up a segment.
Never report web analytics data without using at least one dimension to prevent misleading generalisations being made about user data. Three common strategies for segmenting user data are:
1. Source of Traffic:
User behaviour and conversion rates will often vary according to where visitors have come from. Analyse data by source of traffic to find out how conversion rates and other important metrics vary according to organic traffic, PPC, direct, social media and email marketing activity. This will help you identify the best sources of traffic for your website and whether you are investing in the right channels to get a good return on investment. You can also see what visitors from different sources come to your site to buy or read.
For any website it can be useful to analyse users by the frequency of visits, engagement (e.g. time on site) and types of users (new and returning). They may have different needs and interests that are reflected in what they buy. Do more frequent visitors purchasing behaviour vary by region or time of day?
For many websites the 80:20 Pareto principle can apply. 20% of your customers can generate 80% of your profit. Analyse visitors by basket value, the products they buy, how they pay for goods etc. This kind of analysis can help you identify your most profitable segments and what products they tend to buy. This can be useful for personalisation and a consideration for A/B testing.
Organisations which use segmentation are more successful at optimisation. It forces them to break down silos and look beyond their internal organisational structure.
3. The 8 Step Process For Reviewing Site Performance:
You should now be ready to review your digital experience to identify opportunities for building an A/B testing plan. I use this eight step process to generate ideas for conversion rate optimisation. You may already undertake some of the steps as part of your digital marketing activities. It’s important to consider all the steps in the process to ensure you have a systematic and comprehensive approach to optimisation. It also helps prevent decisions being based upon a single source of insight as this can be risky and misleading.
I always begin by checking the setup of web analytics. Often they haven’t been fully implemented from the beginning to the end of the user journey or correctly configured. This includes ensuring internal traffic has been filtered out and you block spam and bot traffic from Google Analytics.
The technical analysis is about giving the site a visual check, ensuring it is fast to load and looking for errors and SEO problems. It’s surprising how often a visual check of a site can identify problems with rendering or errors.
The next step
A heuristic evaluation is an expert led analysis of the digital experience to identify areas of interest. This can also include some customers or prospects to make it more client-centric. Here are ten usability heuristics defined by Jakob Nielsen. An heuristic evaluation should generate many good ideas for including in your A/B testing plan. Make sure you look for evidence to support each item in your list.
Now you can look at your web analytics to see how your conversion rate varies by dimensions. Such as browser, screen size, traffic source and landing page. You should also identify high traffic pages, funnel drop-off pages and exit pages. Always look at mobile users separately from desktop users . If you have many tablet users then also look at them separately. Never run reports with all devices grouped together as this is likely to be highly misleading. You will miss items that could be included in your A/B testing plan.
User Experience and Usability
User experience analytics, such as Hotjar, are great for getting insights into the actual user experience. You can also use Hotjar for getting qualitative feedback via on-site polls and surveys which you can email to customers and prospects. Qualitative feedback is great for understanding the why rather than measuring what users do. Don’t rely on the first answer people give you as this is often what people think you want to hear rather than what the real issues might be. Check out my blog on how to use online voice of the customer tools to boost conversions.
Usability testing is often the last thing that website owners conduct when reviewing their site. It should probably be the first thing you do. One interview at the beginning of a redesign is better than one hundred interviews once the design is complete. Check out my blog – Why does usability testing improve conversions for more details and a summary of remote usability testing tools. Usability testing will help to answer the why questions that your web analytics throw up and support hypothesis to add to your A/B testing plan.
4. Prioritising Ideas For A/B Testing:
Once you have completed your evaluation you will have a long list of ideas that you may want to add to your A/B testing plan. However, this is only one of the five different buckets you will need to allocate your ideas into.
- Create hypothesis
The A/B testing plan bucket should only be for ideas where there is a clear benefit of conducting a conversion rate optimisation experiment. It probably has a strong hypothesis, but it’s not a no-brainer and so not ready to implement. It may also affect a business critical process (e.g. check-out or registration) where testing allows you to effectively manage the risk of an idea damaging your conversion rate. All ideas that don’t meet these criteria should go into one of the other buckets. This is because the idea is likely to be so simple and obvious that it should be implemented or needs more development/investigation. Only put ideas with strong evidence to support them in your A/B testing plan.
4.1 Prioritising A/B Test Bucket Ideas:
There are a number of frameworks for the prioritisation of A/B testing ideas. Here we will use the PIE (Potential, Importance, Ease) model developed by Widerfunnel. This is a simple, but effective method for prioritising ideas. This is necessary because you can only run a limited number of experiments at any one time on your site or app. You will want to ensure your own scarce resources are allocated to tests most likely to give you the biggest return on investment (ROI).
This should be based primarily on data (e.g. bounce rate, top exit page, funnel drop off rates and conversion rates). You should also how big an improvement you can make to the page considering how long ago the page was designed, have any A/B tests been run on the page to improve its performance and are there any conversion killers present.
Prioritise high traffic pages ahead of other pages. They give you the greatest opportunity to deliver a large ROI and they also minimise the time it takes to complete an experiments. Also consider top entry pages, landing pages with expensive traffic (i.e. pay-per-click visitors) and business critical pages (e.g. check-out and registration). Template tests can also deliver big returns. They allow you to test changes across a whole category of pages and aggregate traffic from many different pages.
Some ideas are so easy and simple to test. You can quickly create and execute them when you have a suitable gap in your schedule. Other ideas can be very political, taking a lot of time and effort to get them approved. Other tests are just very complex, such as alternative user journeys. They take time to develop and set up.
For this reason how easy it is to get a test approved and built is an important consideration when building an A/B testing roadmap. You don’t want to have long gaps in your testing schedule due to having to wait for complex and difficult tests to be ready. Instead it is better to use the time for easier tests that still have good potential to make an impact.
The kind of tests that are most likely to be less easy to conduct include:
- Site-wide elements such as navigation, call-to-actions and banners.
- Site templates
- Dynamic content
- Pages controlled by CMS
- Alternative user journeys
- Where multiple stakeholders need to approve the new experience
Once your team fully understand the prioritisation process . Arrange a session to go through each idea and rate them according to the three criteria. Let stakeholders know they can attend as well. This will demonstrate that you are open to their input and are transparent about the process you are using.
I normally use a 5 point scale where 5 is the highest score (i.e. has most potential, importance or ease) and 1 is the lowest score. Once you have scored the ideas for each of the three elements of PIE you can then sort them in order of priority by using the total score. This will give you a matrix such as the one below which you can share with your stakeholders and other interested parties. It is worth asking for feedback to ensure the priorities still reflect the business priorities and it may also help avoid some political arguments later on in the process.
The aim is to begin with high-value, low-cost ideas as these should come out at the top of the prioritisation process. In the middle of the list should be the more difficult tests that will deliver high-value uplifts. At the bottom of the matrix should be low-value, high-cost test ideas that you may decide you won’t test unless you get evidence that challenges their value.
5. Develop ideas into an A/B testing plan:
To ensure you’re A/B testing plan delivers significant value and supports your growth strategy you will need to consider the following factors.
5.1 Create Strong Hypothesis:
So that everyone involved with each test has a clear understanding of what and why you are testing it is important to develop a strong hypothesis for each test. This is a change and effect statement to help resolve the problem you have identified with the user experience. It should be based upon qualitative (e.g. customer feedback) and/or quantitative data (e.g. click heat maps) to avoid making decisions based purely on gut instinct. The hypothesis also needs to be measurable and includes a rationale for why the proposed change will influence the target metric in a positive way.
An example of a hypothesis:
Problem: Less than half a percent of new visitors to the homepage sign-up and create an account.
Hypothesis: New visitors landing on the homepage are distracted because there are banners and animated elements that are designed for existing customers and so this reduces the relevance of the content for prospects.
Idea: We redirect new visitors to an acquisition focused landing page with content designed purely for non-customers and have a prominent single call-to-action.
5.2 Consider your testing strategy:
Conventional A/B testing suggests that you change one item at a time to understand what influence each change has on the success metric. This provides clarity on what impact changes have, but unless you are lucky enough to have a huge amount of traffic and the ability to run multiple tests simultaneously this will slow the optimisation process down.
In addition making one change at a time may also prevent innovative new designs emerging that require multiple changes. This can result a site being unable to get out of a local maximum as small increments rarely lead to huge uplifts in conversion.
Check out the different kind of experiments below that you can include in your A/B testing plan. For more details check out my post How to optimise your site using A/B testing.
5.3. Set an appropriate level of statistical confidence:
Before you begin any A/B test it is important to calculate how much data you are likely to need to reach a statistical significant result with your experiment. The objective of an A/B test is to collect sufficient data (i.e. sample) that the result is representative of what we would see in practice. The larger the sample size, the more representative and more reliable the test becomes.
You will need to set a level of confidence for your test (e.g. normally 95% or 99%) before estimating your sample size. This should be based upon the level of confidence needed given the level of risk and the potential benefits/costs to the organisation of making a wrong decision (i.e. rejecting the hypothesis when the idea would have actually improved conversion). Don’t automatically set a high level of confidence when the level of risk is low and the benefits are potentially high.
Here is a great sample size calculator from evanmiller.org. This will allow you to estimate how long you will need to run your A/B test to build up a sufficient sample size. Always set this before you begin a test because if you don’t allow enough time to run a test you are likely to get skewed results as it won’t be statistically accurate.
Before ending a test you also need to consider the margin of error. This is the upper and lower bounds of the targeted metric of the two variants (i.e. A and B experiences). It is usually expressed in terms of +/- x%. This is related to the natural variation you get in results from taking a sample rather a census. Again the size of the margin of error is determined by how large the sample is and so it will decline as the sample becomes more representative of all visits. This normally needs to fall below 5% for a test to become fully statistically accurate.
For more a detailed explanation of sampling and statistical confidence for A/B testing see this excellent blog from KissMetrics. If you are using a testing engine that uses Bayesian statistics then you have a different set of decision to make.
5.4 Success Metrics:
It’s important not to rely on a single metric when testing. You might increase average basket size, but if people are buying lower value items you could actually see a drop in revenue and profits. Always measure what drives your business forward and other KPIs that inform the progress you are making towards your business goals. Revenue and margin are crucial for most organisations and so make sure you always measure these metrics to check that an increase in conversion doesn’t negatively influence these success metrics.
6. Share Your A/B Testing Plan:
To build a culture of experimentation it’s important to share your A/B testing plan and the results you achieve. You can’t do everything through a centralised team and so seek to embed your team within the agile development process within your organisation. But also don’t forget to engage people across the organisation. To build a culture of experimentation it is similar to building a community as you need to encourage communication and collaboration on a continuous basis.
7. Start Again:
Building an A/B testing plan is a never ending process as you seek to continuously improve your success metrics and respond to an ever changing environment. Whether a test improves, reduces or makes no difference to your success metrics it is important to learn from each test. Don’t see any test as a failure as it is telling you something useful about your users.
Indeed, you may want to return to some tests that negatively impacted or didn’t move success metrics. It may have been the context of the test or the implementation of the design that resulted in the idea not improving the conversion rate. Your business and competitor environment is in a constant state of flux and so you need to repeat this process on a regular basis to ensure your priorities are still aligned with the business and identify new problems to fix.
There is no more scientific way of learning what works on your site or app than an A/B or MVT experiment. Begin by aligning your business goals with that of your website. This means that you should be able to set a clear goal for every page of your website. If a page doesn’t have a clear objective, consider whether you really need it.
Web analytics are crucial to understanding the performance of your site. However, never look at data without segmenting it by either source of traffic, behaviour or outcomes. The most useful insights and KPIs will be where individual dimensions overlap.
Use a systematic framework such as the eight steps outlined to review your site or app. It’s never wise to rely on a single piece of information and so always look for other information that supports or contradicts an insight that you are thinking of using. Confirmation bias is our enemy here and so don’t ignore data that undermines your hypothesis.
Allocate the ideas generated into different buckets depending upon how clear the benefits and costs of making the change are to you. You can then prioritise those in the A/B testing bucket according to their potential, importance and ease with which the test can be executed.
Before building a test it is essential to develop a strong hypothesis to ensure you learn something from each experiment. This should include your rationale for why the change will improve your success metrics. It will help the designer understand what you are trying to achieve.
One of the most common errors made with A/B testing is ending an experiment before full statistical confidence is achieved. Always ensure you calculate the required sample size and estimate how long the test will need to run to achieve the required level of statistical confidence. If the test is not conclusive by the end of the allotted time consider stopping it. Depending upon the opportunity cost of continuing with it.
Finally, once you have a completed a test, review what you have learnt. Be careful not to get stuck in a local maximum by insisting on only making incremental changes with each test.