Is Survivorship Bias Destroying Your Optimisation Strategy?
Is Survivorship Bias Destroying Your Optimisation Strategy?
Survivorship Bias & Website Optimisation:
During World War II, researchers at the US Center for Naval Analyses were given a difficult problem to solve. They were asked to reinforce US bombers to reduce aircraft losses over enemy territory. They decided to conduct a review of the damage inflicted on US bombers that returned from combat missions.
What we see is all there is:
Naturally they recommended that armour should be added to those areas that showed the most damage on the planes they assessed. But the statistician Abraham Wald pointed out that the study was suffering from survivorship bias. They had only considered aircrafts that survived their missions. The bombers not included in the analysis had been shot down.
Wald argued that the holes in the returning aircraft represented areas where a plane could take damage and still return home safely. Using his insight he recommend they reinforce the essential working parts of the plane, such as the engines, where returning planes largely showed no damage. These components were most likely to be the areas that would prevent a plane from returning safely home.
What is survivorship bias?
Survivorship bias is one of the most common logical errors that optimisers make. It plays on our desire to deconstruct success and cherry pick data that confirms our existing beliefs (see confirmation bias). People are prone to comparing survivors with the overall average.
By only examining successful outcomes we tend to become over-optimistic and may set unrealistic expectations about what optimisation can deliver in the short-term. We have a tendency to ignore the tests that have failed to deliver an uplift and only focus on our successes. As a result we overestimate the importance of skill and underestimate the role of luck in the process.
To manage expectations appropriately consider:
- Huge uplifts from tests don’t happen very often.
- Testing the low-hanging fruit will not give you a competitive advantage.
- A majority of tests don’t achieve an uplift. However, negative or neutral tests still provide valuable insights. Don’t ignore them.
- Conversion rate optimisation is a long-term strategy and not a tactical sprint.
- Tests that work for one site may not work on another. Each site is unique and has its own customer base.
Survivorship bias can also lead to misleading conclusions and false beliefs that successful members of a group (e.g. VIP customers) have common characteristics rather than are the result of a process they have completed. For example, very few, if any, customers are born as VIPs. Optimisers need to be careful to avoid the following traps resulting from survivorship bias:
Understand visitor types:
Visitors are influenced by the process they complete online. Be careful about including returning visitors or existing customers in your A/B tests.
Returning visitors are problematic. They may have already been exposed to the default design, but also because most visitors don’t return to a site. Returning visitors are survivors because they didn’t abandon your site and decide never to come back due to negative aspects of the user experience. They weren’t put-off by your value proposition, the auto-slider, long form registration or other aspects of your site that may have caused some new visitors to bounce. They are also likely to have higher levels of intent than most new visitors.
Existing users are potentially even more biased as they have managed to jump through all the hoops and navigate around all the barriers that many other users may have fallen at. They have also worked out how to use your site and are getting sufficient value to want to continue with using it. This means they are likely to respond very differently to changes in content than might a new visitor.
This does not mean you cannot conduct A/B tests with returning visitors or existing customers. You can if the objective is appropriate and you don’t assume the test result will apply to other visitor types. Just be careful about what you read into the results.
Examine user personas:
Similarly each user persona may have different intent levels due to the source of traffic or other factors influencing behaviour. For instance be careful with including Direct traffic in your A/B tests. You have to question why they would type your URL directly into a search engine if they are really a new visitor. Perhaps some of these visitors have cleared their cookies and so are in fact returning visitors?
Why do uplifts sometimes decay?
Survivorship bias can also result in management questioning the sustainability of uplifts. When you first launch a tactical change to your website, such as a promotional pop-up, it is something new that none of your visitors will have seen before.
This may result in a significant uplift in your overall conversion rate for both new and returning visitors. However, as a proportion of visitors seeing the prompt for the first time will have signed up. These users will no longer be part of your target audience as they have created an account.
As a consequence this will automatically reduce your overall conversion rate over time. Those who are going to be influenced by the pop-up sign-up and those who are not don’t. Furthermore, as visitors come back to the site after experiencing the new pop-up, the proportion of non-customers who have not seen this particular pop-up before will decline to just new visitors. As returning visitors become acclimatised to the promotional pop-up its effectiveness is likely to decline among this type of visitor.
This can make it appear the uplift was not sustainable. However, if you analyse new visitor conversion you are likely to see that the uplift has largely been maintained. But even here there may be a notable decay in the uplift over time. A proportion of returning visitors regularly clear their cookies and so are tracked as new visitors by your web analytics.
This needs to be explained to stakeholders to manage their expectations for the overall conversion rate. If this is not understood this is sometimes used to challenge the sustainability of uplifts from conversion rate optimisation. To respond to this phenomena it is worth revisiting changes on a regular basis to review conversion rates and to test new variants if necessary.
Frequency of email and push notification campaigns:
A common question that digital marketers have is what is the optimum frequency of email and push notification campaigns. Often people assess this by analysing existing user engagement. However, relying on existing users is a heavily biased approach because these customers have self-selected themselves on the basis that they are happy with your current frequency of engagement. Those who are not happy with the level of contact will have already unsubscribed.
Instead you should test email and push notification contact frequency using an unbiased list of new users who have recently signed up and have not received any campaigns so far. Provided the sample size is large enough and they have never been included in CRM campaigns you should test contact frequency using this clean list of new users.
Be cautious about rolling out changes that generate uplifts for pre-qualified visitors. Just because a landing page produces an uplift from a highly engaged email list you cannot assume it will help convert unqualified traffic.
Different types of CTAs:
Why is it that some some graphic designers think all calls to action (CTA) should look identical? The reason aircraft cockpits have different types, sizes and colours of switches and buttons is to clearly differentiate between their different uses. A newsletter sign-up CTA is very different from an add to basket button or a buy CTA. The nature of the user’s decision needs to be reflected in the design of the CTA and so it is dangerous to prescribe in your brand guidelines that all CTAs look the same.
No, you should optimise a page for the specific CTA that is required for the stage in the user journey. As a user proceeds through the conversion journey their intent and needs change. This should be reflected in the design of the CTA. Just because a CTA works on a landing page does not mean it will be optimal for a product page or check-out.
Law of small numbers:
Be careful not to rely on small sample sizes when analysing web analytics or test results. The law of small numbers means that we have a tendency to underestimate the impact of small sample sizes on outcomes. Essentially we often only get certain results because of the unreliability of small numbers. So few survivors are left we get extreme results.
Take care with multivariate tests:
Avoid having too many recipes (i.e. variables being changed) in your MVTs as otherwise you will end up with small sample sizes. It may be better to concentrate on testing one area at a time with a well-designed A/B test. Often a slower optimisation process staying within your traffic capabilities is more reliable than trying to overdo multivariate testing.
Don’t’ take users literally:
Qualitative research and other forms of customer feedback (e.g. usability testing) can provide useful insights for understanding user needs and for developing hypothesis. However, most users don’t reply to surveys on or off-line. Further, neuroscience research indicates that a majority of our decisions are made by our non-conscious brain. This means that we are not fully aware of why we make many of the small decisions when navigating a website. Always make decisions based upon user’s actions and not what they say. If you can validate your hypothesis with A/B testing as this is the only way to find out how visitors react to a change in real-life.
Survivorship bias is a common logical error that we can all suffer from at times. Teams involved in conversion optimisation may be prone to survivorship bias if they lack a good understanding of statistics and so training in this area of optimisation will reduce the likelihood that they fall into the trap of neglecting users who don’t survive a process.