One of the main frustrations I have with Google Analytics is how to keep my data clean of spam and bot traffic. Web analytics is critical to conversion rate optimisation. You need reliable data to measure the performance of your digital marketing activity. The last thing you need is data in Google Analytics that is not accurate.
However, Incapsula estimate thatup to 62% of website traffic is made up automated bots. Spammers are constantly adapting their methods to avoid common strategies of dealing with the fake traffic they generate. It can be difficult to keep on top of the problem they create.
Don’t worry though, there are some proven methods to rid your Google Analytics data of spam and bots.
Types of Spam:
There are two types of spam in Google Analytics; ghosts and crawlers.
Ghosts don’t even access your site, but they make up a majority of fake traffic tracked by Google Analytics. This is important to understand as it explains why such traffic won’t be tracked by Google’s Search Console and server-side solutions like WordPress plugins won’t prevent ghosts.
Spammers use the Measurement Protocol which allows developers to send data directly to Google Analytics’ servers. By using a randomly generated Google Analytics tracking code ID the spammers leave a “visit” with fake data without even knowing which site they are hitting.
Unlike ghost spam, crawlers do access your site and they ignore rules like those in robot.txt files that are designed to prevent them from reading your site. This means when they leave your site they create a record that mimics a real visitor.
Crawlers are more difficult to identify as they target known sites rather than random GA IDs. However, new crawlers are less common and so if you notice a suspicious looking referral in your analytics, checking it out on Google is relatively easy.
What does this tells us?
Don’t try to handle spam individually because it’s timing consuming and so inefficient.
Server-side solutions (e.g. WordPress plugins or .htaccess) won’t prevent ghost spam as this type of spam never touches your site.
Don’t get concerned about spam being detrimental to your SEO as Google Analytics is not used for search rankings.
So, what can you do to stop ghost visits and crawlers making your Google Analytics data unreliable and undermining your performance monitoring? Well, below I outline how you can use filters and segments to stop spammers in their tracks.
However, before we set up any filters or segments it is essential that you have at least three views set for your Google Analytics account. When you set up filters in GA you will permanently block traffic from your account and so it’s important to use a “Test” view to initially check your new filters are working correctly before implementing on your main view (e.g. “All traffic with filters”).
You should also have an “Unfiltered” view so that you have sight of all traffic, whether internal or fake, so that you can monitor the total impact of all the filters you use.
Your “test view” should be identical to your “All traffic with filters” view apart from when you are testing a new change to the views, such as a filter or some other setting. To create a test view go to your main view (e.g. All traffic with filters), click on “Admin” and then click on “View Settings” in the far right column. Within “View Settings” click “Copy View” and then give the new view a name and click “Copy view” to complete the process.
Before creating any new filters or segments to prevent spam from hitting your Google Analytics account make sure you use GA’s own bot filtering setting. Go to the “Admin” area and in the third column from the left you will see “View Settings”. Simply click on the checkbox for “Bot Filtering – Exclude all hits from known bots and spiders”. This will remove up to 80% of bots and spiders and it’s updated regularly as Google becomes aware of new bots.
Dealing with Hostname Ghost Traffic:
If like many sites you have traffic coming from a hostname that you don’t recognise you may have a ghost visitor problem. To check if this is the case go to “Audience” – “Technology” – “Network” and click on “Hostname” as shown below.
This can be caused by one of the following issues:
A spammer maliciously using your property ID to send fake traffic data to your Google Analytics account.
A test server sending data to the same Google Analytics property.
To prevent such data inflating your traffic numbers you will need to set up two filters. Firstly a filter to set the value of the hostname to a custom variable and secondly an include filter for your real hostname to block incorrect hostnames.
Hostname Identifier Field:
Go to Google Analytics “Admin” and in the far left column you will see “Filters”.
Click on “Create new” and give it a name such as Hostname ID Field” or something that informs all users what it is.
Select the filter type “Custom”.
Select the option “Advanced”
Field A – Select “Hostname” and enter “(.*)” without the speech marks.
Field B – Leave blank
Field “Output To – Constructor” select “Custom Field 1” and enter “$A1)” as the value.
“Field A Required” and “Override Output” should both be checked and the other two boxes should be left unchecked.
Include Valid Hostname Filter:
Set up another new filter and name it “Include Valid Hostname” or something similar.
Select filter type “Custom”.
Select filter option “Include”,
In “Filter field” drop down menu select “Custom Field 1”
In “Filter Pattern” use the regex of your hostname or hostnames use in your GA profile. To escape any metacharacters you will need to place a back slash “\” or forward slash “/” before a full-stop “.” or a hyphen “-“. For example our website address would look like this “(www\.?)conversion\-uplift\.co\.uk”. Use a vertical bar “|” to separate each individual hostname that you want to include. By inserting brackets (?) and a question mark around “www” GA will accept our address with or without the “www” prefix. Our full expression is:
Now you can save the filter. Over the next five to seven days compare your hostname data from your test view with your normal filtered view to check that the filters are working as expected.
Language spam often appears in your language report as messages that spammers send to get your attention. Once Google Analytics records language spam it can’t be permanently removed from your reports and so it requires a two pronged approach to prevent language spam from inflating your traffic numbers.
Firstly you can block language spam coming into your reports by setting up a filter. This is a permanent change though and so it should be tried out initially on your test view. Secondly you can apply an advanced segment to your reports to remove language spam from your historical data.
If you have just a few websites you can use the manual method outlined below. If you manage many different sites though you may want to consider an automated solution such as this anti-spam filter tool. Such tools can block referrer spam, language spam, events spam, etc, from hundreds or even thousands of Google Analytics views.
Block Language Spam Coming Into Your Reports:
To create any new filters in Google Analytics you will need “Edit” access at the account level.
This is a simple filter that will block any visitors where the language dimension contains 12 or more characters. Most legitimate language settings will contain between 5 to 6 characters and sometimes 8 to 9 characters. This means that it should only block language spam.
There are also symbols which are not valid for the language dimension, but which are used to create a domain name. The filter will exclude such symbols as well. The expression that we use looks like this:
To create the filter go to your Admin area and then select “Filters” in the third column from the left and click on “Add Filter”. Give the filter a suitable name, select filter type as “Custom” and “Exclude”. You will then need to select “Language Settings” from a drop down menu and paste the filter expression into the “Filter Pattern” input box.
You can then use the “Verify Filter” option to see how it would have affected your data for the last few days. Sometimes GA can’t verify the filter because the number of cases are too low to register a significant change. However, if this does occur you should still use your test view to see if the filter does prevent language spam, even if the numbers are relatively low.
Exclude Historical Language Spam Data:
Filters are unable to block out visits that have already hit Google Analytics and so to clean up your historical data you will need to create a custom segment. Click on “+ Add Segment”. Go to the “Language” dimension, select “does not match regx” and paste the expression into the adjacent input box. You should then save the segment and use it to remove language spam from your reports.
Another common type of spam can often be seen by looking at the referrer report in Google Analytics. You can find this report by selecting “Acquisition” and “Referrals”. Now sort the table by descending bounce rate so that you bring up all the referrers with a 100% bounce rate to the top of the page.
You can then use the “Advanced Filter” to only show those referrers with a minimum threshold of sessions. We used 10 here, but you may want to use a much higher threshold depending upon how much traffic your site attracts. You can now browse through the table and decide which sites you want to add to a referral exclusion list.
Check out any suspicious referrers using Google to ensure they are not genuine blog sites, affiliates etc that are sending quality traffic to your site. You can then build a potential referral exclusion list and create a new filter for spam refers. Go to “Admin” and in the third column from the left select “Filter”. Don’t select “Filter” in the first column as this would set up an account wide filter.
Now select “Add Filter” and enter a suitable name such as “Bad referrers” and select “Custom” and “Exclude”. Now select “Campaign Source” from the drop down for your “Filter Field” and enter the domains you want to exclude in the box. Here is an example of the type of expression you need to enter: atatech\.org|captcha\.gecirtnotification\.com
If you are happy with the filter set up you now click on the “Save” button. This filter will permanently change your GA data and so check how it affects your data by first creating it in your test view.
Ghost spam and crawlers if left unchecked can undermine the reliability of your web analytics. Make sure you check the box to allow GA to block known bots. However, this will never protect you from all bots and crawlers. This means you will also need to create appropriate filters to stop spam hitting your web analytics and use segments to deal with historical spam.
It’s also important that you check the referrals report on a regular basis to see if any new suspicious sites are sending low quality traffic to your website. If you follow the procedures outlined above your Google Analytics views are likely to be largely free of spam and you will be able to use GA confident that your data is not overly inflated by ghosts or crawlers.
If you want help with configuring Google Analytics and analysing your data why not contact a conversion rate optimisation consultant who can complete the process for you and make sure you are getting value from web analytics.
Web analytics tools allow you to track exactly where visitors go on your site, how long they spend on each page and how they interact with your site or app. This allows you to understand more about your potential customers and to measure, analyse and report on your traffic. Web analytics tools answer four key questions:
Who visits your website – in terms of number of visitors and their characteristics?
Where do your visitors come from – the source of traffic?
What do visitors do when they get to your site – which pages do they visit?
Where do they go afterwards – if you have links to other sites (e.g. you are an affiliate)?
This is useful to know so that you can begin to measure the effectiveness of your marketing campaigns and the performance of your website. Unless you measure something you won’t know if you are getting better or not and what changes to make to improve performance and revenues. This means using web analytics tools to set benchmarks and start monitoring changes over time.
Web analytics tools allow you to measure:
How many visitors land on your site every day?
Your audience and their demographic profile – the gender mix, their age, what are their interests?
What geographic location do your visitors come, such as the city or country?
What proportion of your visitors are new or returning visitors?
Audience behaviour – engagement levels and frequency of returning to your site?
What browsers are they using? Important to know so that you ensure you support old browsers if lots of your visitors are still using them.
Technology – what devices are your visitors using and their screen resolution? Again very useful because you want to optimize your site according to the devices & screen sizes of your visitors.
Landing and exit pages – what are they?
Which is your most popular content – which pages do they visit most? Critical for prioritising effort and A/B testing.
Which channels drive most visitors to your site – organic, direct, referral, paid, social?
Which campaigns generate most visitors to your site?
Referrals – Which domains are generating most visitors for your site.
Keywords used by visitors to find your site.
You can compare website traffic against your key competitors on metrics such as bounce rate, time on site and source of traffic by using a website audience comparison tool. These tools use information collected by ISPs, panels and other sources to track competitor website traffic and demographics.
Using Web Analytics Tools To Set Goals:
This is all interesting information, but what really matters is whether you are achieving your business goals. Web analytics tools allow you to set up your organisational goals to measure performance over time and identify reasons why you may not be achieving them .
One of the main tasks of conversion rate optimisation is to align each individual webpage with the relevant business objective. So for instance if you have an e-commerce site you will want to set up goals that lead towards a sale, such as view a product page, add to basket, enter checkout and finally complete a sale.
For a blog you will be more interested in engagement metrics, such as time spent on site and number of pages viewed. Once you have defined your key metrics you can set up automated reports to monitor your conversion rates and begin to investigate any changes that occur.
Next you need to better understand your visitor behaviour to identify user journeys and whether you can improve goal achievement through making changes to your site. You should monitor bounce rates and page load speed times to ensure any changes you make to your website don’t put visitors off browsing your site.
One of the most useful benefits of web analytics is the ability to look at the visitors’ path to purchase so that you can identify the drop-off rates at each step in the journey. You will be able to see if any particular stage is more problematic than the others so that you can consider what changes might help reduce this leak in the conversion funnel.
You should then begin to investigate whether your conversion rate at each step in the funnel varies across some of the metrics we have just listed. This might highlight that your website is not that user friendly for visitors on small screens or that your site doesn’t render properly in certain browsers. You can then use one of many cross-browser testing tools to view what might be causing the problem.
If your overall conversion rate is significantly lower in Germany than the UK and there is no obvious reason why this is the case you might want to review your copy as German tends to use more characters than English and direct translations can sometimes fail to allow for local cultural differences. A/B tests have shown that cultural differences can influence how visitors respond to a user interface and so ideally web optimisation needs to allow for cultural preferences in design and behaviour.
Source of Traffic:
Web analytics tools can tell you where your traffic is coming from and which channels are converting better than others. If you are paying for traffic this helps you to understand if you are getting a reasonable return on investment. Again, investigate why you see differences in your conversion rate to try and understand if it relates to your website or the nature of the traffic for each channel.
Find Broken Stuff With Web Analytics Tools:
If you see a sudden drop-off in conversion or decline in traffic from a reliable source this may indicate something is broken on your or a referrer’s site. Use your analytics to flag up when and where on your site there may be a problem with your site so that you can prevent it going unnoticed and costing your organisation significant sums in lost business.
With most subscription web analytics you can set up automated reports that will be emailed to you on a daily basis to help you monitor your key metrics. This will save you having to login every day and allow you to monitor site performance even when you are out of the office.
Web Analytics Tools – Recommendation:
I’ve used all the most popular web analytics tools on the market from IBM Core Metrics, Adobe Analytics to Google Analytics. The clear winner for me is the free version of Google Analytics because it’s by far the most intuitive solution, it’s fast, very little delay in reporting and it integrates so easily with other tools. The support in terms of documentation is second to none and there is a wealth of advice on the web as so many professional optimisers user it.
It is difficult to beat Google Analytics if you are on a limited budget. If want a more sophisticated product then Google 360 is worth considering as this has all the benefits of the free version with the advantages of a paid solution.
18 Website Analytics Tools Compared:
Below you will find the 18 most popular web analytic tools, some of which are free, and so there is no excuse not to start measuring your visitors and their behaviour.
Previously Omniture/SiteCatalyst. Adobe Marketing Cloud is one of the most popular of web analytics tools on the market. An enterprise solution with e-commerce sites that you can fully integrate with Adobe’s Test & Target A/B, multivariate testing and personalisation platform.
A comprehensive suite of features, including mobile, ad-hoc analysis, and the ability for real-time and rule-based decision-making tools to target key customer segments.
This positions itself as a behavioural analytics solution as its focus is on tracking events rather than simply visitors. Amplitude offers real-time monitoring of user behaviour and unlimited individual user timelines. Pathfinder, their user flow analysis, allows you to better understand how visitors navigate through your site or app by visualising the aggregate paths that they take.
The behavioural cohorting feature allows you to define a group of users based upon the actions they have or have not taken on your site. You can then apply cohorts throughout your analysis to understand how different behaviours impact specific KPIs such as retention and revenues. The Microscope feature allows you to click on any point in a chart and create a cohort of everyone who did or did not take a certain action to investigate what is driving their behaviour.
Amplitude offers a free plan for sites or apps with up to 10 million monthly events. For organisations with up to 100 million monthly events the Business Plan costs just $995 per month.
This offers a suite of analytical and testing tools for tracking and optimising editorial content and advertising spend. It focuses on helping organisations understand what content captures and holds audience attention and monetize inventory on the page.
Real-time web analytics tool with an extensive range of features including data at an individual visitor level, on-site analytics, heatmaps, up-time monitoring, a flexible API, Twitter analytics, Google search rankings, video analytics and big screen mobile mode. Free for single websites.
An enterprise level web personalisation and analytics platform which is popular with e-commerce websites. Used by many of Germany’s top 100 online retailers. This solution combines high-end technology with an intuitive user interface.
Econda’s Cross Sell combines a recommendation engine with an online sales tool and re-marketing suite. Product recommendations are context-sensitive and all entry pages can be tailored for your visitors.
Formismo is a state of the art form analytics platform to identify how users interact with your forms and checkout fields. The tool tracks every input field so that you can identify which fields users don’t complete, plus when they do and don’t use auto-complete.
Your form or checkout is unlikely to work as well on all browsers, devices or certain languages. Advanced filters allows you to view all reports for a segment of your visitors to identify and remedy cross-browser or other performance issues. A highly recommended tool by many conversion experts.
Gauges is positioned as a low-cost real-time web analytics tool for small to medium sized organisations. It was designed to be a website analytics API and the dashboard that you see on the Gauges front-end is a web client that consumes the API. 7 day Free trial available.
The default option for web analytics tools for many organisations. It’s free and being the most popular web analytics tool there is a constant stream of posts on how to get the best out of Google Analytics. What I love about Google Analytics is that the user interface is intuitive and because it’s from Google it integrates really easily with other Google solutions such as the SEO tool Google Search Console, their A/B testing tool Google Optimize and AdSense.
The tool is also generally fast to generate reports, no waiting around for data to be processed or sent to you via email. It’s a great tool to begin getting into web analytics.
The free version of Google Analytics offers most features that smaller organisations need and implementation is simple and quick. It uses sampled data when you have over one million unique dimension combinations in standard reports, or more than 500,000 for special queries, such as in custom reports.
The new enterprise suite of six applications which aims to directly challenge Adobe’s Marketing Cloud. It combines Google Analtyics Premium (now called Google Analytics 360) and Attribution 360 (previously Adometry) which it acquired in 2014. You will also get access to an enterprise version of Google Tag Manager.
It gives you access to Audience Center 360, a data management platform that integrates with Google’s own tools (including DoubleClick) and will take data from third-party tools. Of particular interest is the addition of Data Studio 360 which offers advanced data visualization and analysis solutions. This is powered by BigQuery – Google’s data analytics platform. This provides a native report building option for Google Analytics 360 with all the features of Google Docs (e.g sharing & multi-user editing).
Finally, Optimize 360 is a brand new A/B testing and personalisation tool which includes a visual editor interface to bring it in line with other leading A/B testing solutions. Optimize is the free version which allow you to run up to 3 A/B tests at any one time.
A real-time web analytics tools that gives you insights and access down to the individual user-level. A modern and intuitive user interface GoSquared offers business and enterprise solutions, together with a Free version for the small entrepreneur.
A unique real-time web analytics tool that doesn’t require any code to be implemented to setup event tracking. The Event Visualizer allows you to define analytics events by performing the action yourself and so anyone in your organisation can set up a conversion funnel or retention report in seconds. You can also search for an individual user to see every action they’ve done or find users based upon a specific behaviour.
Heap Analytics offers an integrated graphics solution to plot changes in key metrics over time. This allows you to adjust the range, granularity and visualisation as you require. The solution integrates easily with most popular data analysis tools and you can run your own SQL queries or export data to tools such as Tableau.
A free plan is available for up to 5,000 sessions a month or up to 50,000 sessions per month if you add their badge to your website. A 14 day free trial is available for the Custom Plan.
An enterprise web analytics tool incorporating near real-time web analytics, data monitoring and comparative benchmarking. The click-stream reports are very powerful and allow you to see how visitors navigate around your site.
You can expand the IBM Digital Analytics solution to include multiple sites, offline customer behaviour, ad relevancy, impression attribution and social media channels.
A highly recommended and powerful web analytics tool and digital marketing optimisation platform. The solution has three key advantages over traditional web analytics tools. It allows for flexible custom data with an easy to use API, it focuses on individual users and it tracks user behaviour on a multi-session basis.
By tracking user behaviour on a multi-session basis, and by aliasing anonymous cookie data with identifying information, (e.g. email address), Kissmetrics doubles as a customer database. You can collect detailed purchase information and analyse how it correlates with your behavioural analytics data. This provides a comprehensive view of an individual’s interactions with your site over time. This make it one of the most unique of the web analytics tools on the market.
The product is highly thought of for identifying holes in the conversion funnel. It allows you to build ad-hoc queries to drill down on very specific segments.
A comprehensive real-time web and conversion analytics tool. The free tool will show you how your Google Analytics stats match up with the industry standard. It offers a range of plans, including an enterprise solution. 30-day free trial available and a 60-day money back guarantee.
This is technically one of the most advanced of the web analytics tools on the market. It is superior to Google Analytics for behavioural tracking and is great for content-focused websites. It is also less relevant for e-commerce sites.
You can create easy funnels on Mixpanel and visualise them in the user interface. It also allows you to segment users based upon source of traffic or other characteristics (e.g. city) and how they interact with your site. The Explore feature enables you to create profiles for individual users which can be very useful for assisting Customer Services in supporting existing users.
Due to the complexity of the solution it requires a dedicated analyst who can manage it on a daily basis to fully understand the tool and ensure it is set up correctly to measure all your key metrics. To fully integrate API tracking within the solution also needs a fair amount of technical knowledge. It also requires frequent integration with your website if it has to measure specific events or you regularly update or release new features.
A self-hosted, open-source Free web analytics platform. Matomo is a comprehensive web analytics tool but unlike many packages, there is no limit to the amount of data you can store for free. It also has a mobile app. Because it is held on your own server you own the data and can integrate easily with your own internal systems.
A fully integrated and powerful enterprise web analytics tool that includes analytics, segmentation, testing, targeting and re-marketing. An excellent tool for tracking user segments, purchase funnels, scenarios, drop-off and bounce rates. The solution integrates with 3rd party data including app stores, Twitter, Facebook and YouTube. Webtrends analytics offers:
Unlimited data collection
Multi-channel measurement across social, mobile, web and SharePoint
Real-time tracking of customer activity across multiple channels including web, apps and emails. It provides a comprehensive profile for every user, customisable segmentation, funnels, retention and automated driven actions. A Free version is available.
Web analytics tools are critical to get visibility of what content your visitors are engaging with and to better understand visitor behaviour when they land on your site. For start-ups get yourself Google Analytics as this is a free and very comprehensive solution that will meet most needs. Other solutions often provide free trial periods and so if you are looking for more advanced web analytics tools there are plenty to choose from without having to commit to a major investment.