Avoid Content Duplication because of Google Analytics

Link tagging is a great way of tracking the effectiveness of specific online campaigns. Google Analytics uses link tagging by appending query parameters to the URLs. Now, there isn’t anything wrong up to that point. The problem arises when you combine link tagging with search engine optimization. Search Engines follow links and assign link popularity to pages according to many criteria including the arrival url (destination url). By tagging your links with UTM parameters like: you automatically create a duplicate URL of unless you take the necessary precautions.

My first thought was that Google, since it controls both ranking bots and Google Analytics, would block the indexing of utm parameters in the URLs. It is not the case. If you make a search in Google for “allinurl:utm_id” you will quickly assess that UTM duplication occurs.

So, how to stop the content duplication in order to present only one relevant-unique copy of your content to Google? You need to somehow tell the spiders to not index the pages with UTM variables.

There are two common ways of blocking content from spiders. The first is using the robots meta tag. Which in this case would prevent the entire page from being crawled. The second method is using the robots.txt file.

This is the correct directive in the robots.txt file to avoid duplication of your content with link tagging:
User-agent: *
Disallow: /*utm*

And that’s it. Even though spiders might still index the page, they will not crawl the content, and therefore it will prevent the duplication.

Now that you have the knowledge, you have the power. Happy link tagging for Google Analytics!

