A proper handling of the Canonical Tag is often difficult for webmasters. This guide points out errors and benefits regarding the use of Canonical Tags.
Most SEOs should be familiar with the term Canonical Tag. The meta tag was originally implemented by Google, Microsoft and Yahoo in 2009 and helps webmasters to indicate the canonical content or the canonical URL in case of technically unavoidable duplicates.
In case of multiple similar or alike documents, rel=canonical link hints at the preferred version for search engines.
A canonical URL can be indicated in two ways:
HTML Header
HTTP Header
It is frequently implemented via the HTML header, as this can be done easily and without great effort, whereas the performance and the possibility to tag binary data (e.g. PDFs) with a canonical, speak in favour of the implementation via HTTP header. Learn more about the implementation in our OnPageWiki.
Settle for one of the options – making twice makes no sense here and can result in further problems regarding maintenance and configuration.
If it’s impossible to implement a canonical, a well-kept sitemap is recommended. Meaning: indexable content and 200 OK URLs only. This also helps search engines with the selection and distinction of different URLs with similar content. The sitemap can only serve as a small additional signal for search engines and does not replace the Canonical Tag. Important: Don’t forget to deposit the sitemap in the Webmaster Tools!
But back to our actual topic: Canonical Do’s and Don’ts. Let’s begin with the Do’s.
Every URL should be self-referential. Google does not officially demand this from webmasters, but it sure does help to prevent the formation of problems.
Example: A multiple linked campaign URL does refer crawlers to its canonical version before they even come up with the idea of indexing them.
Products in online shops are usually attributed to various categories. This often simplifies the creation of category landing pages, in order to provide searchers with different entry points to the entire scope of products.
Let’s assume we have an “iPhone 6 64 GB grey”.
It can be attributed to the following categories:
Apple [Brand]
Smartphone [category]
LTE Smartphone [sub-category]
Several well-known shop systems already work with one main category, so that internal links to the product page are always implemented in a designated path URL. In our case, this could look like this:
exampleshop.com/apple/iphone-6-64gb-grey
The URL would then look something like this:
exampleshop.com/[brand]/[product]
Frequently, /[brand] can be replaced via simple URL manipulation with /[category]/ or /[sub-category]/ and with that, a new URL can be called – 1 to 1 duplicate.
A canonical to exampleshop.com/[brand]/[product] helps you to canonize such URLs. From a technical point of view however, a canonization is not required and saves precious crawling resources.
Especially in offline-driven industries, offline texts are often also provided on HTML sites online. Additionally, the offline marketing PDF is put online and.. oops, there we go – Duplicate Content.
However, a webmaster very rarely wants to rank with a PDF document: Users don’t have any navigation possibilities, tracking doesn’t work, there’s no CTA and so on. That’s why it is important to transfer the incoming linkpower of the PDF document to the HTML document.
When loading the PDF document, implementing the Canonical Tag into the HTTP header helps to provide the information that the HTML document shall be indexed as canonical version. This will also prevent the indexation of the PDF document.
In order to not suffer from major ranking losses after URL migration, a proper Redirect Management is required. Every URL should be migrated to the respective new URL via 301 redirect.
This is where it is often forgotten to implement Canonical Tags – which gives away a lot of potential.
As part of an URL migration, all Canonical Tags should be examined and merging into the new URL system should be planned. Following this, all URLs which have referred to another HTML document via Canonical before the URL migration, should redirect via 301 to the new, merged content.
www or non www, trailing slash or no trailing slash – this matter has been discussed over and over again. Above all, the important thing is a consistent standard!
A consistent standard is also important when placing Canonicals. A “self” Canonical of the standard URL should never refer to an alternative version, as this would result in a redirect-loop.
Search engines admittedly often forgive errors as such, but still, they can also lead to the utter ignoring of all Canonical hints. This is why you should always keep them in mind when checking crawler data.
Especially in shop systems, paginated URLs differ only slightly. This is why it’s often assumed to implement a Canonical Tag here, in order to provide search engines with just one URL. But this is a huge mistake!
Canonicals combined with paginated sites should only occur in case of just one “view all” page, as this is the target of the Canonicals.
In all other cases, the Canonical Tag tops the internal linking structure. For example, products on page 3 like that no longer receive any link power, as it would be credited to the first page because of the Canonical Tag. A ranking for product URLs is significantly harder if products keep on rotating on the first paging site, as they exhibit an entirely different internal linking (recognizable by the OPR).
In order to make pagination more easier to understand for search engines and to also increase the first site’s relevancy, the link attributes rel=”next” and rel=”prev” have been introduced.
In the August 2012 Webmaster Central office hours Hangout session, John Mueller stated to not use the Canonical Tag combined with a noindex. In some cases, this can result in the additional communicating of the noindex information.
“You mentioned the noindex… Generally speaking, I would avoid the situation where you’re using a rel=canonical together with the noindex, because it can happen that we take the noindex and also apply it to the canonical [URL], because technically the rel=canonical is meant to be used on URLs that are equivalent and if we see the noindex on one URL and we think “Oh, this is equivalent to the other URL” then we might apply the noindex to the other URL as well. So, really try to stick to things like redirects from one version to the other or the rel=canonical, for example.”
If you’re adding Canonicals because of clean ups, make sure the URL is not tagged with a noindex. Signals for Google should be as structured and clear as possible:
“…you should avoid ever sending mixed signals. If you want something specific done, you should make it as clear as possible. If you’re saying that these URLs are equivalent, but noindex one and try to have the other one indexed, then that will be unclear — always be as clear as possible.”
Domains should always be migrated via 301 redirect on URL level and never via Canonical Tag. A relaunch should be performed with the result that regular visitors can easily find the new site, which is why they also should be referred there.
When search engines are doubting the “meaningfulness”, an excessive use of the Canonical Tag can result in the ignoring of the hint. As a consequence, both URLs/domains are indexed and none of them rank as planned.
As the case with many things in the SEO field, structure and cleanliness come first regarding the use of Canonicals. As the Canonical Tag can only be interpreted as a hint for search engines, they can also entirely mistrust this indication. This is why you should always watch out for errors regarding the use of Canonicals, in order to be able to prevent further problems at an early stage.
Note: The original article is in German written by Martin Tauber – Canonical Guide Do’s and Don’ts
Published on Jun 1, 2015 by Andreas Bruckschloegl