The Canonical Tag is a specification in the source code of a website. It refers to a standard resource - a canonical URL - for websites with the same or almost identical content. If a canonical URL is correctly marked, only this source is used for indexing the search engines. Search engines rate duplicate content negatively because there is no added value for the Internet user. A duplicate content checker can be used to detect duplicate content.
The Canonical Tag is applied when content is used repeatedly or when a definite URL is technically impossible:
It makes sense to include a Canonical Tag on every sub page so that every page links to itself. With that, unexpected errors and wrong links are being adjusted or prevented.
In general, there are two ways of indicating a canonical URL. In both cases, Google recommends absolute URLs – meaning the entire web address.
<link rel=”canonical” href=”http://www.example.com/examplepage.htm”/>
The <link/> element containing the canonical attribute is placed in the <head/> element of the source code and complements the document’s metadata. It refers to the standard page, but is only used where sites that are not being treated as original resource exhibit identical content.
Let’s assume there are the following two websites:
The first one is now our standard resource. The second one is a session as commonly used by online shops in order to be able to store user related data as e.g. items in the shopping cart. The Canonical Tag is now integrated into the head element of the second page. It contains a reference to the standard resource which is the first page. Like that, Google and Co. will know which page shall be handled preferably and incorporated into the index.
Link: <http://www.example.com/examplepage.pdf>; rel=”canonical”
This is not only an indication in the document, but rather an instruction for the answer of the HTTP protocol: If the client (e.g. browser or search engine) sends a request, the server replies that this site is the canonical URL. Sometimes the server needs to be reconfigured.
Let’s now assume there are these two websites:
The second site should be the standard resource. As it is a PDF file, the Canonical Tag needs to be integrated into the site’s header. It refers to itself and tells Google, that the PDF document serves as standard for the indexation.
With the help of the Canonical Tag, website operators can tell search engines which of the pages with identical content should be handled as standard resource. In order to get duplicate content under control, a properly used Canonical Tag is the first thing to do. As a consequence, webmasters influence the link popularity of sites with identical content and at the same time focus their reputation on a canonical URL.
Canonical Tags and Noindex: With the noindex tag, webmasters can convey to Google that a URL should not be indexed. If a Canonical tag refers to this page, Google receives unclear signals. You want to select a page as canonical, but it must not be indexed. Webmasters should therefore decide whether to choose the noindex or canonical version.
At the same time, it is a very powerful tool – if applied incorrectly, websites can be ignored completely by Google. First and foremost, the webmaster should make sure whether it actually is identical or almost identical content because only as and when, Canonical Tags make sense.
Frequent errors are:
The Google Search Console allows webmasters to specify how Google should handle parameters of a website. This can cause the Googlebot to ignore certain URLs of a page.