Noindex

Use of “noindex” in the meta tags tells a search engine robot that the visited page should not be indexed. With the “noindex”, webmasters have the ability to actively influence which URLs should be indexed, and which ones shouldn't be indexed. The so-called Noindex-tag can be expanded with the attribute "follow" or "nofollow".

Implementation

The “noindex” meta tag is integrated in the source code of a website or subpage in thefield and added to other meta data.

The complete tag looks like this:

<meta name="robots" content="noindex" />

The content of the page is, thus, not indexed by the robot and will not be accessible through the SERP.

In order to check whether the meta tag is read and followed, every webmaster can use a so-called web page analyzer. If the tag is correctly integrated, the search results should be negative, because the search engine robot has forbidden the indexing.

Application areas

With "noindex", it can be conveyed to search engines that a website should be excluded from the indexing. The use of the tag can be useful for example in:

internal search results pages
double category pages
copyright-protected content
paginated pages

noindex vs. disallow

In many cases, webmasters are unaware of the difference between the “disallow” command in the robots.txt file and the “noindex” meta tag. Generally, it is not advisable to simultaneously use both methods. This is because the bot is stopped from going through the page with the “disallow” command in the robots.txt file. As a result, the crawler does not recognize the “noindex” meta tag and the page is subsequently included in the indexing. Therefore, it would be wrong for a webmaster to safely use this method in order to additionally ensure that a page is neither crawled nor indexed.

The “noindex” meta tag is only there to prevent search engines from indexing a page. If the entire page should not be crawled, using robots.txt is recommended.

disallow

contents should not be crawled at all
for sensitive content such as login pages
for massive data volumes such as specific image databases
indexing of the pages is possible

noindex

contents should be crawled but not indexed
for internal search results pages
not included in the indexing

Special case “noindex,follow”

Anyone who wants a bot to exclude a subpage of a domain from the indexing but still follow its links can use the “noindex” meta tag followed by the phrase “follow”. In practice, this is as illustrated below:

<meta name="robots" content="noindex,follow" />

For instance, this option can be used for a category that has several pages. The bot follows the links on the respective subpages, but only indexes the first category page. ^[1]

There is also the possibility of combining the Noindex-tag with the pagination rel=“prev“ and rel=“next“.

It is important that the Noindex-tag is not combined with a Canonical-Tag. In this case, it would be conveyed to the search engine on the one hand that there are two identical pages, and that this page is the original. With the noindex-tag, this page should not be indexed.^[2]

noindex,nofollow

If a crawler such as the Googlebot neither includes a website in the index nor follows the links on the website, the tag noindex,nofollow is implemented in the header area.

<meta name="robots" content="noindex,nofollow" />

In practice, this combination is seldom used, because it can disable the crawling of a website.

Importance for SEO

Regarding search engine optimization, the “noindex” meta tag provides an elegant way of avoiding duplicate content. Particularly with regard to the fact that Google and other search engines can penalize pages with duplicate content, influencing the indexing of pages is very important. Adding “follow” in the tag leaves the option to still follow all links on the non-indexed page.

Many content management systems (CMS) automatically create a variety of archive pages that can quickly get to the index. In an extreme case, such a “flooding” of indices can be regarded as spamming. “noindex” can be used to avoid such risks.

“noindex” can also be useful with the relaunch of a website or when launching a new version of a page. All those involved in the project can test the functionality of the new page “live” without the risk of some of the areas being indexed by a search engine. It is important that the noindex is taken out of the source code after the website goes live. Only then can the Googlebot or Bingbot index the site. Only indexable URLs can rank.

References

↑ Block search indexing with meta tags support.google.com. Accessed on 12/12/2014
↑ Google: Do Not No Index Pages With Rel Canonical Tags seroundtable.com. Accessed on 01/31/2017

Web Links

Meta tags that Google understands

[1] Block search indexing with meta tags support.google.com. Accessed on 12/12/2014

[2] Google: Do Not No Index Pages With Rel Canonical Tags seroundtable.com. Accessed on 01/31/2017

[1]

[2]