XML sitemaps are a listing in XML format of all the subpages of a website. An XML sitemap accompanies a website like the table of contents of an article.
It helps WebCrawlers orient more quickly, refers them to suitable subpages using links, and differentiates text elements from additional contents such as images, videos, or podcasts. With the help of meta data, the Googlebot receives information as to how often new content is uploaded to the website and when it was last changed.
XML sitemaps are hierarchical structures, such that each site is linked with all the others. They require a machine-readable format to be able to be crawled by search engines: the XML sitemap protocol. Alternatively, they can be stored as an RSS feed or a text file.
<?xml version="1.0" encoding="UTF-8"?>
Figure 1: Exemplary construction of an XML sitemap.
The most obvious difference is that the XML sitemap is "invisible" for the visitor, because it is a discreet file in your site structure. You create the XML sitemap for Google and other search engines, which can crawl websites with an XML file more easily. This can also be good for your SEO measures, because the sitemap shares what is new in your web presence with Google in a structured fashion.
Figure 2: Complex XML sitemap from Google (www.google.de/sitemap.xml)
The HTML sitemap, on the other hand, is visible to the reader in the browser, mostly in the lower area of the site. It is not obligatory and only makes sense when the page offers so much content that the main navigation cannot present all topics visually. It is practical for your visitors, because they can find what they're looking for more quickly. The HTML sitemap thus allows for better orientation.
Such a "table of contents" is, for example, interesting for companies, bloggers, and online magazine operators who want to offer the reader additional value. You attach a banner with information about the company, its services, or partners on the homepage and on each subpage. Instead of a sitemap, bloggers can overlay their archive. In this way, you can help your visitors read your blog thematically or chronologically.
Figure 3: HTML sitemap of Asos
There are many ways to create the XML sitemap. You can create it either manually or with the help of tools – Google suggests several third-party providers for this. Creating it with a tool is easy: you simply enter your URL and download the finished file. The file is called "sitemap.xml" by default, but you can also rename it.
If other files with the ending gz, html, or txt are created during the sitemap generation, you can ignore them. Note - The next steps can be somewhat tricky if you don't know your directory structure well. You should upload the XML sitemap to the proper place: the root or the main directory of your website. Make sure that you don't land in a subdirectory, but instead in folders such as metadata, wp-admin, wp-content and wp-includes.
Figure 4: Example of a directory structure
When the file can be called up under the following URL, you can enter it through Google:
Alternatively, you can enter the line in any place in your robots.txt file:
Finally, log into the Google Search Console and choose the corresponding domain.
Figure 5: Choose property in the GSC
In the navigation, click on "crawling" and then on "sitemap/." With a click on "insert sitemap," you will see your URL and a blank field in which you can enter the file names.
Figure 6: Enter sitemap in the GSC
In this way, you have added a subpage to your site that is visible to the reader only if they directly enter the entire URL.
Tip: If your uncompressed sitemap is larger than 10 MB, divide it into several smaller sitemaps and insert a sitemap index file.
It’s a given that you should prepare new content for your website as often as possible to gain an "online following," loyal readers, and potential new customers. It is also clear that the WebCrawler regularly searches the web for current information. But why does Google, with its intelligent algorithms, need updates to the sitemap?
Put simply: as the early bird, you will gain a competitive advantage. New content is more quickly indexed and located. In this way, online magazines can increase their chance of exclusive media coverage and, ideally, obtain a coveted Google box. All sitemap types, such as text, images, or videos should be updated after content is changed.
This can be done quickly with small sites: generate XML manually on the corresponding days and upload the files. Operators of larger websites and bloggers with daily change frequencies can have their sitemap updated automatically - through plugins, for example. Warning: updating too frequently can slow down site speed.
There are expanded versions of the sitemap for additional media types. The most important are image, video, and news sitemaps. For a purely text-based website, you can keep the XML and HTML sitemap. However, if you want to incorporate images, offer how-to videos to your readers, or appear in Google News, you should submit a matching "table of contents." Google itself seems enthused by the new opportunities to present better search results - with the help of the site operator.
This is especially true for video XML sitemaps:
(Statement from Google in the Search Console on the topic of video XML sitemaps)
An image XML sitemap combines image content with additional attributes, for example: all the information related to the image (image:image). Practical for discoverability and ranking in Google Images, it provides the opportunity to assign locations to the images (geo_location). "Searchers" will then be forwarded directly to your website via the image.
For videos, there is the video XML sitemap of the same name. Search machines can better narrow down videos thematically and can suggest them to the appropriate target group. You have an advantage in the Google video search when you submit rich snippets through an individual schema.org markup. You can provide all kinds of information in the Tags: from creator, title, and contents of the video to the location and the length, and information as to whether or not the content is "family-friendly."
Figure 7: Example of a video XML sitemap (Source)
Particularly exciting for online newspapers and company blogs is a separate news XML sitemap because the Googlebot requires news-specific tags in order to evaluate content relevance and trust in your author. Most important are the name of the medium (publication), the title of the article (title), as well as the publication date (publication_date). This information can positively affect your ranking.
In the "WebCrawling" area in the search console, you will find the "sitemap report." It is a very practical service from Google, because you can check whether your site contains errors that would prevent the crawler from indexing it. These errors are often in the status code, when URL endings were renamed and the 301 routings are missing. The reason why access to the site was not possible is shown in the sitemap error report.
Figure 8: Check the sitemap in the GSC
404 status code (broken links) are a frequent problem. When attempting to access the site, "site not found" will appear – and, in the worst case, the construction man will appear. A list of sites not found and other indexing problems will appear in the section entitled "Crawling Errors."
You can also use Ryte Website Success to check if sites in your website are listed in the sitemap.xml or if there are problems with your file. Click on the report "sitemap". Here, you will obtain an exact overview of all the sites that are listed in your sitemap.xml.
Figure 9: Check sitemap with Ryte
Sitemaps do not guarantee that all content on your website will be crawled and indexed, because the process is controlled by algorithms. But, by providing a logically constructed XML sitemap, you can make the indexing easier for the search machine bots. This is especially important for the incorporation of videos, which are becoming more and more important for search machine optimization, as it offers the chance to submit a video XML sitemap.
When you make comprehensive information available about your web offerings, webmasters can influence the crawling of your site. It is worthwhile to take a look into the technical SEO and give your ranking a hand. An XML sitemap is not an absolute necessity, but it can’t hurt and it only takes a little bit effort.
Check and optimize your sitemap with Ryte
Published on 03/07/2017 by Eva Wagner.
Eva is an experienced content marketer. Until May 2018 she was a member of online marketing team at Ryte. Using her creativity and the knowledge of current topics, she was responsible for the German Ryte Magazine and the Ryte Wiki. She also organized Ryte’s presence at major trade fairs such as the dmexco in Cologne.Become a guest author »