AJAX Crawling Scheme
The AJAX crawling scheme is a method by which Google and other search engines crawl websites that provide dynamically generated content. Google has used this procedure since 2009. However, on October 15, 2015, Google announced that this crawling scheme was no longer recommended and deemed obsolete (depreciated). Instead, Progressive Enhancement and the possibilities of HTML5 (history API) is meant to be utilized to ensure accessibility of certain content to crawlers.
The connection between client and server does not get interrupted. Rather, the user initiates the process of dynamic content creation by clicking an object on the site. This action causes a script to be executed that is interposed between the HTTP communication of the server and client and loads previously selected content. The AJAX engine detects the call of the script (asynchronous request) and sends an XML request to the server or a database to find the content. The selected items are then loaded dynamically by the script on the website or executed.
How it works
The AJAX crawling scheme ensures that dynamically generated content can be read by crawlers, bots, or spiders. Since these programs which constantly analyze the global Internet cannot interpret dynamically generated web content or scripts, the scheme attempts to store an HTML snapshot of the current content on the server. Content with HTML markup will be readable even for text-based crawlers, because it basically exists in two different versions. Several steps are necessary to prepare a site for the crawling scheme:
- The first step is to note on the website that the AJAX crawling scheme is supported. A conventional website might have the following URL:
The exclamation point symbol (!) is noted on the sample domain with dynamically generated content. It is followed by the pound sign (#). This is the point where the hash fragments begin, in other words, anything that would be generated to handle a dynamic query (usually attributes and value pairs or Ids). The type of URL is referred to as AJAX URL. The combination of ! and # is often called a hashbang.
This notation informs the crawler that the website supports the AJAX crawling scheme.
- As the second step, a different URL format is created for each URL that is dynamically generated because the server must output the correct URL so that the HTML snapshot is referenced. Therefore, the crawler transmits this URL
http://www.my-domain.de/ajax.html#!key=valuein a different format:
http://www.meine-domain.de/ajax.html?_escaped_fragment_=key=value. Only in this way does the server know that the crawler is requesting the content for the URL
http://www.my-domain.de/ajax.html#!key=value. Moreover, the server knows that the crawler must return an HTML snapshot. The original URL format would be the same and no crawlable content would get sent.
- The search engine indexes the HTML snapshot for each URL, but not the dynamic content that it cannot read. AJAX URLs are displayed in the SERPs. These are URLs with a hash fragment such as "key = value".
Importance for search engine optimization
But many webmasters still use AJAX-based applications. The key points of the most frequently asked questions are summarized here:
- Older websites that use the AJAX crawling scheme will continue to be indexed by Google in the future as well. As a rule, however, the crawler uses the URL format with the hash fragments #!.
- Websites that no longer use the AJAX crawling scheme, will not be considered site shifts (cloaking). The URL format including “_escaped_fragments_” should nevertheless be avoided in the implementation of new web projects or relaunches.