A soft 404 error occurs when a user requests a non-existent or incorrect website with his browser and the server incorrectly sends a “200 OK” or “302 Found” HTTP status code to the client (browser). The user receives a 404 error code on the screen, but the server sends a different response code when communicating between the client and the web server. The content of the website is not related to the server’s HTTP response.
This can affect the crawling and indexing of the requested site. In addition, it is sometimes frustrating for users because the server’s response does not match their search query. Soft 404 errors are also called false 404 or soft 404 error codes.
Typically, a server sends a “404 Not Found” error code whenever a requested website does not exist or is no longer present on the server or the URL is corrupt. This is often the case when a resource is stored in a different location on the server, in other words, the content was moved, and the internal links that point to the resource have not been corrected. A 404 error code can also be displayed if external links point to a resource that no longer exists. These are called dead links. The server also sends a 404 error code if a correct URL has been modified by the user.
The soft 404 differs from these cases in that the server responds with an HTTP status code such as 200 or 302, even though the site no longer exists and accordingly a 404 would be the correct answer. The server interprets an incorrect or invalid URL as the correct address and redirects to the start page, for example.
Problems can crop up since the server not only sends status codes to browsers, but also to the crawlers of search engines. The crawler will gradually work through the links that are available to it. If it hits a website that no longer exists and still has a 200 or 302 response code, it treats this resource as a regular website with content. This may mean that it will no longer visit and crawl other pages on the same domain because it spends only a limited time on each website of the World Wide Web. Thus, it crawls resources that do not provide meaningful content and users get presented with content that they have not requested.
Crawling errors are listed under the Diagnostics menu item in the Google Search Console (previously called Google Webmaster Tools). If soft 404 errors are noted there, the following steps can be taken.
It is very important that the server issues the correct HTTP code in response to a request. A website could be removed from the index, if it does not.
Add-ons such as Firebug or the Fetch as Google tool are also very useful. It allows webmasters to see if the HTTP communication between the client and the server is functioning and how the Googlebot reads the website. Both versions display the HTTP status codes. A comparison with the website, which is called by the browser, reveals any necessary changes. In the Bing Webmaster Tools, this operation can be performed in the Index Explorer under menu item 404 error.
Typically, a 404 error code is issued only for bad URLs or non-existent websites. We also recommend personalized 404 error pages to provide users with an alternative to the requested content and that way keep them as a visitor to the site. A good 404 error page:
Different error codes may occur when a website is redesigned, content migrated, or seasonal promotional actions undertaken. In particular, large-scale projects can create thousands of error codes. However, products that are no longer available and even websites with very little content (thin content) can result in soft 404 error codes. The impact is enormous. Search engines may take the affected websites out of the index, which can lead to declines in sales in the case of commercial websites. Users can sometimes get frustrated and the cost of resolving problems increases proportionately to the number error codes issued.
A regular check of possible error messages is recommended. They cannot be completely avoided, but at least kept to small number. As soon as the error messages have been corrected, this should be communicated to the respective search engine in the Webmaster Tools so that the crawler can read the website with the corrected error code as soon as possible. The results of such changes can have an indirect impact on traffic if Google de-indexes or downgrades certain websites with Soft 404 errors. This can happen when the ratio between Soft 404 errors and indexed pages is exceptionally high. The crawler’s time budget is then largely used up by soft 404 error pages.