This article will show you how you can quickly identify and deal with 404 error pages to optimize your website both for search engines and your users.
“404 – page not found” – who hasn’t seen this statement when browsing on the internet? You’re searching for specific information, and this message suddenly appears.
404 error pages can have negative effects on your website in many ways: Besides frustrating users, they also send negative signals to search engine bots. It’s therefore important to quickly identify and deal with 404 error pages.
A 404 error message is caused when a URL is no longer accessible. A URL becomes inaccessible if it has been removed by the website operator, but this can often be unintentional, and can happen without the website operator noticing. For example, content management systems usually generate a URL automatically based on the title of the page, but some CMS systems change the URL automatically when the website operator changes the title of a page. Unless the original URL is redirected, this will result in a 404 error.
404 errors can also be caused by website relaunches. If you implement new URL structures in addition to changing the website content and appearance, you need to take extra care. 404 errors can occur for example if you only redirect the most important pages in order to save resources. You also have to consider existing redirects. You should also bear in mind that your external sources might link to one of your URLs – if you knowingly change a URL, you need to set up a redirect.
404 error pages can have a negative effect on your website regarding both usability and the search engine. A high number of 404 error pages on a website will give your user a negative user experience, and they’re more likely to leave your website and go to the website of your competitor. Websites with a high number of 404 pages also require more crawl resources and they raise the risk of search engines not being able to access important content through the link structure.
In addition, valuable link juice is lost when external
Ryte’s module Website Success can help you to easily identify 404 errors that have been found on your website by the Ryte bot. Go to the Website Success module, select “Indexability” → “Status Codes”, and click on 4xx status codes.
Figure 1: Identify 404 errors with Ryte
If you want to get to the root of the problem, you also need to also analyze what pages link to these inaccessible URLs. These can be identified by clicking on “Links” → “Overview” → “List of all links”, and then setting up the following filters:
Click on “Add new filter”, select “is local file (source)”, and set the option to “Local file”.
Figure 2: Add filter to view local files
This filter lists all links that point to internal pages. To limit the results to just the inaccessible pages, i.e. where a 404 error will occur, click on “Add new filter”, then “Status Code (source)”, and select “is” “404”.
Figure 3: Filter to show all 404 pages
Once you have successfully created and applied both filters, you will see a list of all internal 404 errors and the pages that link to them.
Tip: If you have linked Ryte with Google Analytics, the Ryte bot also analyzes all the URLs from Analytics. This raises your chances of identifying all 404 error pages and gives you a clear overview of the number of visitors who have visited the various URLs in the last 30 days. This means that you can prioritize which 404 errors to correct first based on their traffic.
The Google Search Console (formerly called Google Webmaster Tools) provides you with a lot of useful information about your domain. By clicking on “Crawl Errors” → “Not found” you open a list of URLs that could not be identified during the crawl. Clicking on a specific URL offers more information about the linked page.
Figure 4: Identify 404 pages in the Google Search Console
Tip: Taking a look at soft 404 pages often pays off. Soft 404 pages are faulty or non-existent URLs that still return a “200 OK” or “302 Found” status code.
The Google Search Console lists all 404 pages that have been detected on your website, both now and in the past. When analyzing the 404 pages, you should firstly check the date of the page, and see if it still exists.
Firstly, you need to configure your server correctly. You can do this by adding the following code in the .htaccess file:
You shouldn’t use the domain name in this .htaccess line. Search engines will often interpret this as a soft 404 error.
Once you have successfully analyzed the 404 pages, you should then decide on how you want to proceed with the respective pages.
If you want to offer your users similar information, you should redirect the non-existent URLs to thematically similar pages, so that you can still provide your user with relevant information. This is also means that valuable link juice will be passed on for the internal linking.
However, link power is only passed on when you use permanent 301 redirects – a temporary 302 redirect does not pass on the link juice.
Although you may be tempted to implement a redirect so as to not lose out on link power, in some cases, a 404 error should be left as it is. This is particularly the case when you want to remove content permanently, and there are no other pages with similar content. These pages should be set to provide a “404 not found” page, so that users are informed that the content no longer exists.
When setting up a 404 page, there are several basic elements that you should provide. These include a notification that the original page is no longer available, and a possibility for the user to navigate further to another page of your website. This will prevent your users from leaving your website. Ideally, there should be an option for users to navigate to pages with similar content, to facilitate their search for information. If there is no similar content, a search function could help the user find what they’re looking for. Have a look at this article for more information about setting up a 404 page.
Figure 5: Error page without navigation or aid for the user
Technically, it is important for the page to return a 404 or 410 status code correctly so as to avoid soft 404 errors.
Figure 8: Example of a well implemented 404 page with additional information for the user
You should keep your 404 errors pages to a minimum due to the negative effects they can have for usability and for search engines. If you have URLs with content that is similar to that of the error page, you should always use a permanent 301 redirect to a thematically relevant page.
However, if a 404 error cannot be avoided, the page should return a 404 or 410 status code, and should include navigation options or a search function so that your website can still provide users with relevant information.
Analyze your 404 errors with Ryte for FREE
Published on 08/17/2016 by Stephan Walcher.
Stephan Walcher is a SEO specialist who has been active in the online marketing field since 2007. He has worked as an in-house SEO specialist for MSN and Bing, as head of SEO consulting at Catbird Seat online marketing agency, as senior SEO manager at 1&1 Mail & Media GmbH, and later as Head of Product Management at Ryte. In January 2017, he joined the One Advertising AG in January as Team-Leader Travel SEO.Become a guest author »
Get more traffic and customers by optimizing your website, content and search performance. What are you waiting for?Register for free
Do you want more SEO traffic?
Improve your rankings for free by using Ryte.Register for free