Don’t panic – if your online shop is having issues with technical SEO then our resident website whisperer Izzi Smith can help.
You surely know by now that online success requires good rankings in organic search results. Which means that technical SEO for ecommerce is an essential part of driving your revenue – and our 2022 guide is here to help.
I’ll cover the main issues I’ve encountered when doing technical SEO for hundreds of ecommerce platforms like yours. While each website is different, you’d be surprised at how common these issues are, even for some of the biggest players.
We’ll look at:
Ecommerce SEO security
How crawling and indexing work
6 common technical SEO problems
Ready to rock and roll? Let’s do it!
Safety first: your website should be served using SSL encryption, also known as HTTPS protocol. If it isn’t, then I advise you to directly go and sort this out – right now!
SSL protects your visitors’ data against malicious activities. For an online shop where private information (like credit card details) is entrusted to you, this is especially important.
Should your pages not be encrypted, users will be rightfully warned with a “Not Secure” label in their browser. This is a huge conversion killer!
Check out our guide on how to set-up SSL encryption, to get started.
If your site uses HTTPS, don’t skip this section just yet! As well as keeping the certificate up-to-date, you also need to make sure that visitors don’t accidentally fall into non-secure pages, which can happen when there’s no permanent redirect (301) in place from HTTP -> HTTPS.
To take one fictitious example, maybe some years ago my website was not SSL encrypted. Back then, I published a fantastic article about how to make vegan gummy bears. This article linked to the related product page within the text, using the non-secure URL.
When I migrated to HTTPS, the article automatically updated with it, but the links I added manually did not. This means that every time someone visits the article, clicks on the link and goes to buy the product, they fall into the HTTP-only version of your website.
Adding a permanent redirect rule will avoid this!
Visit one of your website’s pages and in the URL bar change HTTPS to HTTP.
Does the browser now forcefully redirect you back to the HTTPS version? If you stay on the HTTP site, this means the redirect rule is not in place, and you have to set it up.
Run a crawl to detect if HTTP-only pages are being discovered. This is the most reliable way of ensuring that your set-up is fully working for all pages across your site.
Set up a redirect rule that checks if a page starts with http:// and then redirects it to the same URL but with https:// in place.
This can be done via a .htaccess file, WordPress plugin or in your CDN settings.
Avoid linking to old http:// URLs
Ensure that all assets (like images) are correctly served over HTTPS, or else you will risk being flagged with a mixed-content warning
Search engine bots crawl websites like a spider hopping from page to page, using links as their path. They download and analyze the information and store it within their index, so that it can be ranked for relevant queries.
The URL structure helps them understand your site architecture. That’s a huge oversimplification of the process, but for now – that’s all you need to know :)
When analyzing a page’s content, search engine bots first retrieve some essential facts about the website.
Some of these facts are found within the robots.txt, which is a plain text document that dictates how the crawler should behave on the website, including:
Which pages and directories to not crawl (using disallow:)
Which pages of those blocked pages and directories they should crawl (using allow:)
And any sitemaps or sitemap indexes you have (we’ll get to those soon)
It looks like this:
The robots.txt should be handled carefully. Should you block important pages from being crawled and analyzed, they will not be properly surfaced in search results. You should only block pages that make no sense for organic visitors (like logged-in pages, cart URLs, and so on).
Robots directives, on the other hand, are facts found within every page and tell Google whether they should “index” the page or not and/or whether they should “follow” the links on that page.
By “noindexing” a page you are telling Google that it should not be surfaced in search results, and therefore be more difficult for people to find.
By “nofollowing” links on a page, you tell Google not to pass along any linking power from the linked-to pages on that URL. Every time you link to a page, you are giving it a bit of endorsement. Nofollowing removes that endorsement.
These are fundamentals to grasp in technical SEO for any website. Blocking search engines from crawling your site is basically like covering up a physical shop’s windows and taking down the sign. Sure, some people may come across it by chance, but you’re hiding from many potential customers.
I know, tinkering with robots.txt and robots directives sounds terrifying! And for sure, there are horror stories out there about websites disappearing from search. But robots directives can and should be used for good: to prioritize Googlebot’s journey through our website.
You should only block search engines from crawling URLs that make no sense for organic visitors to see, such as your customer’s account pages. Consider this the back office of your shop. It’s important, but search engines and search engine visitors don’t need to find them.
A great example of a situation where it makes sense to block search engines from crawling comes from the popular online marketplace, Zalando. It tells Googlebot not to crawl pages containing URLs paths with cart view and account pages:
Tip: Don’t block crawlers from viewing “noindex” tags!
Search bots need to be able to visit and crawl your pages to register if it is “noindex” or not. If you have set a page to “noindex” but also disallowed it within the robots.txt, they aren’t allowed to read the page and the “noindex” with it – and this is something you should always avoid.
Tom Tailor previously disallowed search engines from crawling these URLs with “?sort=new” within them, but then added a “noindex” to the page to stop them from showing up in search. Googlebot was able to locate the URLs but can’t crawl the information within it. This then leads to a rather ugly search snippet being shown!
In order to read if a page is set to “noindex”, it should be crawlable by search engines. Therefore, you should not block these pages from Google, especially if they’re being linked to. If a “noindex” tag is not crawlable by Google, you will get results like this. Google has indexed the page due to incoming references or links to it, but they have no more information about it!
If you need more information on proper robots.txt and robots directives, the Google Search Central has tons of great documentation for you, such as this guide.
A sitemap is a catalog of all your indexable pages. It exists primarily as an XML file (XML is a data format used to exchange information between browsers and servers).
Most Content Management Systems (where you build your website) create an XML sitemap by default, which is really handy.
Your sitemap should only contain pages you wish to be indexable for search engines, so there’s no point spending time adding in URLs for blocked or non-indexable pages!
You can create XML sitemaps for different sections of your website if you have a large amount of content. If you’re using WordPress, we recommend using a tool like Yoast, as they easily create one for you.
Website crawling technology (like ours at Ryte) can analyze your website and sitemap, and return any cases where either you’ve included a non-indexable URL in the sitemap, or you’ve left out an important URL from your sitemap. To start a crawl, you can start a free trial with Ryte.
Listing all XML sitemaps within the robots.txt is best practice, like so:
Every website is different and has its own technical challenges to overcome. Believe me – I’ve audited and given recommendations for hundreds of sites, yet every week I still find something new.
That being said, for ecommerce websites there are a handful of common problems that we come across frequently. These are caused by the conventions of having branching navigations, content-heavy category pages, unstable inventories, and plenty more quirks!
We won’t cover things like page speed however, because load time is a common issue for websites of all types, and we’ve already covered that in detail in our guide to page speed <link>.
Likewise, on-page SEO is covered in the SEO Starter Kit that we produced together with Hubspot. Elsewhere on the Ryte Magazine, we’ve also got useful resources on schema markup and structured data in this ultimate guide.
In this subsection, I’ll run through those common issues, which are:
Duplicate or thin content
Faceted Navigation
Internal Search Result problems
Out of Stock pages
Empty category pages
Broken content
Let’s dig into them a little deeper!
Having sites filled with duplicate content can be detrimental to your website’s organic success as Google views this as spam and could even algorithmically punish websites with bigger problems. A key goal of Google is to serve unique, valuable pages for searchers so duplicate content goes against that!
Therefore it’s best to nip this problem in the bud, using the Ryte platform. Under Quality Assurance, you can find the “Similar Pages” report to find duplicate content issues:
In the following points, I’ll be running through the three main causes of duplicate or thin content, which are: faceted navigation, internal search results, and out-of-stock pages.
A huge cause of duplicate content problems for online shops is faceted navigation (aka. faceted search) and is covered in far more detail in our magazine article here.
Category pages usually have filtering and sorting functionality that allows users to narrow down results by price, colour, size, and so on. When applying these filters, this usually appends the URL with a new string like ?color=red or ?size=12 and often results in a URL that exists in its own right.
Give it a try! Head to your favorite category page and click some filters. Now check if the URL has changed.
Imagine all the filter and sorting combinations your online shop has, and all the URLs this can generate! In many cases, there is no priority given to which order the filter was applied, (see below as an example). There could be thousands or millions of these faceted navigation pages with different URLs but no real unique content. Scary!
To deal with this issue accordingly, you should apply a rule to make sure that all faceted navigation pages have a canonical tag pointing back to the primary category page, are set as “noindex”, and are more difficult for Search Engines to crawl.
(Credit: Sam Underwood)
Note: It can sometimes make sense to have these facet pages indexable, but only when there are significant amounts of search volume and results. Always analyze which filter combinations deserve their own page, and kill the rest!
Something else that could cause index bloat, or create pages considered duplicates, is your internal search engine. Every time a user carries out a search to find a product they’re looking for, a URL will be generated like shop.com/search?a-nice-product.
It makes no sense at all to have these pages indexed, and if you receive many searches on your domain, you could be spewing out hundreds or thousands of thin or irrelevant pages that Google doesn’t want to waste their sweet time on.
The good news is: there’s a quick and easy solution to avoid this problem and it’s very similar to how you’ve just fixed the faceted search problems, so you can handle this fix in one process.
Set up a rule so that all internal search result pages are automatically “noindex”. (Additionally, for a stronger user experience you can 301 redirect search users to a category page if there is a direct match).
Once Google knows to not index these pages (they will slowly disappear over time), block Search Engines from crawling them by setting up a disallow rule in your robots.txt. Remember that? Jump back.
This is, naturally, a problem that’s unique to online shops! Out-of-stock pages occur when your products are no longer available (duh)! That’s a normal situation, but dealing with them appropriately can make or break SEO performance – especially when you have many of them.
The ideal way to handle OoS (out-of-stock) pages depends on whether the item will return or not. Should the item be coming back in stock soon, you can simply keep the page live and indexable. It’s also a good practice for your conversion optimization to add a notification field allowing users to express their interest in being notified when it returns.
If the product is discontinued, you should make it clear to Search Engines and visitors that it will never be in stock again. Set the page as “noindex” and remove all incoming internal links or references (like mentions within your sitemap) to the page. Over time, Google will remove this page from the index so that organic visitors can’t find it anymore.
Note: Product Schema lets you markup the page with details regarding the product’s status and whether it’s in stock, out-of-stock, seasonal, and more. This will feed some explicit information to Search Engines and be displayed in your Rich Result.
Sometimes your CMS or (overzealous) content team members might create categories without considering the full implications of doing so.
Having really niche category pages can be helpful for targeting long tail queries (i.e. lower search volume, but more specific keywords), but what if there are no products available in that category anymore? You now have an empty page that will confuse incoming visitors and search engine crawlers.
This issue is best resolved by solving the root cause. Working with your CMS and ecommerce teams, you’re the voice of authority to have a strict say of what constitutes a category. In order to be created as a category it should be unique and have at least 3-5 different products within it at all times.
Removing existing category pages from the index can be done with a noindex tag and removing all internal links to the page, then search engines should eventually stop visiting them.
Quick tip: If you want to locate your empty pages (or any pages that match a specific rule in mind) you can create Custom Snippets in the Ryte tool. These are customized rules you give our crawler, and the results are returned in their own report and enrich existing ones.
Broken pages lead to frustrated visits, especially when the page has been linked to prominently.
Let’s take the below instance as an example. Lush has added text to the page of a product which has been discontinued, encouraging them to seek out a similar alternative.
However, when clicking on this link, the user is taken to a broken 404 (not found) page instead. Let’s face it – that’s pretty infuriating!
Being unable to complete these goals causes an annoying visit, which can lead to some negative thoughts around your brand and website. Google will also reduce crawling resources for a website that has a large majority of broken pages, so it’s generally a good idea to clean them up and prevent them from happening.
A broken page fix depends on what is causing the issue. If the page no longer exists, you should simply stop internally linking to it. If the page was deleted by accident, just bring it back to life using your page’s back-up.
Should the link have been entered incorrectly, you need to address that in the page’s HTML. I tried to cover all of those fixes in this below flow chart:
Quick tip: check out my webinar recording on how to solve common technical pitfalls for online shops where I jump into these issues in more detail!
As you’ve hopefully realized by now, ecommerce platforms have their own unique challenges, and require some specialized SEO attention.
But the good news is that as these challenges are so common, there are established methods for dealing with them. So: think safety first, ensure your crawling and indexing are set up correctly, and check that none of the most common problems are happening. Ryte can help.
I hope you found this guide helpful, and would love to hear any feedback, or unique challenges you have with technical SEO for ecommerce!
And don’t forget to check out our related guides to ecommerce content strategy, keyword research for ecommerce and internationalizing your online shop.
Published on Feb 3, 2022 by Izzi Smith