Many an ecommerce site migration has been scuppered by technical SEO issues. Don’t let it happen to you! From blocking crawlers in staging, to optimizing your faceted navigation and more, we share 6 issues to consider.
Going through an ecommerce website migration is a challenge. You only get one shot at launching the new version of your site, and if you don’t optimize it from the standpoint of technical SEO, you might lose all your rankings and traffic overnight.
Migrating is difficult, and even large, well-funded ecommerce stores regularly fail at it. The main reason is the lack of technical SEO quality assurance.
Technically, nothing stands in the way of properly preparing your site for the move. When you use a staging environment to orchestrate the migration, you have the opportunity to test how search engines will look at your new site.
But it takes some experience to know exactly what to look for.
This article will tell you about the main areas you need to look at in your staging environment to ensure the success of your new site as it goes live.
And remember to check out Ryte’s complete guide to ecommerce SEO.
The best practice when migrating your website is first to build a staging environment that lets you develop and test the new version of your site before it goes live. But you must remember that unless you prevent search engines from accessing your staging site, they will crawl and index it. Bad news.
As a quick test, let’s find out how many staging sites are indexed in Google. Here is a search operator string I used:
site:uat.*.com OR site:*uat.*.com OR site:stage.*.com OR site:staging.*.com OR site:test.*.com OR site:testing.*.com
Look at the number below the search bar ‒ all these pages are indexed and need serious cleaning. And I only used some of the most popular staging subdomain identifiers – you could find more examples.
Here’s why getting your staging environment indexed is bad:
It opens up security vulnerabilities.
It may trigger duplicate content issues as Google has access to your old site, your staging environment, and your new site once it goes live.
If your competitors find your staging environment and can access it, they will know your next move.
How to prevent it from happening? Protect your staging environment on a server level using HTTP authentication.
If you protect your staging site by asking for login credentials, search engine crawlers won’t access it, and nothing will get indexed. It’s a simple fix that doesn’t only work well for SEO but also protects your sensitive data from falling into the wrong hands.
2. Make sure crawlers can render your crucial content
These days, it’s common for ecommerce websites to migrate to modern JavaScript frameworks. While these frameworks offer fantastic features, you need to make sure Google can successfully render your new JavaScript-powered website and see all of your content before you go to production.
On top of that, Google is good at processing JavaScript, but I can’t say the same for all search engines. Some of them don’t process JavaScript at all. This means that if you have content on your site that requires JavaScript to be processed to load, some search engines won’t index it, and you won’t rank in their search results.
In the past, I came across an ecommerce site dealing with office and stationery supplies. Unfortunately, the website was built so that Google couldn’t see the actual content of its category pages. Why? Because it was injected using misconfigured JavaScript.
The screenshot above represents the rendered version of the website’s category page. Googlebot didn’t see anything else except for the View All button. The vital content didn’t get rendered, resulting in a blank page. As you can probably guess, this page wasn’t ranking for anything useful.
In the case of this website, we can assume that:
Crawlers may not have differentiated category pages as they were all similar – or similarly empty.
Canonical directives may have been ignored since all category pages were identical.
The authority within the site may not have been appropriately distributed if the links to products weren’t discoverable.
What’s the lesson here? Before you push your site live, you must ensure crawlers can render your crucial content. I know this may sound geeky, but here’s a step-by-step guide you can follow:
First, review the rendered HTML vs. raw HTML code of the page in Chrome DevTools. Comparing both versions will help to spot the differences in missing content and link elements that should be in the rendered HTML but aren’t.
To access Chrome DevTools, right-click on any element and select “Inspect”. Or, press Command+Shift+C or Command+Option+C (Mac) and Control+Shift+C (Windows.)
Next, you need to change the user agent to Googlebot, for which you need to use the Network Conditions tab.
You have two options, either:
Use the DevTools Command Run option (press Control+Shift+P (Windows) or Command+Shift+P (Mac), or
Use the Customize And Control DevTools (shown below).
Uncheck the “Use browser default” box. Then, select your preferred user agent and refresh the page but don’t click anywhere on the page while testing.
Remember to choose a link or a fragment of your critical content based on which you want to test the rendering of your page. Then, try to spot any differences between both versions of your site.
The “Elements” panel shows your rendered HTML, also known as the DOM (initial HTML and any changes JavaScript made). Now check if the links/content that should exist is visible in this code (without clicking on the page).
Now, go to the “Network” panel, refresh the page and click on the HTML document. Navigate to the “Response” code section and search for your content/link (this is the initial HTML).
You can also easily compare the rendered HTML and raw HTML with a Chrome extension – View Rendered Source.
Website elements that are loaded onto the page only after a user clicks or scrolls aren’t something crawlers can usually access. Googlebot doesn’t click or scroll.
Ecommerce sites often display technical specifications, product features, and additional information behind tabbed sections. It’s essential that Google can pick this content up, but sometimes it’s built to require a click action to load.
Do you have tabbed content on your website? If yes, check if it can be rendered.
Copy a text fragment from your tabbed content. Use the “Network Conditions” tab in Chrome DevTools and select the Googlebot user agent. Then refresh the page and ensure you don’t click on any elements while testing. Now, check if the fragment you chose is in the “Elements” tab.
If you can’t find it, that means that Googlebot won’t find it either. Make changes to your site so that this content is available in the source code without having to render JavaScript. It will allow crawlers to access important content quickly and avoid rendering issues.
If your new website is misconfigured, different user agents (like Googlebot vs. a regular user) might receive different values in the HTML elements that are crucial from an SEO perspective.
It’s often the case that the initial HTML file contains one set of values, and these values are replaced as JavaScript gets rendered. This is very confusing for search engine bots because they don’t know which elements they should consider.
The elements you need to pay particular attention to are:
Canonical tags
These HTML elements are essential for SEO, and you should ensure crawlers get the version you want them to, both with and without rendering JavaScript.
Faceted navigation makes it easier for users to find the products they’re looking for. But from an SEO perspective, it’s hazardous if you leave it unoptimized.
The filters within your faceted navigation can generate multiple copies of the same category page. If Google can access these copies without any restrictions, it will attempt to index all of them, leaving you with duplicate content issues.
Botify analyzed an ecommerce website with fewer than 200 thousand product pages. Because of unoptimized faceted navigation, this website had over 500 million pages that Googlebot could access.
How to address this issue? Determine which facet filters should be crawlable and indexable.
Every ecommerce site should develop an indexing strategy for facet filter-generated pages. Some of those pages can drive meaningful organic traffic, while others are duplicates, and nobody will ever look for them, so they should never be indexed.
The two questions you should answer for each faceted category page are:
Does this page answer meaningful search demand?
Do you have a sufficient number of products on this page to justify getting it indexed?
Also, follow the best practices for faceted navigation from Google.
Finally, remember that every ecommerce site should develop an indexing strategy for facet filter-generated pages.
Don’t forget to check if all your redirects are implemented correctly before your new site goes live.
Although implementing redirects isn’t difficult, any mistake can be costly:
If you redirect your users to content that doesn’t meet their needs, you’ll damage the user experience and discourage them from using your site.
If you mistakenly create redirect chains or redirect loops, you’ll waste the crawlers’ time, sending a negative signal about the technical quality of your site.
Before your site goes live:
Map out your redirects outlining the pages that will be gone after the site migration,
When having your staging site on a separate domain or subdomain, replace all the URLs accordingly in your redirect map to test them,
Review if all of your 301 redirects lead to a 200 status code page,
Avoid redirect chains by using absolute URLs,
Ensure that user intent matches the purpose of the redirected pages.
Migrating a website is complex, and any error can be very costly. But it doesn’t mean you can’t successfully migrate your eCommerce website. All you need to do is have a sound plan created well in advance and thoroughly executed.
I hope this article helped you learn about the areas you should take care of on your staging site. Get them fixed before your launch your new site to avoid any trouble.
Published on Mar 4, 2022 by Jasmita D'Souza