When building the new function, we took user feedback into account to ensure everyone could use it easily.
We have created pre-defined filters for the different components of a URL: protocol, domain, path, and query string. So, all you need to do is choose from the dropdown list, and enter the respective protocol, domain, URL path, or query string that you want to include or exclude.
The following filter options are available:
Doesn’t start with
Doesn’t end with
When you’ve set up your desired filters, click "save".
We recommend setting up the whitelist/blacklist function when you set up a new project so that you can track the progress of your optimizations more easily. Suddenly changing the number of URLs crawled would skew your data.
For example, if you want to exclude assets from your website crawl, this will affect your performance score in the Web Vitals Report - if the page is crawled without assets, it will load faster and have a better performance.
This feature can help you tailor your website analysis specifically to your needs. It will also help you save crawl budget as you can exclude URLs that are not relevant for your individual work.
These are some specific cases where this feature can help make your analysis tailored specifically to your needs:
Analyze specific subdomains
In this example, both the English and German subdomains of the Ryte website will be analyzed, using "Domain is en.ryte.com" and "Domain is de.ryte.com"
Analyze a subdomain, but focus on or exclude a directory
In this example, we analyze the subdomain en.ryte.com and the subdirectory "wiki":
Ignore specific file types such as .jpg .pdf .gif
In this example, we exclude URLs with file types .jpg, .pdf. and .gif:
Published on Apr 15, 2020 by Olivia Willson