At the end of November 2020, Google published a new version of the “Crawling Statistics” report which replaces the old Search Console report.
This new tool undoubtedly gives us a lot more information about how Google bots crawl our website and other slightly more technical aspects that can help us optimize our website crawl budget.
As if it was a little SEO auditoria to see the state of health of our site.
The bigger your web project, the more benefits you will get from this tool, as you will be able to draw more conclusions and have much more detailed information.
Throughout this article, we’ll see each of the new features included in this new report, also known as “Drill Stats”.
We leave the link to the report for you so you can jump right in and review the sections as you read our article:
You can also access from your Search Console account, from the tab Settings> Tracking> Open report.
What will I find in this article?
First impressions of the new report
This is the new graph presented by this new section, where we can see by default the follow-up requests that were sent to our website during the period indicated, which is usually the last 3 months.
The other two metrics available are:
- Total download size: it indicates the size in bytes of the main resources downloaded during each crawl: html, images, css, etc.
- Average Response Time: You can view the average time it takes to get all the content on a page.
If there’s an update in Google Search, you’ll also see it shown in the graph on the date that update occurred:
From this section, you will be able to generically see which of your primary or secondary domains had server issues and fell at any given time.
If you want to see the data for a particular domain, clicking on it will suffice to filter only the tracking information for that domain, otherwise the data will be generally reflected for everyone.
You can see the current state of your domains thanks to the icons that we see graphically on the side. If you’ve had a server crash or other Google tracking issue, you’ll see the icon in red.
Trace request pools
With this new data, we can already draw better conclusions and see some of the issues that may affect us the most. optimize the exploration budget.
Requests per response
As we can see, in this grouping we can view as a percentage all the response codes returned by the URLs of our website.
The examples that we can see are some examples of the problems that Google found on our server.
At the moment, we don’t have the option of taking the full list, so we’ll have to settle for that. Likewise, it serves as a guide as to which cases we find, and as we fix it, if there are more, they will appear in the list.
In these codes, we have different answers that we can group into:
Correct process (200): Pages that are currently working and are no problem as these are the URLs that we have active on our website.
Permanently moved (301): Pages with a current redirect to other pages. They are not a problem until they are mistreated. It is very common to see this type of response in SEO migrations, otherwise we should try to minimize the number of redirects so that Google doesn’t have to deal with so many changes.
If you have an ecommerce I’m sure you are wondering how to handle redirects in products that you no longer have in stock. At This article will tell you how to handle it.
Typically, the highest percentage in this section should appear in the page grouping with a response of 200 codes.
Not found (404): These are the types of pages that we should avoid having on our website, and with this report we can correct a lot of errors.
If you navigate to this section, you will see the list of 404s with the date this URL was found. Review this list regularly to reduce your errors until the highest percentage of pages per reply has a code of 200.
These 3 response code groupings are usually the most common, however, other errors may also appear, such as: DNS not responding, robots.txt file not available, temporarily moved (302) and even server errors ( 500).
Requests by File Type
As the name suggests, Google also differentiates the resources on our website based on the type of file.
Most of that percentage has to be in the HTML files as this is what we want Google to crawl because these are the pages on our website.
Requests by subject
In this section, we can only see two options:
Update: The Google robot detects pages that are already created and have been updated, either due to content changes or other modifications.
Detection: Google has detected new pages on our website.
There isn’t an ideal percentage for each of them, but what we’re interested in is that Google is tracking updated content on the web with the goal of increasing the crawl budget. The more new content you create, the better.
Requests by type of Google robot
Which robot crawls our website the most?
From this section, we will be able to know how often the different Google robots visit us.
The one that receives the highest percentage should be smartphones, due to mobile first indexing. And depending on what kind of web project you have, either the Desktop (Computers) Googlebot or the Images Robot will follow. You can also find the Adsbot, intended for Google Ads.
Findings from the new exploration statistics tool
Without a doubt, this new tool gives us an excellent view of some data that we can take advantage of, such as resolving 404 errors, detecting 301 redirects, visualizing which type of robot is visiting our website the most. .. and all this without it being necessary. to do a log analysis.
As I mentioned before, Google doesn’t give us all the data we need, but just provides us with some of it so that we can fix these errors little by little. Even so, we believe that it is a tool that will allow us to correct many problems that before we could not visualize without having a minimum of technical knowledge.