Google Say Most Common Reason For Blocking Googlebot Are Firewalls or CDNs Issues
Gary Illyes from Google posted a new PSA on LinkedIn saying that the most common reason a site unexpectedly blocks Googlebot from crawling is due to a misconfiguration of a firewall or CDN.
Gary wrote, “check what traffic your firewalls and CDN are blocking.” “By far the most common issue in my inbox is related to firewalls or CDNs blocking googlebot traffic. If I reach out to the blocking site, in the vast majority of the cases the blockage is unintended.”
So what can you do? Gary said, “I’ve said this before, but want to emphasize it again: make a habit of checking your block rules. We publish our IP ranges so it should be very easy to run an automation that checks the block rules against the googlebot subnets.”
Gary linked to this help document for more details.
In short, do what you can to test to see if your site is accessible to Googlebot. You can use the URL inspection tool in Google Search Console, as one method. Also, confirm with your CDN or firewall company that they are allowing Googlebot and ask them to prove it.
Forum discussion at on LinkedIn.