Google’s John Mueller recently explained that HTTP status codes are the first thing Google checks when crawling content.
This topic came up during the Google Webmaster Central Hangout on October 18. Here is the question that was submitted:
In response, Mueller confirmed that’s correct. Google does check the status codes before rendering or indexing content.
“Wondering if Google checks status codes before anything else, like before rendering content?”
Specifically, Google will check for a ‘200’ status code before proceeding with crawling any further. A 200 status code indicates to Google that it’s crawling a valid page and there might be content worth indexing on it.
On the other hand, if Google encounters a 400 or 500 error, or a redirect, then it would not proceed with rendering the content for indexing.
Mueller specifically points out that Google does not see any 404 pages. So if you’re designing a fancy 404 page for your site keep in mind that only human visitors will end up seeing it.
By default, Google does not render anything unless it returns a 200 status code. Hear Mueller’s full response in the video below, starting at the 26:38 mark:
“Yes we do check the status codes before we index the content, or render the content. In particular, if it’s a status code 200 then that’s like a sign that there’s something here that we might be able to index. If it’s a 400 or 500 error, or a redirect, then obviously those are things we wouldn’t render.
So if you have a really nice 404 page, then that’s not something that we would see for indexing. Similarly, if your page returns 404 by default … well if it returns 404 then we just won’t render anything there anyway. So it should return 200.”