I saw this blog post on Hacker News, and it was so notable that I was thinking about it for the past week. I disagree on its major points for technical reasons, but I agree in that you should SEO with the thought that it’s true.
But first, I want to make a distinction here. When Google hits a website and looks at its content for possible inclusion into its search index, we call that “spidering”. That’s not a word plucked out of nowhere – we call web crawlers searching for content “spiders” and there’s a long technical history behind that.
In my experience, Google spiders basically everything – even places maybe you wish Google didn’t find such as admin pages. And frankly this makes sense – spidering your web site doesn’t only give information about your website, but it also gets Google information about how it should rank other web pages. For example, Google gets information about the sites you link out to, which contributes to PageRank calculations of how other web pages should be ranked. A second example is that by spidering all the web pages, Google can find scraped/duplicate content and possibly consider the offending domain (not necessarily your domain!) for SEO penalties.
So if there is an incentive to spider everything, you can see where I disagree with the blog post:
I think it’s very unreasonable to say “Google is no longer trying to index the entire web.” There are huge incentives for Google to spider and at least know about the entire web, even if they don’t actually show the web pages it knows about in its search.
First off, most people don’t go past the first page of search results anymore. For a majority of searches, the answers from Google’s AI summary/the first few results (regardless of whether they’re ads or not) will show up with the answer. 60% of searches don’t even result in a click to an outside web page. So even if Google knows about additional web sites that might match the search, is it worth the computing power to resolve the rankings much below the 20th search result slot or even farther?
There’s a human analog here: people do not want to hear additional details. They want you to get to the point as fast as possible. Here’s a Miss Manners article on “Is there any polite way to encourage someone who is recounting an anecdote to you to come to the point a little faster?” I find it reasonable to assume Google search is simply getting to the point and not showing sites that – even though they have relevant information – that information is already available on the other competing web pages that are higher ranked.
So in short, I disagree with this blog article on a technical basis. I don’t think it’s quite so easy to to say because a web page is not showing up in a Google search, that automatically equals Google didn’t see it or care about it or that it’s not in the Google index.
On the other hand, I think the blog’s deeper point is true. We’ve reached the point in the Internet where there are lots of good competing information sources. If you want to launch a competitor, you need to have a value proposition and a niche: a place that you can get started. For example, suppose you have a Pizza Hut, Papa Johns, (insert your favorite pizza place here) in your town. Your townspeople are generally happy with the pizza available, and there’s no obvious need for another pizza place. If you want to launch a new pizza restaurant, you can’t just say, “We sell pizza.” You have to have a value proposition different than Pizza Hut/Papa Johns/etc: maybe the pizza at your restaurant is meatier/cheesier/better crust/whatever better than the competitors.
The same goes for content: if you want to launch a new website, you need to have a value proposition different than what your competitors are offering if you want a space in Google search rankings. You need to develop a following as an expert in some niche in order to compete with better, more well funded competitors especially if you’re a smaller blog.