Leaky Pipes: Search engine optimization (SEO) techniques try to improve the quality and, more frequently, quantity of keywords to "game" Google's algorithms and climb to the top spots of search result pages (SERPs). Nobody except Google knows exactly how those algorithms work, but a recent leak could shed light on some of the internet's most guarded secrets.
Google's automated bots recently committed some confidential documentation on GitHub, describing how to use the company's Content Warehouse API. The commit seems to be a mistake, and Google later tried to undo the leak, but the effort was unsuccessful. The cat is now out of the bag, and SEO experts are examining the leaked documentation to try and uncover what the Content Warehouse API does.
Erfan Azimi, CEO of SEO company EA Digital Eagle, was the first to spot Google's documentation. He later disclosed it to other SEO specialists. The Content Warehouse API appears to be a tool intended for internal use by Google personnel.
The errant commit reveals some previously unknown details about how Google's search engine works and the many thousands of attributes used by the Content Warehouse API. Google Search classifies web content using more than 14,000 different attributes. However, there are no details about how much "weight" every attribute has in search indexing.
The leaked documents also refute some of Google's previous statements about search, such as click-centric user signals not being considered in content indexing. Google said subdomains are considered separately in rankings, but the Content Warehouse documentation doesn't support that assertion. Some other contradictions include using a sandbox for newer websites, assigning an "authority score" to give a site a higher position in SERP, and more.
It also uses some questionable measurements in its site rankings. For example, one of Content Warehouse's modules uses Chrome views as a website quality metric. So, sites with more visits from users using Chrome will rank higher, with all other factors being equal.
Many professional SEO experts and analysts will likely study Google's Content Warehouse documentation over the next several weeks, despite being one of the most controversial industries related to internet search. So far, Mountain View has provided no official statements about the potentially devastating leak. However, rest assured that it has engineers working overtime trying to mitigate the consequences of the leak.