Insights into the Google Document Leak

Uncover the secrets behind Google's ranking algorithm with insights from leaked internal documents.

The Leak: A Brief Overview

On March 13, an automated bot named yoshi-code-bot released thousands of documents from Google’s internal Content API Warehouse on GitHub. This leak, shared with Rand Fishkin, co-founder of SparkToro, and Michael King, CEO of iPullRank, reveals crucial elements that Google uses to rank content. The implications of this leak are monumental, potentially reshaping our understanding of SEO strategies and Google’s search algorithm.

The leaked documents shed light on Google's ranking algorithm, revealing a complex system of 2,596 modules and 14,014 attributes. Although the specific importance of each feature is undisclosed, their mere presence offers valuable insights into how Google prioritises content for search rankings.

Key Findings from the Leaked Documents

The leaked documents unveil a comprehensive Google ranking algorithm consisting of 2,596 modules and 14,014 attributes. Although the specific importance of each feature is undisclosed, their presence alone offers valuable insights into Google's ranking criteria.

Link Importance

The leaked documents affirm the vital importance of link diversity and relevance in Google's ranking criteria. PageRank, a long-standing pillar of Google's algorithm, continues to play a crucial role, especially for a website's homepage.

Click Metrics

According to the leaked documents, successful clicks play a pivotal role in rankings. Google measures clicks using various metrics, such as badClicks, goodClicks, lastLongestClicks, and unsquashedClicks. Longer documents may be truncated, while shorter content is scored based on originality, particularly for Your Money Your Life (YMYL) content, such as health and news.

Freshness and Core Topics

Keeping content fresh is crucial, as Google assesses dates in bylines, URLs, and on-page material. The leaked documents also unveil Google's use of vectorization to ascertain if a page aligns with the website's core themes.

Brand and Entities

Brand recognition is paramount, as highlighted in the leaked documents. Building a notable, popular brand can significantly enhance organic search rankings. Additionally, Google stores author information and uses entity recognition to associate content with its authors.

Site Authority and Chrome Data

Despite being publicly denied by Google in recent years, the leaked documents suggest that the concept of site authority is still a factor in Google’s ranking algorithm. Furthermore, Google utilises data from its Chrome browser for ranking purposes, as evidenced by a module called ChromeInTotal.

Key Takeaways

  • The leak offers groundbreaking revelations into the factors influencing Google's ranking system.
  • Link diversity, user experience, and brand visibility are essential for achieving SEO success.
  • Freshness and author authority continue to hold significant weight in search rankings.
  • Continuous adaptation and optimisation are key in navigating the ever-changing landscape of SEO.


The leaked Google documents have exposed a wealth of knowledge, granting a more profound insight into the elements that impact search rankings. For digital strategists and SEO experts, these revelations are priceless, offering a guide to enhance strategies and remain at the forefront in the constantly evolving realm of search engine optimisation.


Paul Baguley
Paul Baguley

Paul was included as a SEO contributor in a book called Digital Marketers Sound Off. He was one of 101 accomplished digital marketing specialists sharing their favourite tools, tips, tactics, and predictions for the future.


Related posts

Search Unlocking the Benefits of SEO Investment