PageRank algorithm

A Quote by James Surowiecki on google, search engine, internet searching, searching, internet, and pagerank algorithm

Google started in 1998, at a time when Yahoo! Seemed to have a stranglehold on the search business – and if Yahoo! Stumbled, then AltaVista or Lycos looked certain to be the last man standing.  But within a couple of years, Google had become the default search engine for anyone who used the internet regularly, simply because it was able to do a better job of finding the right page quickly.  And the way it does that – and does it while surveying three billion Web pages – is built on the wisdom of crowds.

            Google keeps the details of it’s technology to itself, but the core of the Google system is the PageRank algorithm, which was first defined by the company’s founders, Sergey Brin and Lawrence Page, in a now-legendary 1998 paper called “The Anatomy of a Large-Scale Hypertextual Web Search Engine.”  PageRank is an algorithm – a calculating method – that attempts to let all the Web pages on the Internet decide which pages are most relevant to a particular search.  Here’s how Google puts it:

PageRank capitalizes on the uniquely democratic characteristic of the web by using it’s vast link structure as an organizational tool.  In essence, Google interprets a link from  page A to page B as a vote, by page A, for page B.  Google assesses a page’s importance by the votes it receives.  But Google looks at more than sheer volume of votes, or links; it also analyses the page that casts the vote.  Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”

In that 0.12 seconds, what Google is doing is asking the entire Web to decide which page contains the most useful information, and the page that gets the most votes goes first on the list.  And that page, or the one immediately beneath it, more often than not is in fact the one with the most useful information.

            Now, Google is a republic, not a perfect democracy.  As the description says, the more people that have linked to a page, the more influence that page has on the final decision.  The final vote is a “weighted average” – just as stock price or an NFL point spread is – rather than a simple average like the ox-weighers’ estimate.  Nonetheless, the big sites that have the more influence over the crowd’s final verdict have that influence only because of all the votes that smaller sites have given them.  If smaller sites were giving the wrong sites too much influence, Google’s search results would not be accurate.  In the end, the crowd still rules.  To be smart at the top, the system has to be smart all the way through.

James Surowiecki

Source: The Wisdom of Crowds, Pages: 16..17

Contributed by: HeyOK

Syndicate content