Tuesday 4 October 2011

How Google Works!!

Hello guys...this would be my first blog post ever..so please don't be harsh while judging this article.
Also I ask for both POSITIVE AND NEGATIVE feedback. Thanks.

Google today is an integrated part of our life but have you ever wondered how it actually works.
There are very few of us really know how it really works.So today this post is aiming at solving that QUERY which few would have GOOGLED: how does GOOGLE work?

A background :
In the past 2 years, Google has doubled their workforce  upgraded their its search engine to peed up results, and now answers more queries than Microsoft and Yahoo combined.!
 

ORIGIN:
Some 14 years ago, Larry page, a Ph.D student from Harvard was searching for a name to register his search engine. He rested upon the word "Googol" - meaning the number one followed by 100 zeroes.
However thanks to his friends Sean Anderson's typo error...a new phenomenon called the Google was born.!



.
1. QUERY BOX
It all starts with somebody typing in a request for information about the the nearest branch of your preferred bank, or be it some sms you wish to send to your special one :)  .
2. DOMAIN NAME-SERVERS
“Hello, this is your operator . . . ”
The software for Google’s domain-name servers runs on computers in leased or company-owned data centers all over the world, including one in the old Port Authority headquarters in Manhattan. Their sole purpose is to shepherd searches into one of Google’s clusters as efficiently as possible, taking into account which clusters are nearest to the searcher and which are least busy at that instant. 
3. THE CLUSTER
The request ­continues into one of at least 200 clusters, which sit in Google-owned data centers worldwide.
4. GOOGLE WEB SERVER
This program splits a query among hundreds or thousands of machines so that they can all work on it at the same time. It’s the difference between doing your own homework assignment alone as opposed to asking the entire class to do different questions and compiling them at the end.! .
5. INDEX SERVER
Everything Google knows is stored in a massive database. But rather than waiting for one computer to sift through those gigabytes of data, Google has hundreds of computers scan its “card catalog” at the same time to find every relevant entry. Popular searches are cached—held in memory—for a few hours rather than run all over again.Yup that means everything!:P .
6. DOCUMENT SERVER
After the index server compiles its results, the document server pulls all the relevant documents—the links and snippets of text from its massive database. How does Google search the Web so quickly? It doesn’t. It keeps three copies of all the information from the internet that it has indexed in its own document servers, and all those data have already been prepped and sorted.


7. SPELLING SERVER
Google doesn’t read words; it looks for ­patterns of characters, be they in English or Sanskrit. If it sees your requested pattern a thousand times but finds a million hits for a similar pattern that’s off by one character, it connects the dots and politely suggests what you probably meant, even while it provides you the results, if any, for your fat-fingered query for “hwedge funds or even the abrriviation for a word.”


8. THE MONEY MACHINE:GOOGLE AD SERVERS
Each query is simultaneously run through an ad database, and matches are fed to the Web server so that they’re placed on the results page. The ad team is in a race with the search team. Google aims at  delivering  all searches as quickly as possible; if ad results take longer to pull up than search results, they then don't make it onto the results page—and Google does not make any money on that particular search. 
9. PAGE BUILDER
The Google Web server collects the results of the thousands of operations it runs for a query, organizes all the data, and draws Google’s cunningly simple results page on your browser window, all in less time than it took to read this sentence.
10. THE END OF THE USER STORY: RESULTS
Often in 0.25 ­seconds or less.
CLUSTER CONTROL
Google’s genius lies in its net­working software, which helps thousands of cheap computers in a cluster act like one huge hard drive. Those inexpensive computers allow Google to replace parts ­without stopping the whole show: If a computer drops dead, there are at least two others ready to take its place while an engineer swaps out the busted machine.
ITS ALL ABOUT POWER
Just about the only thing limiting Google’s performance is how much electricity the company can buy. One of its newest data centers (code name: Project 02) is near the ­Columbia River in The Dalles, ­Oregon, which has access to 1.8 gigawatts of cheap hydroelectric power; not coincidentally, this is where ­major internet hookups from Asia connect to U.S. networks. The byte factory has two computing centers, each the size of a football field.
THE MEMORY BANK
Based on the few numbers Google releases, experts guess that at least 20 petabytes of data are stored
on its servers. But Googleytes are ­famous for understatement; Wired says Google may have 200 petabytes of capacity. So how much is that? If your iPod were just 1 petabyte (one million gigabytes), you’d have about 200 million songs to shuffle. And if you started downloading a ­petabyte over your high-speed internet connection, your great-great-great-great-grandchild might still be around when the last few bytes get transferred, in 2514.
PAGE RANKINGS
Google decides how reliable a site is—and thus how important the site’s content will be when Google forms a list of search results—by considering more than 200 factors as it analyzes content. But the secret sauce is Google’s patented formula for following and scoring every link on a page to learn how different sites connect, which means a site is deemed reliable based largely on the quality of the sites that link to it. 
GOOGLEBOTS
Google deploys programs called spiders to build its copies of the internet. On popular sites, Googlebots may follow every link several times an hour. As they scour the pages, the spiders save every
bit of text or code. The raw data are pulled back into the cluster, run through the mill, and scheduled to incrementally replace the older data already on the ­index and doc servers, ensuring that results are fresh, never frozen.
So thats how the world's most popular search engine actually works...Next time you type in a query just think what and where your result went...and before you can finish thinking you would have the result before you. Its not something thats happening between your internet provider and your PC or laptop!!!ITS A GLOBAL PHENOMENON!!

2 comments:

Rahul Malhotra said...

Awesome post! :)

TechnoStriker said...

Buddy, You forgot to mention about Google Search Operators :D

Post a Comment

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | Online Project management