Google index now contains 3 billion documents
On Internet searching and search engine optimizationPandiaFind it all!
PANDIA
spacerspacer spacer
PANDIA SEARCH WORLD NEWS ARCHIVE

Google breaks the 3 billion documents barrier

Google launches a new and bigger search engine index and adds fresh news headlines to results.

Fast recently launched several new features at its AllTheWeb search site, including fresh news results, and we predicted that this would lead to a new and intensified search engine arms race.

It did, and unlike most wars, this is a struggle most of us will benefit from.

In an interview with Pandia Fast executive vice president and general manger Rob Rubin revealed that the company will launch a new and much bigger search engine index early next year containing some 2 billion documents.

Google obviously took notice of this, because today it has launched a search engine database containing no less than 3 billion documents, "further extending" according to Google, "the company's position as the world's largest and most comprehensive search engine."

However, careful readers should note that we are no longer counting webpages only here.

2 billion are webpages (25 percent of which are non-English language web pages), 700 million are Usenet posts, and 330 million are images. Out of the 2 billion webpages, some 0.5 billion are not fully indexed, i.e. they are only registered as links from other webpages.

In this perspective Fast's objective of indexing the text of 2 billion documents remains impressive.

Unlike the Fast search index, however, Google Web Search also offers users the ability to search for numerous non-HTML files such as PDF, Microsoft Office and Corel documents.

We won't be surprised if Fast makes a similar move soon -- the company certainly has the technology needed to do so -- but at the moment Google is the only regular search engine to offer Adobe Acrobat PDF-files in its search results. That is important, not at least because a lot of government agencies and research institutions distribute information using this format.

"This announcement is an important step in Google's ongoing effort to provide search services that are fast, easy to use, and that help people find the information they need," says Larry Page, Google's co-founder and president of Products.

"To search our collection of 3 billion documents by hand, it would take 5,707 years, searching twenty-four hours per day, at one minute per document. With Google, it takes less than a second."

Does the inclusion of 3 billion documents improve the quality of search results?

Probably not. There is a psychological element that should not be underestimated. The search engines need to show users and investors that they are innovative and in front. Numbers like these are more impressive than long texts about new and improved algorithms.

On the other hand, Google has already shown that it is able to produce relevant results, and the inclusion of more documents will make it easier to find more obscure pieces of information.

Google has also followed up on Fast's inclusion of fresh news headlines in search results. Like Fast, Google now searches top news websites throughout the world in order to find news stories relevant to users' queries.

News results are presented to users in a "News" section at the top of search results pages. In addition, Google refreshes millions of web pages every day to ensure that Google users have access to the most current information.

In today's debate on search engines and Internet searching, many focus on the commercialization of search engine results and complain about the lack of relevant hits. The fact that the search engines and directories fueling portals like Go, NBCi and Excite have been -- or will be -- replaced by metasearch engines dominated by paid results may seem to support such a position.

The fact remains, though, that search engines like Google and Fast continue to innovate, and they deliver more high quality results now than ever. That bodes well for the new year!

Google
Google press release

More search engine news...

MAIL UPDATE

Be informed every time Pandia adds another original search engine news story!

Enter your email address below:

This news message is part of the Pandia Search World News Archive. The links in this article will not be updated.

For up to date news on search engines and Internet searching, visit Pandia Search World, or search for news using the Pandia Newsfinder:

Search for search engine news:


Pandia Search Central
Search Engine News
SE Blogs and Sites
Free Newsletters
RSS web feed

Search tools:
Powersearch All-in-One
Plus Web Directory
Metasearch
Newsfinder
Shopping Search
Radio Search
People Search
Kids & Teens

On Web Searching:
Search Tutorial
Search Trends

On Search Ranking:
SE Marketing Tutorial
SE Optimization Gateway
SE Submission
Pay Per Click SE

On Pandia:
Search this Site
Pandia FAQ
Store
Advertising

 


NEWSLETTER

CUP The Pandia Post is our free bimonthly newsletter on Internet searching, search engines and directories.

We will give you the latest news from the world of Internet exploring, useful tips on how to improve your searching skills or search engine ranking, as well as information on the development of the Pandia Search Central.

We will never give your address to any other company or organization.

Enter your email address below and click 'Subscribe':


Privacy policy
spacerspacer spacer

Home | On Web Searching | On Search Engine Ranking | Pandia's search tools | FAQ incl. how to add site | Awards and accolades | About Pandia | Search the Pandia site & site map | Contact information

All-in-one lists of tools: Search engine optimization | Search engines and tools | People and email addresses | News search

Pandia is a registered service mark of P&S Koch, Oslo, Norway. All other company and product names are the trademarks or registered trademarks of their respective holders. © P&S Koch 1998-2008. Comments or questions? Go to our contact page.