Google Blog Search the most popular blog search engine

As Google overtakes Technorati as the most popular blog search engine, Pandia ponders the pros and cons of blog searching.

Hitwise reports that Google Blog Search for the first time has passed Technorati to become the most popular blog search engine on the web.

Google Blog Search began catching up to Technorati in October, when Google placed a link to this search engine on the Google News home. This caused a 168% surge in market share for Google Blog Search over a two week period, Hitwise reports.

Just before Christmas, Google also put Blog Search up in the more pop up menu in the Google home page, making it possible to search the blog search engine directly from google.com.

The need for blog search engines

The blogosphere, or the Internet blog community, has become a very useful source for up to date information. Some of the best bloggers now compete with newspapers and radio and TV journalists in bringing out the latest news and analysis.

Moreover, the blogs bring fresh content that has not yet been indexed by the regular search engines. Although the regular Google search engine will reindex popular websites as often as on a daily basis, others will be left alone for weeks, even months.

The news search engines, on the other hand, only spider a few, influential, blogs on a daily basis. This means that you have to use a blog search engine to get a broader reach.

The spam problem

Unfortunately there are no perfect blog search engines out there. The amount of spam is mind blowing, making it hard to sort out relevant results.

To give you one example: Search for “Pandia” using Google’s blog search and sort by date (yeah, yeah, we know this is a vanity search). The first page of results contains (at the moment of writing) a large number of listings referring to Pandia’s people search service. If you click on any of them, they will lead you to another people search gateway called people-search.com.

However, the links are not listed as people-search.com by Google. No sir! They are listed as http://fiqatodufuke.blogspot.com/index.html and a large number of equally meaningless blogger identities.

This means that the people behind people-search.com (or someone close to them) have established a large number of free Google Blogger blogs, filled the posts with text with relevant keywords, and then put some kind of redirect script into the blogger code to lead visitors to the people-search.com site.

(It should be noted that the reason so many of these pages pop up for this search is that “pandia” is a relatively rare search query. Regular searches do not lead to the same amount of spam in Google Blog Search.)

Scraper sites

Such listings are only the tip of the iceberg. Most blog search engines produce a lot of scraper site results: i.e. web pages from blogs consiting only of content fetched from other sites. They often mix text from various RSS-feeds, generating a large number of blogs and blog posts. Some of them will ultimately lead to a combination of keywords that bring in visitors.

The pages themselves are meaningless, of course, which tempts the unhappy visitor to click on one of the Google Adsens pay-per-click ads located on the page.

It is this use of fake Blogger blogs and Adsense ad units that has led us and others to say that Google is — indirectly — one of the greatest generators of search engine spam.

It is harder to rank blog search results

It seems to us that it is harder for the search engines to find and eliminate such spam from the blog search engine search results than from regular results, as it is harder to identify pages with authority, quality and relevance in such results.

Google, for instance, gives a strong boost to older pages with a lot of relevant, inbound, links in their regular search engine. The whole point on searching for blog search results, however, is to get fresh, up to date content — in many cases only a few hours old. This means that Google cannot wait for links from other blogs to identify high quality posts.

On the other hand, it can give a boost to pages from well established and popular blogs. There are also other signs that can be used to identify spammers. The Blogger identity zuwidokehoze should, for instance, be a dead giveaway.

One thing is clear, all the existing blog search engines have a hard task ahead developing search engine algorithms that sort out and exclude spam. And this is something they have to do, if they are to be tools used outside the circle of bloggers and tech savvy insiders.

If they do succeed in this, however, regular search engine companies like Google and Ask should be able to include fresh blog posts in their regular search engine results.

For a close to complete list of blog search engines, see our Pandia Powersearch page.

Out of the large search engine companies, only Google and Ask have their own blog search engines. Yahoo! will, however, include quite a few blog search results in their news search engine.

All search engines will inlcude blog pages in their regular search results after a while, but then with the same regularity as other web pages.

See also: Google’s blog search engine takes over top slot (MarketWatch)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • blogmarks
  • Blue Dot
  • Bumpzee
  • Furl
  • Ma.gnolia
  • MisterWong
  • Propeller
  • Reddit
  • Simpy
  • StumbleUpon
  • TwitThis
  • Wikio
  • YahooMyWeb
  • BlinkList
  • NewsVine