On Inktomi's anti-spam policies
On Internet searching and search engine optimizationPandiaFind it all!
PANDIA
spacerspacer spacer
PANDIA SEARCH WORLD NEWS ARCHIVE

The ambiguous Inktomi anti-spam policy

September 14 2001, updated the same day.

Inktomi logoPandia takes a look at the Inktomi search engine's fight against spam, and its ambiguous attitude towards cloaking. Is the relationship between Inktomi and search engine optimization companies like MediaDNA influencing search results?

Lovely spam

As our regular readers will know, the search engines are increasingly fighting the existence of "spam". The world spam, which has been adopted from the wonderful Monthy Python sketch "Spam, spam, spam, spam, lovely spam, lovely spam" , signifies attempts by webmasters to trick the search engines into giving a webpage an "undeserved" high ranking in search results.

What amounts to cheating is a matter of debate. Among the traditional techniques rooted out by the search engines are repeating relevant search query terms a large number of times, and hiding keyword rich text by using font colors close to -- or identical with -- the background color of the webpage.

Cloaking

A more sophisticated technique is using a server program that keeps track of the IP numbers of the various search engine spiders (i.e. the search engine programs that travel the Web indexing pages).

This so-called "cloaking" software feeds the search engine spider pages that are different from the ones presented to regular human visitors. By doing this you can give the search engines pages that are full of keyword rich text, and that lack all the tables, frames, plug-ins and lay out restrictions that often weakens a regular webpage 's position in the results.

Some webmasters use this technique to lure searchers to pages that are of no relevance to their need (especially p.o.r.n. sites); others claim that the technique is legitimate as long as you serve the human searchers pages that are relevant to the topic they are looking for.

Inktomi and spam

The Inktomi search engine -- which feeds data to search sites like HotBot, AOL and MSN -- defines spam "an inappropriate use of Inktomi's search engine involving any effort to deceive the search engine into returning a result that is unrelated to the query or whose position has been artificially inflated in the result set."

Inktomi says that "Some examples, but not all, of the more common techniques spammers employ to inappropriately use the search engine include:

  1. Embedding deceptive text in the body of web documents.
  2. Creating metadata that does not accurately describe the content of web documents.
  3. Fabricating URLs that redirect to other URLs for no legitimate purpose.
  4. Web documents with intentionally misleading links
  5. Cloaking/doorway pages that feed Inktomi crawlers content that is not reflective of the actual page
  6. Creating inbound links for the sole purpose of boosting the popularity score of the URL
  7. The misuse of third party affiliate or referral programs "
(See Inktomi spam removal guidelines)

Inktomi on cloaking

Note that Inktomi only forbids cloaking that is not reflective of the actual page. Here it differs from Google and AltaVista, which forbids cloaking in any form, possibly because the spider is in no position to determine the content of the "real", cloaked, page.

A lot of search engine optimization (SEO) companies will avoid cloaking because of the risk involved. If the search engine discovers the trick (and there are ways to do so) the site may get banned. Moreover, the search engine may ban the sites of all the clients of the search engine optimization company. One can imagine what that will do to an SEO business!

Some of the larger SEO companies, however, use cloaking and doorway pages (i.e. pages targeting the search algorithms of the individual search engines) actively, mainly because it lets them deliver results fast and effectively. Presumably they trust their own competences in this field, and find the risk worth taking. Whether their clients truly understand what they are doing, is another matter.

MediaDNA

One company that uses cloaking is the MediaDNA company in LaJolla, USA. It never uses a tainted word like cloaking of course. On its website it says that...

"The service uses proprietary technology to convert divergent formats into XML for easy interface with search engines such as Inktomi. Furthermore, eLuminator's proprietary automated pre-processing capability delivers a 'content cover page' that resembles the client's home page, yet displays specific user-desired content and links, as well as tailored information that is indexable by any search engine."

We will never understand why corporations insist on writing incomprehensible press releases -- one should think communication was the key to success -- but in any case, this text is as close to admitting the use of cloaking as you get. In another press release the technique is called "spider food".

MediaDNA and Inktomi

What makes MediaDNA different from many other SEO companies is that they have a deal with Inktomi.

At the time of the deal Andy Feit, vice president of marketing for Inktomi Search Solutions, said that "Providing users with access to high-value content is a major initiative as we continue to build on our leadership position in the search market. Working with MediaDNA, Inktomi will be able to provide the end user with better search results while helping publishers of valuable online content reach a wider market."

According to a press released last September MediaDNA will "provide Inktomi's leading Internet portal and destination site customers with immediate access to millions of pages of high-value content that would otherwise be unsearchable. As a result, InktomiTM Search Engine users will obtain comprehensive, highly accurate search results including 'deep Web' content."

What it failed to mention was that the owners of this "high value content" will have to pay MediaDNA to produce cloaked page for them, and -- even more important -- that Inktomi gets a piece of the pie, or should we rather say: the pay.

What we have here is a search engine company that gets paid for allowing someone to virtually "spam" its databases. It is not illegal, and Inktomi sure needs the money, but the long term effect on its credibility as a deliverer of "untainted" search results may be devastating.

If -- and this is a big if -- Inktomi is punishing other search engine optimizers for using cloaking, and at the same time helps MediaDNA program to do so by giving them insight into the Inktomi ranking algorithm we are very close to what many would call unfair competition.

On the other hand one could argue that Inktomi is doing nothing that is against its own anti-spam rules. After all, its rules do not prohibit the use of cloaking as such, as long as the information given to Slurp, its spider, is reflective of the actual page. We would expect the pages produced by the MediaDNA services to be relevant.

One could also argue that the MediaDNA service does deliver added value to Inktomi searchers. One of the main purposes of the service is after all to make PDF-files and script-generated content -- i.e. content that is not normally indexed by the Inktomi spider -- available to the general public. MarketResearch.com, a company that aggregates more than 40,000 market research publications from the world's leading industry research firms, has used MediaDNA's eLuminator service exactly for this purpose.

The Black List

Unfortunately, more and more search engine optimizers feel that there is more to it than this. Some of them suspects that Inktomi is punishing other SEO companies in order to strengthen the position of its partners, MediaDNA included.

A newsletter published by Search Engine World, the Search Engine Quarterly Magazine, reports that little over a year ago, the author Brett Tabke, cold calls from other optimizers started to increase in volume.

"They were notable because they were often from panicked optimizers who'd been blocked out of some Inktomi search engine," he writes. "Not just a few pages or domains, but entire client lists line-by-line."

A few days ago he found the Inktomi "black list" of sites and spammers that are to be excluded from the index available on the Net (the pages have since been removed). Tabke reasons that the only reason the Inktomi black list was available on the Net, was that others besides the Inktomi staff was allowed to add spammers to the list.

The list showed that entire C-blocks (i.e. series of IP website addresses) of many search engine optimization firms were banned. A huge number of people and sites were listed, according to Tabke: "We have yet to figure out just how many urls the database represents. Our best guess is 10k domains, 100+ C-blocks, that represent somewhere in excess of 1 million urls just for openers."

The database comments show according to Tabke a pattern of systematically tracking optimizers and firms by using the Whois database of domain owners, IP addresses and DNS servers (i.e. the servers that resolves normal domain names like pandia.com to IP number addresses), discussion forums, affiliate ID's, email addresses, link maps, similar robots.txt files and keyword searches.

The reason for using all these techniques is to identify spammers and map all the sites produced by them. In this way Inktomi will be able to ban all sites produced or optimized by that spammer, regardless of what server they are hosted on. Tabke gives one good example: If only 10 other sites link heavily to a known problem site, then those sites must be problems too, and may be banned.

If one read between the lines of the newsletter, one gets the impression that Inktomi systematically bans search engine optimization firms that compete with MediaDNA. He presents only circumstantial evidence, however, and no definite proof that this is so. For all we know, the Inktomi black list includes nothing but true spammers. Let's face it, a large proportion of spammers are search engine optimization companies trying to bend the rules.

Moreover, as Danny Sullivan of Search Engine Watch has pointed out, MediaDNA is not the only company to feed XML directly into Inktomi. It was merely the first. Anyone who uses their self-serve program can also feed cloaked pages.

Furthermore, the fact that Inktomi has a black list is more than reasonable, and tells volumes about the wasteful war between search engines and spammers.

But, if Tabke is right in his assumption that Inktomi is banning non-spamming search engine optimization companies in general, Inktomi has gone beyond the boundaries of what is ethically acceptable behaviour.

It should be noted though, that a search in the Inktomi database reveals as many as 129,121 listings for the keyword phrase "search engine optimization", and 105,109 for the term "search engine positioning", and MediaDNA is not on the first page of any of these results. Inktomi has far from banned all SEO companies, and we doubt that all the SEO companies listed take part in the Inktomi program.

In any case, the research done by Tabke gives a fascinating glance into the world of Inktomi anti-spam policies. This is clearly another example of how the search engine's need for revenue makes it harder for them to uphold the image of being producers of "neutral" and "objective" search results.

Search Engine Quarterly Magazine
An example of an Inktomi spammer's list
Inktomi Spam Database Left Open To Public (Search Engine Watch)
MediaDNA press release on Inktomi deal
MarketResearch.com Selects MediaDNA's eLuminator Service to Drive Online Sales of Market Research Reports (Yahoo!) Inktomi Index Connect
Index Connect Partners
SearchDay: To cloak or not to cloak.

More search engine news...

MAIL UPDATE

Be informed every time Pandia adds another original search engine news story!

Enter your email address below:

This news message is part of the Pandia Search World News Archive. The links in this article will not be updated.

For up to date news on search engines and Internet searching, visit Pandia Search World, or search for news using the Pandia Newsfinder:

Search for search engine news:


Pandia Search Central
Search Engine News
SE Blogs and Sites

Search tools:
Powersearch All-in-One
Plus Web Directory
Metasearch
Newsfinder
Shopping Search
Radio Search
People Search
Kids & Teens

On Web Searching:
Search Tutorial
Search Trends

On Search Ranking:
SE Marketing Tutorial
SE Optimization Gateway

On Pandia
Free Newsletters

 


NEWSLETTER

CUP The Pandia Post is our free bimonthly newsletter on Internet searching, search engines and directories.

We will give you the latest news from the world of Internet exploring, useful tips on how to improve your searching skills or search engine ranking, as well as information on the development of the Pandia Search Central.

We will never give your address to any other company or organization.

Enter your email address below and click 'Subscribe':


Privacy policy
spacerspacer spacer

Home | On Web Searching | On Search Engine Ranking | Pandia's search tools | FAQ incl. how to add site | Awards and accolades | About Pandia | Search the Pandia site & site map | Contact information

All-in-one lists of tools: Search engine optimization | Search engines and tools | People and email addresses | News search

Pandia is a registered service mark of P&S Koch, Oslo, Norway. All other company and product names are the trademarks or registered trademarks of their respective holders. © P&S Koch 1998-2009. Comments or questions? Go to our contact page.