Pandia
Post Newsletter No. 11 2001 Part 2
SITESEEING
On searching the Usenet
Since the death of the RemarQ search engine in the summer of 2000, Deja.com
was the only search engine capable of searching so-called newsgroup messages.
In February this year Google went ahead and bought the Deja.com database,
and has since then tried to establish a historical index of discussion
messages as complete as possible. Google Groups was released from beta
this month with 700 million postings from a period of 20 years in more
than 35,000 topical categories.
The Usenet is a huge collection of discussion groups outside the Internet
proper, and it used to be very popular some years ago. Unfortunately many
of the various newsgroups became a target for companies drowning the forums
in spam, and much of the Internet debate has moved over to Web based discussions
boards and email based discussion lists. That being said, there are still
many lively newsgroup communities out there, and you can find a lot of
useful information searching them.
Most Internet Service providers will fetch messages from many of these
so-called "newsgroups", allowing Net users to access the discussions
from their own servers. As the Usenet is not part of the Internet, Web
surfers traditionally used dedicated newsreader software to take part
in the discussions. There are newsreaders included in Explorer and Netscape.
Gripe.com -- the new Usenet search service
There
used to be one alternative to the newsreaders. Deja.com not only let you
search in its Usenet archive of messages, you could also take part in
current discussions on its website, i.e. without using a special newsreader.
Google Groups has a similar feature.
In comes Gripe.com, a radically new approach to the use of the Usenet.
Like Google Groups it is a site that can be used for searching newsgroups
as well as taking part in discussions. Unlike Google, however, it is trying
to liberate the user from the newsgroup structure itself (divided into
forums like alt.internet.search, sci.electronics and alt.selfpity). As
an alternative Gripe will search for and gather messages from a lot of
newsgroups on the fly and tailor a "custom forum" targeting
your particular interest.
The messages are sorted according to a special rating system with a scale
from "excellent" to "very poor", giving you a chance
to sort out all the junk that exists on the Usenet. The message you select
is then presented in an "article viewer" that allows you to
read the message, post a reply on the newsgroup forum, add new message,
send a mail to the author and forward the message to a friend.
In the article viewer you can also assign a rating to any message. In
this way you help Gripe determine the future sorting order of that particular
message.
When you post a message, Gripe will suggest other newsgroups that might
benefit from a copy of your message. A watchlist keeps track of the articles
you have contributed and informs you about replies.
In order to use Gripe as a Web based newsreader, you will have to register
as a user. That gives you an added benefit, though, as your Gripe user
ID, and not your regular email address, will be presented to other discussion
participants. Given that spam mailers often fetch email addresses from
newsgroup messages, this will probably be very useful feature indeed.
Although Gripe is presently in a preliminary beta version it looks like
a very promising alternative to the regular newsreader. The fact that
the service generates "custom forums" is very useful when you
are looking for a wide array of forums discussing your specific topic.
On the other hand, it could be useful to search for and isolate each
individual newsgroup as well. Some debattants actually like to follow
the discussions in "their" newsgroup, developing a community
spirit as it were. That is harder, but not impossible, the Gripe way.
Pandia is a site devoted to search engines, and it is obviously the Gripe
search engine that interest us the most. Unfortunately Gripe cannot compete
with Google Groups in this area, and that's not because Google presents
a much larger database of messages.
The problem is that Gripe insists that we should select one of nine categories
before entering a search query. These are "arts", "autos",
"electronics", "fashion", "health", "home",
"outdoors", "parenting" and "travel". Why
"parenting" and "fashion" are selected and not --
for instance -- "natural science" or "the Internet"
is hard to understand, and we often find it difficult to select the most
appropriate category. We still haven't found the right category for "Internet
searching" and "search engines".
Larry Marso, Managing Director of High Regard Inc, the company behind
Gripe, argues that the design of the category scheme was based on both
business and technology considerations. Among the business considerations
were:
"We utilize sophisticated concept mapping analysis to improve
the accuracy of search results, identify content contributed by users
with subject matter expertise, and quality-weight search results,"
Marso says. "In our research, we have found that the nine categories
on the Gripe.com home page represent important clusters of coherent discussion
in universe of the Usenet, with scope and depth we have modeled and mapped.
We believe that the trend toward combination of every subject imaginable
into a single, unified search engine has gone too far and compromised
the accuracy of search results."
He adds that when a user submits a search, Gripe offer "specific"
search results in the
left column, and "related concept" results in the right column.
Right column results are usually similar to what you might find in a deep
category hierarchy system, as if you had taken the time to drill down
an exhaustive list of subcategories. "Our category approach also
optimizes the value added by the High Regard expert rating system to Gripe.com,"
he adds. "Dramatically improving the quality of content is our primary
objective in providing this service to the user community."
There is no support for advanced Boolean searching. Marso actually views
advanced Boolean syntax as unnecessary, or even a potential step backwards
for the Gripe.com search engine. "Gripe.com is not a simple Boolean
'OR' search engine," he says. "User search terms are analyzed
to consider word order, proximity, usage and conceptual content. Included
in search results are articles/discussions with conceptual overlap, even
if there is no word overlap whatsoever. Also, quality articles are scored
higher in processing search requests."
Actually, we continue to believe that Boolean searching represents a
very useful alternative for professional searchers, researchers and librarians
included. That being said, the Gripe way of generating search results
seems very accurate -- at least when you have found the right category.
Moreover, Gripe is in its beta testing face. Given that Gripe learns from
the behavior of the users, a larger number of users should make the results
even better. Not only will there be more "voters" rating messages,
Gripe will also discover more expert participants in various fields and
give their ratings more weight.
Why would anyone develop a search engine for the Usenet, when search
engines for the Web are having such a hard time? Well, High Regard is
clearly planning to sell paid results on the result pages (which may partly
explain why the Gripe categories are focusing on consumers and not academics).
Moreover, the company is using Gripe as a showcase for its company intranet
search technologies.
Gripe.com http://www.gripe.com/
High Regard http://www.highregard.com/
Go
to the next page: Teaching searching to absolute beginners >>>