The future of search may be personalized, but what about your privacy?

personalized searchPandia takes a look at how the search engines may use your web surfing habits and other data to fine tune your search results.

Let’s admit it: Even though the search engine companies do a lot to improve their offerings, there has not been any radical innovations in the search engine field for a long time.

The basic presentation of results remains for all practical purposes the same (one long list of links) and the search algorithm that produces the results relies on a mix of on page and inbound link analysis.

That means that the selection and ordering of sites is based on the content of the page (whether the page contains words relevant to the query) and the number and quality of links pointing to the relevant page.

For several years now the search industry has discussed the possibility of making use of personal search habits for producing better search results. What are the results?

Google personalized search

If you, for instance, make use of Google’s various services (mail, newsreader, docs, bookmarks, search history, search result clicks, personalized home pages), Google should be able to use that information to develop a personal profile of your interests.

Using that profile, the company may — for instance — be able to determine whether you are looking for big cats or cars when searching for “jaguar”, or whether you are more likely to look for shopping sites rather than informational sites.

Google has been experimenting with this for a while.

When you’re signed in to your Google Account, Google will try to get you more relevant, useful search results, recommendations and other personalized features.

Google says: “For example, if you use Google Bookmarks or Google Web History, you’ll get more targeted search results and recommendations for videos or gadgets.”

You no longer have to sign up for personalized results at Google. You will get them automatically. All that is required is that you are logged into your Google account.

However, so far the personalized results are not significantly different from the non-personalized ones. The improvement of search engine results is negligible.

The fact that Google is now testing a search engine result voting system may indicate that the info gathered from search history etc. is not enough to help Google determine what are your likes and dislikes.

Using the computer platform as input

In a comment to the printed version the Norwegian Computerworld Technology Director Mikael Svenson of the Norwegian enterprise search company Intellisearch argues that search tools will take the applications into considerations when determining what a search query is about.

To give one example: If you are searching on a Mac, a search for “Leopard” is more likely to mean Apple’s operating system than if you are searching on a PC.

Semantic search

Harald Botnevik, the leader of Yahoo’s research unit in Norway, says:

“We are facing many challenges. The amount of information available online is increasing rapidly, at the same time as traffic and the number of users is going up rapidly. It is therefore important to find the documents that are most important for the searcher. This is one of the aspects we are trying to improve.”

This is why Yahoo! is working on so-called semantic search.

Yahoo! is trying to find the intention of the searcher, based on contextual information like search history and geographical location.

Yahoo! is planning to introduce semantic search this fall. For the searcher, however, this aspect will not be directly visible.

There is one huge problem with semantic search, however. It is based on the premise that web page producers are rational beings that are actually willing to tag their pages with relevant information.

That might work within the scientific world, but does not necessarily work in other areas.

There are two reasons for this: (1) Webmasters and web site editors do not bother to add the necessary tags and (2) they use them to spam the search engines.

Search engine spammers soon found out that adding irrelevant keywords to the traditional keyword metatags might help web pages in search engine results.

In the end the search engines had to ignore these tags when determining search engine rank. There is no reason to believe that unscrupulous search engine marketers will behave more ethically in the future.

Which is why we believe it will be more fruitful to focus on the surrounding data produced by the searcher rather than the web page producer. The searcher has no incentive whatsoever towards spamming the search engines.

Privacy problems

Apart form the fact that the search engines clearly are finding it hard to fine tune the perfect balance between personal information and the traditional search engine algorithm, the search engines are facing another obstacle in the concern for privacy.

The fact that Google is developing a personal profile on your interests, has already given rise to concern.

Google’s personal search is connected to your Google account (if you have one).

Somewhere on Google’s servers there is an entry with your Google email address on it, and in that entry there is information on your emails, search history, the web feeds you are subscribing to, the links you are clicking on etc.

The Google email address is connected to the personal information you may (or may not) have given them. Google may also use the IP address registered when you access Google to determine your geographical location (but not your exact address).

So, unless you avoid setting up a Google account, or set up a fake one and lie about your personal information, Google should in principle be able to find out a lot about you and your personal habits.

Google says the company is keeping these data for 18 months.

We are not too worried about this, personally.

First of all, we doubt very much that there are real flesh and blood persons sitting in California scrutinizing our personal web search history. All this information handling is done by very impersonal computers.

Secondly, it would be a PR disaster of immense proportions if it should come out that Google personnel is studying personal data. And such a thing will come out, eventually, if it happens. It is just too good a story to be kept in the dark for a long time.

If anything worries us in this context, it would be that American federal and military authorities demand and get access to such data. That might happen.

Still, the main problem is not that Google may look into your private life. It is the fact that the public may fear that Google gets access to their private life.

Microsoft made a lot out of the fact that the latest incarnation of the Explorer browser has an InPrivacy function that in principle stops the transmission of private data (It doesn’t, really!). Firefox, Opera and Safari have had such features for a long time.

If users start worrying about their privacy, more and more of them will opt out of providing the search engines with the data needed to generate personalized search results. That will mean the end of personalized search.

Google may let users comment on, rearrange search results Computerworld
Personalized Search Primer – And Google’s Approach Read/Write Web
Powerset as Antigen: Can Google Resist Microsoft’s New Threat (Beyond Search)
8 Ways To Optimize For Personalized Search Search Engine Journal
Q&A With Google Personalization Gurus Sep Kamvar and Marissa Mayer Search Engine Land
Yahoo Geeks Out Over Semantic Search BBC

Bookmark and Share