Search engine development
Panida Search CentralPandia on Internet Search TutorialsDiscussion forums
PANDIA
spacerspacer spacer
 

Resource Index

Scientific papers on search engine development

Be warned! These articles are written by scientists studying search retrieval, and demand patience and perseverance.

The Google Pagerank Algorithm and How It Works. Ian Rogers gives a detailed analysis of the PageRank algorithm.

The Anatomy of a Large-Scale Hypertextual Web Search Engine. It is a bit technical, this one. Still, it is a classic paper telling about the logic behind the very popular Google search engine. It is written by the founders of Google: Brin and Page. A must if you would like to learn more about link popularity and search engine positioning.

The Term Vector Database: fast access to indexing terms for Web pages, by Raymie Stata1, Krishna Bharat (Google) and Farzin Maghoul. Based on a study of a term vector database, i.e. an attempt to determine the theme or main topic of a site. See also: Sougata Mukherjea: A System for Collecting and Analyzing Topic-Specific Web Information and Andrei Broder et. al.: Graph structure in the web.

Luc Goffinet, Monique Noirhomme-Fraiture: Automatic Hypertext Link Generation based on Similarity Measures between Documents. This paper deals with the problem of automatically generating cross-reference links when converting text to hypertext. A statistical approach is introduced, based on techniques commonly used in Information Retrieval.

Neel Sundaresan and Jeonghee Yi:Mining the Web for Relations. On identifying how pieces of information are related as they are presented on the Web.

Davood Rafiei and Alberto O. Mendelzon: What is this Page Known for? Computing Web Page Reputations. On how the textual content of the Web enriched with the hyperlink structure surrounding it can be a useful source of information for querying and searching.

Haveliwala, Taher H.: "Topic-Sensitive PageRank" (2002). On the future development of Google's PageRank system for calculating search result ranking.

Jon M. Kleinberg: "Authoritative Sources in a Hyperlinked Environment" (1998, PDF-file). Historical paper on inter linkage and the use of hubs and authorities in search engine algorithms.

R Baeza-Yates, B Ribeiro-Neto: Modern Information Retrieval. The summary and the chapter on user interfaces and visualization are available for free.

LSI: Latent Semantic Indexing

Latent Semantic Indexing is an information retrieval method that takes advantage of some of the implicit higher-order associations of words with text objects. Google is at least partly using this method.

Clara Yu, John Cuadrado, Maciej Ceglowski, J. Scott Payne: Patterns in Unstructured Data, Discovery, Aggregation, and Visualization

Sen Yoshida et. al.: Constructing and Examining Personalized Cooccurrence-based Thesauri on Web Pages

Peter W. Foltz: Using Latent Semantic Indexing for Information Filtering

Chaomei Chen: From Latent Semantics to Spatial Hypertext


Pandia Search Central
Search Engine News
SE Blogs and Sites
Free Newsletters
RSS web feed

Search tools:
Powersearch All-in-One
Plus Web Directory
Metasearch
Newsfinder
Shopping Search
Radio Search
People Search
Kids & Teens

On Web Searching:
Search Tutorial
Search Trends

On Search Ranking:
SE Marketing Tutorial
SE Optimization Gateway
SE Submission
Pay Per Click SE

On Pandia:
Search this Site
Pandia FAQ
Store
Advertising

RESOURCES
Resources Home
Books and ebooks
Sites on searching
Search engine tutorials
The best search engines
The best search directories
SE discussion forums
Search engine newsletters
Articles on SE positioning
Hiring SE optimization companies
Selected companies
Note on affiliations








spacerspacer spacer

Home | On Web Searching | On Search Engine Ranking | Pandia's search tools | FAQ incl. how to add site | Awards and accolades | About Pandia | Search the Pandia site & site map | Contact information | Advertising

All-in-one lists of tools: Search engine optimization | Search engines and tools | People and email addresses | News search

Pandia is a registered service mark of P&S Koch, Oslo, Norway. All other company and product names are the trademarks or registered trademarks of their respective holders. © P&S Koch 1998-2008. Comments or questions? Go to our contact page.