A new search engine has surfaced on the Web, and Web-page authors may be dismayed to find that their own descriptions of a site have no relevance in this engine's ranking system. Rather than indexing the text of a site as traditional engines do, the Hyperlink Search Engine leverages the third-party hyperlinks that point to a site. Thus, those middlemen become the arbiters of information relevant to the rankings.
Since traditional query returns are based on generic text indexes of Web pages, site authors can not only manipulate the context of their information, but can often dictate the number of search-engine returns by self-spamming a site with particular key words. The Hyperlink Search Engine, developed by IDD Information Services, believes that a more objective ranking results when the anchor text from hyperlinks is used as the criteria for rankings.
"We've analyzed our rankings a lot and have found that it both searches and ranks output as good as engines that have some human intervention in them, like Yahoo," said Doran Howitt, IDD's director of business development. "We find that our rankings capture the same kind of human wisdom because it was human beings that created the hyperlinks in the first place."
The engine has recently debuted in a demo version that can process about 10 searches per second. The core technology, called Hyperlink Vector Voting, crawls the Web for hyperlinks and indexes them in the same way other engines index text. A benefit of this, said Howitt, is that the engine "stores only a very compact set of data about the Web and is naturally very efficient."
Yanhong Li, one of IDD's scientists, developed the engine's algorithm based on his own desire to have a more effective engine for searching the Web. Li said the original concept for the engine came from a convention in the academic publishing community: When a published article is particularly good, he said, a multitude of other academic articles will reference it, thereby bestowing a ranking and value to it. The same is true on the Web, Li argues, and a de facto voting and ratings system is the result.
"On the Web, the hyperlink serves as a reference to other sites, and it's even better than paper media because it has a text description of information. This description is an evaluation of that site by another user, so not only does it measure quality, but it gives different descriptions for the same concept. This is very hard for traditional engines to do," said Li.
One result of the engine's somewhat finicky ratings system is that some sites - those which have no hyperlinks associated with them - will not be listed at all. This will naturally create some holes in the engine's coverage area, but, by the same token, it will weed out sites that Web users deem unworthy.
"Seventy-five to 80 percent of the things people want to find are popular things, which other people have built links to," said Howitt.
Microsoft and Infoseek have expressed interest in the technology, Li said, and IDD's goal is to license the engine to other companies for integration into other, more comprehensive sites.