RE: [aseek-devel] Ranking problems

From: Gregory Kozlovsky (no email)
Date: Thu Feb 07 2002 - 12:46:01 EST


>From: Savencu Catalin [mailto:]
>Sent: Donnerstag, 7. Februar 2002 11:18
>To:
>Subject: [aseek-devel] Ranking problems

>ASPSeek is great , but it has a major problem , page ranking.
>I indexed about 15000 web sites in .ro TLD , about 3.000.000 web pages ,
> but when a search is performed ASPSeek returns irrelevant results
>(many on top).
>I searched on ASPSeek website for an explanation of how it calculates
>page ranks but I couldn't find anything. Can anyone point me to some
>documentation about this ? Is anyone working to improve page ranking ?

Computing page ranking requires finding an eigenvalue of a very large
matrix. This is a complex numerical problem and it is not solved
correctly in ASPSeek, not even approximately. ASPSeek does several
hundreds of so-called Jacobi iterations using single precision float
numbers.

First, the condition number of the matrix is so large, even for
a moderate number of documents, that a single precision solution has
no meaning whatsoever. Secondly, this is well-known to numerical
analysts, that it is not practically possible to solve such a problem
with Jacobi iterations.

Those who are mathematically inclined can look into a classical book
by Richard Varga, Matrix Iterative Analysis, Prentice-Hall Inc.,
Englewood Cliffs, NJ, 1962. I may try to implement a correct solution
but it will take some time before I can start to do it.

        Gregory Kozlovsky

Project Manager for Information Systems Tel: +41 (01) 632 63
70
International Relations and Security Network (ISN) Fax: +41 (01) 632 14
13
Center for Security Studies and Conflict Research Email:

Swiss Federal Institute of Technology (ETH) http://www.isn.ch
Leonhardshalde 21, ETH-Zentrum / LEH
CH-8092 Zürich, Switzerland








Hosted Email Solutions

Invaluement Anti-Spam DNSBLs



Powered By FreeBSD   Powered By FreeBSD