From: Karen Barnes (no email)
Date: Tue Oct 29 2002 - 22:27:52 EST
Hello Matt,
I can understand you are part of the development team, but I can tell you
from experience that this is exactly what happend to me. Here is the actual
configuration I had (by mistake) which caused the indexer to go into an
infinite loop.
The very first time I ran the indexer I ONLY indexed our site which
consisted of roughly 250 pages. Here's the actual configuration:
Period 1m
server http://www.mysite.com/
server http://www.mysite.com/page2.html
and so on.
Keep in mind that this was a FRESH install with NO other URLs in the
database at all. I did NOT have the index follow links as I wanted to
restrict the index to specific pages. When I ran index I could see the same
pages being indexed over and over and over for several minutes. Then I stop
the index (./index -E) and found that I set the Period to 1m (one minute). I
changed this to:
Period 14d
deleted the site using index just to make sure:
./index -C http://www.mysite.com%"
and then ran index again and within a couple of seconds the indexer stop and
everything was indexed and worked as expected.
So the design may say it can't loop, but I can assure you through experiece
that this is how it reacted when I did it this way.
BTW - You asked me to post the printout of "ulimit -a". Have you had a
chance to look at this? Hoping I can stop the "can't connect to host" errors
I'm experiencing.
Regards,
Karen
>On Tue, 29 Oct 2002 at 16:53:54 -0700, Karen Barnes wrote:
>
> > using the "Period" command? For example; if you have this set like the
> > following:
> >
> > Period 14d
> >
> > then you have set a reindex every 14 days and if you run the indexer for
>14
> > days non stop the process is going to start all over again and never
> > finish. When I run an initial crawl I set this to a very large number
>like
>
>This is not correct. The URL queuer queues URLs in incrementally
>increasing
>time slices. It can never loop in time during a single index run and an
>URL
>will not be indexed more than once per run (even if expired).
>
>
>Matt.
_________________________________________________________________
Surf the Web without missing calls! Get MSN Broadband.
http://resourcecenter.msn.com/access/plans/freeactivation.asp
|
|
|