Re: [aseek-users] how to encourage index program to read new web pages

From: mark david mcCreary (no email)
Date: Fri Jan 04 2002 - 19:11:23 EST


>Diego

Thank you for the tip.

I did try the -a

It seemed to index all of the pages, even though I just wanted the
new additions indexed.

If I give it the specific last message number, it will index that
message, and also previous unindexed messages, due to the fact that
each msg web page points to the previous page via a URL, and ASPseek
follows the links.

However if I just give it a directory path as the server statement,
it never figures out that new messages have been added. It may be
that it reads the index.html file, finds that that one is up to date,
and stops.

I must be missing something basic about how to get a site indexed,
and stay current.

mark

>mark,
>
>I do not have my print out at hand with all the
>commands but, if you want to reindex your database and
>index recently added docs, you may want to try
>
>"./index -a" indexes recently added urls and
>reindexes all other urls in the aspseek.conf file.
>
>Cheers,
>
>Diego
>
>
>--- mark david mcCreary <>
>wrote:
>> I have a web archive that constantly has new web
>> pages added to it.
>> Each web page that I want indexed has this filename
>> pattern -
>> msgxxxxx.html (where xxxxx is a unique number).
>>
>> My first attempt at having new pages indexed, was to
>> run a crontab
>> job calling index. The aspseek.conf file has an
>> include line of
>> server statements
>> include /home/mhonarc/aspseek_server_start_url
>>
>> Which contains lines like this
>>
>> AuthBasic listone:
>> Server http://www.internet-tools.com/listone/
>>
>> The aspseek.conf file also has these lines
>>
>> Allow msg.*\.html$ \/$
>> Disallow .*
>>
>> This works fine the first time I run the index.
>>
>> The next time the index program is called, no new
>> URL's are found to
>> process, despite new messages being added to the web
>> site.
>>
>> Does anybody have any suggestions on how to get
>> around this bug, or
>> another way to index recently added web pages.
>>
>> Thanks
>>
>> mark
>>
>>
>>
>>
>>
>
>
>__________________________________________________
>Do You Yahoo!?
>Send your FREE holiday greetings online!
>http://greetings.yahoo.com








Hosted Email Solutions

Invaluement Anti-Spam DNSBLs



Powered By FreeBSD   Powered By FreeBSD