[aseek-users] how to encourage index program to read new web pages

From: mark david mcCreary (no email)
Date: Fri Jan 04 2002 - 10:47:14 EST


I have a web archive that constantly has new web pages added to it.
Each web page that I want indexed has this filename pattern -
msgxxxxx.html (where xxxxx is a unique number).

My first attempt at having new pages indexed, was to run a crontab
job calling index. The aspseek.conf file has an include line of
server statements
include /home/mhonarc/aspseek_server_start_url

Which contains lines like this

AuthBasic listone:
Server http://www.internet-tools.com/listone/

The aspseek.conf file also has these lines

Allow msg.*\.html$ \/$
Disallow .*

This works fine the first time I run the index.

The next time the index program is called, no new URL's are found to
process, despite new messages being added to the web site.

Does anybody have any suggestions on how to get around this bug, or
another way to index recently added web pages.

Thanks

mark








Hosted Email Solutions

Invaluement Anti-Spam DNSBLs



Powered By FreeBSD   Powered By FreeBSD