Re: [aseek-users] How to limit index to specified URL with path ?

From: Luc Santeramo (no email)
Date: Wed Mar 26 2003 - 04:06:49 EST


Hi,

just an addon to my problem :

If I add this parameters :

MaxHops 9
MaxDocsPerServer 3500
Server http://www.revues.org/ahrf/publicat/resumes.html

without any disallow or disallownomatch...

aspseek says :
(12 2 2 0 0 0 1 10) Adding URL:
http://www.revues.org/ahrf/publicat/320.html
(12 2 2 0 1 0 0 10) Adding URL:
http://www.revues.org/ahrf/publicat/319.html
(12 2 2 0 1 0 0 10) Adding URL:
http://www.revues.org/ahrf/publicat/318.html
(12 2 2 0 1 0 0 10) Adding URL:
http://www.revues.org/ahrf/publicat/317.html
(12 2 2 0 1 0 0 10) Adding URL:
http://www.revues.org/ahrf/publicat/316.html
til here, it is ok,

but here, something goes wrong
(12 2 2 0 0 0 0 10) Adding URL: http://www.revues.org/ahrf/index.html
No "Server" command for URL http://www.revues.org/ahrf/index.html - deleted.

So it doesn't follow what is written in the documentation ( ....Note that
if URL contains path, the whole site will be indexed nevertheless, so to
limit indexing to some subdirectory of site use Disallow parameter
described below ....)

Aspseek should index http://www.revues.org/ahrf/index.html as well. (even
if I don't want him to)

Sometimes it does, so it doesn't

I don't understand... I'm lost

I would be pleased if someone could help me.

thanks.

Luc Santeramo
http://www.in-extenso.org
http://www.revues.org

A 17:09 25/03/2003 +0100, Luc Santeramo a écrit :
>Hi,
>
>I'd like to index http://perso.club-internet.fr/erra
>but I don't want to have all http://perso.club-internet.fr web sites
>
>here is my conf file
>
>MaxHops 9
>MaxDocsPerServer 3500
>Server http://perso.club-internet.fr/erra
>
>the problem is that this conf will index all perso.club-internet.fr web site
>
>extract from the doc :
>Note that if URL contains path, the whole site will be indexed
>nevertheless, so to limit indexing to
>some subdirectory of site use Disallow parameter described below.
>
>so I have to add something like
>DisallowNoMatch ://perso.club-internet.fr/erra.*
>
>But having to specify for each Server command a
>DisallowNoMatch ://thewebsite/thepath.* is not the easiest solution....
>And worse, I suppose this DisallowNoMatch will stop indexing other
>websites like http://www.myotherwebsite.com ?
>
>does somebody have a solution for that ?
>
>thanks
>
>Luc








Hosted Email Solutions

Invaluement Anti-Spam DNSBLs



Powered By FreeBSD   Powered By FreeBSD