From: Ing. Ernesto Rapetti (no email)
Date: Wed Jan 15 2003 - 20:00:29 EST
To test the "ul" feature you have to write the address on URL-ENCODED
format (ej: ul=http:%3A%2F%2F$www.jhuccp.org%2Fpopreporter%2F%25) or use a
HTML form with an <INPUT TYPE=HIDDEN NAME=ul
VALUE=http://www.jhuccp.org/popreporter/%> on it.
In fact I tried your site and it's working alright when you use a form
with the input field "ul" set.
BEWARE:
I warn you that the "ul" feature is broken: if you search again on the
result page it will return results outside the subset. The problem is
that the "$A" variable that you use in the template does not contain the
"ul" field information, and there's no alternate variable you can use.
Although, there's a workarround: you can make a separate template for
every subset, which assigns a fixed value to the "ul" field.
Hope you have a few static subsets.
I have reported this on bugzilla but Kir closed it down ... :-(
Ernesto.
On Wed, 15 Jan 2003, KEVIN ZEMBOWER wrote:
> I'm still not able to limit the found document to one's in a particular directory. Could anyone please take a look at this and let me know if I'm making a dumb mistake. Or, if anyone's got this working, could you paste in your examples?
>
> URL of search page which returns more than 400 documents containing 'advocacy':
> http://www.jhuccp.org/cgi-bin/s.cgi?q=advocacy&cs=&ps=20&o=0
>
> URL of search page which should limit returns to pages containing 'advocacy' in the /popreporter' directory only. However, this returns the same number of documents as the first query, including most not in the /popreporter/ directory:
> http://www.jhuccp.org/cgi-bin/s.cgi?q=advocacy&cs=&ps=20&o=0&ul=http://www.jhuccp.org/popreporter/%
>
> Entry in subsets:
> mysql> select * from subsets;
> | subset_id | mask |
> +-----------+-------------------------------------+
> | 1 | http://www.jhuccp.org/popreporter/% |
> 1 row in set (0.01 sec)
>
> Results of generating subsets and spaces:
> aspseek at www:~$ sbin/index -B
> Loading configuration from /usr/local/aspseek/etc/db.conf
> Loading configuration from /usr/local/aspseek/etc/ucharset.conf
> Loading configuration from /usr/local/aspseek/etc/stopwords.conf
> Loading configuration from /usr/local/aspseek/etc/aspseek.conf
> Generating subset http://www.jhuccp.org/popreporter/% ... done (97 URLs)
> index process finished.
>
> Entry in var/dlog.log after running query:
> Subset http://www.jhuccp.org/ not found
>
> Entry in var/aspseek12/logs.txt, after last reindex:
> aspseek at www:~$ tail /usr/local/aspseek/var/aspseek12/logs.txt
> Sec Count Ch Ch1 Ch2 New Size HQ Hr hits HR lost W hit W miss W ins
> 100.033 607 42 580 77 1 47365422 30162 28155 2007 203609 33367 2
> 100.022 925 41 862 90 0 61748911 46290 45258 1032 328313 27801 6
> 100.023 1513 27 1461 112 0 34825946 79586 78797 789 617709 30690 2
> New indexing session started at: 1042650902
> Got next 5018 URLs for: 0.104 seconds. Queued docs: 5018.Time 0-1042650902.
> New indexing session started at: 1042667810
> aspseek at www:~$
>
> File aspseek.conf:
> aspseek at www:~$ grep -v '^[[:space:]]*$' etc/aspseek.conf |grep -v "^#"
> Include db.conf
> Include ucharset.conf
> Include stopwords.conf
> Converter application/pdf text/html /usr/local/bin/pdftohtml -i -noframes -stdout $in > $out
> Converter application/msword text/plain /usr/local/bin/antiword $in > $out
> DeleteNoServer no
> Server http://www.jhuccp.org/
> DeltaBufferSize 64
> Disallow /cgi-bin/ \.cgi /nph
> Disallow \.tif$ \.au$ \.mov$ \.jpe$ \.cur$ \.qt$
> Disallow \.b$ \.sh$ \.md5$ \.rpm$
> Disallow \.arj$ \.tar$ \.zip$ \.tgz$ \.gz$
> Disallow \.lha$ \.lzh$ \.tar\.Z$ \.rar$ \.zoo$
> Disallow \.gif$ \.jpg$ \.jpeg$ \.bmp$ \.tiff$ \.xpm$ \.xbm$
> Disallow \.vdo$ \.mpeg$ \.mpe$ \.mpg$ \.avi$ \.movie$
> Disallow \.mid$ \.mp3$ \.rm$ \.ram$ \.wav$ \.aiff$ \.ra$
> Disallow \.vrml$ \.wrl$ \.png$
> Disallow \.exe$ \.cab$ \.dll$ \.bin$ \.class$
> Disallow \.tex$ \.texi$ \.xls$ \.texinfo$
> Disallow \.rtf$ \.cdf$ \.ps$
> Disallow \.ai$ \.eps$ \.ppt$ \.hqx$
> Disallow \.cpt$ \.bms$ \.oda$ \.tcl$
> Disallow \.o$ \.a$ \.la$ \.so$ \.so\.[0-9]$
> Disallow \.pat$ \.pm$ \.m4$ \.am$
> Disallow \?D=A$ \?D=A$ \?D=D$ \?M=A$ \?M=D$ \?N=A$ \?N=D$ \?S=A$ \?S=D$
> Disallow [^:]//
> Disallow mmc/.*\.php
> Disallow PHPTEST
> aspseek at www:~$
>
> Thanks for taking the time to look at this and for your thoughts and suggestions.
>
> -Kevin Zembower
>
> -----
> E. Kevin Zembower
> Unix Administrator
> Johns Hopkins University/Center for Communications Programs
> 111 Market Place, Suite 310
> Baltimore, MD 21202
> 410-659-6139
>
|
|
|