RE: [aseek-users] Extremely slow search if many urls are returned

From: Max Lytvyn (no email)
Date: Tue Nov 26 2002 - 03:34:16 EST


That's what I got while running a "world war" query (10.5 secs)

r b w avm fre flt re pi po fr sr ad0 md0 in sy cs us
sy id
2 1 0 463808 32788 370 1 1 1 356 203 0 0 343 7635 151 28
9 63
3 1 0 463808 32556 15 0 0 0 2 0 4 0 366 126600 85 74
26 0
2 1 0 463808 32012 90 0 0 0 0 0 8 0 385 133143 112 74
26 0
2 2 0 465212 30376 453 0 4 0 89 0 26 0 362 134872 166 76
24 0
2 2 0 465212 29584 47 0 1 0 0 0 23 0 363 124918 129 72
28 0
2 3 0 466800 27872 350 1 2 0 88 0 34 0 402 101322 268 80
20 0
2 3 0 475064 27248 611 0 2 0 389 0 33 0 415 147859 550 67
33 0
2 2 0 470296 27168 1246 0 0 0 404 0 32 0 430 234989 431 53
47 0
1 3 0 477584 26328 608 0 0 0 287 0 11 0 506 253012 234 45
55 0
1 2 0 476180 27132 65 0 0 0 264 0 24 0 448 206939 358 42
56 2

Best
Max

-----Original Message-----
From:
[mailto:] On Behalf Of Kir Kolyshkin
Sent: Monday, November 25, 2002 10:42 PM
To:
Subject: Re: [aseek-users] Extremely slow search if many urls are
returned

Oops, sorry, probably I drank too much coffee yesterday.
'vmstat 1 1' is not enough, smth like 'vmstat 1 10' is needed.

Max Lytvyn wrote:
> Here is the 'vmstat 1 1' output:
>
> procs memory page disks faults
cpu
> r b w avm fre flt re pi po fr sr ad0 md0 in sy cs us
> sy id
> 1 7 0 415740 232820 335 1 1 1 323 195 0 0 341 2120 153 26
> 9 65
>
> It was taken while executing a one word ('word') query with 38k
returned
> ulrs (query took 6 seconds). Some apache requests might have been
> processed in parallel.
>
> Best
> Max
>
>
>
>
> -----Original Message-----
> From:
> [mailto:] On Behalf Of Kir
Kolyshkin
> Sent: Monday, November 25, 2002 2:02 PM
> To:
> Subject: Re: [aseek-users] Extremely slow search if many urls are
> returned
>
> Can you send output of 'vmstat 1 1' made during execution of query?
>
> Max Lytvyn wrote:
>
>>Thanks for reply.
>>
>>MySQL buffer is set to 128MB.
>
>
> For the second server, it makes sence to set it to 256Mb.
>
>
>>The system:
>>Hardware
>>Server 1: AMD Athlon 1.4GHz, 512MB memory, 60GB IDE, aspseek is the
>
> only
>
>>big soft running.
>>Server 2: Dual AMD Athlon PR2200, 1GB RAM, 210GB Raid, some soft
>
> running
>
>>in parallel, but most memory is free.
>>
>>Software
>>Server 1: FreeBSD 4.5, MySQL 3.23.40, aspseek from cvs
>>Server 2: FreeBSD 5.0Current, MySQL 4.01, aspseek 1.2.10 (the latest
>>release)
>>
>>The problem is the same on both machines.
>>BTW, the entire processor load is created by searchd process.
>>MnogoSearch developers told me that the problem is with slow sorting
>
> of
>
>>search results because all documents are have almost the same
>
> relevancy.
>
>>Can it be the case for ASPseek?
>>
>>Best
>>Max
>>
>>
>>-----Original Message-----
>>From:
>>[mailto:] On Behalf Of Kir
>
> Kolyshkin
>
>>Sent: Monday, November 25, 2002 1:10 PM
>>To:
>>Subject: Re: [aseek-users] Extremely slow search if many urls are
>>returned
>>
>>Have you increased MySQL's key_buffer_size, as described in FAQ? If
>
> not,
>
>>this is a definitely a bottleneck in your case.
>>
>>Also, please describe your hardware if you want your question to be
>>answered ;)
>>
>>Max Lytvyn wrote:
>>
>>
>>>I have a big problem with search speed - queries that contain common
>>>words are very slow.
>>>I have an index of 200,000 documents on one server; all files are
>>
>>plain
>>
>>
>>>html with just title and body text, 3-40kb size.
>>>Searching with one word queries that return less than 1000 results
are
>>>very fast - about 0.1 sec or even faster. But if any word of a query
>>>matches many urls (e.g. 'word' matches 58000 urls), the search takes
>>>over 5 seconds (up to 25 secs, if several common words are used
>>>together).
>>>I had the same problem with MnogoSearch, but in that case query time
>>
>>was
>>
>>
>>>exponentially dependent on the number of results returned, and
queries
>>>with more keywords (and thus fewer urls returned) were faster. In
>>>aspseek the situation is the same, but looks like the search time is
>>>exponentially dependent on the sum of the url all keywords of the
>>
>>query
>>
>>
>>>return, not on intersection.
>>>
>>>Please HELP!!! I'm desperate - the server load reaches 87% - it is
>>>critical.
>>>
>>>Best
>>>Max
>>>
>>>
>>>
>>
>>
>>
>
>








Hosted Email Solutions

Invaluement Anti-Spam DNSBLs



Powered By FreeBSD   Powered By FreeBSD