[aseek-devel] Special handling of umlauts

From: Jens Thoms Toerring (no email)
Date: Wed Sep 03 2003 - 11:19:09 EDT


   I have a question (not a feature request) concerning the handling
of umlauts: At our site we have quite a number of english speaking
visitor and many of them either aren't aware of the existence of
umlauts or simply don't have a keyboard that allows them to enter
them (without jumping through hoops). Also, many of the pages here
are in english, and many people with names containing umlauts spell
them sometines with and without the umlaut (for example, I often use
"Toerring" instead of the "Törring" (you probably need iso-8859-1 to
see it correctly) that's written in my passport).

  Thus I would like to include some feature that translates a query
in a way to take these ambiguities into consideration. For example,
if someone is searching for my name with "q=toerring" I also would
like to automatically translate the request to "q=toerring+OR+törring",
and of course also the other way round. And as a third option, a
request like "q=torring" should be translated to "q=torring+OR+törring".

  I am not asking for anyone to implement this for me, my question is
first if you think this is possible, and when the answer is yes, where
best to implement it. As far as I can see it should be both possible
in s.cgi as well as in searchd, but due to my very limited experience
with aspseek I'm looking for some guidance. Another question would
be if such a feature could also be useful for other languages and
if it therefore would sense to find a more general solution, e.g.
having an additional configuration file with posible character
replacements etc., instead of just a hack for German only.

                                    Regards, Jens

 Freie Universitaet Berlin     Jens Thoms Toerring
 Webteam                       Tel: 0049 30 838 56055
 Garystrasse 39                Fax: 0049 30 838 53738
 14195 Berlin                  e-mail: 

Hosted Email Solutions

Invaluement Anti-Spam DNSBLs

Powered By FreeBSD   Powered By FreeBSD