From: Jorey Bump (no email)
Date: Wed Apr 30 2008 - 09:19:54 EDT
Paweł Leśniak wrote, at 04/30/2008 09:05 AM:
> Erwan David pisze:
>> It may work, but be aware that language is difficult to detect, and
>> you can receive mails written in english but using Big5 charset (or
>> any charset containing ascii).
>> Even my mail will be sent in UTF-8 (because I made a citation of
>> your name, and I did not configure to use ISO-8859-2)
> OK. With UTF-8 it's hard to detect in what language is the message
> written (but SpamAssassin still can do that by checking which portion of
> UTF charset You're using). But You don't use ISO-8859-2 or KOI-8R to
> write a message ever. Unless You write in these charsets and that's when
> You are from country which needs this encodings. I know it's not a
> *real* solution. But still in my opinion it's way better than rejecting
> mail by IP/Domain.
I routinely get mail in foreign character sets from colleagues who have
received such mail and are CC'ing me in the reply. Some mail clients are
configured to use the character set of the original sender. So, you can
still get burned by this technique.
In any case, blocking by character set doesn't yield enough results to
be worthwhile, especially if you have to deal with even a single false
positive. Adding scores for character sets may be useful, but outright
rejection is a bad idea, in my experience.