public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] SpamAssassin is letting everything through
@ 2005-08-25 21:53 daniel
  2005-08-25 22:41 ` Jonathan Wright
  2005-08-27 14:47 ` Jarry
  0 siblings, 2 replies; 10+ messages in thread
From: daniel @ 2005-08-25 21:53 UTC (permalink / raw
  To: list: gentoo-user

I've been beating my head against my keyboard all day trying to figure out how 
get SpamAssassin working on our server and so far I've not had a lot of 
success.

 - spamd is running
 - it's handling mail and adding headers
 - it seems to be running all the rules

But nearly all spam, even really obvious spam is getting through.  I went to 
the SpamAssassin website to figure out what was going wrong and did a few 
tests.  The GTUBE test comes back positive, so that was good, but manually 
testing known spam is coming out with very low scores given the input.

It looks like the rules should be scoring higher (see below) but they're not.  
Even with the attached email, we're only getting a score of 1.7 even after 
it's hit up on all those different rules.

Am I missing something here?  I've trained it with sa-learn on a whole slew of 
ham and spam and it continues to let through nearly all the spam coming in.

Any help or suggestions would be greatly appreciated.
Thanks


Example Spam
------------------------------------------------------------------------------
Return-path: <Palmiro@joyceessex.com>
Envelope-to: MYUSERNAME@MYDOMAIN.com
Delivery-date: Sat, 26 Mar 2005 06:20:57 -0800
Received: from 21.red-83-46-28.pooles.rima-tde.net ([83.46.28.21] 
helo=joyceessex.com)
        by mail.MYDOMAIN.com with smtp (Exim 4.50 (FreeBSD))
        id 1DFC9k-000DiX-9o
        for MYUSERNAME@MYDOMAIN.com; Sat, 26 Mar 2005 06:20:57 -0800
From: "Thelonius Parent" <Palmiro@joyceessex.com>
To: "James Tripp" <MYUSERNAME@MYDOMAIN.com>
Subject: Pharaamcy: 45-91
Date: Sat, 26 Mar 2005 08:19:30 -0500
MIME-Version: 1.0
Content-Type: multipart/alternative;
  boundary="----=_NextPart_000_004A_01C5313B.42456F83"
X-Priority: 3
X-MSMail-Priority: Normal
X-Unsent: 1
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
Spam-Warn: host is listed in
Status: RO
X-Status: RC
X-KMail-EncryptionState:
X-KMail-SignatureState:
X-KMail-MDN-Sent:
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on
        kenyan.nodes.net.ad-flow.com
X-Spam-Level: *
X-Spam-Status: No, score=1.7 required=5.0 tests=FORGED_RCVD_HELO,
        HTML_FONT_BIG,HTML_MESSAGE,HTML_TAG_EXIST_TBODY,PRIORITY_NO_NAME
        autolearn=no version=3.0.2

This is a multi-part message in MIME format.

------=_NextPart_000_004A_01C5313B.42456F83
Content-Type: text/plain;
        charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hello,

to open.  Cahusac stood before him.  The Breton's face was grave.
landing parties.

de Rivarol bade him be admitted, and there entered now into his

the sound heart of a boy, and in that heart much love for Peter
misrule, damme!  He leaves Port Royal unguarded save by a ramshac
Governor-General.  I perceive your object, and I believe ye're
Wolverstone's he did not know.  But he saw quite clearly now that
to hear him, for he had not troubled to raise his voice.  I hope


Have a nice day.
------=_NextPart_000_004A_01C5313B.42456F83
Content-Type: text/html;
        charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Dus-ascii">
<META content=3D"MSHTML 6.00.2800.1106" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial>Hello,&nbsp;</FONT><A=20
href=3D"http://www.dtkc.jm.com.dodteddrugsf.com/"><FONT =
face=3DArial>MediccationsByMail=20
SHOP</FONT></A><FONT face=3DArial>&nbsp;Welcomes You.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial>Do youu need to spend less on your meddications?=
</FONT></DIV>
<DIV><FONT face=3DArial size=3D4>You could save up tto 80% with us!=
</FONT></DIV>
<DIV>
<TABLE cellSpacing=3D0 cellPadding=3D0 border=3D0>
  <TBODY>
  <TR vAlign=3Dbottom>
    <TD rowSpan=3D2><FONT face=3DArial size=3D4>VI</FONT></TD>
    <TD><FONT face=3DArial size=3D4></FONT></TD>
    <TD rowSpan=3D2><FONT face=3DArial size=3D4>IN&nbsp;Vl</FONT></TD>
    <TD><FONT face=3DArial size=3D4></FONT></TD>
    <TD rowSpan=3D2><FONT face=3DArial size=3D4>RA&nbsp;VA</FONT></TD>
    <TD><FONT face=3DArial size=3D4></FONT></TD>
    <TD rowSpan=3D2><FONT face=3DArial size=3D4>UM</FONT></TD>
    <TD><FONT face=3DArial size=3D4></FONT></TD>
    <TD rowSpan=3D2><FONT face=3DArial size=3D4>AL</FONT></TD>
    <TD><FONT face=3DArial size=3D4></FONT></TD>
  <TR>
    <TD><FONT face=3DArial size=3D4>COD</FONT></TD>
    <TD><FONT face=3DArial size=3D4>AG</FONT></TD>
    <TD><FONT face=3DArial size=3D4>Ll</FONT></TD>
    <TD><FONT face=3DArial size=3D4>&nbsp;CI</FONT></TD>
    <TD><FONT face=3DArial=20
      =
size=3D4>IS&nbsp;and&nbsp;many&nbsp;other&nbsp;in&nbsp;our&nbsp;ST0RE.</F=
ONT></TD></TR></TBODY></TABLE></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial>Have a nice day.</FONT></DIV>
<DIV><FONT face=3DArial>Try us andd you will not be disappointed.=
</FONT></DIV></BODY></HTML>

------=_NextPart_000_004A_01C5313B.42456F83--

Spam detection software, running on the system "kenyan.nodes.net.ad-flow.com", 
has identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
the administrator of that system for details.

Content preview:  Hello, to open. Cahusac stood before him. The Breton's
  face was grave. landing parties. de Rivarol bade him be admitted, and
  there entered now into his [...]

Content analysis details:   (1.7 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.1 FORGED_RCVD_HELO       Received: contains a forged HELO
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.2 HTML_FONT_BIG          BODY: HTML tag for a big font size
 0.2 HTML_TAG_EXIST_TBODY   BODY: HTML has "tbody" tag
 1.2 PRIORITY_NO_NAME       Message has priority, but no X-Mailer/User-Agent
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-25 21:53 [gentoo-user] SpamAssassin is letting everything through daniel
@ 2005-08-25 22:41 ` Jonathan Wright
  2005-08-25 23:05   ` Mike Williams
  2005-08-27 14:47 ` Jarry
  1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2005-08-25 22:41 UTC (permalink / raw
  To: gentoo-user

daniel wrote:
> I've been beating my head against my keyboard all day trying to figure out how 
> get SpamAssassin working on our server and so far I've not had a lot of 
> success.

Once of the things I found that helped very quickly was the DCC system
(Distributed Checksum Clearing). They store the checksum of all know
spam mails. Once I enabled that, my level of false spam went down
massively. I very rarely get any spam in my Inbox any more (although I
do get the occasional genuine mail there, usually mass-mail like from
eBay, but a white-list solves that one).

-- 
 Jonathan Wright                           ~ mail at djnauk.co.uk
                                           ~ www.djnauk.co.uk
--
 2.6.12-gentoo-r6-djnauk-b7 Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz
 up 14:34,  4 users,  load average: 1.43, 0.66, 0.40
--
 "There is nothing wrong with going to bed with someone of your  own
 sex. People should be very free with sex, they should draw the line
 at goats."

                                                        ~ Elton John
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-25 22:41 ` Jonathan Wright
@ 2005-08-25 23:05   ` Mike Williams
  2005-08-25 23:37     ` Jonathan Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Williams @ 2005-08-25 23:05 UTC (permalink / raw
  To: gentoo-user

On Thursday 25 August 2005 23:41, Jonathan Wright wrote:
> Once of the things I found that helped very quickly was the DCC system
> (Distributed Checksum Clearing). They store the checksum of all know
> spam mails. Once I enabled that, my level of false spam went down
> massively. I very rarely get any spam in my Inbox any more (although I
> do get the occasional genuine mail there, usually mass-mail like from
> eBay, but a white-list solves that one).

Any chance you could enlighten the rest of us to how you enabled DCC?
Does spamassassin detect the client tools presence automagically, or is it a 
compile time thing, or a config change, etc?

Ta

-- 
Mike Williams
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-25 23:05   ` Mike Williams
@ 2005-08-25 23:37     ` Jonathan Wright
  2005-08-26  2:09       ` W.Kenworthy
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2005-08-25 23:37 UTC (permalink / raw
  To: gentoo-user

Mike Williams wrote:
> Any chance you could enlighten the rest of us to how you enabled DCC?
> Does spamassassin detect the client tools presence automagically, or is it a 
> compile time thing, or a config change, etc?

Simply emerge dcc:

* mail-filter/dcc
     Available versions:  ~1.2.28 1.2.28-r1 ~1.2.74
     Installed:           no
     Homepage:            http://www.rhyolite.com/anti-spam/dcc/
     Description:         Distributed Checksum Clearinghouse

By default, Spamassassin is set to use all filters (DCC, Razor2 and
Pyzor) by default. To specifically en-/dis-able it, add 'use_dcc 1' (or
'use_dcc 0') to your user_prefs file.

-- 
 Jonathan Wright                           ~ mail at djnauk.co.uk
                                           ~ www.djnauk.co.uk
--
 2.6.12-gentoo-r6-djnauk-b7 Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz
 up 15:34,  4 users,  load average: 0.10, 0.13, 0.23
--
 "It's hard enough to be taken seriously in  the  struggle  for  gay
 rights without having a bunch  of  straight  girls  running  around
 kissing  each  other  to  get  the  attention  of  the   boys   and
 videocameras."

                                            ~ M. Robin D'Antan, 2002
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-25 23:37     ` Jonathan Wright
@ 2005-08-26  2:09       ` W.Kenworthy
  2005-08-26 10:18         ` Jonathan Wright
  0 siblings, 1 reply; 10+ messages in thread
From: W.Kenworthy @ 2005-08-26  2:09 UTC (permalink / raw
  To: gentoo-user

This works very well, though it can be set too zealous (as warned about
in the spamassassin section) so needs checking of the trap directories
every few days.

"http://www.google.com.au/url?sa=t&ct=res&cd=1&url=http%
3A//www.gentoo.org/doc/en/mailfilter-guide.xml&ei=o3kOQ5urCqesYbf74OoE"

BillK

On Fri, 2005-08-26 at 00:37 +0100, Jonathan Wright wrote:
> Mike Williams wrote:
> > Any chance you could enlighten the rest of us to how you enabled DCC?


-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-26  2:09       ` W.Kenworthy
@ 2005-08-26 10:18         ` Jonathan Wright
  0 siblings, 0 replies; 10+ messages in thread
From: Jonathan Wright @ 2005-08-26 10:18 UTC (permalink / raw
  To: gentoo-user

W.Kenworthy wrote:
> This works very well, though it can be set too zealous (as warned about
> in the spamassassin section) so needs checking of the trap directories
> every few days.

I've just just it configured with the standard settings, and while it
can pull in the occasional valid e-mail, they've always been of the mass
mailing sort (usually from eBay, Amazon, etc.). In terms of
personally-addresses e-mails, that's never been a problem.

-- 
 Jonathan Wright                           ~ mail at djnauk.co.uk
                                           ~ www.djnauk.co.uk
--
 2.6.12-gentoo-r6-djnauk-b7 Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz
 up 17:06,  2 users,  load average: 0.14, 0.41, 0.56
--
 "There's this illusion that homosexuals have sex and  heterosexuals
 fall in love. That's  completely  untrue.  Everybody  wants  to  be
 loved."

                                                        ~ Boy George
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-25 21:53 [gentoo-user] SpamAssassin is letting everything through daniel
  2005-08-25 22:41 ` Jonathan Wright
@ 2005-08-27 14:47 ` Jarry
  2005-08-27 17:15   ` Jonathan Wright
  2005-08-27 17:49   ` Willie Wong
  1 sibling, 2 replies; 10+ messages in thread
From: Jarry @ 2005-08-27 14:47 UTC (permalink / raw
  To: gentoo-user

daniel wrote:

> I've trained it with sa-learn on a whole slew of ham and spam and it
> continues to let through nearly all the spam coming in.

BTW, in which form do all emails have to be in order to use them as
examples for spamassassin-learning? mbox (all mails in 1 file), or
maildir (every mail is a separate file)? Or both are acceptable?

What about MUA mail-folders? Can I use them? I have a lot of mails in
mozilla/thunderbird mail-folder format, which is (I think) very close to
mbox (every mailfolder is one file, e.g. one file for all Sent mails,
etc), but I'm not sure if spamassassin would understand it as many mails
in one file, separated by newline and "From - <date>" line...

Jarry
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-27 14:47 ` Jarry
@ 2005-08-27 17:15   ` Jonathan Wright
  2005-08-28 12:12     ` William Kenworthy
  2005-08-27 17:49   ` Willie Wong
  1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2005-08-27 17:15 UTC (permalink / raw
  To: gentoo-user

Jarry wrote:
>> I've trained it with sa-learn on a whole slew of ham and spam and it
>> continues to let through nearly all the spam coming in.
> 
> BTW, in which form do all emails have to be in order to use them as
> examples for spamassassin-learning? mbox (all mails in 1 file), or
> maildir (every mail is a separate file)? Or both are acceptable?

Either. By default, SpamAssassin will assume path given is a Maildir
folder, unless passed with the -mbox statement.

> What about MUA mail-folders? Can I use them? I have a lot of mails in
> mozilla/thunderbird mail-folder format, which is (I think) very close to
> mbox (every mailfolder is one file, e.g. one file for all Sent mails,
> etc), but I'm not sure if spamassassin would understand it as many mails
> in one file, separated by newline and "From - <date>" line...

So long as it's in the mbox style, SpamAssassin will work with it. I do
remember seeing some references to the Thunderbird mail system, so I
think it should be OK.

-- 
 Jonathan Wright                           ~ mail at djnauk.co.uk
                                           ~ www.djnauk.co.uk
--
 2.6.12-gentoo-r9-djnauk-b1 Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz
 up  2:58,  2 users,  load average: 3.26, 1.86, 1.06
--
 "Trust a nitwit society like this one to think that there are  only
 two categories - fag and straight."

                                                        ~ Gore Vidal
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-27 14:47 ` Jarry
  2005-08-27 17:15   ` Jonathan Wright
@ 2005-08-27 17:49   ` Willie Wong
  1 sibling, 0 replies; 10+ messages in thread
From: Willie Wong @ 2005-08-27 17:49 UTC (permalink / raw
  To: gentoo-user

On Sat, Aug 27, 2005 at 04:47:57PM +0200, Jarry wrote:
> daniel wrote:
> 
> >I've trained it with sa-learn on a whole slew of ham and spam and it
> >continues to let through nearly all the spam coming in.
> 
> BTW, in which form do all emails have to be in order to use them as
> examples for spamassassin-learning? mbox (all mails in 1 file), or
> maildir (every mail is a separate file)? Or both are acceptable?

I know mbox works since I've been using it. I don't know whether
maildir works, but it is likely it does. 

> 
> What about MUA mail-folders? Can I use them? I have a lot of mails in
> mozilla/thunderbird mail-folder format, which is (I think) very close to
> mbox (every mailfolder is one file, e.g. one file for all Sent mails,
> etc), but I'm not sure if spamassassin would understand it as many mails
> in one file, separated by newline and "From - <date>" line...

Isn't that what mbox is? a blankline and From - <date>? If the actual
storage format is like that, I don't see why SpamAssassin won't work. 

W
-- 
"Do you all understand? At this point I'll settle for quiet acquiescence."
~DeathMech, S. Sondhi. P-town PHY 205
Sortir en Pantoufles: up 15 days, 20:51
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] SpamAssassin is letting everything through
  2005-08-27 17:15   ` Jonathan Wright
@ 2005-08-28 12:12     ` William Kenworthy
  0 siblings, 0 replies; 10+ messages in thread
From: William Kenworthy @ 2005-08-28 12:12 UTC (permalink / raw
  To: gentoo-user

The mail gateway user guide uses a separate email account (spamtrap) for
spamassassin's use.  I just copy/move spam and ham to the spam or ham
folders of this account - a cron job does an sa-learn on these folders.
In evolution its a no brainer to dragNdrop between accounts.

BillK


On Sat, 2005-08-27 at 18:15 +0100, Jonathan Wright wrote:
> Jarry wrote:
> >> I've trained it with sa-learn on a whole slew of ham and spam and it
> >> continues to let through nearly all the spam coming in.
> > 
> > BTW, in which form do all emails have to be in order to use them as
> > examples for spamassassin-learning? mbox (all mails in 1 file), or
> > maildir (every mail is a separate file)? Or both are acceptable?
> 
...

-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-08-28 12:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-25 21:53 [gentoo-user] SpamAssassin is letting everything through daniel
2005-08-25 22:41 ` Jonathan Wright
2005-08-25 23:05   ` Mike Williams
2005-08-25 23:37     ` Jonathan Wright
2005-08-26  2:09       ` W.Kenworthy
2005-08-26 10:18         ` Jonathan Wright
2005-08-27 14:47 ` Jarry
2005-08-27 17:15   ` Jonathan Wright
2005-08-28 12:12     ` William Kenworthy
2005-08-27 17:49   ` Willie Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox