[AMaViS-user] sa-learn vs spamassassin tests

This is a discussion on [AMaViS-user] sa-learn vs spamassassin tests within the Amavis User forums, part of the Anti-Spam and Anti-Virus Related Forums category; # sa-learn -L --spam and spamassassin -L -r learn the same spam differently. SA version 3.13, using db or ...


Go Back   Usenet Forums > Anti-Spam and Anti-Virus Related Forums > Amavis User

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 09-05-2006
Michael Scheidell
 
Posts: n/a
Default [AMaViS-user] sa-learn vs spamassassin tests

# sa-learn -L --spam and spamassassin -L -r learn the same spam differently.
SA version 3.13, using db or sql database, doesn't seem to matter,
--sync or not --sync, doesn't matter.

Also, it doesn't matter if I run sa-learn --spam or spamassassin -r first.

Further, spamassassin -r and sa-learn --spam learn differently, give
different results:


running spamassassin -Lr against a clean db with my test email gives me
130 tokens.
running sa-learn -L --spam against a clean db, same test email gives me
146 tokens.

Test:

# clean out old sa db:
rm -rf /var/db/spamassassin

#create new one:
mkdir /var/db/spamassassin
chown vscan:vscan /var/db/spamassassin

#test it:
su - vscan -c "sa-learn --sync && sa-learn --dump magic"

0.000 0 3 0 non-token data: bayes db version
0.000 0 0 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 0 0 non-token data: ntokens
0.000 0 0 0 non-token data: oldest atime
0.000 0 0 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

# run new email though it (lets not mess with dcc, razor, spamcop for
this test)
su - vscan -c "spamassassin -rL < /tmp/spam.eml"
su - vscan -c "sa-learn --sync && sa-learn --dump magic"
1 message(s) examined.

0.000 0 3 0 non-token data: bayes db version
0.000 0 1 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 130 0 non-token data: ntokens
0.000 0 1157160113 0 non-token data: oldest atime
0.000 0 1157160113 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

with sync: (no difference)
su vscan -c "sa-learn --sync && sa-learn --dump magic"
0.000 0 3 0 non-token data: bayes db version
0.000 0 1 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 130 0 non-token data: ntokens
0.000 0 1157160113 0 non-token data: oldest atime
0.000 0 1157160113 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

Now try sa-learn:
su - vscan -c "sa-learn -L --spam < /tmp/spam.eml"
su - vscan -c "sa-learn --sync && && sa-learn --dump magic"
Learned tokens from 1 message(s) (1 message(s) examined)

Yep, it does something different enough.

0.000 0 3 0 non-token data: bayes db version
0.000 0 2 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 227 0 non-token data: ntokens
0.000 0 1157160113 0 non-token data: oldest atime
0.000 0 1157464500 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

#control: let's run sa-learn first:
rm -rf /var/db/spamassassin
sme-500# mkdir -p spamassassin
sme-500# chown vscan:vscan spamassassin
sme-500# su - vscan -c "sa-learn --sync && sa-learn --dump magic"
0.000 0 3 0 non-token data: bayes db version
0.000 0 0 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 0 0 non-token data: ntokens
0.000 0 0 0 non-token data: oldest atime
0.000 0 0 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

su - vscan -c "sa-learn -L --spam < /tmp/spam.eml"
su - vscan -c "sa-learn --sync && sa-learn --dump magic"
0.000 0 3 0 non-token data: bayes db version
0.000 0 1 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 146 0 non-token data: ntokens
0.000 0 1157464809 0 non-token data: oldest atime
0.000 0 1157464809 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

(remember, if we ran spamassassin -r first, we only got 130 tokens)

su - vscan -c "spamassassin -Lr < /tmp/spam.eml"
1 message(s) examined.
sme-500# su - vscan -c "sa-learn --sync && sa-learn --dump magic"
0.000 0 3 0 non-token data: bayes db version
0.000 0 2 0 non-token data: nspam
0.000 0 0 0 non-token data: nham
0.000 0 227 0 non-token data: ntokens
0.000 0 1157160113 0 non-token data: oldest atime
0.000 0 1157464809 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal
sync atime
0.000 0 0 0 non-token data: last expiry atime
0.000 0 0 0 non-token data: last expire
atime delta
0.000 0 0 0 non-token data: last expire
reduction count

same 277, so it doesn't matter if we spamassassin -Lr or sa-learn -L
--spam, but we need to do both?


--
Michael Scheidell, CTO
SECNAP Network Security / www.secnap.com
scheidell@secnap.net / 1+561-999-5000, x 1131


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=...057&dat=121642
_______________________________________________
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/...fo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Reply With Quote
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 03:22 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0