PDA

Archiv verlassen und diese Seite im Standarddesign anzeigen : schlechte Trefferquote von Spamassassin/Amavis



Pinky
19.09.08, 15:46
Hallo,
ich habe hier unter Debian Postfix mit Amavis (+clamav, +spamassassin) laufen.
Leider erhalte ich trotdem pro Tag ca. 5-10 Spammails, die sich durch den Filter mogeln. Komischerweise ähnenlich sich manche Mails von Zeit zu Zeit, und obwohl ich sa-learn aktiviert habe, scheinen sie zu niedrig getaggt (X-Priority 3) werden.

Grundsätzlich funktioniert der Filter ganz gut, da oftmals mehr als 6000 Mails pro Tag "rejected" werden. Mein Ziel ist es nun das System so zu konfiguieren, dass max 2-3 Spammails durch den Filter rasseln.

Ich bitte Euch deshalb meine Konfigurationsdateien einmal durchzuschauen, und mich auf Fehler und Verbessungsvorschläge aufmerksam zu machen.

Hier mal eine Liste mit Konfigurationsdateien:

/etc/amavis/conf.d/20-debian_defaults



$QUARANTINEDIR = "$MYHOME/virusmails";

$log_recip_templ = undef; # disable by-recipient level-0 log entries
$DO_SYSLOG = 1; # log via syslogd (preferred)
$syslog_ident = 'amavis'; # syslog ident tag, prepended to all messages
$syslog_facility = 'mail';
$syslog_priority = 'debug'; # switch to info to drop debug output, etc

$enable_db = 1; # enable use of BerkeleyDB/libdb (SNMP and nanny)
$enable_global_cache = 1; # enable use of libdb-based cache if $enable_db=1

$inet_socket_port = 10024; # default listenting socket

$sa_spam_subject_tag = '***SPAM*** ';
$sa_tag_level_deflt = 2.0; # add spam info headers if at, or above that level
$sa_tag2_level_deflt = 6.31; # add 'spam detected' headers at that level
$sa_kill_level_deflt = 6.31; # triggers spam evasive actions
$sa_dsn_cutoff_level = 10; # spam level beyond which a DSN is not sent

$sa_mail_body_size_limit = 200*1024; # don't waste time on SA if mail is larger
$sa_local_tests_only = 0; # only tests which do not require internet access?

# Quota limits to avoid bombs (like 42.zip)

$MAXLEVELS = 14;
$MAXFILES = 1500;
$MIN_EXPANSION_QUOTA = 100*1024; # bytes
$MAX_EXPANSION_QUOTA = 300*1024*1024; # bytes

# You should:
# Use D_DISCARD to discard data (viruses)
# Use D_BOUNCE to generate local bounces by amavisd-new
# Use D_REJECT to generate local or remote bounces by the calling MTA
# Use D_PASS to deliver the message
#
# Whatever you do, *NEVER* use D_REJECT if you have other MTAs *forwarding*
# mail to your account. Use D_BOUNCE instead, otherwise you are delegating
# the bounce work to your friendly forwarders, which might not like it at all.
#
# On dual-MTA setups, one can often D_REJECT, as this just makes your own
# MTA generate the bounce message. Test it first.
#
# Bouncing viruses is stupid, always discard them after you are sure the AV
# is working correctly. Bouncing real SPAM is also useless, if you cannot
# D_REJECT it (and don't D_REJECT mail coming from your forwarders!).

$final_virus_destiny = D_DISCARD; # (data not lost, see virus quarantine)
$final_banned_destiny = D_BOUNCE; # D_REJECT when front-end MTA
$final_spam_destiny = D_BOUNCE;
$final_bad_header_destiny = D_PASS; # False-positive prone (for spam)
$final_spam_destiny = D_BOUNCE;
$final_bad_header_destiny = D_PASS; # False-positive prone (for spam)

#$virus_admin = "michi\@$mydomain"; # due to D_DISCARD default

# Leave empty (undef) to add no header
$X_HEADER_LINE = "Debian $myproduct_name at $mydomain";

# REMAINING IMPORTANT VARIABLES ARE LISTED HERE BECAUSE OF LONGER ASSIGNMENTS

#
# DO NOT SEND VIRUS NOTIFICATIONS TO OUTSIDE OF YOUR DOMAIN. EVER.
#
# These days, almost all viruses fake the envelope sender and mail headers.
# Therefore, "virus notifications" became nothing but undesired, aggravating
# SPAM. This holds true even inside one's domain. We disable them all by
# default, except for the EICAR test pattern.
#

@viruses_that_fake_sender_maps = (new_RE(
[qr'\bEICAR\b'i => 0], # av test pattern name
[qr/.*/ => 1], # true for everything else
));

@keep_decoded_original_maps = (new_RE(
# qr'^MAIL$', # retain full original message for virus checking (can be slow)
qr'^MAIL-UNDECIPHERABLE$', # recheck full mail if it contains undecipherables
qr'^(ASCII(?! cpio)|text|uuencoded|xxencoded|binhex)'i,
# qr'^Zip archive data', # don't trust Archive::Zip
));


# for $banned_namepath_re, a new-style of banned table, see amavisd.conf-sample

$banned_filename_re = new_RE(
# qr'^UNDECIPHERABLE$', # is or contains any undecipherable components

# block certain double extensions anywhere in the base name
qr'\.[^./]*\.(exe|vbs|pif|scr|bat|cmd|com|cpl|dll)\.?$'i,

qr'\{[0-9a-f]{8}(-[0-9a-f]{4}){3}-[0-9a-f]{12}\}?'i, # Windows Class ID CLSID, strict

qr'^application/x-msdownload$'i, # block these MIME types
qr'^application/x-msdos-program$'i,
qr'^application/hta$'i,

# qr'^application/x-msmetafile$'i, # Windows Metafile MIME type
# qr'^\.wmf$', # Windows Metafile file(1) type

# qr'^message/partial$'i, qr'^message/external-body$'i, # rfc2046 MIME types

# [ qr'^\.(Z|gz|bz2)$' => 0 ], # allow any in Unix-compressed
# [ qr'^\.(rpm|cpio|tar)$' => 0 ], # allow any in Unix-type archives
# [ qr'^\.(zip|rar|arc|arj|zoo)$'=> 0 ], # allow any within such archives

qr'.\.(exe|vbs|pif|scr|bat|cmd|com|cpl)$'i, # banned extension - basic
# qr'.\.(ade|adp|app|bas|bat|chm|cmd|com|cpl|crt|emf |exe|fxp|grp|hlp|hta|
# inf|ins|isp|js|jse|lnk|mda|mdb|mde|mdw|mdt|mdz|msc |msi|msp|mst|
# ops|pcd|pif|prg|reg|scr|sct|shb|shs|vb|vbe|vbs|
# inf|ins|isp|js|jse|lnk|mda|mdb|mde|mdw|mdt|mdz|msc |msi|msp|mst|
# ops|pcd|pif|prg|reg|scr|sct|shb|shs|vb|vbe|vbs|
# wmf|wsc|wsf|wsh)$'ix, # banned ext - long

# qr'.\.(mim|b64|bhx|hqx|xxe|uu|uue)$'i, # banned extension - WinZip vulnerab.

qr'^\.(exe-ms)$', # banned file(1) types
# qr'^\.(exe|lha|tnef|cab|dll)$', # banned file(1) types
);
# See http://support.microsoft.com/default.aspx?scid=kb;EN-US;q262631
# and http://www.cknow.com/vtutor/vtextensions.htm


# ENVELOPE SENDER SOFT-WHITELISTING / SOFT-BLACKLISTING

@score_sender_maps = ({ # a by-recipient hash lookup table,
# results from all matching recipient tables are summed

# ## per-recipient personal tables (NOTE: positive: black, negative: white)
# 'user1@example.com' => [{'bla-mobile.press@example.com' => 10.0}],
# 'user3@example.com' => [{'.ebay.com' => -3.0}],
# 'user4@example.com' => [{'cleargreen@cleargreen.com' => -7.0,
# '.cleargreen.com' => -5.0}],

## site-wide opinions about senders (the '.' matches any recipient)
'.' => [ # the _first_ matching sender determines the score boost

new_RE( # regexp-type lookup table, just happens to be all soft-blacklist
[qr'^(bulkmail|offers|cheapbenefits|earnmoney|foryo u)@'i => 5.0],
[qr'^(greatcasino|investments|lose_weight_today|mar ket\.alert)@'i=> 5.0],
[qr'^(money2you|MyGreenCard|new\.tld\.registry|opt-out|opt-in)@'i=> 5.0],
[qr'^(optin|saveonlsmoking2002k|specialoffer|specia loffers)@'i => 5.0],
[qr'^(stockalert|stopsnoring|wantsome|workathome|ye sitsfree)@'i => 5.0],
[qr'^(your_friend|greatoffers)@'i => 5.0],
[qr'^(inkjetplanet|marketopt|MakeMoney)\d*@'i => 5.0],
),

# read_hash("/var/amavis/sender_scores_sitewide"),

{ # a hash-type lookup table (associative array)
'nobody@cert.org' => -3.0,
'cert-advisory@us-cert.gov' => -3.0,
'owner-alert@iss.net' => -3.0,
'slashdot@slashdot.org' => -3.0,
'securityfocus.com' => -3.0,
'ntbugtraq@listserv.ntbugtraq.com' => -3.0,
'security-alerts@linuxsecurity.com' => -3.0,
'mailman-announce-admin@python.org' => -3.0,
'amavis-user-admin@lists.sourceforge.net'=> -3.0,
'amavis-user-bounces@lists.sourceforge.net' => -3.0,
'spamassassin.apache.org' => -3.0,
'notification-return@lists.sophos.com' => -3.0,
'owner-postfix-users@postfix.org' => -3.0,
'owner-postfix-announce@postfix.org' => -3.0,
'owner-sendmail-announce@lists.sendmail.org' => -3.0,
'sendmail-announce-request@lists.sendmail.org' => -3.0,
'donotreply@sendmail.org' => -3.0,
'ca+envelope@sendmail.org' => -3.0,
'noreply@freshmeat.net' => -3.0,
'owner-technews@postel.acm.org' => -3.0,
'noreply@freshmeat.net' => -3.0,
'owner-technews@postel.acm.org' => -3.0,
'ietf-123-owner@loki.ietf.org' => -3.0,
'cvs-commits-list-admin@gnome.org' => -3.0,
'rt-users-admin@lists.fsck.com' => -3.0,
'clp-request@comp.nus.edu.sg' => -3.0,
'surveys-errors@lists.nua.ie' => -3.0,
'emailnews@genomeweb.com' => -5.0,
'yahoo-dev-null@yahoo-inc.com' => -3.0,
'returns.groups.yahoo.com' => -3.0,
'clusternews@linuxnetworx.com' => -3.0,
lc('lvs-users-admin@LinuxVirtualServer.org') => -3.0,
lc('owner-textbreakingnews@CNNIMAIL12.CNN.COM') => -5.0,

# soft-blacklisting (positive score)
'sender@example.net' => 3.0,
'.example.net' => 1.0,

},
], # end of site-wide tables
});

1; # insure a defined return




/etc/amavis/conf.d/5-content_filter_mode


use strict;

# You can modify this file to re-enable SPAM checking through spamassassin
# and to re-enable antivirus checking.

#
# Default antivirus checking mode
# Uncomment the two lines below to enable it back
#

@bypass_virus_checks_maps = (
\%bypass_virus_checks, \@bypass_virus_checks_acl, \$bypass_virus_checks_re);


#
# Default SPAM checking mode
# Uncomment the two lines below to enable it back
#

@bypass_spam_checks_maps = (
\%bypass_spam_checks, \@bypass_spam_checks_acl, \$bypass_spam_checks_re);

1; # insure a defined return



/etc/spamassassin/local.cf



# Add *****SPAM***** to the Subject header of spam e-mails
#
rewrite_header Subject *****SPAM*****

# Set the threshold at which a message is considered spam (default: 5.0)
#
required_score 5.0


# Use Bayesian classifier (default: 1)
#
use_bayes 1
use_bayes_rules 1

# Bayesian classifier auto-learning (default: 1)
#
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam 1
bayes_auto_learn_threshold_spam 14.00



mein sa-learn Script



#!/bin/bash -e

SADIR=/var/lib/amavis/.spamassassin
DBPATH=/var/lib/amavis/.spamassassin/bayes
SPAMFOLDERS="\
/home/michi/.mails/.Junk/cur \
"
HAMFOLDERS="\
/home/michi/.mails/.Trash/cur \
"

for spamfolder in $SPAMFOLDERS ; do \
echo Learning spam from $spamfolder ; \
nice sa-learn --spam --showdots --dbpath $DBPATH $spamfolder
done

for hamfolder in $HAMFOLDERS ; do \
echo Learning ham from $hamfolder ; \
nice sa-learn --ham --showdots --dbpath $DBPATH $hamfolder
done


sa-learn --dump magic --dbpath /var/lib/amavis/.spamassassin


0.000 0 3 0 non-token data: bayes db version
0.000 0 100572 0 non-token data: nspam
0.000 0 5426 0 non-token data: nham
0.000 0 149808 0 non-token data: ntokens
0.000 0 1221107308 0 non-token data: oldest atime
0.000 0 1221831661 0 non-token data: newest atime
0.000 0 1221830281 0 non-token data: last journal sync atime
0.000 0 1221798324 0 non-token data: last expiry atime
0.000 0 691200 0 non-token data: last expire atime delta
0.000 0 5493 0 non-token data: last expire reduction count

Ich hoffe das ich alle wichtigen Configs beigefügt habe.

Danke
Michi

cane
19.09.08, 16:30
Ich denke es ist eine sehr gute Quote wenn von 6000 Spams nur 5-10 durchkommen.

Du könntest eventuell noch Greylisting verwenden, das nimmts ehr viel last von der Kiste.

mfg
cane

Pinky
19.09.08, 16:40
Danke für den Tipp !
mal sehen wie sich das greylisting auswirkt.

Pinky
22.09.08, 22:11
Leider erbringt auch greylisting nicht die gewünschten Ergebnisse.
Gibt es noch weitere Erweiterungen für Postfix und Amavis ?

Danke
Michi

unlimitopen
22.09.08, 22:56
Guten Tag zusammen,

ich habe mal eine ganz bescheuerte Frage. Oben wird geschrieben es werden pro Tag ca. 6000 Mails Rejected, -> werden die Mails wirklich zurückgesendet zum Versender? oder gehen ins /dev/null?
Deine oben genannte Konfiguration hast du die so rausgefunden über "Learning by doing"
oder über Handbücher? Könntest du wenn es Fachbücher waren diese hier nennen?

Vielen Dank!

cane
23.09.08, 01:54
Leider erbringt auch greylisting nicht die gewünschten Ergebnisse.
Gibt es noch weitere Erweiterungen für Postfix und Amavis ?


Pyzor, Razor und DCC als "distributes checksum" Projekte, Achtung - Du musst drauf achten das die Firewall deren Kommunikation nicht blockt.

Ansonsten die RBL von der ix.

Hier finden sich auch viele tolle Infos:
http://wiki.apache.org/spamassassin/

Wenn alles nicht hilft musst Du die Mails posten die durchkommen - oder besser noch auf der amavis oder spamassassin mailinglist mit anderen Interessieren eine Analyse starten.

Aber wiegesagt - deine Quote ist wirklich okay :)

mfg
cane

Pinky
23.09.08, 14:21
Stimmt denn mein "sa-learn"-Script ? Denn obwohl ich die Spammails in den Junkordner schieben und sa-learn drüberlaufen lasse, scheinen neue Mails die nach dem selben Schema aufgebaut sind den Filter ebenfalls auszutricksen.

cane
23.09.08, 15:11
Für vernünftiges Bayes Filtering solltest / musst Du mindestens mehrere hundert, besser tausend SPAM UND HAM Nachrichten anlernen lassen.

mfg
cane

Roger Wilco
23.09.08, 21:31
Oben wird geschrieben es werden pro Tag ca. 6000 Mails Rejected, -> werden die Mails wirklich zurückgesendet zum Versender? oder gehen ins /dev/null?
Weder noch. Die E-Mails werden normalerweise einfach nicht angenommen. Nur richtig kaputte Setups nehmen die E-Mails an und bouncen dann oder senden die E-Mails ohne Benachrichtigung des Senders oder Empfängers ins Nirvana.