Dear polite people in this forum,
I've recently migrated to a new mailserver. As the hardware "age gap" was too large, it would be difficult to simply upgrade from Debian Squeeze to Jessie at the same time (and it would possibly not solve my problem either). So I just installed a clean Jessie and moved the user accounts, old e-mail etc. by hand. Well at least I know more about the internals.
The one thing that I seem to be struggling with is the bayesian database operated by Spamassassin - as enslaved by amavisd-new. (Heh already got to know when I wanted to get the SPAM score headers included in every e-mail message: the $sa_tag_level_deflt lives in an amavis config file.)
I do have
use_bayes 1
bayes_path /var/lib/spamassassin/.spamassassin/bayes
in spamassassin/local.cf . I find it curious that the path ends with "bayes", but this last string is not an actual directory, it seems to be a mere prefix for the _toks and _seen files.
If I try "spamassassin -D --lint 2>&1 | less" I can see some praise:
Jul 9 11:21:15.091 [5076] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_toks
Jul 9 11:21:15.091 [5076] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_seen
Possibly depending on directory where I run it, I have once seen BAYES_20 in that listing as well.
Also sa-learn-cyrus seems to be updating the database just fine, and sa-sync doesn't complain either.
I've actually migrated the bayes DB files from the old server, using
sa-learn --backup
sa-learn --restore=...
and I had to adjust some permissions afterwards of memory serves... sa-learn-cyrus.conf contains the user and group under which it should run, which should match the ownership of the database.
Now for the curious bit:
I cannot see any traces of the bayes filter actually working on the e-mail passing through. Amavis does work, I can see its actions in /var/log/amavis.log, sometimes it catches a SPAM based on its other heuristic rules. But I haven't managed to catch a BAYES score in the received e-mails (which now do contain the expected X-Spam-Status header) nor in the highly positive stuff quarantined in /var/virusmails.
In other words, if I run "grep -ri bayes *" in /var/log/ and /var/virusmails/, I get exactly nothing :-(
Is it possible that the bayesian filter is working and I just don't know?
If the bayes filter does not actually work (in spamassassin under amavisd-new), what other places should I look for relevant config?
If it may be working just fine, is there a way to increase its verbosity? To have its score always printed in some log, or preferably, included in the X-Spam-Status header?
Also, is there a way for me to map the Bayes score to a Spamassassin score increment? I mean - to see and maybe configure how Amavis or SA adds the bayesian contribution...
I was also wondering if I'm missing something in the system, such as a package not installed. But "aptitude search bayes" only returns "spambayes", which is some python-based project, competing to the bayesian filter that's part of spamassassin...
Any ideas are welcome :-)
Frank
it seems like I've found an answer: In /etc/spamassassin/local.cf, you need:
It was the third line that was missing in my case.
BTW, while looking for the problem, I'va managed to insert a "logging probe" in the Spamassassin Perl module guts, which printed a Perl call stack backtrace for me:
In /usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm :
Taken almost verbatim from Thariama's posting here .
Obviously I needed some debugging toggle, to display the debug messages:
In /etc/amavis/conf.d/50-user :
Actually it's just the last two lines that are key to debugging in Amavis and SpamAssassin. The lines above are just FYI.
==== EDIT another hour later : ====
...but wait, there's more, seems like that wasn't the end-game yet :-)
Just after I sent the previous optimistic message, I got a cold shower: the BAYES scores were gone again. So I went back to some serious level of debug, tried removing some config related to auto-expiry that I was playing with at the same time, but even as I got the config back to where it used to work, the Bayes was simply gone. Same symptoms.
While I was fumbling sadly through the debug log, I noticed another promising warning:
_WARN: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/share/perl5/Mail/SpamAssassin/Logger.pm line 241.
Now what the hell is the -T switch...
man perl
cannot find it right there (wish I knew the right chapter).
The spamassassin source code wasn't much help either.
But after a bit of Googling, after I narrowed down the query, I got this:
http://search.cpan.org/~bdfoy/PerlPowerTools-1.012/bin/printf
And several other pointers to an
"Insecure dependency in eval while running setuid"
Same thing? Probably. The -T switch is for "taint mode".
https://perldoc.perl.org/perlsec.html#Taint-mode
And it's a security measure, so that your casual "evals with a printf inside" are not easily hijacked for "code injection".
Now where the hell does that -T switch get into play.
SpamAssassin is running as a module of Amavis.
I already knew that Amavis was really a Perl script.
The Perl interpreter probably gets called using the #! shell specification on the first line in /usr/sbin/amavisd-new .
You betcha.
From there, the workaround is simple.
But ... OOPS! I probably shouldn't tell anyone :->
Still... I don't understand why it suddenly worked for a while, and then suddenly no longer, not anymore. Where's the hidden state? I did restart Amavis after each change in the config files, meaning I restarted the Perl interpreter all over each time...
"This is some spooky $#|t we got here, sarge..."
(to paraphrase Henry Rollins in the Lost Highway)
Frank