Home Page

VPN & Cryptography


Email & Spam

Security Topics


Email Spam

Zero Day Window


Bayesian Algorithm

Content and Connection control

Directory Harvesting Attacks

Email Encryption

Email Archiving

File attachments

Image scanning

Port forwarding and MX records

Reputation filters

Encrypted attachments

Grey Listing

Email Monitoring

Internal Email Security

Open Relay

Per user quarantine area

Reverse DNS lookup & SPF

RFC Compliant emails


Email Throttling

What is Spam

Whitelists and Blacklists



Bayesian Algorithm


Bayesian algorithm in anti-spam filtering solutions is one of the more common features used today to identify spam. It requires manual intervention as a user would train the function as to what looks like spam. The tool uses the concept of Bayes theorem and which uses probabilities to weigh a message.

By comparing large sets of legitimate e-mail and large sets of spam, Bayesian spam filter can then look for combination of words that are statistically likely to occur in spam messages, and for words that are statistically likely to occur in legitimate messages, to determine the probability that an e-Mail is likely to be spam or a legitimate e-mail.

After bayesian has scored a sufficient amount of emails, the next task would be to keep an eye on the email logs, to see how well bayesian has learnt to classify emails. Fine tuning may be required if bayesian is scoring incorrectly. When the system is scoring emails correctly, it is recommended to leave bayesian alone. Sometimes over scoring will do more harm than good. When bayesian starts to produce more false positives or/and false negatives than usual, then it would be required to manually score a subset of emails, and to fine tune the Bayesian system again. Sometimes if bayesian is not performing well, it may be required to reset bayesian and start over again. Occasionally the emails scored by a user is vague (meaning an email looks both legitimate and spam at the same time) and can confuse the system.

Alongside bayesian, vendors also produce anti-spam signatures for known spam emails messages. Spam signatures will block known spam emails, and if its an unknown spam email then this is where bayesian comes into play and can help in catching unknown spam, and with ani-spam solutions there will also be other filtering strategies. As with all security it is defense in depth that is required to have more of a complete security posture. Products such as MIMEsweeper and Barracuda anti-spam filters have layers of defense mechanisms. The chances are with one of the many layers of security layers, will help catch even the most well written, and sophisticated spam messages.

Further Reading

Wikipedia's guide to Bayesian Probability