Home Page


Email & Spam

Security Terminology

Security Topics

VPN & Cryptography



Email Security and Spam Terminology

Zero Day Window


Bayesian Algorithm

Content and Connection control

Directory Harvesting Attacks

Email Encryption

Email Archiving

File attachments

Image scanning

Email Load balancing

Port forwarding and MX records

Reputation filters

Encrypted attachments

Grey Listing

Email Monitoring

Internal Email Security

Open Relay

Outbound email filtering

Per user quarantine area

Reverse DNS lookup & SPF

RFC Compliant emails


Spoofed email

Stopping spam for Networks guide

Email Throttling

What is Spam

Which Spam filter

Whitelists and Blacklists


Security Products Guide

Which Anti-Virus Software?

Which Firewall?

Which Spam Filter?

Which Internet Security Suite?


What is Guide

What is a Firewall?

What is a Virus?

What is Spam?


Essential Security Guides

Securing Windows XP Guide

Securing Windows Vista Guide

A Guide to Wireless Security



Top 8 Internet Security Tips

Why both, Firewall and Anti Virus?

Free or purchased security - Which one?





Bayesian Algorithm


Bayesian spam filtering is one of the more common features used today in spam filters to identify spam. It requires manual intervention as a user would train the function as to what is spam. The tool uses the concept of Bayes theorem and so using probabilities to weigh a message.

By comparing large sets of legitimate e-mail and large sets of spam, bayesian spam filter can then look for combination of words that are statistically likely to occur in spam messages, and for words that are statistically likely to occur in legitimate messages, to determine the probability that an e-Mail is likely to be spam or a legitimate e-mail.

After bayesian has scored a sufficient amount of emails, a user will be required to keep an eye on the email logs, to see how well bayesian has learnt to classy emails. Fine tuning may be required if bayesian is scoring incorrectly. When the system is scoring emails correctly, it is recommended to leave bayesian alone. Sometimes over scoring will do more harm than good. When bayesian starts to produce more false positives or/and false false negatives than usual, then it would be required to manually score a subset of emails, and fine tune the Bayesian system again. Sometimes if bayesian is not performing well, it may be required to reset bayesian and start again. Occasionally the emails scored by a user are vague (meaning an email looks legitimate and spammy at the same time) and can confuse the system.

Alongside bayesian, vendors also produce spam signatures for known spam emails. Spam signatures will block known spam emails, and if its an unknown spam email then it may be caught by bayesian or other filtering strategies. As with all security it is defense in depth that is required to have more of a complete security policy. Products such as MIMEsweeper and Barracuda spam filters have layers of defense mechanisms. The chances are one of the many layers, will catch even rare spam messages.

Further Reading

Wikipedia's guide to Bayesian Probability