A Spam filter looks at the body of an email and uses a set of rules to decide whether or not it is spam. For example, if the email has the word "Viagra" in it – it is spam.
How is it different than a Spam Blocker?
A spam blocker looks at the header (eg, To :, From :, IP Address, etc.) and based on a list of known spam header attributes, decides whether or not it is spam. For example, if the email comes from a known spam address – it is spam.
You can see that spam filters and spam blockers do the same thing in different ways.
Why you need a software solution
Not too long ago, it was pretty simple: you could look at an email and tell it was spam. So you'd just click the delete key and move on.
But a couple of things happened.
- The volume of email increased;
- The percentage of those emails that was spam increased
The solution was straightforward: use automated software to scan the emails and calculate a score for each one. The higher the score, the more likely it is to be spam. If the score is high enough, move the spam to the "spam" folder or the "Junk" folder.
The spammers responded by doing one or more of the following:
- Disguising words that made the score too high
- Adding a flood of neutral (or even gibberish) words to throw off the score ..
- … and more.
Suddenly, sending out lists of spam keywords became impractical; a better system was developed. Something had to be done.
Bayesian filtering: The indispensable tool
This kind of spam filter calculates a score but it also gets smarter the more you use it.
Here's how it works:
Imagine that you open your email inbox. You see some email that you suspect is spam, but you're not sure. So you open it and, sure enough, it is spam.
Now also imagine that you have anti spam software that uses a Bayesian filter.
Here's what you do next:
For the first piece of spam, you click a button. The anti spam software "reads" the email and remembers certain characteristics. Then it deletes it.
Then you open the next suspicious email. It's different than the first one, but it is still spam, so you click the button again. The anti spam software examines this piece of spam and adds its characteristics to the list. Next time another piece of email comes in with those characteristics – straight to the Junk folder.
Do this for every piece of email that you suspect is spam. The more you do it, the more detailed your list of spam-characteristics gets. Pretty soon the list is detailed enough that the anti spam software can predict – on its own – when an email is spam. It can then move that spam to the Junk folder all by itself. You do not have to be involved.
I have simplified the process to make it easy to understand. In practice, the anti spam filter can look at a lot of different characteristics:
- Words in the body of the message,
- HTML code (like colors)
- Pairs of words,
- Where the words appear in relation to other words
- … and more.