SpamBully is not properly filtering my email, I am getting too many false positives or negatives, how do I retrain my filter?

The first step is to see why a message was or wasn't moved to the Spam folder.  There is a button in SpamBully called Email Details.  Email Details will explain why a message was categorized the way it was.  This will help you determine what adjustments need to be made to the filter to fix this problem.

(NOTE: Email Details must be selected BEFORE hitting the Spam or Not Spam buttons to understand why a message was miscategorized)

Training the filter is a way to improve SpamBully's filter performance.  SpamBully comes with a pretrained filter that has processed approximately 4,000 spam messages.  For some users this filter may not match their own email habits and may be overly restrictive.  If you are receiving too many false positives (term explained at bottom) you may need to untrain and retrain the filter to your own set of emails. If you are receiving too many false negatives, (term explained at bottom) you will probably want to make Spam Bully learn more of your spam messages.

NOTE: If you are satisfied with SpamBully's filtering ability, it is not a good idea to retrain the filter.  This is a procedure primarily meant for more advanced computer users.

Method 1. Manually adding your spams and good emails to your existing filter.  This method can be used at any time to improve the filtering ability of the Bayesian filter.

1. Before beginning, you should have atleast one folder with only spam messages and another folder with only good email messages. 

2. Select your Spam folder and from the "Train Filter" button. Select your good and spam folders. If you need more help with the Train filter Option see the additional links suggested at the bottom of this page. 

4. Congratulations... you have now customized your filter to better recognize your spam messages.  False negatives should be reduced some as well as false positives. If you still have problems, please try Method 3 which untrains the filter.


Method 2. Now, lets retrain the filter to improve itself if you are getting too many false negatives (too much spam in your inbox).

1. Before beginning, you should have atleast one folder with only spam messages.

3. Select your spam folder or folders and from the "Train Filter" button add each of these spam folders.  If you need more help with the Train filter Option see the additional links suggested at the bottom of this page.  NOTE: Only learn folders with only spam emails because every message in the folder is learned as spam.

4. Congratulations... you have now customized your filter to better recognize your spam messages.  False negatives should be reduced. If you still have problems please try method 3 which untrains the filter.


Method 3. Finally, lets train the filter to limit the number of false positives you are getting (intermediate to advanced users).

1. Before beginning, you should have atleast one folder with only spam messages and another folder with only good email messages.  The spam folder should have atleast a thousand or so spam messages minimum. Using a small number of spam messages increases the number of false negatives (more spam to your inbox). Ideally, you should train the filter using the same ratio of spams to good emails you currently receive.  (So if you receive 70% spam and 30% good emails. If your corpus (term explained at bottom) of emails consists of 10,000 emails, 7,000 would be spam mails and 3,000 would be good emails ideally.)

2. Using  the "Train" button make sure the "empty the current bayesian filter option is checked. This removes all training from the Bayesian filter.  In this state, the filter has no training and no emails will be blocked.  All emails will be considered "good." Select your spam and good folder or folders and from the "Train Filter" button add each of these spam folders.  If you need more help with the Train filter Option see the additional links suggested at the bottom of this page.  NOTE: Only learn folders with only spam emails or good emails separated so the training will be accurate. 

5. Congratulations... you have now retrained and customized your filter to only your emails.  False positives should be reduced and false negatives should also be reduced provided you used enough emails to train the filter.

 

False Positive - Good email message that has been blocked by a spam filter.  These are considered much worse than a false negative.

False Negative - Is a spam message that has passed through the filter.

Bayesian Filter - Type of spam filter that looks at the probabilities of words and html tags that appear in an email message.  If the message is a spam message, it will increase the rank of the words in its dictionary that appear in that email.  If it is a good message, it will decrease the rank of the words in its dictionary.  By doing this over thousands of messages certain words and patterns emerge that distinguish your good emails from your spam emails.  In general, this works much better than standard message rules because the Bayesian filter can pick up on many underlying parameters that a user will miss.  It is also able to adapt much more quickly to new types of spam emails without a user having to spend time writing a new message rules everytime a new spam comes in.

Corpus - Your library of emails used to train SpamBully. 


This article may be found at:
http://www.spambully.com/sb3help/index.php?page=index_v2&id=63&c=6



Article Comments  

There are no user added comments for this article.