Naive Bayes

Naive Bayes is a technique for building classifiers to distinguish one group from another. A simple example is to identify spam emails. It uses Bayes’ theorem to perform the job, hence the name.

What is the probability that the email I received is spam, given it has the words ‘money’ and ‘buy’?

Let’s build a spam detector from previous data. I have 100 emails, of which 75 are normal, and 25 are spam. 8 of the 75 normal emails contain the word ‘buy’, whereas 15 spam emails have the word. On the other hand, ‘money’ is present in 5 normal emails and 20 spam emails.

The probability that the email is normal, given it contains the words ‘buy’ and ‘money’, is proportional to the probability of seeing buy and money in a normal message x the probability of having a normal message. By the way, as you may have noticed, it appears like Bayes’ theorem.

P(N|B&M) α P(B&M|N) x P(N)

We know P(B&M|N) is (8/75) x (5/75) and P(N) is 75/100.

Extending the same logic, the probably that the email is spam given B&M is:

P(S|B&M) α P(B&M|S) x P(S)

P(B&M|S) is (15/25) x (20/25) and P(N) is 25/100.

P(B&M|N) x P(N) = 0.0053; P(B&M|S) x P(S) = 0.12

The email is more likely spam; the answer to the original question is obtained by applying Bayes’ theorem.

P(S|B&M) = P(B&M|S) x P(S) /[P(B&M|S) x P(S) + P(B&M|N) x P(N)] = 0.12/(0.12 +0.0053 ) = 96%