Understanding Naive Bayes Algorithm in Machine Learning with Step-by-Step Example and Mathematics

December 14, 2024

The Naive Bayes algorithm is a cornerstone of machine learning, widely used for classification problems. It is simple, efficient, and interpretable, making it a go-to choice for applications such as spam detection, sentiment analysis, and more. This blog post provides a step-by-step explanation of the Naive Bayes algorithm, its mathematical foundations, and a practical example using sample data.

What is the Naive Bayes Algorithm?

Naive Bayes is a probabilistic classification algorithm based on Bayes' Theorem, with the assumption that features are conditionally independent given the class label. Despite this "naive" assumption, it works exceptionally well for many real-world problems.

Applications of Naive Bayes

Spam Detection: Classifying emails as spam or not spam.
Sentiment Analysis: Analyzing user sentiment in product reviews.
Medical Diagnosis: Predicting diseases based on symptoms.
Text Classification: Categorizing news articles or documents.

Mathematics Behind Naive Bayes

Bayes’ Theorem

The algorithm is built on Bayes' Theorem:

P(C|X) = \frac{P(X|C) \cdot P(C)}{P(X)}

Where:

$P(C|X)$ : Posterior probability of class $C$ given feature vector $X$ .
$P(X|C)$ : Likelihood of observing $X$ given class $C$ .
$P(C)$ : Prior probability of class $C$ .
$P(X)$ : Evidence (overall probability of $X$ ).

Naive Assumption

Naive Bayes assumes that features are conditionally independent:

P(X|C) = P(x_1|C) \cdot P(x_2|C) \cdot \ldots \cdot P(x_n|C)

Thus, the posterior probability becomes:

P (C ∣ X) \propto P (C) \prod_{i = 1}^{n} P (x_{i} ∣ C)

Step-by-Step Naive Bayes with Example

Step 1: Define the Dataset

Let’s consider a dataset to classify whether a day is suitable for playing tennis based on weather conditions:

Outlook	Temperature	Humidity	Wind	Play Tennis
Sunny	Hot	High	Weak	No
Sunny	Hot	High	Strong	No
Overcast	Hot	High	Weak	Yes
Rain	Mild	High	Weak	Yes
Rain	Cool	Normal	Weak	Yes
Rain	Cool	Normal	Strong	No
Overcast	Cool	Normal	Strong	Yes
Sunny	Mild	High	Weak	No
Sunny	Cool	Normal	Weak	Yes
Rain	Mild	Normal	Weak	Yes
Sunny	Mild	Normal	Strong	Yes
Overcast	Mild	High	Strong	Yes
Overcast	Hot	Normal	Weak	Yes
Rain	Mild	High	Strong	No

Step 2: Calculate Prior Probabilities

The prior probability for each class is calculated as:

P(\text{Yes}) = \frac{\text{Number of 'Yes'}}{\text{Total Instances}} = \frac{9}{14}

P (No) = \frac{Number of ’No’}{Total Instances} = \frac{5}{14​}

Step 3: Compute Likelihood

For a given feature and class, calculate the likelihood. For instance, let’s compute $P(\text{Outlook=Sunny | Play Tennis=Yes})$ :

P(\text{Sunny | Yes}) = \frac{\text{Count of 'Sunny' when 'Yes'}}{\text{Total 'Yes'}} = \frac{2}{9}

Similarly, calculate for all feature values and classes.

Step 4: Make Predictions

Let’s predict for $X = (\text{Sunny, Cool, High, Strong})$ .

Likelihood for Yes

P(X|\text{Yes}) = P(\text{Sunny|Yes}) \cdot P(\text{Cool|Yes}) \cdot P(\text{High|Yes}) \cdot P(\text{Strong|Yes})

Substitute the probabilities:

P(X|\text{Yes}) = \frac{2}{9} \cdot \frac{3}{9} \cdot \frac{3}{9} \cdot \frac{3}{9}

Likelihood for No

P(X|\text{No}) = P(\text{Sunny|No}) \cdot P(\text{Cool|No}) \cdot P(\text{High|No}) \cdot P(\text{Strong|No})

Substitute the probabilities:

P(X|\text{No}) = \frac{3}{5} \cdot \frac{1}{5} \cdot \frac{4}{5} \cdot \frac{3}{5}

Posterior Probabilities

Combine with priors:

P(\text{Yes}|X) \propto P(X|\text{Yes}) \cdot P(\text{Yes})

P(\text{No}|X) \propto P(X|\text{No}) \cdot P(\text{No})

Choose the class with the higher posterior.

Python Implementation

from sklearn.naive_bayes import CategoricalNB
import numpy as np

# Define Dataset
X = np.array([
    ['Sunny', 'Hot', 'High', 'Weak'],
    ['Sunny', 'Hot', 'High', 'Strong'],
    ['Overcast', 'Hot', 'High', 'Weak'],
    ['Rain', 'Mild', 'High', 'Weak'],
    ['Rain', 'Cool', 'Normal', 'Weak'],
    ['Rain', 'Cool', 'Normal', 'Strong'],
    ['Overcast', 'Cool', 'Normal', 'Strong'],
    ['Sunny', 'Mild', 'High', 'Weak'],
    ['Sunny', 'Cool', 'Normal', 'Weak'],
    ['Rain', 'Mild', 'Normal', 'Weak'],
    ['Sunny', 'Mild', 'Normal', 'Strong'],
    ['Overcast', 'Mild', 'High', 'Strong'],
    ['Overcast', 'Hot', 'Normal', 'Weak'],
    ['Rain', 'Mild', 'High', 'Strong']
])

y = np.array(['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No'])

# Convert categorical data to numeric
from sklearn.preprocessing import OrdinalEncoder
encoder = OrdinalEncoder()
X_encoded = encoder.fit_transform(X)

# Train Naive Bayes Model
model = CategoricalNB()
model.fit(X_encoded, y)

# Predict for a new instance
sample = encoder.transform([['Sunny', 'Cool', 'High', 'Strong']])
prediction = model.predict(sample)
print(f"Prediction: {prediction[0]}")

Conclusion

The Naive Bayes algorithm is an effective and intuitive method for classification problems. By understanding the steps and mathematics behind it, you can apply it confidently to real-world problems. The simplicity of the algorithm, combined with its effectiveness, makes it a must-have tool in your machine learning toolkit.

Search This Blog

Creative World