- Supervised Learning
-
Overview
- Introduction to Supervised Learning
- Linear Regression and Its Applications
- Logistic Regression for Classification
- Decision Trees and Random Forests
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN) Algorithm
- Naïve Bayes Classifier
- Gradient Boosting (XGBoost, LightGBM)
- Overfitting and Underfitting in Models
- Bias-Variance Tradeoff
Naïve Bayes Classifier
Add to BookmarkNaïve Bayes is a family of simple yet powerful probabilistic classifiers based on applying Bayes’ Theorem with a strong (naïve) assumption of independence between features. It’s especially effective for text classification tasks like spam detection or sentiment analysis.
What You'll Learn
- What is Naïve Bayes and how it works
- Types of Naïve Bayes classifiers
- Real-world applications
- Example using Python (with
sklearn
)
What is Naïve Bayes?
Naïve Bayes classifiers use the principles of Bayes' Theorem:
In simple terms, it calculates the probability of a class given the input features. The “naïve” assumption is that all features are independent of each other, which simplifies computation.
Types of Naïve Bayes
- Gaussian Naïve Bayes – Assumes continuous data and follows a normal distribution
- Multinomial Naïve Bayes – For discrete counts like word occurrences
- Bernoulli Naïve Bayes – For binary features (yes/no, 0/1)
Example: Naïve Bayes for Text Classification (Spam Detection)
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
# Sample data
texts = ["Win money now", "Hello friend", "Buy cheap meds", "See you tomorrow", "Free entry now"]
labels = [1, 0, 1, 0, 1] # 1 = spam, 0 = not spam
# Convert text to features
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
y = labels
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train model
model = MultinomialNB()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
Output-
Accuracy: 0.0
When to Use Naïve Bayes
- Spam filtering
- Sentiment analysis
- Document categorization
- Medical diagnosis based on symptoms
- Recommendation systems (e.g., movie genres)
Advantages
- Simple and fast
- Works well with high-dimensional data
- Requires a small amount of training data
- Handles both binary and multiclass classification
Limitations
- The assumption of feature independence is rarely true in real-world data
- Poor performance if features are highly correlated
- Cannot capture complex relationships between features
Summary
Naïve Bayes is a foundational machine learning algorithm that combines simplicity with effectiveness. While its independence assumption may not hold in all scenarios, it often performs surprisingly well, especially in text classification.
Prepare for Interview
- JavaScript Interview Questions for 0–1 Year Experience
- JavaScript Interview Questions For Fresher
- SQL Interview Questions for 5+ Years Experience
- SQL Interview Questions for 2–5 Years Experience
- SQL Interview Questions for 1–2 Years Experience
- SQL Interview Questions for 0–1 Year Experience
- SQL Interview Questions for Freshers
- Design Patterns in Python
- Dynamic Programming and Recursion in Python
- Trees and Graphs in Python
- Linked Lists, Stacks, and Queues in Python
- Sorting and Searching in Python
- Debugging in Python
- Unit Testing in Python
- Asynchronous Programming in PYthon
Random Blogs
- How to Become a Good Data Scientist ?
- The Ultimate Guide to Artificial Intelligence (AI) for Beginners
- What is YII? and How to Install it?
- Where to Find Free Datasets for Your Next Machine Learning & Data Science Project
- Top 15 Recommended SEO Tools
- The Ultimate Guide to Starting a Career in Computer Vision
- Internet of Things (IoT) & AI – Smart Devices and AI Working Together
- Understanding HTAP Databases: Bridging Transactions and Analytics
- Python Challenging Programming Exercises Part 3
- Create Virtual Host for Nginx on Ubuntu (For Yii2 Basic & Advanced Templates)
- Python Challenging Programming Exercises Part 1
- Mastering SQL in 2025: A Complete Roadmap for Beginners
- Extract RGB Color From a Image Using CV2
- 5 Ways Use Jupyter Notebook Online Free of Cost
- 15 Amazing Keyword Research Tools You Should Explore
Datasets for Machine Learning
- Amazon Product Reviews Dataset
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset
- Artificial Characters Dataset