Naïve Bayes is a family of simple yet powerful probabilistic classifiers based on applying Bayes’ Theorem with a strong (naïve) assumption of independence between features. It’s especially effective for text classification tasks like spam detection or sentiment analysis.
sklearn)Naïve Bayes classifiers use the principles of Bayes' Theorem:
In simple terms, it calculates the probability of a class given the input features. The “naïve” assumption is that all features are independent of each other, which simplifies computation.
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
# Sample data
texts = ["Win money now", "Hello friend", "Buy cheap meds", "See you tomorrow", "Free entry now"]
labels = [1, 0, 1, 0, 1] # 1 = spam, 0 = not spam
# Convert text to features
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
y = labels
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train model
model = MultinomialNB()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))Output-
Accuracy: 0.0Naïve Bayes is a foundational machine learning algorithm that combines simplicity with effectiveness. While its independence assumption may not hold in all scenarios, it often performs surprisingly well, especially in text classification.
Sign in to join the discussion and post comments.
Sign in