- Supervised Learning
-
Overview
- Introduction to Supervised Learning
- Linear Regression and Its Applications
- Logistic Regression for Classification
- Decision Trees and Random Forests
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN) Algorithm
- Naïve Bayes Classifier
- Gradient Boosting (XGBoost, LightGBM)
- Overfitting and Underfitting in Models
- Bias-Variance Tradeoff
Logistic Regression for Classification
Add to BookmarkLogistic Regression is a supervised machine learning algorithm used for classification problems. It estimates the probability that a given input point belongs to a certain class and is widely used for binary classification tasks.
What is Logistic Regression?
Unlike linear regression, which predicts continuous values, logistic regression predicts probabilities. It uses a logistic (sigmoid) function to map predicted values to a range between 0 and 1. If the output probability is above a threshold (usually 0.5), the input is classified into one class; otherwise, it's classified into the other.
Sigmoid Function
The sigmoid function outputs values between 0 and 1, which can be interpreted as probabilities.
Applications of Logistic Regression
Application | Description |
---|---|
Email Spam Detection | Classify emails as spam or not spam |
Medical Diagnosis | Predict if a patient has a disease based on symptoms |
Credit Risk | Determine whether a customer will default on a loan |
Customer Churn | Predict whether a customer will cancel a service |
Marketing | Predict whether a customer will buy a product |
Python Implementation: Binary Classification
We’ll classify whether a student passes (1) or fails (0) based on hours studied.
# Import libraries
from sklearn.linear_model import LogisticRegression
import numpy as np
import matplotlib.pyplot as plt
# Sample data
X = np.array([[1], [2], [3], [4], [5], [6]]) # Hours studied
y = np.array([0, 0, 0, 1, 1, 1]) # 0 = Fail, 1 = Pass
# Create and train the model
model = LogisticRegression()
model.fit(X, y)
# Predictions and probability
X_test = np.array([[3.5]])
predicted_class = model.predict(X_test)
probability = model.predict_proba(X_test)
print("Predicted Class:", predicted_class[0])
print("Probability of Passing:", probability[0][1])
Output-
Predicted Class: 1
Probability of Passing: 0.5000015650516633
Visualizing the Sigmoid Curve
# Plot sigmoid curve for visualization
X_range = np.linspace(0, 7, 100).reshape(-1, 1)
y_prob = model.predict_proba(X_range)[:, 1]
plt.plot(X_range, y_prob, color='green')
plt.title("Logistic Regression - Probability Curve")
plt.xlabel("Hours Studied")
plt.ylabel("Probability of Passing")
plt.grid(True)
plt.show()
Evaluation Metrics for Classification
- Accuracy: (TP + TN) / Total
- Precision: TP / (TP + FP)
- Recall: TP / (TP + FN)
- F1 Score: 2 * (Precision * Recall) / (Precision + Recall)
- Confusion Matrix
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
y_pred = model.predict(X)
print("Accuracy:", accuracy_score(y, y_pred))
print("Precision:", precision_score(y, y_pred))
print("Recall:", recall_score(y, y_pred))
print("F1 Score:", f1_score(y, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y, y_pred))
Output -
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1 Score: 1.0
Confusion Matrix:
[[3 0]
[0 3]]
Tips for Beginners
- Logistic regression works best when the classes are linearly separable.
- Always scale input features, especially when using regularization.
- Try visualizing data before choosing logistic regression.
Tips for Professionals
- Use regularization (L1 or L2) to prevent overfitting (
penalty='l1'
or'l2'
). - For multiclass classification, use
multi_class='multinomial'
withsolver='lbfgs'
. - Logistic regression can be used as a baseline model for classification tasks due to its interpretability and speed.
Summary
- Logistic regression is a classification algorithm, not a regression one.
- It uses the sigmoid function to estimate class probabilities.
- Useful for many real-world binary classification problems and scalable to multiclass tasks.
Prepare for Interview
- JavaScript Interview Questions for 0–1 Year Experience
- JavaScript Interview Questions For Fresher
- SQL Interview Questions for 5+ Years Experience
- SQL Interview Questions for 2–5 Years Experience
- SQL Interview Questions for 1–2 Years Experience
- SQL Interview Questions for 0–1 Year Experience
- SQL Interview Questions for Freshers
- Design Patterns in Python
- Dynamic Programming and Recursion in Python
- Trees and Graphs in Python
- Linked Lists, Stacks, and Queues in Python
- Sorting and Searching in Python
- Debugging in Python
- Unit Testing in Python
- Asynchronous Programming in PYthon
Random Blogs
- Top 15 Recommended SEO Tools
- 10 Awesome Data Science Blogs To Check Out
- Career Guide: Natural Language Processing (NLP)
- Google’s Core Update in May 2020: What You Need to Know
- Important Mistakes to Avoid While Advertising on Facebook
- Mastering SQL in 2025: A Complete Roadmap for Beginners
- Python Challenging Programming Exercises Part 1
- Why to learn Digital Marketing?
- Datasets for analyze in Tableau
- Loan Default Prediction Project Using Machine Learning
- Where to Find Free Datasets for Your Next Machine Learning & Data Science Project
- How to Start Your Career as a DevOps Engineer
- AI in Cybersecurity: The Future of Digital Protection
- Understanding SQL vs MySQL vs PostgreSQL vs MS SQL vs Oracle and Other Popular Databases
- Best Platform to Learn Digital Marketing in Free
Datasets for Machine Learning
- Amazon Product Reviews Dataset
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset
- Artificial Characters Dataset