- Supervised Learning
-
Overview
- Introduction to Supervised Learning
- Linear Regression and Its Applications
- Logistic Regression for Classification
- Decision Trees and Random Forests
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN) Algorithm
- Naïve Bayes Classifier
- Gradient Boosting (XGBoost, LightGBM)
- Overfitting and Underfitting in Models
- Bias-Variance Tradeoff
Linear Regression and Its Applications
Add to BookmarkLinear Regression is one of the most fundamental algorithms in machine learning. It models the relationship between a dependent variable and one or more independent variables using a straight line. This tutorial explores how it works, its assumptions, applications, and a practical implementation in Python.
What is Linear Regression?
Linear Regression is a supervised learning algorithm used for predicting continuous values. It assumes a linear relationship between input variables (features) and the output variable (target).
Simple Linear Regression involves one independent variable.
Multiple Linear Regression uses two or more independent variables.
Mathematical Representation
For simple linear regression:
Assumptions of Linear Regression
- Linearity – Relationship between input and output is linear.
- Independence – Observations are independent.
- Homoscedasticity – Constant variance of residuals.
- Normality – Residuals are normally distributed.
- No multicollinearity – In multiple regression, independent variables should not be highly correlated.
Applications of Linear Regression
Application | Description |
---|---|
Predicting Sales | Estimating future sales based on marketing spend or seasonal factors |
Real Estate Pricing | Predicting house prices based on area, number of rooms, location, etc. |
Healthcare | Predicting patient metrics like blood pressure based on age, weight, lifestyle factors |
Finance | Forecasting stock prices or risk based on financial indicators |
Engineering | Modeling relationships between process variables and outputs |
Python Implementation: Simple Linear Regression
# Import required libraries
from sklearn.linear_model import LinearRegression
import numpy as np
import matplotlib.pyplot as plt
# Sample data: Study hours vs. exam score
X = np.array([[1], [2], [3], [4], [5], [6]]) # Hours studied
y = np.array([50, 55, 65, 70, 75, 85]) # Exam score
# Create a Linear Regression model
model = LinearRegression()
model.fit(X, y)
# Predict
y_pred = model.predict(X)
# Coefficients
print("Intercept (b0):", model.intercept_)
print("Slope (b1):", model.coef_[0])
# Plotting the regression line
plt.scatter(X, y, color='blue', label='Actual Scores')
plt.plot(X, y_pred, color='red', label='Regression Line')
plt.xlabel("Hours Studied")
plt.ylabel("Exam Score")
plt.title("Linear Regression Example")
plt.legend()
plt.show()
Output -
Intercept (b0): 42.66666666666667
Slope (b1): 6.857142857142858
Model Evaluation Metrics
To evaluate the model, use metrics such as:
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- R² Score (Coefficient of Determination)
from sklearn.metrics import mean_squared_error, r2_score
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print("MSE:", mse)
print("R² Score:", r2)
Output-
MSE: 1.746031746031747
R² Score: 0.9874285714285714
Tips for Beginners
- Always visualize your data to check if a linear model makes sense.
- Standardize or normalize features for better results in multiple regression.
- Use evaluation metrics to determine how well your model fits the data.
Tips for Professionals
- Regularize linear regression using Ridge or Lasso if overfitting occurs.
- Use
statsmodels
for statistical insight and p-values. - Check for multicollinearity using Variance Inflation Factor (VIF) when working with multiple features.
Summary
- Linear Regression predicts continuous values using a linear approach.
- It's simple, interpretable, and often used as a baseline model.
- Ideal for scenarios where relationships between variables are approximately linear.
Prepare for Interview
- JavaScript Interview Questions for 0–1 Year Experience
- JavaScript Interview Questions For Fresher
- SQL Interview Questions for 5+ Years Experience
- SQL Interview Questions for 2–5 Years Experience
- SQL Interview Questions for 1–2 Years Experience
- SQL Interview Questions for 0–1 Year Experience
- SQL Interview Questions for Freshers
- Design Patterns in Python
- Dynamic Programming and Recursion in Python
- Trees and Graphs in Python
- Linked Lists, Stacks, and Queues in Python
- Sorting and Searching in Python
- Debugging in Python
- Unit Testing in Python
- Asynchronous Programming in PYthon
Random Blogs
- Python Challenging Programming Exercises Part 3
- Extract RGB Color From a Image Using CV2
- Time Series Analysis on Air Passenger Data
- SQL Joins Explained: A Complete Guide with Examples
- Create Virtual Host for Nginx on Ubuntu (For Yii2 Basic & Advanced Templates)
- AI in Marketing & Advertising: The Future of AI-Driven Strategies
- Grow your business with Facebook Marketing
- Big Data: The Future of Data-Driven Decision Making
- Downlaod Youtube Video in Any Format Using Python Pytube Library
- How AI is Making Humans Weaker – The Hidden Impact of Artificial Intelligence
- The Ultimate Guide to Artificial Intelligence (AI) for Beginners
- Datasets for Exploratory Data Analysis for Beginners
- Top 15 Recommended SEO Tools
- The Ultimate Guide to Data Science: Everything You Need to Know
- Important Mistakes to Avoid While Advertising on Facebook
Datasets for Machine Learning
- Amazon Product Reviews Dataset
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset
- Artificial Characters Dataset