- Unsupervised Learning
-
Overview
- Introduction to Unsupervised Learning
- K-Means Clustering Algorithm
- Hierarchical Clustering
- Principal Component Analysis (PCA)
- Autoencoders for Dimensionality Reduction
- Gaussian Mixture Models (GMM)
- Association Rule Learning (Apriori, FP-Growth)
- DBSCAN Clustering Algorithm
- Self-Organizing Maps (SOM)
- Applications of Unsupervised Learning
Association Rule Learning (Apriori, FP-Growth)
Add to BookmarkAssociation Rule Learning is an unsupervised learning method used to discover interesting relationships, patterns, or correlations among a set of items in large datasets. It is commonly used in market basket analysis to identify products frequently bought together.
Two popular algorithms for association rule mining are:
- Apriori
- FP-Growth (Frequent Pattern Growth)
These algorithms help generate frequent itemsets and association rules from transaction data.
Key Concepts
Term | Description |
---|---|
Itemset | A group of one or more items |
Support | How frequently the itemset appears in the dataset |
Confidence | Probability that item B is purchased when item A is purchased |
Lift | Measures how much more likely B is purchased with A, compared to by chance |
Apriori Algorithm
Apriori uses a bottom-up approach by generating candidate itemsets and pruning them based on minimum support. It is iterative and simple but can be slow on large datasets.
How Apriori Works:
- Identify all frequent itemsets (items appearing above a support threshold)
- Generate association rules from these itemsets
- Prune rules based on confidence and lift thresholds
FP-Growth Algorithm
FP-Growth avoids candidate generation and instead uses a compact FP-tree structure to find frequent itemsets. It is faster and more efficient than Apriori, especially for large datasets.
How FP-Growth Works:
- Construct an FP-Tree by scanning data once
- Mine frequent patterns from the tree recursively
Python Example: Market Basket Analysis
We'll use the mlxtend
library for Apriori and FP-Growth.
Install Dependencies
pip install mlxtend
Sample Code
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
# Sample transaction data
dataset = [
['milk', 'bread', 'butter'],
['bread', 'butter'],
['milk', 'bread'],
['milk', 'bread', 'butter', 'jam'],
['bread', 'jam']
]
# Convert to one-hot encoded DataFrame
from mlxtend.preprocessing import TransactionEncoder
te = TransactionEncoder()
te_data = te.fit_transform(dataset)
df = pd.DataFrame(te_data, columns=te.columns_)
# Apply Apriori
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
# Generate rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])
Output-
Frequent Itemsets:
support itemsets
0 1.0 (bread)
1 0.6 (butter)
2 0.6 (milk)
3 0.6 (butter, bread)
4 0.6 (milk, bread)
Association Rules:
antecedents consequents support confidence lift
0 (butter) (bread) 0.6 1.0 1.0
1 (milk) (bread) 0.6 1.0 1.0
Output Explanation
- Frequent Itemsets: Shows item combinations that meet the minimum support.
- Association Rules: Displays rules like
{bread} → {butter}
with metrics such as confidence and lift.
Applications of Association Rule Learning
- Market Basket Analysis
- Recommendation Systems
- Website Navigation Patterns
- Fraud Detection
- Medical Diagnosis (symptom-disease patterns)
Advantages
- Easy to implement and understand
- Generates interpretable rules
- Works well for retail, e-commerce, and log data
Limitations
- Apriori is computationally expensive for large datasets
- Generates too many rules without pruning
- Doesn’t consider temporal or causal relationships
Tips for Beginners
- Use
min_support
andmin_confidence
thresholds to filter meaningful rules - Visualize itemset frequencies using bar charts or heatmaps
- Start with small datasets to understand algorithm behavior
Tips for Professionals
- Use FP-Growth over Apriori for scalability
- Apply rule pruning techniques to reduce noisy or redundant rules
- Combine with user segmentation for personalized recommendations
- Use Lift to identify truly interesting rules beyond random co-occurrence
Summary
Association Rule Learning is a powerful method for discovering item relationships in transaction data. Apriori is simple and suitable for smaller datasets, while FP-Growth is more efficient for larger volumes. Both help uncover valuable insights in retail, finance, and web usage patterns.
Prepare for Interview
- JavaScript Interview Questions for 5+ Years Experience
- JavaScript Interview Questions for 2–5 Years Experience
- JavaScript Interview Questions for 1–2 Years Experience
- JavaScript Interview Questions for 0–1 Year Experience
- JavaScript Interview Questions For Fresher
- SQL Interview Questions for 5+ Years Experience
- SQL Interview Questions for 2–5 Years Experience
- SQL Interview Questions for 1–2 Years Experience
- SQL Interview Questions for 0–1 Year Experience
- SQL Interview Questions for Freshers
- Design Patterns in Python
- Dynamic Programming and Recursion in Python
- Trees and Graphs in Python
- Linked Lists, Stacks, and Queues in Python
- Sorting and Searching in Python
Random Blogs
- AI & Space Exploration – AI’s Role in Deep Space Missions and Planetary Research
- Ideas for Content of Every niche on Reader’s Demand during COVID-19
- Career Guide: Natural Language Processing (NLP)
- OLTP vs. OLAP Databases: Advanced Insights and Query Optimization Techniques
- The Ultimate Guide to Starting a Career in Computer Vision
- The Ultimate Guide to Machine Learning (ML) for Beginners
- Navigating AI Careers in 2025: Data Science, Machine Learning, Deep Learning, and More
- How AI Companies Are Making Humans Fools and Exploiting Their Data
- Understanding LLMs (Large Language Models): The Ultimate Guide for 2025
- Top 10 Blogs of Digital Marketing you Must Follow
- Where to Find Free Datasets for Your Next Machine Learning & Data Science Project
- Convert RBG Image to Gray Scale Image Using CV2
- Big Data: The Future of Data-Driven Decision Making
- Exploratory Data Analysis On Iris Dataset
- Types of Numbers in Python
Datasets for Machine Learning
- Awesome-ChatGPT-Prompts
- Amazon Product Reviews Dataset
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset