Association Rule Learning (Apriori, FP-Growth)

Add to Bookmark

Association Rule Learning is an unsupervised learning method used to discover interesting relationships, patterns, or correlations among a set of items in large datasets. It is commonly used in market basket analysis to identify products frequently bought together.

Two popular algorithms for association rule mining are:

Apriori
FP-Growth (Frequent Pattern Growth)

These algorithms help generate frequent itemsets and association rules from transaction data.

Key Concepts

Term	Description
Itemset	A group of one or more items
Support	How frequently the itemset appears in the dataset
Confidence	Probability that item B is purchased when item A is purchased
Lift	Measures how much more likely B is purchased with A, compared to by chance

Apriori Algorithm

Apriori uses a bottom-up approach by generating candidate itemsets and pruning them based on minimum support. It is iterative and simple but can be slow on large datasets.

How Apriori Works:

Identify all frequent itemsets (items appearing above a support threshold)
Generate association rules from these itemsets
Prune rules based on confidence and lift thresholds

FP-Growth Algorithm

FP-Growth avoids candidate generation and instead uses a compact FP-tree structure to find frequent itemsets. It is faster and more efficient than Apriori, especially for large datasets.

How FP-Growth Works:

Construct an FP-Tree by scanning data once
Mine frequent patterns from the tree recursively

Python Example: Market Basket Analysis

We'll use the mlxtend library for Apriori and FP-Growth.

Install Dependencies

pip install mlxtend

Sample Code

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

# Sample transaction data
dataset = [
    ['milk', 'bread', 'butter'],
    ['bread', 'butter'],
    ['milk', 'bread'],
    ['milk', 'bread', 'butter', 'jam'],
    ['bread', 'jam']
]

# Convert to one-hot encoded DataFrame
from mlxtend.preprocessing import TransactionEncoder

te = TransactionEncoder()
te_data = te.fit_transform(dataset)
df = pd.DataFrame(te_data, columns=te.columns_)

# Apply Apriori
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)

# Generate rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)

print("Frequent Itemsets:")
print(frequent_itemsets)

print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])

Output-

Frequent Itemsets:
   support         itemsets
0      1.0          (bread)
1      0.6         (butter)
2      0.6           (milk)
3      0.6  (butter, bread)
4      0.6    (milk, bread)

Association Rules:
  antecedents consequents  support  confidence  lift
0    (butter)     (bread)      0.6         1.0   1.0
1      (milk)     (bread)      0.6         1.0   1.0

Output Explanation

Frequent Itemsets: Shows item combinations that meet the minimum support.
Association Rules: Displays rules like {bread} → {butter} with metrics such as confidence and lift.

Applications of Association Rule Learning

Market Basket Analysis
Recommendation Systems
Website Navigation Patterns
Fraud Detection
Medical Diagnosis (symptom-disease patterns)

Advantages

Easy to implement and understand
Generates interpretable rules
Works well for retail, e-commerce, and log data

Limitations

Apriori is computationally expensive for large datasets
Generates too many rules without pruning
Doesn’t consider temporal or causal relationships

Tips for Beginners

Use min_support and min_confidence thresholds to filter meaningful rules
Visualize itemset frequencies using bar charts or heatmaps
Start with small datasets to understand algorithm behavior

Tips for Professionals

Use FP-Growth over Apriori for scalability
Apply rule pruning techniques to reduce noisy or redundant rules
Combine with user segmentation for personalized recommendations
Use Lift to identify truly interesting rules beyond random co-occurrence

Summary

Association Rule Learning is a powerful method for discovering item relationships in transaction data. Apriori is simple and suitable for smaller datasets, while FP-Growth is more efficient for larger volumes. Both help uncover valuable insights in retail, finance, and web usage patterns.

Overview

Association Rule Learning (Apriori, FP-Growth)

Key Concepts

Apriori Algorithm

How Apriori Works:

FP-Growth Algorithm

How FP-Growth Works:

Python Example: Market Basket Analysis

Install Dependencies

Sample Code

Output Explanation

Applications of Association Rule Learning

Advantages

Limitations

Tips for Beginners

Tips for Professionals

Summary

Gaussian Mixture Models (GMM)

DBSCAN Clustering Algorithm

Prepare for Interview

Tutorials

Random Blogs

Datasets for Machine Learning

Categories

Follow us on Linkedin

Overview

Association Rule Learning (Apriori, FP-Growth)

Key Concepts

Apriori Algorithm

How Apriori Works:

FP-Growth Algorithm

How FP-Growth Works:

Python Example: Market Basket Analysis

Install Dependencies

Sample Code

Output Explanation

Applications of Association Rule Learning

Advantages

Limitations

Tips for Beginners

Tips for Professionals

Summary

Gaussian Mixture Models (GMM)

DBSCAN Clustering Algorithm

Related Tutorials

Prepare for Interview

Tutorials

Random Blogs

Datasets for Machine Learning

Categories

Follow us on Linkedin