- Unsupervised Learning
-
Overview
- Introduction to Unsupervised Learning
- K-Means Clustering Algorithm
- Hierarchical Clustering
- Principal Component Analysis (PCA)
- Autoencoders for Dimensionality Reduction
- Gaussian Mixture Models (GMM)
- Association Rule Learning (Apriori, FP-Growth)
- DBSCAN Clustering Algorithm
- Self-Organizing Maps (SOM)
- Applications of Unsupervised Learning
Self-Organizing Maps (SOM)
Add to BookmarkSelf-Organizing Maps (SOM) are a type of artificial neural network introduced by Teuvo Kohonen. SOMs are mainly used for dimensionality reduction and data visualization, especially for high-dimensional data.
Unlike supervised learning models, SOMs learn patterns without labels, organizing input data into a meaningful 2D map where similar inputs are grouped together. They’re ideal for clustering, pattern recognition, and visualization.
How SOM Works
Architecture
- Input layer: accepts n-dimensional feature vectors
- Map layer: typically a 2D grid of neurons (nodes), each with an associated weight vector of the same dimension as the input
Training Process
- Initialize weight vectors randomly
- For each input vector:
- Find the Best Matching Unit (BMU) — the neuron whose weights are closest to the input
- Update the BMU and its neighbors to make them more like the input
- Over time, the map self-organizes to reflect the data distribution
Key Features of SOM
Feature | Description |
---|---|
Topology Preservation | Similar data points map to nearby neurons |
Dimensionality Reduction | High-dimensional data projected onto a 2D space |
Unsupervised Learning | No labeled data required |
Clustering & Visualization | Helps identify natural clusters and structure in data |
Python Example Using MiniSom
We’ll use the MiniSom
package for a basic SOM implementation.
Install Library
pip install minisom
Sample Code
from minisom import MiniSom
from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
import numpy as np
# Load and normalize data
iris = load_iris()
X = iris.data
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)
# Initialize SOM: 7x7 grid, input_len = 4 (features)
som = MiniSom(x=7, y=7, input_len=4, sigma=1.0, learning_rate=0.5)
som.random_weights_init(X_scaled)
som.train_random(X_scaled, 100)
# Visualize SOM distance map (U-Matrix)
plt.figure(figsize=(7, 7))
plt.pcolor(som.distance_map().T, cmap='coolwarm') # distance map as heatmap
plt.colorbar()
plt.title("SOM - Distance Map (U-Matrix)")
plt.show()
Output-
Output Explanation
- Brighter cells indicate greater distance (edges between clusters)
- Darker cells represent more similarity (cluster centers)
Applications of SOM
- Customer segmentation
- Anomaly detection
- Document or text clustering
- Gene expression pattern analysis
- Image compression
Advantages
- Intuitive data visualization
- Preserves data topology
- Works well with unlabeled, high-dimensional data
- Identifies patterns/clusters without supervision
Limitations
- SOMs require careful tuning of map size and parameters
- Training is relatively slow for large datasets
- Interpretation can be less precise than traditional clustering models
Tips for Beginners
- Always normalize data before training a SOM
- Use the U-Matrix to visually interpret cluster boundaries
- Start with small grid sizes and scale as needed
Tips for Professionals
- Use SOM as a preprocessing step for classification or anomaly detection
- Combine with other clustering techniques like DBSCAN for hybrid modeling
- Customize color-mapped heatmaps for deeper insight
- Evaluate map quality using quantization and topographic errors
Summary
Self-Organizing Maps (SOM) are a powerful tool for unsupervised learning and visualization of complex data. They help you discover hidden structures and relationships, especially in high-dimensional datasets, making them a valuable tool for exploratory data analysis and clustering tasks.
Prepare for Interview
- JavaScript Interview Questions for 1–2 Years Experience
- JavaScript Interview Questions for 0–1 Year Experience
- JavaScript Interview Questions For Fresher
- SQL Interview Questions for 5+ Years Experience
- SQL Interview Questions for 2–5 Years Experience
- SQL Interview Questions for 1–2 Years Experience
- SQL Interview Questions for 0–1 Year Experience
- SQL Interview Questions for Freshers
- Design Patterns in Python
- Dynamic Programming and Recursion in Python
- Trees and Graphs in Python
- Linked Lists, Stacks, and Queues in Python
- Sorting and Searching in Python
- Debugging in Python
- Unit Testing in Python
Random Blogs
- Google’s Core Update in May 2020: What You Need to Know
- How to Start Your Career as a DevOps Engineer
- Generative AI - The Future of Artificial Intelligence
- Understanding OLTP vs OLAP Databases: How SQL Handles Query Optimization
- Internet of Things (IoT) & AI – Smart Devices and AI Working Together
- Grow your business with Facebook Marketing
- Avoiding the Beginner’s Trap: Key Python Fundamentals You Shouldn't Skip
- Important Mistakes to Avoid While Advertising on Facebook
- Data Analytics: The Power of Data-Driven Decision Making
- Time Series Analysis on Air Passenger Data
- Ideas for Content of Every niche on Reader’s Demand during COVID-19
- What is YII? and How to Install it?
- How AI Companies Are Making Humans Fools and Exploiting Their Data
- AI in Cybersecurity: The Future of Digital Protection
- 5 Ways Use Jupyter Notebook Online Free of Cost
Datasets for Machine Learning
- Amazon Product Reviews Dataset
- Ozone Level Detection Dataset
- Bank Transaction Fraud Detection
- YouTube Trending Video Dataset (updated daily)
- Covid-19 Case Surveillance Public Use Dataset
- US Election 2020
- Forest Fires Dataset
- Mobile Robots Dataset
- Safety Helmet Detection
- All Space Missions from 1957
- OSIC Pulmonary Fibrosis Progression Dataset
- Wine Quality Dataset
- Google Audio Dataset
- Iris flower dataset
- Artificial Characters Dataset