Data Visualization with Matplotlib and Seaborn

Introduction

Data visualization is a crucial step in data analysis. It helps in understanding patterns, trends, and relationships within data. Matplotlib and Seaborn are two powerful Python libraries that enable effective data visualization.

In this tutorial, we will cover:

  1. Introduction to Matplotlib and Seaborn
  2. Creating Basic Plots with Matplotlib
  3. Customizing Plots in Matplotlib
  4. Seaborn for Statistical Data Visualization
  5. Advanced Plots with Seaborn
  6. Combining Matplotlib and Seaborn

1. Introduction to Matplotlib and Seaborn

  • Matplotlib is a low-level library that provides complete control over plots.
  • Seaborn is built on top of Matplotlib and is optimized for statistical data visualization.

Install the libraries if not already installed:

pip install matplotlib seaborn

Import required libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

2. Creating Basic Plots with Matplotlib

Line Plot

# Sample data
x = np.linspace(0, 10, 100)
y = np.sin(x)

# Create a line plot
plt.plot(x, y, label='Sine Wave', color='blue', linestyle='dashed')

# Adding labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Plot Example')

plt.legend()
plt.show()

Bar Chart

# Sample data
categories = ['A', 'B', 'C', 'D']
values = [10, 25, 15, 30]

# Create a bar plot
plt.bar(categories, values, color='green')

# Adding labels
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart Example')

plt.show()

Histogram

# Generate random data
data = np.random.randn(1000)

# Create a histogram
plt.hist(data, bins=30, color='purple', edgecolor='black')

plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram Example')

plt.show()

3. Customizing Plots in Matplotlib

Matplotlib provides options to customize plots.

plt.figure(figsize=(8, 5))
plt.plot(x, y, color='red', linewidth=2, marker='o', markersize=5)

plt.xlabel('X-axis', fontsize=12)
plt.ylabel('Y-axis', fontsize=12)
plt.title('Customized Line Plot', fontsize=14)

plt.grid(True)
plt.show()

4. Seaborn for Statistical Data Visualization

Seaborn simplifies data visualization and provides aesthetically pleasing plots.

# Load sample dataset
df = sns.load_dataset("tips")

# Scatter plot using Seaborn
sns.scatterplot(x='total_bill', y='tip', data=df, hue='sex', style='smoker')

plt.title("Scatter Plot with Seaborn")
plt.show()

5. Advanced Plots with Seaborn

Box Plot (To analyze distributions and detect outliers)

sns.boxplot(x='day', y='total_bill', data=df, palette="coolwarm")
plt.title("Box Plot Example")
plt.show()

Heatmap (For correlation analysis)

correlation_matrix = df.corr()

sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", linewidths=0.5)
plt.title("Heatmap Example")
plt.show()

6. Combining Matplotlib and Seaborn

You can use Matplotlib for fine-tuning plots created with Seaborn.

plt.figure(figsize=(8, 5))
sns.histplot(df['total_bill'], kde=True, bins=20, color='blue')

plt.xlabel('Total Bill')
plt.ylabel('Count')
plt.title('Histogram with KDE using Seaborn')

plt.grid(True)
plt.show()

Conclusion

  • Matplotlib is great for detailed custom visualizations.
  • Seaborn simplifies statistical data visualization.
  • Both libraries can be used together for powerful visual storytelling.