Loading...
Development

Data Visualization

Phase 3: Data Visualization (Month 4)

Goal: Tell Stories with Data

Why?

  • 80% of DS interviews ask: "Walk me through your plot"
  • 1 chart > 1000 rows
  • Land $10K+ in salary for storytelling

WeekFocusHours
1Python Plotting (Matplotlib/Seaborn)35
2EDA + Storytelling35
3Tableau Public Mastery35
4Capstone: Executive Dashboard30

Week 1: Python Plotting – Matplotlib & Seaborn

Core Libraries

pip install matplotlib seaborn plotly

Essential Plot Types

PlotUseCode
LineTrendssns.lineplot(x, y)
BarCompare categoriessns.barplot(x, y)
HistogramDistributionsns.histplot(data)
BoxOutliers, quartilessns.boxplot(x, y)
ScatterCorrelationsns.scatterplot(x, y)
HeatmapCorrelation matrixsns.heatmap(corr)

Pro Code Template

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load data
df = pd.read_csv("titanic.csv")

# Style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(data=df, x="Pclass", y="Survived", hue="Sex", ax=ax, errorbar=None)

# Labels
ax.set_title("Survival Rate by Class & Gender", fontsize=16, fontweight='bold')
ax.set_xlabel("Passenger Class", fontsize=12)
ax.set_ylabel("Survival Rate", fontsize=12)
ax.legend(title="Gender")

# Annotate
for p in ax.patches:
    ax.annotate(f'{p.get_height():.1%}', 
                (p.get_x() + p.get_width()/2, p.get_height()), 
                ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.savefig("survival_by_class_gender.png", dpi=300)
plt.show()

Resources:


Week 2: EDA + Storytelling Framework

5-Second Rule: Can a busy exec understand in 5 sec?

Storytelling Framework (McKinsey Style)

graph TD
    A[Context] --> B[Insight]
    B --> C[Action]
StepExample
Context"Titanic had 2224 passengers"
Insight"Women in 1st class: 97% survived"
Action"Prioritize women & children in evacuation"

EDA Checklist

df.describe()
df.isnull().sum()
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
sns.pairplot(df, hue="Survived")

Project: Titanic Survival Story

3 plots + 1 insight per plot → eda_titanic.ipynb


Week 3: Tableau Public – Drag, Drop, Wow

Install: Tableau Public (Free)

Core Skills

SkillHow
ConnectCSV, Google Sheets
Calculated FieldIF [Pclass] = 1 THEN "Rich" ELSE "Poor" END
ParametersDynamic filters
Dashboard3+ sheets + actions
StorySequence of insights

Build 3 Dashboards

#DashboardDataset
1Sales PerformanceSample Superstore
2Customer SegmentationRFM Analysis
3Funnel AnalysisE-commerce funnel

Publish: public.tableau.com → Share link


Week 4: Capstone – Executive Dashboard

Project: "Global Happiness Report 2023"

Dataset: World Happiness Report

Deliverables (GitHub: yourname/data-viz-capstone)

data-viz-capstone/
├── python/
│   ├── eda_happiness.ipynb
│   └── plots/
│       ├── happiness_vs_gdp.png
│       └── top10_happiest.png
├── tableau/
│   ├── Happiness_Dashboard.twb
│   └── Happiness_Dashboard.png
├── streamlit/
│   └── app.py
└── README.md

1. Python: Key Insights

# Top 10 happiest countries
top10 = df.nlargest(10, 'Happiness Score')
sns.barplot(data=top10, x='Happiness Score', y='Country', palette='viridis')
plt.title("Top 10 Happiest Countries (2023)")
plt.xlabel("Happiness Score")
plt.savefig("plots/top10_happiest.png", dpi=300, bbox_inches='tight')

2. Tableau: Interactive Dashboard

Sheets:

  1. Map (Happiness by Country)
  2. Scatter (GDP vs Happiness)
  3. Bar (Top/Bottom 10)
  4. Trend (Happiness over years)

Actions:

  • Filter: Region
  • Highlight: Click country

Publish: tableau.com/your-viz


3. Streamlit: Live App (Bonus)

# streamlit/app.py
import streamlit as st
import plotly.express as px

st.title("World Happiness Dashboard")
df = pd.read_csv("../data/happiness.csv")

region = st.selectbox("Select Region", df['Region'].unique())
filtered = df[df['Region'] == region]

fig = px.scatter(filtered, x="GDP per capita", y="Happiness Score",
                 size="Population", color="Country", hover_name="Country",
                 title=f"Happiness vs GDP in {region}")
st.plotly_chart(fig)
streamlit run streamlit/app.py

README.md (Portfolio Gold)

# World Happiness Dashboard

**Live**: [streamlit.app/happiness](https://yourname-happiness.streamlit.app)  
**Tableau**: [public.tableau.com](https://public.tableau.com/views/WorldHappiness2023/Dashboard)  
**Python EDA**: [notebook](python/eda_happiness.ipynb)

## Key Insights
| Insight | Action |
|-------|--------|
| GDP explains 75% of happiness | Invest in economy |
| Social support > Freedom | Build community programs |
| Nordic countries dominate top 10 | Study their policies |

## Tech
- Python: Matplotlib, Seaborn, Plotly
- Tableau Public: Interactive dashboard
- Streamlit: Live web app

Interview-Ready Plots

QuestionYour Plot
"Show correlation"sns.heatmap(corr, annot=True)
"Outliers?"sns.boxplot()
"Trend over time?"sns.lineplot()
"Compare groups?"sns.catplot()

Assessment: Can You Build This?

TaskYes/No
Python: 5-plot EDA
Tableau: Interactive dashboard
Streamlit: Live filter
3 insights with actions
Published + shared

All Yes → You’re visualization-ready!


Free Resources Summary

ToolLink
Python Graph Gallerypython-graph-gallery.com
Seaborn Examplesseaborn.pydata.org/examples
Tableau Publicpublic.tableau.com
Sample Superstoretableau.com/sample-data
Streamlit Docsdocs.streamlit.io

Pro Tips

  1. Never use default colorssns.set_palette("colorblind")
  2. Annotate everything%, n=, p<0.01
  3. Export high-resdpi=300
  4. Tell a story → Context → Insight → Action
  5. Add to resume:

    "Built interactive Tableau dashboard with 10K+ views"


Next: Phase 4 – Machine Learning Core

You can show data → now predict it.


Start Now:

  1. Download World Happiness Report
  2. Open Jupyter:
import seaborn as sns
df = pd.read_csv("happiness.csv")
sns.scatterplot(data=df, x="GDP per capita", y="Happiness Score", hue="Region")
  1. Save plot → Push to GitHub

Tag me when you publish your Tableau viz!
You now communicate like a senior analyst.