Plotting: matplotlib, seaborn, plotly — picking yours

Visualization in Python is fragmented. There’s no single canonical library the way pandas is canonical for DataFrames or scikit-learn is canonical for classical ML. There are three libraries that most people end up using, each built for a different audience, and the productive thing isn’t to pick one and pretend the others don’t exist — it’s to know which one to reach for given who’s going to look at the chart.

In this lesson: matplotlib, seaborn, and plotly. What each one is good at, why their defaults look the way they do, and the same chart written three times so you can see the trade-offs side by side.

matplotlib: the foundation

matplotlib has been around since 2003. John Hunter wrote it as a port of MATLAB’s plotting interface so neuroscientists could keep their muscle memory while moving to Python. That history is visible in basically every API choice the library makes — the global plt.plot, plt.title, plt.show interface is MATLAB’s. The defaults are early-2000s scientific publication: white background, blue line, sans-serif axis labels, a slightly chunky aesthetic that hasn’t aged especially well.

But matplotlib is the foundation. Seaborn is built on it. pandas’s df.plot() calls it. Most other Python plotting libraries either render through matplotlib or follow its conventions. Knowing matplotlib at the level of “I can read the code and tweak it” is non-negotiable, even if you mostly use higher-level libraries.

The pattern to use, every time, is the object-oriented interface:

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10, 100)
y = np.sin(x)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(x, y, label="sin(x)", color="steelblue", linewidth=2)
ax.set_xlabel("x")
ax.set_ylabel("sin(x)")
ax.set_title("A sine wave")
ax.legend()
ax.grid(alpha=0.3)
fig.tight_layout()
fig.savefig("sine.png", dpi=150)
plt.show()

fig is the whole figure (the canvas). ax is the axes (the actual plotting region inside it). Every customization happens through ax.something. This is the API you should always use. The MATLAB-style global interface — plt.plot(x, y); plt.title("...") — works, but it relies on hidden state, breaks in subtle ways inside notebooks and pipelines, and is impossible to compose into multi-panel figures cleanly. Use fig, ax = plt.subplots().

For multiple panels, ask for a grid up front:

fig, axes = plt.subplots(2, 2, figsize=(10, 8))
axes[0, 0].plot(x, np.sin(x))
axes[0, 1].plot(x, np.cos(x))
axes[1, 0].plot(x, np.tan(x))
axes[1, 1].plot(x, x ** 2)
fig.tight_layout()

When to reach for matplotlib directly: when you need pixel-level control. Annotations at specific coordinates. Multi-panel figures with shared axes. Custom legends. Anything destined for a PDF report or a paper. Matplotlib 3.x — which is what you’ll be on in 2026 — has decent defaults if you set the style: plt.style.use("seaborn-v0_8-whitegrid") is a common first line that immediately makes things look less 2003.

seaborn: matplotlib for statistical graphics

Seaborn is a thin layer on top of matplotlib written by Michael Waskom. It does two things: it ships sensible defaults so charts look modern out of the box, and it has higher-level functions for the kinds of plots you actually want when you’re doing data analysis.

The mental model is: seaborn knows about DataFrames. You hand it a DataFrame and tell it which columns to map onto x, y, color, and shape. It does the aggregation and the layout. Underneath, it’s calling matplotlib — which means everything you learned about fig, ax still applies, you can pull the matplotlib axes out of any seaborn call and customize.

The plots you’ll use 90% of the time:

import seaborn as sns
import pandas as pd

# Load tips dataset bundled with seaborn for examples
tips = sns.load_dataset("tips")

# Scatter with color encoding
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time", style="smoker")

# Boxplot grouped by category
sns.boxplot(data=tips, x="day", y="total_bill", hue="sex")

# Heatmap (great for correlation matrices)
corr = tips.select_dtypes("number").corr()
sns.heatmap(corr, annot=True, cmap="coolwarm", center=0)

# Pairplot — every numeric column against every other
sns.pairplot(tips, hue="time")

The two parameters that show up everywhere are hue= (color encoding — pass any column, categorical or continuous) and style= (marker shape, only for categorical). Together they let you encode three or four variables at once on a 2-D plot without writing any aggregation code yourself.

Seaborn’s relplot, catplot, and displot (the “figure-level” functions) are the version that handles faceting — splitting a plot into a grid by another variable:

sns.relplot(
    data=tips,
    x="total_bill", y="tip",
    hue="day", col="time", row="sex",
    kind="scatter",
)

That’s six panels, color-encoded by day, no manual loop. When you’re exploring a dataset and want a fast read of “how does X change across these groups,” figure-level seaborn is the fastest tool in Python.

When to reach for seaborn: any analytical work where the audience is you or a colleague. EDA notebooks. Stats reports. Anything where you want it to look reasonable without negotiating with matplotlib about font sizes.

plotly: interactive plots that work on the web

matplotlib and seaborn produce static images — PNG, PDF, SVG. Plotly produces interactive charts. You hover over a point and a tooltip shows the value. You drag a region to zoom. You toggle traces on and off by clicking the legend. The output is HTML+JavaScript and renders inline in Jupyter, in VS Code’s notebook view, in a Streamlit or Dash app, or in any web page.

Plotly’s Python library has two layers. The high-level one, plotly.express, is what you’ll use for 80% of charts:

import plotly.express as px

fig = px.scatter(
    tips,
    x="total_bill", y="tip",
    color="time", symbol="smoker",
    size="size", hover_data=["day"],
    title="Tips vs total bill",
)
fig.show()

That’s the plotly equivalent of the seaborn scatterplot from earlier — same data, similar API — but the output is interactive. Hover over a point and you see all the columns in hover_data. The color= parameter takes a column directly (categorical or continuous) and plotly picks the colormap. size= maps a numeric column to marker size.

The other plot types in plotly.express: px.line, px.bar, px.box, px.histogram, px.violin, px.heatmap, px.choropleth (geographic), px.scatter_3d. For most analytical use cases you don’t need anything below this layer.

The low-level layer is plotly.graph_objects, where you build figures by composing traces:

import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=tips["total_bill"], y=tips["tip"],
    mode="markers",
    marker=dict(color="steelblue", size=8),
    name="tips",
))
fig.update_layout(title="Tips", xaxis_title="Total bill", yaxis_title="Tip")
fig.show()

You drop down to graph_objects when you need a chart shape that express doesn’t have, or when you’re composing multiple kinds of traces in one figure (a candlestick with overlaid moving averages, say), or when you’re embedding plotly inside a Dash callback that updates pieces of the figure.

When to reach for plotly: anything interactive. Dashboards (Streamlit and Dash both render plotly natively). Reports that live on the web rather than in a PDF. Notebooks where you want the reader to be able to explore. Time series with thousands of points where zooming is the actual feature. Plotly is on version 5.x in 2026, mature and stable; the API hasn’t churned much in years.

The same chart, three times

So you can see the difference, here’s a scatter of total_bill vs tip, color-coded by time of day, written three ways:

# matplotlib: explicit, low-level
fig, ax = plt.subplots(figsize=(8, 5))
for time, group in tips.groupby("time"):
    ax.scatter(group["total_bill"], group["tip"], label=time, alpha=0.7)
ax.set_xlabel("Total bill")
ax.set_ylabel("Tip")
ax.legend(title="Time")
ax.set_title("Tips vs bill")

# seaborn: one line, looks right by default
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time")

# plotly: one line, plus interactivity
px.scatter(tips, x="total_bill", y="tip", color="time", title="Tips vs bill").show()

All three render the same data. The matplotlib version gives you total control and forces you to write the legend logic yourself. The seaborn version is the shortest path to a publishable static chart. The plotly version is the shortest path to a chart you can put on a webpage.

Saving figures, DPI, and a few practical notes

For matplotlib and seaborn (which is matplotlib underneath), the same savefig works:

fig.savefig("output.png", dpi=300, bbox_inches="tight")
fig.savefig("output.pdf", bbox_inches="tight")  # vector, scales perfectly
fig.savefig("output.svg", bbox_inches="tight")

dpi=300 is the right default for anything going into a print report or a slide deck. bbox_inches="tight" trims the whitespace around the figure, which the default doesn’t do. For PDF, dpi doesn’t matter — vector formats are resolution-independent.

For plotly: fig.write_html("output.html") for the interactive version, or fig.write_image("output.png", scale=2) for static (this requires the kaleido package; install with uv add kaleido).

Figure size in matplotlib is in inches (figsize=(8, 5) is 8 inches wide, 5 tall). For a Jupyter notebook, 8x5 or 10x6 reads well. For a Streamlit dashboard, plotly’s default sizing usually just works.

A few things that bite people

Three pitfalls worth flagging because they cost everyone an hour at some point.

plt.show() behaves differently in scripts and notebooks. In a Jupyter notebook with %matplotlib inline (the default), figures render automatically when a cell ends; plt.show() is a no-op. In a .py script run from the terminal, plt.show() is what blocks and opens the window. In a non-interactive context (a CI job, a server) you don’t want a window at all — set the backend to Agg before importing pyplot: matplotlib.use("Agg") and just call fig.savefig to disk.

Datetime axes need explicit formatting in matplotlib. When you plot a pandas Series with a DatetimeIndex, matplotlib does the right thing about 80% of the time. The other 20% you get overlapping date labels that need fig.autofmt_xdate() or a custom mdates.DateFormatter. Seaborn and plotly tend to handle this more gracefully out of the box.

Plotly’s fig.show() doesn’t always show in every environment. In a notebook it renders inline. In a plain Python REPL it tries to open a browser tab. Inside Streamlit you don’t call fig.show() at all — you call st.plotly_chart(fig). Inside Dash, it goes into a dcc.Graph(figure=fig). Knowing the rendering target before you write the code saves a confusing debugging session.

Picking one

The decision tree, end to end:

Static chart for a paper, slide, or PDF report? Matplotlib (or seaborn for a faster start, then drop to matplotlib for tweaks).
Quick analytical chart in a notebook? Seaborn. One line, looks right, you move on.
Interactive chart for a web app or shareable HTML? Plotly Express.
Dashboard? Plotly inside Streamlit or Dash.

You’ll end up with all three installed in any serious data project. There’s no shame in mixing — a notebook that builds an analysis with seaborn and finishes with one polished plotly chart for the executive summary is a perfectly normal artifact in 2026.

A note on what I deliberately left out: Altair (declarative, Vega-Lite-based, lovely for statistical graphics but smaller community), Bokeh (older interactive library that plotly has mostly displaced for new projects), and HoloViews (high-level wrapper that compiles to either Bokeh or matplotlib). They’re all fine choices — Altair in particular has a devoted following — but if you’re starting fresh in 2026 the matplotlib/seaborn/plotly trio covers more than 95% of what you’ll need and matches what your colleagues already know.

The next lesson — SciPy — is the last piece of the numerical foundation before we move into Module 9 and start applying this stack to real ML and analytics work.

Reference: matplotlib documentation, seaborn documentation, Plotly Python documentation, retrieved 2026-05-01.