Python, from the ground up Lesson 46 / 60

The Python features I learned too late

Match statements, the walrus operator, f-string debugging, dataclasses, and other Python features that would have saved me hours if I'd known about them sooner.

The Python features I learned too late

Every language has a graveyard of features you discover years after you needed them. You read a colleague’s PR, see a one-liner where you’d have written six lines, and quietly mutter, “wait, that’s been there this whole time?” Python has a particularly long list, partly because the language keeps shipping quality-of-life improvements every release and partly because most of us learned Python from a book or course that froze its idioms at 3.6.

This lesson is the dump file from my own embarrassment. None of these are exotic. All of them are in the standard library or built-in. Most of them save you between two and twenty lines per occurrence, and they compound across a codebase.

A note on AI assistants before we start. Tools like Copilot and Cursor are excellent at suggesting these features once they appear in your code; the autocomplete will happily extend a match block or a singledispatch decorator. What they almost never do is volunteer them when you’re writing fresh code. The training set is dominated by older idioms, so the default suggestion is if/elif, os.path, and a hand-rolled __init__. You have to know these features exist and reach for them yourself. The AI catches up; it doesn’t lead.

The practical move: when you’re stuck on something verbose, ask the assistant directly, “is there a stdlib feature for this?” Phrased that way, it’ll point you at ChainMap or singledispatch or whatever fits. It’s a different mode from autocomplete — you’re consulting an oracle, not letting it drive. The list below exists precisely so you have a stock of features to ask about.

match revisited (PEP 634)

We touched match back in lesson 3, but it deserves a second look now that you’ve spent forty-three lessons writing real code. The moment to reach for match is when you find yourself unpacking a dict or tuple inside an if/elif chain:

def handle(event: dict) -> None:
    match event:
        case {"type": "click", "x": int(x), "y": int(y)}:
            click(x, y)
        case {"type": "key", "key": str(k), "modifiers": [*mods]}:
            keypress(k, mods)
        case {"type": "resize", "size": (w, h)}:
            resize(w, h)
        case _:
            raise ValueError(f"unknown event: {event}")

You’re not just branching, you’re destructuring and validating shape in the same step. The int(x) syntax is a class pattern: it matches and binds only if the value is an int. I have refactored honest-to-god 80-line dispatchers into 15 lines of match and the bug count went down because the patterns force you to enumerate cases.

The walrus, used sparingly

Lesson 3 introduced :=. Where it earns its keep in real code is reading streams and filtering comprehensions:

while chunk := stream.read(4096):
    process(chunk)

hits = [m for line in log if (m := PATTERN.search(line))]

Don’t sprinkle it everywhere. The rule I use: walrus when it eliminates a duplicated call, otherwise plain assignment.

f"{x=}" for debugging

Once you internalize this, your print statements get half as long:

total = 1742
ratio = 0.318
print(f"{total=} {ratio=:.2%}")
# total=1742 ratio=31.80%

The = after the expression prints both the source text and the value. Format specifiers still work. I delete more of these than I write, but during a hunt they are gold.

@dataclass(frozen=True, slots=True)

We met dataclasses in lesson 9. The two flags I underused for years:

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)
class Point:
    x: float
    y: float

frozen=True makes the instance immutable, which means it’s hashable and safe to use as a dict key or set member. slots=True (3.10+) skips the per-instance __dict__, cutting memory roughly in half and making attribute access slightly faster. For value objects you create in volume — coordinates, prices, tokens — this combination is the right default.

pathlib.Path instead of os.path

Covered in lesson 7, but most working Python code on the internet still uses os.path.join and string concatenation. Path makes it readable:

from pathlib import Path

config = Path.home() / ".config" / "myapp" / "config.toml"
if config.exists():
    text = config.read_text(encoding="utf-8")

for log in Path("logs").glob("*.log"):
    print(log.stem, log.stat().st_size)

/ joins, .read_text() opens-reads-closes, .glob walks. Stop reaching for os.path. The only reason to use it now is interoperating with a library that demands strings, in which case str(path) ends the discussion.

functools.singledispatch

Polymorphism without inheritance. You write a generic function, then register implementations per type:

from functools import singledispatch
from datetime import datetime

@singledispatch
def serialize(value) -> str:
    raise TypeError(f"no serializer for {type(value).__name__}")

@serialize.register
def _(value: int) -> str:
    return str(value)

@serialize.register
def _(value: datetime) -> str:
    return value.isoformat()

@serialize.register
def _(value: list) -> str:
    return "[" + ", ".join(serialize(v) for v in value) + "]"

The dispatch is by the runtime type of the first argument. This is what you actually want most of the time you reach for isinstance chains, and the type hints on the registered functions double as documentation.

contextlib.suppress

The cleanest catch-and-ignore in Python:

from contextlib import suppress

with suppress(FileNotFoundError):
    Path("cache.json").unlink()

That replaces a four-line try/except/pass. Use it only when you genuinely want silence; never use it as a swallow-everything except Exception.

enumerate(start=1)

Tiny one. Most of the time when you enumerate, you want zero-indexed positions for arrays. Some of the time — printing, line numbers, user-facing rank — you want one-indexed:

for rank, name in enumerate(leaderboard, start=1):
    print(f"{rank}. {name}")

You will never write i + 1 in a print statement again.

dict.get and dict.setdefault

dict.get(key, default) is well-known. dict.setdefault is the one people miss:

groups: dict[str, list[str]] = {}
for word in words:
    groups.setdefault(word[0], []).append(word)

That’s a one-pass groupby without defaultdict. If you do reach for defaultdict, fine; but setdefault keeps the dict a plain dict, which serializes cleanly to JSON without the type-coercion dance.

collections.ChainMap

Layered configuration without writing the layering logic:

from collections import ChainMap

defaults = {"timeout": 30, "retries": 3, "host": "localhost"}
env = {"host": "prod.example.com"}
cli = {"timeout": 5}

config = ChainMap(cli, env, defaults)
config["timeout"]  # 5  (from cli)
config["host"]     # 'prod.example.com'  (from env)
config["retries"]  # 3  (from defaults)

Lookup walks left-to-right, returning the first hit. Writes go to the leftmost mapping. This is the pattern every config library reinvents poorly; the standard library has it for free.

breakpoint() (3.7+)

If you still write import pdb; pdb.set_trace(), stop. The built-in breakpoint() does the same thing, respects the PYTHONBREAKPOINT environment variable so you can swap in ipdb or pudb or disable breakpoints entirely in production by setting it to 0. Drop it in your code and run.

Structural unpacking

This is the underused star of modern Python:

first, *middle, last = [1, 2, 3, 4, 5]
# first=1, middle=[2, 3, 4], last=5

head, *tail = "hello"
# head='h', tail=['e', 'l', 'l', 'o']

Combine with match and you have something close to a functional pattern-matching language. Combined with function calls:

def head_count(first, *rest):
    return first, len(rest)

* and / in function signatures

You can force arguments to be keyword-only or positional-only:

def write_record(record, /, *, encoding="utf-8", overwrite=False):
    ...

Everything before / is positional-only; everything after * is keyword-only. The first protects you from API consumers who pass record=... and lock you into the parameter name. The second forces encoding= at the call site, which prevents the classic bug of swapping two boolean flags. Adopt this on any function that takes more than one argument and you will catch a class of bugs at the parser.

A few more worth a paragraph each

itertools.pairwise (3.10+) gives you consecutive overlapping pairs from any iterable, which is exactly what you want when you’re computing differences between adjacent elements:

from itertools import pairwise

for prev, cur in pairwise([1, 4, 9, 16, 25]):
    print(cur - prev)  # 3, 5, 7, 9

Before this existed I wrote zip(seq, seq[1:]) for fifteen years.

functools.cache (3.9+, formerly lru_cache) memoizes a function’s return value:

from functools import cache

@cache
def fib(n: int) -> int:
    return n if n < 2 else fib(n - 1) + fib(n - 2)

You don’t need a decorator factory like @lru_cache(maxsize=128) unless you actually need eviction. @cache is the friendlier default.

str.removeprefix and str.removesuffix (3.9+) replace a depressing amount of if s.startswith(p): s = s[len(p):] code:

"https://example.com".removeprefix("https://")  # 'example.com'
"image.png".removesuffix(".png")                # 'image'

zoneinfo (3.9+) made pytz obsolete for nearly everyone:

from datetime import datetime
from zoneinfo import ZoneInfo

now = datetime.now(ZoneInfo("Europe/Rome"))

Standard library, IANA database, no external dependency. Most older code still imports pytz. You don’t need to.

The pattern

Every feature here trades a few keystrokes of typing for a meaningful reduction in what could go wrong. Frozen slots dataclasses can’t be mutated by accident. Keyword-only arguments can’t be swapped. match patterns can’t silently pass through an unexpected shape. pathlib doesn’t care about your OS’s path separator. removeprefix doesn’t off-by-one your slicing.

There’s a temptation, when you discover a list like this, to refactor a whole codebase in one weekend. Don’t. The cost-benefit isn’t there, and you’ll annoy reviewers. Instead, adopt one or two new features per PR, in the natural course of work. Within a quarter your default style has shifted, and the older code that’s still using os.path.join and Optional[int] either gets touched and modernized incidentally, or gets left alone because it isn’t on a hot path. Both outcomes are fine.

The deeper lesson is the one I take from each cycle of “wait, that’s been there the whole time?”: Python keeps shipping. Read the release notes. Skim the “What’s New in Python 3.X” page when a release lands. It’s a 20-minute investment that prevents 20 hours of writing things by hand five years later.

Pick two from this list you haven’t used yet. Use them in your next PR. The reaction the first time a colleague reads it will be exactly the one I had: “wait, that’s been there the whole time?”

Search