Capstone: what you know now, where to go next

You made it. Sixty lessons, ten modules, a few hundred code blocks, and what should now be a working mental map of how modern Python is actually used to build things.

This last lesson does three things. It looks back at what we covered, in plain English, so you can see the shape of the curriculum from above. It looks forward at the resources I recommend for taking each piece deeper. And it ends with the only piece of advice that actually matters at this stage, which is the one about going and building something.

What the course covered

Module 1 — Modern Python you should write in 2026

Type hints, dataclasses, structural pattern matching, walrus operators, modern string formatting, the standard library bits that don’t get enough press. The core message of the module: the Python you may have learned in 2018 isn’t quite the Python that ships well in 2026. The language has tightened. Type-checked code is now the default in serious projects, not a hipster preference. You should write it that way too.

Module 2 — The standard library

pathlib, collections, itertools, functools, dataclasses, enum, datetime, re, subprocess, logging, argparse, the new tomllib, the underrated statistics module. Most production scripts are 80% standard library and 20% third-party. Knowing what’s already in the box keeps your dependency list short and your code maintainable.

Module 3 — Packaging + project structure

uv, pyproject.toml, src layouts, version pinning, lockfiles, building wheels, publishing to PyPI, the modern alternative to the old setup.py / requirements.txt mess. Packaging used to be the worst part of Python. With uv and modern PEPs it’s now genuinely fine. A Python project in 2026 has one config file and reproducible installs. Don’t accept less.

Module 4 — Testing, quality, AI workflow

pytest, ruff, mypy, fixtures, parametrization, mocking, property-based testing with Hypothesis, the AI-pair-programming workflow that changes how you write code day to day. The piece most underrated by self-taught engineers: a clean test suite is the asset that lets you change everything else without fear. The piece most underrated by senior engineers: an AI tool you’ve integrated thoughtfully into your workflow shifts your output by something like 1.5-3x on green-field work.

Modules 5-6 — Pandas mastery

Data structures, indexing, groupby, joins, time series, reshaping, the SettingWithCopyWarning, the new copy-on-write semantics, performance tuning, when to drop to NumPy, when to drop to Polars or DuckDB instead. Pandas is still the default tool for in-memory tabular work. By the end of these two modules you should be reading and writing it without consulting a cheat sheet.

Module 7 — Data engineering

ETL design, ingestion patterns, working with REST APIs, async programming with asyncio and httpx, orchestration with Prefect/Airflow, idempotency, retries, schemas. The shift from “I can analyze data” to “I can move data reliably between systems” is the shift from analyst to engineer. This module is where that happens.

Module 8 — Numerical Python

NumPy proper, vectorization, broadcasting, linear algebra, plotting with Matplotlib, scientific computing with SciPy, working in Jupyter without your notebook turning into a mess. The substrate that everything else in scientific Python — Pandas, scikit-learn, PyTorch — sits on top of. You don’t need to be a numerical analyst, but you need to know what axis=0 does without thinking.

Module 9 — Machine learning

scikit-learn, pipelines, feature engineering, tree models, linear models, hyperparameter tuning with Optuna, evaluation, SHAP for interpretation, deployment with FastAPI. End-to-end classical ML, the kind that pays the bills. This is the module that turns “I know how to fit a random forest” into “I can ship a model behind an HTTP endpoint.”

Module 10 — Deep learning, in the 2026 shape

PyTorch fundamentals, training loops, transfer learning with Hugging Face, LoRA fine-tuning, the AI-vs-ML decision, RAG. The shortest module of the course because the field moves so fast that lesson-grade detail goes stale within months. The point of the module wasn’t to make you a deep-learning specialist; it was to give you a clear-eyed map of when the current AI tools are the right answer and when they aren’t.

Where you are now

If you’ve actually worked through the lessons rather than skimmed them, you can:

Set up a Python project from scratch in five minutes — uv init, type hints, tests, lint, CI.
Take a CSV, a database table, or an API and turn it into a clean, well-typed, well-tested data pipeline.
Pick the right tool for an analysis: Pandas for medium data, Polars or DuckDB for bigger, NumPy when you need raw vectorization.
Build a classical ML model end-to-end, evaluate it honestly, and deploy it as a service.
Decide between calling a hosted LLM, fine-tuning an open one with LoRA, and training a classical model — and explain the decision to a non-technical stakeholder.
Write code that’s readable to your future self and to the people who’ll inherit your work.

That’s a working data engineer or ML engineer’s day-to-day toolkit. If you can do all of the above, you’re employable in this field, and not at the bottom rung. The thing that turns it from “employable” to “valuable” is the next year of practice on real problems, not another course.

Where to go next

The honest map of the field, as I’d recommend it to a friend.

Python internals

If you want to understand why Python behaves the way it does — why threads are weird, why dictionaries got faster in 3.12, what the GIL actually does — the books to read are Brett Cannon’s online talks and articles (he’s a CPython core dev who explains the language clearly), and Anthony Shaw’s “CPython Internals” book. The Python “What’s New In…” documentation pages, read every release, are also remarkably good — they’re written by the people who wrote the changes.

Data engineering

The single most important book in the field is Martin Kleppmann’s “Designing Data-Intensive Applications.” It’s database-agnostic, language-agnostic, and the closest thing the discipline has to required reading. After that: “Fundamentals of Data Engineering” by Joe Reis and Matt Housley for the modern stack overview, and reading the dbt documentation cover-to-cover for a working sense of how analytical pipelines are built today. Pair these books with the SQL Server course on this site if you want a deep production-database angle, or the PySpark course if you want to do this at terabyte scale.

Machine learning

Three books, in order: Aurélien Géron’s “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow” is the best single book for working ML practitioners; the third edition is current. Sebastian Raschka’s “Machine Learning with PyTorch and Scikit-Learn” covers similar ground with a stronger PyTorch slant. The fastai course (free, online) is opinionated, accessible, and produces practitioners who ship faster than most academic graduates.

Deep learning

If you want to actually understand transformers and modern neural networks at the level where you can debug them: Andrej Karpathy’s “Neural Networks: Zero to Hero” YouTube series is the best resource on the planet. He builds a GPT from scratch, in PyTorch, in a way you can follow line by line. Hugging Face’s NLP course is the practical companion — it teaches the library you’ll actually use, with the conventions that match the rest of the ecosystem.

Software engineering generally

The single highest-leverage habit is to read the source of popular Python projects. Pick a library you use and respect — httpx, click, rich, FastAPI, pydantic, pytest — and just read it. Not the tutorial; the source. You’ll learn more about how production Python is structured in two weekends of reading well-written libraries than in a year of generic tutorials. Most well-known Python libraries are remarkably well-documented at the code level, and most are 5,000-30,000 lines — readable in a weekend if you focus.

For broader software engineering culture: John Ousterhout’s “A Philosophy of Software Design” is short and worth re-reading every year. “The Pragmatic Programmer” is a classic for a reason. “A Philosophy of Software Design” disagrees with “Clean Code” on most things, and Ousterhout is right about most of them.

Career skills, since you asked

The data-engineer / ML-engineer skill stack of 2026, in rough order of how often it appears in job descriptions:

Python, well, including types, tests, packaging.
SQL, well, on at least one engine — Postgres, SQL Server, BigQuery, Snowflake, anything serious.
A cloud platform at the level of “I can stand up a queue, a container, a database, and a scheduled job in it.” AWS, GCP, or Azure — pick one and learn it deeply.
Pandas + at least one of Polars / DuckDB / PySpark, depending on data scale.
Containers and CI: Docker, GitHub Actions, the basic shape of Kubernetes (you don’t have to love it).
Orchestration: Airflow, Prefect, or dbt for analytical pipelines. The names rotate but the shape doesn’t.
Modern AI/ML: scikit-learn, PyTorch, the Hugging Face stack, the major LLM SDKs (OpenAI, Anthropic).
Communication. The people who actually move up are the ones who can explain technical decisions to non-technical stakeholders. This is a skill you can practice deliberately. It is not a personality trait.

What hiring managers actually look for at the senior level isn’t more tools. It’s judgment: knowing when to use what, knowing when not to use the new thing, knowing how to read a problem and pick a stack that matches it. That judgment grows from shipping things and watching them succeed or fail. It does not grow from courses, including this one. The course was the prerequisite. The practice is the work.

The cross-pollination angle

If you’ve finished this course, the obvious next step on this site is one of the companion tracks. The SQL Server course drills the production-database side that this course only touches lightly — backup, recovery, indexing, the operational reality of a database under load. The PySpark course scales the data-engineering toolbox up to terabyte-class workloads. Both share assumptions and conventions with this one. Together, the three give you the working stack of a 2026 data platform engineer.

You don’t need all three. You probably need pieces of two of them. Pick based on what your work actually involves.

A note on the AI thing

I want to leave you with something specific about the moment you’re learning Python in.

The 2024-2026 wave of AI tools shifted what Python work feels like. You can prompt your way to a working FastAPI app in an afternoon. You can have an LLM write a regex you’d have spent twenty minutes on. You can paste a stack trace and get an explanation that your colleague might have given you in 2018. This is real. It’s not hype. The leverage is real and you should use it.

It’s also not a substitute for understanding the code that comes out. The engineers who get the most out of these tools are the ones who could write the code themselves and use the AI to skip the boring parts. The ones who can’t tell when the AI’s wrong eventually ship something embarrassing, or worse, something dangerous.

This course was designed in the assumption that you’d use AI tools alongside it. Lesson 23 was explicit about it. None of what you’ve learned is invalidated by the AI tools; the AI tools are how you operationalize what you’ve learned at greater speed. If you finish this course feeling like you don’t need the foundations because the AI will write the code for you — you missed the point. If you finish it feeling like the foundations are the thing that lets you direct the AI well — you got it.

Go build something

Here’s the thing nobody tells you about engineering courses: at the end of one, you’re not yet a working engineer. You’re a person who’s read about engineering. The transition to being one happens the first time you ship something real, take responsibility for it, watch it break in production, fix it, watch it break again differently, fix that too, and slowly build the catalog of “things that go wrong” that experienced engineers carry around in their heads. That catalog is most of what experience actually is.

You can’t get it from a course. You can only get it from building things people use.

So pick something — a side project, a tool you actually want for yourself, a problem at your day job that’s been bothering you, a contribution to an open-source library you use, a technical blog where you write up things you’ve figured out. Anything that ships. The size doesn’t matter. The shipping does. Do it once. Then do it again. Then do it again, this time paying attention to the friction you hit and writing down what you’d do differently next time.

Ten projects from now you’ll be a different engineer. Twenty projects from now, you’ll have opinions, and they’ll be informed ones. Fifty projects from now, you’ll be the person other engineers come to for advice, because you’ll have actually accumulated experience instead of just hours.

That’s the path. There isn’t a shortcut. The course got you to the trailhead.

Thanks for reading

Sixty lessons is a lot of words. Thank you for sticking through them. If parts of this course made your work easier, or made you change how you think about a piece of the field, that’s the highest compliment I can hope for. If you found errors, have suggestions, or want to argue about whether pathlib is overrated — say hi. I read every email.

Now close the browser and go ship something.

— Narcis

References: Brett Cannon’s blog (https://snarky.ca/), Anthony Shaw “CPython Internals” (Real Python, 2021), Martin Kleppmann “Designing Data-Intensive Applications” (O’Reilly, 2017), Joe Reis & Matt Housley “Fundamentals of Data Engineering” (O’Reilly, 2022), Aurélien Géron “Hands-On Machine Learning” 3rd ed. (O’Reilly, 2022), Sebastian Raschka “Machine Learning with PyTorch and Scikit-Learn” (Packt, 2022), fastai course (https://course.fast.ai/), Andrej Karpathy “Neural Networks: Zero to Hero” YouTube series, Hugging Face NLP course (https://huggingface.co/learn/nlp-course), John Ousterhout “A Philosophy of Software Design” 2nd ed. (Yaknyam, 2021). Retrieval 2026-05-01.