Python, from the ground up Lesson 16 / 60

Project layout: src/ vs flat, where tests go, where scripts live

The layout decisions that affect imports, testing, and packaging — and the convention that's won in 2026.

You can write Python that runs without ever thinking about where files go. A single script.py works fine. But the moment you want to test it, package it, share it, or just stop tripping over yourself, layout starts to matter.

There are conventions for all of this, and by 2026 most of the dust has settled. This lesson is the map.

Two layouts you’ll see

Open a hundred Python repos and you’ll see two patterns.

Flat layout:

weather-cli/
├── pyproject.toml
├── README.md
├── weather_cli/
│   ├── __init__.py
│   ├── api.py
│   └── cli.py
└── tests/
    ├── conftest.py
    └── test_api.py

The package directory sits at the project root, sibling to tests/ and pyproject.toml. Simple. Visible at a glance.

src/ layout:

weather-cli/
├── pyproject.toml
├── README.md
├── src/
│   └── weather_cli/
│       ├── __init__.py
│       ├── api.py
│       └── cli.py
└── tests/
    ├── conftest.py
    └── test_api.py

The package lives one level deeper, inside a src/ directory. Slightly more typing. Different in one important way.

Why src/ won

The flat layout has a quiet trap: when you run python from the project root, the current directory is on sys.path. So import weather_cli works whether or not the package is installed. That sounds convenient, until you ship a wheel that’s missing a file. Your local tests pass. CI passes (it imports from the working tree too). You upload to PyPI, a user installs it, and — surprise — ImportError, because that file you forgot to add to the package never made it into the wheel.

The src/ layout removes the working directory from the import equation. src/ isn’t a package (no __init__.py directly inside it), and weather_cli is one level too deep to be auto-discovered by Python’s path. To import weather_cli, you have to install the package first — usually with uv sync or pip install -e ., which builds it the same way the wheel will be built. If something is missing from your packaging config, you find out immediately, on your own machine, instead of in a bug report.

That’s the whole reason. It catches “I forgot to ship this file” bugs before they leave your laptop. By 2026 this is the Python Packaging Authority’s recommended default, and what uv init --package produces.

There are still places flat is fine: scripts that aren’t packages, tiny projects you’ll never publish, notebooks. But for anything you’d run tests against and call a “library,” go src/.

The anatomy of a real project

Here’s the directory tree for a believable mid-sized Python project — think a small data pipeline or CLI tool:

weather-cli/
├── .github/
│   └── workflows/
│       └── ci.yml
├── .gitignore
├── .python-version
├── README.md
├── pyproject.toml
├── uv.lock
├── src/
│   └── weather_cli/
│       ├── __init__.py
│       ├── __main__.py
│       ├── api.py
│       ├── cli.py
│       ├── config.py
│       └── py.typed
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── test_api.py
│   ├── test_cli.py
│   └── fixtures/
│       └── sample_response.json
├── scripts/
│   └── regenerate_fixtures.py
├── notebooks/
│   └── exploration.ipynb
├── data/
│   └── .gitkeep
└── docs/
    └── usage.md

Let’s walk through each piece.

What __init__.py is for

Inside src/weather_cli/, the file __init__.py does three jobs:

1. It marks the directory as a package. This is the historical reason. Modern Python also supports “namespace packages” (directories without __init__.py), but for a regular package you want the file there.

2. It curates the public API. Whatever you put in __init__.py is what users see when they import weather_cli. A clean pattern:

# src/weather_cli/__init__.py
"""Tiny weather CLI."""

from weather_cli.api import fetch_forecast, Forecast
from weather_cli.cli import main

__all__ = ["fetch_forecast", "Forecast", "main"]
__version__ = "0.3.1"

Now users write from weather_cli import fetch_forecast instead of from weather_cli.api import fetch_forecast. The internal module structure is yours to refactor; the public surface is stable.

3. It exposes a version constant. Either hard-coded as above, or read dynamically:

from importlib.metadata import version
__version__ = version("weather-cli")

The second form has the advantage of single-sourcing the version — it pulls from your installed package metadata, which came from pyproject.toml.

What __main__.py does

If you create src/weather_cli/__main__.py, you can run the package as a module:

python -m weather_cli --city Rome

Useful when you want python -m invocation alongside (or instead of) a console script. A typical __main__.py:

# src/weather_cli/__main__.py
from weather_cli.cli import main

if __name__ == "__main__":
    raise SystemExit(main())

This is the same entry point you’d reference in pyproject.toml:

[project.scripts]
weather = "weather_cli.__main__:main"

so weather --city Rome and python -m weather_cli --city Rome do the exact same thing. Belt and braces.

The py.typed marker

If your package has type hints and you want downstream users’ type checkers to actually see them, add an empty file called py.typed next to __init__.py. This is PEP 561. Without it, mypy and pyright treat your package as untyped, even if every function has annotations. It’s one of those one-line wins that takes ten minutes to discover the first time.

Tests directory

tests/ mirrors the package. If you have src/weather_cli/api.py, you have tests/test_api.py. This isn’t enforced anywhere — it’s just much easier to find things.

conftest.py at the top of tests/ holds shared fixtures. pytest auto-discovers it; you don’t import it. A typical one:

# tests/conftest.py
import json
from pathlib import Path
import pytest

FIXTURES = Path(__file__).parent / "fixtures"

@pytest.fixture
def sample_response():
    return json.loads((FIXTURES / "sample_response.json").read_text())

Should tests/ itself be a package (with its own __init__.py)? Old wisdom said yes; modern wisdom says it depends. With src/ layout and pytest’s “rootdir” logic, you usually don’t need it. Add one only if you have name collisions between test files (two test_utils.py in different subfolders) — pytest then needs the __init__.py to disambiguate.

Where scripts/ go (and why not in the package)

scripts/ is for one-off automation that uses the package but isn’t part of its public API. Things like:

  • regenerating test fixtures
  • bulk data migrations you ran once
  • benchmarks
# scripts/regenerate_fixtures.py
"""Run from project root: uv run python scripts/regenerate_fixtures.py"""
from weather_cli.api import fetch_forecast
import json

forecast = fetch_forecast("Rome")
print(json.dumps(forecast, indent=2))

These are not installed with the package. They’re not in [project.scripts]. They’re just there so you and your collaborators have one obvious place to look for “that thing I wrote once.” Putting them inside the package would ship them to every user who installs it, which is rude.

Notebooks and data

notebooks/ is where Jupyter notebooks go. Tests don’t run on them, type checkers ignore them, and your pyproject.toml should exclude them from any tooling. [tool.ruff] exclude = ["notebooks"] is a common line.

data/ is for files your code reads or writes locally. If they’re large or generated, gitignore the contents and commit a .gitkeep so the directory exists. If they’re small and canonical (test fixtures, sample CSVs), commit them — but think about whether they belong inside tests/fixtures/ instead.

Don’t put data inside the package directory unless it’s package data — files that need to ship with the wheel because your code reads them at runtime. For that, use importlib.resources:

from importlib.resources import files
config_text = (files("weather_cli") / "default_config.toml").read_text()

and tell your build backend to include it ([tool.hatch.build] or equivalent).

Imports inside the package

Always use absolute imports within your own package:

# good
from weather_cli.api import Forecast

# avoid (relative imports — work, but obscure where things come from)
from .api import Forecast

Both are valid Python. Absolute imports are easier to grep, easier to move (you don’t have to count dots when you reorganise), and clearer when reading code in isolation.

Conditional and lazy imports

Most of the time, import everything at the top of the file. But two patterns are worth knowing:

Conditional imports for optional dependencies:

try:
    import polars as pl
    HAS_POLARS = True
except ImportError:
    HAS_POLARS = False

def to_dataframe(rows):
    if not HAS_POLARS:
        raise ImportError("Install with `pip install weather-cli[polars]` to use this.")
    return pl.DataFrame(rows)

Lazy imports when import time matters — CLIs especially. If your package imports pandas at the top, every CLI invocation pays a one-second startup tax even when you just want --help. Move heavy imports inside the functions that need them:

def export_to_excel(rows, path):
    import openpyxl  # only imported when this function runs
    ...

For libraries published to PyPI you can also expose lazy attributes via PEP 562’s module-level __getattr__, but that’s overkill until you’re sure you need it.

A noxfile.py or Makefile?

Nice-to-have. Both let you write nox -s tests or make lint so you don’t memorise the long commands. In a uv-based project you can also just put commands in pyproject.toml and let uv run do the work — uv run pytest, uv run ruff check ., uv run mypy src/. That’s enough for most projects.

A pattern I like: keep a top-level Makefile with three or four targets that wrap the actual commands, so newcomers can just type make test and not worry about the toolchain.

.PHONY: test lint fmt typecheck
test:
	uv run pytest
lint:
	uv run ruff check .
fmt:
	uv run ruff format .
typecheck:
	uv run mypy src

Trivial, but it documents the project’s expected commands in one place and removes the “what was that command again?” friction. Windows users without make can use just for the same effect.

A data-engineering flavoured example

Layouts shift slightly by domain. For a data-engineering project — the kind of thing that pulls from a warehouse, transforms with polars or duckdb, and emits Parquet — you’ll often see a slightly richer tree:

sales-pipeline/
├── pyproject.toml
├── uv.lock
├── src/
│   └── sales_pipeline/
│       ├── __init__.py
│       ├── extract.py
│       ├── transform.py
│       ├── load.py
│       └── sql/
│           ├── __init__.py
│           └── queries/
│               ├── orders.sql
│               └── customers.sql
├── tests/
│   ├── conftest.py
│   ├── test_transform.py
│   └── fixtures/
│       └── orders_sample.parquet
├── dbt/                       # if you also have a dbt project
├── airflow/                   # DAG definitions
├── notebooks/
└── data/
    ├── raw/                   # gitignored, populated locally
    └── processed/             # gitignored

The point isn’t that every project needs a dbt/ or airflow/ directory — it’s that domain-specific stuff lives at the top level, next to (not inside) your Python package. Keep src/sales_pipeline/ small and focused; let the orchestration, scratch work, and raw data sprawl elsewhere.

Don’t fight the conventions

The temptation when starting a project is to invent something clever — a deeply nested package hierarchy, custom directory names, novel test layouts. Resist. Every Python developer who looks at your repo (including you, six months from now) has muscle memory for src/<package>/, tests/, and pyproject.toml. Putting things where they’re expected costs nothing and saves a thousand small frictions.

The summary

  • src/ layout is the default in 2026. It catches packaging bugs early.
  • __init__.py curates your public API; __main__.py enables python -m.
  • Tests mirror the package structure. conftest.py holds shared fixtures.
  • Scripts, notebooks, and data live in their own top-level directories — never inside the package.
  • Absolute imports inside your own package. Lazy imports only when startup time hurts.
  • Add py.typed if your package has type hints and you want users to benefit.

Combined with the previous lessons — pyproject.toml as the config, uv as the manager — you now have the full modern picture for organising and shipping a Python project. The next module moves on to standard-library deep cuts: argparse, logging, subprocess, and the bits of the stdlib worth knowing cold.

Search