logging: structured logs, levels, and the patterns that work in production

print() is fine. For a script you’re running once on your laptop, it’s perfect. The problem is that the moment a piece of code becomes anything else — a service, a library, a scheduled job, anything someone else operates — print() starts costing you.

There’s no level (is this an error or a status update?), no timestamp (when did this happen?), no module name (which file produced this?), no way to silence one noisy module without silencing everything, and no way for the operator to redirect to a file, a syslog, or a cloud log aggregator without editing your source code.

Python’s logging module solves all of those problems. It also has a reputation for being unpleasant, which is mostly because the official tutorials show the deepest part of the API first. The simple version is genuinely simple. This lesson is the simple version, then the production version, then the small set of pitfalls that are worth knowing about.

The simple version

Two lines at the top of every module:

import logging
logger: logging.Logger = logging.getLogger(__name__)

Then anywhere in that module:

logger.debug("starting calculation with input=%s", payload)
logger.info("processed %d records", len(records))
logger.warning("rate limit at %d%% — backing off", pct)
logger.error("failed to fetch %s: %s", url, exc)
logger.critical("database connection lost")

That’s it. Five severity levels, in increasing order: DEBUG, INFO, WARNING, ERROR, CRITICAL. The default level is WARNING, which means debug and info calls are silently dropped until you turn the level up.

For a script’s entry point, one call configures everything:

import logging

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)-8s %(name)s: %(message)s",
)

basicConfig sets up a handler that writes to stderr with the format you specify. Run your script and you get timestamped, levelled, module-tagged output for free.

getLogger(name): the one rule

The single most important habit: always use logging.getLogger(__name__) at the top of every module, never logging.info(...) directly.

The reason is the logger hierarchy. Logger names are dotted paths, and they form a tree based on those dots. A module called myapp.payments.stripe produces a logger named myapp.payments.stripe, which is a child of myapp.payments, which is a child of myapp, which is a child of the root logger.

Why does this matter? Because configuration cascades down the tree. If you set the root logger to INFO, every module inherits INFO. But you can override per-subtree:

logging.getLogger("myapp").setLevel(logging.INFO)         # most of the app
logging.getLogger("myapp.payments").setLevel(logging.DEBUG)  # this subsystem only
logging.getLogger("urllib3").setLevel(logging.WARNING)    # quiet down a noisy dep

That’s the entire reason logging exists as a hierarchy and not just a collection of print-with-levels. You can ratchet verbosity for one part of the system while keeping the rest quiet, without touching the call sites.

If you call logging.info(...) directly, you’re using the root logger and you’ve thrown away that ability. Use getLogger(__name__).

Format strings: what to put in them

basicConfig’s format argument is a %-style template with named fields. The useful ones:

%(asctime)s — formatted timestamp, with %(msecs)d for milliseconds.
%(levelname)s — DEBUG, INFO, etc. The -8s width gives you padded alignment.
%(name)s — the logger name, which is the module path if you used __name__.
%(message)s — the message itself, after argument substitution.
%(module)s, %(funcName)s, %(lineno)d — call site info, useful in development.
%(process)d, %(thread)d — useful when you have more than one of either.

A reasonable production format:

"%(asctime)s.%(msecs)03d %(levelname)-8s %(name)s [%(process)d]: %(message)s"

A reasonable development format:

"%(asctime)s %(levelname)-8s %(name)s:%(lineno)d %(message)s"

Add lineno and funcName when you want to grep for “where did this come from.”

dictConfig: the production version

For anything beyond a single-script basicConfig, the right tool is logging.config.dictConfig. It takes a declarative dict (or YAML/JSON loaded into a dict) describing handlers, formatters, and per-logger levels.

import logging.config

CONFIG: dict = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "standard": {
            "format": "%(asctime)s %(levelname)-8s %(name)s: %(message)s",
        },
        "json": {
            "()": "pythonjsonlogger.jsonlogger.JsonFormatter",
            "format": "%(asctime)s %(levelname)s %(name)s %(message)s",
        },
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "level": "INFO",
            "formatter": "standard",
        },
        "file": {
            "class": "logging.handlers.RotatingFileHandler",
            "level": "DEBUG",
            "formatter": "json",
            "filename": "/var/log/myapp/app.log",
            "maxBytes": 50_000_000,
            "backupCount": 5,
        },
    },
    "loggers": {
        "myapp": {
            "level": "DEBUG",
            "handlers": ["console", "file"],
            "propagate": False,
        },
        "myapp.payments": {
            "level": "INFO",  # quieter than the rest
        },
        "urllib3": {
            "level": "WARNING",
        },
    },
    "root": {
        "level": "WARNING",
        "handlers": ["console"],
    },
}

logging.config.dictConfig(CONFIG)

Things to note:

disable_existing_loggers: False is almost always what you want. The default True silences anything created before dictConfig was called, which is a foot-gun.
Each handler has its own level. The console can be INFO while the file gets DEBUG.
The RotatingFileHandler rolls files at a size threshold so logs can’t fill the disk.
propagate: False on a logger stops it from also writing to ancestor handlers. Useful when you’ve attached a handler at both the package and root levels and don’t want duplicate lines.

In real apps this config usually lives in a YAML file loaded at startup, so ops can change levels without a redeploy.

Lazy formatting: the one big mistake

Compare these two:

logger.info("processing user %s with %d items", user_id, len(items))      # correct
logger.info(f"processing user {user_id} with {len(items)} items")          # wrong

They produce the same output. They are not equivalent.

The first one passes the format string and the arguments separately. The logger checks the level, and if INFO is disabled, the format string is never evaluated and len(items) is never called. Lazy.

The second one builds the f-string before info is even called. The format work happens unconditionally, level filtering or not. For a "processing user X" log line that’s fine. For something with an expensive computation in the f-string — f"state: {dump_full_state()}" — you’ve just paid for serialization on every call, even when DEBUG is off.

Habit: use %-style placeholders in logger.* calls. Use f-strings everywhere else. The linter pylint will catch this; ruff has a G004 rule for it too.

The exception worth knowing: extra={...} is fine to build with any string formatting because its values aren’t expanded into the message string — they’re attached as attributes for structured handlers.

Exceptions: log the traceback

try:
    risky()
except Exception:
    logger.exception("risky() failed")

logger.exception(msg) is logger.error(msg) plus the current traceback attached. Only call it from inside an except block. For non-error paths where you want to attach an exception object you already have, use the exc_info keyword:

logger.warning("retry %d failed", attempt, exc_info=exc)

Don’t repr(exc) into your message manually. The logger’s traceback handling is better than what you’d assemble with string formatting.

Structured logging: extra and beyond

logger.info(
    "user_action",
    extra={"user_id": user_id, "action": "checkout", "cart_total": 49.99},
)

The extra keyword adds fields to the log record without putting them in the message. A JSON formatter (like python-json-logger) emits them as structured fields. A plain text formatter ignores them unless you reference them by name in the format string.

For more serious structured logging, two third-party options dominate in 2026:

structlog — explicit, composable, processor-pipeline based. Industrial. Pairs cleanly with logging so you can pick it up without rewriting existing log calls.
loguru — opinionated, batteries-included, pretty default output. A single import and you have colored, structured logs. Good for scripts and small services.

Both are reasonable choices. structlog is what you’ll see in larger codebases that care about log shape. loguru is what you’ll see in repositories where someone wanted nice logs in five minutes.

Cloud log aggregators: the integration story

In production you almost never read logs from disk. You ship them to one of:

CloudWatch Logs (AWS) — pick up stdout/stderr from container runtimes; structured logs work best as JSON.
Cloud Logging / Stackdriver (GCP) — same shape; the Python client library exposes a handler that adds GCP-specific fields.
Datadog, Splunk, Honeycomb, Grafana Loki — same general pattern, different SDKs.

The portable strategy: log JSON to stdout, let the runtime ship it, let the aggregator parse it. That’s why the json handler in the dictConfig above is more useful in production than the human-readable one.

# Minimal JSON output with stdlib only
import json
import logging

class JsonFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        payload: dict = {
            "ts": self.formatTime(record),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
        }
        if record.exc_info:
            payload["exc"] = self.formatException(record.exc_info)
        # extras
        for k, v in record.__dict__.items():
            if k not in logging.LogRecord("", 0, "", 0, "", None, None).__dict__:
                payload[k] = v
        return json.dumps(payload)

For most projects, install python-json-logger instead and avoid maintaining the above.

A real-world example: library + application

The split most people get wrong: a library should never configure logging. It should only call logging.getLogger(__name__) and emit. The application that imports the library is the one that decides where logs go and at what level.

A library file:

# mylib/payments.py
import logging

logger: logging.Logger = logging.getLogger(__name__)

def charge(card: str, amount: float) -> str:
    logger.debug("charging card ending %s for %.2f", card[-4:], amount)
    try:
        token: str = _stripe_call(card, amount)
    except Exception:
        logger.exception("stripe call failed for amount=%.2f", amount)
        raise
    logger.info("charged %.2f, token=%s", amount, token)
    return token

An application entry point:

# app/main.py
import logging.config

logging.config.dictConfig({
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "std": {"format": "%(asctime)s %(levelname)-8s %(name)s: %(message)s"},
    },
    "handlers": {
        "console": {"class": "logging.StreamHandler", "formatter": "std"},
    },
    "loggers": {
        "mylib": {"level": "INFO"},          # library: info and above
        "mylib.payments": {"level": "DEBUG"}, # except this submodule
    },
    "root": {"level": "WARNING", "handlers": ["console"]},
})

from mylib.payments import charge
charge("4242 4242 4242 4242", 49.99)

Because the library uses __name__, the application can reach in and turn one submodule’s verbosity up without any cooperation from the library author. That’s the payoff.

Common mistakes worth avoiding

Logging at the wrong level. error for things that aren’t errors creates pager fatigue. info for things that should be debug creates noise that buries actual signal. Calibrate: ERROR is “a human will eventually need to look at this,” WARNING is “potentially concerning, may be normal,” INFO is “milestone events the operator wants to see,” DEBUG is “everything else, off by default.”
Logging inside hot loops. Even at suppressed levels there’s overhead. Move log calls out of inner loops, or guard with if logger.isEnabledFor(logging.DEBUG): for the genuinely expensive ones.
Logging secrets. Tokens, passwords, full request bodies. Once they’re in the log pipeline they’re hard to recall. Redact at the source.
Calling logging.basicConfig() more than once. It’s a no-op after the first call (unless you pass force=True in 3.8+). For complex setups, use dictConfig instead.

When to choose what

For a one-off script: basicConfig and getLogger(__name__).

For a small service: dictConfig from a YAML file at startup, JSON formatter to stdout, let the platform handle file rotation.

For anything bigger: same as above, plus structlog for structured fields, plus a cloud handler if you’re not getting stdout shipped automatically.

The thing to internalize: logging is the channel between your future self (or your future colleague, or the on-call engineer at 3am) and the system as it ran. The cost of a thoughtful log line is one line of code. The cost of not having one when you need it is debugging from a stack trace alone. Spend the line.

References: logging — Logging facility for Python, logging.config, logging.handlers, Logging HOWTO, structlog, loguru. Retrieved 2026-05-01.