The first architecture: a single-server web app + database

The previous lesson was about trade-offs. This one is about the architecture you should pick when you have not yet earned any of the harder trade-offs. Most systems should start here. A surprising number of systems should never leave.

The shape: one virtual machine, one Postgres database, one application process, one deploy pipeline. Optionally a static frontend on a CDN, optionally a managed load balancer in front, optionally a small cache when you genuinely need one. That’s the entire diagram. You can fit it on the back of a napkin. You can run a profitable business on it. People do, every day.

This lesson is about why this is a good starting point, what it can and cannot do, what it costs, what the actual pieces are, and what you should resist adding to it for as long as humanly possible.

What “minimal viable architecture” looks like

The minimal architecture has four moving parts:

A web framework, in whatever language your team is fastest in. Python (Django, FastAPI, Flask), Node (Express, Fastify, NestJS), Go (net/http or chi or Gin), Ruby (Rails), Elixir (Phoenix), .NET (ASP.NET Core), Java (Spring Boot, or Quarkus if you like the cold start). They are all fine. The choice barely matters at this stage. Pick the one whose ecosystem you can navigate at 2 a.m.
A relational database. Postgres is the safe pick. MySQL/MariaDB is the safe second pick. SQLite is genuinely viable for many workloads up to several thousand users; do not laugh. Avoid anything trendier; you have not earned the operational complexity.
A reverse proxy or load balancer in front. Either nginx on the same VM, or a managed load balancer (AWS ALB, Cloudflare, Hetzner Load Balancer, DigitalOcean Load Balancer). Terminates TLS, hands off HTTP to your app.
A way to deploy. This can be as fancy as you want or as crude as git pull && systemctl restart app. Both extremes have shipped real businesses.

That is the entire stack. Everything else (cache, queue, separate frontend, CDN, search, analytics warehouse) is something you bolt on when you can name a specific problem it solves for you, not because the architecture diagrams in conference talks have those boxes.

Here is the C4 container diagram for this:

flowchart LR
    user[End user browser] -->|HTTPS| lb[Load balancer or nginx]
    lb -->|HTTP| app[Web app process]
    app -->|TCP 5432| db[(Postgres database)]
    cdn[CDN] -.->|static assets| user

That is it. One user-facing entry point, one app process, one database, optionally a CDN for static files. The whole thing fits in your head, you can debug any part of it from a laptop, and you can deploy a new version in under five minutes.

What this can actually do

The instinct, especially in 2026 after a decade of microservices marketing, is to think this shape is a toy. It is not.

A reasonably-tuned web app on a single 4-vCPU 8 GB VM, talking to a Postgres on the same machine or on a managed neighbour, can comfortably handle:

Hundreds of requests per second with no special engineering effort. Postgres handles point queries in single-digit milliseconds; your framework’s overhead is tens of milliseconds at worst; the network in front of you is the bottleneck more often than the app is.
Thousands of requests per second with some care. That care looks like: connection pooling (PgBouncer or your framework’s built-in pool, sized correctly), reasonable indexes on the columns your queries filter by, opting out of N+1 queries by using JOIN or eager loading instead of one query per row, and a cache for the genuinely-hot paths.
Tens of millions of rows in your largest tables before you start needing to think hard about partitioning. Postgres is happy at this scale. The B-tree indexes are happy. Sequential scans on the rare badly-written query are still acceptable.
Tens of thousands of concurrent users if your application is well-behaved and most of your traffic is reads. WebSocket connections fan out per-process; if you need a million concurrent sockets you have a different problem, but very few apps actually do.

A small startup processing 10,000 orders a day, a SaaS with 5,000 paying customers, a content site with 500,000 monthly readers, a B2B tool with 200 enterprise users hammering it during business hours: all of these run comfortably on this shape, with room to spare. The companies that need more than this are real, but they are a smaller fraction of the industry than the conference circuit suggests.

The unsentimental truth is that the database does almost all of the work. The application process is mostly glue: it accepts an HTTP request, validates the input, runs some queries, formats the output. If your app is slow, it is almost always because a query is slow, not because the framework is slow. Knowing this saves you from the mistake of throwing more app servers at a problem that is actually a missing index.

What this costs

Real numbers, in EUR, as of 2026, for “I have a side project / a small SaaS that needs to run somewhere”:

Hetzner Cloud CPX21: 4 vCPU, 8 GB RAM, 80 GB SSD, 20 TB traffic. Roughly EUR 13/month. Datacentre in Germany or Finland, latency excellent for European users.
DigitalOcean Basic Droplet, 4 vCPU, 8 GB RAM, similar shape. Roughly EUR 45/month. Multiple global regions, slightly nicer dashboard.
AWS EC2 t4g.large (2 vCPU ARM, 8 GB) or t3.medium (2 vCPU, 4 GB). Roughly EUR 25 to 40/month plus data transfer, plus snapshots, plus the bill nobody can predict.
Managed Postgres, if you want to skip running it yourself: Hetzner doesn’t offer one yet; DigitalOcean managed Postgres starts around EUR 15/month for a small instance; AWS RDS for a db.t4g.small is around EUR 30/month, and a real production-grade one with backups and a replica is closer to EUR 80.
Cloudflare in front for DNS, TLS, caching, DDoS protection: free for most small workloads. Pro plan is USD 25/month if you want WAF, analytics, image optimisation. Most small projects are fine on free.
Backups to S3-compatible object storage (Backblaze B2 or Hetzner Storage Box): a few EUR/month for the volumes a small SaaS produces.
Monitoring via the free tier of Grafana Cloud, BetterStack, or self-hosted Prometheus on the same VM: free, or single-digit EUR for a managed alternative.

Total monthly bill for a working production deployment: somewhere between EUR 30 and EUR 100, depending on whether you self-host the database and how much you value not running it yourself. That is less than your AWS exploration tier accidentally bills you when you forget to shut down a NAT gateway. A real, scaled, paying-customers business can run on this for a long time.

The unit economics are extremely friendly to small teams. If you have 100 paying users at EUR 20/month, you have EUR 2,000/month in revenue and EUR 50/month in infrastructure. Almost any other architectural shape costs more, both in compute and in engineering time. The cheapest way to operate is to not have many things to operate.

Why this is the right starting point

Three reasons, all of them about the human running the system, not about the machine.

Simple to reason about. When something breaks, there are four places it could be: the load balancer, the app, the database, or the connection between them. SSH into the VM, look at the logs, look at top, look at pg_stat_activity. The mental model is small enough to hold in working memory while you debug. Compare this to a 12-service microservices deployment where the request might be in any of seven async queues and you cannot tell which.

Fast to deploy. A new version of the code on this stack is git pull, run migrations, restart the process. Five minutes from “I fixed it on my laptop” to “it’s live in production.” This compounds. A team that ships ten times a day learns ten times faster than a team that ships once a week, and at this stage of a project, learning velocity is the entire game.

Fast to debug. Stack traces are local. Logs are in one place. Time on the wall clock matches time in the database matches time in the application. There are no distributed traces to reconstruct, no clock skew between services, no message queues hiding the order of operations. When you reproduce a bug, you reproduce it the same way every time.

Solo founders and small teams ship years of product on this shape. Stripe ran on Ruby on Rails and one Postgres for a long time. Basecamp still does, by choice. Pieter Levels’ Nomad List runs on a single VM with a single SQLite database, has for years, and prints money. Plausible Analytics, before they grew, was a single Phoenix app and a Postgres. The pattern is so common it is almost invisible.

The pieces in detail

Let’s walk through the actual moving parts of a real version of this stack, the kind you would put up tomorrow morning.

The web framework

Pick the one your team is fastest in. There is no architectural reason to pick a specific one at this scale. The framework is going to spend its time doing the same thing in every language: accept request, validate, query DB, render response. Whatever framework you pick, learn its conventions deeply rather than fighting them.

Two opinionated picks if you have no preference and want something boring and reliable: Django if your team is Python-flavoured, Rails if your team is Ruby-flavoured. Both are 20 years old, both are still actively developed, both have an obscene amount of “this thing is already solved” libraries and Stack Overflow answers, and both are designed for exactly this single-app-and-database shape. Phoenix (Elixir) is a less-common pick but excellent if your team has some experience with it; it scales much further than Django or Rails on a single machine because the BEAM VM is genuinely better at concurrency than CPython or MRI.

Postgres

Use Postgres. Don’t agonise over it.

Postgres in 2026 is, conservatively, the most generally-useful database ever shipped. It does relational tables, JSON documents (with indexes!), full-text search, geospatial queries (PostGIS), time-series reasonably (with TimescaleDB or partitioning), pub/sub (with LISTEN/NOTIFY), and even job queues (with SELECT ... FOR UPDATE SKIP LOCKED). For most small-to-medium SaaS workloads, you will never outgrow it.

Run it as a managed service if you can afford the small premium. Self-host it on the same VM as the app if you cannot. Either way, enable automated backups before you have your first user. We will do a whole lesson on backups in module 9; for now: nightly pg_dump to S3, with retention.

Connection pooling: Django and Rails handle this in-process, which is fine until you need more than a couple of app processes. Once you have multiple processes (or, eventually, multiple machines), put PgBouncer in transaction-pooling mode in front of Postgres. It is the most boring possible piece of software and has been working without complaint for fifteen years.

The job queue

You will need to do work asynchronously: send emails, generate reports, retry external API calls, process uploaded images. The instinct is to add Redis or RabbitMQ. Resist for as long as you can, because Postgres is also a perfectly good job queue at this scale, using SELECT ... FOR UPDATE SKIP LOCKED to dequeue jobs without contention.

Libraries that make this trivial: pg-boss (Node), solid_queue (Rails 8 default, by the way), dramatiq with Postgres broker, River (Go), Oban (Elixir), procrastinate (Python). They give you the job queue without giving you another piece of infrastructure to operate. That is a meaningful win at small team size.

If you genuinely need millisecond-latency real-time work (chat, presence, live game state), then yes, Redis. But that is a specific requirement, not a default.

The reverse proxy and TLS

Two viable choices.

Option A: nginx on the VM. Terminate TLS with Let’s Encrypt via certbot, proxy to the app on localhost:8000, serve static files directly out of /var/www. Three config files, all of which are 20 lines, all of which you copy from a tutorial the first time and never touch again. Free, fast, fine.

Option B: managed load balancer. Cloudflare in front for free, or AWS ALB/Hetzner LB for a few EUR a month. Handles TLS, optional WAF, optional DDoS mitigation. Slightly nicer to operate; you don’t have to renew certs yourself.

Either is fine. Most projects start with option A and move to option B when they want a second VM behind it.

The deploy pipeline

There are two acceptable shapes at this scale.

Docker Compose on the VM. Your docker-compose.yml defines the app, Postgres, optionally Redis. You SSH in, git pull, docker compose up -d --build. Done. Backups are a cron job that pg_dumpall and pushes to S3.

Bare systemd on the VM. Your app runs as a systemd unit. Postgres is the distro-packaged service. Deploys are: SSH in, git pull, bundle install or pip install or npm install, migrate, systemctl restart yourapp. This is what people did before Docker and it still works. It is faster, has fewer moving parts, and the mental model is one unit smaller than the Docker version.

GitHub Actions for the CI: on push to main, run tests, and if green, SSH into the VM and run the deploy script. Five lines of YAML. The whole CI/CD setup is something you can build in an hour and never touch again until it stops fitting, which for most projects is never.

Either shape is good. The wrong shape is “we are going to run Kubernetes for our 50-MAU side project,” and somebody will do it anyway, and they will spend more time on YAML than on code, and they will write a blog post about how it scales beautifully, and you should not listen to them.

A richer diagram

When the project starts to feel real (paying customers, the founder no longer the only on-call), the picture grows monitoring, backups, and an explicit deploy story. Here is what that next-step shape looks like:

Diagram to create: a “small SaaS architecture v1” architecture diagram, drawn in diagrams.net. Centre: one VM, with three boxes inside it labelled “nginx”, “app process (gunicorn / puma / etc.)”, and “Postgres”. To the left: end users connecting via HTTPS through a Cloudflare CDN box. To the right: a separate “S3-compatible object storage” cylinder with arrows from the VM showing nightly pg_dump backups (label: “nightly, encrypted, 30-day retention”) and uploaded user files (label: “uploads”). Below the VM: a “Prometheus + Grafana” cluster (these can be on the same VM or a separate small VM) with arrows up to the VM showing scraping of metrics. Below that: a small “Alertmanager / BetterStack” box receiving alerts and routing to Slack/email. To the bottom-right: a deployment pipeline shown as three boxes left-to-right: “GitHub repo” -> “GitHub Actions (run tests)” -> “SSH deploy to VM (systemd restart)”. Use a soft colour scheme: VM in green, storage in orange, observability in blue, deploy in grey. Group “production” in a light-blue box, “deployment” in a light-grey box, “external services” outside both. No more than 12 named components total; the diagram should still fit in a single screen.

That diagram represents a genuinely production-grade small SaaS. Plenty of companies in the EUR 1-10M revenue range run on something not very different. It does not have Kubernetes. It does not have a service mesh. It does not have separate microservices. It has the things it actually needs: TLS, backups, monitoring, alerting, and a way to get code from GitHub to production reliably.

What NOT to add yet

This list is the most underrated lesson in this lesson. The market wants to sell you all of these. Most of them do not belong in a system that has not yet earned them.

Kubernetes. Until you genuinely need to run multiple services across multiple machines with auto-scaling, Kubernetes is paying for capability you do not use, and the operational tax is real. A small team running Kubernetes spends maybe a third of its engineering time on Kubernetes. That is a third of a team you do not have.

Microservices. A second service is something you split off when a clear, durable boundary has emerged in the codebase, and when the cost of the split (separate deploys, separate databases, async communication, distributed tracing) is justified by the benefit (independent scaling, team boundaries, language choice). At three engineers, that benefit is almost never there. Stay monolithic. The split-it-up exercise is module 7 territory and we will do it properly.

A separate cache layer (Redis as a cache). Postgres has a buffer pool. Most “I need a cache” conversations at small scale are actually “I have a missing index” conversations. Add the index first, measure, and only add Redis if you can name the specific query that is hot enough to warrant a separate layer.

A separate queue (Redis, RabbitMQ, Kafka). As above: Postgres-as-queue is excellent up to thousands of jobs per minute. Adopt a real queue when you have a real reason, not before.

A search engine (Elasticsearch, OpenSearch, Meilisearch). Postgres full-text search is genuinely good and handles maybe 80% of the cases people reach for Elastic for. Start there; reach for the dedicated engine only when relevance scoring or multi-tenant search at scale forces it.

Read replicas. Until your single primary is sweating, a read replica is operational complexity for no benefit. Add it when pg_stat_activity tells you reads are crowding out writes, not before.

A separate analytics database (Snowflake, BigQuery, ClickHouse). Until the analytics queries are actively interfering with the production workload, run them on a Postgres read replica. The day you have a 4 TB warehouse and a data team, fine. That day is later than you think.

Service mesh, sidecars, distributed tracing fabric. All useful in their place. Their place is far in the future for this shape.

The principle behind all of these is the same: defer until the load justifies it. Architecture is the management of complexity, and the cheapest complexity is the one you didn’t add yet. Adding a piece of infrastructure is a one-way door in many teams; once it is in production, removing it is a project.

The teams I have most admired, working on systems with real users and real revenue, have all run something that looked very much like this lesson’s diagram for far longer than the conference talks suggest is possible. They scaled vertically, they tuned queries, they added indexes, they kept the operational surface tiny, and they spent the saved engineering time on the product. That is the boring, profitable shape.

The next module starts the journey from this single-machine architecture to the systems that genuinely outgrow it. Lesson 7 introduces the C4 model in depth so we have a shared vocabulary for drawing these systems as they get more complex. Lesson 8 is the first scaling step: when one machine stops being enough, what comes next, and how to do it without losing the simplicity that made the first version work.

For now, if you take one thing from this lesson: the smallest architecture that solves your problem is almost always the right one. You can always add complexity later. You can almost never take it out.