Data & System Architecture, from the ground up Lesson 24 / 80

Polyglot persistence: when to mix

When your application benefits from multiple databases, when one is enough, and the operational cost of running four data stores instead of one.

This is the final lesson of Module 3. Over the last eight lessons we have walked through the major categories of data store: relational, document, key-value, columnar, graph, time-series, vector. Each category solves a class of query that the others handle clumsily. Each category has a healthy ecosystem of products. And each one, taken on its own, is tempting to add to your stack the moment you encounter a workload it would handle well.

This lesson is the answer to “should I add another database?” and the answer is, surprisingly often, “no, but read on.” The term for the design pattern of running multiple specialised data stores in one application is polyglot persistence, and it has spent the last fifteen years swinging between fashionable and regretted. Where it sits in 2026, and how to decide whether it is right for you, is the topic.

Where the term came from

Martin Fowler coined “polyglot persistence” in 2011 in a short bliki post (https://martinfowler.com/bliki/PolyglotPersistence.html). The argument was straightforward: relational databases had been the universal default for thirty years, but the rise of specialised stores meant that an application could pick the right tool for each subset of its data. Use the relational database for transactional data, a key-value store for sessions, a search engine for full-text queries, a graph database for relationships, a column store for analytics. Each part of the system uses the storage that fits, and the whole composes into something more performant than any single store could be.

The diagram in Fowler’s post became iconic: an e-commerce app with seven different boxes, each labelled with a different database, all connected to the same application layer. It looked elegant. It looked like what 2011 wanted the future to be.

The years that followed taught us that the diagram was missing the most important box: the team. Every additional store is another thing your team has to learn, monitor, back up, upgrade, and debug at three in the morning, and those costs do not add linearly. They compound.

The promise: a perfect tool for each job

The case for polyglot persistence is the case I have been making implicitly throughout Module 3. The query patterns are different. A graph traversal is awkward in SQL. Full-text search is slow without an inverted index. Time-series aggregations want a column store. Sessions want a low-latency in-memory key-value lookup. Vector search wants HNSW. If you build all of these on top of a single relational database, you are doing things the database is not optimised for, and you eventually hit a performance ceiling that the right tool would have lifted easily.

In a perfect world, you would have one database per query pattern, each tuned to its job, each scaling independently. The application would be a thin layer that routes each request to the appropriate store. New requirements would mean a new database. The architecture would compose neatly.

That world exists, in the slide deck. In production, it has costs that the slide deck does not show.

The reality: every store doubles your operational surface

Each database you add is an order-of-magnitude commitment, not a percentage one. You acquire, with the new store: a backup and restore strategy that someone has actually tested; a monitoring story with metrics, thresholds, dashboards, and pager rules; an upgrade plan, because major versions break things and the database you adopted at version 5 will be at version 9 in three years; an on-call training burden, because when the database misbehaves at 3am somebody has to know it well enough to diagnose it; a disaster-recovery story with measured RTO and RPO; and a synchronisation problem, because data that lives in two places drifts.

The rough heuristic I use is: each new persistence system roughly doubles the on-call surface for the team that owns it. Two databases is twice the work of one. Three is four times. The exponent is not precise but the direction is. A small team with three databases is doing more operational work than the same team with one, and that work is taken from the time they would otherwise spend shipping features.

The 2026 default position: start with one Postgres

What has changed since Fowler wrote his post is how much a single modern relational database can do. Postgres in 2026 is not the Postgres of 2011. The same instance that serves your transactional workload can also handle, competently:

  • Full-text search, via tsvector, tsquery, and GIN indexes. Performance is good into the millions of documents, and the search is transactional with the underlying data, which Elasticsearch cannot offer.
  • Document storage, via jsonb. The operators (->, ->>, @>, ?) and GIN indexes on JSON paths give you a document database inside Postgres, with the relational schema for the parts you want structured and the JSON for the parts that vary.
  • Vector search, via the pgvector extension covered in the previous lesson: HNSW indexes, cosine and L2 and dot-product distances, hybrid queries that combine vector similarity with relational filters.
  • Time-series, via TimescaleDB or native partitioned hypertables. Hypertables, compression, continuous aggregates, retention policies. Competitive with the dedicated stores from lesson 22 for most workloads.
  • Graph traversal, via the Apache AGE extension, which adds Cypher-style queries on top of Postgres tables. For simpler cases, recursive CTEs (WITH RECURSIVE) get you most of the way.
  • Geographic queries, via PostGIS: the gold-standard open-source GIS database, running as a Postgres extension.

That list means that, in 2026, the question “should I add a search engine, a vector database, a time-series database, and a graph database” has the same answer for many teams: not yet. Run the workloads on Postgres first. When you outgrow Postgres for one of them specifically, extract that one. The rest stay where they are. This is the same lesson as Module 1’s “start simpler” theme, applied to data infrastructure: the defaults of 2026 are much more capable than the defaults of 2011, and the architectural prescription has to update accordingly.

Patterns where polyglot is unavoidable

There are workloads where Postgres genuinely is not enough and the second database earns its keep.

Search at scale. Postgres tsvector is good up to a point. Above tens of millions of documents, with complex relevance ranking, faceted navigation, autocomplete, multi-language analyzers, and millisecond p99 latency, Elasticsearch (or OpenSearch) pulls ahead. If your application is centrally about search, Elasticsearch is correct. If search is a side feature, Postgres is correct.

Caching at scale. Redis is the right answer for hot keys, session storage, leaderboards, pub/sub, rate limiting: anywhere you need single-digit-millisecond access to small data with TTLs and you do not want every read hitting the primary. Redis serves an order of magnitude more requests per second than Postgres on the same hardware and gives you the right primitives. Adding Redis as the second store is the most common and most justified polyglot move.

Analytics at scale. Running heavy aggregation queries against your transactional Postgres makes the OLTP workload suffer. Ship the data into a column store (BigQuery, Snowflake, ClickHouse, DuckDB for smaller scales) and run analytics there. The transactional database returns to its job. Lesson 19 covered this case in full.

Event sourcing. When the log of events is itself the source of truth, Kafka (or a similar log) is the system of record. The relational database becomes a derived view, rebuilt from the log as needed. Module 4, lesson 42 picks this up.

In each case, the second database does something Postgres genuinely cannot do well, and the operational cost is paid willingly because the alternative is worse.

The keep-them-in-sync problem

Once data lives in two places, you own a synchronisation problem. The user changes their email in the primary; the cache, the search index, and the analytics warehouse all have the old value. Each needs to be updated, each update can fail, each system has its own consistency window. Module 4, lesson 46 covers the solution patterns (change data capture and the outbox pattern) in depth. The two shapes worth naming now:

  • Dual writes: when the application writes to the primary, it also writes to the cache, the search index, and the analytics queue. Easy to implement, easy to get wrong, and almost always ends with the secondaries drifting. Avoid except for the simplest caches with short TTLs.
  • Change data capture (CDC): the primary emits a stream of every change (Postgres logical replication, Debezium, or similar) and downstream consumers read it and update their stores. The primary remains the source of truth, the secondaries are derived, and the synchronisation is idempotent (lesson 16).

If you are adopting a second store and the answer to “how does it stay in sync” is “we will dual-write,” budget time for the rewrite. You will end up at CDC.

A worked example: an e-commerce app with three stores

Concrete scenario. An e-commerce platform with three stores picked for the right reasons:

  • Postgres as the primary system of record: orders, products, users, inventory, payments. ACID on the things that need it.
  • Redis as the cache: session storage, hot product pages, rate-limit counters, pub/sub for real-time updates between processes.
  • Elasticsearch as the catalog search index: the full product catalog with multi-language analyzers, faceted navigation, autocomplete, typo tolerance.
flowchart LR
    User[User request] --> App[Application]
    App -->|writes| PG[(Postgres - source of truth)]
    App -->|reads hot keys| R[(Redis - cache)]
    App -->|search queries| ES[(Elasticsearch - search)]
    PG -->|CDC stream| ES
    PG -.->|invalidate| R

Writes go to Postgres. A CDC stream feeds Elasticsearch, which stays current within seconds. Redis is populated lazily on read, with invalidation on Postgres write. Reads of transactional data go to Postgres, hot pages go through Redis, search queries go to Elasticsearch.

The pros: each store does what it is good at. Catalog search is fast and rich, session lookups are sub-millisecond, transactional writes are ACID, none of the stores is being abused. The cons: three databases, three on-call rotations, three backup strategies, three monitoring dashboards. When search results look stale the team has to decide whether the bug is in Postgres, in the CDC pipeline, or in Elasticsearch. When Redis goes down the team has to know whether the fallback to Postgres can absorb the load. The CDC pipeline is itself a system to operate.

Three failure modes worth naming. Cache stampede: a popular item’s cache entry expires, a thousand concurrent requests miss, all hit Postgres, Postgres falls over. Mitigate with single-flight. CDC lag: the stream to Elasticsearch falls behind, search shows deleted products or hides new ones. Monitor and alert on the lag. Cache-write split-brain: the Postgres write succeeds but the cache invalidation fails, the cache serves stale data until TTL. Short TTLs and tolerate the staleness.

This is a reasonable polyglot architecture. Each store earns its place. The application is meaningfully better than on Postgres alone, but only because the workload genuinely requires the specialisation and the team has invested in the operational maturity to run three systems.

The closing recommendation

The instinct to add another database almost always feels productive. The new store will solve the specific problem in front of you today. The benefits are concrete. The costs are abstract and do not show up until later, when you are running four backup systems and three on-call rotations and your team is doing more operational work than feature work.

Be skeptical of “we should add database X.” The default answer is “have we exhausted Postgres first?” Look at jsonb, at pgvector, at TimescaleDB, at PostGIS, at AGE. Most of the time, one of them does the job well enough that the second database is unnecessary. When the answer is genuinely “no, Postgres cannot do this at our scale,” then add the second store, with eyes open. Pay the cost willingly because the workload requires it, not because the diagram looked elegant.

The teams I have seen succeed with polyglot persistence are the ones that arrived there reluctantly. Each store was added because a measured pain pushed them to it. The teams I have seen struggle are the ones that started with the elegant diagram and then spent two years operating it.

What Module 3 has been about

Eight lessons ago we started Module 3 with the relational model and the question of what makes it the default. We walked through the alternatives: documents, key-value, columnar, graph, time-series, vector. Each solves a problem the relational model handles awkwardly, and each costs something in return. The module’s arc has been about giving you the vocabulary to evaluate any database choice, including the one you already have.

Three things to remember. First, “the right database” is a function of query patterns and operational constraints, not fashion. Second, Postgres in 2026 does much more than its 2011 reputation suggests, and the default of “use Postgres until you have measured that you cannot” is correct more often than the diagrams suggest. Third, every additional database is a real cost: polyglot persistence is a tool to be used reluctantly and deliberately, not a goal in itself.

Module 3 is done. Module 4 turns to the question that has been deferred all module: how do you scale any of these database choices horizontally, when the single-server ceiling is the bottleneck. Replication, partitioning, sharding, the consistency models from Module 2 applied to real storage layouts. The patterns are the same regardless of which database you picked, and they are the next foundation.

Citations and further reading

  • Martin Fowler, “PolyglotPersistence”, 2011, https://martinfowler.com/bliki/PolyglotPersistence.html (retrieved 2026-05-01). The original post that named the pattern.
  • Pramod Sadalage and Martin Fowler, “NoSQL Distilled” (Addison-Wesley, 2012). The book-length treatment of the polyglot argument, written when it was the optimistic case.
  • TimescaleDB documentation, https://docs.timescale.com/ (retrieved 2026-05-01). The reference for the time-series-on-Postgres path.
  • Apache AGE documentation, https://age.apache.org/age-manual/master/index.html (retrieved 2026-05-01). The Cypher-on-Postgres extension.
  • PostGIS documentation, https://postgis.net/documentation/ (retrieved 2026-05-01). The spatial extension that has been the gold standard for two decades.
  • pgvector, https://github.com/pgvector/pgvector (retrieved 2026-05-01). The vector-search extension covered in lesson 23.
  • Postgres documentation, “Full Text Search”, https://www.postgresql.org/docs/current/textsearch.html (retrieved 2026-05-01). The reference for the tsvector/tsquery machinery.
Search