Idempotency, exactly-once, at-least-once, at-most-once

This is the last lesson of Module 2. It is also, in my opinion, the most useful one to internalise, because the idea at its centre is the architectural tool that, more than any other, makes distributed systems behave. The idea is idempotency. The detour we have to take to get there starts with a vocabulary problem: the words “at-least-once,” “at-most-once,” and “exactly-once” are used loosely in the industry, often interchangeably, almost always wrong.

When a vendor’s slide deck says “exactly-once delivery,” what they usually mean is something narrower than the words suggest. When an engineer says “we are using at-least-once,” they may or may not have thought through what duplicates do to their consumer. When a textbook says “exactly-once is impossible,” it is correct in a specific sense and misleading in another. This lesson is the careful reading of those words.

The four flavours of delivery

Three of these are about what the messaging layer promises. The fourth is about what the application layer does with whatever the messaging layer hands it.

At-most-once. The sender ships the message and forgets about it. If it gets lost in transit, dropped at the broker, or the receiver was offline, the message is gone forever. There are no retries. Implementation is trivial: it is just sending. This is the cheapest semantics. It is appropriate for telemetry that is allowed to have gaps (UDP-style metrics, log lines, sampled traces), for fan-out broadcasts where the loss of one consumer’s update is not catastrophic, for anything where the next message will overwrite this one anyway. It is wrong for anything that involves money, state changes the user can see, or audit trails.

At-least-once. The sender ships the message, waits for an acknowledgement from the broker, and retries until it gets one. The broker, similarly, holds the message until it has been acknowledged by the consumer. If a network hiccup causes an ack to be lost in transit even though the message was processed, the sender (or broker) does not know that, retries, and the consumer sees the message a second time. Duplicates are not the exception, they are the steady state. Any reliable messaging system that is not specifically engineered for exactly-once is at-least-once. Kafka, RabbitMQ, SQS, NATS JetStream, every cloud queue you have probably used: at-least-once by default.

Exactly-once. The dream. Each message is delivered to the consumer exactly one time. No drops. No duplicates. As an end-to-end guarantee across an arbitrary network, with arbitrary failures, this is impossible. We will see why in a moment. As a property within a closed system with cooperative components, it is achievable, and several products advertise it.

Idempotent processing. The fourth flavour is the architectural answer to the messaging-layer trilemma. Instead of asking the messaging layer to deliver each message exactly once, you accept at-least-once delivery and design the consumer so that processing the same message a second time has no additional effect. The application layer absorbs duplicates. The system, end to end, behaves exactly-once even though no individual layer guarantees it.

Why exactly-once at the messaging layer is hard

The shortest correct argument for why exactly-once is impossible across an unreliable network is the Two Generals Problem. Two generals on opposite hills want to coordinate an attack at dawn. The only way to communicate is by messengers who must cross the valley between them, and the valley is contested. Any messenger may be captured. The first general sends a message: “attack at dawn.” She does not know if it arrived. The second general, if he received it, sends back an ack. He does not know if the ack arrived. The first general, if she received the ack, can send an ack of the ack. And so on. There is no finite number of acks at which both generals can be certain the other knows the plan.

The proof is a one-line induction: any finite protocol can be cut at the last message, and the sender of that last message cannot know whether it arrived. Translated into messaging terms: a sender cannot know, with certainty, whether its message reached the receiver, unless the network is reliable. The network is not reliable. Therefore, the sender either retries (risking duplicates) or does not (risking loss). Pick one.

Exactly-once delivery would require the sender, the broker, and the receiver to agree, with certainty, on whether each message was processed. That requires consensus across the three of them on every message. Consensus across an unreliable network is possible (Paxos, Raft) but expensive, and even then, only within the cooperating cluster. The moment your “consumer” is something the messaging system does not control, like a third-party payment gateway, exactly-once across the boundary is back to being impossible.

What “exactly-once” usually means in marketing copy

Vendors do ship exactly-once features, and they are not lying, but the scope is narrower than the phrase suggests. The most-cited example is Kafka exactly-once semantics, which Confluent has written about extensively.

What Kafka actually delivers is exactly-once within the Kafka ecosystem. Specifically, a producer can write a batch of messages, a consumer can advance its offset, and these two operations can be performed atomically inside a Kafka transaction. The result is that, as long as your consumer’s only output is to write more messages back to Kafka topics, and as long as your consumer commits its offsets through the same transactional API, no message is processed twice and no message is lost.

The two important caveats are in the previous sentence. First, “within Kafka.” If your consumer’s output is a write to Postgres, or a call to Stripe, or a row in S3, the Kafka transaction does not extend to those sinks. The consumer might process a message, write to Postgres, fail before committing the offset, and on restart re-process the message and re-write to Postgres. The Postgres write happens twice. Kafka’s exactly-once does not save you. Second, “consumer commits its offsets through the same transactional API.” The default consumer commit is not transactional, and most consumer code in the wild does not use the transactional version. The exactly-once guarantee requires a deliberate, code-level opt-in.

Kafka Streams does extend this further, because Kafka Streams is itself a Kafka-only consumer-producer pipeline, and the transactional commit covers the whole topology. Inside a Kafka Streams application, exactly-once is real and useful. The moment your topology has a sink that is not Kafka, you are back to needing idempotency at the sink.

The same shape applies to other “exactly-once” claims. Apache Flink’s exactly-once is exactly-once across Flink’s checkpointed state. Google Pub/Sub’s exactly-once is exactly-once within a single subscriber session. Each of these is meaningful inside a boundary. None of them eliminates the need for idempotent processing at the application’s external sinks.

Idempotency: the architectural answer

Idempotency means an operation can be applied many times with the same effect as applying it once. Setting a variable to 5 is idempotent. Incrementing a variable is not. Sending an email saying “your password has been reset” is not, although the user-visible damage of duplicates is small. Charging a credit card 100 euros is emphatically not, and the user-visible damage of a duplicate is a refund, an angry email, and possibly a chargeback.

The job of the idempotent consumer is to take an at-least-once stream and process each message, deduplicating by some identity, so that the externally observable effect is exactly-once. There are four common patterns for doing this, and they often combine.

Idempotency keys are the canonical pattern for HTTP APIs. The client generates a unique identifier per logical request and sends it in a header (Stripe uses Idempotency-Key, the convention is widely copied). The server, before processing, checks whether it has seen this key before. If yes, it returns the same response it returned the first time, without re-processing. If no, it processes the request, stores the key together with the response, and returns. The semantics are: the same key produces the same effect and the same response, no matter how many times it is sent.

The client is responsible for choosing keys that are unique per logical request. A common shape is to generate a UUID v4 at the moment the user clicks “pay” and reuse it across all retries of that payment until either the request finally succeeds or the user explicitly starts a new attempt. A common bug is to generate a fresh key on every retry, which defeats the point.

Upserts with natural keys are the database-side equivalent. Instead of “insert this row” the consumer issues “insert this row, but if a row with the same (order_id, line_id) already exists, do nothing.” Postgres calls this INSERT ... ON CONFLICT DO NOTHING. SQL Server calls it MERGE. MySQL calls it INSERT ... ON DUPLICATE KEY UPDATE. The key insight is that the table has a unique constraint that captures the logical identity of the operation, and the database enforces idempotency for you.

Operations that are inherently idempotent are the easiest case. Setting a state to a specific value (X = 5) is idempotent regardless of how many times you do it. Setting “this booking is now in state ‘paid’” is idempotent. Incrementing a counter is not, but in many cases you can rephrase the increment as a set (“this booking now shows the cumulative-paid amount of 100 euros”) and recover idempotency. When you have a choice in how to model the operation, prefer the idempotent shape.

Idempotent transforms in stream processing are the version of this for analytics pipelines. A windowed aggregate (count of events per minute, sum of amounts per hour) re-run on the same input produces the same output. As long as the pipeline is deterministic, reprocessing the same input yields the same result, and overwriting yesterday’s hourly totals with today’s recomputation is harmless. Stream-processing engines lean on this hard. The combination of “deterministic transform” plus “overwriting sink” is functionally idempotent end to end.

A worked example: an idempotent payment endpoint

The naive design double-charges on retry. Imagine an HTTP endpoint:

POST /payments
Body: { amount: 100, card: "..." }

The client posts. The server charges the card via the gateway, returns 200 OK with the new payment record. The client’s response gets lost in transit because of a flaky mobile connection. The client retries. The server charges the card again. Two charges. One angry user.

The idempotent design adds an Idempotency-Key header. The client generates the key at the moment the user clicks “pay” and reuses it for retries. The server, on receiving the request, looks up the key in a small store (Redis, or a dedicated table). If the key is new, it locks the key, processes the payment, stores (key, response), and returns. If the key already has a stored response, the server returns the stored response without re-charging. If the key is locked but has no response yet (a previous request is still in flight), the server either waits or returns a “still processing” response that the client can poll.

sequenceDiagram
    participant Client
    participant Payment as Payment Service
    participant Store as (key, response) store
    participant Gateway as Card Gateway
    Client->>Payment: POST /payments [Key: K]
    Payment->>Store: get K
    Store-->>Payment: not found
    Payment->>Store: lock K
    Payment->>Gateway: charge card 100
    Gateway-->>Payment: txn 8821, success
    Payment->>Store: write (K, response)
    Payment-->>Client: 200 OK { txn: 8821 }
    Note over Client,Payment: response lost in transit
    Client->>Payment: POST /payments [Key: K] (retry)
    Payment->>Store: get K
    Store-->>Payment: response { txn: 8821 }
    Payment-->>Client: 200 OK { txn: 8821 }

Notice three things about this design.

First, the deduplication store has to be durable. If it is in memory and the server restarts, the next retry will look like a new request and you have lost the property you were paying for. Redis with persistence is acceptable. A Postgres table is better. An in-memory cache is wrong.

Second, the key has to be stored before the side effect, not after. If you charge the card and then write the key, a crash between the two leaves you with a charge that has no idempotency record, and the next retry will charge again. The right pattern is: lock the key, perform the side effect, write the response, release the lock. Locking can be a row-level pessimistic lock or an INSERT with a unique constraint that fails on duplicates.

Third, the key store has a retention policy. You cannot keep idempotency keys forever. A common choice is twenty-four hours, which covers any reasonable client-retry window. The store grows at the rate of unique requests per day and stays bounded.

Putting it all together

The architectural prescription that falls out of this lesson is short. Use the messaging layer for at-least-once delivery, because at-least-once is the only thing the layer can honestly promise across an unreliable network. Make every consumer idempotent, by one of the patterns above. Treat exactly-once claims with skepticism: identify the boundary inside which the claim holds, identify the sinks outside that boundary, and make those sinks idempotent on their own.

The cost of this design is a small per-message overhead: a lookup in the deduplication store, the storage cost of the keys, and the discipline of generating unique keys at the right layer. The benefit is that the system behaves correctly under retries, broker failures, network partitions, and the kind of failures you cannot anticipate, because the application layer no longer cares whether a given message arrived once or five times.

What Module 2 has been about

Six lessons ago we started Module 2 with the eight fallacies of distributed computing. The arc of the module has been: every property you took for granted in a single-process, single-machine world becomes a question the moment you stop being on one machine. The network is not reliable. Latency is not zero. The clocks do not agree. CAP forces you to choose between consistency and availability when the partition arrives, and the partition will arrive. Strong consistency is expensive when you can have it, and often you cannot have it. Distributed transactions cost more than you expect, and 2PC’s coordinator-failure mode means you should reach for Sagas instead. And, today, exactly-once delivery is a fiction at the messaging layer that becomes a fact at the application layer through idempotency.

If you came into Module 2 hoping for the techniques that make distributed systems easier, the news is mixed. The techniques exist, and we have covered them, but the headline is that distributed systems do not become easier; they become tractable. The way they become tractable is that you internalise a small number of patterns (idempotent operations, sagas, eventual consistency, careful boundaries between local and global state) and apply them everywhere, by default, until they are no longer techniques and are just the way you write code that talks to other machines.

The rest of the course builds on this foundation. Module 3 gets concrete about the data layer: replicas, pooling, sharding, partitioning. Module 4 gets concrete about queues and workers. Module 5 returns to the service-decomposition question with the distributed-systems vocabulary now in hand. None of those modules makes sense without the foundations of Module 2. With them, the rest of the course is a long sequence of “given these constraints, here is the pattern that solves it.” That is what architecture is. Welcome to Module 3.

Citations and further reading

Stripe API documentation, “Idempotent requests”, https://stripe.com/docs/api/idempotent_requests (retrieved 2026-05-01). The canonical write-up of the idempotency-key pattern as a public API contract.
Confluent, “Exactly-Once Semantics Are Possible: Here’s How Kafka Does It”, https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/ (retrieved 2026-05-01). The original Kafka exactly-once announcement, with careful scoping of what the guarantee covers.
Jim Gray and Andreas Reuter, “Transaction Processing: Concepts and Techniques” (Morgan Kaufmann, 1992). The reference text on transactions, idempotency, and the design of reliable systems. Old but still the best treatment of the underlying ideas.
Tyler Treat, “You Cannot Have Exactly-Once Delivery”, https://bravenewgeek.com/you-cannot-have-exactly-once-delivery/ (retrieved 2026-05-01). A short, sharp essay that walks through the Two Generals Problem in messaging terms.