A few years ago I almost shipped a streaming pipeline that the business didn’t actually need. It was going to take a quarter, cost a small fortune in Kafka and Flink infrastructure, and require two new on-call rotations. I had the diagrams. I had the buy-in. I was a week away from kicking off the first sprint.
Then a colleague said, “can you write up a one-pager so I can sanity-check it?”. I said sure, easy, I’ve got the whole thing in my head. Three hours later I had a half-finished document with a section called “options considered” and an embarrassing realisation: the batch alternative was about ninety percent as good for ten percent of the cost. The streaming pipeline died on a Notion page that afternoon, and I have been a believer in writing things down ever since.
That document is what most engineering orgs call an RFC, sometimes a design doc, sometimes a tech spec. The name does not really matter. What matters is the act of writing one before you commit to a decision. An RFC is, genuinely, the cheapest way to find out you are wrong.
The structure that actually works
There are a hundred templates floating around the internet. Most of them are fine. The one I keep coming back to has six sections, in this order, because the order itself is doing work.
Context. What is the situation today? What is broken or missing or about to break? Two or three paragraphs, no jargon you have not already defined, and pretend the reader has been on holiday for a year. If you cannot describe the problem clearly, you do not understand it well enough to solve it.
Goal. One sentence. What does the world look like after this is done? “Reduce p95 dashboard load time from 12 seconds to under 3 seconds for the top 20 dashboards” is a goal. “Make things faster” is a wish.
Non-goals. This is the section everyone skips and everyone regrets skipping. List the things you are explicitly not solving. “We are not redesigning the semantic layer.” “We are not addressing the auth flow.” This protects you from scope creep in the comments and saves the reviewer thirty minutes of “but what about?” thinking.
Options considered. At least two, ideally three. For each one, describe how it would work in enough detail that you could start implementing tomorrow. Cost, risk, time, what changes for users, what changes for on-call. The first option is usually the obvious one. The second is the cheap one. The third is the wildcard you almost did not include but is sometimes the right answer.
Recommendation. Pick one and say why. Do not hedge. If you cannot pick, the document is not done.
Trade-offs and open questions. What are you giving up by picking this option? What are you not sure about? Where do you need help? This section signals confidence calibration: a reviewer who sees “open questions: none” knows the author has either thought of everything or thought of nothing, and it is usually the latter.
That is it. Six sections. The shortest RFC I ever wrote was about four hundred words and saved a week of work. The longest was nine pages and saved a quarter.
Writing it down is the value
Here is the thing nobody tells you when they hand you a template: the document is half the point. The other half is what happens in your head while you are writing it.
When you have an idea in your head, it is a fog. You can move through the fog, you can feel its shape, you can have a strong opinion about it. The moment you try to put it on a page, the fog has to commit to specific words, and the contradictions become visible. “We will use Postgres” sounds great until you write the next sentence and realise you have not decided whether it is the same Postgres as the OLTP one or a separate replica or a logical follower, and each of those has different operational implications, and now you have to actually think about it.
I have killed at least four bad ideas this way. Not because anyone reviewed the doc. Because the act of writing forced me to confront the parts I had been waving my hands at. The streaming-vs-batch story above is one of them. Another was a “let’s just add a column” change that turned into a multi-table migration the moment I started writing down what “just” meant.
If your RFC is not making you uncomfortable in at least one section, you are probably not writing the hard parts.
One-pager or ten-pager?
The size of an RFC should match the blast radius of the decision, not the size of the work.
A one-pager is right when:
- The change is reversible. You can roll it back in an afternoon.
- It affects one team, one service, one pipeline.
- The cost of being wrong is hours, not weeks.
A new dbt model that aggregates an existing table? One-pager. A small Airflow DAG to backfill historical data? One-pager. Sometimes half a page is enough, just to get the rationale out of Slack and into a place where it can be linked later.
A ten-pager is right when:
- The change is hard to reverse. Once you cut over, you live with it.
- It affects multiple teams or rewrites a contract between services.
- It touches data residency, security, billing, or compliance.
- The cost of being wrong is months.
Picking a new orchestrator. Splitting the warehouse into multiple databases. Introducing a streaming layer where you used to batch. Anything with the word “platform” in it. These deserve real documents with real diagrams and real numbers, because the decisions inside them are going to outlive at least three of the people in the room.
The trap is writing a ten-pager for a one-pager problem (you waste a week and bury the actual decision under formatting) or a one-pager for a ten-pager problem (you ship something irreversible and the document does not survive its first contact with reality). Aim for “as long as it needs to be, and a paragraph shorter”.
Inviting feedback without designing by committee
The reason RFCs get a bad reputation is that some teams turn them into consensus engines. You publish the doc, twenty people add comments, every comment becomes a meeting, every meeting adds a section, and three weeks later you have a thirty-page document and a team that has shipped nothing. This is not RFC culture. This is RFC theatre.
A few rules that have served me well.
Name a decision-maker. The author or a tech lead. The doc is not a vote. The decision-maker reads the comments, weighs them, and decides. Disagreement gets noted, not necessarily accommodated.
Set a deadline. “Comments by Friday end of day, decision Monday morning.” Without a deadline, comments arrive forever. With one, the people who care show up, the people who do not stay quiet, and you can move.
Distinguish blocking from non-blocking comments. A blocking comment is “this option does not work because of X”. A non-blocking comment is “have you considered Y?”. The first must be addressed before merging. The second goes into the open questions section and gets resolved if and when it matters.
Read the comments, do not chase them. Multiple smart people will pull you in opposite directions. That is not a bug in the process; it is the process working. Your job is to weigh the arguments, not to make everyone happy.
I usually share an RFC in a single Slack channel with a one-line summary and a link, set the deadline in the message, and ping no more than three or four people directly. The right people self-select.
When to skip the RFC
Not every decision needs a document. The ones that do not:
- True one-day work where the cost of being wrong is one day.
- Urgent fixes during an incident. Write the postmortem instead.
- Decisions where the answer is genuinely obvious to anyone who looks (rare; if you are not sure, write the doc).
- Personal experiments and spikes. The RFC, if any, comes after the spike, not before.
The signal that you should have written one is when you find yourself in the third meeting about a change you thought was simple, having the same conversation with different people. That is the moment to stop, write a page, and link it in every future conversation.
A note on tools. I have used Notion, Google Docs, GitHub Discussions, and the actual Rust RFC repo over the years. They all work. The two things that matter are: comments tied to specific lines, and a stable URL you can link forever. If you have to choose, I lean Notion or Google Docs for collaboration speed, and a markdown file in the repo for anything that is going to outlive the team.
An RFC is a cheap way to find out you are wrong before you commit to being wrong expensively. The structure does not matter as much as the discipline of writing one at all. Pick a template, six sections is plenty, scale the length to the blast radius of the decision, set a deadline, name a decision-maker, and ship. You will be surprised how often the act of writing the document is the entire value, and how often the document never even needs to be read.