Real case: how Netflix runs daily batch on petabytes

Module 5 ends the way Module 4 ended: with a single company, a single decade, and public engineering posts that make the abstractions concrete. The Discord case in lesson 32 was one workload running on three databases over ten years. The Netflix case is structurally different. Netflix’s batch platform is not one workload; it is a platform that thousands of jobs and dozens of teams run on, every day, on hundreds of petabytes of data. The interesting part is not the data model. It is the platform itself: the orchestrator, the table format, and the cost-optimization layers that make daily batch at this scale economically possible.

Three pieces of the architecture are public enough to walk through. Maestro is the orchestrator. Apache Iceberg, which Netflix invented and open-sourced, is the table format. And a layered set of techniques (spot instances, autoscaling, caching, compaction) keeps the bill manageable. The lessons matter more than the specific tools.

The scale

Numbers from public posts and conference talks, as of 2024 to 2025. Netflix’s data platform stores hundreds of petabytes on Amazon S3, in Iceberg tables, covering viewing telemetry, content metadata, recommendations training data, A/B test outcomes, finance data, ad-tier telemetry, and operational logs of the streaming infrastructure.

On top of that storage, Netflix runs many thousands of batch jobs per day in Spark, Trino, Flink, and SQL workflows. Roughly three thousand people inside the company interact with the platform regularly. The platform is multi-tenant, with very different workloads and sensitivities to latency, cost, and reliability.

The annual data-platform spend is in the hundreds of millions of dollars. Individual cost-reduction projects have saved tens of millions each. At this scale, a one percent efficiency improvement is a real headcount of engineering time. Cost optimisation is not a side project; it is a discipline.

Maestro: the orchestrator

Every batch platform needs an orchestrator. Airflow, Prefect, Dagster, Argo, Temporal, and a dozen smaller players cover the open-source space. For most teams the right answer is to pick one off the shelf. Netflix did not.

Before Maestro, Netflix ran a homegrown system called Meson on top of Genie, and before that a series of internal tools. By the late 2010s, Meson was straining: the workflow definition model was hard to compose at scale, the scheduler had performance issues at thousands of concurrent runs, and the multi-tenancy story had grown organically rather than by design.

Maestro, announced and open-sourced in 2024 in the Netflix Tech Blog post “Maestro: Netflix’s Workflow Orchestrator” (https://netflixtechblog.com/maestro-netflixs-workflow-orchestrator-ee13a06f9c78, retrieved 2026-05-01), is the rewrite. Three properties stand out.

Workflow composition. Maestro models workflows as DAGs of steps, with first-class composition of workflows out of other workflows. When many teams share a platform, reusable building blocks need to compose without copy-paste and be versioned as their authors evolve them. Off-the-shelf orchestrators handle this with varying grace; at Netflix’s scale the gracefulness is load-bearing.

Multi-tenancy and isolation. A single Maestro instance runs workflows for many teams with different priorities. It is built to schedule fairly, isolate noisy neighbours, and apply per-team quotas without a separate deployment per team. The kind of property that is invisible until you need it and then becomes the entire reason the system exists.

Scheduler scale. Hundreds of thousands of workflow runs per day on a stateless scheduling tier with a durable state store and an event-driven design. Airflow at this scale needs careful tuning and meta-scheduling; Maestro is engineered for it from the start.

The honest reading: Netflix built Maestro because off-the-shelf options did not fit their specific scale, multi-tenancy, and culture. A defensible reason that does not generalise. Most teams should pick Airflow or Prefect or Dagster, configure them well, and put their engineering effort into the data. The Maestro story is here to illustrate the calculation, not the conclusion. Build-versus-buy at the platform layer is a real decision, and the threshold at which build wins is much higher than most teams think.

Iceberg: the table format

Apache Iceberg started at Netflix. Ryan Blue and Daniel Weeks, then Netflix engineers, designed it in 2017 to solve a problem that everyone running Hive at petabyte scale had: the Hive metastore approach to table metadata was breaking. The 2018 Strata talk and subsequent Netflix Tech Blog posts describe the failure modes in detail. The official documentation at https://iceberg.apache.org/ (retrieved 2026-05-01) covers the origin and the format.

The headline problems with Hive at scale, in Netflix’s experience:

No atomic writes. A Hive table is “the set of files in this S3 prefix that match this partition pattern”. A writer adding files and a reader listing files race against each other. At small scale this is invisible; at petabyte scale the race condition produces partially-written reads, dropped rows, and queries that return different answers depending on when they fired. Hive’s fix was conventions and discipline; Netflix needed correctness.

No working schema evolution. Renaming a column or changing a type in a Hive table was a manual exercise that often broke older readers, especially when the file format had its own opinions (Parquet column names, Avro schemas). Netflix had thousands of tables and hundreds of writers; the manual approach did not scale.

Slow metadata operations. Listing the partitions of a large Hive table required listing S3, which at Netflix’s scale was slow and expensive. Query planning that needed partition pruning was bottlenecked on the listing.

Iceberg makes the table itself a structured object: immutable data files, plus a metadata tree (manifest lists, manifests, snapshots) that records exactly which files belong to the table at each point in time. Writes produce new snapshots atomically. Schema evolution is recorded in metadata, so old and new readers see consistent views. Query planning reads metadata, not S3 listings, and is fast regardless of table size.

Netflix’s adoption was internal first (migrating petabytes of Hive tables over several years) and then external (donating the project to the Apache Software Foundation in 2018, picked up by Snowflake, Databricks, AWS, and Google).

This story matters for Module 5 as the cleanest example of “format choices matter at scale”. Hive worked for Netflix for years; the pain that drove the Iceberg investment was operational and visible only at petabyte scale. Once Netflix solved it, the industry inherited the solution. Netflix has been explicit, in talks, that broad adoption is a strategic win, not a competitive loss: more tools, more bug fixes, more interoperability, more engineers who already know the format.

The cost-optimization layers

Cost optimisation is the third pillar of the platform, alongside the orchestrator and the table format. The published posts cover several techniques that recur at smaller scales too.

Spot instances and autoscaling. Netflix runs the bulk of its Spark workloads on AWS Spot capacity, which is roughly 60 to 90 percent cheaper than on-demand at the cost of being interruptible. Spark is well-suited to spot because tasks are small and re-runnable and the scheduler tolerates losing executors mid-run. The autoscaling layer adds and removes capacity dynamically based on the queue of pending jobs, so clusters do not sit idle overnight and do not run hot during the morning peak.

Iceberg compaction. Streaming writes and frequent batch updates produce many small files in Iceberg tables, which hurt query performance (more file opens, more metadata, less efficient column scans). Compaction periodically rewrites small files into larger ones. At Netflix’s scale this is itself a significant batch workload, run by the platform team on behalf of all tenants. The cost of compaction is real; the cost of not compacting is larger.

Query result caching. Iceberg’s snapshot-based versioning makes “has the data changed” a cheap question (compare the snapshot ID), so cache invalidation is precise rather than time-based. A dashboard that runs the same query every five minutes can hit the cache for hours if the underlying data is rebuilt nightly.

Smart routing across engines. Spark for heavy ETL, Trino for interactive analytics, Flink for streaming. Cost characteristics differ, and routing a query to the right engine is itself an optimisation. Iceberg as a unified table layer lets the engine choice be made per-query.

The numbers are publicly fuzzy (Netflix does not publish its AWS bill), but conference talks include statements like “this initiative saved tens of millions per year” for individual projects. The order of magnitude is what matters: at petabyte scale, a one percent efficiency gain on a hundred-million-dollar spend pays for the engineers who built the optimisation many times over.

The architecture, in one diagram

flowchart LR
    subgraph Storage[Storage layer]
      S3[(S3)]
      ICE[Iceberg metadata]
      S3 --- ICE
    end
    subgraph Compute[Compute engines]
      SP[Spark]
      TR[Trino]
      FL[Flink]
    end
    subgraph Orch[Orchestration]
      M[Maestro]
    end
    subgraph Cost[Cost layers]
      SPOT[Spot autoscaling]
      COMP[Compaction]
      CACHE[Result caching]
    end
    subgraph Consumers[Consumers]
      BI[BI and dashboards]
      ML[ML training]
      BIZ[Business and finance]
    end
    ICE --> SP
    ICE --> TR
    ICE --> FL
    M --> SP
    M --> TR
    M --> FL
    SPOT --> SP
    COMP --> ICE
    CACHE --> TR
    SP --> Consumers
    TR --> Consumers
    FL --> Consumers

Diagram to create: a more visually polished version of the Mermaid diagram above, with the storage layer at the bottom, the compute engines in the middle, orchestration on the left feeding into compute, the cost-optimization layers as horizontal bands cutting across compute and storage, and the consumers on the right. The point of the diagram is to show that the platform is layered: storage is the floor, compute is the middle, orchestration drives compute, cost layers cut horizontally, consumers sit on top.

Iceberg-on-S3 is the floor that everything reads and writes. Spark, Trino, and Flink are the engines. Maestro schedules them. The cost layers cut horizontally. Consumers (BI, ML, business reporting) live on top.

What the journey teaches

Five lessons, in roughly the order Netflix encountered them.

Build your own orchestrator only when standard ones genuinely do not fit. Maestro exists because Airflow and Prefect, at Netflix’s scale and culture, were not the right fit. That calculation is real, and it is wrong for almost every other team. The threshold for “build your own platform tool” is higher than most engineering teams estimate, and the cost of getting it wrong (years of platform debt, hiring difficulty, a tool nobody outside the company knows) is large. Pick the orchestrator off the shelf, configure it well, and put the engineering effort into the data.

Format choices matter at scale, in ways invisible at small scale. Hive worked for Netflix for years. The pain that drove Iceberg was operational and only visible at petabyte scale. If you run a hundred-table warehouse on Snowflake or BigQuery, the format question is answered for you. If you run an open lakehouse with thousands of tables, the format question is the most consequential architectural choice on the platform, and Iceberg (or Delta or Hudi) is what the answer looks like in 2026. Lesson 37 covered the landscape.

Cost optimisation is its own discipline. At petabyte scale, even small percentages of waste are millions of dollars per year. It needs dedicated engineers, dedicated tooling (cost dashboards per workload, per table, per team), and a culture that treats efficiency as a first-class engineering value rather than a thing the finance team chases. Module 9 covers the patterns in detail. The Netflix story is proof the techniques (spot, autoscaling, caching, compaction) are applicable at much smaller scales than Netflix’s.

Open-source contributions strengthen the platform. Iceberg adoption by Snowflake, Databricks, AWS, and Google is the best thing that has happened to Netflix’s data platform in the last five years. More tools, more bug fixes, more engineers who know the format when Netflix hires them. The principle generalises: open-sourcing platform components is often the cheapest way to make them durable.

The platform is the product. For data engineers at Netflix, the platform is the thing they build, deliberately, with the same care you would put into a customer-facing product. The data engineers and scientists are the customers, and the platform team’s job is to make their work fast, correct, cheap, and pleasant. This framing survives the scale gap. Even a five-person data team has internal customers; treating the platform like a product changes the work.

What this means for systems that are not Netflix

If you run a Snowflake or BigQuery warehouse with a few terabytes and a handful of dbt models, the relevant lessons are the meta-lessons, not “use Iceberg”.

Pick an orchestrator off the shelf and revisit the choice only when it is genuinely not keeping up. Most teams never reach that point. Pick a table format you can live with for five years; if your warehouse manages it, the choice is made. Build a cost dashboard early: the habits you build at one terabyte are the habits you need at one petabyte. Treat your data platform as a product with internal customers, the cheapest of the lessons to adopt and the one with the largest leverage. The scale gap does not change the principle; it only changes the size of the team applying it.

Module 5 closes here

Module 5 walked through the modern batch data stack: ETL versus ELT (lesson 33), Hadoop and MapReduce (lesson 34), Spark and the engines that replaced MapReduce (lesson 35), data lake and warehouse history (lesson 36), the lakehouse and open table formats (lesson 37), idempotent batch jobs (lesson 38), backfilling and replay (lesson 39), and now the Netflix case study that exercises all of these at petabyte scale.

The patterns transfer; the scale does not. A team running a hundred-gigabyte warehouse and a team running a hundred-petabyte lakehouse use the same vocabulary, the same diagrams, the same playbooks. The difference is in the constants, not the shape.

Module 6 starts on streaming. The data is no longer at rest. The job no longer finishes. The trade-offs are different, but the Module 5 foundation is what stream processing reads and writes.

Citations and further reading

Netflix Tech Blog, “Maestro: Netflix’s Workflow Orchestrator”, 2024, https://netflixtechblog.com/maestro-netflixs-workflow-orchestrator-ee13a06f9c78 (retrieved 2026-05-01). The open-sourcing announcement and the architectural overview of Maestro.
Netflix Tech Blog, posts on Apache Iceberg and the Netflix data platform, indexed at https://netflixtechblog.com/ under the “Iceberg” and “data platform” tags (retrieved 2026-05-01). The series covers the origin of Iceberg, the migration from Hive, and the platform components built on top.
Apache Iceberg documentation, https://iceberg.apache.org/ (retrieved 2026-05-01). The canonical reference for the format, the metadata model, and the operational guidance on compaction and snapshot expiration.
Ryan Blue, “Iceberg: a fast table format for S3”, Strata Data Conference, 2018. The talk that introduced Iceberg to the broader community, with the failure modes of Hive at Netflix’s scale described in detail.
“Designing Data-Intensive Applications” (Martin Kleppmann, O’Reilly, 2017), chapters 10 and 11. The standard reference for batch and stream processing, against which the Netflix architecture is a concrete instantiation.