What it actually takes to stream a Cricket World Cup final to 780 million viewers

What it actually takes to stream a Cricket World Cup final to 780 million viewers

On March 8, 2026, JioHotstar recorded 82.1 crore (821 million) peak concurrent viewers during the T20 World Cup final between India and New Zealand.

More people were watching a single live stream than the entire population of Europe. Before the first ball was even bowled, 2.1 crore viewers were already connected just to watch Ricky Martin perform at the opening ceremony.

I keep coming back to that number. 821 million simultaneous streams. Not page views. Not total viewers across the day. Simultaneous.

And the stream didn't crash.xr

How?

The traffic pattern that breaks everything

Most applications grow traffic gradually. You launch a feature, run some ads, traffic ticks up over weeks. You have time to notice problems and fix them.

Live cricket doesn't work that way.

Viewership during the T20 World Cup final escalated like this:

  • 2.1 crore at the pre-match entertainment
  • 6.5 crore at the first ball
  • 43.9 crore by the end of India's innings
  • 44.3 crore during the innings break
  • 74.5 crore when the last New Zealand wicket fell (19th over, India win by 96 runs)
  • 82.1 crore during the trophy ceremony

That kind of traffic ramp would melt most systems.

The 2023 World Cup final between India and Australia peaked at 59 million concurrent viewers on Hotstar. The India vs Pakistan match in the same tournament hit 35 million. The 2024 T20 World Cup final drew 53 million. Each of those was, at the time, a world record for live streaming.

The March 2026 final shattered all of them by roughly 14x over the 2023 final.

The engineering challenge isn't just "handle a lot of traffic." It's "handle 10x more traffic than you've ever seen, arriving in a window of minutes, with zero tolerance for failure because the whole country is watching."

How JioHotstar's infrastructure actually works

The platform runs on AWS, with Akamai as the primary CDN partner and CloudFront as a secondary option. The backend architecture has evolved significantly over the years, and the numbers behind it are worth spelling out.

The raw infrastructure numbers:

  • 16 TB of RAM and 8,000 CPU cores on the cloud setup
  • Peak data transfer speeds of 32 Gbps
  • Can scale to over 4,000+ Kubernetes worker nodes during peak matches
  • 500+ AWS C4.4X Large and C4.8X Large instances for load testing alone, running at 75% utilization
  • C4.4X instances carry ~30 GB of RAM, C4.8X packs ~60 GB each

Before the 2023 World Cup, Hotstar was running on two large, self-managed Kubernetes clusters built with KOPS. Those clusters hosted more than 800 microservices handling everything from video playback and personalization to chat, analytics, and ad insertion.

Every microservice had its own AWS Application Load Balancer routing traffic through NodePort services to pods. The request path looked something like:

Client → CDN → ALB → NodePort → kube-proxy → Pod

A lot of hops.

As the platform grew, the team ran into real limits with this setup. Adding a new microservice meant provisioning another load balancer. Troubleshooting a latency issue meant tracing through multiple layers. Costs scaled linearly with service count, not traffic.

The engineering team eventually migrated to Amazon EKS and adopted a service mesh to reduce those routing layers. But what really matters is the scaling strategy.

Predictive scaling, not reactive scaling

This surprised me when I dug into it. JioHotstar doesn't primarily rely on auto-scaling in the traditional sense.

AWS Auto Scaling Groups have a limitation when you need to add hundreds of nodes in seconds because instances get added in batches. That step-based approach introduces too much latency when millions of users show up in a two-minute window.

Instead, the team uses what they call "ladder-based scaling": pre-defined infrastructure tiers per million concurrent users.

Here's how it works:

  • Before a match starts, they pre-warm the infrastructure to a baseline, pre-provisioning a minimum number of EC2 instances so the system is already hot when viewers arrive
  • They then scale up along the ladder as concurrency climbs
  • The platform maintains a concurrency buffer of ~10 million concurrent users
  • If demand exceeds that buffer, new containers spin up and the application starts in "a few seconds" (according to the team's public talks)
  • Their scaling application handles graceful scale-down when demand drops, so they're not paying for idle capacity after the match ends

They've also built AI models that predict traffic based on match importance, teams playing, and historical patterns. An India vs New Zealand final gets a different pre-warm profile than a league match between mid-table teams.

That prediction work happens days before the event, not minutes.

The microservices that matter

Lots of companies say "we run microservices." The specifics matter more than the label.

The 800+ microservices on JioHotstar's platform handle distinct responsibilities:

  • Video playback (the most critical)
  • Recommendation engines
  • Analytics pipelines
  • Ad delivery systems
  • Chat features
  • Scorecard updates
  • Authentication services

All run independently.

During the 2023 World Cup prep, the engineering team analyzed how frequently different features refreshed during live matches. Scorecard updates and "watch more" suggestions didn't need to update every second. By slightly reducing the refresh rate for non-critical features, they cut total network traffic without any visible impact on the viewing experience.

This is a useful principle that applies well beyond cricket streaming. When you're under pressure, figure out which services actually need real-time updates and which ones can tolerate a slightly stale cache. The savings add up fast.

The "jettison strategy"

The team also built what they call a "jettison strategy", essentially a panic mode where non-essential features get disabled entirely to preserve resources for the core video stream.

If the system is on the edge of capacity, things like personalized recommendations and social features get turned off so that the video keeps playing.

The philosophy is simple: nobody cares about content suggestions when the last over is being bowled.

CDNs and why they matter more than you think

Over 82% of global internet traffic is now video, according to Cisco's projections. When you're streaming to hundreds of millions of people simultaneously, you simply cannot serve all that traffic from a central origin. The math doesn't work.

JioHotstar uses a multi-CDN strategy:

  • Akamai as the primary provider (4,100+ edge locations, handles 15-30% of all web traffic globally)
  • CloudFront as secondary

Instead of every viewer's device connecting back to JioHotstar's origin servers in AWS, the video content gets distributed to edge nodes around the world (and heavily within India).

A viewer in Chennai connects to an edge server in Chennai. A viewer in Delhi connects to one in Delhi. The origin servers handle the initial stream encoding and distribution to edge nodes, but the actual delivery to viewers happens from servers that are geographically close.

How multi-CDN traffic steering works

  • The platform steers traffic based on real-time round-trip time and availability metrics
  • If Akamai's performance degrades in a region, traffic shifts to CloudFront for that region
  • A 2024 Forrester study found that multi-CDN strategies reduce rebuffering incidents by ~34% compared to single-CDN setups

Edge-level security

  • Short-TTL tokens enforce entitlements at the CDN edge, blocking unauthorized streams before they hit the origin
  • Cache keys encode content ID, rendition, codec, audio track, subtitle track, and device class
  • This allows targeted cache invalidation without blowing away the entire cache

Observability at this scale is its own problem

You can't fix what you can't see. Everyone says this. But at the scale of 800+ microservices across thousands of nodes, "seeing" what's happening is hard in a non-obvious way.

The real-time data pipeline:

  • Apache Kafka acts as the nervous system, handling millions of events per second (user clicks, playback events, error logs, system metrics)
  • Apache Flink processes those streams into near-real-time features, counters, and alerts
  • The platform processes terabytes of telemetry data in real time

The observability stack covers three layers:

  • Metrics — CPU utilization, request latency, pod health (Prometheus)
  • Logs — application events and errors (Loki)
  • Distributed traces — following a request as it moves across services (OpenTelemetry, Grafana)

Why live sports observability is different

If your e-commerce site has a latency spike at 2 AM, you can investigate in the morning.

If your live cricket stream has a latency spike during the final over, you have seconds to detect it and maybe a minute to fix it before it becomes a national conversation.

The incident response playbooks for live events are pre-written, specific, and rehearsed. Teams run through scenarios before every major match.

Chaos engineering and breaking things on purpose

JioHotstar and other large streaming platforms use chaos engineering: deliberately injecting failures into the system to see how it responds.

  • Shut down a server
  • Introduce network latency
  • Kill a microservice
  • See what breaks

The idea came from Netflix (they famously built "Chaos Monkey" to randomly terminate instances in production), and it's now standard practice at scale.

JioHotstar's testing approach:

  • Runs load tests simulating 50 million concurrent users before major tournaments
  • Uses Gatling and Flood.io to simulate realistic traffic patterns
  • Simulates the burst pattern you'd see when a wicket falls and millions of people simultaneously refresh their streams

The testing models account for different user types:

  • First-time visitors going through signup
  • Returning users logging in
  • People switching between features
  • The sudden spike when something dramatic happens on the field

If the system survives all of that in testing, the team has reasonable confidence it will hold during the actual event.

Reasonable confidence. Not certainty. Nobody has certainty at this scale, and anyone who says otherwise is selling something.

What the numbers actually look like, in sequence

Here's how concurrent viewership records on the platform have progressed:

YearEventPeak Concurrent Viewers
2019World Cup (India vs New Zealand)25.3 million
2023IPL Final (on JioCinema)32 million
2023World Cup (India vs Pakistan)35 million
2023World Cup semi-final (India vs NZ)53 million
2023World Cup Final (India vs Australia)59 million
2025Champions Trophy Final~61 million
2026T20 WC semi-final (India vs England)65 million
2026T20 WC Final (India vs New Zealand)821 million (82.1 crore)

The jump from 65 million to 821 million in the 2026 final is hard to fully explain.

Part of it is that JioHotstar (post-merger of JioCinema and Disney+ Hotstar) now has a combined user base and free mobile streaming. Part of it is that India was playing a World Cup final at home in Ahmedabad's Narendra Modi Stadium. And part of it is that the engineering team had been building toward this moment for years, incrementally learning from each previous peak.

I'd also guess that the measurement methodology changed somewhat post-merger, though I haven't been able to confirm that.

What this means if you're not streaming cricket

Most companies will never need to handle 821 million concurrent users. But the bones of JioHotstar's architecture translate to any system that has to deal with unpredictable load.

  • Pre-warm your infrastructure based on predicted demand. Don't wait for auto-scaling to react. The few minutes of lag can be the difference between a smooth experience and a total outage.
  • Build the ability to shed non-critical features under load, and test that shedding mechanism before you actually need it.
  • Use multiple CDN providers so a single provider isn't a single point of failure.
  • Invest in observability before the crisis that makes you wish you had it.
  • Run chaos experiments regularly, not just before big events, because systems degrade gradually and the failure modes you find today may not be the same ones that bite you next quarter.

And maybe most importantly, scale testing should simulate realistic user behavior, not just raw request volume. A million users all hitting the same endpoint is a different problem than a million users doing different things at different times.

The engineering that keeps a cricket stream running is invisible by design. When it works, nobody talks about it. When it fails, it's on the front page. That's the gig.

×

Contact Us