DTC ULTRAMEDIA PIPELINE ARTICLE - Real-Time WebRTC Streaming Gains Traction as New Protocol Takes Shape

Will Law, chief architect,
cloud technology, Akamai

The OTT sector’s increasingly urgent search for ways to radically reduce end-to-end live streaming latency has unleashed an outpouring of ground-breaking innovations along with a great deal of confusion over what goals need to be met, how to meet them and how long it will take to get there with some assurance of profitability.

The good news is that, in light of the latest developments, whether a service provider is looking for incremental or vast gains in latency reduction, the wait times have been slashed. Now the challenge is to make sure aspirations aren’t limited by lack of knowledge about what can be done.

The state of confusion over what’s doable with reasonable expectations of net-positive bottom-line outcomes was on abundant display at the recent International Broadcasting Convention in Amsterdam. Sorting through the avalanche of claims and counter claims calls for a breadth of knowledge about developments that’s hard to sustain in today’s decision-making crunch.

At IBC there were more vendors touting highly scalable Web Real-Time Communications (WebRTC) streaming solutions at sub-500ms latencies than typically seen at trade shows, including ANT Media, Ceeblue, Dolby Millicast, nanocosmos, Red5 and Wowza. (See our report on the full lineup of WebRTC platforms suited for mass audience streaming.) At the same time, industry executives spent a substantial amount of time in conference sessions and private meetings exploring a wide range of initiatives aimed at reaching new latency benchmarks through solutions that, unlike WebRTC platforms, can operate along side the dominant Hypertext Transport Protocol (HTTP) streaming infrastructure.

In the latter cases, the generally accepted rule of thumb is that the solutions with the least waiting times for commercial availability fall well short of the WebRTC performance marks. But over time, the story goes, there’s at least one HTTP-oriented roadmap known as Media over QUIC (MoQ) that promises to leverage the existing infrastructure with a software overlay that achieves real-time streaming without the forklifts required by WebRTC.

As its name implies, MoQ relies on the IETF QUIC standard originally developed as Quick UDP Internet Connections by Google. But while QUIC has become the default transport for the third version of HTTP-based streaming, MoQ uses QUIC in ways that are incompatible with HTTP streaming.

The consensus narrative surrounding WebRTC and MoQ as the near-term and future vehicles for real-time streaming was jolted when a purportedly ready-to-deploy solution came out of nowhere offering an HTTP-based shortcut to streaming multi-directionally to mass audiences at sub-500ms latencies. The possibility was brought to light unannounced during IBC by Ceeblue, a small Dutch company known, in part, for its role as a WebRTC platform supplier.

Beating Broadcast Latency

Setting aside the Ceeblue development, which we explore at length elsewhere, perceptions of costs and just how much latency reduction is really necessary are what generally divides the WebRTC and HTTP camps. For example, Comcast technology fellow Alex Giladi made clear during an IBC session that when it comes to overcoming the lag time between broadcast and streamed sports reception, he sees no need for WebRTC. “The difference between what you can get out of adaptive streaming and out of WebRTC is large but not large enough to justify wholesale change,” Giladi said.

The wholesale change Giladi referred to stems from the fact that WebRTC is a peer-to-peer communications protocol stack based on the Real-time Transport Protocol (RTP) used in internet voice and video communications. Streaming platforms built on WebRTC require distributed deployments of intelligently orchestrated cloud resources operating outside the HTTP domain to reach massive scales.

The 1.8-second latency Comcast achieved over the dominant HTTP Live Streaming (HLS) and MPEG DASH streaming modes with partners BT and a handful of vendors in an IBC-endorsed “Accelerator” initiative met the group’s goal of showing it’s possible to outperform broadcast latencies streaming sports in 4K to mass audiences. As described by Giladi, the accomplishment, anticipating in part what the MPEG DASH development community is attempting to do with forthcoming 6th edition updates to the standard, was achieved through reductions in latency contributions from encoding, packaging, buffering and other points in the distribution chain.

But the cost-benefit calculations are quite different for service providers who are looking for what WebRTC platforms are delivering. These are the latencies required to support a wealth of next-generation applications infused with video-enabled interactions among participants, such as watch parties, game shows, live sales presentations, online casino gambling and microbetting, multiplayer game playing, distributed esports competitions, photorealistic immersive XR social experiences and much else.

Media over QUIC vs. WebRTC

In a video interview captured at IBC, Akamai’s cloud technology chief architect Will Law, who’s deeply involved in the IETF’s MoQ initiative, said, “If you need streaming at those latencies now, WebRTC meets your needs today.” But he advised those who don’t see an immediate need for real-time connectivity to bide their time with the understanding the MoQ standard will one day lead to real-time multidirectional streaming without drawbacks he cited as issues with use of WebRTC.

Like WebRTC, MoQ is what Law describes as a “publish-subscribe” protocol that avoids the request-response method used with HTTP streaming, where a one-hour video streamed over HLS or DASH entails thousands of requests for short sections of video. “The beauty of a publish-subscribe system is that, as a subscriber I ask you, can you send me the 720P feed of a certain codec, and you say, yes, and you just send it to me as a constant stream of data,” Law explained.

“Publish-subscribe is a good fit for live distribution,” he added, “which includes real-time conversations like web conferences, but also includes live sports, betting, gambling, auctions – all those lower latency applications.” But, unlike WebRTC, “MoQ will be a layer that a CDN like Akamai and other CDNs can build support for, much like we offer an HTTP CDN today.”

WebRTC, for all its advantages as a protocol supported by the major browsers, encompasses a “complex bundle of black boxes” that restrict service providers’ operational flexibilities,” Law said. By that he meant that modes of transport, security and compression “are all baked in” as part of the highly integrated protocol stack.

For example, the only video codecs currently supported by WebRTC are H.264, VP8 and VP9, although the specifications stipulating requirements for browsers to quality as supporting WebRTC say support is only required for H.264 and VP8. VP9 can be used with WebRTC in the limited number of browsers, including Google Chrome, that support both.

As to Law’s reference to baked-in security, this has to do with the fact that the WebRTC standard includes support for a DRM-like private-public key system known as Secure Realtime Protocol (SRTP). Totally apart from any technical deficiencies SRTP might have compared to leading DRM systems, which have been mitigated with tweaks from some WebRTC platform suppliers, a service provider’s reliance on SRTP doesn’t address commonly implemented content licensing policies that require use of DRM, which can be a serious liability.

However, various providers of WebRTC streaming platforms have integrated extensions that work around some of these restrictions. For example, at least two, Red5 and Phenix Real Time Solutions, have integrated their platforms with DRM supplier CastLabs to enable WebRTC workflows embellished with built-in multi-DRM protection that covers all the commonly prescribed bases, including Google’s Widevine, Microsoft’s PlayReady and Apple’s Fairplay.

And, in the case of compression, there’s significant progress toward bringing H.265 and even AV1 support into the WebRTC platform codec lineups. One supplier executive, speaking on background, said his company expects to announce its support for H.265, also known as High Efficiency Video Coding (HEVC), by year’s end.

Tunable Latency

Law also noted that “the WebRTC transport protocol stands by itself,” meaning, among other things, that content streamed over WebRTC can’t be cached. In contrast, caching support provided by MoQ affords service providers the flexibility to choose between delivering a stream at ultralow latency or adding a little latency to maximize quality. By allowing a small amount, maybe a second’s worth, maybe more, of video to be cached at the edge, MoQ users will be able to create buffers that prevent fluctuations in internet operations from blocking transmissions, Law said.

This is an added protection targeting disruptions that aren’t overcome by the basic way QUIC works to mitigate the routine packet dropping that occurs with streaming across internet router hops. QUIC, with reliance on the User Datagram Protocol (UDP) transport mode, avoids the high levels of buffering required to accommodate dropped-packet retransmissions employed with the dominant mode of streaming over the Transmission Control Protocol (TCP).

Instead, with QUIC “I have the ability to set up many parallel streams within each stream,” Law said. “With these streams next to each other, if the head-of-line of one of them is blocked, all the other streams continue to flow. So as long as I can parallelize the transmissions of the data I’m sending, I can have very efficient transmissions. It doesn’t all queue up behind the one that gets lost.”

But there are perturbations in internet transmissions where the disruption is no longer about a dropped packet on a sub-stream moving over a particular router hop. In such cases, there’s a diminution in quality that can’t be avoided by QUIC mechanisms alone. This is where the MoQ caching mechanism comes into play.

“With real time, you’re going to get perturbations in throughput,” Law said. “That’s a fact of life with the internet, especially over mobile connections, less so over fiber, but they occur. And when your latency is very low, you have no buffer in your player to protect the player from these perturbations. So you’ll see that as either a stutter or a rebuffering. It’ll be a very brief one, but it will still be there.”

If real-time delivery is top priority, people will have to live with these occasional glitches. But if top-of-line quality is the priority, then the alternative provided by MoQ caching might be the better option. With MoQ, “we can make available different latency regimes on the same stream,” Law said.

In other words, with MoQ caching is a tunable option that can be adjusted to network conditions. The distributor “can choose to play behind live, which is very attractive,” Law said. “This allows apps like real-time conversations to run in sync.” The caching option can be turned on for use when needed, leaving uninterrupted flows to transmit in real time.

Critically, MoQ, like WebRTC, will be supported in the major browsers. Law said work underway in the Web Transport Group of the Worldwide Web Consortium (W3C} will produce APIs written in JavaScript that enable browsers to access the QUIC-based Web Transport (MoQT) layer under development by the IETF.

There’s a lot to be done before the MoQ vision can become a commercial reality. “By 2025 or 2026 at the latest, IETF will finish MoQT,” Law said. “And then we need specifications defining how media works over MoQT.” There will then need to be widescale adaptation to how all this works over CDNs, along with aspects related to advertising and DRMs, he added.

Meanwhile, WebRTC practitioners are racing ahead with enhancements aimed at strengthening the case for jumping into real-time interactive streaming sooner than later. Stay tuned for an update on some of the latest head-turning advances on that front.