ULTRAMEDIA PIPELINE ARTICLE - Emergence of WebRTC Alternatives Expands Real-Time Streaming Horizon - Michelle Munson, Eluvio CEO and co-founder

Michelle Munson, Eluvio CEO and co-founder

As providers of WebRTC-based streaming platforms respond to surging demand for scalable, reliable real-time video connectivity, that demand is also accelerating progress toward alternative solutions that could eventually achieve equivalent results.

One of the first commercially applied breakthroughs in that direction was brought to light at the NAB Show in early April by Eluvio, which has been making waves with several major M&E providers now streaming content over its Web3-based Content Fabric platform. Eluvio CEO and co-founder Michelle Munson shared clips of recent European Professional Club Rugby (EPCR) competitions which she said Eluvio customer EPCR-TV had streamed live on HLS and DASH feeds using the new Bangkok release of the Content Fabric to achieve sub-second and even sub-half-second latencies at transcontinental distances.

This was one of several revelations we encountered at the Las Vegas Convention Center that underscored the likelihood that as use cases for real-time interactive streaming (RTIS) take hold, the means of supporting them will continue evolving toward outcomes that seemed out of reach not so long ago. As reported elsewhere, streaming platform providers who rely on the Web Real-Time Communications (WebRTC) protocol revealed and, in some cases, demonstrated advances that underscored RTIS over WebRTC is ready for prime time.

That’s still not the case with the alternatives to WebRTC. As things stand now, the only solution in that vein that’s on course to achieve the performance implied in our use of the RTIS terminology is the Internet Engineering Task Force’s Media over QUIC (MoQ) initiative, which we found to be making significant progress albeit with a long way to go before consensus is reached on a new IETF standard.

Otherwise, while the other alternatives to WebRTC are getting closer to supporting a real-time experience within the bounds of human perception on the order of 250 milliseconds between live action and streamed reception, there remains a significant gap that makes them inappropriate for many use cases. Moreover, as yet, at least, none of these alternatives other than MoQ has been designed to accommodate the interactive aspect that’s vital to so many use cases, which requires that any number of recipients to a massively scalable real-time live streaming experience can generate video output that will reach any number of other users targeted by those transmissions in real time.

Significant Progress on Media over QUIC

Descriptions of the MoQ technology are provided on the UltraMedia Pipeline website by this article and this video, both of which feature one of the leading voices for MoQ, Will Law, the chief architect for edge technology at Akamai. We met again at NAB where Law updated us on the intensifying interest surrounding MoQ work groups pushing the complex framework to completion.

“We’re seeing a high level of activity in contributions to the standard,” Law said. “We’re very bullish on its potential.”

And he shared recent results from a test that showed just how close MoQ is to real-world implementation. The platform registered end-to-end latency at 500ms on a live-streamed motorcycle race captured in Zurich and produced in Los Angeles for reception by viewers in Zurich. Law said 300ms of that latency was attributable to anti-jitter buffering having nothing to do with the time consumed by the transport, which put it well within any WebRTC results we’ve seen registered at those distances.

Now the challenge is to resolve all the contributed tech ideas, now numbering over 1,600, into a final set of drafts comprising the major MoQ components that can be circulated for approval as a standard. This cat-herding phase can drag things out when there’s so much interest in the details, but backers are still hoping to get to a sufficiently stable version the industry can work with by late 2026, even if it takes longer to garner formal approval.

But if what’s been attained so far is any indication, what comes next, rather than a prolonged battle between MoQ and WebRTC proponents, is likely to be a full-on transition to ubiquitous adoption of real-time multidirectional streaming over MoQ. Ultimately, the issue isn’t about protocols, noted Chris Allen, CEO of Red5, a leading provider of RTIS support over WebRTC. It’s about how to achieve the best results with whatever protocol or protocols can be used with minimum hassles and costs to get the job done.

“We’re paying close attention to MoQ and applaud what’s been accomplished,” Allen said. “Our strength rests with the architecture we’ve developed to support whatever use cases customers have for real-time streaming. We’ll adapt if MoQ turns out to be the better choice.”

Looking for a Sweet Spot Short of RTIS

The need for RTIS capabilities encompasses a vast range of applications in the M&E domain and beyond impacting just about every type of online engagement involving live video. Participation in social media, watching and betting on sports and esports, how people interact with ads and online retailers, virtualization at high levels of verisimilitude in myriad uses of extended reality technologies, collaboration in design and engineering, orchestration of live multi-camera surveillance and emergency response operations, remote involvement in work and education – the list is endless.

But that doesn’t mean there isn’t an immediate opportunity for providers who can move the needle beyond current latency limitations with new platforms that operate in the space between RTIS and conventional streaming. As revealed by Eluvio and two other companies, Ceeblue and Dolby, who are touting ultra-low latency but less-than-ultimate solutions, it’s now possible to live-stream sports, esports and other content unidirectionally at sub-second latencies to mass audiences using the existing Hypertext Transfer Protocol (HTTP) infrastructure.

This represents a significant contrast with WebRTC, which requires instantiation of new infrastructure. The desire to get as close as possible to real-time transmission free of such disruptions is the motivating factor behind all these options, including MoQ, even though they all require use of new software, which in some cases must be implemented across servers and end devices at vast scales.

Providers of new HTTP-compatible solutions that are ready for commercial deployment say they’ve gone to these lengths because they see growing demand for one-way use cases that require better latency performance than conventional streaming even though they lack the full capabilities of RTIS. For example, as reported in the previously cited UltraMedia Pipeline article, demand for support in the space between RTIS and traditional streaming is why Dolby has made the High Efficiency Streaming Protocol (HESP) it inherited with acquisition of HESP developer THEO last year an option for instant activation on its OptiView multi-streaming platform.

HESP

The fully managed OptiView service implements streaming sessions on whatever streaming infrastructure Dolby uses to support the chosen streaming mode. including the CDN used for WebRTC that it acquired with purchase of the Millicast RTIS platform in 2022 as well as CDNs it uses for HESP and conventional streaming. Prior to the THEO buyout, HESP was heavily promoted by the HESP Alliance, which was founded in 2020 by THEO and Synamedia, who, with other alliance members, continuously refined the software stack as a standardized offering that any vendor or service provider willing to pay the usage royalties could deploy under their own brands.

While latencies in the 700-900ms range were claimed by the HESP Alliance, which has been silent but has retained its website since the Dolby acquisition, Dolby pegs latency at 1-2 seconds, which is what’s been reported by users of the technology. But clearly, these users see value in such capabilities.

One is Luxembourg-based global cloud CDN operator GCore, which tells customers video can be streamed via HESP on its infrastructure with delays “that don’t exceed 2 seconds.” Sky Racing, an Australian provider of live-streamed horse and greyhound races with online betting support, says it is benefitting from enabling last-minute pre-race online wagering through use of HESP to reduce latency to “about one second.”

These are impressive improvements on the multi-second latencies common to other HTTP streaming modes, including Low-Latency HLS (LLHLS) and QUIC-based HTTP3, which has no connection to MoQ. The HESP special sauce involves use of an Initialization Stream generating a string of Initialization Packets. These packets convey all the information essential to executing the start of a streaming segment, including decoder and DRM configurations, full picture I-frame and a reference to where the next frame can be found at the end of each segment in the primary or “Continuation” stream.

There can be any number of Initialization Packets in the Initialization Stream up to one for every frame in the video, which enables instant shifts to lower bitrates dictated by network conditions and allows users to switch to and start viewing a live stream instantly at any point in a live webcast. By allowing instant startup of a new segment at any I-frame, long buffering times incurred with dropped packets in a previous segment can be avoided, thereby reducing overall streaming latency.

WebRTS

Offering an alternative to HESP in this space is Ceeblue’s Web Real-Time Streaming (WebRTS) platform, which the company announced at IBC last year and subsequently solidified with a second version that is gaining traction commercially. Contrary to Ceeblue’s initial characterization of WebRTS as supporting latencies matching those of its WebRTC platform, officials we met with at NAB made clear RTS is a one-way streaming framework built from the ground up to work with the HTTP streaming infrastructure at latencies in the 700+ms range. That can be cut to about 550ms when CDNs are upgraded with software that reduces the brief pre-session fragment caching delays typically used with live content to the shorter durations required by WebRTS flows.

The side-by-side 300ms latency performances of Ceeblue’s WebRTC and RTS platforms we observed at IBC and reported on based on input from company officials at the time was actually a function of the close proximity of the streaming source at Ceeblue’s headquarters in Spijkenisse, South Holland to the IBC show floor at the RAI in Amsterdam, according to Ceeblue COO Lawton Cheney. Acknowledging messaging about WebRTS was not well-honed owing to a last-minute decision to announce the new development at IBC, Cheney also noted that, while WebRTS has the potential to operate multi directionally, making that possible “is low on our roadmap right now.”

“WebRTC is better for the lowest latency and interactive streaming applications,” he said. “WebRTS scales better than WebRTC.”

The scaling comparison in terms of how many users can be reached by a live streaming session is debatable, depending on the capabilities of a given WebRTC platform. But, in terms of how easy it is to scale, the advantage goes to any HTTP-based platform.

Like HESP, WebRTS carves out a new space that backers believe will satisfy many streaming service providers’ ultra-low latency requirements at lower costs and with less disruption than they’d incur using WebRTC. Cheney and Ceeblue vice president of engineering Jonas Blötz described multiple use cases in sports betting, ecommerce and other areas where Ceeblue is finding a strong response for what WebRTS has to offer.

“We hit a sweet spot people have been talking about,” Blötz said. “We’re deploying it, and it works as expected.”

WebRTS like HESP relies on the signaling methods used with HTTP in communications between servers and clients, including the “pull” messages manifest files downloaded to clients send to retrieve sequences of video frames. And both support a diminished role for the Transfer Control Protocol (TCP), which causes buffering delays to accommodate resending dropped packets.

But there are significant differences between the two, having to do in part with the fact that WebRTS “predicts” where the I-frames marking the frontends of successive groups of pictures (GOPs) will be by reading the packet header information about the incoming video flow that’s held by the HTTP server. This allows the WebRTS software layer to convey the precise segment lengths and locations of I-frames to the client player, which impacts the speed of playback in two important ways: the player is able to begin playback as soon as the next I-frame arrives, and it’s made aware of the sequence of secondary frames that can be blocked from TCP intervention.

TCP and the attendant buffering delay can be orchestrated to only work in cases where those packets, typically I-frames, are most important to a given GOP, Blötz explained. Ceeblue, which had depicted this process as “partially reliable” streaming now refers to this approach to controlling TCP as “adaptive frame skipping.” While the process has an impact on raw quality metrics, there’s no impact on perceived quality as measured in conventional ways, he said.

WebRTS employs transcoding supporting ABR, so that when congestion is causing unacceptable rates of packet loss, the transmission can switch to a lower bitrate, thereby minimizing the need for packet recovery and buffering via TCP. And because users can configure how adaptive frame skipping is applied, they can choose between prioritizing continuity with greater support from TCP or minimizing latency by skipping dropped secondary frames.

This “content-aware transfer” support is the secret sauce enabling near-WebRTC levels of latency reduction, Blötz said. How the platform enables a client player’s role in the process without requiring a separate client plug-in is another aspect to that secret sauce, he added.

This is accomplished by using APIs that allow WebRTS to convey critical player information to browsers through Media Source Extensions, with the exception of Safari, which is reached through the Managed Media Source extension. “You can use our SDK with any player out there to deliver streams over HLS or DASH,” Blötz said, noting the SDK APIs are used natively by the browsers while some additional messaging is conveyed via JavaScript.

Complex as it might sound, the WebRTS framework is easy to implement, Blötz added. “We’re building on the shoulders of other protocols and adding some sauce to it,” he said. “It’s almost too simple considering what it can do.”

Ceeblue is open to supporting standardization of WebRTS. “We’ll just put it out there for industry comments and see what people want to do,” Blötz said.

Eluvio’s Content Fabric

As for the latency performance reported by Eluvio’s Michelle Munson, she said live EPCR-TV rugby game feeds streamed directly to consumers in Chicago and San Jose from Europe registered 400ms and 425ms latencies, respectively. The Rugby game distributor’s use of the Content Fabric “to empower their ability to reach fans globally is live and ongoing,” she noted.

While this was arguably a major accomplishment, Eluvio didn’t go out of its way to tout those results. In press materials released with the Bangkok rollout at NAB, the company differentiated between what it described as the ultra-low sub-500ms latency achieved with use of the Content Fabric to deliver live-streamed EPCR-TV content via Secure Reliable Transport (SRT) to affiliated distributors versus the “less than 5-second glass-to-glass” latencies attained with D2C HLS and DASH streams.

Munson mentioned the 425ms and 400ms D2C rates registered in the U.S. as an aside while describing the many benefits of the new Content Fabric release during her press presentation at NAB. Her emphasis was on the fact that everything comprising what EPCR-TV delivered across 106 countries through its B2B and B2C connections was accomplished with a single implementation of the Content Fabric.

The Eluvio platform runs on the open internet, resulting in 10x savings compared to costs incurred with use of CDNs and media clouds, Munson said. She described benefits such as a single transcoding instance serving all feeds to affiliates and end users; application of DRM with windowing and rights policy enforcement tuned to each region as dictated by contracts with affiliates; use of AI for frame-accurate tagging and metadata descriptions; monetization through use of SSAI and subscription models, and automated creation of highlights and a wide range of general and personalized feature enhancements.

When it comes to how far Eluvio goes in winning adoption of this proprietary departure from the multi-tiered norms of today’s video streaming operations, the new latency benefits are a shiny tail wagging a very big dog. They definitely put Eluvio on the board when it comes to weighing approaches to cutting latency, but, so far at least, the company hasn’t shown it’s ready to address the full requirements of RTIS.