REAL TIME INTERACTIVE STREAMING






Multiple WebRTC Platform Advances Signal Big Gains in Real-Time Streaming


By Fred Dawson


Rose Power, senior product marketing manager, Dolby
Demand across multiple market sectors for real-time video streaming solutions sparked an outpouring of innovation at the recent NAB Show, making the biggest case yet seen for freeing the world from sole reliance on the prevailing HLS and DASH technologies.

In meetings across the Lav Vegas Convention Center we discovered there’s been what seems to us a surprising amount of investment in new ideas. They included new approaches to real-time streaming, new ways to streamline activation of supporting infrastructure on an as-needed basis, and applications of AI that enable verbal and visual communications between humans and virtualized counterparts.

Most innovations that related to streaming video at sub-half-second end-to-end latencies revolved around advancements in suppliers’ use of the Web Real-Time Communications (RTC) protocol. But there was also significant progress toward solutions that avoid reliance on WebRTC infrastructure, which is incompatible with the prevailing Hypertext Transfer Protocol (HTTP) streaming environment.

That all these landmark developments went largely unnoticed by industry observers, including the trade press, attests in part to how fast real-time interactive streaming (RTIS) has taken hold against a long-term backdrop of uncertainty about performance, costs and market needs. But it’s also worth noting that slow revenue growth compounded, in some cases, by a penny-wise-pound-foolish mentality has inhibited spending that’s essential to getting the word out in a diminished trade publishing environment that feeds off press releases.

Many RTIS platform suppliers we encountered were either grabbing meetings outside the exhibit and conference spaces or showing off their wares in other vendors’ booths. Just a few, including Dolby, Ant Media and Wowza, had their own booth spaces.

RTIS in Action in Many Places

Notably, among those in the free-range camp, RTIS provider Red5 was hard to miss with demos running in highly trafficked spaces occupied by Amazon’s AWS, Zixi, the internet transport provider that 15 years ago kicked off the disruption of the satellite-based TV contribution market, and Nomad Media, a fast-rising provider of AI-supported next-gen media asset management technology.

AWS was a particularly prominent center of attention, spurred in part by its sponsorship along with Nvidia of a multi-F1 simulator racing competition outside the exhibit hall where show attendees vied for bragging rights and awards throughout the four-day event. Working with AWS, Red5 demonstrated how its platform could be used to deliver camera feeds covering the competition in real time with support for multi-stream viewing access by booth visitors inside the hall.

With this and other demonstrations elsewhere, Red5 CEO Chris Allen said he witnessed high levels of market interest across use cases ranging from consumer experiences to back-end connectivity for real-time collaboration in live productions to applications enabling real-time traffic, emergency and battlefield monitoring over wide reaches of territory. “We met with people looking for solutions in all these areas, including many you wouldn’t expect to find at an NAB event,” Allen said.

On the M&E front, use of the Red5 Experience Delivery Network (XDN) platform in live production collaboration is fueling adoption of real-time streaming on the distribution side and vice versa, Allen added. For example, he noted that a major sports league that has activated XDN connectivity for backend operations is now looking at extending RTIS to its multimillion viewer base. “One feeds the other,” he said.

The same kind of self-perpetuating expansion of real-time streaming usage is occurring outside the M&E realm. During the NAB Show, Red5 learned that the California Department of Transportation (Caltrans), which uses Red5’s TrueTime MultiView Surveillance platform in its own traffic and emergency monitoring operations, is preparing to enable shared usage of the real-time monitoring capabilities across law enforcement and other government agencies statewide.

Push-Button Access to All Streaming Options

Among RTIS platform providers with their own exhibit spaces, one particularly arresting development on quiet display at the Dolby booth had implications for what could turn out to be a commonly replicated template for reducing the costs and hassles of accommodating variations in streaming requirements. For the first time anywhere, Dolby demonstrated it’s now possible through its OptiView platform to implement streaming at whatever level of latency and directional flexibility is required for any given live distribution scenario through point-and-click commands on a management console.

These capabilities stem from Dolby’s acquisitions of WebRTC platform provider Millicast in 2022 and two years later THEO Technologies, inventor of the High Efficiency Streaming Protocol (HESP), which has been promoted as a royalty-bearing standard through the HESP Alliance. “Dolby OptiView is a fully managed service that supports streaming over WebRTC at sub-half-second latencies, HESP at 1-2 seconds or conventional HLS platforms at higher latencies,” said Rose Power, senior product marketing manager at Dolby.

In all cases streams are managed through the OptiView Player (formerly THEOPlayer) software, which uses APIs that allow it to work with whatever browser or client software is used by the chosen transport protocol. At the click of a dashboard command the managed service assigns streaming operations to the appropriate CDN infrastructure it has contracted to use, including an Oracle Cloud Infrastructure network optimized for WebRTC and HTTP-based CDNs supporting HESP and HLS. Power said OptiView users can activate Dolby’s Server-Guided Ad Insertion (SGAI) technology to enable targeted full- or partial-screen ad placements in line with performance parameters of the chosen streaming technology.

It remains to be seen what OptiView contributes to Dolby’s ability to drive usage of the Millicast WebRTC and THEO HESP platforms in the wake of a post-acquisition period of confusion about strategies and commitment to performance. But the OptiView concept is one its competitors might find to be well worth emulating.

Paying for RTIS Infrastructure on an As-Needed Basis

Another big step in a new but different direction was taken by Ant Media, which has long leveraged WebRTC in DIY projects with entities streaming to user bases that topped out at around 100,000 and often far less. Now with demand for more massively scalable turnkey approaches to RTIS emanating from big market players, Ant is moving to what CEO and co-founder Ahmet Oguz Mermerkaya calls an “auto-managed” global streaming platform.

The term refers to what Mermerkaya said is a unique, cost-lowering version of managed RTIS service that activates AWS cloud resource usage through Amazon’s Cloud Development Kit (CDK). CDK makes it possible for Ant to use software coding language to direct resource allocations on an as-needed basis through the AWS CloudFormation service. “This is something new,” he said.

“We have control of the backend components,” he explained. “When customers aren’t using the resources, they’ll be released for other uses automatically. When the customer needs them, they’ll be scaled automatically to the usage requirement.”

Ant calculates this will result in up to a 3X reduction in costs normally incurred with managed WebRTC services. Moreover, the CDK tie-in to AWS resource utilization means Ant customers can set things up for smooth transition to conventional streaming.

It’s up to Ant’s customer to arrange with AWS any resource uses that aren’t part of the WebRTC service, Mermerkaya said. But when such arrangements are made to enable support for streaming, say, through HLS connections, Ant customers can use the Ant CDK connection to bridge to that streaming mode whenever necessary.

AI-Assisted Socialization

Still another groundbreaking move in new RTIS directions was shared by Brad Altfest, managing director for M&E at Agora, which has built a global business providing developers support for bringing interactive video streaming and other social elements into their services and applications. Lately, Altfest said, Agora has made AI a much bigger part of the developer toolkit in order to bring event-synched stats, conversational responsiveness and even chatting robots into the socialized video environment supported by its global Software-Defined Real-Time Network.

“As an example, we’ve demonstrated what can be done with AI in a live-streamed sports watch party,” Altfest said. “Our AI engine allows us to follow the conversation and surface relevant data points automatically. In the case of fans who have become influencers with large audiences, this gives them the kinds of tools broadcasters use for adding graphics and clips to their commentary.”

Conversing chatbots, avatars and toys tied to user engagements through interactive video connections on the Agora WebRTC platform are now supported by the Agora Conversational AI Engine, which launched just ahead of the NAB Show. Designed to support creation of interactive voice experiences with any AI model, the engine ensures ultra-low latency responses and superior voice processing, allowing for effortless creation of immersive voice AI experiences, Altfest said.

Nearly simultaneously with the Conversational AI Engine release, Agora said it was expanding into AI-powered robotics and interactive toys with an AI device development kit anchored in chipsets produced by Beken. This will allow IoT toys and other devices interacting over the Agora platform with end users to engage in more natural, personalized conversations whether for sheer fun or in apps devoted to things like adaptive learning, health care and emotional support, the company said.

Attendees at the recent Mobile World Congress in Barcelona were given a preview of the capabilities in a demonstration involving toymaker Robopoet’s latest AI companion robot Fuzzoo, an AI-powered emotional companion that listens to users and even senses their moods as it formulates verbal responses in real time, according to Robopoet co-founder and CTO Yuna Pan. "With real-time voice processing, emotional AI, and advanced speech capabilities, Agora makes seamless human-machine interaction possible and ensures exceptional performance and reliability,” Pan said in a prepared statement.

The Agora strategy is to enable developers to apply whatever AI resources they choose to create new social experiences that can run on its global real-time streaming network with the aid of the AI conversational engine and the AI device framework built into the Beken chipsets. “Our value at its core is real-time connectivity,” Altfest said. “We’re in very early days. Everyone is experimenting with applications as people like us learn how to use AI with real-time streaming.”

These were just some of the suppliers touting RTIS solutions anchored by the WebRTC protocol. And, as mentioned earlier, there were significant signs of progress toward ultra-low streaming and even the full capabilities of RTIS on platforms that are designed to work with the prevailing HTTP infrastructure. We’re reporting on these developments in a separate article.