This article introduces a new real-time video transmission solution based on AI smart routing technology. Leveraging the public internet as its underlying infrastructure, this solution supports broadcast-grade, high-bitrate, real-time video transmission with 99.9999% reliability. Through a case study of its real-world commercial application during the Paris Olympics 2024, this article explores the principles behind this technology and some of its technical implementations.
1.1 Requirements and Delivery Outcomes
During the Paris Olympics 2024, four television stations in Hong Kong commissioned Caton Technology to transmit 16 HD feeds from the International Broadcast Centre (IBC) in Paris back to Hong Kong for localised production and broadcasting.
Among these 16 feeds, 14 were provided to Caton as baseband signals, requiring Caton to handle the encoding at a bitrate of at least 15 Mbps per feed. The remaining 2 feeds were pre-encoded IP signals delivered at 22 Mbps each via the SRT protocol. In terms of network conditions, the only available connectivity at the IBC for this project was broadband internet from the French telecom provider Orange, and all receiving endpoints in Hong Kong also used internet connections. The reliability requirement was a stringent SLA of 99.9999%, allowing only one second of imperfection over a 17-day period.
The result over the 17-day period saw peak concurrent bandwidth exceeding 300 Mbps. The receiving end continuously analysed the received TS streams, recording zero continuity count (CC) errors. Thus, using only public internet infrastructure, Caton successfully achieved the stringent 99.999% SLA target.
For the television station users, costs were significantly reduced by 80% compared to traditional leased lines or satellite solutions. In contrast to the "best-effort" model of other public internet transmission solutions, Caton offered users peace of mind and guaranteed delivery outcomes.
1.2 Overview of the Technical Solution
Figure 1 illustrates the transmission topology for the Olympics. At the sending end, 14 Caton Prime HEVC encoders were used to push the live streams to the platform. On the receiving end, all television stations employed relay servers, referred to as CRS, to receive the programs from the platform and convert them into UDP multicast for delivery to the stations' production systems.
Figure 1: Topology of the Paris Olympics Transmission Solution
The core of the solution is the transmission platform called Caton Media XStream, which utilises AI smart routing technology to achieve a reliability level of 99.9999%.
While the diagram may appear straightforward, further details will be provided later in this article.
1.3 Achieving Extreme Reliability - First Principles Thinking
In the broadcasting industry, the public internet is widely recognised as unreliable. While there are many scenarios where the use of public networks for program transmission has been accepted, there remains hesitancy to use them in truly significant situations requiring high reliability or bitrates. So, is it really impossible to achieve both reliability and cost-effectiveness? Caton rethought this issue from the ground up.
The reliability of a system increases substantially with the addition of parallel paths. This means that while a single internet link may be unreliable, having multiple parallel links can create a highly reliable transmission network.
The question then becomes: how can we manage and leverage a large number of parallel, uncontrollable internet lines? Caton’s solution utilises artificial intelligence and machine learning technologies to make the uncontrollable internet more manageable. They also analysed the costs and found that, in most parts of the world, even with significant redundancy in internet lines, the solution remains more cost-effective than dedicated leased lines.
After three years of research and development, Caton launched the "AI Smart Routing-Based Real-Time Transmission Platform - Caton Media XStream" in mid-2022. This platform significantly enhances reliability, achieving six nines (99.9999%). It's important to note that the SLA for a single leased line is typically around 99.95%, and satellite connections, which are susceptible to weather conditions, achieve an annual reliability of only about 99.7%. Each additional nine represents a tenfold increase in reliability. Remarkably, at the same time as adding this extra resiliency, Media XStream retains the inherent cost-effectiveness and flexibility of the internet.
1.4 Real-Time Path Planning Example
Figure 2 illustrates how this platform mobilises a large number of parallel internet lines. This diagram represents the actual transmission path of a specific stream on a particular day during the Paris Olympics. As shown, it utilised 7 nodes and dozens of standby lines. The green lines indicate the current active transmission path, which may switch to any alternative routes at any time as the AI determines.
2.1 Centralised Intelligent Routing
At the very start of each transmission, the system goes through an automatic path planning process.
First, preprocessing is conducted. In this phase, all node resources that Caton possesses in the area, along with the links between them, are input into an algorithm for screening. We refer to this as the original full mesh map, as illustrated in Figure 3.
Next, a reduction process is applied to exclude faulty links and nodes. Here, a faulty node or link is not simply defined by its current availability; rather, it is evaluated based on the stability of the link and node over a certain period. If they fail to meet the QoS requirements of the stream, they are also excluded.
Next, various detected metrics, such as packet loss rate, RTT, and jitter, are normalised and quantified based on a mathematical model, providing the weight and cost of each node and link. Finally, by considering each node's computational parameters and bandwidth, the core network is optimised for load balancing, improving bandwidth and resource utilisation.
At this stage, as shown in Figure 3, the system excludes the blue and red paths as they fail the service quality metrics.
Figure 3: Preprocessing
The remaining nodes and links then enter the formal algorithm, known as the Multi-Source Directed Acyclic Graph (DAG) algorithm, as shown in Diagram 4. This algorithm has several key principles and practices.
The first principle is the disjointed path principle, which states that there must be at least two routes from the source to the destination that do not pass through the same network nodes. This principle encompasses both explicit disjointed paths and absolute disjointed paths.
Explicit disjointed paths refer to paths that do not intersect at any nodes visible within our network, and they can be calculated using mathematical methods. On the other hand, absolute disjointed paths indicate that the nodes, including those in the underlying carrier network, do not intersect. Achieving this is quite challenging, as it requires substantial data support regarding the actual transmission paths of the network.
For example, two Points of Presence (PoPs) entering Hong Kong from Singapore may appear to be disjointed; however, if both paths utilise the same carrier's entrance to reach Hong Kong, they would not be truly disjointed. Identifying links that enter through different carriers is essential to ensure absolute disjointed paths.
The second point involves the concept of minimum cut from graph theory. In graph theory, a minimum cut refers to the smallest number of paths in a directed graph that, when disconnected, prevents communication between the source and the target. This number defines the minimum cut. Our path planning needs to ensure that the minimum cut is greater than or equal to 2, meaning that disconnecting any single link will not result in an interruption of data flow at any time.
The third point is to optimise based on classical shortest-path algorithms while meeting all conditions. This involves implementing a shortest path algorithm for a directed acyclic graph (DAG) with multiple sources and multiple targets and assigning priority to each sub-path.
Finally, the green path shown in Figure 4 represents the resulting path planning diagram, which is distributed to all nodes in the core network to guide the actual transmission of the stream.
2.2 Distributed Intelligent Routing
Having discussed the generation of the path planning diagram, let's now explore how each node utilises this path diagram for packet forwarding.
It's important to clarify a common misconception. Typically, in broadcasting systems, a link switch occurs only when the original link has failed, following a primary backup switching logic. However, this is not the case in this solution. Each node continuously monitors the network quality of all potential sub-paths indicated in the path diagram to select the currently optimal sub-path for traffic forwarding.
Therefore, a switch occurs solely because a better route has been identified than the current one. Additionally, all switches occur at the granularity of IP packets, making them very quick and having no impact on the integrity of the received content.
In the central calculation of the path diagram, historical network quality data is used; however, network quality is constantly changing. Therefore, real-time network quality probing serves not only the decision-making needs of the nodes but also feeds data back to the centre for big data analysis.
In addition to monitoring network link quality, we have also incorporated link prediction models in each node to predict the change trends of individual links. Furthermore, when a link failure occurs, the system attempts to predict the potential failure points for that link, determining whether the issue lies with the source network exit, the intermediate link, or the destination network entrance. Based on the predictions, the system decides whether to switch networks, change the next hop destination, or switch to another network element within the same data centre.
2.3 Source Routing Algorithm
The system selects 2 to 3 edge nodes for each sender at the source to ensure coverage. If the sender uses dual ISPs, then a maximum of 6 sub-paths (2x3) can be utilised. These sub-paths serve two purposes:
1) In cases of insufficient bandwidth on a single link, the bandwidth aggregation feature is automatically enabled to ensure that all user traffic can successfully connect to the core network without packet loss.
2) When network fluctuations or failures occur, multiple sub-paths enable seamless, user-transparent adaptive switching.
2.4 Receiving End Routing Algorithm
At the receiving end, similar to the sending end, 2 to 3 nodes will cover a single receiver. For the Olympic live broadcast, all users on the Hong Kong side utilised dual ISPs for receiving, providing a total of four sub-paths to choose from, which significantly contributed to reliability.
Here is a scenario explaining the benefit of multiple nodes:
When the data arrives at the edge node and is ready to be forwarded to the receiver, the downstream link suddenly fails.
The edge node automatically triggers an aggregation mechanism, forwarding the data packets to other edge nodes through their existing connections with the receiver. This allows for relay transmission, ensuring that the receiver can promptly and accurately receive all the data. All edge nodes work collaboratively to achieve this effect.
Through the successful transmission of the Paris Olympics for the Hong Kong Broadcasters, Caton Media XStream (CMXS) has demonstrated and proven the following features:
In addition to sporting events, Caton Media XStream has already delivered reliable content in various scenarios, including global television channel transmissions and live broadcasts for big cinema screens. Media XStream aims to address the bottlenecks and challenges in international television program transmission while balancing reliability, flexibility, and cost-effectiveness.
About Caton Technology
Caton Technology is a global leader in next-generation IP transport, revolutionising media distribution with unmatched innovation and customer service. We empower broadcasters and media companies to deliver exceptional Live video over IP. Leveraging our innovative cloud platform and AI technology, Caton Media XStream ensures zero-error transmission and optimal performance, surpassing traditional delivery methods with superior Service Level Agreements at competitive costs. At Caton, quality, performance, and value coexist, enabling our customers to experience the best of all worlds. Discover more about our cutting-edge solutions at www.catontechnology.com.