Streaming techniques for media
Media streaming involves one or more techniques to deliver video over the internet. A key factor in user experience is the available bandwidth on the viewer's internet connection. Insufficient bandwidth, or sudden network congestion, can cause stuttering and delays while the player is buffering. Viewers report that this as the biggest negative impact on their perception of the quality of a video service. For the system architect, the available network bandwidth provides the key constraint on the overall architecture, including both provider networks and mobile or wireless connections.
One solution to this issue is to reduce or vary the video bitrate, but even this has challenges—the bigger the display, the higher the resolution required to provide the same user experience. Also, the higher the resolution, the higher the bitrate required to sustain it. This is why streaming technologies provide a set of video streams and the ability to switch between them dynamically.
All the streaming technologies described in this whitepaper rely on adaptive bitrate (ABR) streaming. ABR is where the same video content is provided at multiple bitrates. This allows the best possible video quality to be delivered to the viewer under varying conditions.
The ABR experience is superior to delivering a video at a single bitrate because the video stream can be continually switched dependent on available network bandwidth, allowing playback at different bitrates within an ABR ladder. This ability to adapt helps avoid buffering or interruption in playback that can happen when a client's network throughput is insufficient for a particular bitrate.
Most of the streaming techniques deliver a representation of a video stream in a manifest file and the media itself is streamed in a series of segments. This allows a player to begin video playback as soon as possible after receiving data. Streaming techniques for the internet differ from those for closed networks because there is no guaranteed bitrate available at the client, and they must be able to cope with changes in the available bandwidth between the CDN and player.
Most of the streaming techniques on the internet use HTTP to distribute content. However, there are many differences in how and when these technologies might be used based on viewer devices and network characteristics. There is often a need to use one or more of these streaming techniques based on delivery to variety of different devices including smart TVs, set-top boxes (STBs), personal computers (PCs), and handheld devices, such as tablets and smartphones.
There are two typical workflows for streaming:
-
Video on demand (VOD) – In this workflow, the media assets are stored as encoded video files and played on request. The media assets are stored in multiple formats and bitrates for playout, or packaged at the time of requesting, using just-in-time (JIT) packaging, for different client devices.
-
Live streaming – This workflow describes source content for linear TV or assembled channels and live events, such as sport, concerts, news, or any other live broadcast events. This content is packaged in real time, in multiple different formats for different client devices.
Video on demand (VOD) and live streaming can both be delivered using one or more different streaming technologies. The major advantage of streaming for VOD applications is that a client does not have to download the whole file before playback begins.
Common standards of streaming technologies
-
HTTP is used for video and signaling to provide compatibility with networks, firewalls, caches and end clients.
-
Video is delivered as a set of concurrent streams with different renditions.
-
Each video stream is divided into a series of segment files, each containing a few seconds of video.
-
The list of available streams is obtained by downloading an index or manifest file.
-
The client selects a stream, downloads the first few segments dependent on the player's buffer size, and starts playing. The client continually downloads the next segment while the current one is playing, giving the appearance of a continuous video stream.
-
The client chooses which video stream to use for the next segment, allowing it to adapt to immediate network conditions on a segment-by-segment basis.
HTTP Live Streaming
HTTP Live Streaming (HLS).ts
files) or fragmented MPEG-4 (fMP4 files with
.mp4
extension) encoding standards, are typically packaged at 2 to 10 seconds
long. The list of streams is provided in a manifest file with an .m3u8
extension. HLS supports H.264 (AVC) and H.265 (HEVC) video codecs.
The Low-Latency extension to HLS (LL-HLS)
Dynamic Adaptive Streaming over HTTP
Dynamic Adaptive Streaming over
HTTP (DASH or MPEG-DASH)
The low latency variant of MPEG-DASH, referred to as Low Latency DASH (LL-DASH), makes use of HTTP Chunked Transfer Encoding to enable the player to start receiving the segment in chunks even before the full segment is completed.
Common Media Application Format
The Common Media Application Format (CMAF) is somewhat broader in scope than HLS or DASH. It is also the result of an international standardization effort. CMAF specifies a set of tracks that allow clients to identify and select media objects.
A key difference between CMAF and HLS or DASH is that segments are further subdivided into fragments. For example, a segment of eight seconds would be divided into four fragments of two seconds each. With CMAF, a client can be playing a fragment while still downloading other fragments from the same segment. This ability allows for a much lower latency, compared to HLS or DASH, because it is not necessary to download an entire segment before starting to play it.
CMAF is defined by ISO Standard ISO/IEC 23000-19
Microsoft Smooth Streaming
Microsoft Smooth Streaming (MSS) was first launched in 2008 by
Microsoft and remains proprietary technology defined in the
Microsoft Smooth Streaming (MS-SSTR) specification