Streaming techniques for media - Amazon CloudFront for Media

Streaming techniques for media

Media streaming involves one or more techniques to deliver video over the internet. A key factor in user experience is the available bandwidth on the viewer's internet connection. Insufficient bandwidth, or sudden network congestion, can cause stuttering and delays while the player is buffering. Viewers report that this as the biggest negative impact on their perception of the quality of a video service. For the system architect, the available network bandwidth provides the key constraint on the overall architecture, including both provider networks and mobile or wireless connections.

One solution to this issue is to reduce or vary the video bitrate, but even this has challenges—the bigger the display, the higher the resolution required to provide the same user experience. Also, the higher the resolution, the higher the bitrate required to sustain it. This is why streaming technologies provide a set of video streams and the ability to switch between them dynamically.

All the streaming technologies described in this whitepaper rely on adaptive bitrate (ABR) streaming. ABR is where the same video content is provided at multiple bitrates. This allows the best possible video quality to be delivered to the viewer under varying conditions.

The ABR experience is superior to delivering a video at a single bitrate because the video stream can be continually switched dependent on available network bandwidth, allowing playback at different bitrates within an ABR ladder. This ability to adapt helps avoid buffering or interruption in playback that can happen when a client's network throughput is insufficient for a particular bitrate.

Most of the streaming techniques deliver a representation of a video stream in a manifest file and the media itself is streamed in a series of segments. This allows a player to begin video playback as soon as possible after receiving data. Streaming techniques for the internet differ from those for closed networks because there is no guaranteed bitrate available at the client, and they must be able to cope with changes in the available bandwidth between the CDN and player.

Most of the streaming techniques on the internet use HTTP to distribute content. However, there are many differences in how and when these technologies might be used based on viewer devices and network characteristics. There is often a need to use one or more of these streaming techniques based on delivery to variety of different devices including smart TVs, set-top boxes (STBs), personal computers (PCs), and handheld devices, such as tablets and smartphones.

There are two typical workflows for streaming:

  • Video on demand (VOD) – In this workflow, the media assets are stored as encoded video files and played on request. The media assets are stored in multiple formats and bitrates for playout, or packaged at the time of requesting, using just-in-time (JIT) packaging, for different client devices.

  • Live streaming – This workflow describes source content for linear TV or assembled channels and live events, such as sport, concerts, news, or any other live broadcast events. This content is packaged in real time, in multiple different formats for different client devices.

Video on demand (VOD) and live streaming can both be delivered using one or more different streaming technologies. The major advantage of streaming for VOD applications is that a client does not have to download the whole file before playback begins.

Common standards of streaming technologies

  • HTTP is used for video and signaling to provide compatibility with networks, firewalls, caches and end clients.

  • Video is delivered as a set of concurrent streams with different renditions.

  • Each video stream is divided into a series of segment files, each containing a few seconds of video.

  • The list of available streams is obtained by downloading an index or manifest file.

  • The client selects a stream, downloads the first few segments dependent on the player's buffer size, and starts playing. The client continually downloads the next segment while the current one is playing, giving the appearance of a continuous video stream.

  • The client chooses which video stream to use for the next segment, allowing it to adapt to immediate network conditions on a segment-by-segment basis.

HTTP Live Streaming

HTTP Live Streaming (HLS), was developed by Apple and is natively supported on all iOS devices. Segments, using either MPEG-2 transport stream (.ts files) or fragmented MPEG-4 (fMP4 files with .mp4 extension) encoding standards, are typically packaged at 2 to 10 seconds long. The list of streams is provided in a manifest file with an .m3u8 extension. HLS supports H.264 (AVC) and H.265 (HEVC) video codecs.

The Low-Latency extension to HLS (LL-HLS) specification allows delivering media segments in parts, referred to as partial segments. A partial segment can be as short as 200 milliseconds and can be published much earlier than the complete segment.

Dynamic Adaptive Streaming over HTTP

Dynamic Adaptive Streaming over HTTP (DASH or MPEG-DASH) was collaboratively developed as an international standard for adaptive streaming technology. The DASH specification is designed to be extensible and support both current and future video codecs. Implementations have focused on the most commonly used codecs including H.264 and H.265.

The low latency variant of MPEG-DASH, referred to as Low Latency DASH (LL-DASH), makes use of HTTP Chunked Transfer Encoding to enable the player to start receiving the segment in chunks even before the full segment is completed.

Common Media Application Format

The Common Media Application Format (CMAF) is somewhat broader in scope than HLS or DASH. It is also the result of an international standardization effort. CMAF specifies a set of tracks that allow clients to identify and select media objects.

A key difference between CMAF and HLS or DASH is that segments are further subdivided into fragments. For example, a segment of eight seconds would be divided into four fragments of two seconds each. With CMAF, a client can be playing a fragment while still downloading other fragments from the same segment. This ability allows for a much lower latency, compared to HLS or DASH, because it is not necessary to download an entire segment before starting to play it.

CMAF is defined by ISO Standard ISO/IEC 23000-19 and its application for HLS is defined in Apple documentation. The application of CMAF for achieving low latency is described in more depth in the AWS Media Blog post, Lower latency with AWS Elemental MediaStore chunked object transfer.

Microsoft Smooth Streaming

Microsoft Smooth Streaming (MSS) was first launched in 2008 by Microsoft and remains proprietary technology defined in the Microsoft Smooth Streaming (MS-SSTR) specification. MSS uses an index file to list streams. MSS is widely used in legacy platforms.