This whitepaper is for historical reference only. Some content might be outdated and some links might not be available.
Fundamental components of RTC architecture
In the telecommunications industry, RTC commonly refers to live media sessions between two endpoints with minimum latency. These sessions could be related to:
-
A voice session between two parties (such as a telephone system, mobile, or Voice over IP (VoIP))
-
Instant messaging (such as chatting and Instant Relay Chat (IRC))
-
Live video session (such as video conferencing and telepresence)
Each of the preceding solutions has some components in common (such as components that provide authentication, authorization and access control, transcoding, buffering and relay, and so on) and some components unique to the type of media transmitted (such as broadcast service, messaging server and queues, and so on). This section focuses on defining a voice- and video-based RTC system and all of the related components, as illustrated in the following figure.

Essential architectural components for RTC
Softswitch/PBX
A softswitch or PBX is the brain of a voice telephone system and provides intelligence for establishing, maintaining, and routing of a voice call within or outside the enterprise by using different components. All of the subscribers of the enterprise are required to register with the softswitch to receive or make a call. An important functionality of the softswitch is to keep track of each subscriber and how to reach them by using the other components within the voice network.
Session border controller (SBC)
A session border controller (SBC) sits at the edge of a voice network and keeps track of
all incoming and outgoing traffic (both control and data planes). One of the key
responsibilities of an SBC is to protect the voice system from malicious use. The SBC can be
used to interconnect with session initiation protocol (SIP) trunks for external connectivity.
Some SBCs also provide transcoding capabilities for converting CODECs
PSTN connectivity
Voice over IP (VoIP) solutions use Public Switched Telephone Network (PSTN) gateways and SIP trunks to connect with legacy PSTN networks.
PSTN gateway
The PSTN gateway converts the signaling between SIP and SS7 and media between Real Time Transport Protocol (RTP) and time division multiplexing (TDM) using CODEC transcoding. PSTN gateways always sit at the edge close to the PSTN network.
SIP trunk
In a SIP trunk, the enterprise does not end its calls onto a TDM (SS7 based) network, but rather the flows between enterprise and telco remain over IP. Most of the SIP Trunks are established by using SBCs. The enterprise must agree on the predefined security rules from telco, such as allowing a certain range of IP addresses, ports, and so on.
Media gateway (transcoder)
Users communicate in real-time using audio and/or video, as well as optional data and
other information. To communicate, the two devices need to be able to agree upon a
mutually-understood codec for each media track, so they can successfully communicate and
present the shared media. All WebRTC-compatible browsers must support online positioning user
support (OPUS) and G711 for audio, VP8
A typical voice solution outside the WebRTC ecosystem allows various types of CODECs. Some of the common CODECs are G.711 µ-law for North America, G.711 A-law, G.729, and G.722. When two devices that are using two different CODECs communicate with each other, the media gateway translates the CODEC flow between the devices. In other words, a media gateway processes media, and ensures that the end devices are able to communicate with each other.
Push notifications in WebRTC
WebRTC implementations are very common on mobile devices. Unlike web browsers, a mobile device can’t keep a websocket connectivity open for a long time. Therefore, it needs to rely on push-notifications from the WebRTC server for all ending requests, such as calls and messages.
Amazon Simple Notification Service

Amazon SNS for push notifications
WebRTC and WebRTC gateway
Web real-time communication (WebRTC) allows you to establish a call from a web browser or request resources from the backend server by using API. The technology is designed with cloud technology in mind and therefore provides various APIs which could be used to establish a call. Because not all of the voice solutions (including SIP) support these APIs, the WebRTC gateway is required to translate API calls into SIP messages and vice versa.
The following figure shows a design pattern for a highly available WebRTC architecture.
The incoming traffic from WebRTC clients is balanced by an Application
Load Balancer

A basic topology of an RTC system for voice
Another design pattern for SIP and RTP traffic is to use pairs of SBCs on Amazon EC2 in active-passive mode across Availability Zones, as seen in the following figure. Here, an Elastic IP address can be dynamically moved between instances upon failure, where the Domain Name Service (DNS) cannot be used.

RTC architecture using Amazon EC2 in a virtual private cloud (VPC)