WebRTC: Media, Signaling, ICE, and Why WebRTC is so powerful

Conrad Avatar

·

·

WebRTC is often introduced as “real-time communication in the browser.” That description is accurate — and wildly incomplete.

Behind a simple RTCPeerConnection lies a dense stack of protocols, heuristics, NAT traversal tricks, congestion control algorithms, codec negotiation rules, and security constraints that rival most telecom systems built over decades.

This article peels back the layers of WebRTC from an engineering perspective, focusing on the parts that matter when you’re embedding voice and video into real products.


1. WebRTC Is Not a Protocol — It’s a System

WebRTC is best understood as a collection of standards, not a single protocol:

Layer Technology
Signaling Undefined (WebSocket, SIP, HTTP, MQTT, etc.)
Transport ICE, STUN, TURN
Security DTLS, SRTP
Media RTP, RTCP
Control SDP
APIs RTCPeerConnection, MediaStream, getUserMedia

This separation is intentional — and powerful — but it means you own the architecture.


2. Signaling: The Part WebRTC Refuses to Define

WebRTC deliberately avoids defining signaling. That’s why it works equally well with:

  • SIP (via gateways)WebSockets
  • REST-based offer/answer exchanges
  • Message queues
  • Serverless architectures

Minimal Signaling Example (WebSocket)

Copied!
// create peer connection const pc = new RTCPeerConnection(config); // send ICE candidates pc.onicecandidate = event => { if (event.candidate) { socket.send(JSON.stringify({ type: "ice", candidate: event.candidate })); } }; // create offer const offer = await pc.createOffer(); await pc.setLocalDescription(offer); socket.send(JSON.stringify({ type: "offer", sdp: offer.sdp }));

Key insight:
WebRTC signaling is state synchronization, not messaging.
Lost messages = broken calls.


3. ICE: How WebRTC Finds a Path Through the Internet

ICE (Interactive Connectivity Establishment) is WebRTC’s NAT traversal engine.

Candidate Types

Type Description
host Local IP
srflx STUN-reflected public IP
relay TURN server

A typical ICE candidate looks like this:

Copied!
candidate:842163049 1 udp 1686052607 192.168.1.12 56143 typ host

ICE Flow (Simplified)

  1. Gather candidates
  2. Exchange candidates via signaling
  3. Test connectivity pairs
  4. Select best viable path
  5. Lock transport

TURN Is Not Optional

If your product must:

  • Work on corporate networks
  • Support mobile users
  • Handle symmetric NATs

Then TURN is mandatory, not a fallback.

If you don’t budget for TURN, your call success rate will tell that story for you.


4. Media: RTP, Codecs, and Packet Reality

Once ICE succeeds, WebRTC sends media using SRTP over UDP.

Common Audio Codecs

  • Opus (default, best)
  • G.711 (interop)
  • G.722

Common Video Codecs

  • VP8 (baseline)
  • VP9
  • H.264 (interop, hardware acceleration)

Codec negotiation happens via SDP:

Copied!
m=audio 9 UDP/TLS/RTP/SAVPF 111 a=rtpmap:111 opus/48000/2 a=fmtp:111 minptime=10;useinbandfec=1

Important:
The order of codecs matters. Browsers prefer the first compatible option.


5. Congestion Control: Why WebRTC Sounds “Good” on Bad Networks

WebRTC continuously adapts bitrate using:

  • Packet loss
  • RTT
  • Jitter
  • Receiver reports (RTCP)

You can observe this via getStats():

Copied!
const stats = await pc.getStats(); stats.forEach(report => { if (report.type === "outbound-rtp") { console.log({ packetsSent: report.packetsSent, bitrate: report.bytesSent, roundTripTime: report.roundTripTime }); } });

This feedback loop is why WebRTC often outperforms legacy VoIP stacks on mobile networks.


6. Audio Is Harder Than Video (Yes, Really)

WebRTC audio includes:

  • Echo cancellation
  • Automatic gain control
  • Noise suppression
  • Jitter buffering
  • Packet loss concealment

These run before encoding and after decoding.

Copied!
const stream = await navigator.mediaDevices.getUserMedia({ audio: { echoCancellation: true, noiseSuppression: true, autoGainControl: true }, video: false });

Disabling these for “raw audio” almost always makes things worse.


7. WebRTC + SIP: Bridging Two Worlds

WebRTC does not replace SIP — it complements it.

A typical architecture:

Copied!
Browser (WebRTC) ↕ WebRTC Gateway (DTLS ↔ RTP) ↕ SIP Proxy / PBX

Challenges:

  • SDP normalization
  • Codec transcoding
  • NAT alignment
  • Media anchoring

This is where embedded web phones shine: abstracting complexity while keeping flexibility.


8. Security: Why WebRTC Is Always Encrypted

WebRTC forces encryption:

  • DTLS for key exchange
  • SRTP for media
  • No plaintext fallback

This is why WebRTC cannot interop directly with legacy RTP endpoints without a gateway.

Security is not optional — it’s architectural.


9. Debugging WebRTC (The Right Way)

Browser Tools

  • chrome://webrtc-internals
  • Firefox WebRTC stats
  • getStats() logging

Common Failure Modes

  • Missing TURN
  • Broken ICE candidate exchange
  • Codec mismatch
  • SDP munging errors
  • Firewall UDP blocking

If your calls “sometimes work,” it’s almost always ICE.


10. Why WebRTC Is Still the Future

Despite being complex, WebRTC remains unmatched for:

  • Zero-install real-time comms
  • Browser-native media
  • Mobile-friendly networking
  • Secure-by-default design
  • Extensibility

The challenge isn’t WebRTC — it’s engineering discipline around it.


At web-phone.org, we focus on turning these primitives into composable, embeddable building blocks — without hiding the power that makes WebRTC worth using in the first place.

If this kind of deep dive helps you build better real-time systems, you’re in the right place.

Leave a Reply

Your email address will not be published. Required fields are marked *