The Anatomy of a WebSocket

The WebSocket protocol is a fairly simple one; regardless, understanding how it works is essential to understanding how to secure (and exploit) it.  The protocol is comprised of two parts: a handshake and the data transfer.

The Handshake

The handshake is intended to be compatible with HTTP-based server-side software and intermediaries; therefore, the handshake is done via an HTTP Upgrade request.

The client initiates the handshake with the server:

		GET /chat HTTP/1.1Host: server.example.comUpgrade: websocketConnection: UpgradeSec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==Origin: http://example.comSec-WebSocket-Protocol: chat, superchatSec-WebSocket-Version: 13	

And the server responds:

		HTTP/1.1 101 Switching ProtocolsUpgrade: websocketConnection: UpgradeSec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=Sec-WebSocket-Protocol: chat	

In addition to the typical HTTP headers, this interaction includes some headers unique to the WebSocket handshake: the Sec-WebSocket-Protocol, Sec-WebSocket-Version, Sec-WebSocket-Key, and Sec-WebSocket-Accept headers. The Sec-WebSocket-Protocol header is sent by the client to tell the server what sub-protocols are acceptable to the client. The server selects one or none of the provided protocols and echoes it in the response to inform the client which sub-protocol was selected. The Sec-WebSocket-Version header simply designates which version of the WebSocket protocol the client wishes to communicate over. The Sec-WebSocket-Key and Sec-WebSocket-Accept headers are less self-explanatory. The Sec-WebSocket-Key header contains a base64-encoded nonce. The client sends this to the server, and the server responds with a hash of the key in the Sec-WebSocket-Accept header. This is used by the server to prove to the client that it received the client’s handshake request and to prevent the client from receiving cached server handshake responses and assuming that they came from the server.

If the Sec-WebSocket-Accept value matches the expected value and the server responds with a 101 HTTP status code, then the WebSocket connection will be established and frames can be sent between the server and the client.

The WebSocket Security Model

The security model for WebSockets as defined in RFC6455 section 1.6 states that “the WebSocket Protocol uses the origin model used by web browsers to restrict which web pages can contact a WebSocket server when the WebSocket Protocol is used from a web page.”  Ensuring that WebSocket connections can only be established with trusted origins defends against cross-origin attacks such as Cross-Site Request Forgery (CSRF).  While this is defined in the standard, the WebSocket protocol does not intrinsically check the Origin header in the client’s handshake request.  The task of ensuring that the handshake is coming from a trusted origin is left to the persons implementing the WebSocket protocol in their application.  A weak or unimplemented check against the Origin header in the client’s handshake leaves the WebSocket open to abuse by an attacker.  Before we can understand how to use a weak implementation of this check to attack WebSockets, we must first understand how CSRF attacks work.

Cross-Site Request Forgery (CSRF)

CSRF attacks trick users into submitting authenticated requests to an application.  They allow attackers to inherit the identity and privilege of their victim to make requests on their behalf.  One way such an attack can work is by exploiting the properties of cookies.  Cookies are stored in users’ browser and are scoped to a specific domain.  When a user makes a request in their browser to a domain that a cookie is scoped to, the associated cookie is appended to the request being made and sent to the target site.  If an application identifies users through the use of unique user identifiers in cookies, CSRF attacks can abuse this mechanism to forge requests to the target site from the victim’s browser.  The cookies associated with the target site will be appended to the forged request, and when the server receives the request it will read the session cookie and handle the request with the privileges of the user that the session cookie is associated with.

Cross-Site WebSocket Hijacking

The concept behind CSRF attacks can be extrapolated to attacks on WebSockets.  In many cases, applications need to associate WebSocket connections with specific users so that they can handle the data frames being sent over the connection accordingly.  One example would be a messaging application that sends and receives messages to and from users over WebSockets.  The application server needs to be able to associate active WebSocket connections with users to properly handle messages received over the connection and to forward messages from other WebSocket connections to the correct place.  In this example, the user would initiate the handshake by sending the handshake request to the server with their session identifier.  The server can then read the session identifier, open a new WebSocket connection, and associate the new connection with the user that the session identifier belongs to.  WebSocket authentication is typically only done once during the handshake and the connection is then associated with that user for the duration of the lifetime of that connection.  If the application uses cookies to manage user sessions, an attacker may be able to forge the handshake request using a CSRF attack and control messages sent and received over the WebSocket connection.  This is known as Cross-Site WebSocket Hijacking and it is probably the most prominent attack against WebSocket connections.  Let’s look at what an attack against our messaging application might look like.

For the attack to be meaningful, the attacker’s malicious website needs to be able to open a valid WebSocket connection as the victim and send and read messages that are sent over the connection. The simple web page below achieves just that.

When the victim visits the attacker’s website that is hosting the content shown above, the site will initiate the handshake and the victim’s browser will append their session cookies to the request.

		GET /chat HTTP/1.1Host: example-messaging-app.comUpgrade: websocketConnection: UpgradeSec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==Origin: http://malicious-site.comSec-WebSocket-Protocol: chatSec-WebSocket-Version: 13Cookie: session_id=victim_uuid	

Since our messaging application is vulnerable, it isn’t going to check that the Origin header in the handshake request from the client is coming from an untrusted source.  It will accept the request and open a new WebSocket connection associated with the victim’s session_id cookie from the handshake.  Now that the malicious site has an active WebSocket connection posing as the victim, it can send messages to the application using JavaScript.  The server will see that a message has been received over the connection associated with the victim and post the message as the victim.  Additionally, the attacker’s site can watch the WebSocket connection and when a new message is received from the server, it can take the message and forward it to another attacker-controlled server that can then log the message for the attacker to see.  The ability to read messages over the compromised WebSocket server is where CSWSH attacks differ from CSRF attacks.  Typically, with CSRF attacks the attacker can only make actionable requests and has no ability to view the response from the application.

Defending Against CSWSH Attacks

To protect against CSWSH and CSRF attacks alike, a few things can be done.  The most common defense against these kinds of attacks is to implement a CSRF token.  A CSRF token is an additional piece of authentication data that is unique to each request made by a user.  This token can be added to each request—either in an HTTP header or the body of the request—then validated by the server before evaluating the rest of the request.  Since the token is unique to each request made by the user, an attacker would not be able to insert the correct token into their forged payload, nullifying the request.  Additionally, applications can check each requests’ Origin header to ensure that they are coming from a trusted source.  When the CSRF payload attempting to initiate the WebSocket handshake is sent to the server, the Origin header is set to the attacker’s malicious website.  Proper validation checks on the Origin header will see that the request is coming from an untrusted source and discard it.  Another way to defend against this type of attack is to use a session management system that does not send user session identifiers through cookies.  Session identifiers can be sent through HTTP headers which will not be accessible to the attacker or the browser at the time of the attack preventing an attacker from being able to authenticate to the target application as the victim.