*Edit: This is just my copy of the article I wrote for developerFusion on websockets back in may 2012. *
Lately, there has been much talk about the WebSockets API, and how it could change the way web applications are developed. In this article, we will take a look at
By the end of this article, I hope you’ll appreciate just how useful WebSockets are and how to get started rolling your own WebSockets applications.
So why do we need WebSockets? What problem are we trying to solve by using them? The answer is easy. We need a better way for web applications running on a client browser to communicate in real time with their servers. Currently, there are two common methods of providing this.
Both use the HTTP protocol to send messages to the server. Every packet of information sent over this protocol is wrapped in a lot of header information which describes things like where is this packet heading, where it came from, the user agent information etc. All of this adds a lot of overhead when communicating in real time. Neither of these methods are ‘bi-directional full duplex’ where both client and server can send and receive each other’s messages at the exact same time like, for example, a telephone system, where the people at both ends can talk and hear at the same time. These are the reasons current techniques are not good enough for fast, scalable real time communication on the web. We need a better solution, and that is what WebSockets gives us.
WebSockets are a new way for clients to communicate to servers and vice versa, without the overhead of an HTTP protocol. It uses its own protocol, which is defined by the IETF. The latest version is RFC 6455. Previous versions of the protocol proved to have some security issues so while they were implemented in a few browsers like Opera, they were not enabled by default. The newest version of the protocol seems have improved on these issue, and browsers are working on supporting that now.
Apart from having its own protocol, it also has an API which can be used by web applications to open and close connections and to send and receive messages. This is called the WebSockets API and is defined in a W3C Specification.
With WebSockets you can have full duplex bi-directional communication between the server and the client with less overhead than traditional HTTP based methods. This promises faster, more scalable and more robust high performance real time applications on the web. In fact, according to some analysis by the Kaazing Corporation, it could reduce the size of HTTP header traffic by 500:1 to 1000:1 and reduce network latency by 3:1. That translates to some serious performance improvements, especially for applications requiring fast real-time updates.
Before the client and the server start sending and receiving messages, they need to establish a connection first. This is done by establishing a ‘handshake’, where the client sends out a request to connect, and if the server wants, it will send out a response accepting the connection. The protocol specification makes it clear that one of the design decisions when making this protocol was to ensure that both HTTP based clients and WebSocket based ones can operate on the same port. This is why the handshake is such that the client and server ‘upgrade’ from an HTTP based protocol to a WebSocket based protocol.
The protocol spec has an example of such a handshake. The initiating handshake from the client should look like this:
and the responding handshake from the server should look like this:
Here the client will send a key in the Sec-WebSocket-Key header which is base64 encoded. For a server to form a response, it will take this and append the magic string
258EAFA5-E914-47DA-95CA-C5AB0DC85B11 to it, and then calculate the SHA-1 hash of this string. Then it will encode that hash value to base64, and that will be the sec-WebSocket-Accept header in the server’s response.
In the above example,
An important thing to note is the Origin header. The client-side handshake will always include this header, and then it will be up to the server whether they want to accept clients from different origins or not.
The first things all developers should do when working with the WebSockets API is to detect whether or not the client browser supports them. If so, we can work our magic with them. If not, we’ll have to fall back to another method of client-server communication, such as long-polling mentioned above.
Assuming that WebSockets are supported by the browser, the first task will be to connect to a WebSocket server by calling the WebSocket constructor
You could also use wss://, which is the secure socket variant to ws:// in the same way https is to http.
You could also specify sub-protocols of your own like so:
If your connection is accepted and established by the server, then an onopen event is fired on the client’s side. You can handle it like so
If the connection is refused by the server, or for some other reason is closed, then the onclose event is fired .
You can even explicitly close it on your own by calling the close() method, like so
In case of any errors, you can handle them using the onerror event.
Once we’ve successfully opened a connection to the server, we need to send messages to and receive messages from the server. Sending messages is very straightforward. We use the .send() method on our connection object.
Should the client receive a message from the server, it raises the onmessage event for you to handle.
If you want to send JSON objects to the server rather than a simple message, they should be serialized to a string, like so:
N.B. The WebSockets specification states that messages can be sent as binary messages using either the blob or the arrayBuffer objects as well as strings. However, not all browsers currently (as of May 2012) support this.
Most web servers revolve solely around the HTTP protocol. As WebSockets use their own protocol, you may need to install additional libraries and add-ons to support ws:// or the wss:// protocols in addition to http:// and https://.
The latest version of the WebSocket Protocol (RFC 6455) is currently only supported by a couple of the major browsers (Chrome and Opera) right now. While we wait for the other browsers to catch up however, there are several ways to roll out cross-browser WebSocket-based applications right now.
Another way to go would be cloud hosted API services like Pusher or BeaconPush. Instead of rolling out your own WebSocket server, you could use these types of services to run a WebSocket server, and interact on the client side with the API they provide. Generally they provide a flash fallback (which simulates WebSockets) in case the browser does not support WebSockets.
WebSockets provide a really simple way to do fast, robust and very efficient communication between the client and the server, removing some of the problems we face with the HTTP protocol. This technology is especially suited for applications where there is a high amount of data being generated rapidly and which needs to be communicated quickly. One very good area where this can be used is in the area of HTML5 based multiplayer online games (especially ones where you require quick response times, like first-person shooters). Other possible uses on the web include real-time breaking-news updates, fast updating streams on social media, as well as sport scores and online chat applications.