Table of Content

  1. Introduction
  2. Background
  3. WebSocket In Essence
  4. Experimental Demos
  5. Browser Support
  6. Open Issues
  7. WebSocket JavaScript API
  8. Develop WebSocket In Action - Team Poker
  9. Conclusion
  10. References & Resources 

Introduction

HTML5 WebSocket defines a bi-directional, full-duplex communication channel that operates through a single TCP socket over the Web, it provides efficient, low-latency and low cost connection between web client and server, based on WebSocket, developers can build scalablereal-time web applications in the future. Section below is the official definition of WebSocket copied from IETF WebSocket protocol page: 

The WebSocket protocol enables two-way communication between a user agent running untrusted code running in a controlled environment to a remote host that has opted-in to communications from that code.  The security model used for this is the Origin-based security model commonly used by Web browsers.  The protocol consists of an initial handshake followed by basic message framing, layered over TCP.  The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that does not rely on opening multiple HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long polling).

This article is trying to go through WebSocket basic concept, the problems it is going to solve, explain it in essence, watch some experimental Demos, develop a simple WebSocket application in action (Team Poker), and describe current open issues of WebSocket. I sincerely hope it will be systematicallyeasy to understandfrom surface to deep so that eventually readers would not only learn what WebSocket is from high level but also understand it in depth! Any thoughts, suggestions or criticism you may have after reading this article will help me to improve in the future, i would appreciate it if you could leave a comment.

Background

In traditional web applications, in order to achieve some real-time interaction with server, developers had to employ several tricky ways such as Ajax pollingComet (A.K.A Ajax push, Full Duplex Ajax, HTTP Streaming, etc.), those technologies either periodically fire HTTP requests to server or hold the HTTP connection with server for a long time, which "contain lots of additional, unnecessary header data and introduce latency" and resulted in "an outrageously high price tag". websocket.org explained the problems exhaustively, compared the performance of Ajax polling and WebSocket in detail, built up two simple web pages, one periodically communicated with server using traditional HTTP and the other used HTML5 WebSocket, in the testing each HTTP request/response header is approximate 871 byte, while data length of WebSocket connection is much shorter: 2 bytes after connection established, as the transfer count getting larger, the result will be:

Traditional HTTP Request 

  • Use case A: 1,000 clients polling every second: Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)

  • Use case B: 10,000 clients polling every second: Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)

  • Use case C: 100,000 clients polling every 1 second: Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)

HTML5 WebSocket

  • Use case A: 1,000 clients receive 1 message per second: Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)

  • Use case B: 10,000 clients receive 1 message per second: Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Kbps)

  • Use case C: 100,000 clients receive 1 message per second: Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Kbps)

Finally a more readable diagram: 

poll-ws-compare.gif

 "HTML5 Web Sockets can provide a 500:1 or — depending on the size of the HTTP headers — even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency".  --WebSocket.org

WebSocket In Essence

The motivation of creating WebSocket is to replace polling and long polling(Comet), and endow HTML5 web application the ability of real-time communication. Browser based web application can fire WebSocket connection request through JavaScript API, and then transfer data with server over only one TCP connection.

This is achieved by the new protocol - The WebSocket Protocol, which is essentially an independent TCP-based protocol. To establish a WebSocket connection client/browser forms an HTTP request with "Upgrade: WebSocket" header which indicates a protocol upgrade request, and the handshake key(s) will be interpreted by HTTP servers and handshake response will be returned (the detailed handshake mechanism will be described below), afterwards the connection is established (figuratively speaking, the 'sockets' have been plugged in at both client and server ends), both sides can transfer/receive data simultaneously, no more redundant header information, and the connection won't be closed until one side sends close signal, that's why WebSocket is bidirectional and full duplex.

Now it is time for us to delve deep into this protocol, let's start with WebSocket version draft-hixie-thewebsocketprotocol-76 which is now supported by browsers (Chrome 6+, Firefox 4+, Opera 11) and many WebSocket servers (please refer to Browser/Server Support section below for details). A typical WebSocket request/response example is shown below:

Request
GET /demo HTTP/1.1
Upgrade: WebSocket
Connection: Upgrade
Host: example.com
Origin: http://example.com
Sec-WebSocket-Key1: 4 @1 46546xW%0l 1 5
Sec-WebSocket-Key2: 12998 5 Y3 1 .P00

^n:ds[4U 

Response
HTTP/1.1 101 WebSocket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Location: ws://example.com/demo
Sec-WebSocket-Protocol: sample

8jKS'y:G*Co,Wxa- 

The entire process could be described as: the client raise a "special" HTTP request which request "Upgrade" connecting protocol to "WebSocket", on domain "example.com" with path "/demo", with three "handshake" fields: Sec-WebSocket-Key1, Sec-WebSocket-Key2 and 8 bytes ({^n:ds[4U}) after the fields are random tokens which the WebSocket server uses to construct a 16-byte security hash at the end of its handshake to prove that it has read the client's handshake.

Since WebSocket protocol is NOT finalized and is being improved and standardized by IETF Hypertext Bidirectional (HyBi) Working Group, at the time I wrote this article, the latest WebSocket version is "draft-ietf-hybi-thewebsocketprotocol-08" lasted updated by Ian Fette on June 7, 2011, in which both request/response headers are changed comparing to version 76, the handshake mechanism was changed as well, Sec-WebSocket-Key1 and Sec-WebSocket-Key1 are replaced with one Sec-WebSocket-Key.

WebSocket request/response in the latest draft-ietf-hybi-thewebsocketprotocol-08:

Request
GET /demo HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: V2ViU29ja2V0IHJvY2tzIQ==
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 8

Response
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: VAuGgaNDB/reVQpGfDF8KXeZx5o=
Sec-WebSocket-Protocol: chat

The Sec-WebSocket-Key is a base64 encoded randomly generated 16-byte value, in the case above it is "WebSocket rocks!", the server reads the key, concats with a magic GUID "258EAFA5-E914-47DA-95CA-C5AB0DC85B11", to "V2ViU29ja2V0IHJvY2tzIQ==258EAFA5-E914-47DA-95CA-C5AB0DC85B11", then compute its SHA1 hash, get result "540b8681a34307fade550a467c317c297799c79a", finally based64 encodes the hash and append the value to header "Sec-WebSocket-Accept".

I've written C# code below to demonstrate how to compute the Sec-WebSocket-Accept conforming to draft-ietf-hybi-thewebsocketprotocol-08:  

 public static String ComputeWebSocketHandshakeSecurityHash08(String secWebSocketKey)
 {
     const String MagicKEY = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
     String secWebSocketAccept = String.Empty;
 
     // 1. Combine the request Sec-WebSocket-Key with magic key.
     String ret = secWebSocketKey + MagicKEY;
 
     // 2. Compute the SHA1 hash
     SHA1 sha = new SHA1CryptoServiceProvider(); 
     byte[] sha1Hash = sha.ComputeHash(Encoding.UTF8.GetBytes(ret));
 
     // 3. Base64 encode the hash
     secWebSocketAccept = Convert.ToBase64String(sha1Hash);
 
     return secWebSocketAccept;
 }

Unit test code:

String secWebSocketKey = Convert.ToBase64String(Encoding.UTF8.GetBytes("WebSocket rocks!"));
Console.WriteLine("Sec-WebSocket-Key: {0}", secWebSocketKey);

String secWebSocketAccept = ComputeWebSocketHandshakeSecurityHash08(secWebSocketKey);
Console.WriteLine("Sec-WebSocket-Accept: " + secWebSocketAccept); 

We would see the result by running code above:

Sec-WebSocket-Key: V2ViU29ja2V0IHJvY2tzIQ==
Sec-WebSocket-Accept: VAuGgaNDB/reVQpGfDF8KXeZx5o=
Please note draft-ietf-hybi-thewebsocketprotocol-08 is NOT supported by any common web browsers since it was just published on June 7, 2011. However, we can learn it so that we know what is happening now with WebSocket and where it's going in the future.

Experimental Demos

So far there are already many experimental WebSocket Demos built based on either draft-hixie-thewebsocketprotocol-75 or draft-hixie-thewebsocketprotocol-76.

http://rumpetroll.com/ 
Play a tadpole in canvas and can chat with other tadpoles, tadpoles location and char messages could be seen by everyone in real-time.

http://html5labs.interoperabilitybridges.com/prototypes/websockets/websockets/info 
WebSocket Demos in Microsoft HTML5 Labs

http://html5demos.com/web-socket
A very simple "Chat room" demo using WebSocket.

http://kaazing.me/
Refresh stock, weather, news and twits in real-time powered by Kaazing's real-time message delivery network solution.

Mr. Doob's Multiuser Sketchpad
In this multi-user <canvas> drawing application, Web Sockets are used to pass along the coordinates of the lines that other users draw back to each client as they happen.

And so on.. You will also see a simple demo developed by me below.

Browser/Server Support

WebSocket is not only designed for browser/server communication, client application can also use it. However, I guess browser will still be the major platform for WebSocket protocol taken into account the emerging handsets and the coming cloud. At the time I wrote this article, WebSocket protocol version draft-ietf-hybi-thewebsocketprotocol-76 was supported by Safari 5+/Google chrome 6+ (link), Mozilla Firefox 4+ (disabled by default link) and Opera 11 (disabled by default link), IE 9/10 does not support... but on not supported browsers we can use Flash shim/fallback by adopting web-socket-js.

The awesome Can I uses it is maintaining HTML5 new features support in all popular browsers, screenshot below shows WebSocket support:

WebSocketBrowsersupport.png

There are also a number of WebSocket servers available:

http://socket.io  - Provides seamless support for a variety of transports (WebSocket, WebSocket over Flash, XHR polling, JSONP polling, etc.) intended for real-time communication developed by Guillermo Rauch 

node.ws.js - A simple WebSocket server (support both draft-hixie-thewebsocketprotocol-75 and draft-hixie-thewebsocketprotocol-76) developed based on node.websocket.js.

web-socket-js - HTML5 Web Socket implementation powered by Flash

http://nugget.codeplex.com - A web socket server implemented in C#.

jWebSocket.org - The Open Source Java WebSocket Server

phpwebsocket - PHP version of WebSocket server.

WebSocket JavaScript API

W3C defined WebSocket interface as below:

[Constructor(in DOMString url, in optional DOMString protocols)]
[Constructor(in DOMString url, in optional DOMString[] protocols)]
interface WebSocket {
readonly attribute DOMString url;

// ready state
const unsigned short CONNECTING = 0;
const unsigned short OPEN = 1;
const unsigned short CLOSING = 2;
const unsigned short CLOSED = 3;
readonly attribute unsigned short readyState;
readonly attribute unsigned long bufferedAmount;

// networking
attribute Function onopen;
attribute Function onmessage;
attribute Function onerror;
attribute Function onclose;
readonly attribute DOMString protocol;
void send(in DOMString data);
void close();
};
WebSocket implements EventTarget; 

The url attribute is the WebSocket server Url, the protocol is usually "ws" (for unencrypted plain text WebSocket) or "wss" (for secure WebSocket), send method sends data to server after connected, close is to send close signal, besides there are four important events: onopen, onmessage, onerror and onclose, I borrowed a nice picture below from nettuts.

events.jpg

  • onopen: When a socket has opened, i.e. after TCP three-way handshake and WebSocket handshake.
  • onmessage: When a message has been received from WebSocket server.
  • onerror: Triggered when error occurred.
  • onclose: When a socket has been closed.

JavaScript code below is to setup WebSocket connection and retrieve data:

var wsUrl = 'ws://localhost:8888/DummyPath';
var websocket = new WebSocket(wsUrl);
websocket.onopen = function (evt) { onOpen(evt) };
websocket.onclose = function (evt) { onClose(evt) };
websocket.onmessage = function (evt) { onMessage(evt) };
websocket.onerror = function (evt) { onError(evt) };

function onOpen(evt) {
    console.log("Connected to WebSocket server.");

    websocket.send("HTML5 WebSocket rocks!");
}
function onClose(evt) { console.log("Disconnected"); }
function onMessage(evt) {
    console.log('Retrieved data from server: ' + evt.data);

    // Update UI...
}
function onError(evt) { console.log('Error occured: ' + evt.data); }

Develop WebSocket In Action - Team Poker Demo 

Estimating user story effort by using Planning Poker Cards is well-known and widely used in Agile/Scrum development, Program Manager/Scrum Master prepare user stories beforehand, hold meeting with stake holders and have them play poker to represent one's estimation on each story, the higher the card value is, the harder to implement, on the contrary, the lower the value is, the easier to implement, "0" indicates "no effort" or "has been done", "?" indicates "mission impossible" or "unclear requirement".

Actually there is a website - http://pokerplanning.com does the exact work described above, my co-workers and I used it for several times, however, we found it is getting slower and slower as more team members joining the game or after several rounds of voting, we did experience the worst result: no one can vote anymore because everyone's voting page got stucked. I strongly suspect the major reason for this is its Ajax Polling strategy in order to ensure everyone got real-time voting status. By tracking its network activities I guess I was right.

Ajax polling in http://pokerplanning.com:
AJaxPolling.png

I believe HTML5 WebSocket will solve the problem! So I developed a simple demo (I named it Team Poker) which currently only has limited and basic functionalities described as below:

  1. User can login to poker room after inputting his/her nickname.
  2. Everyone gets notified when one user players a poker.
  3. Everyone gets notified when new player joins.
  4. Newly joined player can see the current participants and voting status.

The login screenshot for requirement #1:

TeamPoker-Login.png

New participant(s), new voted poker(s) status update screenshot for requirement #2 and #3 (please click on the image to enlarge):

TeamPoker-Vote.png

Newly joined participant(s) can see current game status, story #4: 

TeamPoker-VoteFinished.png

Please note the Team Poker demo is concentrated on demonstrating the power of WebSocket, in real world, player shouldn't see the poker values played by other players, and there is functionalities like moderator customizing user stories, storing game status on server side, and there is no fancy UI/animation. However, I've share all the source code at the beginning of this article, in additional, I've uploaded the source code on github: https://github.com/WayneYe/TeamPoker, wish some people make it better and productive, will you fork it with me? Dear reader:).

Ok, now coding time, since all clients need to get notified about other client's changes (new player joining or new poker played), in additional, new joint player needs to know current status, I defined two communication contracts:

  • ClientMessage indicates message sent from client, contains a Type property reflects enumeration class MessageType - NewParticipaint, NewVoteInfo, and a Data property to store data.
  • ServerStatus, stores current playing players as well as current voting status - a hashtable [{PlayerName}, {VoteValue}], broadcast to all clients once receiving new client message.
var TeamPoker = TeamPoker || function () { };

TeamPoker.VoteInfo = function (playerName, voteValue) {
    this.PlayerName = playerName;
    this.VoteValue = voteValue;
}
TeamPoker.MessageType = {
    NewParticipaint: 'NewParticipaint',
    NewVoteInfo: 'NewVoteInfo'
};
TeamPoker.ClientMessage = function (type, data) {
    this.Type = type;
    this.Data = data;
};
TeamPoker.ServerStatus = function () {
    this.Players = [];
    this.VoteStatus = [];
};

 On client side, a WebSocket connection will be established after user clicking "Login" button, the nickname will be sent to WebSocket server running on nodejs, the kernal client code is shown below:

 TeamPoker.connectToWsServer = function () {
     // Init Web Socket connect
     var WSURI = "ws://192.168.1.6:8888";
     TeamPoker.WsClient = new WebSocket(WSURI);
 
     TeamPoker.WsClient.onopen = function (evt) {
         console.log('Successfully connected to WebSocket server.');
         TeamPoker.joinGame();
     };
     TeamPoker.WsClient.onclose = function (evt) {
         console.log('Connection closed.');
     };
     TeamPoker.WsClient.onmessage = function (evt) {
         console.log('Retrived msg from server: ' + evt.data);
         TeamPoker.updateGameStatus(evt.data);
     };
     TeamPoker.WsClient.onerror = function (evt) {
         console.log('An error occured: ' + evt.data);
     };
 };
 
 TeamPoker.joinGame = function () {
     var joinGameMsg = new TeamPoker.ClientMessage(TeamPoker.MessageType.NewParticipaint, TeamPoker.CurrentPlayerName);
 
     TeamPoker.WsClient.send(JSON.stringify(joinGameMsg));
 }
 TeamPoker.updateGameStatus = function (data) {
     // Format/fill the data from server side to HTML
 } 

On server side, one important task is to maintain all active client WebSocket connections so that it can "broadcast" messages to every client, and remove the closed client to avoid sending message to "dead" client. Other than this, the logic is very simple, validate message type sent from client, update players/vote status repository and then broadcast to all client:

 /*
 WebSocket server based on
 https://github.com/ncr/node.ws.js
 Written By Wayne Ye 6/4/2011
 http://wayneye.com
 */
 
 var sys = require("sys"),
     ws = require("./ws");
 
 var clients = [], players = [], voteStatus = [];
 
 ws.createServer(function (websocket) {
     websocket.addListener("connect", function (resource) {
         // emitted after handshake
         sys.debug("Client connected on path: " + resource);
 
         // # Add to our list of clients
         clients.push(websocket); 
     }).addListener("data", function (data) {
         var clinetMsg = JSON.parse(data);
 
         switch (clinetMsg.Type) {
             case MessageType.NewParticipaint:
                 var newPlayer = clinetMsg.Data;
                 sys.debug('New Participaint: ' + newPlayer);
                 players.push(newPlayer);
 
                 break;
             case MessageType.NewVoteInfo:
                 var newVoteInfo = clinetMsg.Data;
                 sys.debug('New VoteInfo: ' + newVoteInfo.PlayerName + ' voted ' + newVoteInfo.VoteValue);
                 voteStatus.push(newVoteInfo);
                 break;
             default:
                 break;
         }
 
         // Notify all clients except the one just sent data
         var serverStatus = new ServerStatus();
         serverStatus.Players = players;
         serverStatus.VoteStatus = voteStatus;
 
         var srvMsgData = JSON.stringify(serverStatus);
 
         sys.debug('Broadcast server status to all clients: ' + srvMsgData);
         for (var i = 0; i < clients.length; i++)
             clients[i].write(srvMsgData);
     }).addListener("close", function () {
         // emitted when server or client closes connection
 
         for (var i = 0; i < clients.length; i++) {
             // # Remove from our connections list so we don't send
             // # to a dead socket
             if (clients[i] == websocket) {
                 sys.debug("close with client: " + websocket);
                 clients.splice(i);
                 break;
             }
         }
     });
 }).listen(8888);
 
 var MessageType = {
     NewParticipaint: 'NewParticipaint',
     NewVoteInfo: 'NewVoteInfo'
 };
 function ClientMessage(type, data) {
     this.Type = type;
     this.Data = data;
 };
 function VoteInfo(playerName, voteValue) {
     this.PlayerName = playerName;
     this.VoteValue = voteValue;
 }
 function ServerStatus() {
     this.Players = [];
     this.VoteStatus = [];
 };

Complete source code could be found at github: https://github.com/WayneYe/TeamPoker.

After going through the code let's see what happens underneath: screenshot below was snapped while I was developing the Team Poker WebSocket demo, it recorded the entire process of the WebSocket communication, in this picture 192.168.1.2 is the host of TeamPoker page which fires WebSocket request, 192.168.1.6 is the WebSocket server based on nodejs which exposes port 8888 running on ubuntu 11.04.

All packets behind WebSocket connection:
WebSocketProtocol.png

WebSocket request & response headers:
WebSocketStream.png

So see the power of WebSocket?

  1. Data transfer is done within one TCP connection lifecycle.
  2. No extra headers after handshake. You might notice that the "length" column represents each packet's size, it is less than 100 bytes by average in my case and it only depend on exact transferred data size.

In Ajax polling or Comet, HTTP requests/responses with header information is impossible to achieve same level performance as WebSocket, both of them created new HTTP (TCP) connections to transfer data, and each connection's size is relatively larger than WebSocket, especially when there are cookies stored in header or long headers such as "User-Agent", "If-Modified-Since", "If-Match", "X-Powered-By", etc. 

One thing deserves to be mentioned is the TCP keep alive signals, we should consider close the WebSocket connection as soon as we don't need it any more, otherwise bandwidth will be wasted. 

Open Issues

Adam Barth and his co-workers had found a security vulnerability of WebSocket, he pointed out many routers do not recognize HTTP "Upgrade" mechanism, those routers treat WebSocket packet after handshake as subsequent HTTP packets, as a result the attackers can poison the proxy's HTTP cache (you can refer their exhaustive description), they suggest using CONNECT-based handshake, most proxies appear to understand the semantics of CONNECT requests than understand the semantics of the Upgrade mechanism, and after simulating CONNECT-based handshake they found there was no way to poison the proxy's HTTP cache.

Because of the security issue, Firefox 4.0 and Opera 11 disabled WebSocket by default, we can enable it in about:config, please refer more details here and here.

Conclusion

WebSocket is a revolutionary feature in HTML5, it defines a full-duplex communication channel that operates through a single socket over the Web, real-time data transferring was never being so easy and efficient with relatively low bandwidth and server cost comparing to Ajax polling or Comet. Although it is now not standardized and has security issue mentioned in above section, hence at this time is not recommended to use it in enterprise solutions or data sensitive applications, developers should learn it, watch it, The only thing that never changes is change, the WebSocket protocol draft version numbers changes fast, you might have noticed that after reading my article, wish it becomes normative and standardized soon!  

Are you plugged? If you are, happy WebSocketing! 

References & Resources

HTML5 Web Sockets: A Quantum Leap in Scalability for the Web
http://websocket.org/quantum.html

Wikipedia: Web Socket
http://en.wikipedia.org/wiki/WebSockets 

The Web Socket Protocol
http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-07

The WebSocket API
http://www.w3.org/TR/websockets/

Web Applications 1.0 - Web sockets
http://www.whatwg.org/specs/web-apps/current-work/complete/network.html#network

Introducing Web Sockets
http://dev.opera.com/articles/view/introducing-web-sockets/

WebSockets - MDC Docs
https://developer.mozilla.org/en/WebSockets

Stackoverflow - What are good resources for learning HTML 5 WebSockets?
http://stackoverflow.com/questions/4262543/what-are-good-resources-for-learning-html-5-websockets

HTML Labs - WebSocket
http://html5labs.interoperabilitybridges.com/prototypes/websockets/websockets/info

Start using HTML5 WebSocket today
http://net.tutsplus.com/tutorials/javascript-ajax/start-using-html5-websockets-today/

HTML 5 Web Sockets vs. Comet and Ajax 
http://www.infoq.com/news/2008/12/websockets-vs-comet-ajax

Internet Socket
http://en.wikipedia.org/wiki/Internet_socket

Real-time web test – does html5 websockets work for you?
http://websocketstest.com/ 

Initially posted on Wayne's Geek Life: http://wayneye.com/Blog/HTML5-Web-Socket-In-Essence 

推荐.NET配套的通用数据层ORM框架:CYQ.Data 通用数据层框架
新浪微博粉丝精灵,刷粉丝、刷评论、刷转发、企业商家微博营销必备工具"