|By Peter Lubbers, Frank Greco||
|March 11, 2010 10:15 AM EST||
Lately there has been a lot of buzz around HTML5 Web Sockets, which defines a full-duplex communication channel that operates through a single socket over the Web. HTML5 Web Sockets is not just another incremental enhancement to conventional HTTP communications; it represents a colossal advance, especially for real-time, event-driven web applications.
HTML5 Web Sockets provides such a dramatic improvement from the old, convoluted "hacks" that are used to simulate a full-duplex connection in a browser that it prompted Google's Ian Hickson - the HTML5 specification lead - to say:
"Reducing kilobytes of data to 2 bytes...and reducing latency from 150ms to 50ms is far more than marginal. In fact, these two factors alone are enough to make Web Sockets seriously interesting to Google."
Let's look at how HTML5 Web Sockets can offer such an incredibly dramatic reduction of unnecessary network traffic and latency by comparing it to conventional solutions.
Polling, Long-Polling, and Streaming - Headache 2.0
Normally when a browser visits a web page, an HTTP request is sent to the web server that hosts that page. The web server acknowledges this request and sends back the response. In many cases - for example, for stock prices, news reports, ticket sales, traffic patterns, medical device readings, and so on - the response could be stale by the time the browser renders the page. If you want to get the most up-to-date "real-time" information, you can constantly refresh that page manually, but that's obviously not a great solution.
With polling, the browser sends HTTP requests at regular intervals and immediately receives a response. This technique was the first attempt for the browser to deliver real-time information. Obviously, this is a good solution if the exact interval of message delivery is known, because you can synchronize the client request to occur only when information is available on the server. However, real-time data is often not that predictable, making unnecessary requests inevitable and, as a result, many connections are opened and closed needlessly in low-message-rate situations.
With long-polling, the browser sends a request to the server and the server keeps the request open for a set period. If a notification is received within that period, a response containing the message is sent to the client. If a notification is not received within the set time period, the server sends a response to terminate the open request. It is important to understand, however, that when you have a high message volume, long-polling does not provide any substantial performance improvements over traditional polling. In fact, it could be worse, because the long-polling might spin out of control into an unthrottled, continuous loop of immediate polls.
With streaming, the browser sends a complete request, but the server sends and maintains an open response that is continuously updated and kept open indefinitely (or for a set period of time). The response is then updated whenever a message is ready to be sent, but the server never signals to complete the response, thus keeping the connection open to deliver future messages. However, since streaming is still encapsulated in HTTP, intervening firewalls and proxy servers may choose to buffer the response, increasing the latency of the message delivery. Therefore, many streaming Comet solutions fall back to long-polling in case a buffering proxy server is detected. Alternatively, TLS (SSL) connections can be used to shield the response from being buffered, but in that case the setup and tear down of each connection taxes the available server resources more heavily.
Ultimately, all of these methods for providing real-time data involve HTTP request and response headers, which contain lots of additional, unnecessary header data and introduce latency. On top of that, full-duplex connectivity requires more than just the downstream connection from server to client. In an effort to simulate full-duplex communication over half-duplex HTTP, many of today's solutions use two connections: one for the downstream and one for the upstream. The maintenance and coordination of these two connections introduces significant overhead in terms of resource consumption and adds lots of complexity. Simply put, HTTP wasn't designed for real-time, full-duplex communication as you can see in Figure 1, which shows the complexities associated with building a Comet web application that displays real-time data from a back-end data source using a publish/subscribe model over half-duplex HTTP.
Figure 1: The complexity of Comet applications
It gets even worse when you try to scale out those Comet solutions to the masses. Simulating bi-directional browser communication over HTTP is error-prone and complex and all that complexity does not scale. Even though your end users might be enjoying something that looks like a real-time web application, this "real-time" experience has an outrageously high price tag. It's a price that you will pay in additional latency, unnecessary network traffic, and a drag on CPU performance.
HTML5 Web Sockets to the Rescue!
Defined in the Communications section of the HTML5 specification, HTML5 Web Sockets represent the next evolution of web communications - a full-duplex, bidirectional communications channel that operates through a single socket over the Web. HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications. In addition, since it provides a socket that is native to the browser, it eliminates many of the problems Comet solutions are prone to. Web Sockets remove the overhead and dramatically reduce complexity.
To establish a WebSocket connection, the client and server upgrade from the HTTP protocol to the WebSocket protocol during their initial handshake, as shown in the following example:
Example 1: The WebSocket handshake (browser request and server response)
GET /text HTTP/1.1\r\n
HTTP/1.1 101 WebSocket Protocol Handshake\r\n
Once established, WebSocket data frames can be sent back and forth between the client and the server in full-duplex mode. Both text and binary frames can be sent full-duplex, in either direction at the same time. The data is minimally framed with just two bytes. In the case of text frames, each frame starts with a 0x00 byte, ends with a 0xFF byte, and contains UTF-8 data in between. WebSocket text frames use a terminator, while binary frames use a length prefix.
The Showdown: Comet vs HTML5 Web Sockets
How dramatic is that reduction in unnecessary network traffic and latency? Let's compare a polling application with a WebSocket application.
For the polling example, I created a simple web application in which a web page requests real-time stock data from a RabbitMQ message broker using a traditional publish/subscribe model. It does this by polling a Java servlet that is hosted on a web server. The RabbitMQ message broker receives data from a fictitious stock price feed with continuously updating prices. The web page connects and subscribes to a specific stock channel (a topic on the message broker) and uses an XMLHttpRequest to poll for updates once per second. When updates are received, some calculations are performed and the stock data is shown in a table as shown in Figure 2.
Note: The back-end stock feed actually produces a lot of stock price updates per second, so using polling at one-second intervals is actually more prudent than using a Comet long-polling solution, which would result in a series of continuous polls. Polling effectively throttles the incoming updates here.
It all looks great, but a look under the hood reveals there are some serious issues with this application. For example, in Mozilla Firefox with Firebug (a Firefox add-on that allows you to debug web pages and monitor the time it takes to load pages and execute scripts), you can see that GET requests hammer the server at one-second intervals. Turning on Live HTTP Headers (another Firefox add-on that shows live HTTP header traffic) reveals the shocking amount of header overhead that is associated with each request. The following two examples show the HTTP header data for just a single request and response:
Example 2: HTTP request header
GET /PollingStock//PollingStock HTTP/1.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:126.96.36.199) Gecko/20091102 Firefox/3.5.5
Cookie: showInheritedConstant=false; showInheritedProtectedConstant=false; showInheritedProperty=false; showInheritedProtectedProperty=false; showInheritedMethod=false; showInheritedProtectedMethod=false; showInheritedEvent=false; showInheritedStyle=false; showInheritedEffect=false
Example 3: HTTP response header
HTTP/1.x 200 OK
Server: Sun Java System Application Server 9.1_02
Date: Sat, 07 Nov 2009 00:32:46 GMT
Just for fun, I counted all the characters. The total HTTP request and response header information overhead contains 871 bytes and that doesn't even include any data. Of course, this is just an example and you can have less than 871 bytes of header data, but I have also seen cases where the header data exceeded 2,000 bytes. In this example application, the data for a typical stock topic message is only about 20 characters long. As you can see, it's effectively drowned out by the excessive header information, which wasn't even required in the first place.
What happens when you deploy this application to a large number of users? Let's take a look at the network throughput for just the HTTP request and response header data associated with this polling application in three different use cases.
- Use case A: 1,000 clients polling every second:
Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)
- Use case B: 10,000 clients polling every second:
Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)
- Use case C: 100,000 clients polling every 1 second:
Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)
That's an enormous amount of unnecessary network throughput. If only we could just get the essential data over the wire. Well, guess what? You can with HTML5 Web Sockets. I rebuilt the application to use HTML5 Web Sockets, adding an event handler to the web page to asynchronously listen for stock update messages from the message broker (check out the many how-tos and tutorials on www.tech.kaazing.com/documentation for more information on how to build a WebSocket application). Each of these messages is a WebSocket frame that has just two bytes of overhead (instead of 871). Take a look at how that affects the network throughput overhead in our three use cases.
- Use case A: 1,000 clients receive 1 message per second:
Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)
- Use case B: 10,000 clients receive 1 message per second:
Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Mbps)
- Use case C: 100,000 clients receive 1 message per second:
Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Mbps)
As you can see in Figure 3, HTML5 Web Sockets provide a dramatic reduction of unnecessary network traffic compared to the polling solution.
Figure 3: Comparison of the unnecessary network throughput overhead between the polling and the WebSocket applications
What about the reduction in latency? Take a look at Figure 4. In the top half, you can see the latency of the half-duplex polling solution. If we assume, for this example, that it takes 50 milliseconds for a message to travel from the server to the browser, then the polling application introduces a lot of extra latency, because a new request has to be sent to the server when the response is complete. This new request takes another 50ms and during this time the server cannot send any messages to the browser, resulting in additional server memory consumption.
In the bottom half of the figure, you see the reduction in latency provided by the WebSocket solution. Once the connection is upgraded to WebSocket, messages can flow from the server to the browser the moment they arrive. It still takes 50 ms for messages to travel from the server to the browser, but the WebSocket connection remains open so there is no need to send another request to the server.
Figure 4: Latency comparison between the polling and WebSocket applications
HTML5 Web Sockets and the Kaazing WebSocket Gateway
Today, only Google's Chrome browser supports HTML5 Web Sockets natively, but other browsers will soon follow. To work around that limitation, however, Kaazing WebSocket Gateway provides complete WebSocket emulation for all the older browsers (I.E. 5.5+, Firefox 1.5+, Safari 3.0+, and Opera 9.5+), so you can start using the HTML5 WebSocket APIs today.
WebSocket is great, but what you can do once you have a full-duplex socket connection available in your browser is even greater. To leverage the full power of HTML5 Web Sockets, Kaazing provides a ByteSocket library for binary communication and higher-level libraries for protocols like Stomp, AMQP, XMPP, IRC and more, built on top of WebSocket.
For example, if you use a high-level library for protocols such as Stomp or AMQP (supplied with the Kaazing Gateway), you can talk directly to back-end message brokers like the RabbitMQ broker described in the example. By having a direct connection to services, there is no need for additional application server logic to translate those bi-directional, full-duplex TCP backend protocols to uni-directional, half-duplex HTTP connections; the browser can simply understand these protocols natively.
Figure 5: Kaazing WebSocket Gateway extends TCP-based messaging to the browser with ultra high performance
HTML5 Web Sockets provides an enormous step forward in the scalability of the real-time web. As you have seen in this article, HTML5 Web Sockets can provide a 500:1 or - depending on the size of the HTTP headers - even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency. That is not just an incremental improvement; that is a revolutionary jump - a quantum leap.
Kaazing WebSocket Gateway makes HTML5 WebSocket code work in all the browsers today, while providing additional protocol libraries that allow you to harness the full power of the full-duplex socket connection that HTML5 Web Sockets provides and communicate directly to back-end services. For more information about Kaazing WebSocket Gateway, visit www.kaazing.com and the Kaazing technology network at tech.kaazing.com.
- HTML5 Web Sockets Specification: http://dev.w3.org/html5/websockets/
- WebSockets.org: http://www.websockets.org
- Kaazing corporation: http://www.kaazing.com
- Download the Kaazing WebSocket Gateway: http://www.kaazing.com/download
- Skills Matter HTML5 Communication Training Courses: http://skillsmatter.com/expert-profile/java-jee/peter-lubbers
- Book: Pro HTML5 Programming: Powerful APIs for Richer Internet Application Development (Link)
|PeterLubbers 03/15/10 03:55:00 PM EDT|
An additional problem with streaming is that you can get unexpected results if there are intermediaries such as buffering proxy servers in the way. Streaming solutions therefore often fall back to long-polling if they detect such intermediaries (if they are smart enough to detect them at all). With WebSocket and WebSocket secure you have a better chance of traversing proxy servers and firewalls.
|tbayes 03/11/10 10:11:00 AM EST|
Hi Peter and Frank,
Thank you for this clear and well-written article, which does much to effectively explain the benefits of the WebSocket protocol.
If I may point out one minor correction: In the WebSocket section of Example 3, Use case B and C should show overhead rates of 0.153Mbps and 1.526Mbps respectively, as opposed to 0.153Kbps and 1.526Kbps.
Perhaps a more natural comparison would be between Web Sockets and Comet-streaming (via, for example, the "Forever IFrame" technique), as I feel it a little unsporting to compare Web Sockets directly to polling or long-polling! At around 20 bytes per message, the overhead associated with clients receiving messages via Comet-style streaming is still more than WebSocket's 2 bytes per message, but is still one to two orders of magnitude less than the overhead associated with polling techniques. Similarly, latency whether receiving messages using Comet-streaming or WebSocket is the same. Of course, clients sending messages will gain the advantages you describe only with Web Sockets, as all other approaches do indeed require new HTTP requests.
I completely agree, however, that WebSocket is a definite improvement over the Comet IFrame-approaches, which have an air of string and glue about them.
I hope that we will see WebSocket support in all production mainstream browsers very soon.
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
Jul. 1, 2015 07:15 PM EDT Reads: 2,520
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and Containers together help companies to achieve their business goals faster and more effectively. In his session at DevOps Summit, Ruslan Synytsky, CEO and Co-founder of Jelastic, reviewed the current landscape of...
Jul. 1, 2015 05:00 PM EDT Reads: 2,226
Conferences agendas. Event navigation. Specific tasks, like buying a house or getting a car loan. If you've installed an app for any of these things you've installed what's known as a "disposable mobile app" or DMA. Apps designed for a single use-case and with the expectation they'll be "thrown away" like brochures. Deleted until needed again. These apps are necessarily small, agile and highly volatile. Sometimes existing only for a short time - say to support an event like an election, the Wor...
Jul. 1, 2015 05:00 PM EDT Reads: 1,650
The cloud has transformed how we think about software quality. Instead of preventing failures, we must focus on automatic recovery from failure. In other words, resilience trumps traditional quality measures. Continuous delivery models further squeeze traditional notions of quality. Remember the venerable project management Iron Triangle? Among time, scope, and cost, you can only fix two or quality will suffer. Only in today's DevOps world, continuous testing, integration, and deployment upend...
Jul. 1, 2015 03:00 PM EDT Reads: 1,949
Discussions about cloud computing are evolving into discussions about enterprise IT in general. As enterprises increasingly migrate toward their own unique clouds, new issues such as the use of containers and microservices emerge to keep things interesting. In this Power Panel at 16th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the state of cloud computing today, and what enterprise IT professionals need to know about how the latest topics and trends affect t...
Jul. 1, 2015 02:30 PM EDT Reads: 1,198
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
Jul. 1, 2015 02:21 PM EDT Reads: 659
Summer is finally here and it’s time for a DevOps summer vacation. From San Francisco to New York City, our top summer conferences list is going to continuously deliver you to the summer destinations of your dreams. These DevOps parties are hitting all the hottest summer trends with Microservices, Agile, Continuous Delivery, DevSecOps, and even Continuous Testing. Move over Kanye. These are the top 5 Summer DevOps Conferences of 2015.
Jul. 1, 2015 01:00 PM EDT Reads: 1,001
Sharding has become a popular means of achieving scalability in application architectures in which read/write data separation is not only possible, but desirable to achieve new heights of concurrency. The premise is that by splitting up read and write duties, it is possible to get better overall performance at the cost of a slight delay in consistency. That is, it takes a bit of time to replicate changes initiated by a "write" to the read-only master database. It's eventually consistent, and it'...
Jul. 1, 2015 12:00 PM EDT Reads: 1,710
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 11:45 AM EDT Reads: 1,034
Containers are changing the security landscape for software development and deployment. As with any security solutions, security approaches that work for developers, operations personnel and security professionals is a requirement. In his session at DevOps Summit, Kevin Gilpin, CTO and Co-Founder of Conjur, will discuss various security considerations for container-based infrastructure and related DevOps workflows.
Jul. 1, 2015 10:30 AM EDT Reads: 863
Cloud Migration Management (CMM) refers to the best practices for planning and managing migration of IT systems from a legacy platform to a Cloud Provider through a combination professional services consulting and software tools. A Cloud migration project can be a relatively simple exercise, where applications are migrated ‘as is’, to gain benefits such as elastic capacity and utility pricing, but without making any changes to the application architecture, software development methods or busine...
Jul. 1, 2015 10:00 AM EDT Reads: 1,924
Data center models are changing. A variety of technical trends and business demands are forcing that change, most of them centered on the explosive growth of applications. That means, in turn, that the requirements for application delivery are changing. Certainly application delivery needs to be agile, not waterfall. It needs to deliver services in hours, not weeks or months. It needs to be more cost efficient. And more than anything else, it needs to be really, dc infra axisreally, super focus...
Jul. 1, 2015 10:00 AM EDT Reads: 1,981
The most often asked question post-DevOps introduction is: “How do I get started?” There’s plenty of information on why DevOps is valid and important, but many managers still struggle with simple basics for how to initiate a DevOps program in their business. They struggle with issues related to current organizational inertia, the lack of experience on Continuous Integration/Delivery, understanding where DevOps will affect revenue and budget, etc. In their session at DevOps Summit, JP Morgenthal...
Jul. 1, 2015 09:32 AM EDT Reads: 650
Overgrown applications have given way to modular applications, driven by the need to break larger problems into smaller problems. Similarly large monolithic development processes have been forced to be broken into smaller agile development cycles. Looking at trends in software development, microservices architectures meet the same demands. Additional benefits of microservices architectures are compartmentalization and a limited impact of service failure versus a complete software malfunction. ...
Jul. 1, 2015 09:00 AM EDT Reads: 890
Many people recognize DevOps as an enormous benefit – faster application deployment, automated toolchains, support of more granular updates, better cooperation across groups. However, less appreciated is the journey enterprise IT groups need to make to achieve this outcome. The plain fact is that established IT processes reflect a very different set of goals: stability, infrequent change, hands-on administration, and alignment with ITIL. So how does an enterprise IT organization implement change...
Jun. 29, 2015 12:45 PM EDT Reads: 2,819
While DevOps most critically and famously fosters collaboration, communication, and integration through cultural change, culture is more of an output than an input. In order to actively drive cultural evolution, organizations must make substantial organizational and process changes, and adopt new technologies, to encourage a DevOps culture. Moderated by Andi Mann, panelists discussed how to balance these three pillars of DevOps, where to focus attention (and resources), where organizations migh...
Jun. 28, 2015 05:00 PM EDT Reads: 2,050
At DevOps Summit NY there’s been a whole lot of talk about not just DevOps, but containers, IoT, and microservices. Sessions focused not just on the cultural shift needed to grow at scale with a DevOps approach, but also made sure to include the network ”plumbing” needed to ensure success as applications decompose into the microservice architectures enabling rapid growth and support for the Internet of (Every)Things.
Jun. 28, 2015 01:00 PM EDT Reads: 1,920
Mashape is bringing real-time analytics to microservices with the release of Mashape Analytics. First built internally to analyze the performance of more than 13,000 APIs served by the mashape.com marketplace, this new tool provides developers with robust visibility into their APIs and how they function within microservices. A purpose-built, open analytics platform designed specifically for APIs and microservices architectures, Mashape Analytics also lets developers and DevOps teams understand w...
Jun. 27, 2015 11:00 AM EDT Reads: 1,948
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists peeled away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud envir...
Jun. 26, 2015 12:00 PM EDT Reads: 2,257
Sumo Logic has announced comprehensive analytics capabilities for organizations embracing DevOps practices, microservices architectures and containers to build applications. As application architectures evolve toward microservices, containers continue to gain traction for providing the ideal environment to build, deploy and operate these applications across distributed systems. The volume and complexity of data generated by these environments make monitoring and troubleshooting an enormous chall...
Jun. 26, 2015 12:00 PM EDT Reads: 1,562