Click here to close now.


Microservices Expo Authors: Janakiram MSV, Jason Bloomberg, Carmen Gonzalez, Elizabeth White, Victoria Livschitz

Related Topics: @CloudExpo, Microservices Expo, Microsoft Cloud, Containers Expo Blog, IoT User Interface, @BigDataExpo, SDN Journal

@CloudExpo: Blog Feed Post

Is Cloud Built to Fail or Built to Scale?

The difference matters. A lot.

There's been a growing focus on scalability as the Internet of Things has continued its rapid growth. Perhaps due in part to large online failures during periodic or individual events, perhaps due in part to simple growth, the reason is less important than the reality that scalability is a critical technological driver for a variety of new technologies - cloud and SDN being the most often referenced.

But while we've been focusing on scalability we may have been overlooking the related and no less important availability factor. These two "itys" are related, as scalability is one way to achieve availability when dealing with growth, rapid or otherwise. But availability also means being sensitive to failure.

Cloud, in general, is designed for scalability. It is specifically architected to provide elasticity - which is scalability both in and out. Cloud is designed to enable resource growth and contraction to match demand.

In this way, cloud addresses one aspect of availability: capacity. But it does not always address the other aspect - failure. Cloud is built to scale, not necessarily fail.

The Many Faces of Availability

Availability and scale are both achieved primarily through the same mechanisms in both data centers and cloud environments (and SDN network fabrics, for that matter): load balancing. At the network layer we've seen techniques like link aggregation (trunking, teaming, bundling) to both manage both scale and fail. Multiple network links are bound together, usually using an established network protocol, and traffic is distributed via load balancing across those links.

Similarly, the same techniques are used at the application layers (layers 4-7) to provide the same measure of scalability and resilience to failure at the server and application layer. Multiple resources are bound together, usually using a concept known as a virtual server (or virtual IP address) in a load balancing service and then requests are distributed via load balancing across those resources.

In this way, scalability is achieved. As demand grows, resources are transparently added to increase capacity. Similarly, as demand contracts, resources can be transparently decommissioned to decrease capacity. Voila! Elasticity.

But failure, the other and lesser mentioned aspect impacting availability, is not so easily managed.

The Impact of Failure

In the case of the network, the failure of a single link in an aggregated bundle (or trunk) is handled by simply ignoring the failed link. All traffic is distributed across the remaining links, every packet is pushed, and availability is maintained. Except when it isn't because of oversubscription or congestion that results in excessive latencies that cause delays ultimately resulting in poor application performance. While in the purest sense of the word "available" the application is still accessible, most businesses today consider unresponsive or poorly performing applications to be "unavailable", especially when those applications are revenue generating.

At the application layers, failure is even more detrimental to availability. Oversubscription of an application due to failure of resources often results in true downtime; errors or timeouts that prevent the end-user from accessing the application at all. Worse, users that were active may suddenly find they have "lost" their connection as well as any work they may have been doing before the resource was lost.

Load balancing architectures compensate, of course, by directing those users to other application instances. Cloud environments imbued with auto-scaling capabilities may be able to redress the failure by provisioning a new instance to take its place and thus maintain the proper levels of capacity. But that does not mitigate the loss of productivity and access experienced due to the original failure. It addresses scalability, not availability.

That's because when a failure occurs in most cloud environments, all active sessions to the failed application instance are simply discarded. The users must start anew. The cloud infrastructure fabric will certainly redirect them to a new instance and start a new session (and in this way it will "handle" failure), but this is disruptive to the user; it is noticeable. And noticeable degradations of performance or availability are a no-no for most business stakeholders.

Beware the Long Term

It's not necessarily the immediate reaction that should be of concern, but the long term impact. Everyone cites the data presented by Microsoft, Google, and Shopzilla at Velocity 2009 with respect to the impact of seconds of delay on revenue (spoiler: it's not good) but they tend to ignore the long term impact - the behavioral impact - of such delays and disruption on the end user [emphasis mine]:

Their data showed that slow sites get fewer search queries per user, less revenue per visitor, fewer clicks, fewer searches, and lower search engine rankings. They found that in some cases even after site performance was improved users continued to interact as if it was slow. Bad experiences have a lasting influence on customer behavior.

-- More on how web performance impacts revenue…

Did you catch that? Bad experiences (of which disruption is certainly one, I don't think we need to argue about that, do we?) have a lasting influence on customer behavior.

It's important, therefore, to understand the limitations of the environment in which you are deploying an application - particularly one that is customer-facing. Understanding that cloud is built to scale - not fail - is a key piece of knowledge you need to make decisions regarding which applications and workloads are fit for migration to the cloud, and which may be a fit if they are re-architected to address such failure themselves.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

@MicroservicesExpo Stories
In a report titled “Forecast Analysis: Enterprise Application Software, Worldwide, 2Q15 Update,” Gartner analysts highlighted the increasing trend of application modernization among enterprises. According to a recent survey, 45% of respondents stated that modernization of installed on-premises core enterprise applications is one of the top five priorities. Gartner also predicted that by 2020, 75% of
Despite all the talk about public cloud services and DevOps, you would think the move to cloud for enterprises is clear and simple. But in a survey of almost 1,600 IT decision makers across the USA and Europe, the state of the cloud in enterprise today is still fraught with considerable frustration. The business case for apps in the real world cloud is hybrid, bimodal, multi-platform, and difficult. Download this report commissioned by NTT Communications to see the insightful findings – registra...
DevOps Summit at Cloud Expo 2014 Silicon Valley was a terrific event for us. The Qubell booth was crowded on all three days. We ran demos every 30 minutes with folks lining up to get a seat and usually standing around. It was great to meet and talk to over 500 people! My keynote was well received and so was Stan's joint presentation with RingCentral on Devops for BigData. I also participated in two Power Panels – ‘Women in Technology’ and ‘Why DevOps Is Even More Important than You Think,’ both ...
Docker is hot. However, as Docker container use spreads into more mature production pipelines, there can be issues about control of Docker images to ensure they are production-ready. Is a promotion-based model appropriate to control and track the flow of Docker images from development to production? In his session at DevOps Summit, Fred Simon, Co-founder and Chief Architect of JFrog, will demonstrate how to implement a promotion model for Docker images using a binary repository, and then show h...
As the world moves towards more DevOps and microservices, application deployment to the cloud ought to become a lot simpler. The microservices architecture, which is the basis of many new age distributed systems such as OpenStack, NetFlix and so on, is at the heart of Cloud Foundry - a complete developer-oriented Platform as a Service (PaaS) that is IaaS agnostic and supports vCloud, OpenStack and AWS. In his session at 17th Cloud Expo, Raghavan "Rags" Srinivas, an Architect/Developer Evangeli...
DevOps has often been described in terms of CAMS: Culture, Automation, Measuring, Sharing. While we’ve seen a lot of focus on the “A” and even on the “M”, there are very few examples of why the “C" is equally important in the DevOps equation. In her session at @DevOps Summit, Lori MacVittie, of F5 Networks, will explore HTTP/1 and HTTP/2 along with Microservices to illustrate why a collaborative culture between Dev, Ops, and the Network is critical to ensuring success.
Our guest on the podcast this week is Jason Bloomberg, President at Intellyx. When we build services we want them to be lightweight, stateless and scalable while doing one thing really well. In today's cloud world, we're revisiting what to takes to make a good service in the first place. Listen in to learn why following "the book" doesn't necessarily mean that you're solving key business problems.
Application availability is not just the measure of “being up”. Many apps can claim that status. Technically they are running and responding to requests, but at a rate which users would certainly interpret as being down. That’s because excessive load times can (and will be) interpreted as “not available.” That’s why it’s important to view ensuring application availability as requiring attention to all its composite parts: scalability, performance, and security.
In their session at DevOps Summit, Asaf Yigal, co-founder and the VP of Product at, and Tomer Levy, co-founder and CEO of, will explore the entire process that they have undergone – through research, benchmarking, implementation, optimization, and customer success – in developing a processing engine that can handle petabytes of data. They will also discuss the requirements of such an engine in terms of scalability, resilience, security, and availability along with how the archi...
Overgrown applications have given way to modular applications, driven by the need to break larger problems into smaller problems. Similarly large monolithic development processes have been forced to be broken into smaller agile development cycles. Looking at trends in software development, microservices architectures meet the same demands. Additional benefits of microservices architectures are compartmentalization and a limited impact of service failure versus a complete software malfunction....
The last decade was about virtual machines, but the next one is about containers. Containers enable a service to run on any host at any time. Traditional tools are starting to show cracks because they were not designed for this level of application portability. Now is the time to look at new ways to deploy and manage applications at scale. In his session at @DevOpsSummit, Brian “Redbeard” Harrington, a principal architect at CoreOS, will examine how CoreOS helps teams run in production. Attende...
For it to be SOA – let alone SOA done right – we need to pin down just what "SOA done wrong" might be. First-generation SOA with Web Services and ESBs, perhaps? But then there's second-generation, REST-based SOA. More lightweight and cloud-friendly, but many REST-based SOA practices predate the microservices wave. Today, microservices and containers go hand in hand – only the details of "container-oriented architecture" are largely on the drawing board – and are not likely to look much like S...
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
With containerization using Docker, the orchestration of containers using Kubernetes, the self-service model for provisioning your projects and applications and the workflows we built in OpenShift is the best in class Platform as a Service that enables introducing DevOps into your organization with ease. In his session at DevOps Summit, Veer Muchandi, PaaS evangelist with RedHat, will provide a deep dive overview of OpenShift v3 and demonstrate how it helps with DevOps.
All we need to do is have our teams self-organize, and behold! Emergent design and/or architecture springs up out of the nothingness! If only it were that easy, right? I follow in the footsteps of so many people who have long wondered at the meanings of such simple words, as though they were dogma from on high. Emerge? Self-organizing? Profound, to be sure. But what do we really make of this sentence?
Containers are changing the security landscape for software development and deployment. As with any security solutions, security approaches that work for developers, operations personnel and security professionals is a requirement. In his session at @DevOpsSummit, Kevin Gilpin, CTO and Co-Founder of Conjur, will discuss various security considerations for container-based infrastructure and related DevOps workflows.
Last month, my partners in crime – Carmen DeArdo from Nationwide, Lee Reid, my colleague from IBM and I wrote a 3-part series of blog posts on We titled our posts the Simple Math, Calculus and Art of DevOps. I would venture to say these are must-reads for any organization adopting DevOps. We examined all three ascpects – the Cultural, Automation and Process improvement side of DevOps. One of the key underlying themes of the three posts was the need for Cultural change – things like t...
There once was a time when testers operated on their own, in isolation. They’d huddle as a group around the harsh glow of dozens of CRT monitors, clicking through GUIs and recording results. Anxiously, they’d wait for the developers in the other room to fix the bugs they found, yet they’d frequently leave the office disappointed as issues were filed away as non-critical. These teams would rarely interact, save for those scarce moments when a coder would wander in needing to reproduce a particula...
It is with great pleasure that I am able to announce that Jesse Proudman, Blue Box CTO, has been appointed to the position of IBM Distinguished Engineer. Jesse is the first employee at Blue Box to receive this honor, and I’m quite confident there will be more to follow given the amazing talent at Blue Box with whom I have had the pleasure to collaborate. I’d like to provide an overview of what it means to become an IBM Distinguished Engineer.
The cloud has reached mainstream IT. Those 18.7 million data centers out there (server closets to corporate data centers to colocation deployments) are moving to the cloud. In his session at 17th Cloud Expo, Achim Weiss, CEO & co-founder of ProfitBricks, will share how two companies – one in the U.S. and one in Germany – are achieving their goals with cloud infrastructure. More than a case study, he will share the details of how they prioritized their cloud computing infrastructure deployments ...