|By Jason Bloomberg||
|June 3, 2011 09:45 AM EDT||
Your CIO is all fired up about moving your legacy inventory management app to the Cloud. Lower capital costs! Dynamic provisioning! Outsourced infrastructure! So you get out your shoehorn, provision some storage and virtual machine instances, and forklift the whole mess into the stratosphere. (OK, there's more to it than that, but bear with me.)
Everything seems to work at first. But then the real test comes: the Holiday season, when you do most of your online business. You breathe a sigh of relief as your Cloud provider seamlessly scales up to meet the spikes in demand. But then your boss calls, irate. Turns out customers are swamping the call center with complaints of failed transactions.
You frantically dive into the log files and diagnostic reports to see what the problem is. Apparently, the database has not been keeping an accurate count of your inventory-which is pretty much what an inventory management system is all about. You check the SQL, and you can't find the problem. Now you're really beginning to sweat.
You dig deeper, and you find the database is frequently in an inconsistent state. When the app processes orders, it decrements the product count. When the count for a product drops to zero, it's supposed to show customers that you've run out. But sometimes, the count is off. Not always, and not for every product. And the problem only seems to occur in the afternoons, when you normally experience your heaviest transaction volume.
The Problem: Consistency in the Cloud
The problem is that while it may appear that your database is running in a single storage partition, in reality the Cloud provider is provisioning multiple physical partitions as needed to provide elastic capacity. But when you look at the fine print in your contract with the Cloud provider, you realize they offer eventual consistency, not immediate consistency. In other words, your data may be inconsistent for short periods of time, especially when your app is experiencing peak load. It may only be a matter of seconds for the issue to resolve, but in the meantime, customers are placing orders for products that aren't available. You're charging their credit cards and all they get for their money is an error page.
From the perspective of the Cloud provider, however, nothing is broken. Eventual consistency is inherent to the nature of Cloud computing, a principle we call the CAP Theorem: no distributed computing system can guarantee (immediate) consistency, availability, and partition tolerance at the same time. You can get any two of these, but not all three at once.
Of these three characteristics, partition tolerance is the least familiar. In essence, a distributed system is partition tolerant when it will continue working even in the case of a partial network failure. In other words, bits and pieces of the system can fail or otherwise stop communicating with the other bits and pieces, and the overall system will continue to function.
With on-premise distributed computing, we're not particularly interested in partition tolerance: transactional environments run in a single partition. If we want ACID transactionality (atomic, consistent, isolated, and durable transactions), then we should stick with a partition intolerant approach like a two-phase commit infrastructure. In essence, ACID implies that a transaction runs in a single partition.
But in the Cloud, we require partition tolerance, because the Cloud provider is willing to allow that each physical instance cannot necessarily communicate with every other physical instance at all times, and each physical instance may go down unpredictably. And if your underlying physical instances aren't communicating or working properly, then you have either an availability or a consistency issue. But since the Cloud is architected for high availability, consistency will necessarily suffer.
The Solution: Rethink Your Priorities
The kneejerk reaction might be that since consistency is nonnegotiable, we need to force the Cloud providers to give up partition tolerance. But in reality, that's entirely the wrong way to think about the problem. Instead, we must rethink our priorities.
As any data specialist will tell you, there are always performance vs. flexibility tradeoffs in the world of data. Every generation of technology suffers from this tradeoff, and the Cloud is no different. What is different about the Cloud is that we want virtualization-based elasticity-which requires partition tolerance.
If we want ACID transactionality then we should stick with an on-premise partition intolerant approach. But in the Cloud, ACID is the wrong priority. We need a different way of thinking about consistency and reliability. Instead of ACID, we need BASE (catchy, eh?)
BASE stands for Basic Availability (supports partial failures without leading to a total system failure), Soft-state (any change in state must be maintained through periodic refreshment), and Eventual consistency (the data will be consistent after a set amount of time passes since an update). BASE has been around for several years and actually predates the notion of Cloud computing; in fact, it underlies the telco world's notion of "best effort" reliability that applies to the mobile phone infrastructure. But today, understanding the principles of BASE is essential to understanding how to architect applications for the Cloud.
Thinking in a BASE Way
Let's put the BASE principles in simple terms.
Basic availability: stuff happens. We're using commodity hardware in the Cloud. We're expecting and planning for failure. But hey, we've got it covered.
Soft state: the squeaky wheel gets the grease. If you don't keep telling me where you are or what you're doing, I'll assume you're not there anymore or you're done doing whatever it is you were doing. So if any part of the infrastructure crashes and reboots, it can bootstrap itself without any worries about it being in the wrong state.
Eventual consistency: It's OK to use stale data some of the time. It'll all come clean eventually. Accountants have followed this principle since Babylonian times. It's called "closing the books."
So, how would you address your inventory app following BASE best effort principles? First, assume that any product quantity is approximate. If the quantity isn't near zero you don't have much of a problem. If it is near zero, set the proper expectation with the customer. Don't charge their credit card in a synchronous fashion. Instead, let them know that their purchase has probably completed successfully. Once the dust settles, let them know if they got the item or not.
Of course, this inventory example is an oversimplification, and every situation is different. The bottom line is that you can't expect the same kind of transactionality in the Cloud as you could in a partition intolerant on-premise environment. If you erroneously assume that you can move your app to the Cloud without reworking how it handles transactionality, then you are in for an unpleasant surprise. On the other hand, rearchitecting your app for the Cloud will improve it overall.
The ZapThink Take
Intermittently stale data? Unpredictable counts? States that expire? Your computer science profs must be rolling around in their graves. That's no way to write a computer program! Data are data, counts are counts, and states are states! How could anything work properly if we get all loosey-goosey about such basics?
Welcome to the twenty-first century, folks. Bank account balances, search engine results, instant messaging buddy lists-if you think about it, all of these everyday elements of our wired lives follow BASE principles in one way or another.
And now we have Cloud computing, where we're bundling together several different modern distributed computing trends into one neat package. But if we mistake the Cloud for being nothing more than a collection of existing trends then we're likely to fall into the "horseless carriage" trap, where we fail to recognize what's special about the Cloud.
The Cloud is much more than a virtual server in the sky. You can't simply migrate an existing app into the Cloud and expect it to work properly, let alone take advantage of the power of the Cloud. Instead, application migration and application modernization necessarily go hand in hand, and architecting your app for the Cloud is more important than ever.
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
Apr. 26, 2017 11:45 AM EDT Reads: 8,946
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing be...
Apr. 26, 2017 09:26 AM EDT Reads: 320
This recent research on cloud computing from the Register delves a little deeper than many of the "We're all adopting cloud!" surveys we've seen. They found that meaningful cloud adoption and the idea of the cloud-first enterprise are still not reality for many businesses. The Register's stats also show a more gradual cloud deployment trend over the past five years, not any sort of explosion. One important takeaway is that coherence across internal and external clouds is essential for IT right n...
Apr. 26, 2017 08:45 AM EDT Reads: 1,689
Enterprise architects are increasingly adopting multi-cloud strategies as they seek to utilize existing data center assets, leverage the advantages of cloud computing and avoid cloud vendor lock-in. This requires a globally aware traffic management strategy that can monitor infrastructure health across data centers and end-user experience globally, while responding to control changes and system specification at the speed of today’s DevOps teams. In his session at 20th Cloud Expo, Josh Gray, Chie...
Apr. 26, 2017 07:45 AM EDT Reads: 3,257
To more closely examine the variety of ways in which IT departments around the world are integrating cloud services, and the effect hybrid IT has had on their organizations and IT job roles, SolarWinds recently released the SolarWinds IT Trends Report 2017: Portrait of a Hybrid Organization. This annual study consists of survey-based research that explores significant trends, developments, and movements related to and directly affecting IT and IT professionals.
Apr. 26, 2017 04:30 AM EDT Reads: 1,675
Developers want to create better apps faster. Static clouds are giving way to scalable systems, with dynamic resource allocation and application monitoring. You won't hear that chant from users on any picket line, but helping developers to create better apps faster is the mission of Lee Atchison, principal cloud architect and advocate at New Relic Inc., based in San Francisco. His singular job is to understand and drive the industry in the areas of cloud architecture, microservices, scalability ...
Apr. 26, 2017 03:00 AM EDT Reads: 3,503
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
Apr. 25, 2017 10:30 PM EDT Reads: 7,175
Is your application too difficult to manage? Do changes take dozens of developers hundreds of hours to execute, and frequently result in downtime across all your site’s functions? It sounds like you have a monolith! A monolith is one of the three main software architectures that define most applications. Whether you’ve intentionally set out to create a monolith or not, it’s worth at least weighing the pros and cons of the different architectural approaches and deciding which one makes the most s...
Apr. 25, 2017 08:45 PM EDT Reads: 2,771
Cloud Expo, Inc. has announced today that Aruna Ravichandran, vice president of DevOps Product and Solutions Marketing at CA Technologies, has been named co-conference chair of DevOps at Cloud Expo 2017. The @DevOpsSummit at Cloud Expo New York will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and @DevOpsSummit at Cloud Expo Silicon Valley will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Apr. 25, 2017 08:30 PM EDT Reads: 2,570
In large enterprises, environment provisioning and server provisioning account for a significant portion of the operations team's time. This often leaves users frustrated while they wait for these services. For instance, server provisioning can take several days and sometimes even weeks. At the same time, digital transformation means the need for server and environment provisioning is constantly growing. Organizations are adopting agile methodologies and software teams are increasing the speed ...
Apr. 25, 2017 08:15 PM EDT Reads: 3,355
Back in February of 2017, Andrew Clay Schafer of Pivotal tweeted the following: “seriously tho, the whole software industry is stuck on deployment when we desperately need architecture and telemetry.” Intrigue in a 140 characters. For me, I hear Andrew saying, “we’re jumping to step 5 before we’ve successfully completed steps 1-4.”
Apr. 25, 2017 11:15 AM EDT Reads: 1,736
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, will discuss how automation can provide the dynamic management required to cost-effectively deliver microservices and container solutions at scale. He will discuss how flexible automation is the key to effectively bridging and seamlessly coordinating both IT and developer needs for component orchestration across disparate clouds – an increasingly important requirement at today’s multi-cloud enterprise.
Apr. 25, 2017 06:00 AM EDT Reads: 4,335
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Apr. 25, 2017 03:15 AM EDT Reads: 8,894
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Apr. 25, 2017 03:00 AM EDT Reads: 5,929
Software as a service (SaaS), one of the earliest and most successful cloud services, has reached mainstream status. According to Cisco, by 2019 more than four-fifths (83 percent) of all data center traffic will be based in the cloud, up from 65 percent today. The majority of this traffic will be applications. Businesses of all sizes are adopting a variety of SaaS-based services – everything from collaboration tools to mission-critical commerce-oriented applications. The rise in SaaS usage has m...
Apr. 22, 2017 06:15 PM EDT Reads: 4,856
The proper isolation of resources is essential for multi-tenant environments. The traditional approach to isolate resources is, however, rather heavyweight. In his session at 18th Cloud Expo, Igor Drobiazko, co-founder of elastic.io, drew upon his own experience with operating a Docker container-based infrastructure on a large scale and present a lightweight solution for resource isolation using microservices. He also discussed the implementation of microservices in data and application integrat...
Apr. 22, 2017 05:45 AM EDT Reads: 6,212
We'd all like to fulfill that "find a job you love and you'll never work a day in your life" cliché. But in reality, every job (even if it's our dream job) comes with its downsides. For you, the constant fight against shadow IT might get on your last nerves. For your developer coworkers, infrastructure management is the roadblock that stands in the way of focusing on coding. As you watch more and more applications and processes move to the cloud, technology is coming to developers' rescue-most r...
Apr. 22, 2017 04:00 AM EDT Reads: 4,091
2016 has been an amazing year for Docker and the container industry. We had 3 major releases of Docker engine this year , and tremendous increase in usage. The community has been following along and contributing amazing Docker resources to help you learn and get hands-on experience. Here’s some of the top read and viewed content for the year. Of course releases are always really popular, particularly when they fit requests we had from the community.
Apr. 22, 2017 03:45 AM EDT Reads: 3,590
Even for the most seasoned IT pros, the cloud is complicated. It can be difficult just to wrap your head around the many terms and acronyms that make up the cloud dictionary-not to mention actually mastering the technology. Unfortunately, complicated cloud terms are often combined to the point that their meanings are lost in a sea of conflicting opinions. Two terms that are used interchangeably (but shouldn't be) are hybrid cloud and multicloud. If you want to be the cloud expert your company ne...
Apr. 21, 2017 04:15 PM EDT Reads: 2,237
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
Apr. 18, 2017 03:30 PM EDT Reads: 4,442