|By Greg Schulz||
|January 9, 2013 08:00 AM EST||
This is the first of a two-part industry trends and perspectives series looking at how to learn from cloud outages (read part II here).
In case you missed it, there were some public cloud outages during the recent Christmas 2012-holiday season. One incident involved Microsoft Xbox (view the Microsoft Azure status dashboard here) users were impacted, and the other was another Amazon Web Services (AWS) incident. Microsoft and AWS are not alone, most if not all cloud services have had some type of incident and have gone on to improve from those outages. Google has had issues with different applications and services including some in December 2012 along with a Gmail incident that received covered back in 2011.
For those interested, here is a link to the AWS status dashboard and a link to the AWS December 24 2012 incident postmortem. In the case of the recent AWS incident which affected users such as Netflix, the incident (read the AWS postmortem and Netflix postmortem) was tied to a human error. This is not to say AWS has more outages or incidents vs. others including Microsoft, it just seems that we hear more about AWS when things happen compared to others. That could be due to AWS size and arguably market leading status, diversity of services and scale at which some of their clients are using them.
Btw, if you were not aware, Microsoft Azure is more than just about supporting SQLserver, Exchange, SharePoint or Office, it is also an IaaS layer for running virtual machines such as Hyper-V, as well as a storage target for storing data. You can use Microsoft Azure storage services as a target for backing up or archiving or as general storage, similar to using AWS S3 or Rackspace Cloud files or other services. Some backup and archiving AaaS and SaaS providers including Evault partner with Microsoft Azure as a storage repository target.
When reading some of the coverage of these recent cloud incidents, I am not sure if I am more amazed by some of the marketing cloud washing, or the cloud bashing and uniformed reporting or lack of research and insight. Then again, if someone repeats a myth often enough for others to hear and repeat, as it gets amplified, the myth may assume status of reality. After all, you may know the expression that if it is on the internet then it must be true?
Images licensed for use by StorageIO via Atomazul / Shutterstock.com
Have AWS and public cloud services become a lightning rod for when things go wrong?
Here is some coverage of various cloud incidents:
- Huffington post coverage of February 2011 Google Gmail incident
- Microsoft Azure coverage by Allthingsd.com
- Neowin.net covering Microsoft Xbox incident
- Google's Gmail blog coverage of Gmail outage
- Forbes article Amazon AWS Takes Down Netflix on Christmas Eve
- Over at Performance Critical Apps they assert the AWS incident was Netflix fault
- From The Virtualization Practice: Amazon Ruining Public Cloud Computing?
- Here is Netflix architect Adrian Cockcroft discussing the recent incident
- From StorageIOblog Amazon Web Services (AWS) and the Netflix Fix?
- From CRN, here are some cloud service availability status via Nasuni
The above are a small sampling of different stories, articles, columns, blogs, perspectives about cloud services outages or other incidents. Assuming the services are available, you can Google or Bing many others along with reading postmortems to gain insight into what happened, the cause, effect and how to prevent in the future.
Do these recent incidents show a trend of increased cloud outages? Alternatively, do they say that the cloud services are being used more and on a larger basis, thus the impacts become more known?
Perhaps it is a mix of the above, and like when a magnetic storage tape gets lost or stolen, it makes for good news or copy, something to write about. Granted there are fewer tapes actually lost than in the past, and far fewer vs. lost or stolen laptops and other devices with data on them. There are probably other reasons such as the lightning rod effect given how much industry hype around clouds that when something does happen, the cynics or foes come out in force, sometimes with FUD.
Similar to traditional hardware or software based product vendors, some service providers have even tried to convince me that they have never had an incident, lost or corrupted or compromised any data, yeah, right. Candidly, I put more credibility and confidence in a vendor or solution provider who tells me that they have had incidents and taken steps to prevent them from recurring. Granted those steps might be made public while others might be under NDA, at least they are learning and implementing improvements.
As part of gaining insights, here are some links to AWS, Google, Microsoft Azure and other service status dashboards where you can view current and past situations.
- AWS service status dashboard
- Bluehost server status dashboard
- Google App status dashboard
- HP cloud service status console (requires login)
- Microsoft Azure service status dashboard
- Microsoft Xbox service status dashboard
- Rackspace service status dashboards
What is your take on IT clouds? Click here to cast your vote and see what others are thinking about clouds.
Ok, nuff said for now (check out part II here )
Disclosure: I am a customer of AWS for EC2, EBS, S3 and Glacier as well as a customer of Bluehost for hosting and Rackspace for backups. Other than Amazon being a seller of my books (and my blog via Kindle) along with running ads on my sites and being an Amazon Associates member (Google also has ads), none of those mentioned are or have been StorageIO clients.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO All Rights Reserved
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rathe...
Sep. 1, 2015 01:00 AM EDT Reads: 403
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Aug. 31, 2015 11:00 PM EDT Reads: 498
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
Aug. 31, 2015 10:30 PM EDT Reads: 396
Puppet Labs has announced the next major update to its flagship product: Puppet Enterprise 2015.2. This release includes new features providing DevOps teams with clarity, simplicity and additional management capabilities, including an all-new user interface, an interactive graph for visualizing infrastructure code, a new unified agent and broader infrastructure support.
Aug. 31, 2015 06:30 PM EDT Reads: 521
Early in my DevOps Journey, I was introduced to a book of great significance circulating within the Web Operations industry titled The Phoenix Project. (You can read our review of Gene’s book, if interested.) Written as a novel and loosely based on many of the same principles explored in The Goal, this book has been read and referenced by many who have adopted DevOps into their continuous improvement and software delivery processes around the world. As I began planning my travel schedule last...
Aug. 31, 2015 06:00 PM EDT Reads: 541
Several years ago, I was a developer in a travel reservation aggregator. Our mission was to pull flight and hotel data from a bunch of cryptic reservation platforms, and provide it to other companies via an API library - for a fee. That was before companies like Expedia standardized such things. We started with simple methods like getFlightLeg() or addPassengerName(), each performing a small, well-understood function. But our customers wanted bigger, more encompassing services that would "do ...
Aug. 31, 2015 04:00 PM EDT Reads: 264
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean...
Aug. 31, 2015 01:45 PM EDT Reads: 420
Docker containerization is increasingly being used in production environments. How can these environments best be monitored? Monitoring Docker containers as if they are lightweight virtual machines (i.e., monitoring the host from within the container), with all the common metrics that can be captured from an operating system, is an insufficient approach. Docker containers can’t be treated as lightweight virtual machines; they must be treated as what they are: isolated processes running on hosts....
Aug. 31, 2015 01:00 PM EDT Reads: 155
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Aug. 31, 2015 12:30 PM EDT Reads: 358
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Aug. 31, 2015 11:30 AM EDT Reads: 896
Introducing Containers & Microservices Bootcamp at @CloudExpo Silicon Valley | #Containers #Microservices
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
Aug. 31, 2015 10:45 AM EDT Reads: 345
The pricing of tools or licenses for log aggregation can have a significant effect on organizational culture and the collaboration between Dev and Ops teams. Modern tools for log aggregation (of which Logentries is one example) can be hugely enabling for DevOps approaches to building and operating business-critical software systems. However, the pricing of an aggregated logging solution can affect the adoption of modern logging techniques, as well as organizational capabilities and cross-team ...
Aug. 31, 2015 10:30 AM EDT Reads: 395
DevOps has traditionally played important roles in development and IT operations, but the practice is quickly becoming core to other business functions such as customer success, business intelligence, and marketing analytics. Modern marketers today are driven by data and rely on many different analytics tools. They need DevOps engineers in general and server log data specifically to do their jobs well. Here’s why: Server log files contain the only data that is completely full and accurate in th...
Aug. 31, 2015 10:15 AM EDT Reads: 396
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Aug. 31, 2015 10:15 AM EDT Reads: 305
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
Aug. 31, 2015 09:45 AM EDT Reads: 598
In his session at 17th Cloud Expo, Ernest Mueller, Product Manager at Idera, will explain the best practices and lessons learned for tracking and optimizing costs while delivering a cloud-hosted service. He will describe a DevOps approach where the applications and systems work together to track usage, model costs in a granular fashion, and make smart decisions at runtime to minimize costs. The trickier parts covered include triggering off the right metrics; balancing resilience and redundancy ...
Aug. 31, 2015 08:00 AM EDT Reads: 233
Whether you like it or not, DevOps is on track for a remarkable alliance with security. The SEC didn’t approve the merger. And your boss hasn’t heard anything about it. Yet, this unruly triumvirate will soon dominate and deliver DevSecOps faster, cheaper, better, and on an unprecedented scale. In his session at DevOps Summit, Frank Bunger, VP of Customer Success at ScriptRock, will discuss how this cathartic moment will propel the DevOps movement from such stuff as dreams are made on to a prac...
Aug. 31, 2015 04:00 AM EDT Reads: 231
It’s been proven time and time again that in tech, diversity drives greater innovation, better team productivity and greater profits and market share. So what can we do in our DevOps teams to embrace diversity and help transform the culture of development and operations into a true “DevOps” team? In her session at DevOps Summit, Stefana Muller, Director, Product Management – Continuous Delivery at CA Technologies, answered that question citing examples, showing how to create opportunities for ...
Aug. 31, 2015 03:00 AM EDT Reads: 485
What does “big enough” mean? It’s sometimes useful to argue by reductio ad absurdum. Hello, world doesn’t need to be broken down into smaller services. At the other extreme, building a monolithic enterprise resource planning (ERP) system is just asking for trouble: it’s too big, and it needs to be decomposed.
Aug. 29, 2015 10:00 AM EDT Reads: 352
The Microservices architectural pattern promises increased DevOps agility and can help enable continuous delivery of software. This session is for developers who are transforming existing applications to cloud-native applications, or creating new microservices style applications. In his session at DevOps Summit, Jim Bugwadia, CEO of Nirmata, will introduce best practices, patterns, challenges, and solutions for the development and operations of microservices style applications. He will discuss ...
Aug. 27, 2015 02:15 PM EDT Reads: 520