|By Greg Schulz||
|August 1, 2013 11:15 AM EDT||
In case you missed it, recently the State of Oregon had a data center computer problem (ok, storage and application outage) that resulted in unemployment benefits not being provided. Tony Knotzer over at Network Computing did a story Oregon Storage Debacle Highlights Need To Plan For Failure and asked me for some perspectives that you can read here.
The reason I bring this incident up is not to join in the feeding frenzy that usually occurs when something like this happens, instead, to touch on what should be common. What is lacking at times (or more needed) is common sense when it comes to designing and managing flexible scalable data infrastructures.
|“Fundamental IT 101 is that all technology will fail, despite what the vendors tell you,” Schulz said. And the most likely time technology will fail, he notes, is when people are involved -- doing configurations, making changes or updates, or performing upgrades. - Via Network Computing|
Note that while any technology can or has fail at some point, how it fails along with fault containment via design best practices and vendor resolution are important.
Good vendors learn and correct things so that they don't happen again as well as work with customers on best practices to isolate and contain faults from expanding into disasters. Thus when a sales or marketing person tries to tell me that they have never had a failure I wonder if a: they are making something up, b: have not actually shipped to a customer in production, c: not aware of other deployments, d: towing the company line, e: too good to be true or f: all the above.
On the other hand, when a vendor tells me how they have resiliency in their product as well as processes, best practices and can even tell me (public or under NDA) how they have addressed issues, then they have my attention.
A common challenge today is cost cutting along with focus on the newest technology from servers to storage, networking to cloud, virtualization and software defined among other buzzword bingo themes and trends.
What also gets overlooked as mentioned above is common sense.
Perhaps if somebody could package and launch a good public relations campaign profiling common sense such as Software Defined Common Sense (SDCS) that might help?
On the other hand, similar to public service announcements (PSA) that may seem like common sense to some, there is a reason they are being done. That is to pass on the information to others who may not know about it thus lack what is perceived as common sense.
Lets get back to the state of Oregon's computer systems issues and the blame game.
You know the blame game? That is when something happens or does not happen as you want it to simply find somebody else to blame or pivot and point a finger elsewhere.
While perhaps good for CYA, the blame games usually does not help to prevent something happening again, or in the first place.
Hence in my comments about the state of Oregon computer storage system problems, I took the tone of what is common these days of no fault, shared responsibility and blame.
In other words does not matter who did what first or did not do, both sides could have prevented it.
For some this might resonate of it does not matter who misbehaved in the sandbox or play room, everybody gets a time out.
This is not to say that one side or the other has to assume or take on more blame or responsibility than the other, rather there is a shared responsibility to look out for each other.
Just like when you drive a car, the education focus is on defensive safe driving to watch out for what the other person might do or not do (e.g. use turn signals or too busy to look in a mirror while talking or texting and driving among other things). The goal is to prevent accidents by watching out for those who are not taking responsibilities for themselves, not to mention learning from others mishaps.
Working together vs. the blame game
Different views of customer vs. vendor
Having been a customer, as well as a vendor in the past not surprisingly I have some different views on this.
Sure the customer or client is always right, however sometimes there needs to be unpleasant conversations to help the customer help themselves, or keep themselves out of trouble.
Likewise a vendor may also take the blame when something does go wrong, even if it was entirely not their own fault just to stay in good graces with the customer or get that next deal.
Sometimes a vendor deserves to get beat up when something goes wrong, or at a least tell their story including if needed behind closed doors or under NDA. Likewise to have a meaningful relationship or partnership with the vendor, supplier or VAR, there needs to be trust and confidence which means not everything gets put out for media or blog venues to feed on.
Sure there is explaining what happened without spin, however there is also learning from mistakes to prevent them from happening which should be common sense. If part of that sharing of blame and responsibility requires being not in public that's fine, as well as enough information of what happened is conveyed to clarify concerns and create confidence.
With vendor lockin, when I was a customer some taught that it's the vendors fault (or for CYA, blame them), as a vendor the thinking was enforced that the customer is always right and its the competition who causes lockin.
As an analyst advisory consulting, my thinking not surprisingly is that of shared responsibility.
This means only you can allow vendor lockin, not to mention decide if lockin is bad or not.
Likewise only you can prevent data loss in cloud, virtual or traditional environments which also includes loss of access.
Granted somebody higher up the organization structure may over-ride you, however ask yourself if you did what was needed?
Likewise if a vendor is going to be doing some maintenance work in the middle of the week and there is a risk of something happening, even if they have told or sold you there is no single point of failure (NSPOF), or non disruptive upgrades.
Anytime there is a person involved regardless of if hardware, cables, software, firmware, configurations or physical environments something can happen. If the vendor drops the ball or a cable or card or something else and causes an outage or downtime, it is their responsibility to discuss those issues. However it is also the customers responsibility to discuss why they let the vendor do something during that time without taking adequate precautions. Likewise if the storage system was a single point of failure for an important system, then there is the responsibility to discuss the cost cutting concerns of others and have them justify why a redundant solution is not needed (that's CYA 101 btw ).
Some other common sense tips
For some these might be familiar and if so, are they being done, and for others, perhaps they are new or revolutionary.
In the race to jump to a new technology or vendor, what are the unknowns? For example you may know what the issues or flaws are in an existing systems, solution, product, service or vendor, however what about the new one? Will you be the production beta customer and if so, how can you mitigate any risk?
Ask vendors tough, yet fair questions that are relevant to your needs and requirements including how they handle updates, upgrades and other tasks. Don't be afraid to go under NDA if needed to get a better view of where they are at, have been and going to avoid surprises.
If this is not common IT sense, then take the responsibility to learn.
On the other hand, if this is common sense, take the responsibility to share and help others learn what it is that you know.
Also understand your availability needs and wants as well as balance those with costs along with risks. If something can go wrong it will if people are involved, thus design for resiliency including maintenance to offset applicable threat risks. Remember in the data center not everything is the same.
Here is my point.
There is enough blame as well as accolades to go around, however take some shared responsibility and use it wisely.
Likewise in the race to cut cost, watch out for causing problems that compromise your information systems or services.
Look into removing complexity and costs without compromise which has long-term benefits vs. simply cutting costs.
Here are some related links and perspectives:
Don't Let Clouds Scare You Be Prepared
Cloud conversation, Thanks Gartner for saying what has been said
Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)
Make Your Company Ready for the Cloud
What do you do when your service provider drops the ball
People, Not Tech, Prevent IT Convergence
Pulling Together a Converged Team
Speaking of lockin, does software eliminate or move the location of vendor lock-in?
Ok, nuff said for now, what say you?
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO All Rights Reserved
Puppet Labs has announced the next major update to its flagship product: Puppet Enterprise 2015.2. This release includes new features providing DevOps teams with clarity, simplicity and additional management capabilities, including an all-new user interface, an interactive graph for visualizing infrastructure code, a new unified agent and broader infrastructure support.
Aug. 28, 2015 05:45 PM EDT Reads: 443
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Aug. 28, 2015 03:30 PM EDT Reads: 808
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Aug. 28, 2015 02:15 PM EDT Reads: 408
Introducing Containers & Microservices Bootcamp at @CloudExpo Silicon Valley | #Containers #Microservices
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
Aug. 28, 2015 12:30 PM EDT Reads: 106
It’s been proven time and time again that in tech, diversity drives greater innovation, better team productivity and greater profits and market share. So what can we do in our DevOps teams to embrace diversity and help transform the culture of development and operations into a true “DevOps” team? In her session at DevOps Summit, Stefana Muller, Director, Product Management – Continuous Delivery at CA Technologies, answered that question citing examples, showing how to create opportunities for ...
Aug. 28, 2015 12:00 PM EDT Reads: 447
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Aug. 28, 2015 12:00 PM EDT Reads: 186
What does “big enough” mean? It’s sometimes useful to argue by reductio ad absurdum. Hello, world doesn’t need to be broken down into smaller services. At the other extreme, building a monolithic enterprise resource planning (ERP) system is just asking for trouble: it’s too big, and it needs to be decomposed.
Aug. 28, 2015 11:15 AM EDT Reads: 307
Several years ago, I was a developer in a travel reservation aggregator. Our mission was to pull flight and hotel data from a bunch of cryptic reservation platforms, and provide it to other companies via an API library - for a fee. That was before companies like Expedia standardized such things. We started with simple methods like getFlightLeg() or addPassengerName(), each performing a small, well-understood function. But our customers wanted bigger, more encompassing services that would "do ...
Aug. 28, 2015 11:00 AM EDT Reads: 174
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Aug. 28, 2015 10:00 AM EDT Reads: 267
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean...
Aug. 28, 2015 09:45 AM EDT Reads: 331
Whether you like it or not, DevOps is on track for a remarkable alliance with security. The SEC didn’t approve the merger. And your boss hasn’t heard anything about it. Yet, this unruly triumvirate will soon dominate and deliver DevSecOps faster, cheaper, better, and on an unprecedented scale. In his session at DevOps Summit, Frank Bunger, VP of Customer Success at ScriptRock, will discuss how this cathartic moment will propel the DevOps movement from such stuff as dreams are made on to a prac...
Aug. 28, 2015 09:45 AM EDT Reads: 178
In his session at 17th Cloud Expo, Ernest Mueller, Product Manager at Idera, will explain the best practices and lessons learned for tracking and optimizing costs while delivering a cloud-hosted service. He will describe a DevOps approach where the applications and systems work together to track usage, model costs in a granular fashion, and make smart decisions at runtime to minimize costs. The trickier parts covered include triggering off the right metrics; balancing resilience and redundancy ...
Aug. 28, 2015 09:30 AM EDT Reads: 126
Docker containerization is increasingly being used in production environments. How can these environments best be monitored? Monitoring Docker containers as if they are lightweight virtual machines (i.e., monitoring the host from within the container), with all the common metrics that can be captured from an operating system, is an insufficient approach. Docker containers can’t be treated as lightweight virtual machines; they must be treated as what they are: isolated processes running on hosts....
Aug. 28, 2015 09:00 AM EDT
DevOps has traditionally played important roles in development and IT operations, but the practice is quickly becoming core to other business functions such as customer success, business intelligence, and marketing analytics. Modern marketers today are driven by data and rely on many different analytics tools. They need DevOps engineers in general and server log data specifically to do their jobs well. Here’s why: Server log files contain the only data that is completely full and accurate in th...
Aug. 28, 2015 08:30 AM EDT Reads: 315
The pricing of tools or licenses for log aggregation can have a significant effect on organizational culture and the collaboration between Dev and Ops teams. Modern tools for log aggregation (of which Logentries is one example) can be hugely enabling for DevOps approaches to building and operating business-critical software systems. However, the pricing of an aggregated logging solution can affect the adoption of modern logging techniques, as well as organizational capabilities and cross-team ...
Aug. 28, 2015 07:30 AM EDT Reads: 347
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
Aug. 28, 2015 06:45 AM EDT Reads: 528
Early in my DevOps Journey, I was introduced to a book of great significance circulating within the Web Operations industry titled The Phoenix Project. (You can read our review of Gene’s book, if interested.) Written as a novel and loosely based on many of the same principles explored in The Goal, this book has been read and referenced by many who have adopted DevOps into their continuous improvement and software delivery processes around the world. As I began planning my travel schedule last...
Aug. 28, 2015 06:00 AM EDT Reads: 497
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rathe...
Aug. 28, 2015 06:00 AM EDT Reads: 358
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
Aug. 28, 2015 03:15 AM EDT Reads: 322
The Microservices architectural pattern promises increased DevOps agility and can help enable continuous delivery of software. This session is for developers who are transforming existing applications to cloud-native applications, or creating new microservices style applications. In his session at DevOps Summit, Jim Bugwadia, CEO of Nirmata, will introduce best practices, patterns, challenges, and solutions for the development and operations of microservices style applications. He will discuss ...
Aug. 27, 2015 02:15 PM EDT Reads: 498