|By Greg Schulz||
|August 1, 2013 11:15 AM EDT||
In case you missed it, recently the State of Oregon had a data center computer problem (ok, storage and application outage) that resulted in unemployment benefits not being provided. Tony Knotzer over at Network Computing did a story Oregon Storage Debacle Highlights Need To Plan For Failure and asked me for some perspectives that you can read here.
The reason I bring this incident up is not to join in the feeding frenzy that usually occurs when something like this happens, instead, to touch on what should be common. What is lacking at times (or more needed) is common sense when it comes to designing and managing flexible scalable data infrastructures.
|“Fundamental IT 101 is that all technology will fail, despite what the vendors tell you,” Schulz said. And the most likely time technology will fail, he notes, is when people are involved -- doing configurations, making changes or updates, or performing upgrades. - Via Network Computing|
Note that while any technology can or has fail at some point, how it fails along with fault containment via design best practices and vendor resolution are important.
Good vendors learn and correct things so that they don't happen again as well as work with customers on best practices to isolate and contain faults from expanding into disasters. Thus when a sales or marketing person tries to tell me that they have never had a failure I wonder if a: they are making something up, b: have not actually shipped to a customer in production, c: not aware of other deployments, d: towing the company line, e: too good to be true or f: all the above.
On the other hand, when a vendor tells me how they have resiliency in their product as well as processes, best practices and can even tell me (public or under NDA) how they have addressed issues, then they have my attention.
A common challenge today is cost cutting along with focus on the newest technology from servers to storage, networking to cloud, virtualization and software defined among other buzzword bingo themes and trends.
What also gets overlooked as mentioned above is common sense.
Perhaps if somebody could package and launch a good public relations campaign profiling common sense such as Software Defined Common Sense (SDCS) that might help?
On the other hand, similar to public service announcements (PSA) that may seem like common sense to some, there is a reason they are being done. That is to pass on the information to others who may not know about it thus lack what is perceived as common sense.
Lets get back to the state of Oregon's computer systems issues and the blame game.
You know the blame game? That is when something happens or does not happen as you want it to simply find somebody else to blame or pivot and point a finger elsewhere.
While perhaps good for CYA, the blame games usually does not help to prevent something happening again, or in the first place.
Hence in my comments about the state of Oregon computer storage system problems, I took the tone of what is common these days of no fault, shared responsibility and blame.
In other words does not matter who did what first or did not do, both sides could have prevented it.
For some this might resonate of it does not matter who misbehaved in the sandbox or play room, everybody gets a time out.
This is not to say that one side or the other has to assume or take on more blame or responsibility than the other, rather there is a shared responsibility to look out for each other.
Just like when you drive a car, the education focus is on defensive safe driving to watch out for what the other person might do or not do (e.g. use turn signals or too busy to look in a mirror while talking or texting and driving among other things). The goal is to prevent accidents by watching out for those who are not taking responsibilities for themselves, not to mention learning from others mishaps.
Working together vs. the blame game
Different views of customer vs. vendor
Having been a customer, as well as a vendor in the past not surprisingly I have some different views on this.
Sure the customer or client is always right, however sometimes there needs to be unpleasant conversations to help the customer help themselves, or keep themselves out of trouble.
Likewise a vendor may also take the blame when something does go wrong, even if it was entirely not their own fault just to stay in good graces with the customer or get that next deal.
Sometimes a vendor deserves to get beat up when something goes wrong, or at a least tell their story including if needed behind closed doors or under NDA. Likewise to have a meaningful relationship or partnership with the vendor, supplier or VAR, there needs to be trust and confidence which means not everything gets put out for media or blog venues to feed on.
Sure there is explaining what happened without spin, however there is also learning from mistakes to prevent them from happening which should be common sense. If part of that sharing of blame and responsibility requires being not in public that's fine, as well as enough information of what happened is conveyed to clarify concerns and create confidence.
With vendor lockin, when I was a customer some taught that it's the vendors fault (or for CYA, blame them), as a vendor the thinking was enforced that the customer is always right and its the competition who causes lockin.
As an analyst advisory consulting, my thinking not surprisingly is that of shared responsibility.
This means only you can allow vendor lockin, not to mention decide if lockin is bad or not.
Likewise only you can prevent data loss in cloud, virtual or traditional environments which also includes loss of access.
Granted somebody higher up the organization structure may over-ride you, however ask yourself if you did what was needed?
Likewise if a vendor is going to be doing some maintenance work in the middle of the week and there is a risk of something happening, even if they have told or sold you there is no single point of failure (NSPOF), or non disruptive upgrades.
Anytime there is a person involved regardless of if hardware, cables, software, firmware, configurations or physical environments something can happen. If the vendor drops the ball or a cable or card or something else and causes an outage or downtime, it is their responsibility to discuss those issues. However it is also the customers responsibility to discuss why they let the vendor do something during that time without taking adequate precautions. Likewise if the storage system was a single point of failure for an important system, then there is the responsibility to discuss the cost cutting concerns of others and have them justify why a redundant solution is not needed (that's CYA 101 btw ).
Some other common sense tips
For some these might be familiar and if so, are they being done, and for others, perhaps they are new or revolutionary.
In the race to jump to a new technology or vendor, what are the unknowns? For example you may know what the issues or flaws are in an existing systems, solution, product, service or vendor, however what about the new one? Will you be the production beta customer and if so, how can you mitigate any risk?
Ask vendors tough, yet fair questions that are relevant to your needs and requirements including how they handle updates, upgrades and other tasks. Don't be afraid to go under NDA if needed to get a better view of where they are at, have been and going to avoid surprises.
If this is not common IT sense, then take the responsibility to learn.
On the other hand, if this is common sense, take the responsibility to share and help others learn what it is that you know.
Also understand your availability needs and wants as well as balance those with costs along with risks. If something can go wrong it will if people are involved, thus design for resiliency including maintenance to offset applicable threat risks. Remember in the data center not everything is the same.
Here is my point.
There is enough blame as well as accolades to go around, however take some shared responsibility and use it wisely.
Likewise in the race to cut cost, watch out for causing problems that compromise your information systems or services.
Look into removing complexity and costs without compromise which has long-term benefits vs. simply cutting costs.
Here are some related links and perspectives:
Don't Let Clouds Scare You Be Prepared
Cloud conversation, Thanks Gartner for saying what has been said
Cloud conversations: Gaining cloud confidence from insights into AWS outages (Part II)
Make Your Company Ready for the Cloud
What do you do when your service provider drops the ball
People, Not Tech, Prevent IT Convergence
Pulling Together a Converged Team
Speaking of lockin, does software eliminate or move the location of vendor lock-in?
Ok, nuff said for now, what say you?
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO All Rights Reserved
Adding public cloud resources to an existing application can be a daunting process. The tools that you currently use to manage the software and hardware outside the cloud aren’t always the best tools to efficiently grow into the cloud. All of the major configuration management tools have cloud orchestration plugins that can be leveraged, but there are also cloud-native tools that can dramatically improve the efficiency of managing your application lifecycle. In his session at 18th Cloud Expo, ...
Jul. 23, 2016 08:30 AM EDT Reads: 629
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
Jul. 23, 2016 07:45 AM EDT Reads: 1,038
Ovum, a leading technology analyst firm, has published an in-depth report, Ovum Decision Matrix: Selecting a DevOps Release Management Solution, 2016–17. The report focuses on the automation aspects of DevOps, Release Management and compares solutions from the leading vendors.
Jul. 23, 2016 06:00 AM EDT Reads: 1,565
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
Jul. 23, 2016 04:30 AM EDT Reads: 1,929
This is a no-hype, pragmatic post about why I think you should consider architecting your next project the way SOA and/or microservices suggest. No matter if it’s a greenfield approach or if you’re in dire need of refactoring. Please note: considering still keeps open the option of not taking that approach. After reading this, you will have a better idea about whether building multiple small components instead of a single, large component makes sense for your project. This post assumes that you...
Jul. 23, 2016 03:15 AM EDT Reads: 3,783
Sharding has become a popular means of achieving scalability in application architectures in which read/write data separation is not only possible, but desirable to achieve new heights of concurrency. The premise is that by splitting up read and write duties, it is possible to get better overall performance at the cost of a slight delay in consistency. That is, it takes a bit of time to replicate changes initiated by a "write" to the read-only master database. It's eventually consistent, and it'...
Jul. 23, 2016 12:45 AM EDT Reads: 1,997
Before becoming a developer, I was in the high school band. I played several brass instruments - including French horn and cornet - as well as keyboards in the jazz stage band. A musician and a nerd, what can I say? I even dabbled in writing music for the band. Okay, mostly I wrote arrangements of pop music, so the band could keep the crowd entertained during Friday night football games. What struck me then was that, to write parts for all the instruments - brass, woodwind, percussion, even k...
Jul. 23, 2016 12:15 AM EDT Reads: 1,878
SYS-CON Events announced today that Isomorphic Software will exhibit at DevOps Summit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, cutting-edge enterprise web applications for desktop and mobile. SmartClient combines the productivity and performance of traditional desktop software with the simp...
Jul. 22, 2016 11:45 PM EDT Reads: 681
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Jul. 22, 2016 11:15 PM EDT Reads: 978
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
Jul. 22, 2016 10:30 PM EDT Reads: 2,068
When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
Jul. 22, 2016 10:00 PM EDT Reads: 2,187
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
Jul. 22, 2016 09:15 PM EDT Reads: 2,385
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Jul. 22, 2016 08:45 PM EDT Reads: 2,080
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
Jul. 22, 2016 07:00 PM EDT Reads: 2,389
There's a lot of things we do to improve the performance of web and mobile applications. We use caching. We use compression. We offload security (SSL and TLS) to a proxy with greater compute capacity. We apply image optimization and minification to content. We do all that because performance is king. Failure to perform can be, for many businesses, equivalent to an outage with increased abandonment rates and angry customers taking to the Internet to express their extreme displeasure.
Jul. 22, 2016 04:30 PM EDT Reads: 1,265
In his session at @DevOpsSummit at 19th Cloud Expo, Yoseph Reuveni, Director of Software Engineering at Jet.com, will discuss Jet.com's journey into containerizing Microsoft-based technologies like C# and F# into Docker. He will talk about lessons learned and challenges faced, the Mono framework tryout and how they deployed everything into Azure cloud. Yoseph Reuveni is a technology leader with unique experience developing and running high throughput (over 1M tps) distributed systems with extre...
Jul. 22, 2016 04:15 PM EDT Reads: 1,941
"We provide DevOps solutions. We also partner with some key players in the DevOps space and we use the technology that we partner with to engineer custom solutions for different organizations," stated Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at DevOps at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 22, 2016 08:15 AM EDT Reads: 1,500
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Jul. 22, 2016 04:45 AM EDT Reads: 2,060
Let's just nip the conflation of these terms in the bud, shall we?
"MIcro" is big these days. Both microservices and microsegmentation are having and will continue to have an impact on data center architecture, but not necessarily for the same reasons. There's a growing trend in which folks - particularly those with a network background - conflate the two and use them to mean the same thing.
They are not.
One is about the application. The other, the network. T...
Jul. 22, 2016 04:00 AM EDT Reads: 3,242
Right off the bat, Newman advises that we should "think of microservices as a specific approach for SOA in the same way that XP or Scrum are specific approaches for Agile Software development". These analogies are very interesting because my expectation was that microservices is a pattern. So I might infer that microservices is a set of process techniques as opposed to an architectural approach. Yet in the book, Newman clearly includes some elements of concept model and architecture as well as p...
Jul. 22, 2016 03:45 AM EDT Reads: 9,277