Welcome!

Microservices Expo Authors: Yeshim Deniz, Pat Romanski, Elizabeth White, Stackify Blog, Liz McMillan

Related Topics: Containers Expo Blog, Java IoT, Microservices Expo, Open Source Cloud, @CloudExpo

Containers Expo Blog: Article

Visible Ops' Success Leads to a Vblock

Why the key to a successful Visible Ops framework is a Vblock

Love it or hate it, ITIL and Change Management will always be an integral part of any IT set up with regulations such as BASEL II, FISMA, SOX (Sarbanes-Oxley) and HIPAA constantly breathing down the neck and conscience of organization leaders. Having once had a "purple badge" wearing ITIL guru for a manager, it always fascinated me how he'd advocate the framework as the solution to all our IT problems. While he'd hark on about defining repeatable and verifiable IT processes, it always ended up being theoretical as opposed to practical, often emphasized by his own IT competency, "Err, Archie how do I save this Word document and what on Earth is that SAN thing you keep going on about?"

That distinction between theory and practice was never more apparent than in the almost pointless CAB (Change Advisory Board) meetings that took place on a weekly basis. While the Change Management processes themselves were painfully bureaucratic and often a diversion from doing actual operational work, the CAB meetings were a surreal experience. With barely anyone in attendance the CAB would ask for a justification to each change, with a response of "approve" or "rejected" when it was clear that they had little or no idea of the technical explanation or implication that was given to them.

Then there was the Security/Risk Compliance chap who'd lock himself in his room glued to his Tripwire dashboard carefully spying on any unapproved changes. Such was his fascination with Tripwire that he too barely attended the CAB meetings, instead indirectly emphasizing his lack of trust and relevance of the Change Management system.  So imagine his amazement when I introduced him to a new product we had implemented for our WINTEL environment called VMware and its feature VMotion.  The fact that I had been seamlessly migrating VMs across physical servers without raising a change and without him being able to pick it up on Tripwire sent him into a perplexed frenzy. Somewhat amused by his constant head shaking, I decided to disclose that I had also been seamlessly migrating LUNs across different RAID Groups with HDS' Cruise Control to get more spindles working, upon which like Batman he'd rushed back to his cave to check whether "Big Brother" Tripwire had picked it up.  Was I really supposed to raise a change for every VMotion or LUN migration?

Several years later after moving from being a customer to a technical consultant my impression of the effectiveness of the CAB failed to improve. Midweek and late in the day in the customer's data center with their SAN Architect, I'd pointed out that they had cabled up the wrong ports in their SAN switches and that this would require a change to be raised. "No need for that" replied the SAN architect, "I'm one of the CAB members". He then to my shock and in true Del Boy fashion, duly proceeded to pull out and swap the FC cables to his production hosts with a big grin on his face. Several minutes later his phone rang, to which he replied, "It's okay, I've resolved it. There was a power failure on some servers." Then with a cheeky grin, a swing of the head and a wink of an eye, he turned to me and said, "There you go sorted, lubbly jubbly!".

While my initial skepticism to ITIL's practicality was centered around my personal experiences it was only embellished by the number of long white bearded external auditors that would supposedly check whether proper controls existed within the many firefighting and cowboy organizational procedures I witnessed. Like a classroom of kids hearing the teacher coming up the corridor and scurrying to get to their desk to present a fabricated impression of discipline and order, I never ceased to be astounded by the last minute changes and running around of our compliance folk to ensure we successfully passed our audits. Despite having more daily Priority 1s than the canteen was serving decent hot meals, we still inexplicably passed every audit with flying colours, which in turn emboldened the rogue "under the radar" operational practices that served to keep the lights on.

So with such a tarnished experience of ITIL, it was with great curiosity and interest that led me to look closer at the movement and initiative of ITPI's Visible Ops.  While still mapping its ideas to ITIL terminology, the onus of Visible Ops is on increasing service levels, decreasing costs and increasing security and auditability. In simplest terms, Visible Ops is a fast track / jumpstart exercise to an efficient operating model that replicates the researched processes of high-performing organizations in just four steps.

To summarize, the first of these four steps is what is termed Phase 1 or "Stabilize the Patient". With the understanding that almost 80% of outages are self-inflicted, any change outside of scheduled maintenance windows are quickly frozen. It then becomes mandatory for problem managers to have any change related information at hand so that when that 80% of "unplanned work" is initiated a full understanding of the root cause is quickly established. This phase starts at the systems and business processes that are responsible for the greatest amount of firefighting with the aim that once they are resolved they would free up work cycles to initiate a more secure and measured route for change.

Phase 2, which is termed "Catch & Release" and "Find Fragile Artifacts", is related to the infrastructure itself with the understanding that it cannot be repeatedly replicated. With an emphasis on gaining an accurate inventory of assets, configurations and services, the objective is to identify the "artifacts" with the lowest change success rates, highest MTTR and highest business downtime costs. By capturing all these assets, what they're running, the services that depend upon them and those responsible for them, an organization ends up in a far more secure position prior to a Priority 1 firefighting session.

Phase 3 or "Establish Repeatable Build Library" is focused on implementing an effective release management process. Using the previous phases as a stepping stone, this phase documents repeatable builds of the most critical assets and services enabling their rebuilding to be more cost effective than to repair. In a process that leads to an efficient mass-production of standardized builds, senior IT operations staff can transform from a reactive to a proactive release management delivery model. This is achieved by operating early in the IT operations lifecycle by consistently working on software and integration releases prior to their deployment into production environments. At the same time a reduction in unique production configurations is pushed for, consequently increasing the configuration lifespans prior to their replacement or change which in turn leads to an improvement in manageability and reduction in complexity.  Eventually the output of these repeatable builds are "golden" images that have been tried, tested, planned and approved prior to production. Therefore when new applications, patches and upgrades are released for integration these golden builds or images need merely updating.

The fourth and last phase, entitled "Enable Continuous Improvement" is pretty self explanatory in that it deals with building a closed loop between the release, control and resolution processes. By completing the previous three phases, metrics for the three key process areas (release, controls and resolution) are focused on, specifically those that can facilitate quick decision making and provide accurate indicators of the work and its success in relation to the operational process. Drawing on ITIL‘s resolution process metrics of Mean Time Before Failure (MTBF) and Mean Time to Repair (MTTR), this phase looks at Release by measuring how efficiently and effectively infrastructure is provisioned. Controls are measured by how effectively the change decisions that are made keep production infrastructure available, predictable and secure, while Resolution is quantified by how effectively issues are identified and resolved.

So while these four concise and particular phases look great on paper what really differentiates them from potentially just being another theoretical process that fails to be delivered comprehensively in practical reality? If the manner in which IT is procured, designed, configured, validated and implemented remains the same there is little if any chance for Visible Ops to succeed any much further than the Purple Badge lovers of ITIL. But what if the approach to IT and more specifically its infrastructure was to change from the traditional buy your own, bolt it together and pray that it works method and instead transferred to a more sustainable and predictable model? What if the approach to infrastructure was one of a green fields approach or seamless migration to a pretested, pre-validated, pre-integrated, prebuilt and preconfigured product i.e. a true Converged Infrastructure? What impact could that possibly have on the success of Visible Ops and the aforementioned four phases?

If we look at phase 1 and "stabilizing the patient" this can be immediately achieved with a Vblock where an organisation no longer has to spend time investigating and worrying about the risk and impact of change. By having a standardized product based approach as opposed to a bunch of components bundled together, thousands of hours of QA testing and analysis work can be performed by VCE for each new patch, firmware upgrade or update on a like for like product that is owned by the customer. With this acting as the premise of a semi-annual release certification matrix that updates all of the components of the Converged Infrastructure as a comprehensive whole, risks typically associated with the change process are eliminated. Furthermore as changes are dictated by this pre-tested and pre-validated process and need to adhere to this release certification matrix to remain within support, it helps eradicate any rogue based changes as well as inform problem managers comprehensively of the necessary changes and updates. Ultimately phase 1's objective of stabilization is immediately achieved via the risk mitigation that comes with implementing a pre-engineered, pre-defined and pre-tested upgrade path.

The challenge of phase 2, which in essence equates to an eventual full inventory of the infrastructure, is a painful process at the best of times especially as new kit from various vendors is constantly being purchased and bolted on to existing kit. Moving to a Vblock simplifies this challenge as it's a single product and hence a single SKU at procurement. Akin to purchasing an Apple Macbook that is made up of many components e.g. a hard drive, processor, CD-ROM etc., the Converged Infrastructure's components are formulated as a whole to provide the customer a product. The parts of the product and all of their details are known to the manufacturer i.e. VCE and can easily be transferred as a single bill of materials to the customer with serial numbers etc. thus ensuring an up to date and accurate inventory and consequently simplified asset management process. When patches, upgrades and additions of new parts and components are required they are automatically added to the inventory list of the single product, thus ensuring up to date asset management.

The Release Management requirement of Phase 3 offers a challenge that is not only embroiled with risk but also takes up a significant amount of staff and management time cycles to ensure that technology and infrastructure remain up to date. This entails the rigmarole of downloading, testing and resolving interoperability issues of component patches and releases and relies heavily on the information sharing of silos as well as the success of regression tests. The unique approach of a Vblock meets this challenge immediately by making pre-tested, validated software and firmware upgrades available for the end user enabling them to locate releases that are applicable for their Converged Infrastructure system. With regards to the rebuild as opposed to repair approach stipulated in phase 3, because a Vblock can be deployed and up and running in only 30 days, the ability to have a like for like standardized infrastructure for new and upcoming projects is a far easier process compared to the usual build it yourself infrastructure model. On a more granular level, by having a management and orchestration stack with a self service portal, golden image VMs can be immediately deployed with a billing and chargeback model as well as integration with a CMDB. The result is a quick and successful attainment of phase 3 of the Visible Ops model via a unified release and configuration management methodology that is highly predictable and enhances availability by reducing interoperability issues.

Measuring the success of metrics such as MTTR and MTBF as detailed in Phase 4 is ultimately linked to the success of the monitoring and support model that's in place for your infrastructure. With a product based approach to infrastructure the support model will also be better equipped to ensure continuous improvement. Having an escalation response process that is based on a product, regardless if resolving a problem requires consultation with multiple experts or component teams, ultimately means a seamless and single point of contact for all issues. This end-to-end accountability for an infrastructure's support, maintenance and warranty makes the tracking of issue resolution and availability a much simpler model to measure and monitor. Furthermore with open APIs that enable integration with comprehensive monitoring and management software platforms, the Converged Infrastructure can be monitored for utilization, performance and capacity management as well as potential issues that can be flagged proactively to support.

As IT operational efficiency becomes more of an imperative for businesses across the globe, the theoretical practices that have failed to deliver are either being assessed, questioned or in some cases continued with. What is often being overlooked is that one of the key and inherent problems is the traditional approach to building and managing IT infrastructure. Even a radical and well researched approach and framework such as Visible Ops will eventually suffer and at worse fail to succeed if the IT infrastructure that the framework is based on was built by the same mode of thinking that created the problems. Fundamentally whether the Visible Ops model is a serious consideration for your environment or not, by adopting the framework with a Vblock, the ability to stabilize, standardize and optimise your IT infrastructure and its delivery of services to the business becomes a lot more practical and consequently a lot less theoretical.

More Stories By Archie Hendryx

SAN, NAS, Back Up / Recovery & Virtualisation Specialist.

@MicroservicesExpo Stories
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...
There are two main reasons for infrastructure automation. First, system administrators, IT professionals and DevOps engineers need to automate as many routine tasks as possible. That’s why we build tools at Stackify to help developers automate processes like application performance management, error monitoring, and log management; automation means you have more time for mission-critical tasks. Second, automation makes the management of complex, diverse environments possible and allows rapid scal...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore t...
This talk centers around how to automate best practices in a multi-/hybrid-cloud world based on our work with customers like GE, Discovery Communications and Fannie Mae. Today’s enterprises are reaping the benefits of cloud computing, but also discovering many risks and challenges. In the age of DevOps and the decentralization of IT, it’s easy to over-provision resources, forget that instances are running, or unintentionally expose vulnerabilities.
Regardless of what business you’re in, it’s increasingly a software-driven business. Consumers’ rising expectations for connected digital and physical experiences are driving what some are calling the "Customer Experience Challenge.” In his session at @DevOpsSummit at 20th Cloud Expo, Marco Morales, Director of Global Solutions at CollabNet, will discuss how organizations are increasingly adopting a discipline of Value Stream Mapping to ensure that the software they are producing is poised to ...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service. FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual "function", action, or piece of business logic. They are expected to start within milliseconds...
One of the biggest challenges with adopting a DevOps mentality is: new applications are easily adapted to cloud-native, microservice-based, or containerized architectures - they can be built for them - but old applications need complex refactoring. On the other hand, these new technologies can require relearning or adapting new, oftentimes more complex, methodologies and tools to be ready for production. In his general session at @DevOpsSummit at 20th Cloud Expo, Chris Brown, Solutions Marketi...
Most DevOps journeys involve several phases of maturity. Research shows that the inflection point where organizations begin to see maximum value is when they implement tight integration deploying their code to their infrastructure. Success at this level is the last barrier to at-will deployment. Storage, for instance, is more capable than where we read and write data. In his session at @DevOpsSummit at 20th Cloud Expo, Josh Atwell, a Developer Advocate for NetApp, will discuss the role and value...
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing bes...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry’s single source for the cloud. Fusion’s advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developm...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, Cloud Expo and @ThingsExpo are two of the most important technology events of the year. Since its launch over eight years ago, Cloud Expo and @ThingsExpo have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, I provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading the...
The purpose of this article is draw attention to key SaaS services that are commonly overlooked during contact signing that are essential to ensuring they meet the expectations and requirements of the organization and provide guidance and recommendations for process and controls necessary for achieving quality SaaS contractual agreements.
SYS-CON Events announced today that OpsGenie will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2012, OpsGenie is an alerting and on-call management solution for dev and ops teams. OpsGenie provides the tools needed to design actionable alerts, manage on-call schedules and escalations, and ensure that the right people are notified at the right time, using multiple notification methods.