Welcome!

Microservices Expo Authors: Pat Romanski, Dalibor Siroky, Stackify Blog, Elizabeth White, Liz McMillan

Related Topics: Microservices Expo

Microservices Expo: Article

BCP Lessons Learned and New Ideas for IT Infrastructure Continuity

Learn How to Justify the Creation of Disaster Recovery Facilities

Businesses in the southeastern United States have been hit hard with hurricanes in the last few years, and 2008 was no exception. As a project manager and CBCP for over 1600 disaster recovery deployments I can share real examples of how entire data centers were failed over to the DR operations center in preparation for hurricanes, while others (due to poor planning) did not have the same success. Those that were successful were efficient in organizing the RTO of their communication servers which helped them prioritize the recovery efforts as well utilize creative testing procedures in order to not disrupt normal business activity. The first priority of a BCP is to ensure the safety of the employees, but being able to communicate to those needed is also an important step for successfully executing a BCP. Because of this preparedness many businesses I have heard from were able to proactively allow their employees evacuate and still provide them remote access for business operations from almost anywhere. I will review a few of the  examples of architecture, solutions and best practices for exercising controls in those events as well as discuss what future technology may be utilized to better help justify the creation of disaster recovery facilities.

10 Professional Practices for BCP
There are ten professional practices for business continuity planning; all equally important and if followed appropriately will allow you to create a solid foundation to build upon. For the purpose of this article I will summarize the professional practices, but for more information visit the Disaster Recovery International Institute (www.drii.org). DRII is an excellent resource for BCP and is a consortium of business continuity professionals dedicated to setting industry standards and sharing knowledge around the practice of business continuity management.

The first step in building a BCP is Program Initiation and Management. This step is designed to establish executive approval, support and justification for the need of a resiliency program. Start with building a dedicated team that is committed to supporting the BCP initiative and selecting team members that can effectively manage roles and responsibilities for their portion of the plan. Cost justification is often a hurdle in establishing the need for disaster recovery facilities, so one tip would be to utilize your current assets such as other offices or co-location facilities. You can also work with the IT department to help tie in the IT management budget into the BCP so that you are not just providing continuity in the event of a disaster, but also high availability for day-to-day operational maintenance.

The next couple of steps are important in determining the risk (risk evaluation) your organization faces from either a natural or environment disaster perspective and then determine the business impact (BIA) should one of those events occur. This will help you determine the next step in the business continuity strategy you design and implement to meet your defined recovery point (RPO) and time (RTO) objectives. Once those objectives and controls are defined you will need to integrate emergency response and operations in order to define the process in which a disaster is declared and what prompts the initiation of the BCP.

These previous steps are what allow you to design and implement a comprehensive strategy that meets the requirements of your company’s objectives. I have seen companies try to short cut these previous steps and immediately skip to implementing a solution, only to find out that their infrastructure doesn’t have enough power, bandwidth, resources and or executive approval to support the controls implemented. So the lesson learned is, don’t try to take short cuts and jump into something you have never done before. Following the previous steps will allow you to proceed and likely prevent challenges you may face during the deployment and execution of your plan.

The next three steps include designing and implementing the BCP, generating awareness and training your organization on what to do in event of a disaster, then exercising those plans regularly. Exercising BCP is typically recommended to be tied to your change control process which means the plan should be reviewed any time there is a change within the organization that may affect the plan. (That can be anything as small as a software update to some of the business critical servers to a BCP member leaving the company.) Depending on the situation, exercises could take place as frequently as once a month or at very least 2-3 times per year so that there is a consistent awareness of the plan and procedures.

The last two practices, crisis communication and coordinating with external agencies is really the culmination of the previous practices and will ultimately be the success or failure of your plan. In the event of a disaster, communication is critical to coordinating with emergency responders and your own business continuity team to make sure evacuations and safety procedures are implemented effectively.

When Planning and Exercising is Done Right
Planning is your best friend when it comes to rolling out controls for a business continuity solution. Starting with executive buy in though budget, infrastructure, process, procedures, testing and ultimately execution you can’t plan enough. And when it’s done right deployments go smoothly. However, is more than one way to go about this. As the saying goes “Don’t eat the elephant all in one bite”. Breaking down your overall rollout plan into smaller projects will help you better manage details as well as prioritize the order of the overall deployment. Here are some quotes from companies who did it right and were glad they did after Hurricane Ike made landfall:

  • “All is OK and thanks. Our files were mirrored to our Austin facility with no loss of data or applications. Winds tore a 30'x30' hole in the building roof.  The water damage was bad.  The computer servers were spared but alot of workstations were soaked.  Houston operations were running in Austin just before the hurricane hit and the transfer was seamless.”
  • “Thanks, our company is doing just fine. With our replicated data to one of our other locations, we were up and seeing patients once the patients could get to us. We appreciate your concern, and your overall support of our organization. On behalf of our organization, we want to say thank you!”
  • “Yes we did make it out alive; we activated our business contingency plan, and relocated to Dallas. Luckily our solution allowed us to failover and business continued. “

Exercising the business continuity plan on a regular basis helped these companies not only be prepared but assured that they were ready for anything. And with the adaption of new technologies for IT infrastructure, testing those plans are even easier to exercise while minimizing impact to production operations. In previous years testing business continuity plans for the data center usually required shutting down the entire production facility and running through the restoration process. With the adoption of real-time replication software, co-location facilities and virtualization testing can be accomplished with minimal impact to a production environment. If you have a dedicated disaster recovery facility with hot standby servers you could just segment the networks from each other and bring the site online. However, you had to be very careful about making sure those two sites weren’t talking to each other via domains or active directory services.

How Dynamic Infrastructure Is being used to facilitate BCP Exercise

Dynamic Infrastructure is defined by some as ‘the ability to rapidly move and provision workloads with security and inherent protection’. It may be a new idea to you, but it is being adopted within the IT community with great success. Dynamic Infrastructure not only simplifies the disaster recovery procedures for data center managers, but also provides the ability to use those same controls for day-to-day operations to keep your business operations available all the time - not just during disasters. With the adoption of virtualization technologies saving costs on hardware, power and cooling, data center management budgets can be combined with BCP for maximizing infrastructure availability. These technologies also assist BCP exercises by simulating recovery servers and sites without bringing down production servers. Some solutions like VMware® Site Recovery Manager have this feature but also have some inherent issues. For instance, in the event of a real disaster the virtual solution doesn’t have any failback capability. Typically once that process has been started there is no turning back without a complete restoration which could take days depending on the number of systems and or volume of data that needed to be restored. Dynamic Infrastructure provides the functionality that others are missing as well as allowing for rapid failback capabilities for smaller or “little d” disasters, which are more likely to impact a business critical system.

The Next Generation of BCP
With future technology delivering Dynamic Infrastructure, cloud computing and mobile communication devices, learning how they can protect IT infrastructure for Business Continuity Planning has never been more important. Many management services are offering remote or mobile access for initiating some of these data center management functions. Imagine if you could initiate a failover of a server via your iPhone or BlackBerry®. The reality is that it isn’t very far off. It’s  possible that many business- critical services could be run via cloud computing so that services are available anywhere they are needed - even if there was a disaster at the production facility.

However, this begs the question. Who is protecting the cloud and what is their business continuity plan?

More Stories By Brace Rennels

Brace Rennels is a passionate and experienced interactive marketing professional who thrives on building high energy marketing teams to drive global web strategies, SEO, social media and online PR web marketing. Recognized as an early adopter of technology and applying new techniques to innovative creative marketing, drive brand awareness, lead generation and revenue. As a Sr. Manager Global of Website Strategies his responsibilities included developing and launching global social media, SEO and web marketing initiatives and strategy. Recognized for applying innovative solutions to address unique problems and manage business relationships to effectively accomplish enterprise objectives. An accomplished writer, blogger and author for several publications on various marketing, social media and technical subjects such as industry trends, cloud computing, virtualization, website marketing, disaster recovery and business continuity. Publications include CIO.com, Enterprise Storage Journal, TechNewsWorld, Sys-Con, eWeek and Peer to Peer Magazine. Follow more of Brace's writing on his blog: http://bracerennels.com

@MicroservicesExpo Stories
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
The nature of test environments is inherently temporary—you set up an environment, run through an automated test suite, and then tear down the environment. If you can reduce the cycle time for this process down to hours or minutes, then you may be able to cut your test environment budgets considerably. The impact of cloud adoption on test environments is a valuable advancement in both cost savings and agility. The on-demand model takes advantage of public cloud APIs requiring only payment for t...
It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service. FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual "function", action, or piece of business logic. They are expected to start within milliseconds...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory? In her Day 2 Keynote at @DevOpsSummit at 21st Cloud Expo, Aruna Ravichandran, VP, DevOps Solutions Marketing, CA Technologies, was jo...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
Is advanced scheduling in Kubernetes achievable?Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, answered these questions and demonstrated techniques for implementing advanced scheduling. For example, using spot instances and co...
The cloud era has reached the stage where it is no longer a question of whether a company should migrate, but when. Enterprises have embraced the outsourcing of where their various applications are stored and who manages them, saving significant investment along the way. Plus, the cloud has become a defining competitive edge. Companies that fail to successfully adapt risk failure. The media, of course, continues to extol the virtues of the cloud, including how easy it is to get there. Migrating...
For DevOps teams, the concepts behind service-oriented architecture (SOA) are nothing new. A style of software design initially made popular in the 1990s, SOA was an alternative to a monolithic application; essentially a collection of coarse-grained components that communicated with each other. Communication would involve either simple data passing or two or more services coordinating some activity. SOA served as a valid approach to solving many architectural problems faced by businesses, as app...
Some journey to cloud on a mission, others, a deadline. Change management is useful when migrating to public, private or hybrid cloud environments in either case. For most, stakeholder engagement peaks during the planning and post migration phases of a project. Legacy engagements are fairly direct: projects follow a linear progression of activities (the “waterfall” approach) – change managers and application coders work from the same functional and technical requirements. Enablement and develo...
Gone are the days when application development was the daunting task of the highly skilled developers backed with strong IT skills, low code application development has democratized app development and empowered a new generation of citizen developers. There was a time when app development was in the domain of people with complex coding and technical skills. We called these people by various names like programmers, coders, techies, and they usually worked in a world oblivious of the everyday pri...
From manual human effort the world is slowly paving its way to a new space where most process are getting replaced with tools and systems to improve efficiency and bring down operational costs. Automation is the next big thing and low code platforms are fueling it in a significant way. The Automation era is here. We are in the fast pace of replacing manual human efforts with machines and processes. In the world of Information Technology too, we are linking disparate systems, softwares and tool...
DevOps is good for organizations. According to the soon to be released State of DevOps Report high-performing IT organizations are 2X more likely to exceed profitability, market share, and productivity goals. But how do they do it? How do they use DevOps to drive value and differentiate their companies? We recently sat down with Nicole Forsgren, CEO and Chief Scientist at DORA (DevOps Research and Assessment) and lead investigator for the State of DevOps Report, to discuss the role of measure...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
"As we've gone out into the public cloud we've seen that over time we may have lost a few things - we've lost control, we've given up cost to a certain extent, and then security, flexibility," explained Steve Conner, VP of Sales at Cloudistics,in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
These days, APIs have become an integral part of the digital transformation journey for all enterprises. Every digital innovation story is connected to APIs . But have you ever pondered over to know what are the source of these APIs? Let me explain - APIs sources can be varied, internal or external, solving different purposes, but mostly categorized into the following two categories. Data lakes is a term used to represent disconnected but relevant data that are used by various business units wit...
With continuous delivery (CD) almost always in the spotlight, continuous integration (CI) is often left out in the cold. Indeed, it's been in use for so long and so widely, we often take the model for granted. So what is CI and how can you make the most of it? This blog is intended to answer those questions. Before we step into examining CI, we need to look back. Software developers often work in small teams and modularity, and need to integrate their changes with the rest of the project code b...
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Cloud4U builds software services that help people build DevOps platforms for cloud-based software and using our platform people can draw a picture of the system, network, software," explained Kihyeon Kim, CEO and Head of R&D at Cloud4U, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lav...