Microservices Expo Authors: Liz McMillan, Pat Romanski, Carmen Gonzalez, Elizabeth White, Jason Bloomberg

Related Topics: Microservices Expo, Java IoT, @CloudExpo, Cloud Security

Microservices Expo: Blog Feed Post

Managing Risk in IT

There’s an old adage that says, if it ain’t broke, don’t fix it

The IT debacle at RBS has highlighted the dependency large financial organisations (and other companies) have on their IT infrastructure.  From what has leaked out into the press, the RBS issue relates to a piece of software called CA-7, used for mainframe batch job scheduling. When I first started in IT in 1987, CA-7 (and it’s sister product CA-1, used for tape management) were already legacy technology.  From memory, I believe CA acquired the products from another company; both had archaic configuration processes and poor documentation.  However they did work and were reasonably reliable.

If it Ain’t Broke…
There’s an old adage that says, if it ain’t broke, don’t fix it; meaning if the software works, why change it.  Any change inherently introduces risk; make no changes and you don’t introduce unnecessary risk.  However, IT infrastructure doesn’t run forever.  Change is necessary to accomodate new features & functionality and cope with growth.  Eventually vendors stop supporting certain versions of software and hardware as they entice and force you to upgrade and purchase new products.

The hardware risk profile is pretty well understood by most organisations.  As servers and storage for instance, get older then the cost of support increases as parts become more difficult to obtain (and more expensive).  There’s a tipping point where maintenance costs outweigh upgrade and new purchase and so justification can be made to replace old hardware.  There’s also a number of other factors involved for hardware, including space, power & cooling costs, all of which help create a reasonably mature TCO model which can be used as part of a technology refresh.

The Software Risk Profile
However, I’m not sure we can say the same for software upgrades.  Working out the risk profile for software is more complex.  Firstly, software has no equivalent of hardware parts replacement; software components don’t wear out.  Bugs do get discovered in code, however these usually get fixed with service packs and patches.

Going back to CA-7, this software originally ran in mainframe environments supporting perhaps hundreds or a few thousand batch jobs in an overnight schedule.  In an organisation like RBS, the software may be supporting tens if not hundreds of thousands of complex batch interactions.  These may have dependencies on platforms other than the mainframe, which make things even more complex.

It’s easy to see that too much risk had been concentrated into a single piece of infrastructure software, if a failed upgrade could result in such disastrous consequences.  When software becomes so complex, it’s likely that upgrades get deferred and deferred until the upgrade becomes critical.  Then a failed upgrade has massive consequences.

The risk of failure in this instance was clearly not understood.  The upgrade took place midweek to a system that seemed to cover the update of accounts to every customer in three banks.  With such a high risk profile, this change should have been scheduled for a quiet period such as a bank holiday.  The change and subsequent backout should have been covered by senior staff – The Register article implies junior staff were involved.

Finally, questions have to asked as to how a junior member of staff could delete the entire input queue updating millions of customer records, then requiring “manual” input.  This statement makes no sense or demonstrates huge flaws in RBS’ batch structure.

The Architect’s View
Software and application upgrades are complex and in large organisations that complexity can be one risk too many.  The desire to centralise to reduce costs shouldn’t be done at the expense of introducing excessive risk.  RBS (and probably many other financial organisations) need to reflect on their system designs and look to mitigate these kinds of scenarios.  From my own experience I know we could see another one of these incidents happen at any time.

Read the original blog entry...

Microservices Articles
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
"NetApp's vision is how we help organizations manage data - delivering the right data in the right place, in the right time, to the people who need it, and doing it agnostic to what the platform is," explained Josh Atwell, Developer Advocate for NetApp, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. H...
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, will discuss why containers should be paired with new architectural practices such as microservices ra...
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, discussed how automation can provide the dynamic management required to cost-effectively deliver microservices and container solutions at scale. He also discussed how flexible automation is the key to effectively bridging and seamlessly coordinating both IT and developer needs for component orchestration across disparate clouds – an increasingly important requirement at today’s multi-cloud enterprise.
The Software Defined Data Center (SDDC), which enables organizations to seamlessly run in a hybrid cloud model (public + private cloud), is here to stay. IDC estimates that the software-defined networking market will be valued at $3.7 billion by 2016. Security is a key component and benefit of the SDDC, and offers an opportunity to build security 'from the ground up' and weave it into the environment from day one. In his session at 16th Cloud Expo, Reuven Harrison, CTO and Co-Founder of Tufin, ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lav...
Many organizations are now looking to DevOps maturity models to gauge their DevOps adoption and compare their maturity to their peers. However, as enterprise organizations rush to adopt DevOps, moving past experimentation to embrace it at scale, they are in danger of falling into the trap that they have fallen into time and time again. Unfortunately, we've seen this movie before, and we know how it ends: badly.
TCP (Transmission Control Protocol) is a common and reliable transmission protocol on the Internet. TCP was introduced in the 70s by Stanford University for US Defense to establish connectivity between distributed systems to maintain a backup of defense information. At the time, TCP was introduced to communicate amongst a selected set of devices for a smaller dataset over shorter distances. As the Internet evolved, however, the number of applications and users, and the types of data accessed and...