Welcome!

Microservices Expo Authors: Pat Romanski, John Katrick, Elizabeth White, Liz McMillan, Yeshim Deniz

Related Topics: Microservices Expo, Java IoT, Microsoft Cloud, Machine Learning

Microservices Expo: Article

Establishing Enterprise Monitoring Baselines

Our technology dependent lives

Enterprise monitoring gets a great deal more air time these days than it ever did in the past. Perhaps it's because our technology dependent lives have become some so reliant on the availability of systems and infrastructural services. Have things improved? How would you know?

In reality, monitoring systems themselves are nothing particularly new. Consider the pressure valve on a steam boiler. At the most rudimentary level, the object of the valve is to release pressure. The way you decide whether or not to release pressure is to observe the gauge that indicates the boiler's pressure. Steam engines have been around for hundreds of years and gauges to monitor them, probably almost as long. The thermostat on your house heating system, the temperature gauge on your car's engine, the battery life monitor on your phone; they're all monitors, perhaps not "enterprisey" but you get the idea.

Most of us, no doubt will have seen one or more of the many Hollywood blockbuster movies that features some drama that involves gauges and monitors. Of course the recent unfortunate circumstances at the Fukushima Daiichi nuclear disaster were of particular importance globally because of the seriousness of the events and the fact that as the disaster unfolded the news was relayed across international networks following the Tsunami. More important here, is the need to consider that if there was not some sort of monitoring it would have been impossible to comment on the significance of the reactor temperatures and other factors until the fires and explosions had already occurred. The process of testing air quality, water and milk quality and the general radioactivity characteristics of the community all represented some level of monitoring also. But effective monitoring was only really measurable against some sort of yardstick - a baseline.

So I think we therefore can accept that monitoring is a worldwide ubiquitous phenomenon and one that not only mankind has embraced but also the plants and trees. Autumn and spring after-all are a factor of the length of the days and plants and trees react accordingly by shedding old foliage or generating shoots and foliage anew. So nature it seems perhaps has a baseline too.

Why bother with a baseline?
To be effective any monitoring activity however needs a base-line. Determining baselines is key to effective monitoring. In its most basic form, a performance baseline is quite simply a set of metrics used for the monitoring to define the normal working state of whatever it is that you are monitoring. Engineers typically use performance baselines for comparison to trap changes in state that could indicate a problem.

Setting an appropriate baseline also provides early indicators that usage or consumption or even throughput demands are pushing available capacity, thereby giving support and planning resources the opportunity to plan for upgrades. Aligning performance baselines with existing SLAs (Service Level Agreements) can help the organization stay within capacity parameters and identify problem areas that are falling out of compliance.

The challenge is in determining what constitutes a relevant and appropriate baseline. As you can image, for many things, there is no absolute answer with respect to baselines. Even mother nature sometimes gets it wrong, when trees start sprouting leaves at about the right time in the season and then an unexpected cold snap occurs and nips those shoots in the bud with a frost and effectively stunts or stalls plant growth for the season.

Establishing a baseline is key though for effective implementation of anything new. If for example, your plan is to replace your organization's paper forms processing technology with an electronic forms solution with workflow, based on a technology like that provided by Winshuttle, you need to understand some basic metrics about what you are trying to do and what your expectations should be around general performance and operational function.

There are no standard baselines
There are no generalized standards for baseline monitoring that you can unfortunately simply overlay on your organization. Just as every custom built boiler has its own baseline and every range of boilers differs from every other range, every automotive engine has a different optimal performance baseline, so too, every organization has its own baseline that is unique.

There are industry standards that can help, like CoBIT, ITIL etc, and some of these make monitoring tool recommendation also, but a lot of these constitute heavy lifting in terms of highly integration solutions and infrastructure that a given organization needs to have in place.

A different but effective approach that should be considered is one that involves determining your minimum expectations in terms of effectiveness. We will have no more than two orders waiting to be processed at any point in time, we will have no more than three process exceptions per 100 orders, we will not have order lines canceled due to lack of product availability etc.

The choice to build infrastructure that pushes and pulls data from your ERP system, whether it be from Microsoft Excel or an InfoPath form has been made based on the fundamental assumption that the existing approaches will improve by some measure. What are those improvements?

Data processing may improve in quality, speed or process rigor and all of these can be measured. As a part of the capital investment process there is usually the requirement of some sort of justification for the project, and this can be a great starting point for your baseline - this usually indicates some sort of yield or return on investment metrics. Part of your baseline activity is also the assessment of how long the current approaches achieve their objective, or fail.

Taking an inventory of all the things you believe are important is therefore your starting point.

Priority and measurability
The next step is determining a priority for those items, which are the most important ones and which ones can be reasonably measured. Having a baseline that state, "our users will be happier" may seem to be an odd, one, however it is a reasonable one, if reworked and considered as a response to a periodic survey with a measurable success criteria such as: 95% of all new users surveyed agree that they prefer the new form. While this is not necessarily an enterprise monitorable response, it is something that you could build into your process at the close out of the form, and have a window appear that asks whether the form process was easy or hard and whether they would be likely to use it again in the future. Storing every response in a database can then become part of your monitoring metrics.

At the end of many SKYPE VoIP calls for example, a call quality poll is presented to help in assessing the quality of the encoding algorithms and application performance.

The last factor to consider is how long should you baseline for? The answer to this is not very categorical, however it is important to remember that if continuous improvement is your objective a protracted baseline gives you the best data. Usually this is at least something that has a high number of samples with enough diversity that you have outliers that would skew the process if looked at all inclusively. The important thing about the baseline that should be considered though, is that over time, the characteristics and parameters of the baseline are likely to change. The starting baseline for example may move, after the new system or approach has been adopted, and in fact the new approach itself may become the baseline for future enhancements and improvements.

When talking forms design, some thoughts to consider on form and workflow performance are the following:

  • Form generation time: how long does the form take to render on launch - with paper, it's how long does it take you to find the form...
  • Form completion time: how long on average does it take to complete the form - this assumes that the person completing the form has all the information that they need, to hand.
  • Form routing time: how long does it take to close out the process and pass control to the next person in the chain
  • Notification time: how long does it take for the submitter to be informed that their form is en route and how long does it take for the next person in the chain to be notified also - failures or protracted delays here, may speak to a number of factors, but you should define the expectation that you have for these.

Softer metrics like those previously cited, like non-conformance, number of forms rejected due to data quality etc, are a little harder to put system monitors around, but you should try to monitor them anyway. With so many of the form and workflow activities now being stored in openly accessible relational databases like SQLServer there are a great many more ways that the data can be evaluated than ever before.

If you have some interesting baseline variables you'd like to share then please do let me know, I would love to hear about them.

Further Reading:

More Stories By Clinton Jones

Clinton Jones is a Product Manager at Winshuttle. He is experienced in international technology and business process with a focus on integrated business technologies. Clinton also services a technical consultant on technology and quality management as it relates to data and process management and governance. Before coming to Winshuttle, Clinton served as a Technical Quality Manager at SAP. Twitter @winshuttle

@MicroservicesExpo Stories
For organizations that have amassed large sums of software complexity, taking a microservices approach is the first step toward DevOps and continuous improvement / development. Integrating system-level analysis with microservices makes it easier to change and add functionality to applications at any time without the increase of risk. Before you start big transformation projects or a cloud migration, make sure these changes won’t take down your entire organization.
When you focus on a journey from up-close, you look at your own technical and cultural history and how you changed it for the benefit of the customer. This was our starting point: too many integration issues, 13 SWP days and very long cycles. It was evident that in this fast-paced industry we could no longer afford this reality. We needed something that would take us beyond reducing the development lifecycles, CI and Agile methodologies. We made a fundamental difference, even changed our culture...
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. H...
You often hear the two titles of "DevOps" and "Immutable Infrastructure" used independently. In his session at DevOps Summit, John Willis, Technical Evangelist for Docker, covered the union between the two topics and why this is important. He provided an overview of Immutable Infrastructure then showed how an Immutable Continuous Delivery pipeline can be applied as a best practice for "DevOps." He ended the session with some interesting case study examples.
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his general session at @DevOpsSummit at 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will discuss how customers are able to achieve a level of transparency that e...
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, discussed how automation can provide the dynamic management required to cost-effectively deliver microservices and container solutions at scale. He also discussed how flexible automation is the key to effectively bridging and seamlessly coordinating both IT and developer needs for component orchestration across disparate clouds – an increasingly important requirement at today’s multi-cloud enterprise.
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...
Learn how to solve the problem of keeping files in sync between multiple Docker containers. In his session at 16th Cloud Expo, Aaron Brongersma, Senior Infrastructure Engineer at Modulus, discussed using rsync, GlusterFS, EBS and Bit Torrent Sync. He broke down the tools that are needed to help create a seamless user experience. In the end, can we have an environment where we can easily move Docker containers, servers, and volumes without impacting our applications? He shared his results so yo...
Enterprise architects are increasingly adopting multi-cloud strategies as they seek to utilize existing data center assets, leverage the advantages of cloud computing and avoid cloud vendor lock-in. This requires a globally aware traffic management strategy that can monitor infrastructure health across data centers and end-user experience globally, while responding to control changes and system specification at the speed of today’s DevOps teams. In his session at 20th Cloud Expo, Josh Gray, Chie...
Don’t go chasing waterfall … development, that is. According to a recent post by Madison Moore on Medium featuring insights from several software delivery industry leaders, waterfall is – while still popular – not the best way to win in the marketplace. With methodologies like Agile, DevOps and Continuous Delivery becoming ever more prominent over the past 15 years or so, waterfall is old news. Or, is it? Moore cites a recent study by Gartner: “According to Gartner’s IT Key Metrics Data report, ...
Kubernetes is a new and revolutionary open-sourced system for managing containers across multiple hosts in a cluster. Ansible is a simple IT automation tool for just about any requirement for reproducible environments. In his session at @DevOpsSummit at 18th Cloud Expo, Patrick Galbraith, a principal engineer at HPE, discussed how to build a fully functional Kubernetes cluster on a number of virtual machines or bare-metal hosts. Also included will be a brief demonstration of running a Galera MyS...
In his session at Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, presented a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to maximize project result...
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
In IT, we sometimes coin terms for things before we know exactly what they are and how they’ll be used. The resulting terms may capture a common set of aspirations and goals – as “cloud” did broadly for on-demand, self-service, and flexible computing. But such a term can also lump together diverse and even competing practices, technologies, and priorities to the point where important distinctions are glossed over and lost.
In his session at @DevOpsSummit at 20th Cloud Expo, Kelly Looney, director of DevOps consulting for Skytap, showed how an incremental approach to introducing containers into complex, distributed applications results in modernization with less risk and more reward. He also shared the story of how Skytap used Docker to get out of the business of managing infrastructure, and into the business of delivering innovation and business value. Attendees learned how up-front planning allows for a clean sep...
"I will be talking about ChatOps and ChatOps as a way to solve some problems in the DevOps space," explained Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.