Welcome!

Microservices Expo Authors: Liz McMillan, Pat Romanski, Elizabeth White, Mehdi Daoudi, Yeshim Deniz

Related Topics: Microservices Expo

Microservices Expo: Article

The DNA of APM: Event to Incident Flow

Where APM and ITIL Come Together

This article is the corollary to “The Anatomy of APM” which outlines four foundational elements of a successful APM strategy:  Top Down Monitoring, Bottom Up Monitoring, Reporting & Analytics, and ITIL / ITSM Processes.  Here I provide a deeper context on how the event-to-incident flow is structured.

It is the correlation of events and the amalgamation of metrics that bring value to the business by way of dashboards and trending reports, and it’s in the way the business interprets the accuracy of those metrics that determines the success of the implementation.  If an event occurs and no one sees it, believes it, or takes action on it, APM’s value can be severely diminished and you run the risk of owning “shelfware.”

Overall, as events are detected and consumed by the system, it is the automation that is the lifeblood of an APM solution, ensuring that the pulse of the incident flow is a steady one.  The goal is to show a conceptual view of how events flow through the environment and eventually become incidents.  At a high level, the Trouble Ticket Interface (TTI) will correlate the events into alerts, and alerts into incidents which then become tickets, enabling the Operations team to begin working toward resolution.

The event flow moves from the outside in, and then from the center to the right.

Here’s how it works:

  • The outside blue circles represent the monitoring toolsets that collect information directly from the infrastructure and the critical applications.
  • The inner green (teal) circles represent the toolsets the Enterprise Systems Management (ESM) team manages, and is where most of the critical application thresholds are set.
  • The dark brown circles are logical connection points depicting how the events are collected as they flow through the system: Once the events hit this connection point they go to three output queues.
  • The Red circles on the right are the Incident Output queues for each event after it has been tracked and correlated.

The transformation between event-to-incident is the critical junction where APM and ITIL come together to provide tangible value back to the business.  If you only take one thing away from this picture, it would be the importance of managing the strategic intent of the output queues, because this is the key for managing action, going red to green, and trending.

Conclusion
It is not necessarily the number of features or technical stamina of each monitoring tool to process large volumes of data that will make an APM implementation successful, it’s the choices you make in putting them together to manage the event-to-incident flow that determines your success.  Timeliness and accuracy in this area will help you gain credibility and confidence with each of the constituents and business partners you support.

More Stories By Larry Dragich

Larry Dragich is actively involved with industry leaders, sharing knowledge of Application Performance Management (APM) technologies, from best practices and technical workflows, to resource allocation and approaches for implementation. He has been working in the APM space since 2006 where he built the Enterprise Systems Management team which is now the focal point for IT performance monitoring and capacity planning activities.

Microservices Articles
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this session we'll describe best practices for "configuration as code" in a Kubernetes environment. We will demonstrate how a properly constructed containerized app can be deployed to both Amazon and Azure ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lav...
The now mainstream platform changes stemming from the first Internet boom brought many changes but didn’t really change the basic relationship between servers and the applications running on them. In fact, that was sort of the point. In his session at 18th Cloud Expo, Gordon Haff, senior cloud strategy marketing and evangelism manager at Red Hat, will discuss how today’s workloads require a new model and a new platform for development and execution. The platform must handle a wide range of rec...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
If your cloud deployment is on AWS with predictable workloads, Reserved Instances (RIs) can provide your business substantial savings compared to pay-as-you-go, on-demand services alone. Continuous monitoring of cloud usage and active management of Elastic Compute Cloud (EC2), Relational Database Service (RDS) and ElastiCache through RIs will optimize performance. Learn how you can purchase and apply the right Reserved Instances for optimum utilization and increased ROI.
TCP (Transmission Control Protocol) is a common and reliable transmission protocol on the Internet. TCP was introduced in the 70s by Stanford University for US Defense to establish connectivity between distributed systems to maintain a backup of defense information. At the time, TCP was introduced to communicate amongst a selected set of devices for a smaller dataset over shorter distances. As the Internet evolved, however, the number of applications and users, and the types of data accessed and...
Consumer-driven contracts are an essential part of a mature microservice testing portfolio enabling independent service deployments. In this presentation we'll provide an overview of the tools, patterns and pain points we've seen when implementing contract testing in large development organizations.
In his session at 19th Cloud Expo, Claude Remillard, Principal Program Manager in Developer Division at Microsoft, contrasted how his team used config as code and immutable patterns for continuous delivery of microservices and apps to the cloud. He showed how the immutable patterns helps developers do away with most of the complexity of config as code-enabling scenarios such as rollback, zero downtime upgrades with far greater simplicity. He also demoed building immutable pipelines in the cloud ...
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...