Click here to close now.

Welcome!

@MicroservicesE Blog Authors: Elizabeth White, Pat Romanski, Cloud Best Practices Network, Lori MacVittie, Liz McMillan

Related Topics: Java IoT, Industrial IoT, @MicroservicesE Blog, IoT User Interface, Agile Computing, @CloudExpo Blog

Java IoT: Article

Evolving an APM Strategy for the 21st Century

How the approach to APM has evolved to adapt to a complex ecosystem

I started in the web performance industry - well before Application Performance Management (APM) existed - during a time when external, single page measurement ruled the land. In an ecosystem where no other solutions existed, it was the top of the data chain to support the rapidly evolving world of web applications. This was an effective approach to APM, as most online applications were self-contained and, compared to the modern era, relatively simple in their design.

A state-of-the-art web application, circa 2000

Soon, a new solution rose to the top of the ecosystem - the synthetic, multi-step business process, played back either in a browser or a browser simulator. By evolving beyond the single-page measurement, this more complex data collection methodology was able to provide a view into the most critical business processes, delivering repeatable baseline and benchmark data that could be used by operations and business teams to track the health of the applications and identify when and where issues occurred.

These multi-step processes ruled the ecosystem for nearly a decade, evolving to include finer detail, deeper analytics, wider browser selection, and greater geographic coverage. But, like anything at the apex of an ecosystem, even this approach began to show that it couldn't answer every question.

In the modern online application environment, companies are delivering data to multiple browsers and mobile devices while creating increasingly sophisticated applications. These applications are developed using a combination of in-house code, commercial and open source packages and servers, and outside services to extend the application beyond what the in-house team specializes in.

This growth and complexity means that the traditional, stand-alone tools are no longer complex and "smart" enough to help customers actually solve the problems they face in their online applications. This means that a new approach, the next step in APM evolution, was needed to displace the current technologies at the top of the pyramid.

A state-of-the-art online application, circa 2013

This ecosystem, with multiple, sometimes competing, data streams makes it extremely difficult to answer the seemingly simple question of What is happening?, and sometimes nearly impossible to answer the important question of And why does it matter to us?

Let's walk through a performance issue and show how the approach to APM has evolved to adapt to the complex ecosystem, and why we find that it requires a sophisticated, integrated approach to allow the flood of data to turn into a concentrated stream of actionable information.

Starting with synthetic data, we already have two unique perspectives that provide a broader scope of data than the traditional datacenter-only approach. By combining Backbone (traditional datacenter synthetic monitoring) with data from the Last Mile (data collected from end-user competitors running the same scripts that are run from the Backbone), the clear differences in performance appear, giving companies an idea that the datacenter-only approach needs to be extended by collecting data from a source that is much closer to the customers that use the monitored application.

Outside-In Data Capture Perspectives used to provide the user experience data for online applications

Using a real-world scenario, let's follow the diagnostic process of a detected issue from the initial synthetic errors to the deepest level of impact, and see how a new, integrated APM solution can help resolve issues in an effective, efficient, and actionable way.

Starting with a three-hour snapshot of synthetic data, it's apparent that there is an issue almost halfway through this period, affecting primarily the Backbone measurements.

Examination of Individual Synthetic Measurements to identify outliers and errors

The clear cluster of errors (red squares in the scatter plot) around 17:30 is seen to be affecting Backbone only by filtering out the blue Last Mile measurements. After this filtering, zooming in allows us to quickly see that these errors are focused on the Backbone measurement perspective.

Filtered Scatter Plot Data Showing the Backbone Perspective, Focusing on the Errors

Examining the data shows that they are all script playback issues related to a missing element on the step, preventing the next action in the script from being executed.

A waterfall chart showing that the script execution failed due to an expected page element not appearing

But there are two questions that need to be answered: Why? And Does this matter? What's interesting is that as good as the synthetic tool is, this is as far as it can go. This forces teams to investigate the issue further and replicate it using other tools, wasting precious time.

But an evolved APM strategy doesn't stop here. By isolating the time period and error, the modern, integrated toolset can now ask and answer both those questions, and extend the information to: Who else was affected?

In the above instance, we know that the issue occurred from Pennsylvania. By using a user-experience monitoring (UEM) tool that captures data from all incoming visitors, we can filter the data to examine just the synthetic test visit.

Already, we have extended the data provided by the synthetic measurement. By drilling down further, it immediately becomes clear what the issue was.

Click on "Chart" takes over 60 seconds of Server Time

And then, the final step, what was happening on the server-side? Well, it's clear that one layer of the application was causing the issue and eventually the server timed out.

Issue is in the Crystaldecision API - Something to pass to the developers and QA team!

So, the element that was needed to make the script move forward wasn't there because the process that was generating the element timed out. When the agent decided to attempt the action, the missing element caused the script to fail.

This integrated approach has identified that the Click on ‘Chart' action is one of potential concern and we can now go back and look at all instances of this action in the past 24 hours to see if there are visits that encountered a similar issue. It's clear that this is a serious issue that needs to be investigated. The following screenshot shows all click-on chart actions that experienced this problem including those from REAL users that were also impacted by this problem.

A list of all visitors - Synthetic and Real - affected by the click on "Chart" issue in a 24-hour period, indicating a high priority issue

From an error on a Synthetic chart, we have quickly been able to move down to an issue that has been repeated multiple times over the past 24 hours, affecting not only synthetic users but also real users. Exporting all of this data and sending it to the QA and development teams will allow them to focus their efforts on the critical area.

This integrated approach has shown what has been proven in ecosystems all throughout the world, whether they are in nature or in applications: a tightly integrated group that seamlessly works together is far more effective than an individual. With many eyes, perspectives, and complementary areas of expertise, the team approach has provided far more data to solve the problem than any one of the perspectives could have on its own.

More Stories By Stephen Pierzchala

With more than a decade in the web performance industry, Stephen Pierzchala has advised many organizations, from Fortune 500 to startups, in how to improve the performance of their web applications by helping them develop and evolve the unique speed, conversion, and customer experience metrics necessary to effectively measure, manage, and evolve online web and mobile applications that improve performance and increase revenue. Working on projects for top companies in the online retail, financial services, content delivery, ad-delivery, and enterprise software industries, he has developed new approaches to web performance data analysis. Stephen has led web performance methodology, CDN Assessment, SaaS load testing, technical troubleshooting, and performance assessments, demonstrating the value of the web performance. He noted for his technical analyses and knowledge of Web performance from the outside-in.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@MicroservicesExpo Stories
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
Cloud Migration Management (CMM) refers to the best practices for planning and managing migration of IT systems from a legacy platform to a Cloud Provider through a combination professional services consulting and software tools. A Cloud migration project can be a relatively simple exercise, where applications are migrated ‘as is’, to gain benefits such as elastic capacity and utility pricing, but without making any changes to the application architecture, software development methods or busine...
Discussions about cloud computing are evolving into discussions about enterprise IT in general. As enterprises increasingly migrate toward their own unique clouds, new issues such as the use of containers and microservices emerge to keep things interesting. In this Power Panel at 16th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the state of cloud computing today, and what enterprise IT professionals need to know about how the latest topics and trends affect t...
Data center models are changing. A variety of technical trends and business demands are forcing that change, most of them centered on the explosive growth of applications. That means, in turn, that the requirements for application delivery are changing. Certainly application delivery needs to be agile, not waterfall. It needs to deliver services in hours, not weeks or months. It needs to be more cost efficient. And more than anything else, it needs to be really, dc infra axisreally, super focus...
Sharding has become a popular means of achieving scalability in application architectures in which read/write data separation is not only possible, but desirable to achieve new heights of concurrency. The premise is that by splitting up read and write duties, it is possible to get better overall performance at the cost of a slight delay in consistency. That is, it takes a bit of time to replicate changes initiated by a "write" to the read-only master database. It's eventually consistent, and it'...
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
Many people recognize DevOps as an enormous benefit – faster application deployment, automated toolchains, support of more granular updates, better cooperation across groups. However, less appreciated is the journey enterprise IT groups need to make to achieve this outcome. The plain fact is that established IT processes reflect a very different set of goals: stability, infrequent change, hands-on administration, and alignment with ITIL. So how does an enterprise IT organization implement change...
Conferences agendas. Event navigation. Specific tasks, like buying a house or getting a car loan. If you've installed an app for any of these things you've installed what's known as a "disposable mobile app" or DMA. Apps designed for a single use-case and with the expectation they'll be "thrown away" like brochures. Deleted until needed again. These apps are necessarily small, agile and highly volatile. Sometimes existing only for a short time - say to support an event like an election, the Wor...
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and Containers together help companies to achieve their business goals faster and more effectively. In his session at DevOps Summit, Ruslan Synytsky, CEO and Co-founder of Jelastic, reviewed the current landscape of...
The cloud has transformed how we think about software quality. Instead of preventing failures, we must focus on automatic recovery from failure. In other words, resilience trumps traditional quality measures. Continuous delivery models further squeeze traditional notions of quality. Remember the venerable project management Iron Triangle? Among time, scope, and cost, you can only fix two or quality will suffer. Only in today's DevOps world, continuous testing, integration, and deployment upend...
While DevOps most critically and famously fosters collaboration, communication, and integration through cultural change, culture is more of an output than an input. In order to actively drive cultural evolution, organizations must make substantial organizational and process changes, and adopt new technologies, to encourage a DevOps culture. Moderated by Andi Mann, panelists discussed how to balance these three pillars of DevOps, where to focus attention (and resources), where organizations migh...
At DevOps Summit NY there’s been a whole lot of talk about not just DevOps, but containers, IoT, and microservices. Sessions focused not just on the cultural shift needed to grow at scale with a DevOps approach, but also made sure to include the network ”plumbing” needed to ensure success as applications decompose into the microservice architectures enabling rapid growth and support for the Internet of (Every)Things.
Mashape is bringing real-time analytics to microservices with the release of Mashape Analytics. First built internally to analyze the performance of more than 13,000 APIs served by the mashape.com marketplace, this new tool provides developers with robust visibility into their APIs and how they function within microservices. A purpose-built, open analytics platform designed specifically for APIs and microservices architectures, Mashape Analytics also lets developers and DevOps teams understand w...
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists peeled away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud envir...
Sumo Logic has announced comprehensive analytics capabilities for organizations embracing DevOps practices, microservices architectures and containers to build applications. As application architectures evolve toward microservices, containers continue to gain traction for providing the ideal environment to build, deploy and operate these applications across distributed systems. The volume and complexity of data generated by these environments make monitoring and troubleshooting an enormous chall...
Containers and Docker are all the rage these days. In fact, containers — with Docker as the leading container implementation — have changed how we deploy systems, especially those comprised of microservices. Despite all the buzz, however, Docker and other containers are still relatively new and not yet mainstream. That being said, even early Docker adopters need a good monitoring tool, so last month we added Docker monitoring to SPM. We built it on top of spm-agent – the extensible framework f...
There's a lot of things we do to improve the performance of web and mobile applications. We use caching. We use compression. We offload security (SSL and TLS) to a proxy with greater compute capacity. We apply image optimization and minification to content. We do all that because performance is king. Failure to perform can be, for many businesses, equivalent to an outage with increased abandonment rates and angry customers taking to the Internet to express their extreme displeasure.
There's a lot of things we do to improve the performance of web and mobile applications. We use caching. We use compression. We offload security (SSL and TLS) to a proxy with greater compute capacity. We apply image optimization and minification to content. We do all that because performance is king. Failure to perform can be, for many businesses, equivalent to an outage with increased abandonment rates and angry customers taking to the Internet to express their extreme displeasure.
SYS-CON Events announced today that the "Second Containers & Microservices Conference" will take place November 3-5, 2015, at the Santa Clara Convention Center, Santa Clara, CA, and the “Third Containers & Microservices Conference” will take place June 7-9, 2016, at Javits Center in New York City. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
The causality question behind Conway’s Law is less about how changing software organizations can lead to better software, but rather how companies can best leverage changing technology in order to transform their organizations. Hints at how to answer this question surprisingly come from the world of devops – surprising because the focus of devops is ostensibly on building and deploying better software more quickly. Be that as it may, there’s no question that technology change is a primary fac...