Click here to close now.


Microservices Expo Authors: Elizabeth White, PagerDuty Blog, Liz McMillan, Ian Khan, Carmen Gonzalez

Related Topics: Microservices Expo, Java IoT, IoT User Interface, Agile Computing, @CloudExpo

Microservices Expo: Article

Fact Finders: Sorting Out the Truth in Real User Monitoring

Go Real with the right expectations

On my recent visits to Velocity, WebPerfDay and Apps World in London, Real User Monitoring (RUM) was the hot topic. That triggered my thinking about the differences between vendors. They all promise the same for a varying range of prices - from free to a couple thousand US dollars. What I found out is that there IS a big difference and - depending on what you want to do with RUM - you want to make sure you understand the capabilities and limitations of the available solutions.

The false claim of 100% Coverage
What all vendors claim to do is capture data from 100% of your users. When looking closer you see that many of these solutions - especially the "Freemiums" - rely on theW3C Navigation Timings. So my question is: How can I cover ALL Users with W3C timings when these timings are NOT AVAILABLE on all browsers?

W3C timings are only available on new browsers. So - what about the IE6, IE7, IE8, the whole Safari Browser family, older Firefox and Chrome instances? Looking at current statistics they sum up to 35% of the overall market share ( The statements of vendors that rely on these timings to capture all users experience are simply not accurate.

The performance impact of monitoring
After finding that out I just asked myself: "Are there anymore deficiencies that can be found?"

I first thought about the collection mechanism which reminded me of the challenges all the Web Analytics tools have. Data collection relies on the browsers onUnload event. The RUM tools have to collect the data till the last second of the lifecycle of the page and then send it off. Most SaaS solution vendors are using an image GET request to send the data to the collection instances. Modern browsers are optimizing this event because "Why should a Browser download an image if the page is about to die?"Modern browsers like Chrome optimized this use case and simply do not execute the request at all or do not wait for response if the data got sent. So again- I am losing data from my real end users. The work around some of the vendors put in place is putting a timeout in the onUnLoad-event. I've seen timeouts with up to 500ms which impact the next page that gets loaded. We want to improve the user experience/performance but these tools are forcing the user to wait longer to move to the next page.

So we are losing all the old browsers and additionally the modern ones that do not execute the data collection requests. We are now far away from 100% coverage.

Do the math
Another argument you always hear is that the RUM solution allows you to find out more about the end user environment's impact on page performance. The geographical region of the end user, the browsers, the OS or device can result in slow page performance. But does this really work?

Let's do some simple math and figure out what this means to a page with 1 000 000 visits a day:

  • 1 000 000 over all visits/day
  • 1 000 000 - 35% visits with no W3C timing support in the browser
  • 650 000- 20% not sending the data correct at all or incomplete
  • 520 000 captured visits per day

Figure 1: Only 52% of visitors are captured by most RUM vendors due to limitations of browsers

So we have reduced or base from 1 000 000 to 520 000. Let's start with the break down into the different goupings:

  • 520000 broken down by 100 countries
  • 520000/100 = 5200 visits/country/day
  • 5200 visits per country broken down by 20 Browser Versions
  • 5200/20 = 260 visits/country/browser version/day

Let's break the 260 visits further down by  10 operating system:

  • 260/10 = 26 visits/country/browser version/operating system/day

We want to have date on an hourly basis:

  • 26/24 ~ 1 visits/country/browser version/operating system/hour

**1 000 000 visits per day =~ 1 visits/country/browser version/operating system/hour! We have done no sampling, we have only country level data, we are looking at visits and not page views!**

To clarify: In this calculation I assume that the visits are evenly distributed over all countries but do not take into account that most solutions do sampling at a rate of 1-20% and look at visits with multiple page views instead of unique URIs - this seems to me as a best case scenario. In reality it can be even worse.

So then, why is Real User Monitoring so popular?...
...because it helps you to improve your Users experience! How can that work after knowing that we might not capture data from all our end users? You only have to change your expectations of what you want to achieve with Real User Monitoring.

What you should expect from your RUM solution is:

  • Support for all browsers - not only the new browsers
  • A reliable data sending mechanism
  • W3C timings support
  • Functional Health information like errors from JavaScript and HTTP - not only timings
  • AJAX/XHR-requests timing - not only timings for page loads
  • The click path of a whole visit - not only separate page views
  • Support for desktop browsers, mobile browsers and mobile native applications in combined view
  • Landing and Exit page analysis

If your selected solution provides all these features to you can go an additional step further and not only monitor your users, you can do real User Experience Management (UEM). I just want to point out what that allows you to do in some short examples.

Example 1: JavaScript Errors - Which one to fix first?
If your RUM- UEM solution provides you with JavaScript errors you can start fixing problems right away. It should be able to show you which messages appear how often in which browser, shown in Figure 2.

Figure 2: Detailed JavaScript error messages are captured for every visit and easy accessible grouped by browser, OS or geo-location

Example 2: Why are my customers leaving my web site?
With the UEM you are now able to not only see that your customers are leaving your web site. You can also figure out if they had technical issues (see Figure 3).

Figure 3: Looking at Exit Pages and correlating it with Failure Rate, Performance and User Experience allows us to quickly identify why visitors leave the website on these pages

Example 3: What did my customer do on the application before he called our support center?
Having every visit and all actions available makes it easy for the support center employees to look up the visit information as part of the triage process (see Figure 4).

Figure 4: Seeing all actions the visitor really executed on the website helps speed up the complaint process as all facts are available

Example 4: Correlating Performance to Business
Analyzing the performance of every single visit and all actions not only allows us to pinpoint problems on individual pages, certain browsers or geographical regions. It also allows us to correlate problems in the application to business. Knowing how much revenue is lost due to declined performance gives application owners better arguments when discussing investments in the infrastructure or additional R&D resources. The dashboard shown in Figure 5correlates Response Time with the number of Visitors by Continent and the generated Orders. Problems in the infrastructure that lead to performance problems of the application can then easily be correlated to lost revenue:

Figure 5: Correlating Business Values such as number of Orders with Page Performance and Infrastructure Health opens a new of communication between Business and Application Owners

W3C timings give us great insight but it is only available in new browsers. Be aware of what your RUM solution vendor promises to you and do not forget about the simple math. Set your expectations right and look for solutions that support visits and health indicators like HTTP errors and JavaScript errors. Go Real with the right expectations.

More Stories By Klaus Enzenhofer

Klaus Enzenhofer has several years of experience and expertise in the field of Web Performance Optimization and User Experience Management. He works as Technical Strategist in the Center of Excellence Team at dynaTrace Software. In this role he influences the development of the dynaTrace Application Performance Management Solution and the Web Performance Optimization Tool dynaTrace AJAX Edition. He mainly gathered his experience in web and performance by developing and running large-scale web portals at Tiscover GmbH.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@MicroservicesExpo Stories
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
Hiring the wrong candidate can cost a company hundreds of thousands of dollars, and result in lost profit and productivity during the search for a replacement. In fact, the Harvard Business Review has found that as much as 80 percent of turnover is caused by bad hiring decisions. But when your organization has implemented DevOps, the job is about more than just technical chops. It’s also about core behaviors: how they work with others, how they make decisions, and how those decisions translate t...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York and Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty ...
One of the most important tenets of digital transformation is that it’s customer-driven. In fact, the only reason technology is involved at all is because today’s customers demand technology-based interactions with the companies they do business with. It’s no surprise, therefore, that we at Intellyx agree with Patrick Maes, CTO, ANZ Bank, when he said, “the fundamental element in digital transformation is extreme customer centricity.” So true – but note the insightful twist that Maes adde...
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
In his General Session at DevOps Summit, Asaf Yigal, Co-Founder & VP of Product at, explored the value of Kibana 4 for log analysis and provided a hands-on tutorial on how to set up Kibana 4 and get the most out of Apache log files. He examined three use cases: IT operations, business intelligence, and security and compliance. Asaf Yigal is co-founder and VP of Product at log analytics software company In the past, he was co-founder of social-trading platform Currensee, which...
DevOps is about increasing efficiency, but nothing is more inefficient than building the same application twice. However, this is a routine occurrence with enterprise applications that need both a rich desktop web interface and strong mobile support. With recent technological advances from Isomorphic Software and others, rich desktop and tuned mobile experiences can now be created with a single codebase – without compromising functionality, performance or usability. In his session at DevOps Su...
Using any programming framework to the fullest extent possible first requires an understanding of advanced software architecture concepts. While writing a little client-side JavaScript does not necessarily require as much consideration when designing a scalable software architecture, the evolution of tools like Node.js means that you could be facing large code bases that must be easy to maintain.
People want to get going with DevOps or Continuous Delivery, but need a place to start. Others are already on their way, but need some validation of their choices. A few months ago, I published the first volume of DevOps and Continuous Delivery reference architectures which has now been viewed over 50,000 times on SlideShare (it's free to registration required). Three things helped people in the deck: (1) the reference architectures, (2) links to the sources for each architectur...
As organizations realize the scope of the Internet of Things, gaining key insights from Big Data, through the use of advanced analytics, becomes crucial. However, IoT also creates the need for petabyte scale storage of data from millions of devices. A new type of Storage is required which seamlessly integrates robust data analytics with massive scale. These storage systems will act as “smart systems” provide in-place analytics that speed discovery and enable businesses to quickly derive meaningf...
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
You may have heard about the pets vs. cattle discussion – a reference to the way application servers are deployed in the cloud native world. If an application server goes down it can simply be dropped from the mix and a new server added in its place. The practice so far has mostly been applied to application deployments. Management software on the other hand is treated in a very special manner. Dedicated resources are set aside to run the management software components and several alerting syst...
Continuous processes around the development and deployment of applications are both impacted by -- and a benefit to -- the Internet of Things trend. To help better understand the relationship between DevOps and a plethora of new end-devices and data please welcome Gary Gruver, consultant, author and a former IT executive who has led many large-scale IT transformation projects, and John Jeremiah, Technology Evangelist at Hewlett Packard Enterprise (HPE), on Twitter at @j_jeremiah. The discussion...
Discussions of cloud computing have evolved in recent years from a focus on specific types of cloud, to a world of hybrid cloud, and to a world dominated by the APIs that make today's multi-cloud environments and hybrid clouds possible. In this Power Panel at 17th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the importance of customers being able to use the specific technologies they need, through environments and ecosystems that expose their APIs to make true ...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound...
Microservices are a very exciting architectural approach that many organizations are looking to as a way to accelerate innovation. Microservices promise to allow teams to move away from monolithic "ball of mud" systems, but the reality is that, in the vast majority of organizations, different projects and technologies will continue to be developed at different speeds. How to handle the dependencies between these disparate systems with different iteration cycles? Consider the "canoncial problem"...
PubNub has announced the release of BLOCKS, a set of customizable microservices that give developers a simple way to add code and deploy features for realtime apps.PubNub BLOCKS executes business logic directly on the data streaming through PubNub’s network without splitting it off to an intermediary server controlled by the customer. This revolutionary approach streamlines app development, reduces endpoint-to-endpoint latency, and allows apps to better leverage the enormous scalability of PubNu...
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, San...
In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace. Traditional approaches for driving innovation are now woefully inadequate for keeping up with the breadth of disruption and change facin...
I recently attended and was a speaker at the 4th International Internet of @ThingsExpo at the Santa Clara Convention Center. I also had the opportunity to attend this event last year and I wrote a blog from that show talking about how the “Enterprise Impact of IoT” was a key theme of last year’s show. I was curious to see if the same theme would still resonate 365 days later and what, if any, changes I would see in the content presented.