Welcome!

Microservices Expo Authors: Pat Romanski, Elizabeth White, Ruxit Blog, Liz McMillan, Sematext Blog

Related Topics: Microservices Expo, Java IoT, IoT User Interface, Agile Computing, @CloudExpo

Microservices Expo: Article

Fact Finders: Sorting Out the Truth in Real User Monitoring

Go Real with the right expectations

On my recent visits to Velocity, WebPerfDay and Apps World in London, Real User Monitoring (RUM) was the hot topic. That triggered my thinking about the differences between vendors. They all promise the same for a varying range of prices - from free to a couple thousand US dollars. What I found out is that there IS a big difference and - depending on what you want to do with RUM - you want to make sure you understand the capabilities and limitations of the available solutions.

The false claim of 100% Coverage
What all vendors claim to do is capture data from 100% of your users. When looking closer you see that many of these solutions - especially the "Freemiums" - rely on theW3C Navigation Timings. So my question is: How can I cover ALL Users with W3C timings when these timings are NOT AVAILABLE on all browsers?

W3C timings are only available on new browsers. So - what about the IE6, IE7, IE8, the whole Safari Browser family, older Firefox and Chrome instances? Looking at current statistics they sum up to 35% of the overall market share (http://www.w3counter.com/globalstats.php). The statements of vendors that rely on these timings to capture all users experience are simply not accurate.

The performance impact of monitoring
After finding that out I just asked myself: "Are there anymore deficiencies that can be found?"

I first thought about the collection mechanism which reminded me of the challenges all the Web Analytics tools have. Data collection relies on the browsers onUnload event. The RUM tools have to collect the data till the last second of the lifecycle of the page and then send it off. Most SaaS solution vendors are using an image GET request to send the data to the collection instances. Modern browsers are optimizing this event because "Why should a Browser download an image if the page is about to die?"Modern browsers like Chrome optimized this use case and simply do not execute the request at all or do not wait for response if the data got sent. So again- I am losing data from my real end users. The work around some of the vendors put in place is putting a timeout in the onUnLoad-event. I've seen timeouts with up to 500ms which impact the next page that gets loaded. We want to improve the user experience/performance but these tools are forcing the user to wait longer to move to the next page.

So we are losing all the old browsers and additionally the modern ones that do not execute the data collection requests. We are now far away from 100% coverage.

Do the math
Another argument you always hear is that the RUM solution allows you to find out more about the end user environment's impact on page performance. The geographical region of the end user, the browsers, the OS or device can result in slow page performance. But does this really work?

Let's do some simple math and figure out what this means to a page with 1 000 000 visits a day:

  • 1 000 000 over all visits/day
  • 1 000 000 - 35% visits with no W3C timing support in the browser
  • 650 000- 20% not sending the data correct at all or incomplete
  • 520 000 captured visits per day

Figure 1: Only 52% of visitors are captured by most RUM vendors due to limitations of browsers

So we have reduced or base from 1 000 000 to 520 000. Let's start with the break down into the different goupings:

  • 520000 broken down by 100 countries
  • 520000/100 = 5200 visits/country/day
  • 5200 visits per country broken down by 20 Browser Versions
  • 5200/20 = 260 visits/country/browser version/day

Let's break the 260 visits further down by  10 operating system:

  • 260/10 = 26 visits/country/browser version/operating system/day

We want to have date on an hourly basis:

  • 26/24 ~ 1 visits/country/browser version/operating system/hour

**1 000 000 visits per day =~ 1 visits/country/browser version/operating system/hour! We have done no sampling, we have only country level data, we are looking at visits and not page views!**

To clarify: In this calculation I assume that the visits are evenly distributed over all countries but do not take into account that most solutions do sampling at a rate of 1-20% and look at visits with multiple page views instead of unique URIs - this seems to me as a best case scenario. In reality it can be even worse.

So then, why is Real User Monitoring so popular?...
...because it helps you to improve your Users experience! How can that work after knowing that we might not capture data from all our end users? You only have to change your expectations of what you want to achieve with Real User Monitoring.

What you should expect from your RUM solution is:

  • Support for all browsers - not only the new browsers
  • A reliable data sending mechanism
  • W3C timings support
  • Functional Health information like errors from JavaScript and HTTP - not only timings
  • AJAX/XHR-requests timing - not only timings for page loads
  • The click path of a whole visit - not only separate page views
  • Support for desktop browsers, mobile browsers and mobile native applications in combined view
  • Landing and Exit page analysis

If your selected solution provides all these features to you can go an additional step further and not only monitor your users, you can do real User Experience Management (UEM). I just want to point out what that allows you to do in some short examples.

Example 1: JavaScript Errors - Which one to fix first?
If your RUM- UEM solution provides you with JavaScript errors you can start fixing problems right away. It should be able to show you which messages appear how often in which browser, shown in Figure 2.

Figure 2: Detailed JavaScript error messages are captured for every visit and easy accessible grouped by browser, OS or geo-location

Example 2: Why are my customers leaving my web site?
With the UEM you are now able to not only see that your customers are leaving your web site. You can also figure out if they had technical issues (see Figure 3).

Figure 3: Looking at Exit Pages and correlating it with Failure Rate, Performance and User Experience allows us to quickly identify why visitors leave the website on these pages

Example 3: What did my customer do on the application before he called our support center?
Having every visit and all actions available makes it easy for the support center employees to look up the visit information as part of the triage process (see Figure 4).

Figure 4: Seeing all actions the visitor really executed on the website helps speed up the complaint process as all facts are available

Example 4: Correlating Performance to Business
Analyzing the performance of every single visit and all actions not only allows us to pinpoint problems on individual pages, certain browsers or geographical regions. It also allows us to correlate problems in the application to business. Knowing how much revenue is lost due to declined performance gives application owners better arguments when discussing investments in the infrastructure or additional R&D resources. The dashboard shown in Figure 5correlates Response Time with the number of Visitors by Continent and the generated Orders. Problems in the infrastructure that lead to performance problems of the application can then easily be correlated to lost revenue:

Figure 5: Correlating Business Values such as number of Orders with Page Performance and Infrastructure Health opens a new of communication between Business and Application Owners

Conclusion
W3C timings give us great insight but it is only available in new browsers. Be aware of what your RUM solution vendor promises to you and do not forget about the simple math. Set your expectations right and look for solutions that support visits and health indicators like HTTP errors and JavaScript errors. Go Real with the right expectations.

More Stories By Klaus Enzenhofer

Klaus Enzenhofer has several years of experience and expertise in the field of Web Performance Optimization and User Experience Management. He works as Technical Strategist in the Center of Excellence Team at dynaTrace Software. In this role he influences the development of the dynaTrace Application Performance Management Solution and the Web Performance Optimization Tool dynaTrace AJAX Edition. He mainly gathered his experience in web and performance by developing and running large-scale web portals at Tiscover GmbH.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@MicroservicesExpo Stories
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
This digest provides an overview of good resources that are well worth reading. We’ll be updating this page as new content becomes available, so I suggest you bookmark it. Also, expect more digests to come on different topics that make all of our IT-hearts go boom!
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
SYS-CON Events announced today that Isomorphic Software will exhibit at DevOps Summit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, cutting-edge enterprise web applications for desktop and mobile. SmartClient combines the productivity and performance of traditional desktop software with the simp...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
In his session at @DevOpsSummit at 19th Cloud Expo, Yoseph Reuveni, Director of Software Engineering at Jet.com, will discuss Jet.com's journey into containerizing Microsoft-based technologies like C# and F# into Docker. He will talk about lessons learned and challenges faced, the Mono framework tryout and how they deployed everything into Azure cloud. Yoseph Reuveni is a technology leader with unique experience developing and running high throughput (over 1M tps) distributed systems with extre...
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Adding public cloud resources to an existing application can be a daunting process. The tools that you currently use to manage the software and hardware outside the cloud aren’t always the best tools to efficiently grow into the cloud. All of the major configuration management tools have cloud orchestration plugins that can be leveraged, but there are also cloud-native tools that can dramatically improve the efficiency of managing your application lifecycle. In his session at 18th Cloud Expo, ...
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
Ovum, a leading technology analyst firm, has published an in-depth report, Ovum Decision Matrix: Selecting a DevOps Release Management Solution, 2016–17. The report focuses on the automation aspects of DevOps, Release Management and compares solutions from the leading vendors.
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
This is a no-hype, pragmatic post about why I think you should consider architecting your next project the way SOA and/or microservices suggest. No matter if it’s a greenfield approach or if you’re in dire need of refactoring. Please note: considering still keeps open the option of not taking that approach. After reading this, you will have a better idea about whether building multiple small components instead of a single, large component makes sense for your project. This post assumes that you...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Node.js and io.js are increasingly being used to run JavaScript on the server side for many types of applications, such as websites, real-time messaging and controllers for small devices with limited resources. For DevOps it is crucial to monitor the whole application stack and Node.js is rapidly becoming an important part of the stack in many organizations. Sematext has historically had a strong support for monitoring big data applications such as Elastic (aka Elasticsearch), Cassandra, Solr, S...
Right off the bat, Newman advises that we should "think of microservices as a specific approach for SOA in the same way that XP or Scrum are specific approaches for Agile Software development". These analogies are very interesting because my expectation was that microservices is a pattern. So I might infer that microservices is a set of process techniques as opposed to an architectural approach. Yet in the book, Newman clearly includes some elements of concept model and architecture as well as p...
"We provide DevOps solutions. We also partner with some key players in the DevOps space and we use the technology that we partner with to engineer custom solutions for different organizations," stated Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at DevOps at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.

Let's just nip the conflation of these terms in the bud, shall we?

"MIcro" is big these days. Both microservices and microsegmentation are having and will continue to have an impact on data center architecture, but not necessarily for the same reasons. There's a growing trend in which folks - particularly those with a network background - conflate the two and use them to mean the same thing.

They are not.

One is about the application. The other, the network. T...

If you are within a stones throw of the DevOps marketplace you have undoubtably noticed the growing trend in Microservices. Whether you have been staying up to date with the latest articles and blogs or you just read the definition for the first time, these 5 Microservices Resources You Need In Your Life will guide you through the ins and outs of Microservices in today’s world.
Before becoming a developer, I was in the high school band. I played several brass instruments - including French horn and cornet - as well as keyboards in the jazz stage band. A musician and a nerd, what can I say? I even dabbled in writing music for the band. Okay, mostly I wrote arrangements of pop music, so the band could keep the crowd entertained during Friday night football games. What struck me then was that, to write parts for all the instruments - brass, woodwind, percussion, even k...