Click here to close now.


Microservices Expo Authors: Pat Romanski, SmartBear Blog, Yeshim Deniz, Mike Kavis, Liz McMillan

Related Topics: Microservices Expo, Java IoT, Containers Expo Blog, IoT User Interface, @CloudExpo, Apache

Microservices Expo: Article

How to Triple Throughput and Improve Application Performance …

…through end-to-end testing

Thanks to the great guys who help our customers with their application performance problems we can share some of their stories in this article. We hope you - responsible for application performance in your own organization - can leverage these findings in order to prevent these common problem patterns we see out there in the real world.

I want to highlight some typical problems in web applications that can easily be identified through load testing and can lead to significant improvements in throughput and performance. In this case a 94% faster transaction performance was achieved and throughput could be tripled. It was all possible by fixing deployment problems on the Web Server. Here is story on how they did it!

Challenge: Is End User Response Time Unacceptable or Not? If So - Where Is the Problem?
Load tests are great. They tell you whether your application can handle the simulated load by staying within the acceptable response times for the tested transactions. When just looking at the average response time as measured on the web servers it will be hard to tell:

  • Do we have a performance problem at all?
  • How can we improve the performance?

Figure 1 shows a typical graph you get from a load testing tool or by analyzing your web server logs. The test that was executed simulated constant load after a short warm-up period. The results show that Average Transaction Response Time increased slightly over time with one outlier up to 3 seconds. The throughput of the system (Transaction Count) on the other side went slightly down. This can be expected when response time goes up. The question is - is this a problem? Is an average of 1.5s bad User Experience?

Figure 1: Declining Transaction Performance on both web servers also leads to less throughput

Do Not Trust Average Values: Focused analysis is required to identify problems!
One lesson that all of our customers have learned is that you do not want to analyze your performance by looking at the average execution time of all of your simulated transactions. This would give a wrong picture as certain transactions will always be fast because they are optimized where others are slow because there really is a problem. If you look at all of them at once - and then just at averages - it is very likely that you never find that you actually have a problem as it will hide behind the statistically calculated values.

Therefore you need to focus your analysis on individual transaction types that you test. Figure 2 shows a performance breakdown of the individual tested transactions. Figure 1 shows that certain transactions have a significant increase in response time where others only have a slight increase. On average the application is not performing too badly - but it is these individual transactions under load that are the real problem for the end users. Even worse if these are the transactions that are critical to your application:

Figure 2: Different transaction types perform differently. Looking at overall averages would not reveal these problems

The breakdown by tested transaction shows us that there are at least two transactions that showed spikes of up to 21s to execute. One of them is the Login transaction that is very critical to the application. Now it's time to focus our next analysis step on these transactions in order to get rid of the "statistical noise" of the other transactions that actually ran fine.

Look at the End-to-End View: It shows you where your problems are
The next step in the problem analysis is to look beyond the measured response time on the web server. Analyzing the full end-to-end view reveals which component in the infrastructure contributes the most to the overall performance. This allows you to attack the problem where it happens without trying to improve components that may actually work really well. Figure 3 shows the Transaction Flow Visualization of each individual request that was generated during the load test for the one transaction type we are focused on. Instead of just showing response as perceived by the end user (or virtual simulated user) it shows which component along the transaction execution contributed how much to the response time. It is easy to spot that this problem is not related to the 4 Java Application Server but can be found on the two load balanced Web Servers where 87% of the time is spent:

Figure 3: Analyzing the flow of the tested transaction reveals the component we need to focus our performance analysis on

Typical Problem Patterns on the Web Server
I recently wrote about the typical deployment problems that happen when moving an application from test to production: In the case of this blog it was a combination of misconfigured Web Server Settings (Max Connections and Misconfigured Modules). Other problems we typically see are oversized web pages leading to too much load on the web server to deliver that content.

Improvement: 3x Throughput and 94% Performance Gain
After fixing the problem the customer can now run about up to 30,000 transactions per Web Server instead of 10,000. The average response time also went down from ~1.19s to ~68ms. Not only is this great for the end-user experience but it also means that the existing hardware can be much better leveraged and supports many more users than originally anticipated. Figure 4 shows the final charts and transaction flow visualization of a test that was re-ran after all problems identified could be addressed:

Figure 4: Much Higher and Constant Throughput and Performance after fixing the identified performance problems

There Is More: Browser, CNDs, Network, Web Servers, Application Servers, Databases...
Obviously problems cannot always just be found in one component. Typically when you address one problem the problem shifts to the next, e.g., too many database calls executed per transaction, too heavy JavaScript libraries in the browser or cross-application impact in your infrastructure. Here are some links with additional reading material with more stories from the real world:

If you have your own stories that you want to share feel free to contact us.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@MicroservicesExpo Stories
JFrog has announced a powerful technology for managing software packages from development into production. JFrog Artifactory 4 represents disruptive innovation in its groundbreaking ability to help development and DevOps teams deliver increasingly complex solutions on ever-shorter deadlines across multiple platforms JFrog Artifactory 4 establishes a new category – the Universal Artifact Repository – that reflects JFrog's unique commitment to enable faster software releases through the first pla...
Saviynt Inc. has announced the availability of the next release of Saviynt for AWS. The comprehensive security and compliance solution provides a Command-and-Control center to gain visibility into risks in AWS, enforce real-time protection of critical workloads as well as data and automate access life-cycle governance. The solution enables AWS customers to meet their compliance mandates such as ITAR, SOX, PCI, etc. by including an extensive risk and controls library to detect known threats and b...
Ten years ago, there may have been only a single application that talked directly to the database and spit out HTML; customer service, sales - most of the organizations I work with have been moving toward a design philosophy more like unix, where each application consists of a series of small tools stitched together. In web example above, that likely means a login service combines with webpages that call other services - like enter and update record. That allows the customer service team to writ...
Several years ago, I was a developer in a travel reservation aggregator. Our mission was to pull flight and hotel data from a bunch of cryptic reservation platforms, and provide it to other companies via an API library - for a fee. That was before companies like Expedia standardized such things. We started with simple methods like getFlightLeg() or addPassengerName(), each performing a small, well-understood function. But our customers wanted bigger, more encompassing services that would "do ...
Clearly the way forward is to move to cloud be it bare metal, VMs or containers. One aspect of the current public clouds that is slowing this cloud migration is cloud lock-in. Every cloud vendor is trying to make it very difficult to move out once a customer has chosen their cloud. In his session at 17th Cloud Expo, Naveen Nimmu, CEO of Clouber, Inc., will advocate that making the inter-cloud migration as simple as changing airlines would help the entire industry to quickly adopt the cloud wit...
The APN DevOps Competency highlights APN Partners who demonstrate deep capabilities delivering continuous integration, continuous delivery, and configuration management. They help customers transform their business to be more efficient and agile by leveraging the AWS platform and DevOps principles.
Our guest on the podcast this week is Jason Bloomberg, President at Intellyx. When we build services we want them to be lightweight, stateless and scalable while doing one thing really well. In today's cloud world, we're revisiting what to takes to make a good service in the first place. Listen in to learn why following "the book" doesn't necessarily mean that you're solving key business problems.
Apps and devices shouldn't stop working when there's limited or no network connectivity. Learn how to bring data stored in a cloud database to the edge of the network (and back again) whenever an Internet connection is available. In his session at 17th Cloud Expo, Bradley Holt, Developer Advocate at IBM Cloud Data Services, will demonstrate techniques for replicating cloud databases with devices in order to build offline-first mobile or Internet of Things (IoT) apps that can provide a better, ...
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Despite all the talk about public cloud services and DevOps, you would think the move to cloud for enterprises is clear and simple. But in a survey of almost 1,600 IT decision makers across the USA and Europe, the state of the cloud in enterprise today is still fraught with considerable frustration. The business case for apps in the real world cloud is hybrid, bimodal, multi-platform, and difficult. Download this report commissioned by NTT Communications to see the insightful findings – registra...
Application availability is not just the measure of “being up”. Many apps can claim that status. Technically they are running and responding to requests, but at a rate which users would certainly interpret as being down. That’s because excessive load times can (and will be) interpreted as “not available.” That’s why it’s important to view ensuring application availability as requiring attention to all its composite parts: scalability, performance, and security.
“All our customers are looking at the cloud ecosystem as an important part of their overall product strategy. Some see it evolve as a multi-cloud / hybrid cloud strategy, while others are embracing all forms of cloud offerings like PaaS, IaaS and SaaS in their solutions,” noted Suhas Joshi, Vice President – Technology, at Harbinger Group, in this exclusive Q&A with Cloud Expo Conference Chair Roger Strukhoff.
As we increasingly rely on technology to improve the quality and efficiency of our personal and professional lives, software has become the key business differentiator. Organizations must release software faster, as well as ensure the safety, security, and reliability of their applications. The option to make trade-offs between time and quality no longer exists—software teams must deliver quality and speed. To meet these expectations, businesses have shifted from more traditional approaches of d...
As the world moves towards more DevOps and microservices, application deployment to the cloud ought to become a lot simpler. The microservices architecture, which is the basis of many new age distributed systems such as OpenStack, NetFlix and so on, is at the heart of Cloud Foundry - a complete developer-oriented Platform as a Service (PaaS) that is IaaS agnostic and supports vCloud, OpenStack and AWS. In his session at 17th Cloud Expo, Raghavan "Rags" Srinivas, an Architect/Developer Evangeli...
All we need to do is have our teams self-organize, and behold! Emergent design and/or architecture springs up out of the nothingness! If only it were that easy, right? I follow in the footsteps of so many people who have long wondered at the meanings of such simple words, as though they were dogma from on high. Emerge? Self-organizing? Profound, to be sure. But what do we really make of this sentence?
Last month, my partners in crime – Carmen DeArdo from Nationwide, Lee Reid, my colleague from IBM and I wrote a 3-part series of blog posts on We titled our posts the Simple Math, Calculus and Art of DevOps. I would venture to say these are must-reads for any organization adopting DevOps. We examined all three ascpects – the Cultural, Automation and Process improvement side of DevOps. One of the key underlying themes of the three posts was the need for Cultural change – things like t...
There once was a time when testers operated on their own, in isolation. They’d huddle as a group around the harsh glow of dozens of CRT monitors, clicking through GUIs and recording results. Anxiously, they’d wait for the developers in the other room to fix the bugs they found, yet they’d frequently leave the office disappointed as issues were filed away as non-critical. These teams would rarely interact, save for those scarce moments when a coder would wander in needing to reproduce a particula...
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
In a report titled “Forecast Analysis: Enterprise Application Software, Worldwide, 2Q15 Update,” Gartner analysts highlighted the increasing trend of application modernization among enterprises. According to a recent survey, 45% of respondents stated that modernization of installed on-premises core enterprise applications is one of the top five priorities. Gartner also predicted that by 2020, 75% of
It is with great pleasure that I am able to announce that Jesse Proudman, Blue Box CTO, has been appointed to the position of IBM Distinguished Engineer. Jesse is the first employee at Blue Box to receive this honor, and I’m quite confident there will be more to follow given the amazing talent at Blue Box with whom I have had the pleasure to collaborate. I’d like to provide an overview of what it means to become an IBM Distinguished Engineer.