Microservices Expo Authors: Hollis Tibbetts, Liz McMillan, Elizabeth White, Yeshim Deniz, Ian Khan

Related Topics: Microservices Expo, Java IoT, Containers Expo Blog, IoT User Interface, @CloudExpo, Apache

Microservices Expo: Article

How to Triple Throughput and Improve Application Performance …

…through end-to-end testing

Thanks to the great guys who help our customers with their application performance problems we can share some of their stories in this article. We hope you - responsible for application performance in your own organization - can leverage these findings in order to prevent these common problem patterns we see out there in the real world.

I want to highlight some typical problems in web applications that can easily be identified through load testing and can lead to significant improvements in throughput and performance. In this case a 94% faster transaction performance was achieved and throughput could be tripled. It was all possible by fixing deployment problems on the Web Server. Here is story on how they did it!

Challenge: Is End User Response Time Unacceptable or Not? If So - Where Is the Problem?
Load tests are great. They tell you whether your application can handle the simulated load by staying within the acceptable response times for the tested transactions. When just looking at the average response time as measured on the web servers it will be hard to tell:

  • Do we have a performance problem at all?
  • How can we improve the performance?

Figure 1 shows a typical graph you get from a load testing tool or by analyzing your web server logs. The test that was executed simulated constant load after a short warm-up period. The results show that Average Transaction Response Time increased slightly over time with one outlier up to 3 seconds. The throughput of the system (Transaction Count) on the other side went slightly down. This can be expected when response time goes up. The question is - is this a problem? Is an average of 1.5s bad User Experience?

Figure 1: Declining Transaction Performance on both web servers also leads to less throughput

Do Not Trust Average Values: Focused analysis is required to identify problems!
One lesson that all of our customers have learned is that you do not want to analyze your performance by looking at the average execution time of all of your simulated transactions. This would give a wrong picture as certain transactions will always be fast because they are optimized where others are slow because there really is a problem. If you look at all of them at once - and then just at averages - it is very likely that you never find that you actually have a problem as it will hide behind the statistically calculated values.

Therefore you need to focus your analysis on individual transaction types that you test. Figure 2 shows a performance breakdown of the individual tested transactions. Figure 1 shows that certain transactions have a significant increase in response time where others only have a slight increase. On average the application is not performing too badly - but it is these individual transactions under load that are the real problem for the end users. Even worse if these are the transactions that are critical to your application:

Figure 2: Different transaction types perform differently. Looking at overall averages would not reveal these problems

The breakdown by tested transaction shows us that there are at least two transactions that showed spikes of up to 21s to execute. One of them is the Login transaction that is very critical to the application. Now it's time to focus our next analysis step on these transactions in order to get rid of the "statistical noise" of the other transactions that actually ran fine.

Look at the End-to-End View: It shows you where your problems are
The next step in the problem analysis is to look beyond the measured response time on the web server. Analyzing the full end-to-end view reveals which component in the infrastructure contributes the most to the overall performance. This allows you to attack the problem where it happens without trying to improve components that may actually work really well. Figure 3 shows the Transaction Flow Visualization of each individual request that was generated during the load test for the one transaction type we are focused on. Instead of just showing response as perceived by the end user (or virtual simulated user) it shows which component along the transaction execution contributed how much to the response time. It is easy to spot that this problem is not related to the 4 Java Application Server but can be found on the two load balanced Web Servers where 87% of the time is spent:

Figure 3: Analyzing the flow of the tested transaction reveals the component we need to focus our performance analysis on

Typical Problem Patterns on the Web Server
I recently wrote about the typical deployment problems that happen when moving an application from test to production: In the case of this blog it was a combination of misconfigured Web Server Settings (Max Connections and Misconfigured Modules). Other problems we typically see are oversized web pages leading to too much load on the web server to deliver that content.

Improvement: 3x Throughput and 94% Performance Gain
After fixing the problem the customer can now run about up to 30,000 transactions per Web Server instead of 10,000. The average response time also went down from ~1.19s to ~68ms. Not only is this great for the end-user experience but it also means that the existing hardware can be much better leveraged and supports many more users than originally anticipated. Figure 4 shows the final charts and transaction flow visualization of a test that was re-ran after all problems identified could be addressed:

Figure 4: Much Higher and Constant Throughput and Performance after fixing the identified performance problems

There Is More: Browser, CNDs, Network, Web Servers, Application Servers, Databases...
Obviously problems cannot always just be found in one component. Typically when you address one problem the problem shifts to the next, e.g., too many database calls executed per transaction, too heavy JavaScript libraries in the browser or cross-application impact in your infrastructure. Here are some links with additional reading material with more stories from the real world:

If you have your own stories that you want to share feel free to contact us.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@MicroservicesExpo Stories
A completely new computing platform is on the horizon. They’re called Microservers by some, ARM Servers by others, and sometimes even ARM-based Servers. No matter what you call them, Microservers will have a huge impact on the data center and on server computing in general. Although few people are familiar with Microservers today, their impact will be felt very soon. This is a new category of computing platform that is available today and is predicted to have triple-digit growth rates for some ...
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his general session at @DevOpsSummit at 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will discuss how customers are able to achieve a level of transparency that e...
SYS-CON Events announced today that Enzu will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their online busine...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, will discuss the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docke...
The reason I believe digital transformation is not only more than a fad, but is actually a life-or-death imperative for every business and IT executive on the planet is simple: there will be no place for an “industrial enterprise” in a digital world. Transformation, by definition, is a metamorphosis from one state to another, wholly new state. As such, a true digital transformation must be the act of transforming an industrial-era organization into something wholly different – the Digital Enter...
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
In his session at 19th Cloud Expo, Claude Remillard, Principal Program Manager in Developer Division at Microsoft, will contrast how his team used config as code and immutable patterns for continuous delivery of microservices and apps to the cloud. He will show the immutable patterns helps developers do away with most of the complexity of config as code-enabling scenarios such as rollback, zero downtime upgrades with far greater simplicity. He will also have live demos of building immutable pipe...
Application transformation and DevOps practices are two sides of the same coin. Enterprises that want to capture value faster, need to deliver value faster – time value of money principle. To do that enterprises need to build cloud-native apps as microservices by empowering teams to build, ship, and run in production. In his session at @DevOpsSummit at 19th Cloud Expo, Neil Gehani, senior product manager at HPE, will discuss what every business should plan for how to structure their teams to d...
When we talk about the impact of BYOD and BYOA and the Internet of Things, we often focus on the impact on data center architectures. That's because there will be an increasing need for authentication, for access control, for security, for application delivery as the number of potential endpoints (clients, devices, things) increases. That means scale in the data center. What we gloss over, what we skip, is that before any of these "things" ever makes a request to access an application it had to...
SYS-CON Events announced today that Transparent Cloud Computing (T-Cloud) Consortium will exhibit at the 19th International Cloud Expo®, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. The Transparent Cloud Computing Consortium (T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data proces...
The evolution of JavaScript and HTML 5 to support a genuine component based framework (Web Components) with the necessary tools to deliver something close to a native experience including genuine realtime networking (UDP using WebRTC). HTML5 is evolving to offer built in templating support, the ability to watch objects (which will speed up Angular) and Web Components (which offer Angular Directives). The native level support will offer a massive performance boost to frameworks having to fake all...
In many organizations governance is still practiced by phase or stage gate peer review, and Agile projects are forced to accommodate, which leads to WaterScrumFall or worse. But governance criteria and policies are often very weak anyway, out of date or non-existent. Consequently governance is frequently a matter of opinion and experience, highly dependent upon the experience of individual reviewers. As we all know, a basic principle of Agile methods is delegation of responsibility, and ideally ...
Today every business relies on software to drive the innovation necessary for a competitive edge in the Application Economy. This is why collaboration between development and operations, or DevOps, has become IT’s number one priority. Whether you are in Dev or Ops, understanding how to implement a DevOps strategy can deliver faster development cycles, improved software quality, reduced deployment times and overall better experiences for your customers.
Apache Hadoop is a key technology for gaining business insights from your Big Data, but the penetration into enterprises is shockingly low. In fact, Apache Hadoop and Big Data proponents recognize that this technology has not yet achieved its game-changing business potential. In his session at 19th Cloud Expo, John Mertic, director of program management for ODPi at The Linux Foundation, will explain why this is, how we can work together as an open data community to increase adoption, and the i...
JetBlue Airways uses virtual environments to reduce software development costs, centralize performance testing, and create a climate for continuous integration and real-time monitoring of mobile applications. The next BriefingsDirect Voice of the Customer performance engineering case study discussion examines how JetBlue Airways in New York uses virtual environments to reduce software development costs, centralize performance testing, and create a climate for continuous integration and real-tim...
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of So...
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

SYS-CON Events announced today that eCube Systems, the leading provider of modern development tools and best practices for Continuous Integration on OpenVMS, will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. eCube Systems offers a family of middleware products and development tools that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...