Microservices Expo Authors: Liz McMillan, Karthick Viswanathan, Pat Romanski, Stackify Blog, Dalibor Siroky

Related Topics: Microservices Expo, Java IoT

Microservices Expo: Article

Developers Think Functionality

But less about scalability

Two weeks ago I co-hosted a Webinar with one of our users – Bill Mar, Director of Engineering Services from SmithMicro Software. SmithMicro provides the backbone of our digital life by connecting different digital devices together. In his role, Bill works in the Wireless Business unit working on Voice-related services, e.g.: VoiceSMS or Visual Voicemail – services that we’ve all become used to since we run around with smart phones such as the iPhone or Blackberry.

Bill talked about how SmithMicro had to move towards Proactive Performance Management as the company and the user base started to grow. In his presentation he made an interesting but bold statement: Developers Think Functionality – But Less About Scalability.

As I used to be a developer for many years (and still today, as dynaTrace still allows me to do a little coding on certain features) I had to think about this statement and didn’t know in the beginning whether I should agree with him from the perspective of my current role within dynaTrace or whether I should be offended from the perspective of a developer who just likes to code new features. In the end I agreed with him – especially after listening to all he had to say about his day-to-day challenges as Director of Engineering Services.

In the webinar Bill gave some great insight into what they did in order to become more proactive with performance management. He shared recommendations and their Best Practices that have worked for his team. He really told some great stories and had some great analogies. The bold statement I mentioned in the beginning is just a teaser :-)

Problems came with growing business success

Business success is a great thing, and is what every company is designed to achieve. More active users mean more money spent on the products or services you sell. If you provide Software as a Service – such as SmithMicro does – and you start with a rather small user base you don’t necessarily run into any software related issues right away. SmithMicro started realizing some certain usage peaks during the year – like during the holiday season or New Years when people send their Best Wishes to friends and family using their digital services. With their growing success, however, more volume related issues bubbled up to the surface. It was rather easy to find the initial load related problems by digesting log files and looking at exception stack traces. Even though this process took a certain amount of time it was still fast enough to react to problems that came in from a rather small user-base.

Problems happen faster if you drive faster

When driving 100 miles an hour you have much less time to react in order to avoid a fatal crash then when driving at 10 miles an hour. The same is true with the online business. If you have 100 transactions an hour you may lose the business of a hundred users if it takes you an hour to fix a problem. If you have 100 transactions per second (TPS) you will lose a whole lot of money in one hour. Bill also faced this problem as they reached 100 TPS. Looking at log files and analyzing exception stack traces was no longer fast enough to react on problems in order to avoid losing business. There is a two way approach to this problem:
a) don’t allow code to end up in production that has potential scalability issues and
b) bring tools into production that allows Operations to react more pro-actively (early alerting system) and that equips Devs with all information they need without needing to analyze log-files.

Developers need to understand their code and the real use case scenarios

Bill mentioned several interesting things on that topic and started with another great analogy: The plan used to build a house is not the same as the plan it was built. In order to have a clear understanding of what is actually going on in the application it is important to have plans of “the real” architecture. It is hard and not always practical to maintain blueprints or class diagrams as software is very dynamic – and often changes happen because they have to happen and nobody thinks about updating the documentation. A Best Practice therefore is that developers and architects need to understand the current architecture as it is – and not how they think the architecture should exist.

SmithMicro uses dynaTrace Sequence Diagrams from Real-Life  Transactions instead of using manual maintained UML Diagrams

SmithMicro uses dynaTrace Sequence Diagrams from Real-Life Transactions instead of using manual maintained UML Diagrams

On the topic of scalability Bill talked about having an early focus on things like memory allocation, performance and scalability of critical components. Coming back to his initial bold statement about developers only focusing on functionality, he made it clear that functional readiness doesn’t necessarily mean Production Ready. With some longer-running local tests that test real use-case scenarios, developers can easily identify problems like excessive memory consumption or non-performing code using simple load generators and profiling-like tools. Scalability is a key requirement, and the understanding of real use cases used to verifying scalability is another Best Practice for proactive performance management.

SmithMicro looking at individual PurePaths captured under load to  identify scalability issues and performance bottlenecks

SmithMicro looking at individual PurePaths captured under load to identify scalability issues and performance bottlenecks

Operations needs early indicators and an understanding about how the the applications work

Not all problems can be avoided by being proactive in development. Therefore another Best Practice from SmithMicro is to give Operations all they need to become more proactive in identifying problems early on and also help them understand what to do in case there are problems on the horizon without having to call in the engineering side every time a dashboard indicates an issue.

Operations therefore needs early indicators such as trend changes in transaction response times, memory consumption, garbage collection activity, number and execution time of database queries. In order to capture this information the right set of tools need to be brought in – tools that must be very lightweight in order to avoid unnecessary overhead but that provide enough information for both operations and developers to analyze problems that occur. Traditional monitoring tools that only monitor certain silos of the application stack, e.g. web server, app server, network, database – only help to identify problematic regions. In order for Operations to understand a problem and in order for developers to identify the root cause it is important to get End-to-End transactional tracing with the ability to view this data at a high-level as well as in-depth.
A high-level view provides Operations with enough data to identify performance trends and hotspots in their application infrastructure.

High-Level Operations Memory Dashboard used to identify trends in  Memory Allocations, Usage and Garbage Collection Activity

High-Level Operations Memory Dashboard used to identify trends in Memory Allocations, Usage and Garbage Collection Activity

The In-Depth view on the same collected data provides developers with enough method and component-level data for problem analysis without having to digest log files and stack traces:

Low Level Database Dashboard shows Database Activity as well as  individual SQL Statements and their Bind Variables

Low Level Database Dashboard shows Database Activity as well as individual SQL Statements and their Bind Variables

Developers tend to be curious and often try things that they shouldn’t: The goal for Bill is that Operations can do a better job in being proactive and not needing to call in developers every time a dashboard shows RED. With such early indicators and a better understanding about the application and it’s dependencies to all its involved components Operations can solve many of the production problems on their own. The problem they often ran into was that developers were rather “relaxed” when troubleshooting problems in production – often causing more problems than the problems they were working on.
As Bill said: If you don’t know it’s gonna work – you shouldn’t try it”. In order to prevent this situation it is important for SmithMicro to extract all information required by developers from the production system to help developers to understand what is going on without them needing to “mess with the real world” (I am still not offended by those comments :-) )

Where SmithMicro is heading?

The overall goal for Bill and his team is to become more pro-active when it comes to performance management. They want to enable Operations to become more self-sufficient by extending their knowledge about application internals and giving them early indicators of problems they can react to. They also want to make it easier for developers to understand what is really going on their application – especially spreading the knowledge in cross-functional teams.

Bill’s recommendations

At the end Bill gave his recommendations to all the rest of us out there.

  • Understand your use-case scenarios
    • What are your 5-15 main use case scenarios
    • Model these use case scenarios and monitor them
    • By doing this you become proactive.
  • Developers
    • Understand how the application works and
    • Understand the real life requirements that come from operations
  • Operations
    • Understand the run-time behaviour of the application
    • Look at trending and early indicators
    • Have actionable data for developers
  • By following such a process you become more proactive, and ensure your Application is Ready for Production

Further Information

I really hope this summary blog of the webinar made you want to hear more about it and actually listen to the recorded webinar. Follow this link and listen to what Bill and I had to say about Proactive Performance Management. There is also some other stuff that you might be interested in, like The Practical Guide to Performance Management in Development (How we at dynaTrace do it internally), Best Practices from Zappos on Performance Management and Alois’s Blogs in his Performance Almanac.

Related reading:

  1. Best Practice Webinar on Proactive Application Performance with Smith Micro on April 28th Besides blogging and speaking at conferences I often get the...
  2. Performance vs. Scalability When people talk about performance and scalability they very often...
  3. Week 6 – How to Make Developers Write Performance Tests I had an interesting conversation with our Test Automation team...
  4. Performance Antipatterns – Part 1 Last year december I gave a talkat DeVoxx on Performance...
  5. 5 Quick Steps to End-To-End Web Performance Visibility Web applications have evolved from the simple client-server structure of...

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

@MicroservicesExpo Stories
"As we've gone out into the public cloud we've seen that over time we may have lost a few things - we've lost control, we've given up cost to a certain extent, and then security, flexibility," explained Steve Conner, VP of Sales at Cloudistics,in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
From manual human effort the world is slowly paving its way to a new space where most process are getting replaced with tools and systems to improve efficiency and bring down operational costs. Automation is the next big thing and low code platforms are fueling it in a significant way. The Automation era is here. We are in the fast pace of replacing manual human efforts with machines and processes. In the world of Information Technology too, we are linking disparate systems, softwares and tool...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
Is advanced scheduling in Kubernetes achievable?Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, answered these questions and demonstrated techniques for implementing advanced scheduling. For example, using spot instances and co...
It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service. FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual "function", action, or piece of business logic. They are expected to start within milliseconds...
The nature of test environments is inherently temporary—you set up an environment, run through an automated test suite, and then tear down the environment. If you can reduce the cycle time for this process down to hours or minutes, then you may be able to cut your test environment budgets considerably. The impact of cloud adoption on test environments is a valuable advancement in both cost savings and agility. The on-demand model takes advantage of public cloud APIs requiring only payment for t...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory? In her Day 2 Keynote at @DevOpsSummit at 21st Cloud Expo, Aruna Ravichandran, VP, DevOps Solutions Marketing, CA Technologies, was jo...
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
These days, APIs have become an integral part of the digital transformation journey for all enterprises. Every digital innovation story is connected to APIs . But have you ever pondered over to know what are the source of these APIs? Let me explain - APIs sources can be varied, internal or external, solving different purposes, but mostly categorized into the following two categories. Data lakes is a term used to represent disconnected but relevant data that are used by various business units wit...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
With continuous delivery (CD) almost always in the spotlight, continuous integration (CI) is often left out in the cold. Indeed, it's been in use for so long and so widely, we often take the model for granted. So what is CI and how can you make the most of it? This blog is intended to answer those questions. Before we step into examining CI, we need to look back. Software developers often work in small teams and modularity, and need to integrate their changes with the rest of the project code b...
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Cloud4U builds software services that help people build DevOps platforms for cloud-based software and using our platform people can draw a picture of the system, network, software," explained Kihyeon Kim, CEO and Head of R&D at Cloud4U, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lav...
As many know, the first generation of Cloud Management Platform (CMP) solutions were designed for managing virtual infrastructure (IaaS) and traditional applications. But that's no longer enough to satisfy evolving and complex business requirements. In his session at 21st Cloud Expo, Scott Davis, Embotics CTO, explored how next-generation CMPs ensure organizations can manage cloud-native and microservice-based application architectures, while also facilitating agile DevOps methodology. He expla...
"Grape Up leverages Cloud Native technologies and helps companies build software using microservices, and work the DevOps agile way. We've been doing digital innovation for the last 12 years," explained Daniel Heckman, of Grape Up in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Some people are directors, managers, and administrators. Others are disrupters. Eddie Webb (@edwardawebb) is an IT Disrupter for Software Development Platforms at Liberty Mutual and was a presenter at the 2016 All Day DevOps conference. His talk, Organically DevOps: Building Quality and Security into the Software Supply Chain at Liberty Mutual, looked at Liberty Mutual's transformation to Continuous Integration, Continuous Delivery, and DevOps. For a large, heavily regulated industry, this task ...
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
Let's do a visualization exercise. Imagine it's December 31, 2018, and you're ringing in the New Year with your friends and family. You think back on everything that you accomplished in the last year: your company's revenue is through the roof thanks to the success of your product, and you were promoted to Lead Developer. 2019 is poised to be an even bigger year for your company because you have the tools and insight to scale as quickly as demand requires. You're a happy human, and it's not just...