Microservices Expo Authors: Gordon Haff, Elizabeth White, John Katrick, Mehdi Daoudi, Pat Romanski

Related Topics: Microservices Expo, Java IoT

Microservices Expo: Article

Genesis of a Genetic Algorithm

Understanding GAs from business to implementation

Have you ever wondered how a good idea is transformed into business value? Have you ever thought how does someone take an abstract idea and produce business value from apparent nothingness? Have you ever postulated what can you do to leverage the assets at your disposal for the greater good? If so, than sit tight because you are about to take a journey that will equip your inquisitiveness with the step by step actions that FireScope took when it transformed its corporate wide historical metrics from passive data asset into the next level of business intelligence.

With unique insight, FireScope is able to identify potentially hidden relationships from among your IT assets and reveal cause-and-effect metrics which you might not have even known existed. This wizardry is accomplished using a distinctive genetic algorithm solution and yet is as simple to execute as applying a few mouse clicks. In this article we will introduce the business idea that triggered subsequent investigations, which lead to initial analysis, and ultimately lead to the completion of FireScope's genetic algorithm implementation. This journey will cover several topics such as business proposition, data normalization, and genetic modeling. But at the end of the journey you'll recognize that the combination of metric collection and data comparison delivered by FireScope Inc. is unmatched in the IT industry.

But do take caution, because the later portion of this article is presented from the perspective that you already possess a basic understanding of what a genetic algorithm is, as well as some of the concepts that are used in modeling a genetic algorithm. Fear not if you are not yet there, as you can still garner significant benefit from this article without understanding the nuts and bolts of it. Let the journey begin.

FireScope from 30,000 feet
As background, a functioning FireScope deployment has the ability to gather metrics from all forms of existing IT assets, normalize the gathered metrics, provide historical analysis of the metrics, and most importantly provide service views for worldwide operations which are unparalleled in the IT industry.

FireScope collects a vast array of corporate wide metrics. Some examples are CPU utilization, disk storage, host temperature, interface traffic, and memory utilization. Other examples include database metrics, JMX metrics, NetApp metrics, VMware metrics, web server response time, and so many more that we've neither the time nor the space to mention them all. It is also important to understand that every metric gathered by FireScope is collected on its own schedule. Even two or more metrics that have the same collection interval do not collect at exactly the same instant. So with this universe of data the natural question became what can we do to leverage this asset and deliver the next level of business intelligence?

Revealing hidden secrets
If you've been around the IT world long enough than you've likely experienced a time when an update was made to a web-server which invoked new or existing services from an application server which in turn caused dead locks on your database server. Unfortunately, the dead locks did not occur in test because they were load related and as a result weren't uncovered until your public facing application was placed under heavy load on Cyber Monday. Don't worry you didn't need those sales anyway!

Now to be fair, anyone who consistently monitors their IT infrastructure can identify that their web server is not performing as expected. Furthermore, if you know all of the relationships between your web servers, application servers, and database servers, you can even set up static alerts that draw your attention to the notion that one layer of your business is impacting another layer. But where the story gets really interesting is if you fall into one or more of the following categories:

  1. The alerting values that you set are either too high or too low.
  2. You have not taken the time to set up alerts for related IT assets.
  3. You are not aware of the relationships between your IT assets.
  4. You do not properly monitor your IT assets

I hope you can see that this is a very complex world that we live in. To detect a catastrophic corporate shutdown you need tools that can search out the fact that independent metrics collected from independent servers are impacting one another. This is exactly the idea that FireScope sought to solve when it postulated the use of a genetic algorithm to provide optimal search heuristics and uncover hidden relationships in the very same metrics that it had already collected from your IT assets. But not so fast, in order to accomplish this goal we need the ability to compare disparate metrics.

Not all time is created equal
As noted above, FireScope collects a universe of metrics from a universe of IT assets and each is collected on its own schedule. As an example consider CPU utilization taken from two different hosts both being polled on 30 second intervals. Since both are polled on their own schedule we really couldn't say that the two metrics had similar signatures if our metric collection times were not equal. Yet another problem arises when you consider comparing two metrics that have different collection intervals such as CPU utilization collected every 30 seconds on one asset vs. CPU Utilization collected every 5 minutes on another asset. How can these be easily compared?

FireScope needed the ability to easily normalize the time domain from all collected metrics in order to fairly and accurately compare metrics collected on independent schedules. As it turns out, FireScope already trends all metrics that have numeric representations. You can think of trending as averaging over time. But for the purposes of FireScope's genetic algorithm, the trending operation also contributed an effective normalization in the time domain for all metrics having a numeric representation. So problem one is solved, because all numeric metrics are trended every hour starting on the hour.

Not all units have the same value
Let's trade one for one. I'll give you a nickel for every dollar you give me. Sound fair? Yea, I didn't think you would go for that either, but that thought does bring to light the next challenge of comparing metrics having differing units. The problem becomes even more challenging when you consider that in some instances the units might not even be from the same domain! Consider the chart below which details just a few metric/unit pairings from your IT assets:



Interface Traffic


CPU Utilization


Host temperature

Degrees F or C

Web Server Response Time


How can "Degrees F" be compared to "Percent CPU Utilization"? FireScope needed the ability to compare differing metrics collected across your IT infrastructure each having potentially different metric/unit representations.

Well, what if we compared the relative rise/fall of collected values instead of comparing the values themselves? In doing so we wouldn't be comparing the values themselves, but the increase or decrease in metric value over time. In short, FireScope calculates the tangent, or rate of change, between hourly trends for all collected metrics as a pre-processing step for its genetic algorithm. As you'll see later, this series of data is used to form a "gene" or genetic sequence and subsequently used to compare one metric against another to determine how closely the two signatures rise or fall within the same time frame.

Action and reaction
If you're in IT, you will most likely care if a rise in web traffic caused a delayed rise in CPU activity on another system which in turn caused a general slowdown in your business response times.

The third level of value that FireScope delivers with its genetic algorithm solution is the ability to search out cause-and-effect metrics from within your IT assets. By applying a sliding window comparison of a selected metric against the searched metrics, FireScope can search out the possibility that deviations in one metric appear before or after a target metric anomaly.  In doing so, the notion that one metric caused another metric to deviate is displayed graphically by a simple time series display. Furthermore this cause and effect rendering can appear in either direction.

  1. The target metric may have deviated because it is impacted by some other metric
  2. The target metric deviation may have impacted some other metric in your system
  3. Or both of the above are true and your system has multiple cause-and-effect metrics

But once again you may not have even known that these metrics have cause-and-effect relationships or that the relationships exhibit delayed response signatures.

Starting the analysis
Let's assume that you fall into category 3 referenced above which implies that you are actively monitoring your IT assets, but you might not be aware of all of the relationships between all of your IT assets. Let's also assume that you are experiencing a slow-down in one of your important business operations, but other than the apparent slow-down you can't quite explain why this portion of your business is slow. You observe that your CPU load is higher than normal on one system which is displayed via a historical graph of the system CPU load. You start FireScope's analytics and select this same metric. You provide the analysis start time by selecting the data area just prior to the point in time where the CPU load started to rise. Next you select the analysis end time by marking the data area after the CPU returned to normal, or you select now because the CPU load is still high. As a last step you trigger the Genetic Algorithm analysis, asking FireScope to search out other metrics that exhibit a similar metric response during the same timeframe as the selected metric. The result of the analysis is the top 5 metrics that most closely match the signature of the selected metric. The value of the analysis is that the resulting metrics may very well contain metrics that either caused the spike in CPU load that is being evaluated, or these other metrics may be being impacted by the selected metric.

Components of a genetic algorithm
A genetic algorithm is an optimized search solution that attempts to mimic the process of natural evolution. Natural evolution uses "genes", "chromosomes", "genetic-mutation", "genetic-crossover" and multiple generations to produce improvements in nature. Of course sometimes natural evolution produces defects or mutations, but just as in nature, these apparent defects can turn out to be extremely valuable assets in the evolutionary process. Genetic algorithms attempt to simulate natural evolution concepts in software by expressing an optimized search solution using these very same natural evolution concepts.

The figure below illustrates FireScope's use of several genetic algorithm constructs and provides a brief synopsis of the genetic algorithm process.

GA building blocks, the "gene"
Let's use a bottom up approach and talk about genes first. As was mentioned above, FireScope compares the relative change over time of disparate metric values. This comparison is accomplished by first digitizing the comparison values into a representative alphabet where each character in the alphabet represents the change in slope, or tangent, between two sequential trend values. Since time is always increasing in this domain the only values that are relevant are values that fall between -π/2 (-90 degrees) to π/2 (90 degrees). FireScope divides this range into 90 distinct buckets each representing one letter of a 90 character alphabet. The digitization process allows for optimized comparison and also filters out small changes that are not significant enough to impact the comparison process. The series of alphabetic values representing digitized values from one metric are encoded into a gene starting from the earliest time slot under consideration to the latest.

GA building blocks, the "chromosome"
Moving up in our bottom up approach is the formation of multiple genes into a chromosome. FireScope's goal is to identify five metrics that are most closely related to a pre-selected target metric in a time range that is slightly wider than the selected metric time range. As a result, FireScope creates a chromosome from five genes of other metric trend values each of which were digitized into a gene. The resulting chromosome represents one possible solution from among millions of combinations of solutions. As the genetic algorithm progresses, it will search out from among millions of chromosomes five other chromosomes that best match the trend pattern of the target chromosome.

GA building blocks, population/fitness/scoring
The first selection of genes to form the initial chromosome population is purely a random selection from all existing numeric trend data. FireScope creates an initial population of several hundred chromosomes (possible solutions) and then applies a fitness algorithm to this population. The fitness algorithm assigns a numeric value which represents an assessment of how closely each chromosome in the population matches the target metric. Chromosomes that score the highest are chosen for mating to produce the next generation. The desired intent is that improved next generation chromosomes are the natural result of combining chromosomes from the best parents of the prior generation.

GA building blocks, mating
This process is sometimes referred to as genetic crossover, and is the process of selecting some genes (metrics) from each of two different parents to produce a new child chromosome. The new child chromosome is 5 genes made via crossover or mating from two high scoring parents from the prior generation. This chromosome, as with all others, represents one possible solution from among millions of possible solutions that might be the 5 metrics that most closely resemble the signature of the pre-selected target metric.

GA building blocks, "mutations"
After mating a small proportion of the new generation of chromosomes are chosen to be randomly mutated. FireScope uses the mutation process to inject previously unexplored genes (metrics) into the search process. If a chromosome is selected for mutation a random gene is replaced by a gene from a metric that has not yet been explored. This mutation process can have the effect of improving the overall score of the selected chromosome, or degrading it. However the randomness of this process has been shown to improve GA search capabilities just as mutation in nature sometime provides improvement though natural genetic evolution.

Cause-and effect
In comparing other metrics against a target metric, FireScope evaluates a longer timeframe than the time range selected by the user's target metric. The compared metric trend values are compared both prior to the target metric and after the target metric. By expanding the search window and sliding the target metric over the searched metric FireScope delivers the ability to detect if a searched metric may have caused the target metric to deviate from a normal signature, or if the target metric caused other metrics to deviate from their normal signature. This determination is accomplished by detecting the similarity of the target metric to the searched metric. This approach has nothing to do with genetic algorithms, but is simply a higher level value that is extracted from the FireScope's genetic algorithm implementation.

GA building blocks, "completion"
After several thousands of generations have been evaluated and the improvement of scoring has slowed to an acceptable level the genetic algorithm completes and the top scoring chromosome from the last generation is selected as the best solution. This chromosome contains 5 genes that represent the top scoring metrics which most closely match the target metric. Each gene (metric) from the top scoring chromosome can be displayed back to the user for further investigation, and each is displayed on the same graph as the selected target metric.

If this is your first exposure to genetic algorithm techniques, it can be overwhelming to try to understand all of this Genetic Algorithm terminology. Concepts such as genes, chromosomes, mutation, and fitness algorithms are difficult to conceptualize. While FireScope uses genetic algorithm techniques, it is important to understand that this approach is nothing more than achieving optimal search times to deliver the business value of revealing possibly unknown relationships between disparate metrics collected throughout your IT assets.

It is also instructive to realize that the real ingenuity in this approach is not in the application of the genetic algorithm, but rather in the normalization techniques that were applied to deliver the ability to compare disparate metrics. While the application of the genetic algorithm does provide optimized search results, these results could not have been achieved if it weren't for the initial work of normalizing the time domain, normalizing the value domain, and implementing the sliding window analysis that delivers the ability to uncover delayed cause-and-effect metrics hidden within your IT infrastructure.

FireScope's genetic algorithm coupled with your inquisitiveness form a near super human capability that exists nowhere else in the IT industry. As with all supernatural powers, you must use them wisely!


More Stories By Pete Whitney

Pete Whitney is a Solutions Architect for Cloudera. His primary role at Cloudera is guiding and assisting Cloudera's clients through successful adoption of Cloudera's Enterprise Data Hub and surrounding technologies.

Previously Pete served as VP of Cloud Development for FireScope Inc. In the advertising industry Pete designed and delivered DG Fastchannel’s internet-based advertising distribution architecture. Pete also excelled in other areas including design enhancements in robotic machine vision systems for FSI International Inc. These enhancements included mathematical changes for improved accuracy, improved speed, and automated calibration. He also designed a narrow spectrum light source, and a narrow spectrum band pass camera filter for controlled machine vision imaging.

Pete graduated Cum Laude from the University of Texas at Dallas, and holds a BS in Computer Science. Pete can be contacted via Email at [email protected]

@MicroservicesExpo Stories
We call it DevOps but much of the time there’s a lot more discussion about the needs and concerns of developers than there is about other groups. There’s a focus on improved and less isolated developer workflows. There are many discussions around collaboration, continuous integration and delivery, issue tracking, source code control, code review, IDEs, and xPaaS – and all the tools that enable those things. Changes in developer practices may come up – such as developers taking ownership of code ...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
"We started a Master of Science in business analytics - that's the hot topic. We serve the business community around San Francisco so we educate the working professionals and this is where they all want to be," explained Judy Lee, Associate Professor and Department Chair at Golden Gate University, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Cloud Governance means many things to many people. Heck, just the word cloud means different things depending on who you are talking to. While definitions can vary, controlling access to cloud resources is invariably a central piece of any governance program. Enterprise cloud computing has transformed IT. Cloud computing decreases time-to-market, improves agility by allowing businesses to adapt quickly to changing market demands, and, ultimately, drives down costs.
For over a decade, Application Programming Interface or APIs have been used to exchange data between multiple platforms. From social media to news and media sites, most websites depend on APIs to provide a dynamic and real-time digital experience. APIs have made its way into almost every device and service available today and it continues to spur innovations in every field of technology. There are multiple programming languages used to build and run applications in the online world. And just li...
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
The general concepts of DevOps have played a central role advancing the modern software delivery industry. With the library of DevOps best practices, tips and guides expanding quickly, it can be difficult to track down the best and most accurate resources and information. In order to help the software development community, and to further our own learning, we reached out to leading industry analysts and asked them about an increasingly popular tenet of a DevOps transformation: collaboration.
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
How is DevOps going within your organization? If you need some help measuring just how well it is going, we have prepared a list of some key DevOps metrics to track. These metrics can help you understand how your team is doing over time. The word DevOps means different things to different people. Some say it a culture and every vendor in the industry claims that their tools help with DevOps. Depending on how you define DevOps, some of these metrics may matter more or less to you and your team.
"CA has been doing a lot of things in the area of DevOps. Now we have a complete set of tool sets in order to enable customers to go all the way from planning to development to testing down to release into the operations," explained Aruna Ravichandran, Vice President of Global Marketing and Strategy at CA Technologies, in this SYS-CON.tv interview at DevOps Summit at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"We are an integrator of carrier ethernet and bandwidth to get people to connect to the cloud, to the SaaS providers, and the IaaS providers all on ethernet," explained Paul Mako, CEO & CTO of Massive Networks, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"Grape Up leverages Cloud Native technologies and helps companies build software using microservices, and work the DevOps agile way. We've been doing digital innovation for the last 12 years," explained Daniel Heckman, of Grape Up in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"NetApp's vision is how we help organizations manage data - delivering the right data in the right place, in the right time, to the people who need it, and doing it agnostic to what the platform is," explained Josh Atwell, Developer Advocate for NetApp, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"Outscale was founded in 2010, is based in France, is a strategic partner to Dassault Systémes and has done quite a bit of work with divisions of Dassault," explained Jackie Funk, Digital Marketing exec at Outscale, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Let's do a visualization exercise. Imagine it's December 31, 2018, and you're ringing in the New Year with your friends and family. You think back on everything that you accomplished in the last year: your company's revenue is through the roof thanks to the success of your product, and you were promoted to Lead Developer. 2019 is poised to be an even bigger year for your company because you have the tools and insight to scale as quickly as demand requires. You're a happy human, and it's not just...
The enterprise data storage marketplace is poised to become a battlefield. No longer the quiet backwater of cloud computing services, the focus of this global transition is now going from compute to storage. An overview of recent storage market history is needed to understand why this transition is important. Before 2007 and the birth of the cloud computing market we are witnessing today, the on-premise model hosted in large local data centers dominated enterprise storage. Key marketplace play...
Cavirin Systems has just announced C2, a SaaS offering designed to bring continuous security assessment and remediation to hybrid environments, containers, and data centers. Cavirin C2 is deployed within Amazon Web Services (AWS) and features a flexible licensing model for easy scalability and clear pay-as-you-go pricing. Although native to AWS, it also supports assessment and remediation of virtual or container instances within Microsoft Azure, Google Cloud Platform (GCP), or on-premise. By dr...
With continuous delivery (CD) almost always in the spotlight, continuous integration (CI) is often left out in the cold. Indeed, it's been in use for so long and so widely, we often take the model for granted. So what is CI and how can you make the most of it? This blog is intended to answer those questions. Before we step into examining CI, we need to look back. Software developers often work in small teams and modularity, and need to integrate their changes with the rest of the project code b...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...