Welcome!

Microservices Expo Authors: Stackify Blog, Aruna Ravichandran, Dalibor Siroky, Kevin Jackson, PagerDuty Blog

Related Topics: @CloudExpo, Artificial Intelligence, @DXWorldExpo

@CloudExpo: Blog Post

The Human Body and Data Center Automation | @CloudExpo #AI #ML #DataCenter

The nervous system has two components: the central nervous system and the peripheral nervous system

The Human Body and Data Center Automation - Part 2

Disclaimer: I am an IT guy and my knowledge on human body is limited to my daughter's high school biology class book and information obtained from search engines. So, excuse me if any of the information below is not represented accurately !!

Human body is the most complex machine ever created. With a complex network of interconnected organs, millions of cells and the most advanced processor, human body is the most automated system in this planet. In this article, we will draw comparisons between working of a human body to that of a data center. We will draw parallels between human body automation to data center automation and explain different levels of automation we need to drive in data centers. This article is divided into four parts covering each of body main functions and drawing parallels on automation. This is the second article in the human body series. Please click here for the link for first article

The nervous system
The nervous system is a complex collection of nerves and specialized cells known as neurons that transmit signals between different parts of the body. It is through the nervous system that we communicate with the outside world and, at the same time, many mechanisms inside our body are controlled. The nervous system takes in information through our senses, processes the information and triggers reactions, such as making your muscles move or causing you to feel pain. The closest comparison to nervous system to what is in our data center is the network. Much like the network connects everything together in the data center, nervous system essentially is the body's network system.

The nervous system has two components: the central nervous system and the peripheral nervous system. The central nervous system is made up of the brain, spinal cord and nerves. The peripheral nervous system consists of sensory neurons, ganglia (clusters of neurons) and nerves that connect to one another and to the central nervous system. Imagine this as core network (central) and data center network (peripheral). However, what is so fascinating about our nervous system is the way it works. Let's take a deep dive inside our body and learn how can we make our networks more efficient.

Image Source: Livescience.com

The nervous system has two main subdivisions: the somatic, or voluntary, component; and the autonomic, or involuntary, component. The autonomic nervous system regulates certain body processes, such as blood pressure and the rate of breathing, that work without conscious effort. It is constantly active, regulating things such as breathing, heart beat and metabolic processes. It does this by receiving signals from the brain and passing them on to the body. It can also send signals in the other direction - from the body to the brain - providing your brain with information about how full your bladder is or how quickly your heart is beating.

Now can you think what system in our data center comes close to autonomic nervous system? It's our monitoring system. The function of monitoring system in data center is to monitor health of various components(hardware/software) in our data center and alert us when the thresholds are breached or an error has occurred. In most of the modern data centers today there is some tool which does this job. Alerts and error logs are collected at all layers and once an error has occurred or particular KPIs crosses certain threshold, an event is generated and humans are notified to take action. However, where the human body defeats any modern monitoring system is ability to take autonomic actions based on situation. Let's imagine you on a treadmill and running. When you are running and as the heart rates goes up, the brain is not just sending out alerts to you indicating your heart rate is going up but it also taking appropriate actions to ensure the body continues to function. The first action is to breakdown glycogen, a form of glucose to give you extra dose of energy.

The second action is to draw more blood towards your muscles which are under stress and away from non-needed functions like digestion (unless you are eating while excising). Since the body needs more oxygen for your muscles it signals your lungs to intake more oxygen and hence your breathing rate goes up. As the body burns more glucose and your body heats up, your brain sends signals to your sweat glands to release moisture to keep the body cool and hence maintains temperature inside the body. All these actions without you telling your body what to do to keep you healthy. Only if the thresholds crosses beyond certain rate and your body is not able to fix you, it will signal us to take action like resting or slowing down. This is exactly how our monitoring system should work. However, what happens in most of the enterprise is a sorry state of affairs.

Let's consider a very common issue most of the enterprise faces - a performance issue. Consider that the mission critical business application running on your server is experiencing performance issues and the users are complaining. In a typical organization, an application user will do first level analysis and based on his/her analysis he will open an incident ticket with command center. Everyone from systems engineer, storage engineer, network engineer and specialized performance engineers are paged to figure out what's happening. Hours are spent to detect where in the fabric there is contention which is leading to performance issue. Once the issue is detected another few hours are spent to finalize the action plan and finally the fix is put in place. Sounds familiar.

Now imagine we can learn from our human body and can design our data center in such a way that the system should automatically detect something is going wrong in the fabric and find out where in the fabric there is issue. Once the issue is detected it identifies appropriate fix and implements the fix. If the system detects performance issue is because of underlying CPU constraint on one of the VMs, the system should either scale up CPU capacity on the VM or automatically horizontally scale the application by adding another VM or container. If the issue was detected at network level, system should be in a position to move entire VLAN to another healthy leaf switch. If the issue happened at DC level, the system should automatically fail over all the impacted applications to another DC. While some of the modern cloud native applications works in similar fashion, the same level of maturity is not seen in traditional applications.

The somatic system consists of nerves that connect the brain and spinal cord with muscles and sensory receptors in the skin. The voluntary nervous system (somatic nervous system) controls all the things that we are aware of and can consciously influence, such as moving our arms, legs and other parts of the body. The nerves (like the network cables in our DC), starts at the brain and central cord and branches out to every part of our body. Neurons (intelligent code) send signals to other cells though thin fibers called axons, which causes chemical known as neuro transmitter to be released at junctions called synapses. A synapse gives command to the cell and the entire communication takes a fraction of a second. Such is the speed of transmission in our human body that our fastest router in the world cannot come close to this.

Let's take an example. Imagine someone tapping you lightly on your shoulder and your immediate reaction is to turn around and see who is doing that. The sensory neurons (cells) on our shoulder transmit the signals to your brain via the nerves at such a fast pace that you immediately react. Now imagine someone tapping at your shoulder and it takes few seconds to a minute for your body to react to the signal J . The way our body reacts to various form of sense (touch, smell, taste, etc.) and the fact we don't have to manage every action indicates how advance is our body's automation system. The body sensor systems are like the sensors in our data center. The role of sensors is to collect the data and send it for further processing. While we have lot of maturity to collect data what we lack is how fast can we analyze the data to take appropriate action. This is where our Brain checkmates even the fastest of all computers including IBM Watson. Our brain is a combination of Big Data system, e.g., Hadoop, the intelligence of IBM Watson and fastest super compute in the world all combined into one. Let's look at our brain.

Image Source: diseasespictures.com

Brain - Our Brain is the intelligence of our body. It controls all actions in our body. It acts as both CPU and memory for our body and without brains you are almost like walking zombie who has no control of his or her actions. Inside the data center, CPU/Memory inside our servers combined with the software which runs on top of these acts like brain. However, we are still far to match the amount of computing capacity our brain has and more important the learning and intuitive capabilities our brain has.

The fastest supercomputer in the world is China's Sunway Taihulight  and has a maximum processing speed of 93 petaFLOPS. A petaFLOP is a quadrillion (one thousand trillion) floating point calculations per second. Still this does not come close to processing speed of the human brain. It is postulated that the human brain operates at 1 exaFLOP, which is equivalent to a billion billion calculations per second. While the hardware which is, the muscular structure can be compared with chip set we have in our computers, it's the software which makes the difference.

Our brain controls the nervous system, the muscular system and other vital parts of our body. It also has tremendous learning capability. When a child is born, our brain is almost empty but is in a learning mode. It quickly learns how to interact with the outside world and starts to read data from the sense organs. This is how we react to taste, touch and sound. As we grow we learn how to talk, write and communicate. We learn how to walk, run and jump! In the software world, we call it as AI - Artificial intelligence. Humans have been trying for centuries on how to develop brain like self-learning capabilities and while we now have self-driving cars which is pushing the envelope, we are still far away from truly matching our brain's power.

Now the brain cannot work out of isolation. It gets all the data it needs from our sensory organs - eyes, nose, tongue. We interact with the world with the help of our sensory organs. Eyes gives us visual data , nose gives us smell related data while the tongue allows us to taste. Do you know that our tongue alone has millions of sensors which allows to distinguish various tastes from sour to sweet and from hot to cold! Imagine the status of all restaurants if we did not have these sensors. On the other hand, our nose has sensors to not only detect various smell but also acts as a self-defense organ.

In our data centers, we need similar capabilities. Inside the compute, we have sensors which tells us that the filesystem is getting full or inside the router we can tell if we have packet drops happening. We name them as alerts and in a given day millions of alerts are generated by all the systems running in an enterprise. The difference here is what do we do the alerts? If every alert is dependent on human to take manual intervention it will be as good as our tongue telling you that the coffee you are drinking is very hot and waiting for an action to be decided by you on whether you should stop drinking. Your tongue sends alert to your brain, your brain processes information and decides it's too dangerous and immediately takes action to control your actions. Now you may still drink it cause harm but the body takes immediate action to prevent the harm.

Similarly, with the alerts coming from our systems, we need to develop systems which can take immediate actions (self-heal) and not wait for human intervention all the time. If the filesystem is full, take immediate action to detect and fix what is causing it to be full. If the security intrusion detection system has detected malicious emails, block the emails immediately. If network port is dropping packets, isolate the port and move traffic to alternate port. The more autonomic actions we can take, the better we will be in managing our data center.

To summarize here's what we need in our data center: L-C-C-A

  • Lightning fast network of intelligent sensors across the data center stack
  • Central monitoring system which can monitor alerts/error logs at all layers from application down to the server
  • Co-relation engine which can correlate various alerts and error logs and pin points where in the data center there is issue
  • Artificial intelligence (AI) capable run-book automation engine which can trigger autonomous action (self-heal) based on the issue identified and implements the fix

In our next article on human body and data center automation we will focus on our circulator system which is responsible for flow of blood, oxygen and nutrients in our body and we will learn how our data center should learn from same. Until next time.

More Stories By Ashish Nanjiani

Ashish Nanjiani is a Senior IT Manager within Cisco IT managing Cisco worldwide IT data centers as an operations manager. With 20 years of IT experience, he is an expert in data center operations and automation. He has spoken in many inter-company events on data center automation and helps IT professionals digitize their IT operations. He is also an entrepreneur and has been successfully running a website business for 10+ years.

Ashish holds a Bachelor of Science degree in Electrical and Electronics and a Masters in Business Administration. He is a certified PMP, Scrum master. He is married and has two lovely daughters. He enjoys playing with technology during his free time. [email protected]

@MicroservicesExpo Stories
How is DevOps going within your organization? If you need some help measuring just how well it is going, we have prepared a list of some key DevOps metrics to track. These metrics can help you understand how your team is doing over time. The word DevOps means different things to different people. Some say it a culture and every vendor in the industry claims that their tools help with DevOps. Depending on how you define DevOps, some of these metrics may matter more or less to you and your team.
For many of us laboring in the fields of digital transformation, 2017 was a year of high-intensity work and high-reward achievement. So we’re looking forward to a little breather over the end-of-year holiday season. But we’re going to have to get right back on the Continuous Delivery bullet train in 2018. Markets move too fast and customer expectations elevate too precipitously for businesses to rest on their laurels. Here’s a DevOps “to-do list” for 2018 that should be priorities for anyone w...
If testing environments are constantly unavailable and affected by outages, release timelines will be affected. You can use three metrics to measure stability events for specific environments and plan around events that will affect your critical path to release.
In a recent post, titled “10 Surprising Facts About Cloud Computing and What It Really Is”, Zac Johnson highlighted some interesting facts about cloud computing in the SMB marketplace: Cloud Computing is up to 40 times more cost-effective for an SMB, compared to running its own IT system. 94% of SMBs have experienced security benefits in the cloud that they didn’t have with their on-premises service
DevOps failure is a touchy subject with some, because DevOps is typically perceived as a way to avoid failure. As a result, when you fail in a DevOps practice, the situation can seem almost hopeless. However, just as a fail-fast business approach, or the “fail and adjust sooner” methodology of Agile often proves, DevOps failures are actually a step in the right direction. They’re the first step toward learning from failures and turning your DevOps practice into one that will lead you toward even...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
The goal of Microservices is to improve software delivery speed and increase system safety as scale increases. Microservices being modular these are faster to change and enables an evolutionary architecture where systems can change, as the business needs change. Microservices can scale elastically and by being service oriented can enable APIs natively. Microservices also reduce implementation and release cycle time and enables continuous delivery. This paper provides a logical overview of the Mi...
While walking around the office I happened upon a relatively new employee dragging emails from his inbox into folders. I asked why and was told, “I’m just answering emails and getting stuff off my desk.” An empty inbox may be emotionally satisfying to look at, but in practice, you should never do it. Here’s why. I recently wrote a piece arguing that from a mathematical perspective, Messy Desks Are Perfectly Optimized. While it validated the genius of my friends with messy desks, it also gener...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
The enterprise data storage marketplace is poised to become a battlefield. No longer the quiet backwater of cloud computing services, the focus of this global transition is now going from compute to storage. An overview of recent storage market history is needed to understand why this transition is important. Before 2007 and the birth of the cloud computing market we are witnessing today, the on-premise model hosted in large local data centers dominated enterprise storage. Key marketplace play...
The cloud revolution in enterprises has very clearly crossed the phase of proof-of-concepts into a truly mainstream adoption. One of most popular enterprise-wide initiatives currently going on are “cloud migration” programs of some kind or another. Finding business value for these programs is not hard to fathom – they include hyperelasticity in infrastructure consumption, subscription based models, and agility derived from rapid speed of deployment of applications. These factors will continue to...
Some people are directors, managers, and administrators. Others are disrupters. Eddie Webb (@edwardawebb) is an IT Disrupter for Software Development Platforms at Liberty Mutual and was a presenter at the 2016 All Day DevOps conference. His talk, Organically DevOps: Building Quality and Security into the Software Supply Chain at Liberty Mutual, looked at Liberty Mutual's transformation to Continuous Integration, Continuous Delivery, and DevOps. For a large, heavily regulated industry, this task ...
Following a tradition dating back to 2002 at ZapThink and continuing at Intellyx since 2014, it’s time for Intellyx’s annual predictions for the coming year. If you’re a long-time fan, you know we have a twist to the typical annual prediction post: we actually critique our predictions from the previous year. To make things even more interesting, Charlie and I switch off, judging the other’s predictions. And now that he’s been with Intellyx for more than a year, this Cortex represents my first ...
"Grape Up leverages Cloud Native technologies and helps companies build software using microservices, and work the DevOps agile way. We've been doing digital innovation for the last 12 years," explained Daniel Heckman, of Grape Up in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The Toyota Production System, a world-renowned production system is based on the "complete elimination of all waste". The "Toyota Way", grounded on continuous improvement dates to the 1860s. The methodology is widely proven to be successful yet there are still industries within and tangential to manufacturing struggling to adopt its core principles: Jidoka: a process should stop when an issue is identified prevents releasing defective products
Defining the term ‘monitoring’ is a difficult task considering the performance space has evolved significantly over the years. Lately, there has been a shift in the monitoring world, sparking a healthy debate regarding the definition and purpose of monitoring, through which a new term has emerged: observability. Some of that debate can be found in blogs by Charity Majors and Cindy Sridharan.
We seem to run this cycle with every new technology that comes along. A good idea with practical applications is born, then both marketers and over-excited users start to declare it is the solution for all or our problems. Compliments of Gartner, we know it generally as “The Hype Cycle”, but each iteration is a little different. 2018’s flavor will be serverless computing, and by 2018, I mean starting now, but going most of next year, you’ll be sick of it. We are already seeing people write such...
It’s “time to move on from DevOps and continuous delivery.” This was the provocative title of a recent article in ZDNet, in which Kelsey Hightower, staff developer advocate at Google Cloud Platform, suggested that “software shops should have put these concepts into action years ago.” Reading articles like this or listening to talks at most DevOps conferences might make you think that we’re entering a post-DevOps world. But vast numbers of organizations still struggle to start and drive transfo...
Let's do a visualization exercise. Imagine it's December 31, 2018, and you're ringing in the New Year with your friends and family. You think back on everything that you accomplished in the last year: your company's revenue is through the roof thanks to the success of your product, and you were promoted to Lead Developer. 2019 is poised to be an even bigger year for your company because you have the tools and insight to scale as quickly as demand requires. You're a happy human, and it's not just...
"Opsani helps the enterprise adopt containers, help them move their infrastructure into this modern world of DevOps, accelerate the delivery of new features into production, and really get them going on the container path," explained Ross Schibler, CEO of Opsani, and Peter Nickolov, CTO of Opsani, in this SYS-CON.tv interview at DevOps Summit at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.