Welcome!

Microservices Expo Authors: Elizabeth White, Gregor Petri, Ian Khan, Yeshim Deniz, AppDynamics Blog

Related Topics: Microservices Expo, Java IoT, @CloudExpo

Microservices Expo: Article

Can ‘Big Data’ Prevent Major Service Outages?

New analytics are coming

If Amazon, Bank of America and Microsoft can’t contain service outages before they become colossal PR problems, the rest of us mere mortals have much to fear. It's a safe bet that some time in the next 60 days, another major consumer Internet service will make the headlines for melting down.

Given the potential losses, one might think that the IT organizations at these high profile companies would be bullet proof. But they’re not. Despite countless millions in management investments, the state of the art is still unexpected outages with recovery times that seem to be a minimum of 4 hours and maximum of 2 or 3 days. The evidence suggests that the same holds true for most of the Global 2000 enterprises and similarly sized service providers and government organizations. Multiple escalated incidents a week with one or two showstoppers a year, keep our IT experts tied up for an average of 4 hours per incident.

The key issue that “prevents preventing” outages is at the design core of the network and application monitoring systems in use today. Many are based on technology that was developed when a company’s network could still be visualized on a couple of PowerPoint slides. Applications had a one to one relationship with servers, networks were largely point to point, users could be grouped by the router that served their office. The life of an IT manager was so much simpler.

These old monitoring systems are based on the idea that IT experts would define the performance thresholds, rules and exceptions necessary to identify unacceptable behavior. But today, the typical enterprise application infrastructure is so complex, that you would have to gather dozens of IT managers to even begin to map it all out. Many have reached a level of complexity that defies an IT organizations to fully understand. The result, unforeseen, outages that often take days to resolve.

These monitoring systems are still great for generating the data required to understand the systems behavior – just ask the operations center that receives tens of thousand of alerts a day. But the real challenge lies in making sense of the alerts, and taking the right action before the train wreck occurs.

‘Big Data Analytics’ to the rescue
There’s a lot of chatter about the promise of Big Data in retail, healthcare and manufacturing. But in the realm of IT operations and application performance, not so much. But perhaps we can apply the lessons learned in ‘Big Data’ to solve our operational crises.

Unlocking the promise of Big Data requires two elements – aggregating and managing the data for fast access, and super powerful analytics to uncover the information locked within. The IT operations environments have been collecting and managing the data for decades. A typical large enterprise generates millions of data points an hour of monitoring metrics, log files and events. What’s missing from these environments is the analytics.

Luckily a new generation of machine-learning analytics has arisen that is up to the task. These systems can process information already collected by these monitoring systems and by “self-learning’ their behavior – actually detect problems and identify their root cause as they develop.

If Big Data principles are applied to our complex IT applications and infrastructures, the service outage that made yesterday’s Wall Street Journal will tomorrow be a blip on a sys admin’s screen that is solved with a single mouse click.

More Stories By Kevin Conklin

Kevin Conklin is a 20-year network management veteran and Vice President of Prelert, which is leveraging recent developments in Artificial Intelligence to provide transformative IT management software solutions. Prelert is founded and managed by a team that has developed considerable expertise in the field through companies like Micromuse and Riversoft, Securify, Axentis, Concord, Smarts, Securify and Njini.

@MicroservicesExpo Stories
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices - comp...
About a year ago we tuned into “the need for speed” and how a concept like "serverless computing” was increasingly catering to this. We are now a year further and the term “serverless” is taking on unexpected proportions. With some even seeing it as the successor to cloud in general or at least as a successor to the clouds’ poorer cousin in terms of revenue, hype and adoption: PaaS. The question we need to ask is whether this constitutes an example of Hype Hopping: to effortlessly pivot to the ...
SYS-CON Events announced today the Enterprise IoT Bootcamp, being held November 1-2, 2016, in conjunction with 19th Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA. Combined with real-world scenarios and use cases, the Enterprise IoT Bootcamp is not just based on presentations but with hands-on demos and detailed walkthroughs. We will introduce you to a variety of real world use cases prototyped using Arduino, Raspberry Pi, BeagleBone, Spark, and Intel Edison. Y...
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
SYS-CON Events announced today that Sheng Liang to Keynote at SYS-CON's 19th Cloud Expo, which will take place on November 1-3, 2016 at the Santa Clara Convention Center in Santa Clara, California.
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
With the rise of Docker, Kubernetes, and other container technologies, the growth of microservices has skyrocketed among dev teams looking to innovate on a faster release cycle. This has enabled teams to finally realize their DevOps goals to ship and iterate quickly in a continuous delivery model. Why containers are growing in popularity is no surprise — they’re extremely easy to spin up or down, but come with an unforeseen issue. However, without the right foresight, DevOps and IT teams may lo...
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lea...
Much of the value of DevOps comes from a (renewed) focus on measurement, sharing, and continuous feedback loops. In increasingly complex DevOps workflows and environments, and especially in larger, regulated, or more crystallized organizations, these core concepts become even more critical. In his session at @DevOpsSummit at 18th Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, showed how, by focusing on 'metrics that matter,' you can provide objective, transparent, and meaningful f...
Digitization is driving a fundamental change in society that is transforming the way businesses work with their customers, their supply chains and their people. Digital transformation leverages DevOps best practices, such as Agile Parallel Development, Continuous Delivery and Agile Operations to capitalize on opportunities and create competitive differentiation in the application economy. However, information security has been notably absent from the DevOps movement. Speed doesn’t have to negat...
With online viewership and sales growing rapidly, enterprises are interested in understanding how they analyze performance to positively impact business metrics. Deeper insight into the user experience is needed to understand why conversions are dropping and/or bounce rates are increasing or, preferably, to understand what has been helping these metrics improve. The digital performance management industry has evolved as application performance management companies have broadened their scope beyo...
While DevOps promises a better and tighter integration among an organization’s development and operation teams and transforms an application life cycle into a continual deployment, Chef and Azure together provides a speedy, cost-effective and highly scalable vehicle for realizing the business values of this transformation. In his session at @DevOpsSummit at 19th Cloud Expo, Yung Chou, a Technology Evangelist at Microsoft, will present a unique opportunity to witness how Chef and Azure work tog...
Your business relies on your applications and your employees to stay in business. Whether you develop apps or manage business critical apps that help fuel your business, what happens when users experience sluggish performance? You and all technical teams across the organization – application, network, operations, among others, as well as, those outside the organization, like ISPs and third-party providers – are called in to solve the problem.
Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS - software, platform, and infrastructure as a service.
As applications are promoted from the development environment to the CI or the QA environment and then into the production environment, it is very common for the configuration settings to be changed as the code is promoted. For example, the settings for the database connection pools are typically lower in development environment than the QA/Load Testing environment. The primary reason for the existence of the configuration setting differences is to enhance application performance. However, occas...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
Apache Hadoop is a key technology for gaining business insights from your Big Data, but the penetration into enterprises is shockingly low. In fact, Apache Hadoop and Big Data proponents recognize that this technology has not yet achieved its game-changing business potential. In his session at 19th Cloud Expo, John Mertic, director of program management for ODPi at The Linux Foundation, will explain why this is, how we can work together as an open data community to increase adoption, and the i...
SYS-CON Events announced today that Tintri Inc., a leading producer of VM-aware storage (VAS) for virtualization and cloud environments, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Tintri VM-aware storage is the simplest for virtualized applications and cloud. Organizations including GE, Toyota, United Healthcare, NASA and 6 of the Fortune 15 have said “No to LUNs.” With Tintri they mana...