Welcome!

Microservices Expo Authors: Elizabeth White, Pat Romanski, Stackify Blog, Liz McMillan, Yeshim Deniz

Related Topics: Containers Expo Blog, Java IoT, Microservices Expo, Microsoft Cloud, @CloudExpo, @BigDataExpo, SDN Journal

Containers Expo Blog: Blog Feed Post

Bare Metal Blog. When Things Go Wrong.

It’s about hardware

If you’re like most people, when you notice something really odd about your body, the first thought to enter your brain is not “I need to call the doctor.” Of course let me clarify, if you look down after a fall, and think “My arm really shouldn’t bend like that”, then yeah, you call a doctor right away. But if you didn’t fall and go “there’s this tingle in my right arm”, the first thing you do is attempt to analyze. Then, if you want to know all of the really worst things in the world it might be, you research online at a site like WebMD. If, after reasonable thought and possible research, you cannot place a reason for your problem, you might go see your doctor. Or you might try some aspirin, depending upon the problem, and your level of discomfort.

If you decide to go see a doctor, you don’t want him to take an XRay, glance at a blurry image, and pronounce that you have two days to live. You’re going to want him to do a thorough job of examining anything more than “you slept on it wrong, it’ll be better tomorrow”. Because you want to – and you want your doctor to – work from an informed position.

But do you give your hardware the same opportunity?

Just like your body shows odd symptoms or even has a system failure, so too for the hardware in your datacenter. While it would be wonderful if a device could last forever, see my Mean Time Between Failures post in the Bare Metal Series for more reasonable expectations.

<Disclaimer> As always, I am an F5 employee, and I know F5 gear better than other vendors. From here on, I will talk primarily about what is available to diagnose problems on an F5 device. Vendor support for these tools varies, check with your vendor to find out if/how you can achieve the same ends. </Disclaimer>

It’s about hardware.
Somewhere on the system, there is going to be a hardware diagnostics tool. In the case of F5 gear, it is called End User Diagnostics (EUD), and it provides you with a solid battery of self-diagnostics that can be used to see if the hardware is functioning well. Here is the main menu from the tool:

Notice that it can test the RAM, the LCD display, SFPs, the indicator lights on the chassis, overall system, and system sensors (like temperature). But it can go beyond these tests, checking the internal path packets traverse through the hardware, the hardware-used memory (PVA), SSL processing (since SSL is offloaded to specialized hardware), FIPS processing, compression, disk drives, file systems on the drives… It’s a pretty solid picture of what might be wrong with the system. While we hope you never need it, reality is that hardware wears down, gets dirty power, or on occasion fails in spite of burn-in. So for those times, you have the tool available.

Notice that EUD doesn’t have an option to quit without rebooting, and that’s not the only caveat. While I could give you the other details, like you need to disconnect network cables while running EUD, I’ll just point out that http://ask.f5.com has a lot of information about the tool if you have an account, and F5 offers training in using it also. Again, we do our best to make sure you’ll never need it but know it does happen, so want you prepared. It is strongly recommended that you download the latest version and read the release notes also.

But it’s software too!
No complex piece of computing machinery runs on straight hardware anymore. Whether you recognize it as software or not, all computer systems – including all ADCs – use software to accomplish some goals. In F5 gear, a fair share of processing is either shared hardware/software or straight software, and as you might imagine, the software can have issues from configuration on that can cause odd behavior.

For that bit of the puzzle, F5 has long had the tool for the job… QKView* runs on the machine and collects a ton of data. The results of QKView can be sent to technical support upon request, but also can be uploaded to a user diagnostics site. More on that in a moment. qkview runs across the system, picking up important (but non security-related) information and puts it all together in a tarball. “What good is that?” a bunch of you must be asking. And that’s the great part, since normally that would be a valid question. The logs, configs, error dumps are all available to you on the device, so what use is making them less available in a tarball? That’s where the next part comes in…

I cannot stress strongly enough, if you are an F5 customer considering using qkview, please go to ask.f5.com and download the latest version. Improvements in performance, what data is gathered, even organization of data inside the tarball are happening pretty regularly, and using the newest version will help insure that you have the most relevant data in the most efficient form.

But it’s really complex software…
F5 gear is a marriage of blazingly fast, bullet proof hardware with highly optimized software. To create a system that is not only that complex, but adds in features like the ability to store multiple versions of the software and boot the one of choice at any given time, and pluggable software modules that do a variety of application delivery and application security functions for you, well, that takes a lot of software. Never fear, all of our software has rigorous QC applied, just like our hardware does, but there’s a lot of it interacting, and I have never met the device whose designers knew before hand the array of uses that customers will find to put it to. Every network is different, every application architecture is different, and thus the usage of every single ADC deployed is different. Well, not every single one, since most customers use clustering sooner or later, but more than half of them, for certain.

That is why QkView output is a tar file There’s a bunch of information about how all the various software and hardware parts are communicating in those files, what’s gone wrong, how the device is configured… Just a ton of information. In fact, with versioning differences (if software changed, often what it reports changes), it was difficult to offer up a cohesive application on the BIG-IP to analyze these files.

Enter iHealth, a free (registration required to keep it to people with legitimate uses) qkview analyzer.  There are a large variety of reasons that F5 chose to go to a centralized online analyzer over a standalone tool. I’ll hit on a couple of them for you, they’re the ones I think you’ll care the most about.

1. The online tool offers manipulable graphical output. In short, you can navigate data organized in a natural way, look at what’s important to you, and get back to fixing problems faster. Generated charts are also great tools for management presentation to point out problem areas or talk up how much traffic the device is handling.

2.  The online tool can utilize the information from thousands of deployed devices to show you where you’ve made common configuration errors or point out potential future problems. It’s like chatting with thousands of your widespread peers about qkview output and getting free advice.

3.  The heuristics database that checks configurations and offers advice/tells you how to resolve issues is always up to date. You don’t have to update it before checking a qkview file.

But as always, a picture is worth a thousand words, so I’ll offer you a couple thousand words’ worth.

When the qkview file is uploaded and analyzed, you get the iHealth summary page:

This serves as a starting point to explore in more detail, and offers totals for how many devices have been defined, what add-on modules are licensed, version information, etc.

Next let’s take a look at the diagnostics section, the one that will interest most users (some users utilize iHealth to performance tune their network, and for those customers, diagnostics is far less used):

Notice how it' has issues divided up by severity? And it offers links to how to fix them. Useful when there’s trouble and you’re in a hurry.

It builds this handy list from information stored in the online app – information that can be updated as needed. That means the app is more responsive to your needs than an on-device tool might be.

In the end, it’s about serving traffic. Reliably.

All of these tools – End User Diagnostics, qkview, and iHealth are out to help with one thing… Helping you (and F5 tech support when necessary) figure out what’s really wrong and fix it, and helping you proactively fix things that might be wrong for the future. And all of that is to simply support the need to keep applications on-line and performing well. While they are not much use if your ADC is a doorstop, they’re invaluable if the ADC is a cornerstone of your datacenter, and cut hours, in some cases days off of troubleshooting and repair timelines.

And remember, all are free tools for you to use, just one part of the overall quality plan at F5.

Read the original blog entry...

More Stories By Don MacVittie

Don MacVittie is founder of Ingrained Technology, A technical advocacy and software development consultancy. He has experience in application development, architecture, infrastructure, technical writing,DevOps, and IT management. MacVittie holds a B.S. in Computer Science from Northern Michigan University, and an M.S. in Computer Science from Nova Southeastern University.

@MicroservicesExpo Stories
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing bes...
Regardless of what business you’re in, it’s increasingly a software-driven business. Consumers’ rising expectations for connected digital and physical experiences are driving what some are calling the "Customer Experience Challenge.” In his session at @DevOpsSummit at 20th Cloud Expo, Marco Morales, Director of Global Solutions at CollabNet, will discuss how organizations are increasingly adopting a discipline of Value Stream Mapping to ensure that the software they are producing is poised to o...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore t...
Most DevOps journeys involve several phases of maturity. Research shows that the inflection point where organizations begin to see maximum value is when they implement tight integration deploying their code to their infrastructure. Success at this level is the last barrier to at-will deployment. Storage, for instance, is more capable than where we read and write data. In his session at @DevOpsSummit at 20th Cloud Expo, Josh Atwell, a Developer Advocate for NetApp, will discuss the role and value...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...
This talk centers around how to automate best practices in a multi-/hybrid-cloud world based on our work with customers like GE, Discovery Communications and Fannie Mae. Today’s enterprises are reaping the benefits of cloud computing, but also discovering many risks and challenges. In the age of DevOps and the decentralization of IT, it’s easy to over-provision resources, forget that instances are running, or unintentionally expose vulnerabilities.
It has never been a better time to be a developer! Thanks to cloud computing, deploying our applications is much easier than it used to be. How we deploy our apps continues to evolve thanks to cloud hosting, Platform-as-a-Service (PaaS), and now Function-as-a-Service. FaaS is the concept of serverless computing via serverless architectures. Software developers can leverage this to deploy an individual "function", action, or piece of business logic. They are expected to start within milliseconds...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
There are two main reasons for infrastructure automation. First, system administrators, IT professionals and DevOps engineers need to automate as many routine tasks as possible. That’s why we build tools at Stackify to help developers automate processes like application performance management, error monitoring, and log management; automation means you have more time for mission-critical tasks. Second, automation makes the management of complex, diverse environments possible and allows rapid scal...
One of the biggest challenges with adopting a DevOps mentality is: new applications are easily adapted to cloud-native, microservice-based, or containerized architectures - they can be built for them - but old applications need complex refactoring. On the other hand, these new technologies can require relearning or adapting new, oftentimes more complex, methodologies and tools to be ready for production. In his general session at @DevOpsSummit at 20th Cloud Expo, Chris Brown, Solutions Marketi...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry’s single source for the cloud. Fusion’s advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developm...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, Cloud Expo and @ThingsExpo are two of the most important technology events of the year. Since its launch over eight years ago, Cloud Expo and @ThingsExpo have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, I provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading the...
The purpose of this article is draw attention to key SaaS services that are commonly overlooked during contact signing that are essential to ensuring they meet the expectations and requirements of the organization and provide guidance and recommendations for process and controls necessary for achieving quality SaaS contractual agreements.
SYS-CON Events announced today that OpsGenie will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2012, OpsGenie is an alerting and on-call management solution for dev and ops teams. OpsGenie provides the tools needed to design actionable alerts, manage on-call schedules and escalations, and ensure that the right people are notified at the right time, using multiple notification methods.