Welcome!

Microservices Expo Authors: Yeshim Deniz, AppDynamics Blog, Pat Romanski, Elizabeth White, Liz McMillan

Related Topics: Containers Expo Blog, Microservices Expo, @CloudExpo

Containers Expo Blog: Article

Building a Cloud Factory

A process empowering companies to more efficiently migrate workloads to the cloud

Few areas of human endeavor can match the pace of change in IT. Even by IT standards, the change being driven by cloud computing sometimes seems surprising. To refer to a virtual environment that has only recently been deployed as "legacy," as some organizations are now doing, underscores the fact that the only thing constant in the data center is change. To deal with change of this magnitude, which can involve transforming the workload hosting model of an entire organization, some industrial-strength thinking is required.

In order to tackle this challenge, it's important to properly frame the cloud transformation problem. Many associate cloud with agility, flexibility, cost transparency and other end-user-oriented benefits. But many of these attributes are primarily associated with new infrastructure requests, and specifically, the use of self-service portals to "spin up" infrastructure to host new applications or host transient processing demands. When it comes to migrating hundreds or thousands of existing workloads into cloud infrastructure, agility is not a benefit that is typically experienced. In fact the opposite is often the case: because clouds require a higher degree of standardization (i.e., a finite catalog of sizes and software options), migrating existing physical and virtual servers into cloud models can actually be quite difficult. In other words, the very features that make clouds agile for new workload deployments can actually make them less agile from a transformation perspective.

This is where the notion of a factory comes in. In industrial processes, factories are the epitome of scalability, repeatability and productivity. Although they may take some effort to "tool up," once they are up and running they can handle a higher flow of activity, efficiently processing inputs to provide consistent output. This notion is also key to large-scale transformation. By applying a common approach that has been properly engineered to give repeatable results, organizations can greatly reduce the time and effort required to migrate to cloud infrastructure.

Within this concept, it is important to expand on what is meant by "properly engineered." Many organizations tackle these kinds of problems from a grassroots perspective, using spreadsheets and smart people to determine action. The problem with this approach is it rarely evolves to the point where it can generate truly accurate answers, mainly because the problem is too complex. Migrating workloads into clouds requires processing volumes of historical data, analyzing configuration information on the servers and applications being migrated, modeling target instance sizes and software stacks, enforcing corporate and regulatory requirements, honoring SLA and data protection rules, etc. Spreadsheets are not well suited to this, in much the same way that they are ill suited for use as corporate accounting platforms. Even if they can be coaxed into giving a decent answer for simple environments, they will not generate the reports needed to satisfy stakeholders, management, engineering, operations, etc., all of whom need significant detail surrounding the decisions being made in order to ensure benefits are achieved and risk is minimized.

Buried in the list of migration analysis requirements is a key concept linking them all together. This is the notion of policy, which represents the ground rules on how workloads should be hosted, where they should and should not go, how much resources they should be allocated, etc. Without properly modeled policies, hosting decisions are left to the practitioner performing the migration, and it can be hit-or-miss whether they do the right thing (or even follow the same policy twice in a row). Planning and managing cloud infrastructure without proper policies is like trying to fill out a tax return without instructions - there are just too many variables to get it right.

With all of these concepts in mind, the exact nature of the cloud factory becomes clearer. It divides the problem into a series of logical steps that combine data, target models and cloud planning and management policies in order to automate the process of deciding exactly where things go and how big to make them. These steps that make up the factory are:

  1. Candidate Qualification: This process determines whether a given set of workloads are suitable to be hosted in a given cloud environment. This is both qualitative and quantitative in nature and designed to separate true candidates from the workloads that are better suited to go elsewhere (more on this later in step 6). Examples of quantitative criteria include maximum I/O rates, context switching limitations, maximum CPU and memory sizes, etc. Qualitative criteria include data sensitivity, SLA requirements, backup strategy and other considerations. By applying a policy capturing all of these factors, a rapid and accurate assessment can be made.
  2. Sizing: This takes the qualified candidates and determines what cloud instances are best suited to host them given their historical levels and patterns of utilization. This again is subject to policy, which governs how much history is considered, target utilization levels, etc. The result is a detailed specification of the instance sizes needed and the projected utilization levels in the "to be" environment. Note the use of benchmarks is critical in this step, as the translation of CPU utilization from the current environment to the cloud depends on the relative speeds of the CPU employed in each.
  3. Load Balancing: Also a sizing step, this is focused on the load balancers and clusters being migrated. Because cloud environments offer different sizing options, and can even offer more advanced "elasticity" features, it is not always desirable to do a straight one-to-one translation of these servers into cloud capacity. For example, an 8-way IIS cluster might translate onto 12 smalls, 6 mediums and 3 large instances. Of these options, the one that meets the policy criteria (e.g., size for yearly peak activity, allow for N+1 resiliency) at the lowest cost will be the winner. This result is combined with the general sizing results from the previous step to provide a complete sizing plan.
  4. Software Stack Mapping: This step considers the OS and software configurations of the source servers and maps them onto the "closest" configuration available in the cloud. Because cloud catalogs only offer a finite set of software options, this is effectively a standardization analysis. For Infrastructure-as-a-Service (IaaS), this step is typically limited to the OS-level configuration and matches the OS attributes of the existing servers and VMs to the operating systems that are on offer in the cloud (which is typically a much shorter list). For Platform-as-a-Service this step also includes scrutiny of the actual software inventory and applications installed. The result may say "server X looks the most like an IIS v6 server, but differs from the standard image in the following ways..." This not only provides the optimal stack to deploy, but also generates a remediation list that is critical for reducing risk during implementation.
  5. Placement: Once the final specification is arrived at (through sizing, balancing and software mapping), the next step for internal cloud environments is determining exactly where the workloads should be placed in the infrastructure actually hosting the cloud environment. Because most clouds are based on virtual environments, the key is to fit the new VMs into the environment in a way that optimally leverages server resources. This step looks somewhat similar to placement of workloads in virtual environments (which tends to resemble placing Tetris blocks in available server capacity), but the policy regarding overcommit has a large influence on the resulting placements. If the policy is to strictly reserve the capacity for each cloud instance, then the environment will be very safe but relatively inefficient, as the workload density will be quite low (think of playing Tetris with the blocks wrapped in bubbles). If the policy is to fully overcommit resources, then the end customer may have a higher risk of contention if they place unanticipated demands on the environment, but the higher density that results can result in significantly lower costs (think Tetris blocks packed tightly together, requiring far less capacity).
  6. Exception Handling: Going back to step 1, there are typically components of an application or business service that may not be suitable for hosting in the cloud. For these systems, it is necessary to evaluate other hosting options in order to determine what to do with them. Because there is often an order of precedence with respect to the hosting options, this step involves the systematic qualification of the rejected workloads against an ordered set of hosting strategies. These strategies can include using cloud instances with customized allocations, using dedicated cloud servers, hosting in a virtual environment, using dedicated blades, using dedicated rack mount servers or leaving the workloads alone (a last resort). By passing the rejected candidates through this gauntlet of options, each will arrive at a viable outcome.

The result of applying these steps is a methodical, exhaustive and rapid process for planning cloud migrations. By taking a data-centric, policy-driven approach, fewer mistakes are made, less rework is required, and application owners and other stakeholders will have much higher confidence they will arrive on the other end unscathed. This transparency, combined with the detailed specifications and implementation details that emerge, can rapidly accelerate cloud initiatives. This not only reduces time-to-value, but also enables IT organizations to keep up with the pace of technology innovation, which shows no sign of letting up.

More Stories By Andrew Hillier

Andrew Hillier is CTO and co-founder of CiRBA, Inc., a data center intelligence analytics software provider that determines optimal workload placements and resource allocations required to safely maximize the efficiency of Cloud, virtual and physical infrastructure. Reach Andrew at [email protected]

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@MicroservicesExpo Stories
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
With the rise of Docker, Kubernetes, and other container technologies, the growth of microservices has skyrocketed among dev teams looking to innovate on a faster release cycle. This has enabled teams to finally realize their DevOps goals to ship and iterate quickly in a continuous delivery model. Why containers are growing in popularity is no surprise — they’re extremely easy to spin up or down, but come with an unforeseen issue. However, without the right foresight, DevOps and IT teams may lo...
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lea...
SYS-CON Events announced today that Sheng Liang to Keynote at SYS-CON's 19th Cloud Expo, which will take place on November 1-3, 2016 at the Santa Clara Convention Center in Santa Clara, California.
Much of the value of DevOps comes from a (renewed) focus on measurement, sharing, and continuous feedback loops. In increasingly complex DevOps workflows and environments, and especially in larger, regulated, or more crystallized organizations, these core concepts become even more critical. In his session at @DevOpsSummit at 18th Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, showed how, by focusing on 'metrics that matter,' you can provide objective, transparent, and meaningful f...
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
With online viewership and sales growing rapidly, enterprises are interested in understanding how they analyze performance to positively impact business metrics. Deeper insight into the user experience is needed to understand why conversions are dropping and/or bounce rates are increasing or, preferably, to understand what has been helping these metrics improve. The digital performance management industry has evolved as application performance management companies have broadened their scope beyo...
Digitization is driving a fundamental change in society that is transforming the way businesses work with their customers, their supply chains and their people. Digital transformation leverages DevOps best practices, such as Agile Parallel Development, Continuous Delivery and Agile Operations to capitalize on opportunities and create competitive differentiation in the application economy. However, information security has been notably absent from the DevOps movement. Speed doesn’t have to negat...
While DevOps promises a better and tighter integration among an organization’s development and operation teams and transforms an application life cycle into a continual deployment, Chef and Azure together provides a speedy, cost-effective and highly scalable vehicle for realizing the business values of this transformation. In his session at @DevOpsSummit at 19th Cloud Expo, Yung Chou, a Technology Evangelist at Microsoft, will present a unique opportunity to witness how Chef and Azure work tog...
Your business relies on your applications and your employees to stay in business. Whether you develop apps or manage business critical apps that help fuel your business, what happens when users experience sluggish performance? You and all technical teams across the organization – application, network, operations, among others, as well as, those outside the organization, like ISPs and third-party providers – are called in to solve the problem.
Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS - software, platform, and infrastructure as a service.
As applications are promoted from the development environment to the CI or the QA environment and then into the production environment, it is very common for the configuration settings to be changed as the code is promoted. For example, the settings for the database connection pools are typically lower in development environment than the QA/Load Testing environment. The primary reason for the existence of the configuration setting differences is to enhance application performance. However, occas...
Internet of @ThingsExpo, taking place November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 19th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo Silicon Valley Call for Papers is now open.
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
Apache Hadoop is a key technology for gaining business insights from your Big Data, but the penetration into enterprises is shockingly low. In fact, Apache Hadoop and Big Data proponents recognize that this technology has not yet achieved its game-changing business potential. In his session at 19th Cloud Expo, John Mertic, director of program management for ODPi at The Linux Foundation, will explain why this is, how we can work together as an open data community to increase adoption, and the i...
SYS-CON Events announced today that Tintri Inc., a leading producer of VM-aware storage (VAS) for virtualization and cloud environments, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Tintri VM-aware storage is the simplest for virtualized applications and cloud. Organizations including GE, Toyota, United Healthcare, NASA and 6 of the Fortune 15 have said “No to LUNs.” With Tintri they mana...
SYS-CON Events announced today that Numerex Corp, a leading provider of managed enterprise solutions enabling the Internet of Things (IoT), will exhibit at the 19th International Cloud Expo | @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Numerex Corp. (NASDAQ:NMRX) is a leading provider of managed enterprise solutions enabling the Internet of Things (IoT). The Company's solutions produce new revenue streams or create operating...
In his general session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed cloud as a ‘better data center’ and how it adds new capacity (faster) and improves application availability (redundancy). The cloud is a ‘Dynamic Tool for Dynamic Apps’ and resource allocation is an integral part of your application architecture, so use only the resources you need and allocate /de-allocate resources on the fly.
Cloud Expo 2016 New York at the Javits Center New York was characterized by increased attendance and a new focus on operations. These were both encouraging signs for all involved in Cloud Computing and all that it touches. As Conference Chair, I work with the Cloud Expo team to structure three keynotes, numerous general sessions, and more than 150 breakout sessions along 10 tracks. Our job is to balance the state of enterprise IT today with the trends that will be commonplace tomorrow. Mobile...