|By Jyoti Bansal||
|February 18, 2017 02:00 PM EST||
How Java's Built-In Garbage Collection Will Make Your Life Better (Most of the Time)
By Kirk Pepperdine
“No provision need be made for the user to program the return of registers to the free-storage list.”
This line (along with the dozen or so that followed it) is buried in the middle of John McCarthy’s landmark paper, “Recursive Functions of Symbolic Expressions and Their Computation by Machine,” published in 1960. It is the first known description of automated memory management.
In specifying how to manage memory in Lisp, McCarthy was able to exclude explicit memory management. Thus, McCarthy relieved developers of the tedium of manual memory management. What makes this story truly amazing is that these few words inspired others to incorporate some form of automated memory management—otherwise known as garbage collection (GC)—into more than three quarters of the more widely used languages and runtimes developed since then. This list includes the two most popular platforms, Java’s Virtual Machine (JVM) and .NET’s Common Language Runtime (CLR), as well as the up and coming Go Lang by Google. GC exists not just on big iron but on mobile platforms such as Android’s Dalvik, Android Runtime, and Apple’s Swift. You can even find GC running in your web browser as well as on hardware devices such as SSDs. Let’s explore some of the reasons why the industry prefers automated over manual memory management.
Automatic Memory Management’s Humble Beginnings
So, how did McCarthy devise automated memory management? First, the Lisp engine decomposed Lisp expressions into sub-expressions, and each S-expression was stored in a single word node in a linked list. The nodes were allocated from a free list, but they didn’t have to be returned to the free list until it was empty.
Once the free list was empty, the runtime traced through the linked list and marked all reachable nodes. Next, it scanned through the buffer containing all nodes, and returned unmarked nodes to the free list. With the free-list refilled, the application would continue on.
Today, this is known as a single-space, in-place, tracing garbage collection. The implementation was quite rudimentary: it only had to deal with an acyclic-directed graph where all nodes were exactly the same size. Only a single thread ran, and that thread either executed application code or the garbage collector. In contrast, today’s collectors in the JVM must cope with a directed graph with cycles and nodes that are not uniformly sized. The JVM is multi-threaded, running on multi-core CPUs, possibly multi-socketed motherboards. Consequently, today’s implementations are far more complex—to the point GC experts struggle to predict performance in any given situation.
Slow Going: Garbage Collection Pause Time
When the Lisp garbage collector ran, the application stalled. In the initial versions of Lisp it was common for the collector to take 30 to 40 percent of the CPU cycles. On 1960s hardware this could cause the application stall, in what is known as a stop-the-world pause, for several minutes. The benefit was that allocation had barely any impact on application throughput (the amount of useful work done). This implementation highlighted the constant battle between pause time and impact on application throughput that persists to this day.
In general, the better the pause time characteristic of the collector, the more impact it has on application throughput. The current implementations in Java all come with pause time/overhead costs. The parallel collections come with long pause times and low overheads, while the mostly concurrent collectors have shorter pause times and consume more computing resources (both memory and CPU).
The goal of any GC implementer is to maximize the minimum amount of processor time that mutator threads are guaranteed to receive, a concept known as minimum mutator utilization (MMU). Even so, current GC overheads can run well under 5 percent, versus the 15 to 20 percent overhead you will experience in a typical C++ application.
So why you don’t feel this overhead like you do in a Java application? Because the overhead is evenly spread throughout the C/C++ run time, it is perceptibly invisible to the end users. In fact the biggest complaint about managed memory is that it pauses your application at unpredictable times for an unpredictable amount of time.
Garbage Collection Advancements
Sun Java’s initial garbage collector did nothing to improve the image of garbage collection. Its single-threaded, single-spaced implementation stalled applications for long periods of time and created a significant drag on allocation rates. It wasn’t until Java 2, when a generational memory pool scheme—along with parallel, mostly concurrent and incremental collectors—was introduced. While these collectors offered improved pause time characteristics, pause times continue to be problematic. Moreover, these implementations are so complex that it’s unlikely most developers have the experience necessary to tune them. To further complicate the picture, IBM, Azul, and RedHat have one or more of their own garbage collectors—each with their own histories, advantages and quirks. In addition, a number of companies including SAP, Twitter, Google, Alibaba, and others have their own internal JVM teams with modified versions of the Garbage collectors.
Costs and Benefits of Modern-Day Garbage Collection
Over time, an addition of alternate and more complex allocation paths led to huge improvements in the allocation overhead picture. For example, a fast-path allocation in the JVM is now approximately 30 times faster than a typical allocation in C/C++. The complication: Only data that can pass an escape analysis test is eligible for fast-path allocation. (Fortunately the vast majority of our data passes this test and benefits from this alternate allocation path.)
Another advantage is in the reduced costs and simplified cost models that come with evacuating collectors. In this scheme, the collector copies live data to another memory pool. Thus, there is no cost to recover short-lived data. This isn’t an invitation to allocate ad nauseam, because there is a cost for each allocation and high allocation rates trigger more frequent GC activity and accumulate extra copy costs. While evacuating collectors helps make GC more efficient and predictable, there are still significant resource costs.
That leads us to memory. Memory management demands that you retain at least five times more memory than manual memory management needs. There are times the developer knows for certain that data should be freed. In those cases, it is cheaper to explicitly free rather than have a collector reason through the decision. It was these costs that originally caused Apple to choose manual memory management for Objective-C. In Swift, Apple chose to use reference counting. They added annotations for weak and owned references to help the collector cope with circular references.
There are other intangible or difficult-to-measure costs that can be attributed to design decisions in the runtime. For example, the loss of control over memory layouts can result in application performance being dominated by L2 cache misses and cache line densities. The performance hit in these cases can easily exceed a factor of 10:1. One of the challenges for future implementers is to allow for better control of memory layouts.
Looking back at how poorly GC performed when first introduced into Lisp and the long and often frustrating road to its current state, it’s hard to imagine why anyone building a runtime would want to use managed memory. But consider that if you manually manage memory, you need access to the underlying reference system—and that means the language needs added syntax to manipulate memory pointers.
Languages that rely on managed memory consistently lack the syntax needed to manage pointers because of the memory consistency guarantee. That guarantee states that all pointers will point where they should without dangling (null) pointers waiting to blow up the runtime, if you should happen to step on them. The runtime can’t make this guarantee if developers are allowed to directly create and manipulate pointers. As an added bonus, removing them from the language removes indirection, one of the more difficult concepts for developers to master. Quite often bugs are a result of a developer engaged in the mental gymnastics required to juggle a multitude of competing concerns and getting it wrong. If this mix contains reasoning through application logic, along with manual memory management and different memory access modes, bugs likely appear in the code. In fact, bugs in systems that rely on manual memory management are among the most serious and largest source of security holes in our systems today.
To prevent these types of bugs the developer always has to ask, “Do I still have a viable reference to this data that prevents me from freeing it?” Often the answer to this question is, “I don’t know.” If a reference to that data was passed to another component in the system, it’s almost impossible to know if memory can safely be freed. As we all know too well, pointer bugs will lead to data corruption or, in the best case, a SIGSEGV.
Removing pointers from the picture tends to yield a code that is more readable and easier to reason through and maintain. GC knows when it can reclaim memory. This attribute allows projects to safely consume third-party components, something that rarely happens in languages with manual memory management.
At its best, memory management can be described as a tedious bookkeeping task. If memory management can be crossed off the to-do list, then developers tend to be more productive and produce far fewer bugs. We have also seen that GC is not a panacea as it comes with its own set of problems. But thankfully the march toward better implementations continues.
Go Lang’s new collector uses a combination of reference counting and tracing to reduce overheads and minimize pause times. Azul claims to have solved the GC pause problem by driving pause times down dramatically. Oracle and IBM keep working on collectors that they claim are better suited for very large heaps that contain significant amounts of data. RedHat has entered the fray with Shenandoah, a collector that aims to completely eliminate pause times from the run time. Meanwhile, Twitter and Google continue to improve the existing collectors so they continue to be competitive to the newer collectors.
Share “How Java’s Built-In Garbage Collection Will Make Your Life Better (Most of the Time)” On Your Site
After more than five years of DevOps, definitions are evolving, boundaries are expanding, ‘unicorns’ are no longer rare, enterprises are on board, and pundits are moving on. Can we now look at an evolution of DevOps? Should we? Is the foundation of DevOps ‘done’, or is there still too much left to do? What is mature, and what is still missing? What does the next 5 years of DevOps look like? In this Power Panel at DevOps Summit, moderated by DevOps Summit Conference Chair Andi Mann, panelists l...
Mar. 28, 2017 05:00 AM EDT Reads: 9,878
The rise of containers and microservices has skyrocketed the rate at which new applications are moved into production environments today. While developers have been deploying containers to speed up the development processes for some time, there still remain challenges with running microservices efficiently. Most existing IT monitoring tools don’t actually maintain visibility into the containers that make up microservices. As those container applications move into production, some IT operations t...
Mar. 28, 2017 01:15 AM EDT Reads: 3,047
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Mar. 28, 2017 01:00 AM EDT Reads: 2,340
The IT industry is undergoing a significant evolution to keep up with cloud application demand. We see this happening as a mindset shift, from traditional IT teams to more well-rounded, cloud-focused job roles. The IT industry has become so cloud-minded that Gartner predicts that by 2020, this cloud shift will impact more than $1 trillion of global IT spending. This shift, however, has left some IT professionals feeling a little anxious about what lies ahead. The good news is that cloud computin...
Mar. 27, 2017 04:30 PM EDT Reads: 1,418
By now, every company in the world is on the lookout for the digital disruption that will threaten their existence. In study after study, executives believe that technology has either already disrupted their industry, is in the process of disrupting it or will disrupt it in the near future. As a result, every organization is taking steps to prepare for or mitigate unforeseen disruptions. Yet in almost every industry, the disruption trend continues unabated.
Mar. 27, 2017 04:15 PM EDT Reads: 507
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore t...
Mar. 27, 2017 03:30 PM EDT Reads: 2,974
Lots of cloud technology predictions and analysis are still dealing with future spending and planning, but there are plenty of real-world cloud use cases and implementations happening now. One approach, taken by stalwart GE, is to use SaaS applications for non-differentiated uses. For them, that means moving functions like HR, finance, taxes and scheduling to SaaS, while spending their software development time and resources on the core apps that make GE better, such as inventory, planning and s...
Mar. 27, 2017 03:00 PM EDT Reads: 464
As Enterprise business moves from Monoliths to Microservices, adoption and successful implementations of Microservices become more evident. The goal of Microservices is to improve software delivery speed and increase system safety as scale increases. Documenting hurdles and problems for the use of Microservices will help consultants, architects and specialists to avoid repeating the same mistakes and learn how and when to use (or not use) Microservices at the enterprise level. The circumstance w...
Mar. 27, 2017 03:00 PM EDT Reads: 4,316
Everyone wants to use containers, but monitoring containers is hard. New ephemeral architecture introduces new challenges in how monitoring tools need to monitor and visualize containers, so your team can make sense of everything. In his session at @DevOpsSummit, David Gildeh, co-founder and CEO of Outlyer, will go through the challenges and show there is light at the end of the tunnel if you use the right tools and understand what you need to be monitoring to successfully use containers in your...
Mar. 27, 2017 01:15 PM EDT Reads: 1,750
What if you could build a web application that could support true web-scale traffic without having to ever provision or manage a single server? Sounds magical, and it is! In his session at 20th Cloud Expo, Chris Munns, Senior Developer Advocate for Serverless Applications at Amazon Web Services, will show how to build a serverless website that scales automatically using services like AWS Lambda, Amazon API Gateway, and Amazon S3. We will review several frameworks that can help you build serverle...
Mar. 27, 2017 01:00 PM EDT Reads: 2,111
The Software Defined Data Center (SDDC), which enables organizations to seamlessly run in a hybrid cloud model (public + private cloud), is here to stay. IDC estimates that the software-defined networking market will be valued at $3.7 billion by 2016. Security is a key component and benefit of the SDDC, and offers an opportunity to build security 'from the ground up' and weave it into the environment from day one. In his session at 16th Cloud Expo, Reuven Harrison, CTO and Co-Founder of Tufin, ...
Mar. 27, 2017 11:30 AM EDT Reads: 6,805
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
Mar. 27, 2017 10:30 AM EDT Reads: 3,036
Building custom add-ons does not need to be limited to the ideas you see on a marketplace. In his session at 20th Cloud Expo, Sukhbir Dhillon, CEO and founder of Addteq, will go over some adventures they faced in developing integrations using Atlassian SDK and other technologies/platforms and how it has enabled development teams to experiment with newer paradigms like Serverless and newer features of Atlassian SDKs. In this presentation, you will be taken on a journey of Add-On and Integration ...
Mar. 27, 2017 08:15 AM EDT Reads: 3,198
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership abi...
Mar. 27, 2017 05:00 AM EDT Reads: 11,180
The essence of cloud computing is that all consumable IT resources are delivered as services. In his session at 15th Cloud Expo, Yung Chou, Technology Evangelist at Microsoft, demonstrated the concepts and implementations of two important cloud computing deliveries: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). He discussed from business and technical viewpoints what exactly they are, why we care, how they are different and in what ways, and the strategies for IT to transi...
Mar. 27, 2017 05:00 AM EDT Reads: 6,304
Without a clear strategy for cost control and an architecture designed with cloud services in mind, costs and operational performance can quickly get out of control. To avoid multiple architectural redesigns requires extensive thought and planning. Boundary (now part of BMC) launched a new public-facing multi-tenant high resolution monitoring service on Amazon AWS two years ago, facing challenges and learning best practices in the early days of the new service.
Mar. 27, 2017 03:45 AM EDT Reads: 3,099
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the work...
Mar. 27, 2017 03:00 AM EDT Reads: 3,116
As software becomes more and more complex, we, as software developers, have been splitting up our code into smaller and smaller components. This is also true for the environment in which we run our code: going from bare metal, to VMs to the modern-day Cloud Native world of containers, schedulers and micro services. While we have figured out how to run containerized applications in the cloud using schedulers, we've yet to come up with a good solution to bridge the gap between getting your contain...
Mar. 26, 2017 09:45 PM EDT Reads: 7,782
As organizations realize the scope of the Internet of Things, gaining key insights from Big Data, through the use of advanced analytics, becomes crucial. However, IoT also creates the need for petabyte scale storage of data from millions of devices. A new type of Storage is required which seamlessly integrates robust data analytics with massive scale. These storage systems will act as “smart systems” provide in-place analytics that speed discovery and enable businesses to quickly derive meaningf...
Mar. 26, 2017 07:45 PM EDT Reads: 9,732
DevOps has often been described in terms of CAMS: Culture, Automation, Measuring, Sharing. While we’ve seen a lot of focus on the “A” and even on the “M”, there are very few examples of why the “C" is equally important in the DevOps equation. In her session at @DevOps Summit, Lori MacVittie, of F5 Networks, explored HTTP/1 and HTTP/2 along with Microservices to illustrate why a collaborative culture between Dev, Ops, and the Network is critical to ensuring success.
Mar. 26, 2017 03:00 PM EDT Reads: 10,722