Welcome!

SOA & WOA Authors: Steve Weisfeldt, Elizabeth White, Pat Romanski, Maureen O'Gara, Liz McMillan

Related Topics: Weblogic

Weblogic: Article

Keeping Memory Leaks and Stalled Threads in Check

Keeping Memory Leaks and Stalled Threads in Check

Q. What are some of the performance problems you've seen associated with the GC Heap?
There are several different categories of memory-related problems that I've seen in the field. The most common of these is the memory leak. A Java memory leak is the result of objects remaining referenced after an application has completely finished using them. This tends to happen when an object that has a long lifespan within your application holds references to other objects with short lifespans.

Memory leaks manifest as steady increases in the baseline of the JVM's GC Heap Bytes in Use, where the baseline is defined as the bytes in use after a full garbage collect. Full garbage collects look different on different JVMs. The easiest way to know whether a full garbage collect has taken place is to turn on verbose garbage collection statistics (usually, this can be done with the -Xverbose:gc flag). You can also record the GC Heap Bytes in Use and graph them over a long time span - the garbage collection pattern jumps out very clearly with such analysis.

Besides memory leaks, many server-side applications suffer from trying to keep too much session state. In some cases, reducing the session timeout can significantly reduce the memory load on the application server. But frequently the complexity of the result sets returned by queries wasn't planned for and the application had to be re-architected to ask for results in smaller pieces.

Many applications burn CPU cycles on temporary objects. Object instantiation is one of the most expensive calls in Java. If a single transaction requires more than 10,000 temporary object instances (e.g., String or Hashtable$Entry), performance takes a double hit: once on the creation and again on the increased frequency of garbage collection required to reclaim this memory.

Finding the ideal GC Heap settings is a matter of deciding the right balance point between high-frequency, quick garbage collects and low-frequency, slow garbage collects. If your application has a service-level agreement requiring all transactions to be returned in under 10 seconds, then you may not be able to delay your full garbage collects more than a few minutes - longer delays mean more memory to collect and longer stalls of the JVM. Many environments have asynchronous garbage collectors available to address just this problem, and they can be worth investigating if you have strict service-level agreements for your application.

Q. What do I do if I suspect a stalled thread in my JVM?
Your JVM may have a stalled thread for any number of reasons: deadlock, livelock, lengthy timeouts on requests to a back-end system, and so on. Each of these problems has a different signature in a live application, but the first step for diagnosis is the same in all cases: produce a thread dump.

If you're on a Unix platform, thread dumping a JVM is usually as simple as issuing the SIGQUIT to the JVM process. This can be done with the "kill -3 " command. Be sure that you issue the signal to the JVM of the process you suspect has the hung thread. Many servers have multiple JVMs running, so you should carefully grep the results of your ps command or check for the PID in the WebLogic console before you issue the SIGQUIT.

If your application server is alive in a console window, you can thread dump it by typing - on Windows, and - on Unix. NT Service commands can temporarily be forced to run in a console window, allowing you to thread dump them, by checking the "Allow Service to Interact with Desktop" box in the properties for the service. Be sure to uncheck this later so that the service does not continue to pop up a window.

If you still can't obtain a thread dump when your server is hung, try adding another pool of execute threads to your WLS instance. In some cases, the JVM cannot thread dump when there are no available threads in the WLS; having a secondary queue that is unused by incoming requests will ensure availability.

Once you have a thread dump, you should look for threads that are clearly involved with incoming requests. They usually stick out clearly because they have much deeper stack traces than threads that are just waiting on sockets for incoming requests. If you are unsure whether or not a particular thread is doing something normal, try searching for it on a Web search engine or on the WebLogic performance bulletin board - if it is a normal waiting condition you will probably see lots of other thread dumps on the Internet with waiting threads that look just like yours.

With luck, you will have weeded your thread dump down to a handful of threads that could be the cause of your problem. Some late-edition Sun JVMs have built-in deadlock detection. If you suspect you have a deadlock, you could try reproducing the problem in one of these JVMs to see if it flags the problem. A "livelock" is usually a thread caught in an infinite loop.

By taking several thread dumps in a row you can see which of the several threads you are now paying attention to stays more or less in the same place each time. This takes any normal requests that just happened to get caught in the act out of the picture.

At this point, if you still haven't found your problem thread, or been able to take a thread dump at all, you should consider working with a performance tool designed for handling such threading problems.

*  *  *

As always, I invite you to send an e-mail to asklew@syscon.com if you have any performance-related questions about JVMs, Java applications, WebLogic Server, or connections to back-end systems.

More Stories By Lewis Cirne

Lew Cirne is the founder of New Relic, the first provider of on-demand (SaaS) application management tools for cloud or datacenter applications. A seasoned entrepreneur, technologist, and enterprise software pioneer, he has been focused on application performance management for more than ten years. Cirne holds seven patents related to application performance technology. Most recently he was an Entrepreneur in Residence at Benchmark Capital. He founded and was first CEO of Wily Technology and earlier held senior engineering positions at Apple and Hummingbird Communications.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.