Andreas Grabner About the Author

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Hunting an Oracle JDBC Memory Leak Crashing an 80JVM WebSphere Cluster

Are you prematurely restarting JVMs to keep them from running out of memory? Or have you received the recommendation from your application experts to just “increase the Java Heap Space?”

The following memory chart shows the memory consumption of 10 JVMs (4.1GB Heap each) per Host on an 8 machine cluster. The JVMs kept running out of memory, crashing with Out Of Memory Exceptions and sometimes even bringing the whole host to crash:

Starting in May all JVMs showed the same memory pattern of consuming all 4.1GB Heap until it crashed. The similar pattern of all JVMs sometimes even brought the whole host down

Starting in May all JVMs showed the same memory pattern of consuming all 4.1GB Heap until it crashed. The similar pattern of all JVMs sometimes even brought the whole host down

The typical first response from application teams in situations like this is to add more memory to these hosts. However – as this is not the best practice to solve problems – the team where this story is from decided to take a more pro-active appraoch and decided that it was time to look closer on the actual root cause of the exhaustive memory consumption.

Step #1: Analyzing the Out-of-Memory Dumps

Whenever the team’s JVMs crash, its APM Solution actively captures a full heap memory dump. This makes it very convenient to analyze post mortem the root cause of the excessive memory usage. In this scenario the hotspot was easy to identify. The following screenshot shows that the root causes are 22.7k T4CStatement objects. They consume about 2.6GB on the heap and the reason why they are not cleared by the Garbage Collector is because they are referenced on global static variables:

The main problem is the growing number of T4CStatement objects that consume 2.6GB that can’t be cleared by the GC as they are still referenced

The main problem is the growing number of T4CStatement objects that consume 2.6GB that can’t be cleared by the GC as they are still referenced

Step #2: Who is allocating these objects and why are they not cleaned

Taking a closer look on who allocates these T4CStatement objects shows that they are allocated whenever a SQL Statement is executed through the allocateStatement method called by createStatement. The problem with that method is that the T4CStatement object is allocated but it is never freed when the associated connection gets put pack into the Connection Pool:

 

The T4CStatement objects get created for every new statement that gets executed. The problem though is that these objects are never released after they are no longer needed

The T4CStatement objects get created for every new statement that gets executed. The problem though is that these objects are never released after they are no longer needed

Not releasing them keeps them on the heap until all the memory is consumed. The following screenshot shows that most of these objects have been on the heap for hours or even days:

Looking at the Heap Dump also shows how long these objects have been on the heap – verifying that the GC really never cleans them.

Looking at the Heap Dump also shows how long these objects have been on the heap – verifying that the GC really never cleans them.

Step #3: Working with Vendor to fix the problem

Oracle, the vendor of the JDBC Driver, dug into its implementation and found where it keeps these statement objects in a global variable so that it can’t be cleaned by the GC. With a quick turnaround the vendor provided a fix so that these T4CStatement were no longer referenced longer than these transactions needed them. That solved all crashes and of course reduced average memory consumption. That also leads to overall more efficient use of their JVMs in the cluster and processing more load on the same environment without fearing to run into memory issues.

Learn more on Enterprise Memory Management?

I can also recommend checking out our free online book chapter on Memory Management. Or read up on some of the other memory management & memory leak related blog posts such as DevOps way to solving JVM Memory Issues, Fixing Memory Leaks in Java Production Applications or Top Java Memory Problems.

About The Author
Andreas Grabner
Andreas Grabner Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments

  1. Can you tell us which version of the Oracle JDBC drivers have this problem and when it was fixed?

    • Here is the answer from one of the engineers that worked on this issue:
      We found that there is a particular file that seems to be the culprit to this issue. The ojdbc6.jar file needs to be updated to version 11.2.0.1 or later. I was able to get that message across to the vendor but instead of only updating that jar file they created a patch that did multiple things. This of course meant that I could not definitely say that updating that one file solved this problem rather then a combination of files. But, as part of the patch they did update the jar file to 11.2.0.1. The fix was finally implemented in August 2013.

      I hope this helps

  2. Robert Panzer says:

    We had a similar problem and resolved it by ensuring that Statement.close() is called together with Connection.close().
    At least at us that was often missing.

  3. We occur the same problem on JBoss server, actually it was a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1008763
    so if you have queries like:
    select * from table where id in (?,?)
    select * from table where id in (?,?,?)
    select * from table where id in (?,?,?,?)
    then prepared statement size is exceeding and leads to memory leak.

  4. David Lopes says:

    Very nice post, we had a similar situation with IBM MQ jars, it is interesting to see how APM tools sometimes can even pinpoint issues on external components of big vendors, after their customers have spent a lot of time figuring if there was something wrong with their own application.

  5. kgvsprasad says:

    Thanks for the post. very helpful. We are facing similar problem. Where can i get the fix?

Comments

*


+ 1 = eight