Michael Kopp About the Author

Michael is a Technical Product Manager at Dynatrace. Reach him at @mikopp

The reason I don’t monitor connection pool usage

I have been working with performance sensitive applications for a long time now. As can be expected most of them have to use the database at one point or the other. So you inevitably end up having a connection pool. Now to make sure that your application is not suffering from waiting on connections you monitor the pool usage, but is that really helping? Too be honest not really…

How an application uses the connection pool

Most applications these days use connection pools implicitly. They get a connection, execute some statements and close it. The close call does not destroy the connection but put it back into a pool. The goal is to minimize the so called busy time. Under the hood most application servers refrain from putting a connection back into the pool until the transaction has been committed. For this reason it is a good practice to get the database connection as late as possible during a transaction. Again the goal is to minimize usage time, so that many application threads can share a limited number of connections.

All connection pools have a usage measure to determine if enough connections are available, or in other words to see if the lack of connections has a negative effect. However as a connection is occupied only for very short amounts of time, often fractions of a second, we would need to check the usage equally often to have a statistical significant chance of seeing the pool being maxed out under normal conditions.

Connection Pool usage if polled every 10 seconds

Connection Pool usage if polled every 10 seconds

In reality this is not done, as checking the pool to often (say several times a second) would lead to a lot of monitoring overhead. Most solutions check every couple of seconds and as a result we only see pool usage reaching 100% if it is constantly maxed out. If we were to track the usage on a continuous basis the result would look different:

Pool usage as seen if min/max and average are tracked continously instead of polled

Pool usage as seen if min/max and average are tracked continuously instead of polled

This means that by the time we see 100% pool usage with regular monitoring solutions we would already have a significant negative performance impact, or would we?

What does 100% pool usage really mean?

Actually it does not mean much. It means that all connections in the pool are in use, but not that any transactions are suffering performance problems due to this. In a continuous load scenario we could easily tune our setup to have 100% pool usage all the time and not have a single transaction suffering; it would be perfect.

However many use cases do not have a steady continuous load pattern and we would notice performance degradation long before that. Pool usage alone does not tell us anything; Acquisition Time does!

This shows the pool usage and the min/max acquisition time which is non-zero even though the pool is never maxed out

This shows the pool usage and the min/max acquisition time which is non-zero even though the pool is never maxed out

Most application servers and connection pools have a wait or acquisition metric that is far more interesting than pool usage. Acquisition time represents the time that a transaction has to wait for a connection from the pool. It therefore represents real actionable information. If it increases we do know for a fact that we do not have enough connections in the pool all the time (or that the connection pool itself is badly written). This measure can show significant wait time long before the average pool usage is anywhere close to 100%. But there is still a slight problem. The measure is still an aggregated average across the whole pool or more specifically all transactions. Thus while it allows us to understand whether or not there are enough connections overall, it does not enable us to identify which business transactions are impacted and by how much.

Measuring Acquisition Time properly

Acquisition time is simply the time it takes for the getConnection call to return. We can easily measure that inside our transaction and if we do that we can account it on a per business transaction basis and not just as an aggregate of the whole pool. This means we can determine exactly how much time we spend waiting for each transaction type. After all I might not care if I wait 10ms in a transaction that has an average response time of a second, but at the same time this would be unacceptable in a transaction type with 100ms response time.

The getConnection call as measured in a single transaction. It is 10 ms altough the pool average is 0.5ms

The getConnection call as measured in a single transaction. It is 10 ms altough the pool average is 0.5ms

We could even determine which transaction types are concurrently fighting over limited connections and understand outliers, meaning the occasional case when a transactions waits a relative long time for a connection, which would otherwise be hidden by the averaging affect.

Configuring the optimal Pool Size

Knowing how big to configure a pool upfront is not always is. In reality most people simply set it to an arbitrary number that they assume is big enough. In some high volume cases it might not be possible to avoid wait time all the times, but we can understand and optimize it.

There is a very easy and practical way to do this. Simply monitor the connection acquisition time during peak load hours. It is best if you do that on a per business transaction basis as described above. You want to pay special attention to how much it contributes to the response time. Make sure that you exclude those transactions from your analysis that do not wait at all, they would just skew your calculation.

If the average response time contribution to your specific business transaction is very low (say below 1%) than you can reasonably say that your connection pool is big enough. It is important to note that I am not talking about an absolute value in terms of milliseconds but contribution time! If that contribution time is too high (e.g. 5% or higher) you will want to increase your connection pool until you reach an acceptable value. The resulting average pool usage might be very low on average or close to 100%, it does not really matter!


The usefulness of a pool measure depends on the frequency of polling it. The more we poll it, the more overhead we add and in the end it is still only a guess. Impact measures like acquisition time are far more useful and actionable. It allows us to tune the connection pool to a point where it has no or or at least acceptable overhead when compared to response time. Like all impact measures it is best not to use the overall average, but to understand it in terms of contribution to the end user response time.

About The Author
Michael Kopp
Michael Kopp Michael is a Technical Product Manager at Dynatrace. Reach him at @mikopp


  1. Kiran Kumar says:

    Why bother measuring Acquisition and pool sizes. Is there any benefit of using less connections?
    After detailed analysis as suggested I come up with a figure say 10 users can get acceptable performance with 10 connections. What am I loosing if I set-up to use 50 connections?

    For eg when I increase the heap size from 1024 to 4096 I know I have to procure 3gb of ram. In case of database connections what is the additional cost (consequences ) needed for 40 idle connections.

  2. In a simple usecase with just 10 parallel uses then you are of course right, just set it up to 40 or 50.


    – bigger setups will not be able to just quadruple the number of available connections as it will exhaust the resources of their database server or lead to unacceptable memory usage
    – each connection has its own statement pool and other memory using items attached to it. You will need much more memory for the same performance.
    – As I tried to show in the blog the pool usage can be way below the maximum whenever you look at the figure or just look at the average and you still have a noticeable performance impact that you will have a hard time figuring out.
    – not every connection pool is related to JDBC connections which are relatively cheap, think about thread pools, JMS Connection Pools, pools of mainframe socket connections…

    Lastly, what I try to advocate here is in case you have a performance issue, don’t trust the pool usage to make sure your problem is not in the connection pool area.

    PS, the cost of going from 1GB to 4GB is bigger GC pauses 😉

  3. I think keeping the pool size higher(within safe limits) can also act as a measure of resiliency.In real time prod scenario’s its always a possibility that some system/interface cranks up causing threads to be released late than usual.In such scenario’s a safer high value of pool size may actually help the system to survive for some time till firefighting is done.

  4. Kiran Kumar says:

    @Michael Kopp, Thanks for the nice article and Thanks for the quick clarification. I have two challenges
    1) To demonstrate that I have optimum pool size
    2) (Contrast to above) When performance issue is observed, to demonstrate that pool size is not the cause.

    Your article has solved both challenges.

  5. @Vivek Agreed, your argument is correct. It also requires that you have sufficient monitoring in place to
    a) identify what a safe high value is
    b) identify the fact that it was the root cause for a problem, the reason for the firefight or had nothing to do with it.

    @Kumar Glad to be of service

  6. Yves Martin says:

    Because most hibernate-based applications use now a large set of different queries, the prepared statement cache is difficult to size too. As most (all ?) JDBC drivers manage a statement cache per connection, it is really easy to exhaust application server memory with (statement cache size * connection pool size) items.
    It is not rare that a prepared statement weights 5 Mb in memory.

    As a result, the “optimal connection pool size” allows you to increase your prepared statement cache !

    What I really expect for future is the ability for database servers – which hold the prepared statement’s execution plan, often thanks a cursor – to share cached statements in JDBC driver’s across all connections in pool.

    In my opinion, that would be a great improvement.

  7. @Martin thanks for bringing that up, I tried to hint at that, but you put it in much better terms.

    I agree with your desire that the statement pool should be shared. in fact the statement pool is shared on the server side of things. the jdbc side suffers from bad design, the preparedstatement objects often carry a lot of state!

  8. I use Tivoli/pmi metrics on Websphere to monitor and size connections pools effectively. For sudden surges in prod. always tune your pool sizes (jdbc/jms/resource adapters/web container,IOC layer etc) for a 2x,3x anticipated load. It’s better to be safe than sorry. And also over sizing max connections is generally a bad idea since the jvm has to maintain those connections and resources are unnecessarily utilized.

  9. Hi

    we receive below alerts frequently but we didnt suspect any issues in logs do we need to check BPPM or BMM monitoring tool for this? What setting would be optimal? Let me know if you seek any information.

    HALY0111 SOKBPPM : 20141207 14:48:08 usnyc AppServer BMM_WAS MAJOR High average wait time or an available connection from the pool: Host=hostname, Cell=CellName, Node=NodeName, Server=AppServer, JDBCConnectionPool=connectionPoolModule, JDBCProvider=DB2 Universal JDBC Driver Provider (XA), JDBCDataSource=jdbc/TuhtiDataSource, WaitTime.Mean=176967.274556


  10. Mark Ernst says:

    Normally, I wouldn’t comment, but I wouldn’t rely on anything here. There can be a benefit to lower connection pool sizes, in fact I recommend 0 for a minimum unless you have test queries setup and cleanup of stale/bad connections. As far as a single SQL statement taking up 5 MB, that must be some statement to take up ~5 million characters. Never seen anything like that, and I doubt anyone has. Most statements are a few thousand characters at most. So if you increase the statement cache size from say 10 to a 100 it costs you about 3 MB per JVM/instance in total. (based on 3k per statement) As far as caching at the DB, they do, but your App Server still a has to “prepare” the statement, thus the App side statement cache. Wait time versus connections in use can be two very different things, and caused by different reasons. Make sure you understand why you would do things each way and choose, or use them both.

  11. Hi!, could you recommend me a java pool monitor

  12. Tony Basile says:

    Michael, I would appreciate your opinion on what Mark Ernst wrote above. Specifically, the part that reads:
    There can be a benefit to lower connection pool sizes, in fact I recommend 0 for a minimum unless you have test queries setup and cleanup of stale/bad connections. Even IBM recommends setting minimum connections to 0 to deal with StaleConnectionExceptions.

  13. As with everything it depends. In this case it depends on the implementation of the connection pool. Having the connection pool min=0 will indeed avoid certain problems, like memory overhead, stale connections (that might also consume resources at the db server).
    However the downside is that requests that need to open a real connections will be slower. In a high volume burst scenario that might be a problem, in many it will pose no problem.
    If you can deal with the added memory needed and the connection pool supports it I would go for a different option thought. Many pools can define both a min and a max idle/max life time per connection. Basically this means that a connection that has not been used for a while will be closed, while still keeping a minimum of connections around. in other words the pool cycles connections regularly. This avoids any mem leak and stale connection problems while still giving good performance.

  14. Mark Ernst says:

    True, it will be slower to ramp up if it is cruising at 0 connections. We typically get a slow ramp up, and then it goes up and down with volume, peaks and then travels back down. We have most of our problems when connections to the DB are severed or dropped. Those connections are invalid or stale. Usually this happens during early morning change windows for other systems like DB’s. This won’t fit everyone, which is why I said “understand why you would do things…”

    In WebSphere, if you have the minimum set to anything higher than 0, it will not clean out all stale connections as you expect. It will only reap them down to your minimum, never cleaning those out completely until a failure, and a “reap entire pool” or “reap failing connections only” is performed. So sure you may have a minimum 3 or 10 or 50 connections set and available, some/none of which are valid, then get reaped, then get recreated. Max idle time (or unused timeout) will not fix this completely as it only reaps down to minimum. Your performance while waiting, and reaping, and recreating will be poor.

    And when I said, don’t rely on anything here, I really meant in the comments field. Personal anecdotes, or “it works for me” so it must be perfect doesn’t mean it is the best practice. 99 times out of a 100 you might be able to speed across iced roads, it is that one time that kills you.

    This is from my own practical experience over many years. You can set it how you want, just understand why and the limitations.

    IBM also recommends “0” minimum in several different articles, here is a snippet from one: “Any value above 0 means that WebSphere Application Server is required to keep that many connections in the pool even after the connections have reached the Unused Timeout.” — That is in regards to a firewall closing the connection, but the same rules apply to everything.

    Good Luck!

  15. nitesh shukla says:

    i want to know that why DB connection pool usages goes high and seems a lots of hike in spike in particular servers

Speak Your Mind


Do the math so we know you are human *