Tuesday, July 17, 2012

Connection Pooling

A cache of database connections maintained in the database's memory so that the connections can be reused when the database receives future requests for data.
Connection pools are used to enhance the performance of executing commands on a database. Opening and maintaining a database connection for each user, especially requests made to a dynamic database-driven Web application, is costly and wastes resources. In connection pooling, after a connection is created, it is placed in the pool and it is used over again so that a new connection does not have to be established. If all the connections are being used, a new connection is made and is added to the pool. Connection pooling also cuts down on the amount of time a user must wait to establish a connection to the database. 

Connection Pooling - what is it and why do we need it?

It's a technique to allow multiple clinets to make use of a cached set of shared and reusable connection objects providing access to a database. Connection Pooling feature is supported only on J2SDK 1.4 and later releases.

Opening/Closing database connections is an expensive process and hence connection pools improve the performance of execution of commands on a database for which we maintain connection objects in the pool. It facilitates reuse of the same connection object to serve a number of client requests. Every time a client request is received, the pool is searched for an available connection object and it's highly likely that it gets a free connection object. Otherwise, either the incoming requests are queued or a new connection object is created and added to the pool (depending upon how many connections are already there in the pool and how many the particular implementation and configuration can support). As soon as a request finishes using a connection object, the object is given back to the pool from where it's assigned to one of the queued requests (based on what scheduling algorithm the particular connection pool implementation follows for serving queued requests). Since most of the requests are served using existing connection objects only so the connection pooling approach brings down the average time required for the users to wait for establishing the connection to the database.

How is it used?

It's normally used in a web-based enterprise application where the application server handles the responsibilities of creating connection objects, adding them to the pool, assigning them to the incoming requests, taking the used connection objects back, returning them back to the pool, etc. When a dynamic web page of the web-based application explicitily creates a connection (using JDBC 2.0 pooling manager interfaces and calling getConnection() method on a PooledConnection object ... I'll discuss both the JDBC 1.0 and JDBC 2.0 approaches in a separate article) to the database and closes it after use then the application server internally gives a connection object from the pool itself on the execution of the statement which tries to create a connection (in this case it's called logical connection) and on execution of the statement which tries to close the connection, the application server simply returns the connection back to pool. Remember you can still use JDBC 1.0 / JDBC 2.0 APIs to obtain physical connections. This of course is used very rarely - probably in the cases where connection to that particular database is needed once in a while and maintaining a pool of connection is not really needed.

How many connections the Pool can handle? Who creates/releases them?

These days it's pretty configurable - the maximum connections, the minimum connections, the maximum number of idle connections, etc. these all parameters can be configured by the server administrator. On start up the server creates a fixed number (the configured minimum) of connection objects and adds them to the pool. Once all of these connection objects are exhausted by serving those many clinet requests then any extra request causes a new connection object to be created, added to the pool, and then to be assigned to server that extra request. This continues till the number of connection objects doesn't reach the configured maximum number of connection objects in the pool. The server keep on checking the number of idle connection objects as well and if it finds that there are more number of idle connection objects than the configured value of that parameter then the server simply closes the extra number of idle connections, which are subsequently garbage collected.

Traditional Connection Pooling vs Managed Connection Pooling

Connection Pooling is an open concept and its certainly not limited to the connection pooling we normally notice in the enterprise application i.e., the one managed by the Application Servers. Any application can use this concept and can manage it the way it wants. Connection Pooling simply means creating, managing, and maintaining connection objects in advance. A traditional application can do it manually, but as we can easily observe that as the scalability and reach of the application grows, it becomes more and more difficult to manage connections without having a defined and robust connection pooling mechanism. Otherwise it'll be extremely difficult to ensure the maintainability and availability of the connections and in turn the application.

No comments:

Post a Comment