Here is a list of top 10 performance issues with a Database and their most
probable solutions
Too many calls to the DB - There might be multiple trips to a single DB from various middleware components and any of the following scenarios will occur
1. More data is requested than necessary, primarily for faster rendering (but in the slowing down the entire performance )
2. Multiple applications requesting for the same data.
3. Multiple queries are executed which in the end return the same result
This kind of problem generally arises when there is too much object orientation. The key is to strike a balance between how many objects to create and what to put in each object. Object oriented programing may be good for maintenance, but it surely degrades performance if they are not handled correctly
Too much synchronization – Most developers tend to over-synchronize, large pieces of code are synchronized by writing even larger pieces of code. It is generally fine when there is low load, under high load, the performance of the application will definitely take a beating. How to determine if the application has sync issues. The easiest way (but not 100% fool proof) is to chart CPU time and Execution time
CPU Time – is the time spent on the CPU by the executed code
Execution time - This is the total time the method takes to execute. It includes all times including CPU, I/O, waiting to enter sync block, etc
Generally the gap between the two times gives the waiting time. If our trouble making method does not make an I/O call nor an external call, then it’s most probably a sync issue that is causing the slowness.
Joining too many tables – The worst kind of SQL issues
creep up when too many tables are joined and the data is to be extracted from
it. Sometimes it is just unfortunate that so many tables have to be necessarily
joined to be able to pull out the necessary data.
There are two ways to attack this problem
1) Is it possible to denormalize a few tables to have more data?
2) Is it possible to create a summary table with most of the information that will be updated periodically?
Returning a large result set - Generally no user will go through thousands of records in the result set. Most users will generally limit to only the first few hundreds (or the first 3 -4 pages). By returning all the results, the developer is not only slowing the database but also chocking the network. Breaking the result set into batches (on the database side) will generally solve this issue (though not possible always)
There are two ways to attack this problem
1) Is it possible to denormalize a few tables to have more data?
2) Is it possible to create a summary table with most of the information that will be updated periodically?
Returning a large result set - Generally no user will go through thousands of records in the result set. Most users will generally limit to only the first few hundreds (or the first 3 -4 pages). By returning all the results, the developer is not only slowing the database but also chocking the network. Breaking the result set into batches (on the database side) will generally solve this issue (though not possible always)
Joining tables in the middleware – SQL is a fantastic language for data manipulation and retrieval. There is simply no need to move data to a middle tier and join tables there. Generally by joining data in the middle tier:
1. Unnecessary load on the network as it has to transport data back and forth
2. Increasing memory requirements on the application server to handle the extra load
3. Drop in server performance as the app tier is mainly held up with processing large queries
The best way to approach this problem is to user Inner and Outer joins right in the database itself. By this, the all the power of SQL and the database is utilized for processing the query.
Ad hock queries – just because SQL gives the privilege to
create and use ad-hock queries, there is no point in abusing them. In quite a
few cases it is seen that ad-hock queries create more mess than advantage they
bring. The best way is to use stored procedures. This is not always possible. Sometimes
it is necessary to use ad-hock queries, then there is no option but to use
them, but whenever possible, it is recommended to use stored procedures. The
main advantage with stored procedures is
1. Pre compiled and ready
2. Optimized by DB
3. Stored procedure in on the DB server, i.e. no network transmission of large SQL request.
Lack of indices – You see that the data is not large, yet the DB seems to be taking an abnormally long time to retrieve the results. The most possible cause for this problem could be lack of or misconfigured index. At first sight it might seem trivial, but when the data grows large, then it plays a significant role. There can be significant hit in performance if the indices are not configured properly.
1. Pre compiled and ready
2. Optimized by DB
3. Stored procedure in on the DB server, i.e. no network transmission of large SQL request.
Lack of indices – You see that the data is not large, yet the DB seems to be taking an abnormally long time to retrieve the results. The most possible cause for this problem could be lack of or misconfigured index. At first sight it might seem trivial, but when the data grows large, then it plays a significant role. There can be significant hit in performance if the indices are not configured properly.
Fill factor – One of the other things to consider along
with index is fill factor. MSDN describes fill factor as a percentage that
indicates how much the Database Engine should fill each index page during index
creation or rebuild. The fill-factor setting applies only when the index is
created or rebuilt. Why is this so important? If the fill factor is too high,
if a new record is inserted and index rebuilt, then the DB will more often than
not split the index (page splitting) into a new page. This is very resource
intensive and causes fragmentation. On the other hand having a very low value
for fill factor means that lots of space is reserved for index alone. The
easiest way to overcome this is to look at the type of queries that come to the
DB; if there are too many SELECT queries, then it is best to leave the default
fill factor. On the other hand if there are lots of INSERT, UPDATE and DELETE
operations, a nonzero fill factor other than 0 or 100 can be good for
performance if the new data is evenly distributed throughout the table.
My Query was fine last week but it is slow this week?? – We
get to see a lot of this. The load test ran fine last week but this week the
search page is taking a long time. What is wrong with the database? The main
issue could be that the execution plan (the way the query is going to get
executed on the DB) has changed. The easiest way to get the current explain plan
the explain plan for the previous week, compare them and look for the
differences.
High CPU and Memory Utilization on the DB – There is a high
CPU and high Memory utilization on the database server. There could be a
multitude of possible reasons for this.
1. See if there are full table scans happening (soln: create index and update stats)
2. See if there is too much context switching (soln: increase the memory)
3. Look for memory leaks (in terms of tables not being freed even after their usage is complete) (soln: recode!)
1. See if there are full table scans happening (soln: create index and update stats)
2. See if there is too much context switching (soln: increase the memory)
3. Look for memory leaks (in terms of tables not being freed even after their usage is complete) (soln: recode!)
There can be many more reasons, but there are the most common ones.
Low CPU and Memory utilization yet poor performance – This
is also another case (though not frequent). The CPU and memory are optimally
used yet the performance is still slow. The only reason why this can be is for
two reasons:
1. Bad network – the database server is waiting for a socket read or write
2. Bad disk management – the database server is waiting for a disk controller to become free
1. Bad network – the database server is waiting for a socket read or write
2. Bad disk management – the database server is waiting for a disk controller to become free
As always these are only the most common database performance issues that
might come up in any performance test. There are many more of them out there...
No comments:
Post a Comment