SQL Server - Performance
Counter
SQLServer:Access
Methods (Forwarded Records/sec): Rows with varchar columns can experience
expansion when varchar values are updated with a longer string. In the case where the row cannot fit in the
existing page, the row migrates and access to the row will traverse a
pointer. This only happens on heaps
(tables without clustered indexes). Evaluate clustered index for heap
tables. In cases where clustered indexes
cannot be used, drop non-clustered indexes, build a clustered index to reorg
pages and rows, drop the clustered index, then recreate non-clustered indexes.
Preferred Value(<
10 per 100 Batch Requests/Sec)
SQLServer:Access
Methods (Full Scans / sec): This
counter monitors the number of full scans on base tables or indexes. Values
greater than 1 or 2 indicate that we are having table / Index page scans. If we
see high CPU then we need to investigate this counter, otherwise if the full
scans are on small tables we can ignore this counter. A few of the main causes of high Full
Scans/sec are
• Missing indexes
• Too many rows requested
Queries with missing indexes or too many rows requested will
have a large number of logical reads and an increased CPU time.
Preferred Value(Index
Searches/sec)/(Full Scans/sec) > 1000
SQLServer:Access
Methods(Index Searches/sec): Number of
index searches. Index searches are used to start range scans, single index
record fetches, and to reposition within an index. Index searches are
preferable to index and table scans. For
OLTP applications, optimize for more index searches and less scans (preferably,
1 full scan for every 1000 index searches). Index and table scans are expensive
I/O operations.
Preferred Value ((Index Searches/sec)/(Full
Scans/sec) > 1000):
SQLServer:Access
Methods(Page Splits/sec): Number
of page splits per second that occur as the result of overflowing index pages.
Interesting counter that can lead us to our table / index design. This value
needs to be low as possible. If you find out that the number of page splits is
high, consider increasing the fillfactor of your indexes. An increased
fillfactor helps to reduce page splits because there is more room in data pages
before it fills up and a page split has to occur.
Note that this counter also includes the new page
allocations as well and doesn’t necessarily pose a problem. The other place we can confirm the page
splits that involve data or index rows moves are the fragmented indexes on page
splits.
SQL Server:Buffer Manager(Buffer Cache hit ratio): This counter
indicates how often SQL Server goes to the buffer, not the hard disk, to get
data. The higher this ratio, the less often SQL Server has to go to the hard
disk to fetch data, and performance overall is boosted. Unlike many of the
other counters available for monitoring SQL Server, this counter averages the
Buffer Cache Hit Ratio from the time the last instance of SQL Server was
restarted. In other words, this counter is not a real-time measurement, but an
average of all the days since SQL Server was last restarted. In OLTP
applications, this ratio should exceed 90-95%. If it doesn't, then you need to
add more RAM to your server to increase performance. In OLAP applications, the
ratio could be much less because of the nature of how OLAP works. In any case,
more RAM should increase the performance of SQL Server OLAP activity.
Preferred Value: >
90%
SQL Server:Buffer
Manager(Free list stalls/sec):
Free list stalls/sec is the frequency with which requests for available
database pages are suspended because no buffers are available. Free list stall
rates of 3 or 4 per second indicate too little SQL memory available.
Preferred Value: >
2
SQL Server:Buffer Manager(Free pages): Total number of pages on all free lists.
Preferred Value: >
640
SQL Server:Buffer
Manager(Lazy Writes/Sec): This counter tracks how many times a second that
the Lazy Writer process is moving dirty pages from the buffer to disk in order
to free up buffer space. Generally speaking, this should not be a high value,
say more than 20 per second or so.
Ideally, it should be close to zero. If it is zero, this indicates that
your SQL Server's buffer cache is plenty big and SQL Server doesn't have to
free up dirty pages, instead waiting for this to occur during regular
checkpoints. If this value is high, then a need for more memory is indicated.
Preferred Value: < 20
SQL Server:Buffer
Manager(Page Life Expectancy): This performance monitor counter tells you,
on average, how long data pages are staying in the buffer. If this value gets
below 300 seconds, this is a potential indication that your SQL Server could
use more memory in order to boost performance.
Preferred Value: >
300
SQLServer:Buffer
Manager(Page lookups/sec): Number of
requests to find a page in the buffer pool. When the ratio of page lookups to
batch requests is much greater than 100, this is an indication that while query
plans are looking up data in the buffer pool, these plans are inefficient.
Identify queries with the highest amount of logical I/O's and tune them.
Preferred Value: (Page
lookups/sec) / (Batch Requests/sec) < 100
SQL Server:Buffer
Manager(Page reads/sec): Number of physical database page reads issued. 80
– 90 per second is normal, anything that is above indicates indexing or memory
constraint.
Preferred Value: <
90
SQL Server:Buffer
Manager(Page writes/sec): Number of physical database page writes issued.
80 – 90 per second is normal, anything more we need to check the lazy
writer/sec and checkpoint counters, if these counters are also relatively high
then, it’s memory constraint.
Preferred Value: <
90
SQLServer:General
Statistics(Logins/sec): > 2 per
second indicates that the application is not correctly using connection pooling.
Preferred Value: < 2
SQLServer:General
Statistics(Logouts/sec): > 2 per second indicates that the application
is not correctly using connection pooling.
Preferred Value: < 2
SQLServer:General
Statistics(User Connections ):
The number of users currently connected to the SQL Server.
Note: It is recommended to review this counter along with
“Batch Requests/Sec”. A surge in “user
connections” may result in a surge of “Batch Requests/Sec”. So if there is a disparity (one going up and
the other staying flat or going down), then that may be a cause for concern.
With a blocking problem, for example, you might see user connections, lock
waits and lock wait time all increase while batch requests/sec decreases.
SQL Server:Latches(Latch
Waits/sec): This is the number of
latch requests that could not be granted immediately. In other words, these are
the amount of latches, in a one second period that had to wait.
Preferred Value: (Total Latch Wait Time) / (Latch
Waits/Sec) < 10
SQL Server:Latches( Total
Latch Wait Time (ms)): This is the total latch wait time (in milliseconds)
for latch requests in the last second
Preferred Value: (Total Latch Wait Time) / (Latch
Waits/Sec) < 10
SQL Server:Locks (Lock Wait Time (ms)): Total wait time (milliseconds) for locks in
the last second.
Note: For “Lock Wait Time”
it is recommended to look beyond the Avg value.
Look for any peaks that are close (or exceeds) to a wait of 60 sec. Though this counter counts how many total
milliseconds SQL Server is waiting on
locks during the last second, but the counter actually records at the end of locking event. So most probably the peaks represent one huge
locking event. If those events exceeds
more than 60seconds then they may have extended blocking and could be an issue.
In such cases, thoroughly analyze the blocking script output. Some applications
are written for timing out after 60 seconds and that’s not acceptable response
for those applications.
SQL Server:Locks (Lock Waits/sec): This counter reports
how many times users waited to acquire a lock over the past second. Note that while you are actually waiting on
the lock that this is not reflected in this counter—it gets incremented only
when you “wake up” after waiting on the lock. If this value is nonzero then it
is an indication that there is at least some level of blocking occurring. If you combine this with the Lock Wait Time
counter, you can get some idea of how long the blocking lasted. A zero value for this counter can definitively
prove out blocking as a potential cause; a nonzero value will require looking
at other information to determine whether it is significant.
Preferred Value: 0
SQL Server:Locks (Number of Deadlocks/sec): The number of lock requests that resulted in a
deadlock.
Preferred Value: <
1
SQLServer:Memory
Manager(Total Server Memory(KB)): The
Total Server Memory is the current amount of memory that SQL Server is
using. If this counter is still growing
the server has not yet reached its steady-state, and it is still trying to populate
the cache and get pages loaded into memory.
Performance will likely be somewhat slower during this time since more
disk I/O is required at this stage. This
behavior is normal. Eventually Total
Server Memory should approximate Target Server Memory.
SQLServer:SQL
Statistics (Batch Requests/Sec): This counter measures the number of batch
requests that SQL Server receives per second, and generally follows in step to
how busy your server's CPUs are. Generally speaking, over 1000 batch requests
per second indicates a very busy SQL Server, and could mean that if you are not
already experiencing a CPU bottleneck, that you may very well soon. Of course,
this is a relative number, and the bigger your hardware, the more batch
requests per second SQL Server can handle. From a network bottleneck approach,
a typical 100Mbs network card is only able to handle about 3000 batch requests
per second. If you have a server that is this busy, you may need to have two or
more network cards, or go to a 1Gbs network card.
Note: Sometimes low batch requests/sec can be
misleading. If there were a SQL
statements/sec counter, this would be a more accurate measure of the amount of
SQL Server activity. For example, an
application may call only a few stored procedures yet each stored procedure
does lot of work. In that case, we will
see a low number for batch requests/sec but each stored procedure (one batch)
will execute many SQL statements that drive CPU and other resources. As a result, many counter thresholds based on
the number of batch requests/sec will seem to identify issues because the batch
requests on such a server are unusually low for the level of activity on the
server.
We cannot conclude that a SQL Server is not active simply by
looking at only batch requests/sec.
Rather, you have to do more investigation before deciding there is no
load on the server. If the average
number of batch requests/sec is below 5 and other counters (such as SQL Server
processor utilization) confirm the absence of significant activity, then there
is not enough of a load to make any recommendations or identify issues
regarding scalability.
SQLServer:SQL
Statistics (SQL Compilations/sec): The
number of times per second that SQL Server compilations have occurred. This
value needs to be as low as possible. If you see a high value such as over 100,
then it’s an indication that there are lots of adhoc queries that are running,
might cause CPU usage, solution is to re-write these adhoc as stored procedure
or use sp_executeSQL.
Preferred Value: <
10% of the number of Batch Requests/Sec
SQLServer:SQL
Statistics (SQL Re-Compilations/sec) : This needs to be nil in our system
as much as possible. A recompile can cause deadlocks and compile locks that are
not compatible with any locking type.
Preferred Value: <
10% of the number of SQL Compilations/sec
No comments:
Post a Comment