29 December 2013

cpu utilization is 100 / High

We experience regular slowdowns on our Microsoft SQL database. After analyzing the memory usage we would like to continue the root cause investigation by examining CPU bottlenecks.

Here is my general recommendations to recognize CPU related bottlenecks in SQL Server. 
Recommendation:
There are many reasons for CPU related performance problems on SQL Server. The main reason for CPU bottlenecks is insufficient hardware resources. CPU utilization can usually be reduced by configuration changes and query tuning so think before you rush out to buy faster or more processors.

You can use Performance Monitor to check the load on your CPU. Look for the Processor:% Processor Time counter: if it regularly exceeds 80% of the processor time per CPU then most probably you're facing a CPU related bottleneck.

Type 'perfmon' run and click enter, now performance window will open you have to create counter log and monitor below counters.

These are many counters, I usually prefer below counters to find any performance related issue in my production box. Right click on "Counter log" and select "New Log Setting" it will ask for name give any name, and click ok, 

After clicking OK, now new pop up box will open, here 3 tabs you can see, in general tab click 'Add Counter', again one more pop up box will open like below.
Here you have to select 'performance object' first then highlight the 'counter name' and click 'add'. Do the same for following counters.

    Memory – Available MBytes
    Paging File – % Usage
    Physical Disk – Avg. Disk sec/Read
    Physical Disk – Avg. Disk sec/Write
    Physical Disk – Disk Reads/sec
    Physical Disk – Disk Writes/sec
    Processor – % Processor Time
    SQLServer: Buffer Manager – Buffer cache hit ratio
    SQLServer: Buffer Manager – Page life expectancy
    SQLServer: General Statistics – User Connections
    SQLServer: Memory Manager – Memory Grants Pending
    SQLServer: SQL Statistics – Batch Requests/sec
    SQLServer: SQL Statistics – Compilations/sec
    SQLServer: SQL Statistics – Recompilations/sec
    System – Processor Queue Length

After adding above counters click close, now you came back to previous window, now go 'Log File' tab, in this tab select 'log type' is "Text File (Tab Delimited)" and click 'configure' and select the file location, where you want to save the output.

And 3rd Tab is for Scheduling to do run this in a any particular time to capture the data, if suppose if any specific time you are facing CPU bottleneck or something else you can automate this.

After running one or two days or hours of these counters, and stop the counters, format the excel sheet for easy understanding like below, put the formulas for Average, median, min, max, Std Deviation. and you can easily figure it out what was the maximum value and it average value for each individual counters.


Below sheet are just recommendations and not exact recommendations because each scenario and each server configuration is different from other. You must collect a base-line of data to compare the above values.


Object
Counter
You Want
                      Description
Memory
Available Mbytes
> 100MB
Unused physical memory (not page file).
Paging File
% Usage
< 70%
Amount of Page File in use, which indicates the server is substituting disk space for memory.
Physical Disk
Avg. Disk sec/Read
< 8ms
A key measure of disk latency representing the average time, in milliseconds, of each read to disk where > 20 is poor, <20 is good/fair, <12 is better, <8 is best
Physical Disk
Avg. Disk sec/Write
< 8ms (non cached)
< 1ms (cached)
A key measure of disk latency representing the average time, in milliseconds, of each write to disk, where non-cached writes ( > 20 poor, <20 fair, <12 better, <8 best) differ significantly from cached writes (> 4 poor, <4 fair, <2 better, <1 best ). For OLTP databases, the lower this number the better, especially for disks holding the transaction log.
Physical Disk
Disk Reads/sec


Physical Disk
Disk Writes/sec


Processor
% Processor Time
< 80%
Percentage of elapsed time the processor spends executing non-idle threads.
SQLServer: Buffer Manager
Buffer cache hit ratio
>90%
The percentage of pages that were found in the memory. Higher the value the better. Preferred around the 90% mark. This included the availability of the procedure and the data cache in the system.
SQLServer: Buffer Manager
Page life expectancy
> 300
Tells, on average, how many seconds SQL Server expects a data page to stay in cache. The target on an OLTP system should be at least 300 (5 min). When under 300, this may indicate poor index design (leading to increased disk I/O and less effective use of memory) or, simply, a potential shortage of memory.
SQLServer: General Statistics
User Connections
See Description
The number of users currently connected to the SQL Server. This counter should roughly track with “Batch Requests/Sec”. They should generally rise and fall together. For example, blocking problems could be revealed by rising user connections, lock waits and lock wait time coupled with declining batch requests/sec.
SQLServer: Memory Manager
Memory Grants Pending
See Description
Total number of processes waiting for a workspace memory grant.
SQLServer: SQL Statistics
Compilations/sec
< 10% of the number of Batch Requests/Sec
Number of times that Transact-SQL compilations occurred, per second (including recompiles). The lower this value is the better. High values often indicate excessive adhoc querying and should be as low as possible. If excessive adhoc querying is happening, try rewriting the queries as procedures or invoke the queries using sp_executeSQL. When rewriting isn’t possible, consider using a plan guide or setting the database to parameterization forced mode.
SQLServer: SQL Statistics
Recompilations/sec
< 10% of the number of SQL Compilations/sec
Number of times, per second, that Transact-SQL objects attempted to be executed but had to be recompiled before completion. This number should be at or near zero, since recompiles can cause deadlocks and exclusive compile locks. This counter’s value should follow in proportion to “Batch Requests/sec” and “SQL Compilations/sec”. This needs to be nil in your system as much as possible.
System
Processor Queue Length
< 4 per CPU
Number of threads waiting for CPU cycles, where < 12 per CPU is good/fair, < 8 is better, < 4 is best.

No comments:

Post a Comment