SQL Experts: cpu utilization is 100 / High

We experience regular slowdowns on our Microsoft SQL database. After analyzing the memory usage we would like to continue the root cause investigation by examining CPU bottlenecks.

Here is my general recommendations to recognize CPU related bottlenecks in SQL Server.

Recommendation:

There are many reasons for CPU related performance problems on SQL Server. The main reason for CPU bottlenecks is insufficient hardware resources. CPU utilization can usually be reduced by configuration changes and query tuning so think before you rush out to buy faster or more processors.

You can use Performance Monitor to check the load on your CPU. Look for the Processor:% Processor Time counter: if it regularly exceeds 80% of the processor time per CPU then most probably you're facing a CPU related bottleneck.

Type 'perfmon' run and click enter, now performance window will open you have to create counter log and monitor below counters.

These are many counters, I usually prefer below counters to find any performance related issue in my production box. Right click on "Counter log" and select "New Log Setting" it will ask for name give any name, and click ok,

After clicking OK, now new pop up box will open, here 3 tabs you can see, in general tab click 'Add Counter', again one more pop up box will open like below.

Here you have to select 'performance object' first then highlight the 'counter name' and click 'add'. Do the same for following counters.

Memory – Available MBytes

Paging File – % Usage

Physical Disk – Avg. Disk sec/Read

Physical Disk – Avg. Disk sec/Write

Physical Disk – Disk Reads/sec

Physical Disk – Disk Writes/sec

Processor – % Processor Time

SQLServer: Buffer Manager – Buffer cache hit ratio

SQLServer: Buffer Manager – Page life expectancy

SQLServer: General Statistics – User Connections

SQLServer: Memory Manager – Memory Grants Pending

SQLServer: SQL Statistics – Batch Requests/sec

SQLServer: SQL Statistics – Compilations/sec

SQLServer: SQL Statistics – Recompilations/sec

System – Processor Queue Length

After adding above counters click close, now you came back to previous window, now go 'Log File' tab, in this tab select 'log type' is "Text File (Tab Delimited)" and click 'configure' and select the file location, where you want to save the output.

And 3rd Tab is for Scheduling to do run this in a any particular time to capture the data, if suppose if any specific time you are facing CPU bottleneck or something else you can automate this.

After running one or two days or hours of these counters, and stop the counters, format the excel sheet for easy understanding like below, put the formulas for Average, median, min, max, Std Deviation. and you can easily figure it out what was the maximum value and it average value for each individual counters.

Below sheet are just recommendations and not exact recommendations because each scenario and each server configuration is different from other. You must collect a base-line of data to compare the above values.

Object	Counter	You Want	Description
Memory	Available Mbytes	> 100MB	Unused physical memory (not page file).
Paging File	% Usage	< 70%	Amount of Page File in use, which indicates the server is substituting disk space for memory.
Physical Disk	Avg. Disk sec/Read	< 8ms	A key measure of disk latency representing the average time, in milliseconds, of each read to disk where > 20 is poor, <20 is good/fair, <12 is better, <8 is best
Physical Disk	Avg. Disk sec/Write	< 8ms (non cached) < 1ms (cached)	A key measure of disk latency representing the average time, in milliseconds, of each write to disk, where non-cached writes ( > 20 poor, <20 fair, <12 better, <8 best) differ significantly from cached writes (> 4 poor, <4 fair, <2 better, <1 best ). For OLTP databases, the lower this number the better, especially for disks holding the transaction log.
Physical Disk	Disk Reads/sec
Physical Disk	Disk Writes/sec
Processor	% Processor Time	< 80%	Percentage of elapsed time the processor spends executing non-idle threads.
SQLServer: Buffer Manager	Buffer cache hit ratio	>90%	The percentage of pages that were found in the memory. Higher the value the better. Preferred around the 90% mark. This included the availability of the procedure and the data cache in the system.
SQLServer: Buffer Manager	Page life expectancy	> 300	Tells, on average, how many seconds SQL Server expects a data page to stay in cache. The target on an OLTP system should be at least 300 (5 min). When under 300, this may indicate poor index design (leading to increased disk I/O and less effective use of memory) or, simply, a potential shortage of memory.
SQLServer: General Statistics	User Connections	See Description	The number of users currently connected to the SQL Server. This counter should roughly track with “Batch Requests/Sec”. They should generally rise and fall together. For example, blocking problems could be revealed by rising user connections, lock waits and lock wait time coupled with declining batch requests/sec.
SQLServer: Memory Manager	Memory Grants Pending	See Description	Total number of processes waiting for a workspace memory grant.
SQLServer: SQL Statistics	Compilations/sec	< 10% of the number of Batch Requests/Sec	Number of times that Transact-SQL compilations occurred, per second (including recompiles). The lower this value is the better. High values often indicate excessive adhoc querying and should be as low as possible. If excessive adhoc querying is happening, try rewriting the queries as procedures or invoke the queries using sp_executeSQL. When rewriting isn’t possible, consider using a plan guide or setting the database to parameterization forced mode.
SQLServer: SQL Statistics	Recompilations/sec	< 10% of the number of SQL Compilations/sec	Number of times, per second, that Transact-SQL objects attempted to be executed but had to be recompiled before completion. This number should be at or near zero, since recompiles can cause deadlocks and exclusive compile locks. This counter’s value should follow in proportion to “Batch Requests/sec” and “SQL Compilations/sec”. This needs to be nil in your system as much as possible.
System	Processor Queue Length	< 4 per CPU	Number of threads waiting for CPU cycles, where < 12 per CPU is good/fair, < 8 is better, < 4 is best.

SQL Experts

29 December 2013

cpu utilization is 100 / High

No comments:

Post a Comment