Sql Server Faq

May 29, 2010

HIGH CPU on Sql Server running Biztalk instances

 

Recently we encountered a case where in a sql server which was running and hosting Biztalk instances was showing 100% CPU.

 

We started troubleshooting from the SQL end to identify who is major consumer of CPU and following were our observations and findings

 

·         We used the perfmon counter Process  %Processor time and confirmed that sqlservr.exe was the major consumer of the CPU.

·         Since the recommendation for Biztalk server is to have MAXDOP 1 we ruled out parallelism as a cause of HIGH CPU.

·         We checked the perfmon counter Sql Server: SQL Statistics , Batch Requests/sec , Sql Compilations/sec and SQL Recompilations/sec but we could not see any major compilations or recompilations/sec so even that was ruled as a cause of the High CPU

·         Finally we started looking at the queries which are major consumers of CPU. We used the following queries and queried the DMvs sys.dm_exec_requests and sys.dm_exec_query_stats to find the queries consuming high CPU

 

SELECT st.text, r.cpu_time  FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st
ORDER BY cpu_time DESC

SELECT TOP 5 total_worker_time/execution_count AS [Avg CPU Time],

    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,

        ((CASE qs.statement_end_offset

          WHEN -1 THEN DATALENGTH(st.text)

         ELSE qs.statement_end_offset

         END - qs.statement_start_offset)/2) + 1) AS statement_text

FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY total_worker_time/execution_count DESC;

 

 

·         We found that following two stored procedures were one of the major consumers of CPU resources while all the other queries on the servers were getting blocked and waiting on SOS_SCHEDULER_YIELD

 

MSGBOXPERFCOUNTERS_GETSPOOLSIZE                            11.71%

[DBO].[BTS_DEQUEUEMESSAGES_BIZTALKSERVERISOLATEDHOST]     10.59%

 

 

·         We consulted our Biztalk team and found that the first stored proc is used for performance data collection which is fired every 5 secs while second stored procedure is used to dequeue messages coming from Biztalk

 

·         We further drilled the stored procedure [DBO].[BTS_DEQUEUEMESSAGES_BIZTALKSERVERISOLATEDHOST]  and found it was using a cursor to query table

 [BizTalkServerIsolatedHostQ] which was causing blocking on the server.

 

DECLARE btscurse CURSOR FAST_FORWARD FOR  

SELECT TOP 20 w.uidWorkID, w.uidInstanceID, w.uidMessageID, w.snPartRetrieval,

w.fOrderedDelivery, w.snPriority, w.dtLastTouched, w.nID, i.uidProcessID,

i.uidActivationID, i.nState   FROM [BizTalkServerIsolatedHostQ] AS w

WITH (UPDLOCK ROWLOCK READPAST)   INNER LOOP JOIN Instances AS i

WITH (UPDLOCK ROWLOCK READPAST) ON ( w.uidInstanceID = i.uidInstanceID AND uidAppOwnerID = N‘85459549-B70C-4525-B1E6-72D3C0125E52′) AND ( i.uidProcessID IS NULL OR i.uidProcessID = @uidProcessID ) 

 

 

·         We queried the table BizTalkServerIsolatedHostQ and found that there were 581683 messages which were undelivered which were unusual. SO it indicates some of the Orphaned records might be cached in the table which are zombie

 

·         On further research we found that we might hitting a bug mentioned in the KB articles KB 944426 & 936536 & 941690.

 

 

In order to resolve the issue  we used the following Action Plan

 

·         In order to minimize the SQL CPU usage due to the stored procedure MSGBOXPERFCOUNTERS_GETSPOOLSIZE , we set the following registry key to 30sec   

 

HKLM\Software\Microsoft\BizTalk Server\3.0\MsgBox Performance\ CacheRefreshInterval to 30 seconds

 

IMPORTANT : This resolution contains information about modifying the registry. Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up, restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows Registry

 

·         We terminated the orphan cache and RFR and applied the Hotfix  KB 944426 & 936536 & 941690 to avoid the future occurrence of the issue.

 

 

 

After the above changes the CPU on the sql server was resumed to normal.

Hope this help!!  :)

 

Reference

=========

 

http://msdn.microsoft.com/en-us/library/cc296892(BTS.10).aspx

http://msdn.microsoft.com/en-us/library/cc296811(v=BTS.10).aspx

 

 

Parikshit Savjani
Premier Field Engineer, Microsoft

 

Share/Save/Bookmark

October 13, 2008

Sql server performing slow after changing the Cost Threshold for parallelism

We have observed a few cases now where sql server has started performing slow after changing the Cost Threshold Of Parallelism server option of the sql server.

Ironically in such cases we observe that sql server is performing slow even when there is no high CPU utilization and no high I/O operations on the server. Increasing the Cost Threshold Of Parallelism decreasing your CPU utilization as it reduces the Parallelism in the execution plan of some of the queries which are fired on the database and in turn slows their response time and hence the performance of sql server deprecates.

As defined in the BOL,

Cost threshold for parallelism option is used to specify the threshold at which Microsoft SQL Server creates and runs parallel plans for queries. SQL Server creates and runs a parallel plan for a query only when the estimated cost to run a serial plan for the same query is higher than the value set in cost threshold for parallelism. The cost refers to an estimated elapsed time in seconds required to run the serial plan on a specific hardware configuration. Only set cost threshold for parallelism on symmetric multiprocessors.
The Default Value of the Cost Threshold Of Parallelism is 5 which means when Optimizer creates and estimates the cost of the query if the cost exceeds 5, optimizer considers this query as a candidate for the parallel execution plan and hence it selects the Parallel plan for this query. However the Degree of Parallelism depends on the Number of Processors in the System and the MAX DOP setting of the sql server. Based on these values optimizer selects a parallel execution plan which consumes higher CPU but with much better response time of the queries and hence the performance of sql server increases.

Cost Threshold of parallelism can also be used to reduce the parallelism in the sql server if the parallel queries consume excessive CPU so we can increase the Cost Degree of Parallelism and change the behavior of sql server to use serial execution plans.

Usually it is always a good option to go with default value for the Cost Threshold Of Parallelism and avoid manipulating the behavior of optimizer unless it is tested on your environment with the same workload as production.

Share/Save/Bookmark

Powered by WordPress