Wednesday, October 19, 2011

Using sys.dm_exec_requests to find a blocking chain

A query that calls sys.dm_exec_requests and pulls the statement text out of the request is a handy script for any DBA's toolbox.  A simple version of the query performs a cross apply to dm_exec_sql_text and pulls out the request's TSQL statement-in-progress using the unicode character offsets.

This is simple enough and does a good job as a queryable sp_who2 and DBCC INPUTBUFFER replacement to an extent.

In an environment that suffers from frequent spells of query blocking, the ability to identify which query was leading the way becomes important in finding the query or queries to troubleshoot.

Whether the end result of his query was a simple KILL command, and/or a walk down the hall to investigate the "business case" would be a situational decision.

I wanted a blocking chain - a list of all other spids that a session was blocking.

Here's a query that dumps the output of dm_exec_requests into a temporary table, then uses the XML PATH to conveniently transform the blocked_by list into a comma-delimited list in the row of the blocker.  The  sort order is the length of this comma-delimited list, putting the cause of a pileup right at the top.

Doing this in a temp table was necessary because the blocking_session_id is a highly volatile column.  As a result, the output of this query is several ms behind "live".  After a few executions, the spid hanging out at the top is easy enough to spot.

Here's an example of the nightmare kind of scenario that prompted this script into existence:




 declare @showallspids bit 
 select @showallspids =1

 create table #ExecRequests  (
  id int IDENTITY(1,1) PRIMARY KEY
 , session_id smallint not null 
 , request_id int null
 , request_start_time datetime null
 , login_time datetime not null
 , login_name nvarchar(256) null
 , client_interface_name nvarchar(64)
 , session_status nvarchar(60) null
 , request_status nvarchar(60) null
 , command nvarchar(32) null
 , sql_handle varbinary(64) null
 , statement_start_offset int null
 , statement_end_offset int null
 , plan_handle varbinary (64) null
 , database_id smallint null
 , user_id int null
 , blocking_session_id smallint null
 , wait_type nvarchar (120) null
 , wait_time_s int null
 , wait_resource nvarchar(120) null
 , last_wait_type nvarchar(120) null
 , cpu_time_s int null
 , tot_time_s int null
 , reads bigint null
 , writes bigint null
 , logical_reads bigint null
 , [host_name] nvarchar(256) null
 , [program_name] nvarchar(256) null
 , blocking_these varchar(1000) NULL
 , percent_complete int null
 , session_transaction_isolation_level varchar(20) null
 , request_transaction_isolation_level varchar(20) null
 )

 insert into #ExecRequests (session_id,request_id, request_start_time, login_time, login_name, client_interface_name, session_status, request_status, command,sql_handle,statement_start_offset,statement_end_offset,plan_handle,database_id,user_id,blocking_session_id,wait_type,last_wait_type,wait_time_s,wait_resource,cpu_time_s,tot_time_s,reads,writes,logical_reads,[host_name], [program_name] , session_transaction_isolation_level , request_transaction_isolation_level )
       select s.session_id,request_id, r.start_time, s.login_time, s.login_name, s.client_interface_name, s.status, r.status,command,sql_handle,statement_start_offset,statement_end_offset,plan_handle,r.database_id,user_id,blocking_session_id,wait_type,r.last_wait_type, r.wait_time/1000.,r.wait_resource ,r.cpu_time/1000.,r.total_elapsed_time/1000.,r.reads,r.writes,r.logical_reads,s.[host_name], s.[program_name], s.transaction_isolation_level, r.transaction_isolation_level
 from sys.dm_exec_sessions s 
 left outer join sys.dm_exec_requests r on r.session_id = s.session_id
 where 1=1
 and s.session_id >= 50 --retrieve only user spids
 and s.session_id <> @@SPID --ignore myself
 and  (@showallspids = 1 or r.session_id is not null) 

 update #ExecRequests 
 set blocking_these = LEFT((select isnull(convert(varchar(5), er.session_id),'') + ', ' 
       from #ExecRequests er
       where er.blocking_session_id = isnull(#ExecRequests.session_id ,0)
       and er.blocking_session_id <> 0
       FOR XML PATH('') 
       ),1000)
 select * from 
 (
  select  
   timestamp = getdate()
  , r.session_id , r.host_name , r.program_name
  , r.session_status
  , r.request_status
  , r.blocking_these
  , blocked_by =  r.blocking_session_id
  , r.wait_type 
  , r.wait_resource
  , r.last_wait_type
  , DBName = db_name(r.database_id)
  , r.command
  , login_time
  , login_name
  , client_interface_name
  , request_start_time
  , r.tot_time_s, r.wait_time_s, r.cpu_time_s, r.reads, r.writes, r.logical_reads
  --, [fulltext] = est.[text]
  , offsettext = CASE WHEN r.statement_start_offset = 0 and r.statement_end_offset= 0 THEN NULL
         ELSE SUBSTRING (  est.[text]
             , r.statement_start_offset/2 + 1, 
              CASE WHEN r.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), est.[text])) 
               ELSE r.statement_end_offset/2 - r.statement_start_offset/2 + 1
              END )
       END
  , r.statement_start_offset, r.statement_end_offset
  , cacheobjtype = LEFT (p.cacheobjtype + ' (' + p.objtype + ')', 35)
  , QueryPlan  = qp.query_plan 
  , request_transaction_isolation_level = case request_transaction_isolation_level 
    when 0 then 'Unspecified'
    when 1 then 'ReadUncommitted'
    when 2 then 'ReadCommitted'
    when 3 then 'Repeatable'
    when 4 then 'Serializable'
    when 5 then 'Snapshot' end 
  , session_transaction_isolation_level = case session_transaction_isolation_level 
    when 0 then 'Unspecified'
    when 1 then 'ReadUncommitted'
    when 2 then 'ReadCommitted'
    when 3 then 'Repeatable'
    when 4 then 'Serializable'
    when 5 then 'Snapshot' end 
  , p.plan_handle
  , stat.execution_count, stat.total_worker_time, stat.last_worker_time, stat.total_elapsed_time, stat.last_elapsed_time, stat.total_physical_reads, stat.total_logical_writes, stat.total_logical_reads
  --next two lines are SQL 2012 only!
  --, stat.total_rows, stat.last_rows
  from #ExecRequests r
  LEFT OUTER JOIN sys.dm_exec_cached_plans p ON p.plan_handle = r.plan_handle 
  OUTER APPLY sys.dm_exec_query_plan (r.plan_handle) qp
  OUTER APPLY sys.dm_exec_sql_text (r.sql_handle) est
  LEFT OUTER JOIN sys.dm_exec_query_stats stat on stat.plan_handle = r.plan_handle
  and r.statement_start_offset = stat.statement_start_offset  
  and r.statement_end_offset = stat.statement_end_offset
  
 
 ) a
 order by LEN(blocking_these) desc, blocking_these desc, blocked_by desc, session_id asc
 go
 drop table #ExecRequests  

Updated 20140813: latest tweaks and changes Updated 20140929: latest tweaks and changes

Saturday, October 15, 2011

Slides and Sample code: SQL Server Admin Best Practices with DMV's from the Houston TechFest 2011

Here's my official blog page for my presentation on SQL Server Admin Best Practices with DMV's from the Houston TechFest 2011 on October 15.  Thanks for attending my session in the of a great event put on by Michael and the entire Houston TechFest team.  The University of Houston was a great host for the event!

Thanks also to all the sponsors who attendees.  In the speaker room I heard some discussion of HTF's decision to have back-to-back Keynotes in the morning, not starting the actual sessions until 11AM and then breaking for lunch at noon.  I have to agree with what I heard - the vendors paid the money to put this free event on, why not give them the freshest audience?  As a veteran of three SQL Saturdays in Baton Rouge, I have to admit it was an uncommon but inventive scheduling decision.  If it increases vendor satisfaction and retention for next year's event, nobody should mind, especially the attendees.

Download slides and sample code here.

You can give me a rating on SpeakerRate here.

Monday, October 10, 2011

Resource Governor clamps down on Greedy Reports

One of the things I run into every now and then is this provincial idea that reports can and should be read straight out of the transactional database.  From an architectural standpoint this is of course flawed at and worst disastrous to the performance of both the read/write transactional system and the read-intensive reporting system.

At scale, architectural decisions to place heavy-duty reports - even reports that run nightly, monthly or annually - can be paralyzing to an enterprise.

Using various tools like replication, ETL via SSIS, ETL via SSIS with CDC, database mirroring, or snapshots to provide an off-site, contention-free reporting environment is an elementary architectural decision.  Simply put - don't mix your read/writes with your heavy reads.

But what if you're stuck with a host of hard-hitting reports running against your application database?

Long-term, look into providing the business a viable reporting alternative - a data warehouse with slowly changing dimensions for historical trending, Analysis Services cubes, self-service BI via PowerPivot...

Short-term, take some aggressive steps like using the SQL Server Resource Governor... within reason.  Introduced with SQL 2008, surely this tool can let you wield tyrannical power over connections.  Instead of using this as a punitive hammer on your offending reports, or as a wide-sweeping penalizing stroke, use the Resource Governor as an insulator for your key apps.  Take the approach of protecting the application's read/writes from the reports' massive reads, and the resulting resource stranglehold.

Take this configuration for example.  This is a step-by-step on how to set up Resource Governor.  Note a large number of commented-out example configuration options.



use master
go
--http://msdn.microsoft.com/en-us/library/bb933944(v=SQL.100).aspx
CREATE WORKLOAD GROUP GovGroupReports;
go
CREATE FUNCTION dbo.fnResourceGovernorClassifier() RETURNS sysname 
WITH SCHEMABINDING
AS
BEGIN
      -- Note that any request that does not get classified goes into the 'default' group.
      -- This function is run in the context of the new connection.
      --http://msdn.microsoft.com/en-us/library/bb933865.aspx
     
      DECLARE @grp_name sysname
     
      IF (
                  SUSER_SNAME() IN ('ReportUser')
            AND IS_MEMBER ('db_datareader') = 1
            --AND HOST_NAME() IN ('ReportingServer')
            --AND SUSER_SNAME() IN ('ReportUser')
            --AND IS_SRVROLEMEMBER('bulkadmin') = 1
            --AND IS_MEMBER ('DOMAIN\READONLYGROUP') = 1
            --AND APP_NAME() LIKE 'Report Server%'
      )
      BEGIN
            SET @grp_name = 'GovGroupReports';
      END

      RETURN @grp_name

END;
GO
-- Register the classifier function with Resource Governor
ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION= dbo.fnResourceGovernorClassifier);
--ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION= null);
GO
CREATE RESOURCE POOL GovPoolMaxMemory
WITH (--http://msdn.microsoft.com/en-us/library/bb934024.aspx
            MAX_MEMORY_PERCENT = 75
      --,   MIN_CPU_PERCENT = 0
      --,   MAX_CPU_PERCENT = 75
      --,   MIN_MEMORY_PERCENT = 0
);
GO
ALTER WORKLOAD GROUP GovGroupReports
WITH (--http://msdn.microsoft.com/en-us/library/bb895361.aspx
            MAX_DOP = 1
      --,   IMPORTANCE = { LOW | MEDIUM | HIGH }
      --,   REQUEST_MAX_MEMORY_GRANT_PERCENT = 25
      --,   REQUEST_MAX_CPU_TIME_SEC = 600
      --,   REQUEST_MEMORY_GRANT_TIMEOUT_SEC = 600
      --,   GROUP_MAX_REQUESTS = 50

)
USING GovPoolMaxMemory
GO

-- Start Resource Governor
ALTER RESOURCE GOVERNOR RECONFIGURE;
GO

--Just in case
--ALTER RESOURCE GOVERNOR DISABLE

What we've accomplished with this Resource Governor implementation is to cap connections we've identified connections using ReportUser login.  We could identify these connections a number of different ways.

Remember that connections we don't catch will always fall through the function into the default bucket, and will be ignored by the Resource Governor.  Only groups we name will be directly affected.

The Resource Pool GovPoolMaxMemory we've created is pretty self explanatory - to simplify, it will cap usage of all connections in the pool to 75% of the SQL Server's available memory.  This is probably still very generous - we may want to make sure that other connections (the transactional application) will have more than 25% of all available memory.  This strategy does prevent reports from hogging ALL of the memory, and remember that is our strategy - to protect, not to punish.  

Similarly, the Resource Group we've created will restrict queries to one degree of parallelism.  That is a topic for another blog post, I recommend reading up on it.  In many cases, this can reduce CXPACKET wait types, further protecting the server from being crushed under the load of the reports.

After the governor is configured, use the query below to verify that new connections are dropping into the new bucket:  

SELECT
            es.session_id
      ,     es.login_time
      ,     es.login_name
      ,     [Group Name]            = wg.name
      ,     [Host Name]             = es.host_name
      ,     Application             = es.program_name
      ,     es.client_interface_name
      ,     es.memory_usage
      ,     es.total_elapsed_time
      ,     Pool_Max_Memory_kb      = rp.max_memory_kb
      ,     Pool_Max_Memory_kb      = rp.used_memory_kb
--SELECT *
FROM
      sys.dm_resource_governor_workload_groups wg
INNER JOIN
      sys.dm_exec_sessions es
ON es.group_id = wg.group_id
LEFT OUTER JOIN
      sys.dm_resource_governor_resource_pools rp
on rp.pool_id = wg.pool_id
WHERE
      wg.group_id > 2 --look only at connections being governed


"Serve the public trust, protect the innocent, uphold the law." -RoboCop

Thursday, October 06, 2011

Houston TechFest UG Leadership Summit

At the Houston TechFest 2011 Conference at the University of Houston I will be hosting a User Group Leadership Summit at NOON as part of the "Advanced" track.

I invite anyone current involved with a local .net, IT Pro, SQL Server, Sharepoint, or any other user group, in any region, to attend.

We will have an open forum for discussion with voted topics, I will merely introduce myself and proctor and nothing more.

This will be my second year going to the Houston TechFest, and last year (though I spoke in the last slot - some time in the early evening, in a hot, dim, sleepy room) was a blast.  I was very impressed with the crowd, the speakers, and the vendors and I am looking forward to seeing them all again.  As part of the committee that runs SQL Saturday Baton Rouge each year (#17, #28, #64 and again 2012 as well) I took some pointers from it.  HTF is definitely a wider diversity of technologies and topics, but not something that a SQL Saturday can't hope to be.

I will be making the road trip to Houston with more than a dozen of my coworkers for the event, and I will be joined in speakers shirts by three of my coworkers: Ryan Richard, Mike Huguet and Kerry Huguet.  I will be presenting on "SQL Server Best Practices with DMVs" at 2:30pm in the SQL Server track.