Wednesday, September 04, 2013

Replicated Stored Procedure Execution

One feature that many of my clients are unaware of is that the batched execution of a stored procedure can be replicated under certain circumstances. This can result in significant improvement in performance for stored procedures on replicated databases.

This MSDN article does a good job of breaking down the basics, but here's a demonstrable example you can use.

The rules about batching up the stored procedure executions are fairly severe, and require the SERIALIZABLE isolation level for the stored procedure's transaction. The goal is to see a single 100-row transaction, not 100 single-row transactions. The latter, inefficient behavior is the default behavior that we want to avoid. We'll prove it later.

Step one: Execute the below script block in an existing replicated database. (Not production, obviously.) Or, set up a new simple transactional replication between two databases, then run the below script block.
--Create testing table
CREATE TABLE [dbo].[TestArticle](
 [id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
 [text1] [nvarchar](100) NULL,
 [text2] [nvarchar](100) NULL,
 [text3] [nvarchar](150) NULL,
 [text4] [nvarchar](150) NULL,
 [text5] [nvarchar](200) NULL,
 [int1] [int] NULL,
 [int2] [int] NULL,
 [int3] [int] NULL,
 [int4] [int] NULL,
 [int5] [int] NULL,
 CONSTRAINT [PK_TestArticle] PRIMARY KEY CLUSTERED ( [id] ASC )
)
--Populate with random data
declare @x int
select @x = 1
WHILE @x < 100
BEGIN
 --Insert filler data
 INSERT INTO dbo.TestArticle
 (text1,text2,text3,text4,text5,int1,int2,int3,int4,int5)
 SELECT
     replicate(char(round(rand()*100,0)),round(rand()*100,0))
 , replicate(char(round(rand()*200,0)),round(rand()*100,0))
 , replicate(char(round(rand()*300,0)),round(rand()*100,0))
 , replicate(char(round(rand()*400,0)),round(rand()*100,0))
 , replicate(char(round(rand()*500,0)),round(rand()*100,0))
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 select @x = @x + 1
END
go
CREATE procedure [dbo].[addheaprows] 
as
 --Insert filler data
 INSERT INTO dbo.testarticle
 (text1,text2,text3,text4,text5,int1,int2,int3,int4,int5)
 SELECT top 100
     replicate(char(round(rand()*100,0)),round(rand()*100,0))
 , replicate(char(round(rand()*200,0)),round(rand()*100,0))
 , replicate(char(round(rand()*300,0)),round(rand()*100,0))
 , replicate(char(round(rand()*400,0)),round(rand()*100,0))
 , replicate(char(round(rand()*500,0)),round(rand()*100,0))
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 , round(rand()*10000,0)
 FROM dbo.testarticle
GO

Step two: Add both the table and the stored procedure to a new publication as articles. The stored procedure must be added as an article with Replicate option = 'Execution in a serialized transaction of the SP'.

Step three: Set up a default "TSQL" trace on the Subscriber instance.

Step four: Execute the below on the Publisher instance so that you can see the replicated stored procedure execution in action.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN testrepl
exec dbo.addrows
COMMIT TRAN testrepl

--In the trace, you should see one trace event for SQL:BatchStarting, textdata =

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; EXEC "dbo"."addrows"  ;SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
 
Step five: Now let's look at what happens normally without any modifications. Start up a default "TSQL" trace on the Subscriber instance and execute the below on the Publisher instance.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN testrepl
exec dbo.addrows
COMMIT TRAN testrepl


--In the trace, you will see 100 trace events for RPC:Starting, textdata = 


exec [sp_MSins_dbotestarticle] '8C39EB22-71BF-488A-8066-00C4AFABA497', .... 

exec [sp_MSins_dbotestarticle] .... 

exec [sp_MSins_dbotestarticle] .... 

exec [sp_MSins_dbotestarticle] .... 

....
This is the default behavior, and it's a bit shocking to see for the first time.

Remember our rules about the SERIALIZABLE isolation level, which means this isn't necessarily appropriate for all environments. This change could lead to an increase in locking during the execution of your procedure, so educate yourself on heightened isolation levels and why they are important.

No comments: