pointers, solutions and scripts for the SQL DBA
not intended to replace msdn, common sense or oxford commas


Capture File IO

Here is a DMV query you can use to examine file IO. Run and store this before every restart.

The num reads and num bytes read columns reset when the sql service resets.

The sample_ms resets when the actual box resets.

select sf.fileid, sf.name, sf.filename

, days_server_up = f.sample_ms/1000./60./60./24.

, f.num_of_reads, gb_read = f.num_of_bytes_read/1024./1024./1024., hours_stall_read = f.io_stall_read_ms/1000./60./60.

, f.num_of_writes, gb_written = f.num_of_bytes_written/1024./1024./1024., hours_stall_write = f.io_stall_write_ms/1000./60./60.

, f.io_stall, size_gb = f.size_on_disk_bytes/1024./1024./1024.

from sys.dm_io_virtual_file_stats (db_id(),null) f

inner join sysfiles sf on f.file_id = sf.fileid


Mini-Lecture: Why I hate "comma joins"

A colleague of mine who is not a DBA asked about comma joins in an email. I replied with the below:

Sent: Monday, October 19, 2009 3:32 PM
Subject: RE: SQL question

Happy to help. Thanks for the hyperbole. :)

There is no functional difference, however, the first method – using a “comma join” is undesirable for a few reasons.

1) Using “comma joins” is old code and only supported at the pleasure of Microsoft. Its considered bad syntax. Its considered lazy. JOINS are the standard for TSQL. It may not be supported in future versions of SQL server.

2) It mixes your JOIN logic with your SELECTION logic. Imagine you added WHERE EMPID = ‘10’ to the sample query. Both functionally and in business logic, the EMPID = ‘10’ and the A.EMPLID = C.EMPLID are two very different statements. Functionally, it will probably work (not 100%), yet it is far more difficult to understand and maintain.

3) It is really difficult to read. Imagine a much larger query that has a bunch of commas in the FROM and then a tangle of statements in the WHERE Clause. Using JOINs and ONs are far easier to read for the user that comes behind you.

4) Screw up your WHERE clause and you’ll end up with a Cartesian product. Row count explosion.

Attached is a lab to demonstrate the various types of joins from a functionality standpoint.

Sent: Monday, October 19, 2009 3:23 PM
Subject: SQL question

Hey man, I have a SQL question for you.

What is the difference between this:


And this:


And the benefit of one over the other?

Hope all is well with the “Grand SQL Wizard”!!!

(Yeah, I'm not sure either if that last bit was sarcasm or hyperbole either. It certainly wasn't genuine or factual. :) )

Here's that handy .sql script I keep around for demonstrating basic JOIN principles and syntax. Note that all of the syntax in this script is valid, but a lot of it is unacceptable and improper - that is sort of the point. This lab usually goes along with a short, stern lecture :) (see above) on why proper JOIN syntax is preferable.

tkey int

tkey int


SELECT * FROM @T1 T1 INNER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 LEFT OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 RIGHT OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 FULL OUTER JOIN @T2 T2 ON T1.tkey = T2.tkey
SELECT * FROM @T1 T1 , @T2 T2
SELECT * FROM @T1 T1 , @T2 T2 WHERE T1.tkey = T2.tkey

"Programmers can be lazy." -Larry Wall


Try, try again with strange SQL 2008 error on Win7/Server 2k8

"Invoke or BeginInvoke cannot be called on a control until the window handle has been created."

Got this error when trying to install SQL 2008 Service Pack 1 on Windows Server 2008.

Went on a wild search of trying to find yet another solution to a SQL 2k8 install problem, including downloading a CU, but then I found this.
Generally, if you just rerun it won't hit the issue again.
It worked.


Trying again. Its not just for bicycle riding any more!

"The definition of insanity is doing the same thing over and over again and expecting different results." - NOT BY Einstein, Ben Franklin, Mark Twain, or anybody with a brain.

2008 install problem with Framework 3.5

Problem installing SQL 2008 on a new virtual server. Gave an error almost immediately after running setup.exe, didn't even get to the links screen.

"Microsoft .NET Framework 3.5 installation has failed.

SQL Server 2008 Setup requires .NET Framework 3.5 to be installed."

The solution I found deep in a thread in MSDN. The point of this blog is to make things like this easier to find on the web, so here goes.

  1. Copy your media to a folder on the server. You need to update a file in there.
  2. Download a fresh copy of dotNetFx35setup.exe from Microsoft
  3. Copy the new file and overwrite the old in this location in the media: ..\x64\redist\DotNetFrameworks\


Introduction to DMV's in SQL 2005 and 2008 from SQL Saturday!

Download my presentation on Introduction to Dynamic Management Views here

I most recently gave this presentation at SQL Saturday! in Baton Rouge on August 1

Introduction to SSIS presentation from SQL Saturday!

Download my presentation on Introduction to SSIS here

I most recently gave this presentation at the SQL Saturday in Baton Rouge on August 1


Forward Engineering from Visio 2007 without Enterprise Architect edition

To me, one of the most boggling, counter-intuitive and short-sighted decisions was to remove forward-engineering of databases from the base editions Microsoft Visio 2007.

One of the best third party add-ons to Visio therefore, is the Orthogonal Toolbox, a XML-based export utility that, with the right XSLT file, can be used to forward engineer your Visio database diagram.

It isn't perfect, but here's how:
  1. Close Visio, download and install the Orthogonal Toolbox: http://www.orthogonalsoftware.com/toolboxaddons.html
  2. Download this guy's XSLT file: http://www.dougboude.com/blog/1/2008/11/SQL-Forward-Engineering-with-Visio-2003-Professional.cfm
  3. Open your visio diagram, click on the new toolbox button.
  4. Select the XSLT file in the first box, select a target in the second.
  5. Done. Spits out a nice SQL file.
I had a problem with this tool however - including the attributes (the columns) caused the tool to cause a windows error. I was using Visio 2007 and Vista. Curiously enough, copying my entire diagram (control-A) and pasting it into a new Visio diagram solved that problem.

Also, I learned today that VS 2008 can import a SQL file, like the one generated here, into a database solution. Check constraints and defaults don't get put in, but its a big timesaver for PK's, FK's and tables.

http://www.orthogonalsoftware.com/toolboxaddons.html (dead)
http://richard.gluga.com/2009/03/no-erd-to-sql-code-generation-in-visio.html (dead)

UPDATE: see comments for more information on generating DDL from a Visio doc

UPDATE: in Vista and Windows 7, try running Visio in XP SP2 compatibility mode.  This cleared up an error in Visio 2007: "Requested Registry Access Is Not Allowed"

UPDATE: If you want the orthogonal software download, try this link here.  Otherwise, try this: http://forwardengineer.codeplex.com/

"Avarice, the spur of industry" - David Hume


How many threads is too much for SQL?

Here's a redacted version of a Q&A I had with one of the devs (who is not a DBA) at my consulting company. I thought it was worth sharing.

I got a question on multithreading in SQL 2005. Do you know much about it? Our client would like to know how many threads can be open before performance starts to suffer.

And my response to my colleague:

That's a really hard question to answer since SQL doesn't operate in application "threads". SQL server, with only one processor (and therefore one Scheduler) can handle many many threads from an application. More processors will enable it to more quickly respond to multiple threads, and SQL 2005 detects processor count automatically and scales. "Open threads" is not really something to measure, as a multithreaded application really doesn't correlate directly to SQL Server. Sure, you can see the threads connections open and close, but to measure performance, the count of threads is really not useful.

If they are worried about CPU capacity, there are a number of perfmon counters and internal dynamic management views (DMV's) to look at to see if CPU is a bottleneck.

In Perfmon, you want to be looking at

\System\Processor Queue Length (shouldn't exceed 2x # of CPU's)
\Processor(_Total)\% Processor Time (self-explanatory, lower number the better)
\Process(sqlservr)\% Processor Time (self-explanatory, shouldn't differ from the above if no other major apps are running on the server)

In DMV's, look for

sys.dm_os_wait_stats (You don't want to see a lot of CPU-related wait types)

Here is a good link from MS: http://technet.microsoft.com/en-us/magazine/2007.10.sqlcpu.aspx

There's obviously much more to look at with SQL server, but not knowing my audience I'm not sure how much detail you want. Sorry for the unclear answer but there really is no hard number of application threads to stay under. It depends a lot upon the other subsystems of the server as well (RAM and physical IO). Sounds like they want to do performance tuning. Sounds like an opportunity.

"You always admire what you really don't understand." -Blaise Pascal


correct the name of a SQL server in SQL 2005

Renamed a virtual server, but your SQL server instance still reflects the old name?

verify what SQL Server thinks it is by

select @@servername

rename the server with:

sp_dropserver [sqlserveroldname<\instancename>]
sp_addserver [sqlservernewname<\instancename>], local

restart the SQL Server Service

verify the new name by select @@servername

Note: this isn't a huge deal in SQL 2005 as it was in previous versions. You'll notice, for example, you can still access your SQL server even if it thinks it is in the wrong name, and the SSMS object explorer looks accurate. But for remote logins and I would assume linked server connections, having those names out of sync could be fun.

MSDN: http://msdn.microsoft.com/en-us/library/ms143799.aspx

EDIT: Note that you cannot rename the instance name of a named instance, only the server name.


'Microsoft.AnalysisServices.DsvTableBinding' to type 'Microsoft.AnalysisServices.QueryBinding'

One of the things this blog intends to do is help speed up the troubleshooting process by documenting strange errors like this.

Here's an interesting error message that came up today which has surprisingly little web coverage.

"Unable to cast object of type 'Microsoft.AnalysisServices.DsvTableBinding' to type 'Microsoft.AnalysisServices.QueryBinding'.

It is caused by trying to apply a querystring to an existing partition that is not query-driven in OMA. In this case, an SSIS 2005 vb.net app that builds partitions (by the hundreds for me) into a cube. Like many things, its one of those spend a while then slap your forehead problems that makes a ton of sense. Obviously, I can't set the partition query of a partition that is table-bound.

I do plan to post my ssis app once I am done with it. But until then, here's the blog post that inspired it.

"No question is so difficult to answer as that to which the answer is obvious." -George Bernard Shaw


handy feedback for SSIS 2005 Script Components

As you know, getting feedback from SSIS is spotty at times. Debugging is also very limited in SSIS Script Components (where's immed!?). This came in handy today.
Dts.Events.FireProgress("informational string", 0, 0, 0, Nothing, False)
Lots of different ways to get info out of the "progress" window. Lots of feedback possibilities in the Dts.Events collection.

And don't forget that in SSIS 2008, you can write C# as well as VB.net.

"Feedback is the breakfast of champions." -Ken Blanchard


Fancy Missing Indexes

UPDATED 5/28/2013, see notes below.

You've seen fancy ketchup. Here's fancy missing indexes.

This script is database specific. It does not use the server-wide DMV's alone, it joins to INFORMATION_SCHEMA.TABLES system tables to generate index create scripts. This is a very fast way to enter an environment, and take a peek at the indexing situation, something I do regularly as a DBA consultant.

, create_index_statement = 'CREATE NONCLUSTERED INDEX IDX_NC_' + replace(t.name, ' ' ,'')
 + replace(replace(replace(ISNULL(replace(mid.equality_columns, ' ' ,''),'') , '],[' ,'_'),'[','_'),']','') 
 + replace(replace(replace(ISNULL(replace(mid.inequality_columns, ' ' ,''),''), '],[' ,'_'),'[','_'),']','') 
 + ' ON ' + statement 
 + ' (' + ISNULL (mid.equality_columns,'') 
    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END 
    + ISNULL (mid.inequality_columns, '')
 + ')' 
 + ISNULL (' INCLUDE (' + mid.included_columns + ')', '')  COLLATE SQL_Latin1_General_CP1_CI_AS
, unique_compiles, migs.user_seeks, migs.user_scans, last_user_seek, migs.avg_total_user_cost
, avg_user_impact, mid.equality_columns,  mid.inequality_columns, mid.included_columns
, quartile
--select *
FROM sys.dm_db_missing_index_groups mig
INNER JOIN --get only the top 20% of suggestions based on cost.
(select *, quartile = NTILE(5) OVER (ORDER BY avg_total_user_cost asc) from sys.dm_db_missing_index_group_stats) migs 
ON migs.group_handle = mig.index_group_handle
--and migs.quartile = 1
INNER JOIN sys.dm_db_missing_index_details mid 
ON mig.index_handle = mid.index_handle
inner join sys.tables t 
on t.object_id = mid.object_id
inner join sys.schemas s
on s.schema_id = t.schema_id

and mid.database_id = db_id() -- current database only
--and  (datediff(week, last_user_seek, getdate())) < 6
--AND  migs.unique_compiles > 10
--and  migs.quartile >= 3
--and  migs.user_seeks > 10
--and  migs.avg_user_impact > 75
--and  t.name like '%salesorder%'

--order by avg_user_impact * avg_total_user_cost desc 
order by create_index_statement
Pardon blogspot's inane raw code posting ability.
  • The improv calc up top in the alternate ORDER BY is simply to evaluate the relative use of a potential index against others that show up in the report. Its not terribly scientific but weighs usage and value.
  • The WHERE clause items can be used depending on the situation. If your server has been operating for a while and you'd like to see what indexing needs recent changes have been generating, use the date filter. Have a lot of ad hocs? Filter on unique_compiles > 1. The third WHERE clause is the same as the improv value above, helps me filter out things that are too small to care about. Lastly, I filter out anything that won't have at least a 50% impact. Oftentimes, I'll set this last value much higher.
  • Joins to INFORMATION_SCHEMA.TABLES system tables to generate a proper index create statement. It follows a naming convention I prefer, IDX_NC_ for nonclustered indexes, followed by the table name and the key fields. For covering indexes, I'll remove the key fields from the name and just put "covering". Your naming convention for indexes is probably different.
  • Before creating any indexes from the missing index DMV's, do a little research on them and keep in mind that they reset with the server. It maybe useful to retain these over time, spanning multiple resets.
  • Oftentimes, you'll see many of the rows in this result overlap. The INCLUDES might be in a different order, or you'll see one suggested index that has all but one other field that another suggestion has. Intelligently combine these together, keeping in mind that order DOES matter in the key fields.
  • Lastly, if you've examined existing indexing on the table, use the provided creation script in the column create_index_statement .
  • This script, aside from a few tweaks here and there, is hardly an original creation of this blog. As part of your research on missing index DMV's, don't miss: http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx
Other linkage:

Updated 5/28/2013 to reflect some changes, including getting rid of the INFORMATION_SCHEMA tables in favor of system tables, update the formatting, 

"As I leave you I want you to know... just think how much you'll be missing." -Richard Nixon


Don't stop that mouse from moving



Method 2: Move Your Mouse Pointer
If you move your mouse pointer continuously while the data is being returned to Microsoft Excel, the query may not fail. Do not stop moving the mouse until all the data has been returned to Microsoft Excel.

NOTE: Depending on your query, it may take several minutes to return the results of your query to the worksheet.

"Words represent your intellect. The sound, gesture and movement represent your feelings." -Patricia Fripp


Defrag Complications

Former colleague of mine called me up on a Saturday morning. His manager, who spends most of his day staring at a unnamed, 3rd party, leading SQL monitoring software, was really giving him hell about a table in one of their performance databases that was at 90% fragmentation.

Rebuilding the indexes didn't seem to help. So I asked him how large the table was, and it was nine pages.

I feel sorry for my buddy, who now has to try and explain this to his boss. His report of high-fragmentation tables is just not going to be cleared unless he can filter out the tiny tables.

"Against stupidity the gods themselves contend in vain." -Friedrich von Schiller


clear out DTA indexes and statistics

Here's a helpful script I keep around in my portable HD. When the Database Tuning Advisor fails or is aborted for whatever reason, it can leave behind its hypothetical indexes and stats. This script will remove these active indexes and stats from your database. Uses the is_hypothetical flag to find indexes. Uses the _dta prefix to find stats. This assumes that when you implement DTA-recommended stats, you change their name (always a best practice). If you have implemented DTA-recommneded stats and not changed their name, this script will remove those too.

WHILE EXISTS (SELECT * FROM sys.indexes WHERE is_hypothetical = 1)
DECLARE @sql varchar(max)
SELECT @sql = 'drop index ' + indexname + ' on [' + tablename + ']'
SELECT TOP 1 indexname = i.name, tablename = o.name
FROM sys.indexes i
INNER JOIN sys.objects o
ON i.object_id = o.object_id
WHERE i.is_hypothetical = 1 and o.type = 'u'
) x
EXEC (@sql)

WHILE EXISTS (SELECT * FROM sys.stats i WHERE OBJECTPROPERTY(i.[object_id],'IsUserTable') = 1 AND i.[name] LIKE '_dta%' and user_created = 0)
DECLARE @sql varchar(max)
SELECT @sql = 'drop statistics [' + object_name(i.[object_id]) + '].['+ i.[name] + ']'
FROM sys.stats i
OBJECTPROPERTY(i.[object_id],'IsUserTable') = 1 AND
i.[name] LIKE '_dta%' and user_created = 0
EXEC (@sql)

See also:
I originally posted this on tsqlscripts.com (drink!)

"The great tragedy of science - the slaying of a beautiful hypothesis by an ugly fact." -T.H. Buxley

blog intent

Pardon the title, which treads the thin ice of being too wannabe-serious. My first dozen or so attempts at a sql blog title failed for not being unique enough or not available on blogspot.

The intent of this blog is not personal. As a consultant database administrator in the South, I'm regularly encountering things that I think other DBA's should have quicker access to. Follow this blog if you like, but its intended purpose is to be that one resource on the web that specifically addresses some DBA's problem or question.

If you've found this blog via a web search engine and it helped you out, then this small effort on my part has served its purpose. Writing this initial post is more for my own edification and benefit, a mission statement of sorts.

"There is no restraining men's tongues or pens when charged with a little vanity." - George Washington