Friday, May 03, 2013

Hunt Down Tables Without Clustered Indexes

This one's a gimme, but an underrated utility script when I enter a new environment and look for low-hanging fruit for easy improvements.

You'd be surprised how many vendors ship applications without a single index, much less properly aligned clustered indexes.

Heaps are bad.  You can't defragment a heap by performing index maintenance, despite old myths about the create/drop a clustered index trick.

Similarly, the myth of a heap performing better on pure inserts vs a table with a clustered index is an academic one.  Heaps are impossible to order, which means selecting from a heap is always a scan.

If your table will be constantly written to and rarely read, put a clustered index on it.  The writes will be sequential, resulting in fewer page breaks and less fragmentation, which mean more efficient storage and maintenance.

EDIT: Page "breaks" (I meant splits) aren't possible on heaps, but forwarding pointers, which create a maze of bad performance in heaps with nonclustered indexes. Thanks to commenters for pointing this out!

And when you have to look in that table for a stray logging record, it won't be a full table scan. If your table will be constantly written to and never read, why is it in your database?  Such a table doesn't exist.

This script is an easy one to use for finding heaps in a database, along with the row count and size of jumbled data that is haunting your database.  It is in my DBA toolbox folder alongside many other scripts.

--Doesn't work on 2000 databases or databases in 2000 compatability mode.  Need to change the db_id() syntax if so.

 [Database Name] = db_name()
, [Table Name] = + '.' +
, p.row_count
, SizeMb= (p.reserved_page_count*8.)/1024.
 sys.objects o
inner join  sys.schemas s on o.schema_id = s.schema_id
inner join  sys.dm_db_partition_stats p  on p.object_id = o.object_id 
inner join  sys.indexes si on si.object_id = o.object_ID
WHERE si.type_desc = 'Heap'
and  is_ms_shipped = 0
order by SizeMb desc

As an aside, but a different blog post altogether - a good clustered key is narrow, unique, unchanging and sequential - which is why IDENTITY integer columns are the perfect clustered key.  Compound clustered keys can be a big waste of space.