Thursday, April 21, 2016

Columnstore Indexes: Finally Awesome in SQL 2016

Columnstore indexes are amazing for optimizing the performance of large rowcount queries that would otherwise look like a SCAN operation in an execution plan. Introduced in SQL Server 2012, they render the table read-only, however, and needed to be dropped (not even disabled) in order to insert/update/delete rows.

If you tried to update a table with a nonclustered columnstore index in SQL 2012 or SQL 2014:

Msg 35330, Level 15, State 1, Line 2
UPDATE statement failed because data cannot be updated in a table that has a nonclustered columnstore index. Consider disabling the columnstore index before issuing the UPDATE statement, and then rebuilding the columnstore index after UPDATE has completed.

In SQL 2014, you can now make a clustered columnstore index, but it has only narrow, specific uses because clustered columnstore indexes don't support traditional (B-tree) nonclustered indexes, a primary key, foreign keys, etc.

All those limitations above are going away in SQL 2016. Right now: SQL 2016 will release on June 1 2016. Release candidates are available for download but aren’t for production use.

SQL 2012 - Nonclustered columnstore indexes can be created, but aren’t updateable. 
SQL 2014 - Nonclustered columnstore indexes can be created, but still aren’t updateable. Clustered columnstore indexes can now be created, are updateable.
SQL 2016 - Nonclustered and clustered columnstore indexes can be created, and are updateable. Plus, you can mix clustered columnstore with traditional B-tree indexes. (Lots of other features too!)

More about these features as Columnstore indexes evolve here:

Want to benefit from columnstore indexes, but not using SQL 2016 yet?

You have three options.

1) Disable/rebuild columnstore nonclustered indexes when you do updates to the table. Not a big deal for nightly-build data warehouses, but painful for frequently-updating tables. Note that the ONLINE feature to create traditional indexes isn't available here.
DROP INDEX [IDX_CS_Table_Covering] ON [dbo].[Table]
<do updates/inserts>
CREATE NONCLUSTERED COLUMNSTORE INDEX [IDX_CS_Table] ON [dbo].[Table] ( Column1, Column2... )
(When you move to SQL 2016, these drop/create steps won’t be necessary any more.)

2) SQL 2014 only. Drop all existing clustered/nonclustered indexes and foreign keys on the table, change the table to have a single clustered columnstore index. This requires performance testing for smaller-scope queries, but should provide the big performance gains you’re looking for, at the cost of dropping the primary key and related foreign keys. This is suitable for narrow use only in reporting database designs and not in normalized environments.
<drop all existing indexes on [dbo].[Table], including the primary key and linked foreign keys>
3) SQL 2014 only. Create a copy of the table that implements 2) above, so you’ll have a table with traditional indexes plus an exact copy of the table but with only a clustered columnstore index. You would then need to point big report queries at dbo.Table_CS and other queries (that use your traditional nonclustered indexes) at dbo.Table. And we would need to modify all insert/update/delete operations in the ETL on dbo.Table in order to replicate any row operations into dbo.Table_CS. This is painful, I understand, but should provide the big performance gains you're looking for. (It would be totally unnecessary to design two tables and redirect some reports in this way if you intend to move to SQL 2016 in the near term, so this step is not recommended.)

That's it!

Need proof that columnstore indexes are awesome and will improve the performance of your large queries? Get used to loving columnstore scans

No comments: