Monday, September 14, 2020

Prepping the local SSD D: in an Azure VM for SQL Server TempDB

One of the few good, safe uses for the local SSD volume of an Azure VM is for SQL Server TempDB data and log files. There's a gotcha. Let's discuss. 

Note: this blog post will be updated as I get clarification, specifically to solution 3b below. See comments.

Typically this is the D: volume of an Azure VM, though not always in the case of migrated VMs.

Moving your SQL Server instance's TempDB files to the D: volume is recommended for performance, as long as the TempDB files fit it the D: that has been allocated, based on your VM size. 

When the D: is lost due to deallocation, as expected, the subfolder you created for the TempDB files (if applicable) and the NTFS permissions granting SQL Server permission to the folder are no longer present. SQL Server will be unable to create the TempDB files in the subfolder and will not start. Even if you put the TempDB data and log files in the root of D:, after deallocation, that's still not a solution, as the NTFS permissions to the root of D: won't exist. In either case, SQL Server will be unable to create the TempDB files and will not start.

In the Windows Application Event log of the Azure VM, you'll see something like:
Event ID: 17204
FCB::Open failed: Could not open file D:\tempdb.mdf for file number 1.  OS error: 2(The system cannot find the file specified.).
Event ID: 5123
CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file 'D:\tempdb.mdf'.
On Azure Windows VMs, the local SSD temporary disk is D:, I'll assume it is for you as well. On Azure Linux VMs, the local SSD temporary disk is /dev/sdb by default. I mention these solutions for Windows OS below.


There are three solutions to this problem in Windows. To test any of these solutions, don't just restart your VM, you have to deallocate it (stop it in the Azure Portal, then restart it). That's the way to be sure the contents of D: has been lost. However, data on the temporary disk may be lost during Azure maintenance, which is mostly out of your control, so it is important to solve this issue when using the D: for TempDB.

Note that simply shutting down the Azure VM from inside operating system does not deallocate the VM, it only stops the VM. During a normal, successful reboot of the VM OS, data on the D: will not be lost. A reboot initiated from inside an RDP session, for example, shouldn't trigger the contents of D: to be lost. You must stop the VM from the Azure Portal to deallocate the VM. This is the difference between a VM's status of "Stopped" or "Stopped (deallocated)". 

1. Windows Scheduled Task Calling PowerShell

The first solution it appears is the one Microsoft documented years ago here. A simple PowerShell script running in Windows scheduled task, triggered on Startup. It also requires setting the SQL Server service to "Automatic (Delayed Startup)". 

The script creates the subfolder on D: for the TempDB, then grants permissions to the SQL Server service account, before SQL Server tries to start.

This type of script and delayed startup is a yet-unsolved necessary complication, but this solution works fine.

2. Add SQL service account to local Administrators group

The second solution is to place the TempDB files on the root of D: but then make the SQL Server service account a local Windows administrator, so that it has permissions to write to the root of D:.  This violates most security best practices (though it is more common than you'd think). 

Key here is placing TempDB in the root of D:. Even as a member of the local Administrators group on the server, SQL Server will not create subfolders upon startup, and a subfolder on D: won't exist after deallocation. 

I'm not a fan but it's a solution, and this is certainly the simplest solution, both to implement and maintain.

3a. Azure SQL Server IaaS Agent Extension, installed with VM  

A third possible solution involves the Azure SQL VM Resource Provider. It places the SQL Server IaaS Agent Extension that connects the SQL instance to the Azure portal for all sorts of easy integrations, bringing a lot of manageability to the Azure Portal. It is generally highly recommended. 

The SQL Server IaaS Agent Extension should handle preparing D: for the TempDB, solving startup issues for an Azure VM with SQL Server. This definitely works for VM images with SQL Server from the Azure marketplace, which by default configure TempDB on the local SSD D: anyway. I have tested this out with the newest "SQL Server 2019 on Windows Server 2019" image available in the Azure Marketplace. It works. For example:

"If you create the VM from Azure portal or Azure quickstart templates and place Temp DB on the Local Disk then you do not need any further action." Turns out, Solution 3 here is very similar to Solution 1. When installed from the marketplace with the SQL Server IaaS Agent installed, the Azure VM has a Windows Scheduled task triggered At Startup to call "C:\Program Files\Microsoft SQL Server IaaS Agent\Bin\SqlIaaSExtension.SqlServerStarter.exe".  

I speculate that this app runs on startup to prep the D: for TempDB, handling the permissions on D:\TempDb subfolder for the SQL Server service account, which is not a member of the local admins group by default. In this case, unlike Solution #1 above, the SQL Server service is not configured to "Automatic (Delayed Startup)", it is still configured to "Automatic", as usual.

Again, to test this out, don't just restart your VM, you have to deallocate it (stop it in the Azure Portal, start it). That's the way to be sure the D: has been deallocated and restarted.

I have split this solution into two parts because 3a only appears to work if SQL and the SQL VM Resource Provider are pre-installed via the image gallery. 

3b. Azure SQL Server IaaS Agent Extension installed manually

But what about for an existing VM, not created with the SQL Server IaaS Agent Extension pre-installed from the Azure VM image gallery? This is currently unclear.

We can install the SQL Server IaaS Agent Extension ourselves by registering the VM with the SQL VM resource provider. Once the RP is associated with your Azure subscription, adding the VM is easy. Install the SQL Server IaaS Agent Extension in lightweight management mode, without restarting the VM from the Azure CLI:
az sql vm create --name vm-iaasag-postsql --resource-group vm-iaasag-postsql_group --location eastus2 --license-type PAYG
Options for PowerShell are available in the same link.

However, you'll notice the SQL Server IaaS Agent Extension in lightweight management mode doesn't actually create the executable found in "C:\Program Files\Microsoft SQL Server IaaS Agent\Bin\". For that to be present, get up to "Full management mode". You can upgrade from lightweight management mode, or go straight to full management mode, but it costs a SQL Server service restart either way for Full management mode.

Currently though, if I create a new Windows VM without SQL Server, install a SQL instance, install the SQL Server IaaS Agent Extension in full management mode, and even schedule "C:\Program Files\Microsoft SQL Server IaaS Agent\Bin\SqlIaaSExtension.SqlServerStarter.exe" in a Scheduled Task triggered on startup (just like in a VM from the image gallery has), it doesn't appear to prep the local SSD D:\ drive for TempDB upon startup after deallocation. SQL Server service still fails to start because it cannot create TempDB. 

So, the SQL Server IaaS Agent Extension appears to prep the local SSD D:\ drive for the TempDB files, but only if SQL and the SQL VM RP are pre-installed via the image gallery. I'm still actively working on a solution to this and reaching out to Microsoft. I will update this blog post at a later date.


Edward Pochinski said...

We created a directory structure, you just need a bit more PowerShell to re create directory structure and assign the perms by calling a PS script from the task scheduler job.
Ed Pochinski

w said...

Hi Edward-
Yep, that's what solution #1 does and what Microsoft published as a solution for this in 2016.


Mels de Zeeuw said...
This comment has been removed by the author.
Mels de Zeeuw said...

I have a solution for 3b.
Create a .reg file and fill it with the following content:
Windows Registry Editor Version 5.00


Import it into the registry and create the a scheduled task like described in 3a it will work. I just tested this.

Don't forget to create a scheduled task that runs at system startup and executes the following command: "C:\Program Files\Microsoft SQL Server IaaS Agent\Bin\SqlIaaSExtension.SqlServerStarter.exe"