Thursday, October 31, 2019

Availability Groups: What to do without MultiSubNetFailover = True

I received an email from a client who is having issues with third-party applications connecting to their three-subnet SQL Server Availability Group. After an exchange with Microsoft Support, they discovered that the applications weren't specifying MultiSubNetFailover = True in their connection strings. As a result, because RegisterAllProvidersIP = 1 in the cluster, connections were randomly experiencing high latency upon connecting, as client-side DNS queries over time had a 66% chance of returning the wrong IP from the listener.

They set RegisterAllProvidersIP = 0, but before you take that as advice keep reading. That fixed the connectivity latency problem for now. I'm not sure whether the application connection strings can't or won't be changed to include MultiSubNetFailover = True. This decision was made either because of vendor limitations and/or because of vendor reliance on old data providers. Here was my guidance regarding Microsoft's recommendation to specify RegisterAllProvidersIP = 0 and lower the HostrecordTTL to 120 (seconds):

I guess this is good quick fix guidance but the problem really is on the application side. Assuming those connection strings or providers can’t change, sure. But you lose the ability to failover fast and you will have increased load on DNS servers.
With RegisterAllProvidersIP disabled on the client access point (the cluster network), only one IP address is made available for the listener: the IP of the Listener in the primary replica's subnet. When RegisterAllProvidersIP is enabled, all site IPs for the listener are simultaneously listed. Connection strings using MultiSubnetFailover = True will try all IPs simultaneously and use the one that responds, providing for the fastest possible transition after an availability group failover. That's desirable.
With RegisterAllProvidersIP = 1, caveat remains that any connection strings that aren't using or can't use MultiSubnetFailover will have connection problems. So RegisterAllProvidersIP should be enabled only when MultiSubnetFailover=True can be used in all application connection strings.

With RegisterAllProvidersIP=0, failovers may be problematic. HostrecordTTL = 120 is 2 minutes. Availability Group failovers will result in up to a 2 minute outage, after which time the client OS will query DNS for new information. With RegisterAllProvidersIP =1, failovers can be more or less instantaneous, or at least not hampered by DNS. 

Also by lowering the HostrecordTTL, traffic to and constant load on your DNS server(s) will increase tenfold, as the default is 20 minutes.  What MS has recommended is a short-term solution but it’s not scalable obviously. It’s not getting to the real source of problem – the connection strings. 

It would be much better if we got the third-party applications to change their connection strings to use MultiSubnetFailover = True, and then you could enable RegisterAllProvidersIP =1. This has been around since 2012, it should be available if they are using any modern data provider. If the third party app is still using OLEDB, there is a new data provider for MSOLEDBSQL released in 2018 that works for all previous MS OLE DB implementations. The new ODBC Driver 17 supports Multisubnetfailover, if they’re using ODBC. If the vendor apps are still using something old like SQLNCLI10 (from 2008), or something non-Microsoft, check for newer versions. More info here: https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/always-on-client-connectivity-sql-server?view=sql-server-ver15

Wednesday, October 23, 2019

Idera GeekSync: SQL Security Principals and Permissions 101

Thanks to Idera and all 239 of you who joined the Idera GeekSync webinar today for my topic on SQL Security Principals and Permissions 101.

Fun fact: Principals and Principles have totally different meanings, but both apply. Principals - in that each login, role, group, user, etc., in SQL Server is a principal - as in an important person, or, in one of the definitions, "a person for whom another acts as an agent or representative." But also, the presentation is about Principles - as in good guidance, or "a fundamental truth or proposition that serves as the foundation for a system of belief or behavior or for a chain of reasoning."

We discussed both! Not only a 101-level intro to authentication, logins and users, but also guidance on migrating and dealing with logins and users, SQL Injection, vulnerability and threat detection, data provider updates, permissions for non-sysadmins, separate accounts for sysadmins, and the nature of permissions in/around SQL Server stored procs, views, and functions.

Got good feedback from Idera on the comments, hope it was a useful and entertaining (albeit fast-paced because of time) review of SQL Server Security and Permissions for you.

PowerPoint available for download here: https://github.com/williamadba/Public-Presentations/tree/master/Idera%20GeekSync

Lab scripts for the sproc/view/udf permissions demo:
https://github.com/SparkhoundSQL/sql-server-toolbox/blob/master/lab%20-%20security%20p1.sql
https://github.com/SparkhoundSQL/sql-server-toolbox/blob/master/lab%20-%20security%20p2.sql

[UPDATE] Idera has posted the recording here: https://www.idera.com/resourcecentral/webcasts/geeksync/sql-security-principals-and-permissions


Monday, October 21, 2019

SSIS Scaleout: Cannot open certificate store on the machine

When attempting to connect a worker node to the current Master node of a SQL Server Integration Services (SSIS) Scaleout, for on-prem SQL Servers, if you receive the error:

Validation Error(s):
Cannot open certificate store on the machine

First off, launch SSMS on the SQL Server instance you intend to be the master. The steps to troubleshoot start with the two steps here:
https://docs.microsoft.com/en-us/sql/integration-services/scale-out/troubleshooting-scale-out#cannot-open-certificate-store

"1. Run Scale Out Manager as administrator. If you open Scale Out Manager with SSMS, you have to run SSMS as administrator.
2. Start the Remote Registry service on the computer if it is not running."
3. Finally, if you are trying to add a named instance, there is one more gotcha. The Scaleout Worker is the server name not the SQL Instance name. Do not provide the full named instance name, for example, "SQLSERVER-1\INSTANCENAME", only provide the server name, in this case, "SQLSERVER-1".