Sunday, June 30, 2019

Thanks for Joining us at SQLSaturday Pensacola!

Thanks for joining us at #SQLSatPensacola!

The 8:30am Careers in Technology panel featured lots of talk about Twitter, so might as well mention that the panelists were (in audience's order, left to right): @SqlrUs @SQLEspresso @mviron @svenaelterman @william_a_dba @sqllocks @SQLServerDBA318 @IrishSQL. It was the first of four back-to-back sessions for the Assafs!

Then Christine gave a presentation on Giving Feedback: How to Effectively Communicate to your Employees. I presented on SQL Admin Best Practices with DMVs in the auditorium before lunch.  Here's the slidedeck and info for my presentation. The slidedeck has also been uploaded to the SQLSat Pensacola schedule page. The lunch WIT session with Rie, Monica, and Christine was also full of wisdom and truths.

It was great to see #SQLFamily again this weekend, hope to see you next time!

Next up is SQLSaturday Birmingham, then Baton Rouge on August 17!

Monday, June 24, 2019

SQLSaturday Pensacola 2019 Coming Soon, See You There

One of my favorite (and my first) SQLSaturday events, Pensacola is this weekend June 29. Check out their lineup and info! After that I'll be at SQLSat Birmingham, then my hometown SQLSaturday Baton Rouge on Aug 17. Hopefully everyone in the region can make it out to all these great SQLSaturday events.

At SQLSaturday Pensacola, both my wife and I will be presenting, then catching up on reading and relaxation. I'll be presenting in the morning on SQL Admin Best Practices with DMVs, and she'll be presenting (a much more popular topic) for current and future leaders, Giving Feedback: How to Effectively Communicate to your Employees. I'll also be part of the Careers in IT - Pensacola 2019 panel in the first timeslot.

We'll definitely both be joining Rie Irish for Women in Tech: Thriving Not Just Surviving as well.

SQLSat Pensacola is also sporting a pair of precons on the Friday before, including Zero To Dashboard by Angela Henry and Managing and Architecting Azure Data Platform by fellow good looking bald guy John Morehouse.

Looking forward to #SQLFamily again this weekend, hope to see you there!

Wednesday, June 05, 2019

Actual Emails: Will MSOLEDB work for connecting to an Availability Group?

Scenario:

We need to configure an existing legacy application from an external vendor to talk to our new SQL Server 2017 Availability Group, which spans multiple subnets. In the end, that last bit is the key. The old data provider MSOLEDB will work for connecting to single-subnet Availability Group listeners, but won't work consistently when connecting to a multisubnet Availability Group's listener. The key is the ability to specify MultiSubNetFailover=True in the connection parameters.

Client's software vendor:
The connection used is the Microsoft OLE DB Provider for SQL Server that is supplied by Microsoft to create the Data Link Properties. When configuring our Data Link, we use the "Microsoft OLE DB Provider for SQL Server". The connection string is formatted: 
Provider=SQLOLEDB.1;Password="whatever";Persist Security Info=True;User ID=username;Initial Catalog=Test;Data Source=ServerName 
Answer:
Good info, but we do need to make a change here. SQLOLEDB is the provider from back in the SQL 2000 era. Do not recommend its use for new development. It has been replaced by the Native Client (SNAC), which has since been replaced by MSOLEDBSQL (I linked below).  It should be easy and transparent to upgrade the provider from SQLOLEDB with no negative impact.
Here’s why we need to upgrade the data provider to talk to our SQL Availability Group. The SQL Server Listener for a multi-subnet Availability Group actually has two IP’s. When you perform at a command line:
Nslookup SQLListenerNameWhatever 
You get back an IP in each subnet (in our case, two), for each replica SQL instance in the Availability Group. 
When a connection string uses MultiSubNetFailover=True and connecting to the Availability Group Listener name (not the IP or either SQL Server instance name), BOTH IP’s are tried simultaneously and immediately, and the driver talks only to the IP that replies: the primary replica.  After a failover, the other IP begins to reply immediately, and so there is no delay in reconnectivity when a failover occurs. 
Without specifying MultiSubNetFailover=True, your application will (essentially randomly) pick ONE of the two IP’s for the Listener, and try it. There is no way to “rig” one IP to be first consistently over time. If it picks the primary replica, everything works! If it picks the IP for the current secondary replica… your application’s connection timeout will have to expire and then try the next IP.  This is why I’m bringing this up – the application will timeout upon SQL login without MultiSubNetFailover=True.
This hasn’t been an issue with your other clients if they aren’t using a multisubnet availability group. If they have an Availability Group all inside only one subnet, then the Listener only has one IP in DNS, and MultiSubNetFailover=True isn’t required.
You should be fine to install the MSOLEDBSQL provider released in 2018 and use that in your data link. Obviously it should be tested for due diligence, but it should work. At the very least, you could try instead the SQL Native Client 11 (SQLNCLI11), which was released for SQL Server 2012, and it also should work just fine for both OLEDB or ODBC.
Let me know if you have any more questions.
Connection string information for SQL Server Availability Groups:
Note: you also need to make sure your Windows Cluster has RegisterAllProviderIP's set to ON for a multisubnet Availability Group!


Saturday, June 01, 2019

Thanks for joining us at SQLSaturday Dallas!

Thanks for joining hundreds of SQL pros from around the country today at UTDallas!

Here's my session downloads from my presentation today at 1:15 in room 1.102, there's a handy .zip file in there with all the contents for easy reference. They have also been uploaded to the SQLSaturday Dallas website and downloadable from the schedule page.

Why a 101-level presentation? One of the pieces of feedback we consistently have received over the past 10 years of SQLSaturday Baton Rouge is there's not enough entry-level content. This is why I try to deliver low-level entry ramp style presentations for jobseekers, job switchers, and students at SQLSaturdays. And it's not something "beneath" me, because these foundational, fundamental principles are important for new careers and, at the edges, I still learn things that round out my knowledge and further my own career.


Thursday, May 30, 2019

See you at SQLSaturday Dallas!

Both my wife and I will be speaking this weekend at SQLSaturday Dallas, back-to-back in room 1.102, and both on topics perfect for entry level IT pros, job seekers, and students. We're honored to be on a SQLSaturday schedule filled with some of the biggest names in the industry, so I hope to see you there! Looking forward to seeing so many #SQLFamily there again.

I'll be presenting on SQL Security Principals and Permissions 101, a ground-level introduction to SQL Server authentication, security principals and even some security principles. We'll go through some code demos about how stored procedures assist the DBA by abstracting the permissions necessary for execution, and how you could design a database security model using role-based, least-permissions pattern. I think everyone can learn something from this session, from DBA's to developers, and how better to secure their databases.

See you at UT Dallas this Saturday, June 1!

Wednesday, May 29, 2019

Untrusted FK's demo + ADS notebooks

Back in 2012 at SQLSaturday Pensacola, I did a lunch-time lightning talk on indexing, which was supposed to have a big finale on how FK's could silently help with performance, and it didn't work. Didn't show the performance boost. It was because, thanks to poor scripting by SSMS, FK's are enabled but not trusted. This blog post serves as a long-belated knowledge share of a problem I didn't figure out live, as well as a successful recreation of that big finale.

Fast forward to a couple weeks ago at the May 2019 Baton Rouge SQL Server User Group, and local fellow SQL pro John Wells gave a presentation to a mixed audience of dev's and DBA's on performance tips, and one of them was to check up on untrusted FK's. Thought occurred to me to create some lab's to demonstrate not only how to untrusted FK's get accidentally created by well-intentioned dev's and DBA's, but how to fix, and also a demonstration on how they can be silent performance drags. So with all respect to John, Chris Bell who John mentioned in his presentation, and many others who have spoken and written well on the topic already, below is a lab demonstrating the issue, resolution, and performance impact.

To further one more goal, I've built the lab into a tsql script but also a new Azure Data Studio SQL notebook. Either offer a step-by-step introduction and demonstration. New to SQL Notebooks? No worries, they were released generally in March 2019 as a new feature of Azure Data Studio (which can be launched from SSMS 18+), which itself was made GA in September 2018.

A SQL Notebook is another way to combine text instructions with step-by-step, modular executed SQL statements in a single file. Notebooks are a popular form of mixed media, combining text, proofs, queries, and their live interactive output. They are already quite popular tools for data scientists to package methodology, analysis and live output. SQL Notebooks include kernels for querying SQL, PySpark, Spark, Python with multiple languages, and now in preview for SQL 2019 CTP 3.0, working with big data clusters. SQL Notebooks could also be used by DBA's and developers as rich, well-documented runbooks for complex exercises deployment or disaster recovery exercises.

SQL Notebooks are easy to learn and use, consider a SQL Notebook for your next detailed query breakdown or summary data analysis. Maybe you'll learn two things at once!

Github links to the SparkhoundSQL toolbox for the Untrusted FK's lab: .sql .ipynb

PS Check it out, Github even parses the notebook JSON correctly to display a (non-interactive) view of the notebook outside of ADS with the saved resultsets. Neat!

UPDATE: Refined my terminology a little bit to favor "SQL Notebooks" not "ADS Notebooks"

Monday, April 29, 2019

Four Data Integration Design Questions to Ask

We get requests to move data between custom systems regularly, even within systems. I was advising a client on something fairly simplea collection of tables out of one vendor system to feed anotherand I thought I'd make a blog post out of the array of questions that always come up.

Regardless of the strategy for data movement, whether it be:
  • SQL Server Integration Services (SSIS) locally or in Azure Integration Runtime (IR)
  • Stored procedures
  • SQL replication
  • Secondary readable Availability Groups
  • Azure Data Factory 2.0 (not 1.0, oh goodness, never 1.0)
  • Transactional movement featuring message queues or APIs
  • Any streaming solution
  • ETL or ELT
  • Any other kind of transformation I'm forgetting to mention
The following questions should be asked before designing a data movement plan.

(There are no correct answers to these questions of course, but you must be able to determine the answers from the business case.)

1. What is the latency requirement for the changes from the data source(s) to be copied to the destination?
Common answers: Instantly, no longer than 5 min, or 30 min, or nightly.

2. How many rows are expected to change in the source(s) in a given time period? 
Common answers: Anywhere from few rows per month to all/most the rows in a table every day.

3. What types of data changes are performed in the source(s)? 
Is the source data inserted, updated, and/or deleted? 

4. Do we have a reliable way to identify "the delta"? 
How do we know which rows have changed, including hard deleted rows (vs soft deleted rows)?

Let's dive more into the last question, because this is where the design meets the implementation method. There's a reason we always design tables with an IDENTITY column and some basic auditing fields.

First off, a truncate/insert strategy is not scalable. I have redesigned more of these strategies than I can remember, often because of database developer myopia. A truncate/reinsert strategy, even a bulk insert strategy, will inevitably outgrow its time boundary identified in Question 1. Don't waste your time and resources on such a strategy, you need to identify a way to find out what changed the in data source now.

But what if we don't or can't trust the application to always modify a "ChangeDate"? This is certainly the easiest way to know if the row has changed, but what if the original table wasn't designed with such a field? We should consider whether we can alter the data source(s) with useful, built-in SQL Server features like Change Tracking (CT), Change Data Capture (CDC), or a more recently-introduced feature called Temporal Tables. The latter can provide a reliable, built-in modified date and row history, transparent to applications. All of these strategies are well documented and have easy to use labs available.

Each of these solutions is very useful and recommended in its use case, and much preferred over a trigger-based system which will add complexity and overhead to transactions. A "pull" of recent changes is much preferred for most scenarios over a "push" of each change inside the transaction.

Caveats remain howeverand this came up with a recent clientthe impact on future updates/patches for databases must account for implementations of CT, CDC, or Temporal Tables. The same caveats apply to replication (useful in spots) and database triggers. Don't enable these SQL features without consulting with and advising the maintaining developers on the potential impact and need for testing.

One more crucial factor often overlooked as part of Question 4 are the intermediate transactions, especially in the case of less-than-instant data movement. If a row changes from status 1, to status 2, to status 3, can we just send over the row state with status 3? Or must we apply an insert for status 1, an update for status 2, and then another update for status 3 to the destination? This could be a major problem if the destination has an indirect dependency on evaluating the status changes; for example, to calculate the durations between statuses.

I once designed a data warehouse for tracking the performance of auditors, and we were analyzing the workflow for the bottlenecks in a 20-step process. Each of the 20 steps and its corresponding row state and time stamp were the whole point of the analysis. This demanded some sort of row-versioning in the data source. Not all change detection strategies work for this, however. Change Tracking, for example, would not suffice. Know your solutions!

You shouldn't move forward with any data movement design before answering these questions.

Are there any other common questions you'd ask for before deciding on a plan for a project like this?

Monday, April 15, 2019

Actual Emails: How do I learn SQL Server despite limited SQL duties at work?


Got this email from a client in the southern US asking how to up their game in SQL Server, frustrated by a lack of hands-on opportunities to administer SQL via current job duties. I also felt it necessary to discuss whether or not cert exams were appropriate, leaving it up to them, and then my preferred training methods.


To: William 

Subject: SQL Certification Questions

I am trying to gain a deeper knowledge to go and sit for the certification exams. What would be your best suggestion to immerse myself into SQL and learn the skills I need to sit for each exam. I have tried to just create tasks to force myself to learn and practice SQL query skills and that works but it has some limits. I learned a lot from that, but I learned so much more with some direction and a course structure.  I have taken a couple of online SQL training courses and everything seems simple and logical. I start feeling like less confident when I look at sample questions. They seem to go into deeper detail than what I have seen in the training classes and deeper than what I see day to day. What would your suggestion be for the best method of gaining the skills needed to effectively manage my SQL environment?



From: William

Hi! Honored that you reached out, hope this email helps. I actually give a presentation on this topic, based on my experience as a writer for the last three generations of SQL certification exams for Microsoft. The exam writers are instructed to test experience by asking questions in the frame of tasks that test whether or not the exam taker has “do this job” before. The exam writers are told to test for someone with 3-5 years of experience at a minimum, by testing things you can’t learn from only reference docs (and especially on brand new features of the latest version of SQL). So that’s who I advise taking the tests: 3-5+ years of xp. 
So my opinion here may be different from others and especially from some managers, but I don’t feel it’s appropriate or productive to ask inexperienced resources to pursue exams. What’s more likely than a passing score is a person becoming disillusioned, frustrated, or disengaged from career progress, or they try to cheat (with brain dumps), or they quit and/or change career path.
 Given that, gauge for yourself whether you think an exam makes sense for you at this stage. Regardless, as for training resources:Again experience is the best teacher here, but I understand the frustration about not being exposed to much variety, as far as SQL development/administration goes. This blog post of mine has links to many resources, I’d point out specifically the “Stairways” and the MVA. If you don’t already have a copy of my book, the Sparkhound office near your area can totally arrange that. Joining your local SQL User Group is good free training, as well as all the virtual user groups that PASS provides for free. Highly recommend joining PASS, it’s free. There’s a SQLSaturday conference in Atlanta next weekend, again more free training, later this year there are SQLSaturdays in Memphis and Baton Rouge and Pensacola, and there are SQLSaturdays that usually happen annually in Birmingham and Columbus, GA/Phenix City, AL (thought I don’t see either in the upcoming events list on sqlsaturday.com right now…)  The wife and I like to make little weekend trips out of those Saturday conferences, and before our kid went off to college, we’d bring them along too for fun.
 As for trying to get hands-on experience, I would give you the same advice I often give my team: “lab it out.” Use your local workstation or laptop, install SQL Server Developer edition, have a local instance to play with at all times. Sign up for an Azure account and use your free credits to run an Azure VM with SQL Developer edition, or if you have an MSDN account through work or otherwise, you get monthly credits for Azure spend. If Azure isn’t an option, use a home PC or server, the hardware doesn’t have to be production-quality to facilitate learning basic concepts and testing admin actions you shouldn’t try on production. I have learned and developed a lot of my toolbox “lab” scripts by just playing and breaking and fixing and dropping and recreating on my SQL sandboxes… all outside of production. So if you don’t have an admin sandbox, get one, just setting it up, breaking and fixing it will be a learning experience.  
I can’t speak personally to any of the paid training classes, other than the SQLSkills folks literally wrote big parts of SQL Server and their training is considered top notch. The PASS Summit conference in Seattle is the biggest/best SQL Server conference with two days of all-day deep dive trainings followed by three long days of sessions, if you have that kind of training money.
 Let me know if you have any questions and best of luck!

Thursday, April 04, 2019

Activate Conference 2019: Databases 101 for the Aspiring App Dev Session Info and Resources

A special hello if you're visiting this blog post during or after my workshop on Thursday afternoon, April 4 at the Louisiana Tech Park as park of the Activate Conference 2019!


Here are all the links, downloads, and more you need:

Databases 101 for the Aspiring App Dev
Workshop Student Track
1:00 PM

Presentation Downloads:


Slide deck: Download link  (.pptx files) 

Sample TSQL 101 script: Download link (.sql file)


Links to get setup with MS SQL Server:

This is the link to download SQL for FREE:
The tools to dev in the Microsoft ecosystem are all free:
Download the WideWorldImporters sample database:
ToolboxGithub (.sql files)
  • Look for toolbox.zip for an easy download, or 
  • Click on each file then download, or 
  • Click on each file, click “raw” on each file to copy/paste

Tuesday, April 02, 2019

Activate Conference 2019: Three Days of Tech Learning

Thursday April 4 I'll be giving a workshops at the super-slick-branded Activate Conference 2019 in Baton Rouge at the Louisiana Technology Park, a conference filled with people and topics much cooler than me and my "Databases 101 for the Aspiring App-Dev" 1.5 hour workshop.

Activate is an event that aims to enrich our local tech community by establishing and furthering the careers of individuals in the web and technology field.

3-Day registration for the whole conference, which includes a day of workshops, a hackathon, and a Saturday conference, is free for students. VIP Admission is also available which includes a t-shirt, swag bag and kickoff dinner.

On Saturday, Sparkhound will be hosting a table with our third annual Lego deconstruction competition, with an prize for the fastest times!

See you there, and best of luck to the organizers: Isral, Quinton, Lynsey, and all their many helpers and volunteers!

Tuesday, March 05, 2019

Backup of an Azure SQL DB downloaded outside of Azure

From a dev colleague: “Asking for a client... do you know how to get a backup of an Azure SQL DB downloaded outside of Azure?”

Short answer - You can’t…

Medium answer - … and you don’t really need to for anything related to DR or HA. Backups are automagically handled in Azure, Azure SQL DB georeplication can be set up in a few button clicks. Instead of thinking of Azure SQL DB as a traditional SQL database in a server, think of it as a platform, consider spinning up a stand-alone copy of the production Azure SQL DB in a cheaper tier to set as pre-production. It’d be a better, truer dev/test/qa environment as a result.

Long answers – Okay, so you really want to get the Azure SQL DB to an on-prem SQL Server instance. Sigh.

  1. You can manually script it out into CREATE and INSERT T-SQL statements – schema and data – with various tools, including a compare with SSDT, or an officially supported free crossplatform scripter that is Python-based. Keep in mind this might be a huge file and a lengthy manual process with a large database.
  1. You could also use the “Export” button on the Azure portal for the Azure SQL DB to create a DACPAC on an Azure storage account. The file is locked with a username/password which you specify at the time of export.  It will take a few minutes, even for a small database. Once it's done, you can download the .bacpac file from the blob location with Azure Storage Explorer or something similar, and then import the .bacpac. Keep in mind that this might also be a huge file, and a lengthy manual process with a large database.