SQL Tact: ssis

Showing posts with label ssis. Show all posts

Monday, October 21, 2019

SSIS Scaleout: Cannot open certificate store on the machine

When attempting to connect a worker node to the current Master node of a SQL Server Integration Services (SSIS) Scaleout, for on-prem SQL Servers, if you receive the error:

Validation Error(s):
Cannot open certificate store on the machine
First off, launch SSMS on the SQL Server instance you intend to be the master. The steps to troubleshoot start with the two steps here:
https://docs.microsoft.com/en-us/sql/integration-services/scale-out/troubleshooting-scale-out#cannot-open-certificate-store

"1. Run Scale Out Manager as administrator. If you open Scale Out Manager with SSMS, you have to run SSMS as administrator.

2. Start the Remote Registry service on the computer if it is not running."

3. Finally, if you are trying to add a named instance, there is one more gotcha. The Scaleout Worker is the server name not the SQL Instance name. Do not provide the full named instance name, for example, "SQLSERVER-1\INSTANCENAME", only provide the server name, in this case, "SQLSERVER-1".

Monday, April 29, 2019

Four Data Integration Design Questions to Ask

We get requests to move data between custom systems regularly, even within systems. I was advising a client on something fairly simple—a collection of tables out of one vendor system to feed another—and I thought I'd make a blog post out of the array of questions that always come up.

Regardless of the strategy for data movement, whether it be:

SQL Server Integration Services (SSIS) locally or in Azure Integration Runtime (IR)
Stored procedures
SQL replication
Secondary readable Availability Groups
Azure Data Factory 2.0 (not 1.0, oh goodness, never 1.0)
Transactional movement featuring message queues or APIs
Any streaming solution
ETL or ELT
Any other kind of transformation I'm forgetting to mention

The following questions should be asked before designing a data movement plan.

(There are no correct answers to these questions of course, but you must be able to determine the answers from the business case.)

1. What is the latency requirement for the changes from the data source(s) to be copied to the destination?
Common answers: Instantly, no longer than 5 min, or 30 min, or nightly.

2. How many rows are expected to change in the source(s) in a given time period?
Common answers: Anywhere from few rows per month to all/most the rows in a table every day.

3. What types of data changes are performed in the source(s)?
Is the source data inserted, updated, and/or deleted?

4. Do we have a reliable way to identify "the delta"?
How do we know which rows have changed, including hard deleted rows (vs soft deleted rows)?

Let's dive more into the last question, because this is where the design meets the implementation method. There's a reason we always design tables with an IDENTITY column and some basic auditing fields.

First off, a truncate/insert strategy is not scalable. I have redesigned more of these strategies than I can remember, often because of database developer myopia. A truncate/reinsert strategy, even a bulk insert strategy, will inevitably outgrow its time boundary identified in Question 1. Don't waste your time and resources on such a strategy, you need to identify a way to find out what changed the in data source now.

But what if we don't or can't trust the application to always modify a "ChangeDate"? This is certainly the easiest way to know if the row has changed, but what if the original table wasn't designed with such a field? We should consider whether we can alter the data source(s) with useful, built-in SQL Server features like Change Tracking (CT), Change Data Capture (CDC), or a more recently-introduced feature called Temporal Tables. The latter can provide a reliable, built-in modified date and row history, transparent to applications. All of these strategies are well documented and have easy to use labs available.

Each of these solutions is very useful and recommended in its use case, and much preferred over a trigger-based system which will add complexity and overhead to transactions. A "pull" of recent changes is much preferred for most scenarios over a "push" of each change inside the transaction.

Caveats remain however—and this came up with a recent client—the impact on future updates/patches for databases must account for implementations of CT, CDC, or Temporal Tables. The same caveats apply to replication (useful in spots) and database triggers. Don't enable these SQL features without consulting with and advising the maintaining developers on the potential impact and need for testing.

One more crucial factor often overlooked as part of Question 4 are the intermediate transactions, especially in the case of less-than-instant data movement. If a row changes from status 1, to status 2, to status 3, can we just send over the row state with status 3? Or must we apply an insert for status 1, an update for status 2, and then another update for status 3 to the destination? This could be a major problem if the destination has an indirect dependency on evaluating the status changes; for example, to calculate the durations between statuses.

I once designed a data warehouse for tracking the performance of auditors, and we were analyzing the workflow for the bottlenecks in a 20-step process. Each of the 20 steps and its corresponding row state and time stamp were the whole point of the analysis. This demanded some sort of row-versioning in the data source. Not all change detection strategies work for this, however. Change Tracking, for example, would not suffice. Know your solutions!

You shouldn't move forward with any data movement design before answering these questions.

Are there any other common questions you'd ask for before deciding on a plan for a project like this?

Thursday, June 11, 2015

Error Upon First Processing Your SSAS Database With A Service Account

This blog post is to document the easy fix to a problem that has become a rite of passage for the good folks (including myself) on the data warehouse development team I've been working with recently.

Here's the error text you've probably received the first time you try to process your Analysis Services database using a job or, in this case, an SSIS package calling an Analysis Services Execute DDL Task:

Executing ExecutePackageTask: D:\whatever.dtsx
Error: 0xC1060000 at Analysis Services Processing Task, Analysis Services Execute DDL Task: OLE DB or ODBC error: Login failed for user 'NT SERVICE\MSOLAP$SQL2K14'.; 28000; Cannot open database "DataWhatever" requested by the login. The login failed.; 42000.
Error: 0xC1120064 at Analysis Services Processing Task, Analysis Services Execute DDL Task: A connection could not be made to the data source with the DataSourceID of '4c67b909-250d-4b14-95cc-451d1c95bc9f', Name of 'SqlServer .sql2k14bi DataWhatever'.
Error: 0xC11F0056 at Analysis Services Processing Task, Analysis Services Execute DDL Task: An error occurred while processing the partition 'DimDate_168cd22b-ae72-4730-a054-4fd0fa6acc2f' in table 'DimDate_168cd22b-ae72-4730-a054-4fd0fa6acc2f'.
Error: 0xC11C0006 at Analysis Services Processing Task, Analysis Services Execute DDL Task: The current operation was cancelled because another operation in the transaction failed.
Task failed: Analysis Services Processing Task
Warning: 0x80019002 at Main: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (4) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.

The solution is quite simple.

The SSAS data cube might have been developed and processed just fine with your personal user account, but now that you've executed it with a service account for the first time, you need to grant it permissions. The SSAS Service account does not have rights to access your data source, in this case, your SQL Server 2014 instance.

Grant read-only permissions to the SSAS service account to the source database, for example, by creating a login and adding that login to the db_datareader role of the database, or granting SELECT permissions to the tables or schema needed to process the SSAS cube.

Friday, November 01, 2013

PowerShell: Delete Unneeded Rows/Columns In Excel

I had a project to automate the import of a large number of excel files via SSIS into SQL 2012. Many of the files had different headers, but they all had a column header row that preceded the data.

On some of the files, the header row started on row 2, 3, 4, or 5, and on some others, the first two or three columns were entirely blank. What a pain, right?

One saving advantage that I identified in this proprietary set of excel files: each of the datasets begins with a column headed "Identifier", so that is the cell that I want to end up in cell [A1]. That was definitely key to making this work and easily identifying when my header row began.

I automated cleaning up these spreadsheets with the following PowerShell script

Param([string]$wbname)

Try
{
 $err1 = 0;
 #testing only #$wbname = "E:\foo.xls"

 $xl = New-Object -comobject Excel.Application
 $xl.DisplayAlerts = $False

 $wb = $xl.Workbooks.open($wbname)
 $sheet = $wb.Sheets.Item(1)

#delete columns
while( ($sheet.Cells.Item(1,1).Text -eq "") -and ($sheet.Cells.Item(2,1).Text -eq "") -and ($sheet.Cells.Item(3,1).Text -eq "")){[void]$sheet.Cells.Item(1,1).EntireColumn.Delete()}

#delete rows
while( ($sheet.Cells.Item(1,1).Value() -NotLike "Identifier") -and ($sheet.Cells.Item(1,2).Value() -NotLike "Identifier")){[void]$sheet.Cells.Item(1,1).EntireRow.Delete()}

#cleanup
$wb.Close($true)
$xl.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)
Remove-Variable xl

}
Catch
{
    $err1 = 1;
}
Finally
{
    if($err1 -eq 1)
    {
        #Write-Output "99";
 #Write-Output $_.Exception.Message;

 $stream = [System.IO.StreamWriter] "e:\error.txt"
 $stream.WriteLine($Error[0].InvocationInfo.PositionMessage)
 $stream.WriteLine($wbname)
 $stream.close()


 [Environment]::Exit("99");
    }
    Else
    {
        #Write-Output "0";
        [Environment]::Exit("0");
    }

}

The two while loops are where the customization was for my particular set of problematic Excel spreadsheets, but the general structure of the code is where you can change it up for yourself.

The last four lines are cleanup - while the last might not be very important, the second-to-last line is pretty important to make sure the stick COM object goes away. Explained much better here: http://technet.microsoft.com/en-us/library/ff730962.aspx

In order to put this in sequence in my SSIS package, I used the Execute Process Task. Screenshots below:

Unfortunately the client server I was working for here only had PowerShell 1.0 installed, but from what I understand, this should apply to PowerShell 2.0 and 3.0. I'm open to anyone who has any insight there.

(click to enlarge the below images)

The argument passed to the PowerShell file includes the full path to the Excel file. (I removed the paths from this actual client implementation.)

I'm open to feedback on this, wonder if anyone else has encountered a similar issue. Obviously, the best solution would have been to modify the process that generates these maddeningly inconsistent excel files.

Wednesday, January 26, 2011

"The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020."

I know this doesn't make a lot of sense so far, but from what I can tell, the Fast Load OLE DB Destination mode can fail if an unconnected ADO destination in the same Data Flow also fails.

Got this error in SSIS with an OLE DB Destination:

"The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020."

This error only popped up when an error in an unrelated transformation in the same data flow had an error. I stopped receiving this error when I changed from a "Table or View - fast load" to "Table or View".

That's the short of it. Here's more:

This is a puzzling error apparently caused by an error with a transformation with an ADO NET Destination, that runs concurrently and in parallel - that is, unconnected - to my OLE DB Destination.

I don't really have the time to figure out all the different permutations that allow this error to keep coming up, but safe to say its not really an error... it seems to be caused by other errors, but the important thing to note is that an error in one transformation was able to allow an unrelated transformation to fail. This can be very misleading when trying to debug a screen full of red blocks.

The error that came up was a run of the mill aggravating data type error caused by SSIS being picky -

[SLRP ADO [4902]] Error: An exception has occurred during data insertion, the message returned from the provider is: The given value of type String from the data source cannot be converted to type nvarchar of the specified target column.

Will keep an eye out for more stuff like this in SSIS, open to feedback.

Friday, August 06, 2010

Fun with SQL 2005 Service Pack 3

At a remote client site last night, I ran into a series of problems with SQL 2005 RTM and updating it to SP3.

Error Number : 1612

Error Description : Unable to install Windows Installer MSP file

like that, on the SQL, AS, NS, RS, DTS and Tools components. Furthermore, the SP3 install was asking me for the original install media locations of the .msi files before failing each component during the Service Pack installation.

I tried a few of things unsuccessfully: setting explicit admin permissions on numerous folders (has worked in the past), using a different copy of SQL 2005 CD1 media, trying SP2 instead (same problems), uninstalling the antivirus software, googling like a mad man. (Er... I mean Binging.)

Eventually, my mad searching led me to this MSDN Blog Entry, which initially seemed very daunting. And I was correct, it was very time-consuming. But it worked.

(Note that if you've found this blog entry because you're experiencing some of the errors within, the above Blog Entry link Parts 1 and 2 are what you are looking for.)

This error in the service pack log files was the most telling:

Warning: Local cached package 'C:\WINDOWS\Installer\a1b67256.msi' is missing. 

45 minutes of searching through logs for eight-digit random strings, copying files from the unpacked SP3 exe and the original media into c:\windows\installer, SQL, AS and RS all were installed by the service pack correctly.

But, the NS, DTS and Tools components gave me an entirely different error.

Error 2902

Operation ixfAssemblyCopy called out of sequence.  

I re-copied the msi and msp files, thinking that perhaps I had done something wrong. That didn't work. I reinstalled, noticing that NS and DTS were installed to be available on first run. I fully installed all features this time around for NS, DTS and Tools, but again, SP3 reported the above ixfAssemblyCopy error.

To get SP3 to take Tools, I had to uninstall and reinstall that component, using the CD2:\Setup\SqlRun_Tools.msi. That allowed Service Pack 3 to finally successfully update.

What a night!

Some talking points:

Interesting how I was told five years ago that when Microsoft made SSIS, they didn't re-use a single line of code for DTS. That is hard to believe, since the installers for SSIS are still called SqlRun_DTS.msi.
There was some interesting stuff in the logs that threw me off. Stuff like

MSI (s) (74:74) [20:12:05:278]: Note: 1: 2203 2: i:\35a1934bc4ebd525729353c57bb8\HotFixSQL\Files\sqlrun_sql.msp 3: -2147287037
MSI (s) (74:74) [20:12:05:278]: SOURCEMGMT: Source is invalid due to missing/inaccessible package.

That's really odd, since there actually is an I:\ drive mapped on this server, and it doesn't contain any folders like that. I assume that long-string-named folder is temporary and that its use of I: would be virtual.

Could the Service Pack actually have been thrown off by the fact that an i: drive existed? I couldn't unmap it, wasn't my server, but I can't help but thinking that was fishy. Anyone else encountered service pack issues when an I: drive exists?
I still don't know the problem with why SP3 couldn't copy the .msi and .msp files on its own. Seems simple enough, what's the explanation? Couldn't be permissions, I gave explicit admin rights to the installing user to every folder I could. Talk about aggravating.

Friday, July 31, 2009

Introduction to SSIS presentation from SQL Saturday!

Download my presentation on Introduction to SSIS here

I most recently gave this presentation at the SQL Saturday in Baton Rouge on August 1

Tuesday, June 02, 2009

'Microsoft.AnalysisServices.DsvTableBinding' to type 'Microsoft.AnalysisServices.QueryBinding'

One of the things this blog intends to do is help speed up the troubleshooting process by documenting strange errors like this.

Here's an interesting error message that came up today which has surprisingly little web coverage.

"Unable to cast object of type 'Microsoft.AnalysisServices.DsvTableBinding' to type 'Microsoft.AnalysisServices.QueryBinding'.

It is caused by trying to apply a querystring to an existing partition that is not query-driven in OMA. In this case, an SSIS 2005 vb.net app that builds partitions (by the hundreds for me) into a cube. Like many things, its one of those spend a while then slap your forehead problems that makes a ton of sense. Obviously, I can't set the partition query of a partition that is table-bound.

I do plan to post my ssis app once I am done with it. But until then, here's the blog post that inspired it.

"No question is so difficult to answer as that to which the answer is obvious." -George Bernard Shaw

Monday, June 01, 2009

handy feedback for SSIS 2005 Script Components

As you know, getting feedback from SSIS is spotty at times. Debugging is also very limited in SSIS Script Components (where's immed!?). This came in handy today.

Dts.Events.FireProgress("informational string", 0, 0, 0, Nothing, False)

Lots of different ways to get info out of the "progress" window. Lots of feedback possibilities in the Dts.Events collection.

And don't forget that in SSIS 2008, you can write C# as well as VB.net.

"Feedback is the breakfast of champions." -Ken Blanchard