SSIS – table variables in the Data flow source
Let’s say you were doing a fact table load and wanted to do a dimension lookup.
-Hah, easy you say, in SQL Server 2008 we have cached lookups, so I can join Staging.BusinessKey to Dimension.BusinessKey to get Dimension.Dimensionkey (surrogate key).
What happens when you want to apply something like a banding range, often used for Aged Debt (‘Not Due’, ‘0-90days’, ’90-120 days’…etc). In this situation you would need a “Staging.Value BETWEEN Dimension.BandMin and Dimension.BandMax” type join.
-Hah, you say, the Lookup task allows for advanced joins so I can do this.
Great, untill you have to load 50million rows and you realise the ETL is going to take longer than building Sagrada Familia.
So one fairly reasonable approach may be to write some t-sql using a table variable looking something like:
You could even create the VALUES section of the INSERT statement dynamically by querying the Dimension table and constructing the Sourcesql as an expression.
All very well and good until you run it in SSIS and it returns 0 rows. Strange as all the rows came back in SSMS.
The reason for this is the query is returning two data streams and SSIS can only address the first one.
This problem can be resolved by issuing a SET NOCOUNT ON at the beginning of the query which “Stops the message that shows the count of the number of rows affected by a Transact-SQL statement or stored procedure from being returned as part of the result set.”