Saturday, September 27, 2008

Data extrapolation on the fly

In some cases we need to generate the data based on a starting date, to an end data (today-getdate()).

This process is called data extrapolation based on limited information (starting date). Here I am trying to achieve the same without storing data in temp./staging tables.

First I am creating a temp. table wih 2 rows and now I need to generate rest of the rows starting from given date to today's date.

Next step is to make use of CTE to generate sequence of rows and finally making use of CROSS JOIN to join the sequence with real data and get the desired output.


CREATE TABLE #TEMP (
MYNAME VARCHAR(10),
STARTDATE DATETIME)
GO

INSERT INTO #TEMP VALUES ('Mohit', '2008-09-15')
INSERT INTO #TEMP VALUES ('Nayyar', '2008-09-22')
GO

WITH FIRST2(SEQ)
AS (SELECT 1
UNION ALL
SELECT 1),

FIRST4(SEQ)
AS (SELECT 1
FROM FIRST2 X
CROSS JOIN FIRST2 Y),

FIRST16(SEQ)
AS (SELECT 1
FROM FIRST4 X
CROSS JOIN FIRST4 Y),

FIRST256(SEQ)
AS (SELECT 1
FROM FIRST16 X
CROSS JOIN FIRST16 Y),

FIRST65536(SEQ)
AS (SELECT 1
FROM FIRST256 X
CROSS JOIN FIRST256 Y),

SEQROWS
AS (SELECT ROW_NUMBER()
OVER(ORDER BY SEQ) AS SEQNUMBER
FROM FIRST65536 WITH (NOEXPAND))

SELECT MYNAME,
STARTDATE,
DATEADD(DAY,N.SEQNUMBER - 1,STARTDATE) AS EXTRAPOLATION
FROM #TEMP
CROSS JOIN SEQROWS N
WHERE N.SEQNUMBER <= DATEDIFF(DAY,STARTDATE,GETDATE())
ORDER BY MYNAME,
STARTDATE,
3
GO

DROP TABLE #TEMP
GO


Thanks
Mohit Nayyar

Thursday, July 17, 2008

Things You Didn't Know About Temp Tables and Table Variables

I recently found really useful information on temp. table and table variable usage. I am sure this will be useful to SQL Community as well.

http://www.sqlservercentral.com/articles/63472/


- Mohit

Thursday, May 29, 2008

SSIS approach to handle Inferred Members

While loading data in Fact tables we usually see a scenario where the fact data is available but there is no corresponding business key in the related dimension.

In this case we choose multiple options to resolve the issue.
  1. Ignore that fact

  2. Insert the associated business key in dimension table and return the newly generated surrogate key from dimension table. And now store the data in Fact table with the surrogate key.

The second approach relates to a term called “Inferred members”. All the other attributes of that dimension will also get updated in next run of dimension load (usually nightly load).

In SSIS there are multiple options available to implement the second case.

First approach is to do lookup on the dimension table and for all the rows that are now matching, insert the business key in Dimension table and then do the lookup again to get the surrogate key.

Second approach is to make use of Lookup and Script component. Lookup component will ignore rows with no matching business key in dimension table. Then script component will process only those rows where it didn’t find the associated surrogate key and finally insert the same in dimension table and return the associated surrogate key through stored procedure output parameter.
This script component approach is more efficient because its using the existing lookup component only once and then doing all the processing in script component.

But the additional benefit comes if we make use of .Net Generic.SortedDictionary class to store the cache information regarding the newly generated key. Read more about this here…
http://msdn.microsoft.com/en-us/library/f7fta44c(VS.80).aspx


- Mohit