Showing posts with label Business Intelligence. Show all posts
Showing posts with label Business Intelligence. Show all posts

Saturday, September 27, 2008

Data extrapolation on the fly

In some cases we need to generate the data based on a starting date, to an end data (today-getdate()).

This process is called data extrapolation based on limited information (starting date). Here I am trying to achieve the same without storing data in temp./staging tables.

First I am creating a temp. table wih 2 rows and now I need to generate rest of the rows starting from given date to today's date.

Next step is to make use of CTE to generate sequence of rows and finally making use of CROSS JOIN to join the sequence with real data and get the desired output.


CREATE TABLE #TEMP (
MYNAME VARCHAR(10),
STARTDATE DATETIME)
GO

INSERT INTO #TEMP VALUES ('Mohit', '2008-09-15')
INSERT INTO #TEMP VALUES ('Nayyar', '2008-09-22')
GO

WITH FIRST2(SEQ)
AS (SELECT 1
UNION ALL
SELECT 1),

FIRST4(SEQ)
AS (SELECT 1
FROM FIRST2 X
CROSS JOIN FIRST2 Y),

FIRST16(SEQ)
AS (SELECT 1
FROM FIRST4 X
CROSS JOIN FIRST4 Y),

FIRST256(SEQ)
AS (SELECT 1
FROM FIRST16 X
CROSS JOIN FIRST16 Y),

FIRST65536(SEQ)
AS (SELECT 1
FROM FIRST256 X
CROSS JOIN FIRST256 Y),

SEQROWS
AS (SELECT ROW_NUMBER()
OVER(ORDER BY SEQ) AS SEQNUMBER
FROM FIRST65536 WITH (NOEXPAND))

SELECT MYNAME,
STARTDATE,
DATEADD(DAY,N.SEQNUMBER - 1,STARTDATE) AS EXTRAPOLATION
FROM #TEMP
CROSS JOIN SEQROWS N
WHERE N.SEQNUMBER <= DATEDIFF(DAY,STARTDATE,GETDATE())
ORDER BY MYNAME,
STARTDATE,
3
GO

DROP TABLE #TEMP
GO


Thanks
Mohit Nayyar

Thursday, May 29, 2008

SSIS approach to handle Inferred Members

While loading data in Fact tables we usually see a scenario where the fact data is available but there is no corresponding business key in the related dimension.

In this case we choose multiple options to resolve the issue.
  1. Ignore that fact

  2. Insert the associated business key in dimension table and return the newly generated surrogate key from dimension table. And now store the data in Fact table with the surrogate key.

The second approach relates to a term called “Inferred members”. All the other attributes of that dimension will also get updated in next run of dimension load (usually nightly load).

In SSIS there are multiple options available to implement the second case.

First approach is to do lookup on the dimension table and for all the rows that are now matching, insert the business key in Dimension table and then do the lookup again to get the surrogate key.

Second approach is to make use of Lookup and Script component. Lookup component will ignore rows with no matching business key in dimension table. Then script component will process only those rows where it didn’t find the associated surrogate key and finally insert the same in dimension table and return the associated surrogate key through stored procedure output parameter.
This script component approach is more efficient because its using the existing lookup component only once and then doing all the processing in script component.

But the additional benefit comes if we make use of .Net Generic.SortedDictionary class to store the cache information regarding the newly generated key. Read more about this here…
http://msdn.microsoft.com/en-us/library/f7fta44c(VS.80).aspx


- Mohit

Monday, August 27, 2007

SSIS - deadlock was detected while trying to lock variables

Recently I faced one MAJOR issue in my ETL packages because of new patches deployed on Microsoft Windows 2003 Server.

"A deadlock was detected while trying to lock variables "variable names (comma separated)" for read/write access. A lock cannot be acquired after 16 attempts. The locks timed out."

OR

"The script threw an exception: A deadlock was detected while trying to lock variables "variable names (comma separated)" for read access and variables "variable names (comma separated)" for read/write access. A lock cannot be acquired after 16 attempts. The locks timed out."


It's more to do with Script component/Task used in the package and making use of "ReadOnlyVariables" and "ReadWriteVariables" properties for declaring variables that will be used in the script.

Technically if we declare variables in these two properties then there is NO need to lock these variables in the script, and this is exactly what we were doing in the past prior to these patches....

Windows 2003 Post-SP2 Hotfix - MS07-31/935840 W2K3 Server
Windows 2003 Post-SP2 Hotfix - MS07-34/929123 W2K3 Server
Windows 2003 Post-SP2 Hotfix - MS07-039/926122 W2K3 Server
Windows 2003 Post-SP2 Hotfix - KB924054 W2K3 Server
2.0 IE Update W2K3 Server

Now I don't know which one of these patches is causing the real problem but yes it's one of them for sure.

We can see these error in the SSIS logging files...

Now the quick fix to these errors are NOT using "ReadOnlyVariables" and "ReadWriteVariables" properties for declaring variables that will be used in the script. Instead a better way is to declare the same in script itself and then locking and unlocking the same using DTS object model.


Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain

Public Sub Main()

Dim myVar As Variables

Dts.VariableDispenser.LockForWrite("User::Var1")
Dts.VariableDispenser.LockForWrite("User::Var2")

Dts.VariableDispenser.GetVariables(myVar)

myVar("User::Var1").Value = "SomeValue"
myVar("User::Var1").Value = "SomeValue"

myVar.Unlock()

Dts.TaskResult = Dts.Results.Success

End Sub

End Class



Let me know if you know the exact source of this problem.

- Mohit Nayyar

Tuesday, July 3, 2007

BI and Reporting enhancements in Microsoft SQL Server 2008 "Katmai"

Change Data Capture
Change Data Capture (CDC) is a really interesting feature in Katmai using which we can focus on specific business requirement to capture changed data. That means this component captures all changes asynchronously and then exposes the changes through T-SQL based on specific requirements.

The provision to capture schema-changes is also very useful in this component.



Merge SQL Statement
Truly fantastic and really useful........

So how many times you have used "IF EXISTS" to make sure that you insert new row otherwise update the existing row.....yes I am talking about UPSERT...

This MERGE statement takes care of this scenario and check for existing rows to make updates and inserts new rows in the table.

MERGE INTO FactTable F
USING TransactionTable T ON T.OrderID = F.OrderID
WHEN MATCHED THEN UPDATE
SET
F.Quantity = T.Quantity
WHEN NOT MATCHED THEN INSERT (OrderID, Quantity)
VALUES (T.OrderID, T.Quantity)




C# support for SSIS scripting components



Doesn't require IIS for SSRS
Yeah, thats the good part, we don't need IIS anymore to load SSRS reports with SQL Server 2008. This is the best for a developer like me.......



Grouping Sets
So how many times you have asked by business users to give specific grouping totals, people in sales only ask for product level totals, management looks for employee-product totals for bonus, finance needs total on 10 different things........

Well now with this new feature we can define multiple grouping option within the same query....that means instead of writing multiple group by queries and then combining the result with UNION ALL, we can have just one query with multiple grouping sets



Tablix – This release of SQL Server introduces a new report component called tablix, technically it’s a combination of existing table and matrix in SSRS. We just need to change few basic properties and this will do the job, so no more patchwork required to achieve stacked columns, pivot and more than 1 subtotals.



RDL Object Model – Good news for developers who wants to write/extend reports programmatically.

Dundas RS controls – As we know that Microsoft recently acquired Dundas chart, so this release of Reporting Services will contain the full power of dundas, an absolutely must have in reporting chart world.



Soft Artisans Office Writer technology – Again, This acquisition will allow users to embed their reports in Microsoft Word and Excel documents. Talk to you manager, he will love this.



Export to Microsoft Word




SSIS Lookup "TxLookup" in SQL Server 2008
Microsoft is really working on this component, and now we can lookup into any data source, including other SSIS packages, great news. Along with this TxLookup also works on "pre-charge" query, currently we have cache-miss queries in SQL Server 2005 Integration Services.



SharePoint user interface to manage SSRS reports



Office based report authoring
Now we can used report authoring tool, which allow us to author our reports using excel or word based tools. That means we can put tables, formulas, and also apply formatting too.




FYI - Microsoft entry into MDM world
Microsoft have also purchased Stratature recently, which provides Master Data Management solutions. MDM solution is really useful and must have for organizations having multiple businesses. Read more on MDM at MSDN The What, Why, and How of Master Data Management