Friday, June 15, 2007

Agile software development - eScrum by Microsoft

Here we have another great news, Microsoft has just release a new tool called eScrum to keep track of all SCRUM related activities in agile software development.

eScrum is a web-based, project management tool for SCRUM which will be the part of Microsoft Visual Studio Team Foundation.

PS: Previously all SCRUM lovers have used an offering (add-in) provided by Conchango (UK based consultancy) to manage SCRUM development, which works really well with Microsoft Team System.

Get one FREE here………
http://www.microsoft.com/downloads/details.aspx?FamilyID=55A4BDE6-10A7-4C41-9938-F388C1ED15E9&displaylang=en

Thursday, June 14, 2007

T-SQL or Merge JOIN or Lookup to load Dimension data

In my recent discussion with a friend we talked about loading the data in dimension table. So being a SQL guy we have at least two options available in Microsoft SQL Server 2005.

So let me first explain the objective behind this.

We need to load only new data in target dimension table. So we need to check if this data exists at the destination, if not then insert this new data based on the business keys e.g. CustomerCode, ProductCode etc.

T-SQL Solution

LEFT / RIGHT JOIN
Make a join of target table with source table on business keys and then use LEFT or RIGHT join to get only new rows.

SELECT s.CustomerCode, s.CustomerName
FROM SourceTable s
LEFT JOIN TargetTable t ON (s.CustomerCode = t.CustomerCode)
WHERE t.CustomerCode IS NULL

The above solution works fine if you have source and destination databases on the same server, or may be you can use linked server to achieve the same if it's on different servers.

SSIS Solution

Now the standard solution to use LEFT / RIGHT JOINS in SSIS is "Merge Join" transformation, you can do precisely the same thing, it works well even if you have source database in Oracle and target database in Microsoft SQL Server.

But as we are aware that Merge Join expects sorted data, so if you are planning to sort the data at source system or use SORT transformation in SSIS then it will prove to be a costly affair.

Hmm............then what else can I do.........here I present Lookup transformation in SSIS

Well, generally this component is used to load Fact tables; when you want to load surrogate keys derived from Dimension table into Fact table.

But we can also "Configure Error Output" for this component....what does that mean........let me explain.....

Technically we map a common field from two tables in this component and fetch other columns based on this common column data. So what if all the data is not available in both of these tables…..let me explain…





Let say we have 10 rows in source and 5 rows in target.........so by default this component will return error because it’s not able to find other 5 rows in target table................hmmm..........trust me this is really good for us....









Now I can "Configure Error Output" for this component, which will redirect the failure rows and these failure redirected rows are the one that we are missing in target............so we achieved the functionality of loading only new rows in target table.





I hope this will be helpful for you as well..........

Wednesday, June 13, 2007

LINQ & LINQ to Entities in SQL Server 2008 "Katmai"

I am very excited with this new way to access data and recently learned something about LINQ and what SQL Server 2008 "Katmai" stores with respect to LINQ to Entities model.

Let’s have a look to...........Language Integrated Query (LINQ)

Today data can belong to different data domains like XML document, database, text file, excel spreadsheet. So various specific data access model are being used to access data from these sources. Let say XQuery is used to fetch XML data, SQL is used to query RDBMS and a custom code is written to query text files.

Now LINQ tries to resolve these issues by offering a uniform way to access and manage data. We can also see LINQ as a methodology that simplifies and unifies the implementation of any kind of data access.

LINQ consists of standard query operators
to play with data irrespective of the data sources. So in a way we are on the path to use standard code against any data source like Microsoft SQL Server, Oracle, Access or Sybase. This makes developer life really easy, so there is no need to spend hours in learning various languages (access methods) to work in multiple projects.

So precisely Microsoft is extending the power of .Net languages specifically C# and Visual Basic.Net to include this kind of functionality named as LINQ. This kind of implementation reuses the developer's existing knowledge of Object Oriented programming to play with databases, XML, objects, and few other data domains.

“Microsoft original motivation behind LINQ was to address the impedance mismatch between programming languages and database.”

LINQ to SQL (previously known as DLINQ), translates LINQ expression into SQL Server queries. People have already started creating LINQ libraries to query WMI, LDAP, SharePoint data sources.

LINQ to Entities

LINQ to Entities is supported by latest release of ADO.NET Entity Framework. The primary benefit of this framework is to run C# LINQ queries against a conceptual data model instead of tables. So LINQ to Entities seems to be a superset of DLINQ. The entity framework provides an additional layer on database schema which is available in terms of entities or conceptual model to query in .Net language.

That means writing something against entity is far more easy and manageable than writing complex SQL queries.

var CAEmployee = from employee in Employees where employeeLocation is CaliforniaState select employee;

Above statement is really easy to understand and I am really not bothered about database schema. What all I know is “employees” and I want to fetch California employees.

I guess this makes lot of sense and I really appreciate this work by Microsoft.

Tuesday, June 5, 2007

MSF or SCRUM in Agile Software Development

Recently I had long discussion about usage of MSF or SCRUM in Agile Software Development, and I found lots of different opinions on this in the community.

So first of all I would like to explain what agile software development is all about.

Manifesto for Agile Software Development

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

So what's the difference in achieving these using SCRUM or MSF; let’s have a look

SCRUM
Scrum provides general guidelines to the software development where the key objective is to achieve a set of prioritized shippable features within specific time limits (typically 30 days sprint) by putting primarily focus on team interaction and to adapt fast changing requirement in business process. So scrum is focused on

1. Team interaction
2. Prioritizing the customer requirement
3. Generating working software in monthly cycles (sprint)


Microsoft Solutions Framework (MSF)
MSF was introduced in 1994 as a combination of best practices and guidelines used within Microsoft product teams and Microsoft's Consulting practice and various other Microsoft internal business divisions. At the core of MSF are nine foundational principles:

1. Foster open communications
2. Work toward a shared vision
3. Empower team members
4. Establish clear accountability and shared responsibility
5. Deliver incremental value
6. Stay agile, expect and adapt to change
7. Invest in quality
8. Learn from all experiences
9. Partner with customers


We can find similarities in these MSF principles and Scrum objectives, so what’s the difference. Well as per my discussion with some of the techies who follow Scrum religiously find these issues with MSF

1. MSF defines work for the team, so it’s missing the self-organizing factor.
2. Incremental units produced using MSF are non-shippable.
3. Its more process oriented, which in-turn slows down the development process.
4. It doesn't support cross-functional behavior within the team.

On the other hand people familiar with MSF advocates the same using these points

1. It has a well defined process
2. It works well for big teams and large projects
3. Onsite - Offshore model is more suited to MSF
4. With MSF people not only share responsibility but they also share the common vision, achievable through a defined process

So finally which is good, well being a consultant I will answer "Depends"........Scrum is fast, can give results really early, but I saw few quality issues, doesn't seem to work for large teams or let say large projects. Even sometime producing independent features are also hard, so depends a lot on the project.

Again, MSF works best in offshore delivery model where the large team shares a common vision to achieve the end result in a well defined manner under agile development.

Monday, June 4, 2007

Your Data, Any Place, Any Time - SQL Server 2008 "Katmai"

I guess now I should start writing my blog with the CTP (Community Technology Preview) release of SQL Server 2008 code name "Katmai". So let see what we got in this latest release of SQL Server from Microsoft.

So what Microsoft is calling its new release "Your data, any place, any time"......hmm.......sound good to me....

The major feature I see in this release is the capability to store data in file system along with existing relational and XML data. This new feature provides the facility to store large binary data in file system, still being a part of database with transactional consistency. So looks like an effective storage model with all database benefits.

Another interesting feature is working with entities instead of tables. So this new ADO.Net Entity Framework enables developers to work with business entities like "employee" or "student" instead of using tables.

LINQ (Language Integrated Query) another new feature is very useful with its capability to query the SQL Server in any programming language instead of using SQL to query the database.

Encrypting the whole database is another enhancement in this release. That means instead of Encrypting a table or column we can encrypt the whole database, log, or data file without making much change in application logic.

On the same lines encryption with third-party key is also beneficial for regulatory compliance.

Automatic recovery of corrupted data pages from mirrored server is another important enhancement in this release and the best part of this feature is transparency to end-user.

Resource Governor, as the name says coming up for the first time with SQL Server 2008. It’s very useful to define the priorities and resource limits, thus getting consistent and predictable performance benefits.

Go to http://www.microsoft.com/sql/prodinfo/futureversion/default.mspx to know more about SQL Server 2008.