Thursday, March 22, 2012

Best practices to design datawarehouse for a Fraud Detection/Transaction Monitoring Application

Hi All,

This thread relates to following thread of Architecure forum.

http://forums.microsoft.com/msdn/showpost.aspx?postid=1908857&siteid=1&sb=0&d=1&at=7&ft=11&tf=0&pageid=2

Because after reaching at decision that application would require datawarehousing the members suggested that it would be nice if it is posted on datawarehouse forum.

I would appreciate if could go through the mentioned thread that would state application's business requirement.

Now i am continuing the from the thread where we were discussing about SCD.

In our Application customer address is defined in 6 fields (Street,city,State,Country,Pincode,ContactNo) so if we are to maintain 3 type of addresses like Office,Communication,Residence then we have to add 18 fields in our Customer Master dimension table in our dataware house.

Suppose after some time customer office address is changed so what should be done in Customer Master Table at time of synchronization, Should ETL process insert a new record into dimension table with updated address in office address fields or it should update the existing record fields without adding new one.

If for Changing dimensions we are adding a new record then we have to maintain a flag field where status of record would be kept like 'current record' or 'old record' , this would help us to identify the main record.Every time whenever a new customer record is added old record status is changed to 'Old record'.

Should we add new record for changing fields like address,designation,employer,annual income etc or update the existing record only.Because in our analysis these all fields would play major because after each profile updation a risk category is assigned to customer.So we can not afford to overwrite any field information because at any time old information can be asked by management/regulator.

If every time for each field updation a new record is updated then customer master datawarehouse table would become 5 times if each customer record is updated five times during the year.

Suggest us the efficient approach to handle changing dimensions.

Thanks & Regards,

Sameer Gautam

Hi,

You don't have to create a record each time a change happens, just depends if you want to capture the history for that field or not. For example you may want to capture history for change of address, so you would add a new record, but you may not want it for surname, so just update all records for that customer with the new surname.

SCD are a pain to say the least when data warehousing, the principle is nice but coding it is sometimes complicated especially if you try and use snowflakes. And space soon becomes an issue, if you data changes regually. And you are right that if you make a change to a customer 5 times in a year then you will get 5 new records. Although you may code it so it only captures the end of year snapshot, rather than every change.

I would have a look through some of kimballs articles:

http://kimballgroup.com/html/articlesArchitecture/articlesAdvancedDim.html

Good luck

Matt

sql

No comments:

Post a Comment