I am often asked usually by programmers - What is Data Warehousing & how do I learn it? I explain to them we use all the same tools that you do but differently. That’s when I coined the term Data Sense. It describes the essence of Data Warehousing and separates Data Warehousing from rest of Programming. Every aspect of IT from Hardware / Software infrastructure to Design, Development and QA is done with massive data flows and need for data precession accuracy and meaning.

Tuesday, August 19, 2008

Where the mind is free......

Sometimes an intractable problem seems so only because it is misclassified. When you take a step back, look at big picture and think it is no longer an issue.
Case to point is a data problem that some hedge fund companies mentioned - how do you define an slowly changing dimension to take care of the issue where there are multiple identification scheme for the same security and each transaction uses one more more of these schemes. These schemes have non trivial many-many relationship witheach other. This is very much a back office problem.
But for Market analysis does it matter? Would storing it as a junk dimension be OK. Can we run association algorithmns on it to group records with different set of security identifiers together into one record as at some level they represent same security. Or in other words would the information contained in grouped security identifiers be enough for analysis. Hmmm once we think it became a simple Junk dimension + Data Mining problem.
Where the mind is free........