I am often asked usually by programmers - What is Data Warehousing & how do I learn it? I explain to them we use all the same tools that you do but differently. That’s when I coined the term Data Sense. It describes the essence of Data Warehousing and separates Data Warehousing from rest of Programming. Every aspect of IT from Hardware / Software infrastructure to Design, Development and QA is done with massive data flows and need for data precession accuracy and meaning.

Friday, August 15, 2008

The Hedge Fund P&L Problem

I have talked to a few hedge funds over last couple of months and this is a common problem that they face when implementing Microsoft BI solutions without a deep understanding of dimensional modeling. They are looking for MDX experts to help them dig out of the hole, not recognizing that it’s a design issue.
Here is the simplified version of the problem. A Hedge fund owns m(n) quantities of n number of securities. Every day they can sell or buy a few more – both the quantities of security held (i.e. 100 of IBM), or number of various securities (IBM, MSFT, HLTH…) may change. The values of the securities themselves may change (lets simplify by assuming mark to market). They get a feed of daily transactions that were closed. They need to make daily (or periodic) Profit and Loss estimate. This could influence market communications and security transaction decisions. I assume accounting may make estimates of those. But you maybe looking to Build a SSAS cube in addition to study P&L by various dimensions to improve your business, or to build an executive dashboard.
Please read the last two blogs to understand the underpinnings of Dimensional modeling. I’ll build on those to explain how we model the above problem. First let me explain how it was modeled wrong in several places I am aware of.
Take the incoming purchases / sales as transactions / facts. Identify dimensions (made mistakes here but that’s beside the point). Build and SSAS cube out of those transactions. Now start writing MDX programs that deduce P&L based on those transactions. Its not that they didn’t understand the problem. They understand how to calculate P&L based on the data in hand. But they are using Cubes as filesystem and Writing MDX programs on it. If you want to do that use Java. Why struggle with MDX. It’s not that the business is complex. It’s just the design is wrong. The measures you are storing at base level of cube are wrong.
What does your car speedometer show you? It’s the speed of the car at that instant. Not acceleration not velocity – just the speed. Imagine of it showed acceleration and you constantly have to calculate speed in the back of your mind by adding (or subtracting) from initial speed of 0.
So what should be right measures in Hedge Fund P&L Business-Dimensional Space. That depends on the understanding of the problem. Don’t confuse sales transactions with Business events in Business-Dimensional Space. What we are trying to analyze in P&L is the Holdings – not the sales. Sales are just changes in the holdings i.e. a derived measure / calculation in the cube. Moving measures to reflect Holdings, not sales will solve the complexity issues with the P&L Cubes. You no longer have to traverse several cells in sales space to deduce P&L. P&L is simple calculation from differences in measures at cell A and B. This will not increase the size of the cube. The size of the cube is dictated by cardinality of the dimensions. It’s just the ETL that needs to reflect the logic of deducing holdings from initial holdings and sales.

And since I am often asked “There is no dimension that can fix the above problem”. A particular facet may be fixed but hack is a hack and the problem shows up elsewhere – in needing complex MDX or unknown data accuracy.