The spaghetti we call Microsoft BI stack

We have many many many BI productsLassard

The Microsoft BI stack is a reference to Microsoft products which make up the “Business Intelligence” and “Business Analytics” offering.

From cubes to tabular models and from standard reports to ad hoc reports. In Between lies a spectrum of reporting and data modeling alternatives. For instance, you can choose Excel for ad hoc reporting tool but you could also use it for standard reporting (something you can share with the rest of the organization).

Having options is a good thing but taking in account the latest developments around Excel (think Data explorer, PowerPivot and PowerView) things have become somewhat …. well overwhelming.

The growth in functionality overlap between these different products have made it confusing and .. well .. downright inefficient.

During the keynote Amir Netz gave on the first BA conference back in April 2013 he explained how chaotic things have become and announced a new goal, simplicity…

However, shortly after this invigorating speech he announced yet another Excel add in “Geoflow”.

Which, I’d like to add, on its own is quit nifty. But looks remarkably like something you should create in PowerView.

I will have to give this some more thought but my humble suggestion would be to start with Geoflow and Data Explorer. I think Data Explorer belongs in PowerPivot (although I like the name Data Explorer better .. come on PowerPivot !? It sounds like a special Power Ranger). Geoflow should be part of PowerView.

So much for my “ranting’s”. Note that I like the Microsoft BI stack, but I see room for improvement :).

The next challenge in information worker empowerment

Teradata CTO Stephen Brobst was interviewed regarding the topic “There is money in unstructured data”. The article mentioned Teradata teaming up with Attensity. Attensity has the technology to extract “feelings” from text, interesting. Stephen Brobst also stated that he sees an increasing desire in advanced analytics: creating and answering analytical questions.

I agree with Stephen on that last point. More and more BI solutions provide the user with a semantic layer which they can use to make reports without being exposed to the technical nuts and bolts. Client tools like Microsoft Excel and Business Objects Designer encourage the business user to explore their data / information on their terms. It was only a few years ago when business users complained being dependant on IT for reporting. With that hurdle now overcome its on to the next one, which is, yes you guessed right, advanced analytics.

Unfortunately Microsoft does not have an answer to this “wish” yet. I have registered a connect entry here. Yes there is PowerPivot, but lets be honest, PowerPivot is for Excel guru’s (aka programmers in disguise).

Being Microsoft orientatated does not make me blind for other BI solutions, especially when I see them working for customers. Both SiSense and SAP Busines Objects give the business the tooling to perform advanced analytics. The major drawback I find with these solutions is information lock – in; these solutions dont integrate well with commonly used office applications, which is Microsoft Office.

Real data mining in Excel, the next step

A client, for whom I have implemeted a data warehouse solution, did not only use this solution for making high level reports in Excel but also uses it as a source to answer several analytical questions (data mining). 

A, for privacy reasons, generalized scenario is: All clients which have responded to our “fill in a random name” campain. The goal being to see if we can spot some charicteristics which may define our target audience for our next campaign. This query would generally result in about 400.000 clients. We would want this result to be in Excel so we could import it in SSPS.

Now, as some of you might know, Microsoft provides a data mining addin for Excel 2007 / 2010. My understanding of data ming is dealing with large data volumes in order to detect realistic patterns. So I thought “data mining addin” plus “Excel” equals Excel can handle huge amounts of rows fetched from a cube. Unfortunately this vision did not translate back into Excel. Trying to get to that level of detail, 400.00 clients, in Excel using a pivottable is near to impossible. Ok, I thought to myself, we will use PowerPivot. Using PowerPivot did enable me to import 400.000 clients (rows in powerpivot) but I was still unable to get this level of detail in Excel pivottable.

So there I was with a client, hungry for more data and more detail but now way of delivering it to him. The workaround? copy paste from powerpivot in SSPS. I know … the horror right?

PS: During my search for alternative solutions I came across this interesting addin for Excel “Predixion“. A data mining adding for Excel in the cloud.

Shifting technologies in the Microsoft BI world

Well after much thought I have decided to actually start my own blog and what better subject to start on then the SQL Pass Summit 2010. Microsoft presented their roadmap for Business Intelligence and lo and behold Microsoft introduced two new BI projects:

  • Project Crescent
  • BISM

Seeing as the whole BI community allready explained what these two projects are I am not going to do so again :). What I do find interesting is how a large part of the BI community (especially SSAS) reacted on this. In short; they were upset. But why? Well as the roadmap states BISM is the new C# and UDM models in Analysis Services the mature C++.

My reaction: The basis for a being a BI consultant still lies in understanding what information is and how to make this easy to get to. Wether this is through SSAS or Powerpivot or BISM.

Just the other day I adviced a client to use Powerpivot (finance department) versus the SSAS solution I delivered for the Marketing department earlier. Furthermore; I see sandboxing being a viable project starting point with BISM. Sandboxing is something that allready being used in SharePoint 2010 projects.

But are we getting something better? mmm we are getting another technology. Which will prob. perform better. But when I look at the ‘problems’ my clients have with Microsoft BI solutions it is not the technology, it is the lack of end user tooling for the analytic user.

An analytic user being the marketeer that wants to use Excel to retrieve all relations which have only donated money on project x (not any other project). Translated to SQL this would be a NOT IN (in SSAS I would have to write a named set everytime the SET would change).

Well, I see my challenge in my work still lies in the same area: Defining information together with the customer.