Where we bust a myth, avoid a pitfall and embrace a vision …
There is a lot of literature about how to design a data warehouse. There are endless discussion on the internet about the design principles to be applied. There are fierce battles on which ETL philosophy is intrinsically better.
There is nearly no debate on how the information should flow within the company.
There is nearly no discussion on how the control process should be supported by this flow.
There is nearly no discussion about business intelligence platforms.
Better, we see such discussion on vendors’ brochures but, among professionals, they are often dismissed as marketing speak or a minor aspect, just to be thought of when you are tired of doing the real, meaty, stuff.
But let’s start from the beginning …
It’s a no brainer!
At the beginning of Business Intelligence, in the 90’s, sometimes you could hear discussions like this: “what’s really important is the data warehouse, how you collect and harmonize your data. The tool to consume them is not really important. You see? All those DSSs built without a DW to support them? They are all being decommissioned. Data is what matters. And cubes. Lot of fat, meaty but shiny olap cubes because users want to drill down data”.
Apparently this attitude somehow managed to survive while the entire BI landscape was evolving and shifting. After all, everything users ask for are reports or to drill/slice/dice/filter if they are analyst. After all, when we collect user requirements we are told about reports made so and so. The top of weirdness are datasets to be consumed mainly in Excel. And we all work with Agile right? Where everything you do is a user request. To give all this stuff to the user, there is not much difference among the different tools out there. One is graphically excellent, the other produces very good printed report, the other works with cubes etc., but at the end of the day, it is always the same stuff, is it? And some of them cost so much!
Yes, in the last two or three years those freaky blokes called data scientists came up, trying to do all sorts of alchemies on their laptops with tools like R or SPSS; but they just ask for flat files, nothing really new from our point of view.
The Naturally Evolving Architecture
Bill Inmon coined this term in 1992, to illustrate, in his seminal work “Building the Data Warehouse” (Did you read it, did you? Because if you didn’t, stop reading me and read him!), what was going to happen if reporting was generated in the way that looked obvious at the time, by extractor programs. The final result was a maze of uncoordinated and inherently irreconcilable information living and prospering in every corner of a data-centric organization (Yes, data-centric organizations are not a thing of the 2010s, they always existed, but we will talk about this somewhere else). The lack of reliability, the inherent complexity and a truckload of other issues called for a new paradigm. That new paradigm was the Data Warehouse.
Now, pretty much no-one is still relying on the NEA and everyone is recognizing the necessity of a DW. Apparently, though, no-one has realized that the Naturally Evolving Architecture is still with us, and it’s alive and kicking, it just shifted a bit ahead.
To understand this point, let us wear for a minute the user’s shoes and sit on the side line watching the entire process from the outside. Let’s imagine not to be very technical but to know enough to have an informed global view.
You see the source system, than some magic happens (well, actually the ETL process) and the DW is built every morning, than on top of it sits an olap cube. Then, you get some data (mainly from the cube), manipulate and enrich them (usually in Excel), distribute to some other people by e-mail or a shared drive etc. As a user, I see my job as the data extractor, enricher and distributor to real consumers, those who actually identify action courses upon those information.
This looks normal, is it? Is it normal to do this day after day, week after week, month after month?
THIS IS BY NO MEANS THE ONLY POSSIBLE PROCESS.
THIS IS BY NO MEANS THE ONLY POSSIBLE ARCHITECTURE.
This is just letting the things go as they naturally go if everyone is doing what she is naturally thinking to do. It isn’t really that different from the time when data were collected and organized on paper.
The Comedy of Misunderstanding
BI software has evolved deeply in the last two decades (yes, I am that old), from desktop tools through the web revolution, spreading to mobile platforms; visualizations become more appealing and powerful, with more and more components, under the push of some young visionary companies who aimed to a more personal BI. The pendulum swung from large BI suites to more lean clients. Now the focus is on consuming big data in an insightful way, with little or no overhead for the user.
This is the story that is usually told when an otherwise hectic industry stops for a moment and looks back at his past. And it is true, but it ignores the single, most important, feature that characterizes a BI product: its’ infrastructure.
The traditional large BI suites all rely on a foundation that provides a range of essential services, like user definition and security, scheduling, a safe repository, data distribution, messaging, social features. Most critically, they feature a layer of shared data access services that provides a common vision throughout the organization. This layer may include federation or other integration forms, while advanced clients leverage it for lightweight integration at user level.
So, the effective deployment of one of these suites in a complex organization may heavily change the process described before.
A LARGE BI SUITE MAY DEFINE THE INFORMATION FLOW PROCESS WITHIN AN ORGANIZATION.
In respect to the model described above, we may have deep differences.
- For example, we may have user controlled, automated report mass distribution.
- Reports that remain interactive and contain data may be served to the analyst users, potentially greatly alleviating any query performance issue.
- Alerting systems may reduce or eliminate the need for entire report families.
- Dashboards reduce the need to supply executive information.
- Data integration performed at client level, with the possibility for the users to directly engineer that reduces DW maintenance overload.
- Tools that can produce data based presentations help analysts in their “storytelling” duty.
- Google style interrogations give the casual user a lot of information at their fingertips.
- Data federation may shortcut substantial chunks of the integration process.
- Semantic layers may tap directly into source systems to provide real-time BI.
- Reports may turned into data sources for other reports and analysis with no IT involvement.
- Social features may be used to foster healthy discussions on data.
I could go on but I think that anyone who worked with one of the large BI suites, and most of the smaller one,s can recognize the pattern. Each one of the points above is a distinct advantage. Everything that makes the process less time and resource consuming adds to the bottom line. Everything that turns data into information more quickly but still consistently, adds to the ability to respond to a changing environment.
If you have 20 people consuming information, you can do pretty much everything you want and it will not do any particular difference.
If you have 1000 people consuming information and you spare 15 minutes, every day, for each one of them, how many resources are you freeing for other uses? Well, you do the math.
Down the Feature Drain
However, the world is not a perfect place.
Many of the BI professionals who worked with a BI suite will agree that many of these features remain not utilized or underutilized. This happens, despite the potential described before; and the reasons for this happening are hiding in plain sight.
The BI vendor marketing bears a part of responsibility, since it is often unable to take a stance. In the effort to talk to everyone, it ends up talking to none. The message is always confused as it oscillates between “you can do exactly what you want” and “look at this comprehensive panorama made of 137 perfectly integrated different software modules that will revolutionize your company”. The real potential behind is often drown in the glare of shiny charts.
The bulk of responsibility, though, stays with us, the BI professionals. It is much easier to focus on the basics and forget the possibilities. Maybe because creating and maintaining a DW is such a taxiing effort, we find easy to just translate user requirements into a bunch reports, de facto replicating the NEA on the BI platform. This condition for satisfaction is also easier to be added to a contract, so it appears a sensible way to go; I have been guilty of this myself, sometimes.
We are also obsessed from the users’ requirements: every methodology starts with collecting user requirements, the capacity of translating them into technical requirements is considered an important piece of know-how. Unluckily, the users do not know what they want and do not have a clue of what they really need. In BI it is intrinsically complex enough to identify the “what”; that is the data that may answer a business question. Specs are never 100% correct the first time; this is a given.
The issue becomes nightmarish when we let the users design the “how”. The average business user will ask for a better version of what she already has; she will fiddle with the tools already in her command to find a solution and she will ask you to help with that. This will happen because it is not the users’ job to know what is available in the BI software market and, crucially, how it can be used to improve the way information is managed within the organization.
Users, questioned on what may be of help, will cover just the segment of NEA under their responsibility and will miss the bigger picture. They will likely ask for some little features or improvements, sometimes doomed to be utterly irrelevant in the overall BI strategy. Paradoxically, what is going to emerge from a non-educated user survey, are just ways to improve the NEA, making it more difficult to be eradicated.
Henry Ford used to say: “If I'd let the clients design my cars, I will end up with fast horses”. It is our duty, as BI professionals, to show the users that an entire panoply of vehicles is available out there and harness their informed contribution to identify the best software and the best process.
Hmmm ... I know I know, this is going to be controversial as it seems a spot for the big vendors. Well, I think this is the truth according to my experience; I am happy to be disproved. Up to you!