The One Document you Must Absolutely Have, NO EXCUSES

Documenting the Business Intelligence project activity is important but it is also time consuming. This is the reason why what has to be documented should be chosen carefully. This is neither a praise of the lack of documentation, so common with Agile, nor a hymn to the overburden imposed by an excessive documentation like it happens with other methodologies when they are not applied properly. Balance, in what is written and what is not, is obviously the logical approach.

However, in my view, there’s a document that every project MUST have because it crucially contains the answers to the vast majority of all the questions that are going to be asked by the users and the stakeholders: the Business Model Document.

The Business Model Document (BMD) maps every object accessible by the users with its sources. It also describes all the transformations or calculations applied to generate the object.

This document is necessary for two reasons:

  • describes the fundamentals of the job being done, so it is a testament to the analysis, the thinking and effort poured into a project. A BI project is basically as good as its model, so it is worth having a clear view of what the model is.  Besides, it is also a cool deliverable.
  • At some point, the users need to be told about the fields or the objects that they have at their disposal, and they will keep asking information during their learning curve. The worst thing you may need to do is digging into your code every time you have to answer a question because you can't remember that off the top of your head.

 The Business Model Document must include:

  • A general, introductory, description of the system, shortly describing databases, tools and listing the data-marts.
  • A list of all measures available to the users and their context
  • A list of all the dimensions, with their attributes and their context, not forgetting degenerated and junk dimensions
  • A description of the update frequency and policy, and of all the parameters governing the system.

In detail, for every measure it must include:

  • A short description in business terms, stating what the measure is for, who requested it, who approved it and other related measures.
    Example: Sell_Through. It is the ratio between the sales and the stock level for a product. Usually expressed in %, a 100% sell through means that all the stock has been sold, a 0% means that no item has been sold. It can be calculated on quantity, cost, sales or profit. See also Sell in, Sell out, quantity, COGS, Net Sales, Profit.
    Included in the original implementation and approved by John Doe (Merchandising Coordinator) in Jan 20X4
  • The measure source in term of actual fields or other measures participating in definition, either they come from a db, a file, an API or whatever else.

    Example:  When calculated in quantity it derives from measure quantity, which is plainly from table MyERPOrdersDetails, field nNumQuantity01 , 

    When calculated from cost it uses the COGS as loaded periodically from file fPrdCosts.csv, field Cost

    When calculated from Net Sales it uses the same definition as the Net Sales measure 

    When calculated  in Profit ...

  • The calculations required to get the final number. This may be a formula or the verbal description of an algorithm but every variable in it must be unambiguously identified.
    Example: when calculated in quantity, derives from measure quantity (see above) divided by the product stock level as in table MyERPStockSnapshot, field qty, excluding StockTypes samples, QC and Demo
    When calculated from cost  ...
  • How it aggregates in reporting and the context in which it can be used
    Example: Sell_Through aggregates in sum and is not available in financial, treasury and HR contexts.

Similarly, for every dimension it must include

  • A business description, describing in plain language what the dimension is supposed to model (customers, time, products, warehouses, subsidiaries, vendors etc.)
  • A description of key fields (natural and, eventually, surrogate keys) and attributes, and their source. Generally has no business value to explain the algorithm applied to build a dimension (and for a big dimension it may be very complex) but knowing the source of fields is essential.
  • A careful explanation of the hierarchies and the grouping available for the dimension
    Example: DimSalesman are associated to DimCustomers, every customer has one. Salesmen are a group of customers and they are further aggregated into areas. This hierarchy derives from table MyERPSalesForce, fields sUserCode, sSalesMName1,sSalesMName2.
  •  A careful explanation of all the attributes, identifying their origin.

A dedicated section of the BMD must also state the updates policy, ETL jobs execution times (considering the fact that users may not be all in the same time-zone), database sources and latency, must be documented thoroughly.
Since it is likely that your system has some tables containing functioning parameters, it is paramount for all the administrator to have a clear view of how to use them. Their description on the document should address specific and realistic scenarios.
Example: Parameter nUpdatedTimespan controls how many days the system looks back to load sales orders. A value of 365 means that all the orders younger than 365 days will be deleted and reloaded to account for changes. A value of 0 means that all the orders will be deleted and a full reload will take place. A value lower than 0 means that a hash algorithm will be used to identify the orders to be reloaded.

It i also useful to have at the beginning or at the end of the document, the system bus matrix to visually show what is available. If you can, print it out and hang it on the wall.

No use to say, the BMD must be accurately maintained every time a change or an addition are made. Cutting short on the technical details guarantees that it will not be an undue burden for the team but a useful tool to support the job.

In a mature system, this document may grow up to hundreds of pages long, one little addition at a time. In this case a wiki or a document management system are, maybe, the best way to organize and consume the BMD.

Though, my suggestion is to print a copy of it, from time to time, to give physical evidence of the job done, for those who actually did it and for all the stakeholders.