Derived or Second Level Attributes

Originally posted on March 14 2011



There’s an entire class of non trivial entities that are often overlooked in BI which do not even have a name. Often when you browse the dimensions they’re mixed together with various other attributes; they’re hardly a dimension of their own.

They are … well, they lack a collective name, to my best knowledge. I’m used to call them “second level attributes” but it’s not a very descriptive name.

 A second level attribute of a dimension is an attribute that depends on the facts.

The date of the first order, or the date when a customer was acquired, depends from the facts, and it’s usually extrapolated by them and stored in a field. Less trivial, the number of the orders placed by a customer in the current year. It might change every day and it should be recalculated every day. Much less trivial, the cluster to which a customer belongs, depending on some data mining wizardry. Another classic example is determining if a customer is a bad payer or not; the day before you are throwing all your lawyers at him, the day after the payments everything is fine.

They’re not limited to customers. I have implemented some fancy second level attributes like the orders that, prepared on an assembly line already working at the 70% capacity, choke it. Or the products which belong to a class whose sales profile match the demographics of the intended audience (I haven’t fully understood it even now, I confess.)

 Finding the value of the attribute is often possible with current BI tools, but that’s not the required result. They must be used as attributes like the others, but they must be updated depending on the underlying facts. The classic sales manager needs to know the total sales of the customers that bought 10 yellow widgets and at least 2 blue widgets but no more than 3 green widgets and compare them to area total sales, because those customers are to be offered a special discount for 5 gray widgets. He needs to estimate how much the special discount will affect her sales.

Calculating them might be complex but is almost always possible in every technology currently in use.  I always used some sort of sql script and I never faced insurmountable hurdles.

 There’s an aspect, nonetheless, that the old BI does not address but the #newBI should consider. The requirement changes continuously and I’ve never seen an application or a system that let the user define the rule to get the attribute value independently from the technical side. This kind of insight is increasingly common because is much closer to the way the human brain classifies a class of business entities. 

I’m trying to build this feature into Viney@rd, of course, but that’s not easy. The difficulty is translating the verbal expression of the rules in some sort of code that could be executed.

 I’d be more than pleased to discover that there’s already an application that does this well, or that there’s another, more flexible way, to address the issue. Please, let me know!




2282 views and 1 response

  • Mar 16 2011, 1:33 AM

    Augusto Albeghi responded:

    I add a comment myself to point out that Alec Sharp (!/alecsharp) of Clariteq Consulting ( points out that they're named "Derived" attributes in ER models. Actually, now that I'm reminded, I remember this point and it's a good definition.