Business Intelligence

  Smile Iberia, O... / Libros blancos / Business Intell... / Multidimensional analysis or olap  



Multidimensional analysis is one of the most commonly used methods of analysis in business intelligence.

Let‟s try to outline the main principles.

We have basic information sets, generally in large volumes, each line of information represents an event characterised by:

  • A unique identifier
  • Elements that describe the information
  • Values which carry information regarding quantities

Take the following standard example, which we have simplified for our purposes:

Take the basic entity the invoice line item. It is often a company‟s key entity the most conclusive information it holds in regards to the sales process.

The invoice line item is part of an invoice and relates to the sale of a product, in a given quantity, at a given price, to a given customer.

It carries various different attributes: a product reference, a quantity, a unit price, a total price.

We can access other information via the invoice that the line item belongs to: the client reference, date (year, month, day) etc.

We also have access to other information on the customer themselves: country, region, type of customer, sector of activity, etc. The customer may even have an assigned sales agent.

We will leave it here for this example, but we could go a lot further and collect a great deal of information all
stemming from this one little invoice line item.

 

 

The first step is to gather all of the information required for our analysis. Here, for example, the information
required is the:

  • Date (year, month, day)
  • Product
  • Product category
  • Customer
  • Customer country
  • Sales agent
  • Quantity
  • Turnover

The first information constitutes the axes for potential analyses, the last two constitute values.

In multidimensional analysis, the initial relational model is not the most pertinent or the most reliable. We generally
prefer to denormalize the database, i.e. to work on a single table that all useful information has been gathered on.

In our example we would obtain the following table:

There is clearly a great deal of redundancy in this table, but in the end it is easier to manage redundant but simple information, and to avoid problems regarding consistency and integrity which are an area for information systems.

This next step involves creating the first level of aggregation i.e. uniting certain line items.

In our case for example, we can assume that the data will not be used at product reference level, but only by segment. In this case, we will regroup all of the identical lines for the key (date, segment, family, customer, country, sales agent), and add the quantity and turnover values.

The last step is that of “multidimensional analysis” in the true sense of the term, which consists of selecting the axes of analysis.
Among these axes, we can distinguish between:

  • Discrete or discontinued value axes i.e. that contain a limited number of values, a postal code or CSP segment for example.
  • Continuous value axes, typically a date or a price. We can lead them to a discrete number of values by defining ranges: a price range, an age range etc.

We can also distinguish between:

  • Cumulable values, an amount or a number of items for example.
  • Noncumulable values, age or date for example.

Cumulable values are those which it is wise to combine, i.e. those we can calculate the sum of, (or the average, or
some other mathematical function of), for a sub-total of lines, for each theme for example.

Multidimensional analysis consists in:

  • Defining the axes of analysis that will be used, and the order in which they will be used. E.g.: by region, then by year, then by sales agent, then by product range. We do not always use all possible axes.
  • Define the value(s) to be studied, and the function to be applied to these values.

Each hierarchy of axes of analysis correspond to a question that we ask it.

For example analysis by year, by sales agent aims at representing the development of turnover, and comparing sales agents for each preceding year. Analysis by sales agent, by year aims to compare the sales agents first.

DescargaBusiness Intellig... (PDF - 3.31MB)
Copyright Cometa 2010 Guide open source| Understanding open source| Business Intelligence| Gestión documental open source| e-commerce Open Source| Los CMS Open Source| Créditos - Aviso Legal