SQL Book

An Introduction to Data Warehouses and Data Warehousing

Part 1 - Aims of the Data Warehouse

The terms Data Warehouse and Data Warehousing are used frequently today but can cover a wide range of concepts and processes. This series of articles aims to give an introduction to the various aspects of the world of Data Warehousing. In Part 1 of this series we identify why Data Warehouses are now in high demand and what their aims are.

Today companies record a large amount of data for different reasons such as sales, supplies, payroll, marketing, research etc. Often when a company has data collected for different purposes the data is collected and recorded in different systems such as: online shopping cart, call centre applications, spreadsheets, Oracle / Access / SQL Server databases etc.

The diversity of the data sources makes it hard for these different areas of data to be analysed together. The modern business requires its data to be analysed and summarised in order for it to observe patterns of behaviour that will help the company identify problem areas and areas that they are doing well in.

It is this identification of trends in their business operations then helps them make decisions on how to develop the business to increase benefits / productivity / efficiency etc.

This enabling of the summarising and analysis of data is the aim (end product) of Data Warehousing.

The end users of the data warehouse are company managers who can view its summary reports / graphs / charts etc and use them to quickly make decisions on how to carry the business forward.

Although it is the end result, front end analytics is not all that data warehousing is. In order to achieve the end aim a number or processes must be undertaken in order to achieve accurate and useful analysis.

These processes can be summarised in the following steps:

  1. Identifying analytical requirements of the company
  2. Identifying appropriate data sources
  3. Data Cleansing (extraction from original data sources, transformation of this data into formats that follow company business rules)
  4. Loading the cleansed data into the Data Warehouse storage
  5. Front end Analytics (reporting, charting, data-mining, OLAP)

In the remaining articles in this series we will look at these different stages of Data Warehousing and with simple real world examples see how they are ongoing processes that are constantly evolving in order to satisfy the companies business needs.

In Part 2 we look at what considerations should be given to the business requirements of the Data Warehouse.