So you've
made the decision to design a data warehouse for your enterprise. You'd love to
do some online shopping and buy an end-to-end product that will pop out of the
box and hit the ground running. But guess what? It doesn't exist. So where do
you go from here? Here are some guidelines from POD on some of the issues you
will need to be aware of when considering setting up a data warehouse.
The Basics:
First of all, what is a data warehouse and where can it prove to be useful
within my organisations' infrastructure? A data warehouse can be thought of as a
repository of an enterprise's data that can be used for information analysis,
user trends, management reporting and decision-making. Data is extracted from
the production systems where it originates and placed into a data warehouse
which will provide access to the data regardless of client platform, software or
data formats.
According to the Palo Alto Management Group, the top reasons for investing in
a data warehouse are to:
- improve decision or management processes
- improve customer service
- keep ahead of the competition
- help execute corporate strategy initiatives
- reduce operational costs
- retain key customers
- identify new customers
- identify business and trading trends
The Challenge:
This all sounds great, but it's not simple. The integration of all the tools
required to put together a successful data warehouse is a major challenge. One
of the considerations to keep in mind is how well the tools you are considering
purchasing integrate with your existing environment as well as any other tools
you may want to use. Integration occurs at many levels; the breakdown of the
integration at any of these levels could result in budget overruns and
implementation delays.
Metadata
One of the most challenging tasks is integrating the metadata. This details the
location and description of everything in the actual data warehouse, similar to
a map. Being able to synchronise the metadata between different vendor products,
different functions and different metadata stores can be a real challenge. there
are currently no standards for metadata which further complicates the task. It
is hoped that a lead will be given from industry leaders such as Microsoft and
Oracle.
Data Transformation
Transformation tools extract data from the production sources, clean it and load
it into the warehouse. You need to make sure that the product you choose offers
metadata bridges to all your possible sources of metadata.
OLTP
Integrating the extract, cleanse and load cycle with your online transactional
processing systems is another area of concern. You will need to make some
decisions at this point...you could develop a continuous data-feed model from
production to warehouse, resulting in a real-time system if time is of the
essence, or you could opt for a batch-feed model, where data might be fed to the
warehouse, say, once a day, during the night. This is more appropriate for less
time-critical applications and of course produces less overhead on your
production systems.
You will also need to decide on whether to opt for a push or a pull model of
data feed. The 'push' model has the production systems 'push' their data from
production to the warehouse, while the 'pull' model lets the data warehouse
software go get the data from the production systems. A drawback to 'push' is
that it incurs a system overhead on the database engine used for OLTP.
Combinations of both types of model are possible and are raised as options
during the design process.
Information Catalogue
One of the most important tools is the information catalogue. It is vital that
the catalogue presents information to users in an easy to understand and easy to
use format. The catalogue should be able to integrate both technical and
business data into a single user resource for easier maintenance and
troubleshooting.
The Solution:
You should by now be realising that integration is no easy task. With the
average cost of a data warehouse system valued at £1.1 million, you may want to
consider outside consultants who have experience in designing such systems.
POD works on the premise that no single organisation has the set of
end-to-end products required to build a successful data warehouse. So we have
developed expertise in providing a single point of responsibility for the
success of our customer's endeavours.
The number of failed data warehouse projects greatly outnumbers the
successes. Reliance on technology is not enough when it comes to realising the
true value of the data accumulated within organisations. There is also the need
to get the right people, the right methodologies and the appropriate experience
brought together to produce a successful data warehouse project.
Data warehouses reach an organisation at all levels and the personnel that
design and build the warehouse must be capable of working across the
organisation as well. This is where the industry and product experience of POD
coupled with a business-focussed approach help to guarantee success in setting
up your own data warehouse.