CRM
 
 
home page general  website information contact me at lamarheller@earthlink.net copyright information
 

 

Data Warehousing: Supporting Business Intelligence

by Jonathan G. Geiger, Cutter Consortium

Business intelligence (BI) is the set of processes and data structures used to understand a company’s business environment and support strategic analysis and decisionmaking. This article describes the business value that BI capabilities provide, the architecture needed to support the environment, and a sound approach for building and managing it.

Business Value

BI has become popular because companies recognize that it provides bottom-line benefits. One of the interesting facets of BI is that, in itself, it doesn’t do anything; it is merely a store of information that is applied by people and systems to derive benefits. This sometimes makes it difficult to justify the sizable investment in the architecture that a sustainable BI environment requires. Companies need to recognize that without such an investment, however, they will not be in a position to leverage one of their most important assets — information.

The most significant generic benefit of the BI environment is the collection — in the data warehouse — of the single, consolidated, enterprise view of the data. Although there are technical efficiency benefits, the major beneficiaries of the single store of information are the business users. With this consolidated information as a base, strategic analysts can get to the data they need much easier, use the same figures as the basis of their analysis, and have a common understanding of business terms. A data warehouse built to support customer relationship management, for example, will enable companies to know how many customers they have and the profitability of each customer. Armed with this information, a company can make sound business decisions affecting both customer segments and individual customers.

Architecture

The corporate information factory is representative of a conceptual architecture needed to support BI.1 This architecture creates two distinct stores of information, each with a different set of objectives and a different design. The data warehouse stores the enterprise’s consolidated BI information. It includes historical information to facilitate trend analysis and is updated through a controlled set of processes, never through individual transactions. The objective of the data warehouse is to serve as a collection and dissemination point for the data. It collects data from wherever that data may exist through a data acquisition process, and it sends data to the data marts through a data delivery process. The term “relational” is often used to describe the design of the data warehouse.

Data marts are smaller data stores that are populated with data from the warehouse and built to answer a specific set of business questions or support a specific business function. The marts often contain summarized data, and their objective is to provide the business users with a store of information that can be easily and quickly accessed and traversed. The terms “dimensional” or “star schema” are often used to describe the most popular type of data mart.

Some of the benefits of this architecture include flexibility, durability, scalability, maintainability, and reusability. These benefits are only available with the investment in the infrastructure, and companies need to resist the temptation to demand a business deliverable from the architecture itself; the architecture provides the foundation. When an office building is built, much of the cost goes into the foundation, plumbing, wiring, etc. The visible deliverables are the individual offices, but they would not be possible without the investment in the infrastructure. Furthermore, if the infrastructure is built with the future in mind, we can rearrange offices and move internal walls without needing to reconstruct the building.

Within the BI architecture, segregation of the data warehouse and the data marts protects the business users from changes to the warehouse. The components of the architecture facilitate growth (scalability), as each component can be addressed independently. The data warehouse, combined with the data acquisition processes that feed it, is designed to accommodate additional sources of data (flexibility) without changing the architecture (durability). By including conforming dimensions (components that can be used in multiple data marts) and summaries, reusability is accommodated. With a central store of information, as the business community needs change, the data is often readily available, needing only to be loaded into a new data mart to meet the new requirements (maintainability).

The data acquisition process is a complex set of activities that collects data from various sources and loads it into the data warehouse. This process often consumes the majority of the development effort. The most complicated part of this process involves making business decisions concerning the source of data to be used, the quality expectations to be met, and how data from multiple sources will be integrated. The data delivery process filters, formats, and delivers data from the data warehouse to the data marts. The data warehouse itself is maintained by an enterprise data management function to ensure that performance, reliability, and quality expectations are met.

The architecture includes other important components. Meta data is information about the data in the warehouse and the data marts and how it is used. The decision support interface is a set of end-user access tools used to obtain and navigate through the data. The enterprise portal is the interface through which the users get to the data warehouse and data marts. Just as a person in an office doesn’t think about the building’s foundation, plumbing, and wiring, this portal should be designed to meld the BI capabilities with business processes so that the architecture itself disappears into the background.

Methodology

Building a sustainable BI environment requires a program orientation to ensure that the investment in the architecture and infrastructure is made. With this investment, companies can realize the benefits previously cited and be in a position to add new capabilities very quickly. The methodology itself is iterative, with each project typically scheduled for completion within three to six months.

Each BI effort is a project within the program. It begins with planning and initiation, in which the scope is defined, the expectations are set, and the project plan is developed. The next two phases, “getting data in” and “getting information out,” may be executed in tandem. Within getting data in, the data warehouse is designed and the data acquisition process is built. Within getting information out, the data marts are designed, the data delivery processes are built, and the end-user access facilities are built. The last phase of each project is deployment, in which the BI capability is moved into a production environment. Once the warehouse is built, it must be maintained and managed to ensure that growth is managed, performance expectations are met, and that the business continues to get business value.

Summary

Successful BI initiatives require a program orientation, a sustainable and flexible architecture, and an iterative development methodology. The program orientation ensures that all of the individual efforts are coordinated and that the work performed in one project is leveraged in subsequent projects. The architecture needs to isolate the data warehouse, which serves as the collection point for data residing in the operational systems, and the data marts, which serve as the primary access point for the business community. The methodology consists of program management activities, as well as compatible “getting data in” and “getting information out” phases in the individual projects.