Data warehousing systems: How theywork and what they can do for your company
By now you probably have heard the ITbuzzwords “data warehouse” – and wondered what a data warehouse is and why asewn products company would be interested in it. As the term suggests, a data warehousecontains a large amount of data. It is a way of organizing business information that canprovide better management visibility and more insight than the traditional informationsystems used to support day-to-day operations.
A data warehouse won’t place the orders,pay the suppliers, control the shipments or generate the invoices. What it will do isprovide a clear, high-level picture of these processes, and offer a solid base forperforming business analysis. An in-depth look at the boundaries and configurations ofdata warehouses will provide a better understanding of how your company can use such toolsto harvest information and improve operations.
Which Systems Do What
The applications that typically support a company’s operations are transaction-oriented,and their purpose is to make operations work smoothly and efficiently and keep track ofbusiness transaction details as they happen. They usually include a database managed by adatabase management system, or a large set of data files managed by the applicationitself. The technology for managing these systems has improved dramatically over the past10 years, with a focus on improving the efficiency of data acquisition and structuring. Asa result, today’s systems offer faster performance, higher integrity at the transactionlevel, less contention among users and bigger data volumes.
While these improvements have advanced thefunctionality of transaction systems, they still are not equipped to provide a high-levelview of data, which is required for setting strategic direction, setting goals andevaluating operations effectiveness. Generally speaking, operational applicationstypically are designed from the point of view of a single business process, and they aredeveloped or purchased at different times – usually with little thought of combining thesystems’ data into an integrated enterprise-level picture. It is not surprising, then,that the reports generated by transaction systems are incomplete, inconsistent with eachother, and not very useful for managers working above the operations level.
A data warehouse, on the other hand,includes a database that is designed for accessing data in a form that is useful forbusiness analysis and management decision support. As a result, data warehouse design isapproached quite differently from transaction database design. The favored approach useswhat is called a “star schema,” with fact tables and conformed dimension tables,rather than the traditional normalized, entity-relationship design that has beensuccessful for transaction systems.
This data structure is advantageous becausebusiness people easily understand it, and they can create queries to get needed data.Additionally, the query response time is reasonable, even when processing huge volumes ofdata.
Over the past several years, database software vendors have broadened their product lines,and most of the major players now offer excellent capability for decision support as wellas transaction processing. Moreover, rapid growth in data volume, number of users andcomplexity of queries has become characteristic of data warehouse systems. As a result,”scalability” and parallel processing features are particularly important. Thismeans that you won’t have to scrap your system and start over as your requirementsincrease.
Although the data warehouse uses the samecore data that is collected by transaction systems, the data is selectively extracted fromoperational source systems. It is “cleaned” of errors and inconsistencies;transformed to fit the data warehouse platforms; and integrated and accumulated to supportbusiness analysis.
This data acquisition process is the hardpart of data warehousing. It’s complex, expensive, time consuming and not very glamorous.The degree of difficulty depends on how many different data sources are involved, howaccessible they are and the quality level of the data. The task can be accomplished bywriting custom transformation programs, but is better managed with “extract,transformation and loading” (ETL) or “data staging” tools. These softwaretools are now in their second generation, and are available from a number of highlyqualified and experienced vendors on today’s full range of computing platforms. Althoughexpensive, they are usually a better long-term investment than the custom programmingapproach.
Once data is cleaned and properlystructured, the business user community can access the data warehouse with “businessintelligence” tools. At this level, a data warehouse can provide:
- a “managed reporting” environment with regular distribution of predefined reports to managers who want reliable and consistent information without fancy analysis;
- a “business analysis” environment with ad hoc queries and drill-downs so analysts can study the data, notice the trends, make the comparisons and try to answer the “why” questions;
- a sophisticated “statistical analysis” environment for serious modeling and research; and
- a “data mining” environment for discovering patterns that are not readily apparent or even suspected.
The large number of business intelligencetools currently available continues to grow and change rapidly. Yet no single tool does itall, so it is common to utilize several different ones to meet the full range ofrequirements in a complex business. This is the interesting part of data warehousing.
For example, a data warehouse can provideanalytical reports on order characteristics for order management, where data from theorder entry system and the inventory management system are consistent and correctlycombined. The same calculations and business rules are used for all reports, so you getthe same answer regardless of who asks the question. Users can directly compare salesacross different product lines or across different retail outlets. They also will have acomplete and accurate historical record so that they can confidently compare sales fromseason to season, or with the same season in a previous year.
No Quick Fix
In spite of what the vendors may say, there is no one quick and easy way that datawarehousing works. Getting the design right requires knowledge of the business as well asthe technology. And since no single vendor makes all the components, building a datawarehouse always requires a fair amount of system integration.
Remember that a data warehouse is along-term, strategic commitment. A large part of the value comes from the consistenthistorical perspective of your operations. Don’t think of it as a one-time project,because experience suggests that usage will grow and evolve over time. Implementationshortcuts and quick fixes won’t serve your company well in the long run. Take the time tomap out the process and put the correct resources in place before you begin.
Chuck Williams is a manager for KurtSalmon Associates (KSA). KSA is a leading management consulting firm specializing inretailing, consumer products and health care. KSA works with customers and suppliersthroughout the supply chain, from raw materials suppliers to the ultimate consumer. In themanufacturing sector, KSA works with both soft goods and hard goods suppliers. In theretailing sector, KSA consults to mass merchants, grocery wholesalers and retailers,direct marketers, DIY (Do It Yourselfers) stores and department and specialty stores.