Article

At Decision Technology, we often hear about the struggles and overhead of creating and maintaining a data warehouse, particularly for the small and midsize enterprises. That’s why we found the article below so relevant as well as entertaining. In the world of data warehousing, many people assume that bigger and more complicated is better. The article below points out that this can be an erroneous—and costly—conclusion.

With DecisionCentric software, organizations can derive business intelligence from virtually any combination of databases, without data warehouse software and for a much lower total cost of ownership. By employing Data Federation to integrate your data, we can eliminate or postpone the need for data warehouse software or ETL processes. Our approach also eliminates the long start-up and deployment times associated with "Big BI" software.

If you’re considering data warehouses, data marts or ETL software, ask yourself, “Do I really need this?” And then consider Decision Technology, where “Smarter Business Intelligence” provides more enterprise power.

 

Reality IT: Who Needs a Data Warehouse?

Column published in DMReview.com September 1, 2005

—by Gabriel Fuchs

Our data warehouse is so big, it only comes into work when it feels like it.

At my job, we have realized that we need a data warehouse. It was actually quite easy to make this decision - everyone else has one so, therefore, we should obviously have one as well. In order to be better than everyone else, we have, however, not stopped at a data warehouse. We decided to go for the whole kit; an operational data store (ODS) that feeds an enterprise data warehouse that feeds a load of domain specific data marts. Now that we have a killer infrastructure, we just have to figure what we want out of all the stuff.

With the risk of being sacrilegious, I would nevertheless like to ask the following: How many actually need the data warehouse architecture that they have implemented? Yes, reports and analyses may be quicker to produce, but given the limited number of power users, IT is still largely responsible for providing the bulk of reports. Consequently, many organizations are very much where they were 20 years ago. It may not be SQL programming anymore and the Web has certainly improved the distribution of reports and analyses, but IT is still very much solicited when new reports and analyses are needed. And if you want more data in a report or for preparing analyses, IT is definitely involved in extracting this data from the legacy systems.

In many cases, when there is a need to put new data in a data warehouse (or ODS or a data mart) what happens? After all, some necessary data does tend to be forgotten and left out when doing the initial definition and modeling of a data warehouse. Well, the user goes to IT with a demand, whereby IT will answer when it can be done. In reality, it will often take some time to update a data warehouse. If the data warehouse is fed through an ODS, more work is needed. If there are data marts involved, these need to be fed as well. There is a lot of feeding going on in such a situation. I am not going to tell you what happens when the source systems that feed the ODS change. Let's just say that it may take some time, resources and yet more feeding to get whatever you want out of the whole hodgepodge.

The point here, is to ask oneself if a "complete" data warehouse architecture, i.e., ODS, enterprise data warehouse and specific data marts, is actually more cost-efficient than a smaller solution, e.g., a sole data warehouse? As an example, let's use customer dimension - something that is likely to exist in several data marts, at least if the organization is interested in its customers. Modify it in one data mart and it shall have to be modified elsewhere (or you can forget having one version of the truth). Modify directly in the enterprise data warehouse and its repercussion and validity in the concerned data marts still need to be verified. Domain-specific data marts tend to depend on each other in one way or another; so the more data marts, the more complex the maintenance risks becoming.

A data warehouse is nice, but it is always likely to increase the workload for the IT department. If the resulting benefits do not outweigh the extra expenses, the data warehouse has failed. Check the number of users that actually benefit from the whole stuff and ask yourself if these users are now so much more efficient that this covers the extra expenses.

As cool as a complex data warehouse architecture may be, it is not the number of resulting data marts that will be the indicator of success. More data marts will not necessarily mean higher returns on investment. More is not necessarily more. Instead, more can often be less. More is cooler, though.

At my job, we are back to square one. Whenever we need data that is not in our big and, therefore, cool data warehouse system, we shall have to wait for this data to be integrated. And wait. Often, it would have been quicker and more efficient to get the data directly from the source systems, but we cannot do that. What would the use of our data warehouse be, if we were allowed to circumvent it? Our data warehouse system has locked us all in a situation that whenever the demanded data is not in it, the data shall have to be put there - no matter how much time it may take. The result is often that the whole system makes us less efficient in some cases than when we were not endowed with this rich and cool architecture.

So, do not get into situation where your data warehouse has become so big that you are all a part of it, and it is all a part of you.

And if your data warehouse is so big that Stephen Hawking has a theory about it, then you might really need to ask yourself questions about its efficiency.

Gabriel Fuchs is an independent consultant and analyst within the field of business intelligence. He is also author of numerous industry articles as well as a regular speaker at conferences. His column, "Reality IT," reflects a variety of different people's experiences, together with some common sense. He can be contacted at sgfuchs@bluewin.ch.

Reprinted with Permission from Gabriel Fuchs