Data Warehousing, Part III: Why?

By nate

(This was originally posted to the NTEN Data Warehousing group list, by Nate Wilbert. Nate has kindly allowed us to publish his work here.)

In the first two posts I've discussed how data warehousing is similar to a large puzzle with many pieces, and how that puzzle is different from typical database "puzzles" created within an organization.

I think that perhaps I should back up and provide some reasons for even thinking about a data warehouse.

1. One version of the truth

The data warehouse will be your central data repository and as much as possible all reporting should be done there. If this is the case, when people from your organization go to meetings they shouldn't be carrying contradictory reports in with them. Here's a typical scenario I've heard before creation of the data warehouse.

Manager #1: "My numbers for last month show we're on target to meet our goals for the year."

Manager #2: "That's funny. My numbers for last month indicate we're behind!"

Manager #1: "Hmmm. Where did you get those numbers from?"

Manager #2: "The ABC and XYZ databases. Isn't that where you get yours?"

Manager #1: "No. I go to another database altogether."

Creating the data warehouse will give people one place to go for their numbers and help reduce or prevent confusion around how numbers are gathered.

2. Improved performance in your transactional systems

If an organization doesn't have a data warehouse, they will probably have to report off the transactional system (e.g. Blackbaud RE, Mpower, Siebel, etc.). Your transactional system typically stores data related to a particular facet of your operations (e.g. donor data, financial data, research, etc.). If you are running reports straight against those systems you are using the same resources that must be used to load the system.

For example, if I run a report against the Blackbaud Raiser's Edge fundraising systems that may slow down or even prevent in some extreme cases anyone from loading data into the system.

3. Reduced wait time for reporting

Without a data warehouse, people may have to visit multiple data sources to create the reports they need. Going to all of those sources of data takes time. Wouldn't it just be easier to have one place to go to for your reporting needs? I've worked with people who had previously taken days to put a report together that they could now get in minutes.

Moreover, as I mentioned in the previous post, the way data is stored in a data warehouse is much different than the way your transactional systems store data. That is, we may store data in more than one place or even summarize the data, all in an effort to make it as fast as possible to pull the data out for a report.

4. Improved data quality and processes

Every single project I've been involved in has hit data quality issues at one point or another. And not just on our side! :-) Usually when we start pulling data in from various sources we find that the data isn't as clean as it could be.

For example, perhaps there's an analyst out there who doesn't follow the process for updating their weekly TPS report. When we start pulling their data over we notice right away that certain fields are never populated. As a result we can address the issue with them and improve the quality of the data.

It's not a top ten list, but hopefully this will give you some useful points for discussion when the topic of reporting comes up in your organization.

Have a great week.

Nate Wilbert