Data Warehousing, Part V: Small and Agile

By nate

(This was originally posted to the NTEN Data Warehousing group list, by Nate Wilbert. Nate has kindly allowed us to publish his work here.)

I received an email today about an NTEN webinar with the headline "Bust Those Silos: Reporting Across Multiple Systems." This was interesting to me because I thought, great! Someone is going to talk about data warehousing! Well, not quite I think.

Here was a quote: "Is it possible to pull key data across multiple systems without building and maintaining a complex and sophisticated data warehouse? There are a number of approaches and we'll take a look at some and create dialogue around the different options."

I'm not sure which approaches he will cover. Hopefully it will be helpful, but I did quickly want to say that there's this idea that building a data warehouse is some huge undertaking. In the past, that's probably been more true.

But speaking from recent personal experience, and considering all the industry emails I get about "agile data warehousing", I'd say that using a team of as little as 3 people it's possible for you to get a data mart operational within 6-9 weeks. ... It doesn't have to be the world's best data mart. We did this taking a hybrid agile approach, that is, we took some agile principles and applied them to our traditional waterfall data warehouse development processes. We dropped more documentation in favor or code. We focused on smaller, more specific "features" rather than trying to get an entire subject area reporting system in place. I spoke with one company recently who said they actually do 2 week sprints...I'm not exactly sure what their deliverables are, but apparently that's working for them!

For example, we worked with the HR group at our client recently. They have A LOT of reports that they run each month. Headcount reporting, open positions, movement within the organization, and others. We took the movement area (a movement would be new hire, termination, transfer to or from another company / department) and built out a single fact table, very specific in nature with a movement as the grain of the table, and then built out the dimensions attached to that fact table. That meant we brought over employee data, company data, job data as well as a few others. It was a fast paced project, but we completed in about 8 weeks and delivered new tables for this data mart in the warehouse, including the extract, transform, and load processes (ETL) as stored in SQL Server Integration Services, about 10 reports from SQL Server Reporting Services, and ad-hoc analysis capabilities using SQL Server Analysis Services. It was all accessible through Microsoft Office SharePoint Server...oh yeah, and we threw in a few dashboard pages at the end as well.

Whether it's 2 weeks or 8 weeks, I think you can use a small team...heck, maybe even just one person if that's all you have, to deliver very focused "features" as part of a hybrid agile data warehousing approach. If nothing else, just know that there are alternatives to a traditional waterfall project management approach around data warehousing. The rewards for implementing a data warehouse will be worth it!