Exploration Warehouses in the Cloud: Substance or Hot Air?
by Curt Hall, Senior Consultant, Cutter Consortium
I think that many end-user organizations are going to have serious reservations about deploying their data warehouses permanently to the cloud, certainly at least initially. This is quite understandable, given that most companies tend to view their data -- especially customer data -- as a strategic asset. In fact, according to my research, the number-one reason that organizations do not use on-demand BI and data warehousing is that they deem data analysis simply too important a strategic function to outsource (see "Corporate Adoption of On-demand BI and Data Warehousing: Trends and Directions," Business Intelligence Executive Update, Vol. 7, No. 18). Moreover, let's face it: the concept of deploying large data warehouses in the cloud is still relatively new and unproved. That said, one application for which I do see end-user organizations trying out cloud-based data warehouses is to quickly launch exploration warehouses that analysts can use to run very complex queries on large data sets.
Just to make sure we're all on the same page, the exploration warehouse offers just what its name implies: a data warehouse designed to give users -- typically analysts and other power users -- an environment that supports truly ad hoc queries and "anything goes" exploratory data analysis. Such users need quick and ready access to a lot of current data on which they can issue complex and long-running queries associated with data mining and other complex analytic applications. Additionally, the purpose of the exploration warehouse is to give its users as much independence as possible from having to rely on IT for development and support. Finally, the exploration warehouse is typically kept separate from an organization's central or enterprise data warehouse in order to avoid overtaxing and bogging down everyday BI and reporting applications.
Considering the above criteria, cloud-based exploration warehouses and marts seem to make a lot of sense. Because cloud-based services are licensed on a pay-as-you-go model, and with the provider also frequently offering implementation expertise, companies can get a dedicated, high-performance exploration warehouse up and running fairly painlessly without incurring the up-front data center costs and delays typically associated with traditional data warehouse development and operation. For the most part, all the traditional hardware/software costs and licensing, etc., get thrown out the window with cloud computing. Instead, because the computing resources are owned and operated by the cloud provider in a data center location, the end-user organization can "rent" just the hardware and software needed, for the amount of time needed. For an exploration warehouse, this could range from two to six months or so, for example. After this period, the end-user organization can "pull the plug" on the project. Or should data volumes grow, or it's decided that it's necessary to conduct further complex analyses, the end-user organization can rent the additional computing resources needed to meet its analytic processing requirements.
Finally, the very nature of the exploration warehouse tends to make it well suited for the cloud. Because it is typically used mainly for exploratory analysis, should there be some kind of interruption of service it might certainly be an inconvenience. However, it's unlikely that this would prove the substantial disruption to business operations that is likely to arise should your production data warehouse (and production reporting systems, etc.) become unavailable for any length of time.
I'd like to get your comments about the use of cloud-based data warehouses and on-demand BI, in general. Do you have reservations about deploying your organization's data warehouse to the cloud? And what do you think of the concept of utilizing cloud-based exploration warehouses? Is this a viable alternative to on-premise data warehouses? If not, what do you think is required before cloud-based warehouses are more widely accepted? As always, your comments will be held strictly confidential. E-mail me at chall@cutter.com or call +1 510 848 7417.
Sincerely,
Curt Hall, Senior Consultant
Business Intelligence Practice
E-mail: chall@cutter.com
