Recently at a client, the data warehouse administrator was asked to define a sandbox environment in the production data warehouse for analysts and developers working on a small project. The idea behind this sandbox was to allow the team a working area for collaboration and intermediate storage of results while working with the data in a purely ad hoc capacity. Instantly it was recognized this could be the start of something bigger within the organization—something that could not currently be provided by the incumbent business intelligence tools. The response had to be formulated quickly in order to avoid stifling the creativity of the analysts—or worse, the progress of the project—but care had to be taken as well; if managed incorrectly it could get out of hand and become a waste of system resources and a drain on human resources that had already been spread thin. The business unit in question is looking to move from the confines the current business intelligence environment and push the edges.
This was a group of analysts that wanted to get their hands dirty and weren’t afraid to fail. They wanted to mash data together that previously could not be done by the business intelligence tools in their controlled ad hoc environments. This was data mining for the next set of KPIs that would shape the way business moves forward.
The concept of agile analytics is not new, eBay presented on and blogged about this concept in 2008. The idea at this client was simple. By leveraging the existing enterprise data warehouse system to house their sandbox environment the duplication of data is all but eliminated. Groups interested in sharing data between their sandbox environments are strongly discouraged until the data has been properly integrated into the production environment. The sandbox environments would also be given a short life expectancy at their inception to prevent the prototypes from becoming production and data ending up in a wasteland. This all sounded great on paper.
In the midst of a development architecture overview, a brief conversation among a few enterprise architects uncovered the potential Screw-Me Scenario that could bring the concept of agile analytics to an untimely demise. “The users of the data warehouse are not permitted to write ad hoc queries outside of a controlled business intelligence tool. They might write a bad query.” Thanks for the warning, we’ll be sure to refine our pitch to the enterprise architects to diffuse this scenario before it turns ugly.
In Oliver Ratzesberger’s presentation for eBay’s Analytics as a Service, he acknowledges that the metrics we already know are cheap and the unknown metrics are expensive. But the known metrics are not pushing the edges. Known metrics are found in the middle of the box. Agile analytics is about pushing the edges about how your enterprise data warehouse is used to improve response to the needs of the business. It is about the evolution of the user community from one who plays in controlled ad hoc environments to encouraging them to experiment with new ideas and not to fear failing along the way. Agile analytics is about encouraging your users reach out for the edges and P U S H. Only once the edges are stretched can the middle of the box redefined.
photo by edenpictures via Flickr (Creative Commons License)

Recent Comments