Ralph Kimball defined data warehousing as “that part of relational technology that deals with getting the data out” (1995). As the primary audience for extracted data is, as of yet, human beings, data warehousing may then be seen as a user-centric, or bottom up paradigm for data stores. Coronel et al (2011) state that data marts are “more manageable data sets that are targeted to meet the special needs of small groups within the organization”. So, on the surface data marts are also a bottom up method for designing and accessing data stores. However, their more manageable scale would lend itself to faster top down design, relative to an entire warehouse, as an entire mart could be more easily visualized. Kimball later wrote (2011) that, in his view, data warehousing also “includes the extract-transform-load (ETL) and business intelligence (BI) functions.” To deliver this level of functionality in a timely manner it may well be necessary for resource-bound organizations to begin a data warehousing initiative by implementing data marts. This could lead to data-redundancy or inconsistency between marts, which can be addressed by carefully evaluating and enforcing dimensional conformance (Kumar, 2007, p2). Inmon, however, says that “the starting point for the design and development of the data warehouse model is the data model” (2000). This could also be said of the design and development of data marts. That such a model, applied across the entire warehouse, would preclude inconsistencies seems obvious. As neither position is obviously invalid it is likely that an approach which balances them will be successful.
Nishith (2006) observes that “there are far too many conflicting and confusing definitions of Data Mart and Data Warehouse” and that the “debate” between Kimball and Inmon schools of thought “only adds to the confusion.” Inmon worries that “the data warehouse environment is one where many requirements CANNOT be discerned until the data warehouse environment is built and the data in the warehouse is available for analysis” (Ibid.). While this concern is legitimate, it certainly does not preclude asking users what sort of data they know they'll need from the warehouse, nor that which they might want. It is also difficult to imagine how one begins a data model without communicating with the people who will populate and use it. Therefore, it isn't clear how this approach fundamentally conflicts with some bottom-up procedures.
Another paradigm for building data warehouses is the data vault. Linstedt (2002) defines it as “a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business.” He espouses it as a hybrid approach. His commitment to balancing the organizational advantages of top down with the speed of implementation of bottom up gives the concept some appeal. However, the level of detail he provides requires considerable literacy in the design and implementation of data warehouses already (Ibid.) and so successful application of his paradigm may often require previous experience in both bottom up and top down methodologies.
A data warehouse is a data store that attempts to capture all pertinent information about an organization and deliver it to users. Data marts are subject-constrained data stores with a similar purpose. Whether one best breaks a warehouse into marts or builds it out of them may be a fascinating argument. But creating a data store that users find effective is probably a better consumption of resources.
Inmon, W. (2000) Building the Data Warehouse: Getting Started [Online]. Available from: http://inmoncif.com/inmoncif-old/www/library/whiteprs/ttbuild.pdf (Accessed: 26 August, 2011)
Kimball, R. (1995) 'The Database Market Splits', DBMS, Sept 1995, in The Kimball Group Reader, Relentlessly Practical Tools for Data Warehousing and Business Intelligence (2010) [Online]. Available from: http://site.ebrary.com.ezproxy.liv.ac.uk/lib/liverpool/docDetail.action?docID=10369792 (Accessed: 26 August, 2011)
Kimball, R. (2011) The Evolving Role of the Enterprise Data Warehouse in the Era of Big Data Analytics [Online]. Available from: http://www.informatica.com/downloads/1597_EDW_Big_Data_Analytics_Kimball.pdf?elq=e4589c7c3d6c49b2b28f0662cca6214c (Accessed: 26 August, 2011)
Kumar, N. (2007) Inmon vs. Kimball [Online]. Available from: http://people.rit.edu/nxk2100/images/Inmon%20versus%20Kimball.pdf (Accessed: 26 August, 2011)
Linstedt, D. (2002) Data Vault Series 1 – Data Vault Overview [Online]. Available from: http://www.tdan.com/view-articles/5054/ (Accessed: 26 August, 2011)
Nishith (2006) Data Mart versus Data Warehouse - The Great Debate [Online]. Available from: http://opensourceanalytics.com/2006/03/14/data-mart-vs-data-warehouse-the-great-debate/ (Accessed: 26 August, 2011)