MapReduce: within, outside, or on the side-by-side with parallel DBMSs?
The approaches of use of MapReduce technology together with analytical DBMSs are discussed. The paper considers approaches where one implements MapReduce within a kernel of a parallel DBMS, where MapReduce serves as a communication infrastructure of a new parallel DBMS, and where one uses MapReduce in a symbiotic unity with a parallel DBMS. As examples of the first approach, we consider features of massively-parallel DBMSs Greenplum Database and nCluster of Greenplum and Aster Data Systems companies correspondingly. The second approach is used in the project HadoopDB of the universities Yale and Brown. Finally, the third approach the Vertica company is developing.
Proceedings of the Institute for System Programming, vol. 19, 2010, pp. 35-70.
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).