Proceedings of ISP RAS


MapReduce: within, outside, or on the side-by-side with parallel DBMSs?

Sergey D. Kuznetsov.

Abstract

The approaches of use of MapReduce technology together with analytical DBMSs are discussed. The paper considers approaches where one implements MapReduce within a kernel of a parallel DBMS, where MapReduce serves as a communication infrastructure of a new parallel DBMS, and where one uses MapReduce in a symbiotic unity with a parallel DBMS. As examples of the first approach, we consider features of massively-parallel DBMSs Greenplum Database and nCluster of Greenplum and Aster Data Systems companies correspondingly. The second approach is used in the project HadoopDB of the universities Yale and Brown. Finally, the third approach the Vertica company is developing.

Keywords

massively-parallel analytical DBMSs, MapReduce, user-defined function parallelization, communication infrastructure.

Edition

Proceedings of the Institute for System Programming, vol. 19, 2010, pp. 35-70.

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

Full text of the paper in pdf (in Russian) Back to the contents of the volume