Proceedings of ISP RAS


Some problems on graph databases

R.I. Guralnik (SPBU, St. Petersburg, Russia)

Abstract

Graph databases appear to be the most popular and relevant among non-relational databases. Its popularity is caused by its relatively easy implementation in the problems in which data have big numbers of relations such as protein-protein interaction and others. With the development of fast internet connection, graph database found another interesting application in representation of social networks. Moreover, graph edges are storable which lowers graph traversing calculation costs. Such system appeared to be natural and in-demand in the era of Internet and social networks. The most significant by size and matter section of graph databases problems is data mining. It contains such problems as associative rules learning, data classification and categorization, clustering, regression analysis etc. In this review, data mining graph database problems are considered which are most commonly presented in modern literature. Their popularity is represented by the big number of publications on these problems on several recent years’ major conferences. Such problems as influence maximization, motif mining, pattern matching and simrank problems are examined. For every type of a problem we analyzed different papers and described basic algorithms which were offered 10-15 years ago. We also considered state-of-the-art solutions as well as some important in-between versions. This review consists of 6 sections. Besides introduction and conclusion, each section is dedicated to its own type of graph database problem.

Keywords

graph databases; motif mining; influence maximization; pattern matching; simrank

Edition

Proceedings of the Institute for System Programming, vol. 28, issue 4, 2016, pp. 193-216.

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2016-28(4)-12

Full text of the paper in pdf (in Russian) Back to the contents of the volume