Learning and Scaling Directed Networks via Graph Embedding
Reliable evaluation of network mining tools implies significance and scalability testing. This is usually achieved by picking several
graphs of various size from diffierent domains. However, graph propertiesand thus evaluation results could be dramatically diddierent from one domain to another. Hence the necessity of aggregating results over a multitude of graphs within each domain.
The paper introduces an approach to automatically learn features of a directed graph from any domain and generate similar graphs while scaling input graph size with a real-valued factor. Generating multiple graphs with similar size allows signifficance testing, while scaling graph size makes scalability evaluation possible. The proposed method relies on embedding an input graph into low-dimensional space, thus encoding graph features in a set of node vectors. Edge weights and node communities could be imitated as well in optional steps.
We demonstrate that embedding-based approach ensures variability of synthetic graphs while keeping degree and subgraphs distributions close to the original graphs. Therefore, the method could make signifficance and scalability testing of network algorithms more reliable without the need to collect additional data. We also show that embedding-based approach preserves various features in generated graphs which can't be achieved by other generators imitating a given graph.
The paper has won Best Stdent Paper Award at ECML-PKDD 2017.
Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.