On a model of social network with user communities for distributed generation of random social graphs.
In the field of social community detection, it is commonly accepted to utilize graphs with reference community structure for accuracy evaluation. The method for generating large random social graphs with realistic structure of user groups is introduced in the paper. The proposed model satisfies some of the recently discovered properties of social community structure: dense community overlaps, superlinear growth of number of edges inside a community with its size, and power law distribution of user-community memberships. Further, the method is by-design distributable and showed near-linear scalability in Amazon EC2 cloud using Apache Spark implementation. The generated graphs possess the properties of real social networks and could be utilized for quality evaluation of algorithms for community detection in social graphs of more than 109 users.Full text of the paper in pdf (in Russian)
Machine Learning and Data Analysis. 2014. V. 1, № 8. Pp. 1027 - 1047.