Social network analysis: methods and applications.
Social data analysis is rapidly gaining popularity worldwide due to the emergence of online social networking services in 1990-s. That also relates to the phenomenon of personal data socialization: biography facts, correspondence, diaries, photos, videos, audios, travel notes, etc became available to the public. Thereby, social networks are a unique source of data about real people's personal lives and interests. This offers an unprecedented opportunity to address research and business objectives, as well as to create auxiliary services for social network users. The paper describes the basic components of ISPRAS technology stack for social network data analysis. Particular attention is given to tasks, methods, and applications of network (social connections between users) and textual (user messages and profiles) data analysis: demographic attribute detection, event detection in messages corpora, user identity resolution, community detection, and influence measurement. Distributed implementations of certain methods using Apache Spark are also described. Collecting social data is associated with a number of well-known issues, including privacy, lack of structure, access restrictions, data size, etc. Therefore, means for input data acquisition are also considered: collecting real data through web-interfaces of social services (Facebook, Twitter, Hunch) and generating random social graphs including profile attributes, social ties, community memberships, and textual messages for each user. For each of the developed tools we describe its functionality, use cases, basic steps of the underlying algorithms, and experimental results.
Proceedings of the Institute for System Programming, vol. 26, issue 1, 2014, pp. 439-456.
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).