Proceedings of ISP RAS


A method of automatically estimating user age using social connections

A. Gomzin (ISP RAS, Moscow, Russia, MSU, Moscow, Russia)
S. Kuznetsov (ISP RAS, Moscow, Russia, MSU, Moscow, Russia, MIPT, Dolgoprudny, Russia)

Abstract

The work is devoted to methods of social network users’ age detection. Social networks allow users to fill their profiles that may contain an age. Profiles are not fully filled, so the task of unknown attributes detection arises. Explicit and predicted values are used in recommender and marketing systems. Moreover, the predicted values can be used for determining online communities’ demographic profiles and for inferring the target audience of marketing campaigns in the Internet. In this paper a method for predicting unfilled age values is proposed. The method uses the following information available from the social network: explicit users’ ages and social graph. The graph contains nodes representing users and communities. Community is the special page in the Internet that unites users on interests. Friendship relations between users and subscriptions of users on communities represented as edges of the social graph. The method is based on the label propagation in the friendship and subscription graphs. Ages of the users are representd by labels that are propagated in the graph. The scheme of the algorithm is following: initialize user labels according to explicit profiles; build vector model that contains distributions of the neighbours’ ages grouped by user age; compute weights of users and communities, propagate labels to communities; build vector model considering calculated weights; propagate labels to users that have not filled their age in the profile. The paper describes the algorithm and contains experimantal results showing that friendship relations are more useful for age prediction in the social network than communities.

Keywords

social media, demographic attributes, vector model, social graph, label propagation

Edition

Proceedings of the Institute for System Programming, vol. 28, issue 6, 2016, pp. 171-184.

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2016-28(6)-12

Full text of the paper in pdf (in Russian) Back to the contents of the volume