Friendship prediction and homophily in social media
Top Cited Papers
- 1 May 2012
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on the Web
- Vol. 6 (2) , 1-33
- https://doi.org/10.1145/2180861.2180866
Abstract
Social media have attracted considerable attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and topical components of social media has been only partially explored. Here, we study the presence of homophily in three systems that combine tagging social media with online social networks. We find a substantial level of topical similarity among users who are close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local similarity between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar interests are more likely to be friends, and therefore topical similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on several datasets, confirming that social networks constructed from topical similarity capture actual friendship accurately. When combined with topological features, topical similarity achieves a link prediction accuracy of about 92%.Keywords
Funding Information
- Division of Information and Intelligent Systems (IIS-0811994)
This publication has 46 references indexed in Scilit:
- Multirelational organization of large-scale social networks in an online worldProceedings of the National Academy of Sciences, 2010
- Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networksProceedings of the National Academy of Sciences, 2009
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- Hierarchical structure and the prediction of missing links in networksNature, 2008
- Usage patterns of collaborative tagging systemsJournal of Information Science, 2006
- Tuning clustering in random networks with arbitrary degree distributionsPhysical Review E, 2005
- Generation of uncorrelated random scale-free networksPhysical Review E, 2005
- Why social networks are different from other types of networksPhysical Review E, 2003
- Large-scale topological and dynamical properties of the InternetPhysical Review E, 2002
- Clustering and preferential attachment in growing networksPhysical Review E, 2001