Multiresolution community detection for mega-scale networks by information-based replica correlations

  • 5 December 2008
Abstract
We use a Potts model community detection algorithm to accurately and quantitatively evaluate the hierarchical or multiresolution structure of a graph. By calculating the correlations among multiple copies ("replicas") of the same graph over a range of resolutions, multiresolution structures manifest themselves as strong correlations between the individual replica solutions. The average Normalized Mutual Information, the Variation of Information, and other measures in principle give a quantitative estimate of the `best' resolutions and indicate the relative strength of the structures in the graph. Because the method is based on information comparisons, it can in principle be used with any community detection model that can examine multiple resolutions. As a local measure, our Potts model avoids the `resolution limit' that affects other popular models. With this model, our community detection algorithm has an accuracy that ranks among the best of currently available methods. Using it, we can examine graphs over 40 million nodes and more than one billion edges. We further report that the multiresolution variant of our algorithm can accurately solve systems of at least 200000 nodes and 10 million edges on a single processor. For typical cases, we find a super-linear scaling, $O(L^{1.3})$ where L is the number of edges in the system.

This publication has 0 references indexed in Scilit: