Clustering via hypergraph modularity
Autoři:
Bogumił Kamiński aff001; Valérie Poulin aff002; Paweł Prałat aff003; Przemysław Szufel aff001; François Théberge aff002
Působiště autorů:
SGH Warsaw School of Economics, Warsaw, Poland
aff001; The Tutte Institute for Mathematics and Computing, Ottawa, ON, Canada
aff002; Department of Mathematics, Ryerson University, Toronto, ON, Canada
aff003
Vyšlo v časopise:
PLoS ONE 14(11)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0224307
Souhrn
Despite the fact that many important problems (including clustering) can be described using hypergraphs, theoretical foundations as well as practical algorithms using hypergraphs are not well developed yet. In this paper, we propose a hypergraph modularity function that generalizes its well established and widely used graph counterpart measure of how clustered a network is. In order to define it properly, we generalize the Chung-Lu model for graphs to hypergraphs. We then provide the theoretical foundations to search for an optimal solution with respect to our hypergraph modularity function. A simple heuristic algorithm is described and applied to a few illustrative examples. We show that using a strict version of our proposed modularity function often leads to a solution where a smaller number of hyperedges get cut as compared to optimizing modularity of 2-section graph of a hypergraph.
Klíčová slova:
Algorithms – Clustering algorithms – Community structure – Computer and information sciences – Finance – Source code – Telecommunications – Transportation
Zdroje
1. Fortunato S. Community detection in graphs, Physics Reports. 2010; 486: 75–174. doi: 10.1016/j.physrep.2009.11.002
2. Girvan M, Newman MEJ. Community structure in social and biological networks. Proceedings of the National Academy of Sciences. 2002; 99: 7821–7826. doi: 10.1073/pnas.122653799
3. Zhang Z, Liu C. A hypergraph model of social tagging networks. Journal of Statistical Mechanics: Theory and Experiment, vol. 2010, no 10, 2010. doi: 10.1088/1742-5468/2010/10/P10005
4. Shepherd MA, Watters CR., Cai Y. Transient hypergraphs for citation networks. Information Processing & Management, 26(3), 395–412, 2010. doi: 10.1016/0306-4573(90)90099-N
5. Gallo G, Longo G, Pallottino Stefano, Nguyen S. Directed hypergraphs and applications. Discrete applied mathematics, vol. 42, no 2-3, pp 177–201, 1993. doi: 10.1016/0166-218X(93)90045-P
6. Samaga R, Klamt S. Modeling approaches for qualitative and semi-quantitative analysis of cellular signaling networks. Cell communication and signaling, vol. 11 no 1, 2013. doi: 10.1186/1478-811X-11-43
7. Sarkar S, Sivarajan KN. Hypergraph models for cellular mobile communication systems. IEEE Transactions on Vehicular Technology, 47(2), 460–471, 1998. doi: 10.1109/25.669084
8. Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E. 2004; 69: 026–113. doi: 10.1103/PhysRevE.69.066133
9. Chung F, Lu L, Connected components in random graphs with given expected degree sequence. Annals of Combinatorics. 2002; 6: 125–145. doi: 10.1007/PL00012580
10. Zhou D, Huang J, Scholkopf B. Learning with hypergraphs: clustering, classification and embedding. Advances in neural information processing systems. 2006; 19:1601–1608.
11. Shi J, Malik J. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence. 2000; 22: 888–905. doi: 10.1109/34.868688
12. Rodriguez JA. On the Laplacian spectrum and walk-regular hypergraphs. Linear and Multi-linear Algebra. 2003; 51: 285–297. doi: 10.1080/0308108031000084374
13. Zien J, Schlag M, Chan P. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, 1999; 18: 1389–1399. doi: 10.1109/43.784130
14. Agarwal S, Lim J, Zelnik-Manor L, Perona P, Kriegman D, Belongie S. Beyond pairwise clustering. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005; 2: 838–845.
15. Agarwal S, Branson K, Belongie S. Higher Order Learning with Graphs, Proc. Int’l Conf. Machine Learning. 2006; 148: 17–24.
16. Voloshim VI, Introduction to Graph and Hypergraph Theory. Nova Kroshka Books; 2013.
17. Karypis G, Kumar V, Multilevel K-Way Hypergraph Partitioning. VLSI Design. 2000; 11: 285–300. doi: 10.1155/2000/19436
18. Chien I, Chung-Yi L, I-Hsiang W. Community detection in hypergraphs: Optimal statistical limit and efficient algorithms. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. 2018: 84: 871–879.
19. GitHub gist repository, https://gist.github.com/pszufe/02666497d2c138d1b2de5b7f67784d2b.
20. Chung FRK, Lu L. Complex Graphs and Networks. American Mathematical Society; 2006.
21. Seshadhri C, Kolda TG, Pinar A. Community structure and scale-free collections of Erdös-Rényi graphs. Physical Review E. 2012; 85: 056109. doi: 10.1103/PhysRevE.85.056109
22. Kolda TG, Pinar A, Plantenga T, Seshadhri C. A scalable generative graph model with community structure. SIAM Journal on Scientific Computing. 2014; 36: C424–C452. doi: 10.1137/130914218
23. Winlaw M, DeSterck H, Sanders G. An In-Depth Analysis of the Chung-Lu Model. Lawrence Livermore Technical Report LLNL-TR-678729. 2015.
24. Newman M. Networks: An Introduction. Oxford University Press; 2010.
25. Fortunato S, Barthelemy M. Resolution limit in community detection. Proc. Natl. Acad. Sci. USA. 2007: 104: 36–41. doi: 10.1073/pnas.0605965104 17190818
26. Clauset A, Newman MEJ, Moore C. Finding community structure in very large networks. Phys. Rev. E. 2004; 70: 066111. doi: 10.1103/PhysRevE.70.066111
27. Lancichinetti A, Fortunato S. Limits of modularity maximization in community detection. Phys. Rev. E. 2011; 84: 066122. doi: 10.1103/PhysRevE.84.066122
28. Newman MEJ. Fast algorithm for detecting community structure in networks. Phys. Rev. E. 2004; 69: 066133. doi: 10.1103/PhysRevE.69.066133
29. McDiarmid C, Skerman F. Modularity in random regular graphs and lattices. Electronic Notes in Discrete Mathematics. 2013; 43: 431–437. doi: 10.1016/j.endm.2013.07.063
30. McDiarmid C, Skerman F. Modularity of tree-like and random regular graphs. Oxford Journal of Complex Networks. 2018; 6: 596–619. doi: 10.1093/comnet/cnx046
31. Ostroumova Prokhorenkova L, Prałat P, and Raigorodskii A. Modularity of complex networks models. Internet Mathematics. 2017. doi: 10.24166/im.12.2017
32. McDiarmid C, Skerman F. Modularity of Erdos-Renyi random graphs; 2018. Preprint. Available from: https://arxiv.org/abs/1808.02243. Cited 30 April 2019.
33. Leordeanu M, Sminchisescu C. Efficient Hypergraph Clustering. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics. 2012; 22: 676–684.
34. Blondel V.D., Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment. 2008; 10: P10008. doi: 10.1088/1742-5468/2008/10/P10008
35. Poulin V, Théberge F. Comparing Graph Clusterings: Set partition measures vs. Graph-aware measures. 2018. Preprint. Available from: https://arxiv.org/abs/1806.11494. Cited 30 April 2019.
36. Newman MEJ. Community detection in networks: Modularity optimization and maximum likelihood are equivalent. Phys. Rev. E. 2016; 94: 052315.
37. Lu X, Szymanski BK. Asymptotic resolution bounds of generalized modularity and statistically significant community detection. 2019. Preprint. Available from: https://arxiv.org/abs/1902.04243. Cited 30 April 2019.
Článek vyšel v časopise
PLOS One
2019 Číslo 11
- Jak a kdy u celiakie začíná reakce na lepek? Možnou odpověď poodkryla čerstvá kanadská studie
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
- Spermie, vajíčka a mozky – „jednohubky“ z výzkumu 2024/38
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Infekce se v Americe po příjezdu Kolumba šířily nesrovnatelně déle, než se traduje
Nejčtenější v tomto čísle
- A daily diary study on maladaptive daydreaming, mind wandering, and sleep disturbances: Examining within-person and between-persons relations
- A 3’ UTR SNP rs885863, a cis-eQTL for the circadian gene VIPR2 and lincRNA 689, is associated with opioid addiction
- A substitution mutation in a conserved domain of mammalian acetate-dependent acetyl CoA synthetase 2 results in destabilized protein and impaired HIF-2 signaling
- Molecular validation of clinical Pantoea isolates identified by MALDI-TOF
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy