Improving graphs of cycles approach to structural similarity of molecules
Autoři:
Stefi Nouleho Ilemo aff001; Dominique Barth aff001; Olivier David aff002; Franck Quessette aff001; Marc-Antoine Weisser aff003; Dimitri Watel aff004
Působiště autorů:
DAVID, Department of Computer Science, University of Versailles Saint Quentin, Versailles, France
aff001; ILV, Department of Chemistry, University of Versailles, Versailles, France
aff002; LRI, CentraleSupelec, Paris-Saclay University, Evry, France
aff003; ENSIIE, Evry, France
aff004; SAMOVAR, Telecom SudParis, Evry, France
aff005
Vyšlo v časopise:
PLoS ONE 14(12)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0226680
Souhrn
This paper focuses on determining the structural similarity of two molecules, i.e., the similarity of the interconnection of all the elementary cycles in the corresponding molecular graphs. In this paper, we propose and analyze an algorithmic approach based on the resolution of the Maximum Common Edge Subgraph (MCES) problem with graphs representing the interaction of cycles molecules. Using the ChEBI database, we compare the effectiveness of this approach in terms of structural similarity and computation time with two calculations of similarity of molecular graphs, one based on the MCES, the other on the use of different fingerprints (Daylight, ECFP4, ECFP6, FCFP4, FCFP6) to measure Tanimoto coefficient. We also analyze the obtained structural similarity results for a selected subset of molecules.
Klíčová slova:
Algorithms – Amphotericin – Daylight – Graph theory – Graphs – Molecular structure – Quinine – Molecular computing
Zdroje
1. Reaxys. Accessed Nov, 2017.
2. Chemical entities of biological interest (chebi). Accessed April 1, 2015.
3. Birchall K. and Gillet V. J. Reduced Graphs and Their Applications in Chemoinformatics, pages 197–212. Humana Press, 2011.
4. C. Berge. Théorie des graphes et ses applications. Dunod, 1963.
5. Gasteiger J. Handbook of Chemoinformatics: From Data to Knowledge (Representation of Molecular Structures). Wiley, 1 edition, 2003.
6. Raymond J. W., Gardiner E. J., and W. Peter. Rascal: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal, 45(6), 2002. doi: 10.1093/comjnl/45.6.631
7. Zager L. A. and Verghese G. C. Graph similarity scoring and matching. Applied Mathematics Letters, 45(21):86–94, 2008. doi: 10.1016/j.aml.2007.01.006
8. Johnson M. A. and Maggiora G. M. Concepts and applications of molecular similarity. The American Chemical Society, 1988.
9. Eckert H. and Bajorath J. Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discovery Today, 12(5):225–233, 2007. doi: 10.1016/j.drudis.2007.01.011 17331887
10. Cereto-Massagué A., Ojeda M. J., Valls C., Mulero M., Garcia-Vallvé S., and Pujadas G. Molecular fingerprint similarity search in virtual screening. Methods, 71:58–63, 2015. doi: 10.1016/j.ymeth.2014.08.005 25132639
11. B. Andreas, Jeremy J.L., S. Josef, Sai S.C.K., G. Meir, and D. John W. How similar are similarity searching methods? a principal component analysis of molecular descriptor space. J. Chem. Inf. Model., 49(1), 2009.
12. d. G. Kurt and C. F. Molecular graph augmentation with rings and functional groups. J. Chem. Inf. Model., 50(9):1660–1668, 2010. doi: 10.1021/ci9005035
13. G. Benoit, B. L., and V. D. Graph kernels in chemoinformatics. Quantitative Graph Theory Mathematical Foundations and Applications, CRC Press, pages 425–470, 2015.
14. Flower D. On the properties of bit string-based measures of chemical similarity. Journal of Chemical Information and Computer Sciences, 38:379–386, 05 1998. doi: 10.1021/ci970437z
15. F. Abu-khzam, N. Samatova, M. A. Rizk, and M. Langston. The maximum common subgraph problem: Faster solutions via vertex cover. IEEE/ACS International Conference on Computer Systems and Applications, pages 367–373, 2007.
16. A. Tatsuya and N. Hiroshi. Comparison and enumeration of chemical graphs. Comput Struct Biotechnol J., 5, 2013.
17. N. Michel and B. Horst. Bridging the Gap Between Graph Edit Distance and Kernel Machines. World Scientific Publishing Co., Inc., 2007.
18. S. Roger, M. John, N. O. Boyle, G. Andrew J., S. Stefan, and G. Darren V.S. Chemical similarity based on graph edit distance:efficient implementation and the challenges of evaluation. 7th Joint Sheffield Conference on Chemoinformatics, 2015.
19. Rogers D. J. and Tanimoto T. T. A computer program for classifying plants. Science, 132(3434):1115–1118, 1960. doi: 10.1126/science.132.3434.1115 17790723
20. G. Benoit, B. L., and V. D. Relevant cycle hypergraph representation for molecules. 9th IAPR-TC-15 Graph-Based Representations in Pattern Recognition, page 111, 2013.
21. Horváth, Tamás and Gärtner, Thomas and Wrobel, Stefan. Cyclic pattern kernels for predictive graph mining. KDD-2004—Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 158–167, 2004.
22. A. N. Flachsenberg Florian and Rarey M. Ringdecomposerlib: An open-source implementation of unique ring families and other cycle bases. J. Chem. Inf. Model., 2(57):122–126, 2017.
23. Vismara P. and Laurenco C. An abstract representation for molecular graphs. Discrete Mathematics and Theoretical Computer Science, 51:343–366, 2000.
24. L. Michael F. and H. John D. The sheffield generic structures project—a retrospective review. J. Chem. Inf. Comput. Sci., 36(5):930–936, 1996. doi: 10.1021/ci950173l
25. G. Valerie J., D. Geoffrey M., H. John D., L. Michael F., and D. Winfried. Computer storage and retrieval of generic chemical structures in patents. 13. reduced graph generation. J. Chem. Inf. Comput. Sci., 31(2):260–270, 1991. doi: 10.1021/ci00002a011
26. Glem Robert C., Bender Andreas, Hasselgren Catrin, Carlsson Lars, Boyer Scott and Smith James. Circular fingerprints: Flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs: the investigational drugs journal, 9:199–204, 2006. 16523386
27. Horton J.D. A polynomial-time algorithm fo find the shortest cycle basis of a graph. SIAM Journal on Computing, 16(2):358–366, 1987. doi: 10.1137/0216026
28. Vismara P. Union of all minimum cycle bases of a graph. Electr. J. Comb., 4(1):73–87, 1997.
29. SCIP http://scip.zib.de/ Accessed Nov, 2017.
30. Santra AK., and Christy Josephine. Genetic Algorithm and Confusion Matrix for Document Clustering International Journal of Computer Science Issues, 2012.
Článek vyšel v časopise
PLOS One
2019 Číslo 12
- S diagnostikou Parkinsonovy nemoci může nově pomoci AI nástroj pro hodnocení mrkacího reflexu
- Je libo čepici místo mozkového implantátu?
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
- AI může chirurgům poskytnout cenná data i zpětnou vazbu v reálném čase
- Nová metoda odlišení nádorové tkáně může zpřesnit resekci glioblastomů
Nejčtenější v tomto čísle
- Methylsulfonylmethane increases osteogenesis and regulates the mineralization of the matrix by transglutaminase 2 in SHED cells
- Oregano powder reduces Streptococcus and increases SCFA concentration in a mixed bacterial culture assay
- The characteristic of patulous eustachian tube patients diagnosed by the JOS diagnostic criteria
- Parametric CAD modeling for open source scientific hardware: Comparing OpenSCAD and FreeCAD Python scripts
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy