Search : [ author: Moonjong Kang ] (1)

Applying Deep Neural Networks and Random Forests to Predict the Pathogenicity of Single Nucleotide Variants in Hereditary Cancer-associated Genes

Da-Bin Lee, Seonhwa Kim, Moonjong Kang, Changbum Hong, Kyu-Baek Hwang

http://doi.org/10.5626/JOK.2023.50.9.746

The recent proliferation of genetic testing has made it possible to explore an individual"s genetic variants and use pathogenicity information to diagnose and prevent genetic diseases. However, the number of identified variants with pathogenicity information is quite small. A method for predicting the pathogenicity of variants by machine learning was proposed to address this problem. In this study, we apply and compare deep neural networks with random forests and logistic regression, which have been widely used in previous studies, to predict variant pathogenicity. The experimental data consisted of 1,068 single-nucleotide variants in genes associated with hereditary cancers. Experiments on 100 random data-sets generated for hyperparameter selection showed that random forests performed best in terms of area under the precision-recall curve. On 15 holdout gene data-sets, deep neural networks performed best on average, but the difference in performance from the second-best random forest was not significant. Logistic regression was also statistically significantly worse than that of either model. In conclusion, we found that deep neural networks and random forests were generally better than logistic regression at predicting the pathogenicity of single-nucleotide variants associated with hereditary cancer.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr