Search : [ author: Inuk Jung ] (4)

A Time-Course Multi-Clustering Method for Single-Cell Trajectory Inference

Jaeyeon Jang, Inuk Jung

http://doi.org/10.5626/JOK.2022.49.10.838

From time-series single-cell transcriptome data, gene expression information can be generated to observe the timing of significant cell differentiation changes while accounting for important biological phenomena in relation to experimental conditions. Due to recent surge of time-series single-cell transcriptome data, studies on various dynamic variation in cells such as cell cycle and cell differentiation have been actively conducted. Particularly, time series analysis at single-cell level for cell differentiation is advantageous for biological interpretation compared to a single time point as it is possible to observe changes in the time axis. In this paper, we proposed a multi-clustering method to infer cell trajectory by considering time information at the genetic-level of time-series single-cell transcriptome data. Analyses of gene expression data on the development of human neuron cell differentiation using this method showed similar results to biological results uncovered in a previous study.

A Network Topology Scaling Method for Improving Network Comparison Using Colon Cancer Transcriptome Data

Eonyong Han, Inuk Jung

http://doi.org/10.5626/JOK.2022.49.8.646

Various research methods have been proposed based on gene expression information in the disease analysis model. In cancer transcriptome data analysis, methods of discovering hidden characteristics based on pathways are useful for the interpretation of results. In this study, the gene correlation network in the pathway unit was compared and analyzed based on the gene co-expression data. If there is a difference in the size of the two networks to be compared, the bias of the amount of information results in biased network information on a larger scale. To resolve this bias, the network of patients from different backgrounds was adjusted using the same amount of information in the network configuration. Normalized networks applied comparative analysis of important gene groups using the characteristics of biological networks, normalized 202 pathways networks using data of subtypes of total 4 types of colon cancer, and identified 5 pathways with specific results among subspecies.

COVID-19 Virus Whole-genome Embedding Strategy through Density-based Clustering and Deep Learning Model

Minwoo Pak, Sangseon Lee, Inyoung Sung, Yunyol Shin, Inuk Jung, Sun Kim

http://doi.org/10.5626/JOK.2022.49.4.261

The rapid spread of the COVID-19 throughout the world has made the causative virus Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) one of the major targets for research in various fields such as genetics and vaccinology. In particular, studies regarding the phylogeny and subtype properties are of especially great importance due to the variety of subtypes and high variability. However, most computational approaches to studying the viral genome are based on the frequencies of single-nucleotide polymorphisms (SNPs) since the large size of the genomic sequence makes it almost impossible to encode the information of the whole genome at once. In this study, we introduce an alternative embedding strategy to extract information from the SARS-CoV2 whole genome using the density-based clustering algorithm MUTCLUST and deep learning. We first reduced the size of the genome by identifying densely mutated clusters as important regions using MUTCLUST. We then learned the subtype-specific embedding vectors from the extracted clusters using a sequence convolutional deep learning model. We found that the learned embeddings contained information that could be used to discriminate known subtypes and reconstruct phylogenetic trees.

A Multi-Omics Data Integration Method and Parametric Analysis on Large-Scale Colon Cancer Data

Inuk Jung

http://doi.org/10.5626/JOK.2020.47.8.779

Research is being conducted to reveal the mechanisms of diseases and organisms through analysis of genomic data including the expression information of genes. At the genomic level, the principles of living organisms or diseases are very complex, as there are many genes involved, and there is a sophisticated regulatory relationship between the genes. Additionally, various omics participate in the gene expression regulation. Recently, the volume of genome data generated yearly is rapidly increasing because of the decrease in the cost of next-generation sequencing. Various new technologies to measure multi-modal omics from one sample are in active development. In this study we conducted a parametric analysis on colon cancer multi-omics data to observe the effect of the sample number and omics objects on the classification of its four subtypes. Two well known multi-omics integration methods and our in-house built method were used for the analysis. As a result, we found that at least 100 samples and less than 5,000 omics objects were required to achieve a satisfactory subtype classification performance. Three different multi-omics analysis methods were compared.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr