Digital Library[ Search Result ]
Comparative Analysis of Accuracy and Stability of Software Reliability Estimation Models based on Recurrent Neural Networks
Taehyoun Kim, Duksan Ryu, Jongmoon Baik
http://doi.org/10.5626/JOK.2023.50.8.688
Existing studies on software reliability estimation based on recurrent neural networks have used networks to create one model under the same conditions and evaluated the accuracy of the model. However, due to the randomness of artificial neural networks, such recurrent neural networks can generate different training results of models even under the same conditions, which can lead to inaccurate software reliability estimation. Therefore, this paper compares and analyzes which recurrent neural networks could estimate software reliability more stably and accurately. We estimated software reliability in eight real projects using three representative recurrent neural networks and compared and analyzed the performances of these models in terms of accuracy and stability. As a result, Long Short-Term Memory showed the most stable and accurate software reliability estimation performance. A more accurate and stable software reliability estimation model is expected to be selected based on the results of this study.
Mini-Batching with Similar-Length Sentences to Quickly Train NMT Models
Daniela N. Rim, Richard Kimera, Heeyoul Choi
http://doi.org/10.5626/JOK.2023.50.7.614
The Transformer model has revolutionized Natural Language Processing tasks such as Neural Machine Translation. Many efforts have been made to study the Transformer architecture to increase its efficiency and accuracy. One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching and mini-batch with similar-length sentences, which minimizes the waste of computing power. Since the amount of sorting could violate the independent and identically distributed (i.i.d) data assumption, we sort the data partially. In experiments, we apply the proposed method to English-Korean and English-Luganda language pairs for machine translation and show that there are gains in computational time while maintaining the performance. Our method is independent of architectures, so that it can be easily integrated into any training process with flexible data lengths.
Multi-Document Summarization Use Semantic Similarity and Information Quantity of Sentence
Yeon-Soo Lim, Sunggoo Kwon, Bong-Min Kim, Seong-Bae Park
http://doi.org/10.5626/JOK.2023.50.7.561
Document summarization task has recently emerged as an important task in natural language processing because of the need for delivering concise information. However, it is difficult to obtain a suitable multi-document summarization dataset. In this paper, rather than training with a multi-document summarization dataset, we propose to use a single-document summarization dataset. That is, we propose a multi-document summarization model which generates multiple single-document summaries with a single-document summarization model and then post-processes these summaries. The proposed model consists of three modules: a summary module, a similarity module, and an information module. When multiple documents are entered into the proposed model, the summary module generates summaries of every single document. The similarity module clusters similar summaries by measuring semantic similarity. The information module selects the most informative summary from each similar summary group and collects selected summaries for the final multi-document summary. Experimental results show that the proposed model outperforms the baseline models and it can generate a high-quality multi-document summary. In addition, the performances of each module also show meaningful results.
The Dataset and a Pretrained Language Model for Sentence Classification in Korean Science and Technology Abstracts
Hongbi Ahn, Soyoung Park, Yuchul Jung
http://doi.org/10.5626/JOK.2023.50.6.468
Classifying each sentence according to its role or function is a critical task, particularly in science and technology papers where abstracts contain various types of research-related content. Proper content curation and appropriate meaning tags are necessary but challenging due to the complexity and diversity of the work. For instance, in biomedical-related abstract data (such as PubMed) in foreign languages, the sentences in the abstract typically follow a consistent semantic sequence, such as background-purpose-method-result-conclusion. However, in Korean paper abstracts, the sentences are described in different orders depending on the author. To address this, we have constructed a dataset (PubKorSci-1k) that tags each sentence according to its role in the abstracts of the science and technology domains described in Korean. Additionally, we propose a learning technique for sentence classification based on this dataset.
Document-level Machine Translation Data Augmentation Using a Cluster Algorithm and NSP
http://doi.org/10.5626/JOK.2023.50.5.401
In recent years, research on document level machine translation has been actively conducted to understand the context of the entire document and perform natural translation. Similar to the sentence-level machine translation model, a large amount of training data is required for training of the document-level machine translation model, but there is great difficulty in building a large amount of document-level parallel corpus. Therefore, in this paper, we propose a data augmentation technique effective for document-level machine translation in order to improve the lack of parallel corpus per document. As a result of the experiment, by applying the data augmentation technique using the cluster algorithm and NSP to the sentence unit parallel corpus without context, the performance of the document-level machine translation is improved by S-BLEU 3.0 and D-BLEU 2.7 compared to that before application of the data augmentation technique.
Type-specific Multi-Head Shared-Encoder Model for Commonsense Machine Reading Comprehension
http://doi.org/10.5626/JOK.2023.50.5.376
Machine reading comprehension (MRC) is a task introduced to a machine that can understand natural languages by solving various tasks based on given context. To evaluate natural language understanding of machine, a machine must make commonsense inference under full comprehension of a given context. To enhance model obtaining such abilities, we proposed a multi-task learning scheme and a model for commonsense MRC. Contributions of this study are as follows: 1) a method of task-specific dataset configuration is proposed; 2) a type-specific multi-head shared-encoder model with multi-task learning scheme including batch sampling and loss scaling is developed; and 3) when the method is evaluated on CosmosQA dataset (commonsense MRC), the accuracy was improved by 2.38% compared to the performance at baseline with fine-tuning.
Prediction of Antibiotic Resistance to Ciprofloxacin in Patients with Upper Urinary Tract Infection through Exploratory Data Analysis and Machine Learning
http://doi.org/10.5626/JOK.2023.50.3.263
Emergency medicine physicians use an empirical treatment strategy to select antibiotics before clinically confirming an antibiotic resistance profile for a patient with a urinary tract infection. Empirical treatment is a challenging task in the context of concern for increased antibiotic resistance of urinary tract pathogens in the community. As a single-institution retrospective study, this study proposed a method for predicting antibiotic resistance using a machine learning algorithm for patients diagnosed with upper urinary tract infection in the emergency department. First, we selected significant predictors using statistical test methods and a game theory based SHAP (SHapley Additive exPlanation), respectively. Next, we compared four classifier performances and proposed an algorithm to assist decision-making in empirical treatment by adjusting the prediction probability threshold. As a result, the SVM classifier using predictors selected through SHAP (65% of the total) showed the highest AUROC (0.775) among all conditions used in the experiment. By adjusting the predictive probability threshold in the SVM, we achieved classification accuracy with a specificity that was 3.9 times higher than empirical treatment while preserving the sensitivity of the doctor"s empirical treatment at 98%.
Performance Improvement of a Korean Open Domain Q&A System by Applying the Trainable Re-ranking and Response Filtering Model
Hyeonho Shin, Myunghoon Lee, Hong-Woo Chun, Jae-Min Lee, Sung-Pil Choi
http://doi.org/10.5626/JOK.2023.50.3.273
Research on Open Domain Q&A, which can identify answers to user inquiries without preparing the target paragraph in advance, is currently being undertaken as deep learning technology is used for natural language processing. However, existing studies have limitations in semantic matching using keyword-based information retrieval. To supplement this, deep learning-based information retrieval research is in progress. But there are not many domestic studies that have been empirically applied to real systems. In this paper, a two-step performance enhancement method was proposed to improve the performance of the Korean open domain Q&A system. The proposed method is a method of sequentially applying a machine learning-based re-ranking model and a response filtering model to a baseline system in which a search engine and an MRC model was combined. In the case of the baseline system, the initial performance was an F1 score of 74.43 and an EM score of 60.79, and it was confirmed that the performance improved to an F1 score of 82.5 and an EM score of 68.82 when the proposed method was used.
Graph Neural Networks with Prototype Nodes for Few-shot Image Classification
http://doi.org/10.5626/JOK.2023.50.2.127
The remarkable performance of deep learning models is based on a large amount of training data. However, there are a number of domains where it is difficult to obtain such a large amount of data, and in these domains a large amount of resources must be invested for data collection and refining. To overcome these limitations, research on few-shot learning, which enables learning with only a small number of data, is being actively conducted. In particular, among meta learning methodologies, metric-based learning which utilizes similarity between data has the advantage that it does not require fine-tuning of the model for a new task, and recent studies using graph neural networks have shown good results. A few-shot classification model based on a graph neural network can explicitly process data characteristics and the relationship between data by constructing a task graph using data of a given support set and query set as nodes. The EGNN(Edge-labeling Graph Neural Net) model expresses the similarity between data in the form of edge labels and models the intra-class and inter-class similarity more clearly. In this paper, we propose a method of applying a prototype node representing each class to few-shot task graph to model the similarity between data and class-data at the same time. The proposed model provides a generalized prototype node that is created based on task data and class configuration, and it can perform two different few-shot image classification predictions based on the prototype-query edge label or the Euclidean distance between prototype-query nodes. Comparing the 5-way 5-shot classification performance on the mini-ImageNet dataset with the EGNN model and other meta-learning-based few-shot classification models, the proposed model showed significant performance improvement.
Style Transfer for Chat Language using Unsupervised Machine Translation
Youngjun Jung, Changki Lee, Jeongin Hwang, Hyungjong Noh
http://doi.org/10.5626/JOK.2023.50.1.19
Style transfer is the task of generating text of a target style while maintaining content of given text written in a source style. In general, it is assumed that the content is an invariant and the style is variable when the style of the text is transferred. However, in the case of chat language, there is a problem in that it is not well trained by existing style transfer model. In this paper, we proposed a method of transfer chat language into written language using a style transfer model with unsupervised machine translation. This study shows that it is possible to construct a word transfer dictionary between styles that can be used for style transfer by utilizing transferred results. Additionally, it shows that transferred results can be improved by applying a filtering method to transferred result pair so that only well transferred results can be used and by training the style transfer model using a supervised learning method with filtered results.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr