Digital Library[ Search Result ]
Adversarial Training with Contrastive Learning in NLP
Daniela N. Rim, DongNyeong Heo, Heeyoul Choi
http://doi.org/10.5626/JOK.2025.52.1.52
Adversarial training has been extensively studied in natural language processing (NLP) settings to make models robust so that similar inputs derive similar outcomes semantically. However, since language has no objective measure of semantic similarity, previous works use an external pre-trained NLP model to ensure this similarity, introducing an extra training stage with huge memory consumption. This work proposes adversarial training with contrastive learning (ATCL) to train a language processing model adversarially using the benefits of contrastive learning. The core idea is to make linear perturbations in the embedding space of the input via fast gradient methods (FGM) and train the model to keep the original and perturbed representations close via contrastive learning. We apply ATCL to language modeling and neural machine translation tasks showing an improvement in the quantitative (perplexity and BLEU) scores. Furthermore, ATCL achieves good qualitative results in the semantic level for both tasks without using a pre-trained model through simulation.
Mini-Batching with Similar-Length Sentences to Quickly Train NMT Models
Daniela N. Rim, Richard Kimera, Heeyoul Choi
http://doi.org/10.5626/JOK.2023.50.7.614
The Transformer model has revolutionized Natural Language Processing tasks such as Neural Machine Translation. Many efforts have been made to study the Transformer architecture to increase its efficiency and accuracy. One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching and mini-batch with similar-length sentences, which minimizes the waste of computing power. Since the amount of sorting could violate the independent and identically distributed (i.i.d) data assumption, we sort the data partially. In experiments, we apply the proposed method to English-Korean and English-Luganda language pairs for machine translation and show that there are gains in computational time while maintaining the performance. Our method is independent of architectures, so that it can be easily integrated into any training process with flexible data lengths.
Document-level Machine Translation Data Augmentation Using a Cluster Algorithm and NSP
http://doi.org/10.5626/JOK.2023.50.5.401
In recent years, research on document level machine translation has been actively conducted to understand the context of the entire document and perform natural translation. Similar to the sentence-level machine translation model, a large amount of training data is required for training of the document-level machine translation model, but there is great difficulty in building a large amount of document-level parallel corpus. Therefore, in this paper, we propose a data augmentation technique effective for document-level machine translation in order to improve the lack of parallel corpus per document. As a result of the experiment, by applying the data augmentation technique using the cluster algorithm and NSP to the sentence unit parallel corpus without context, the performance of the document-level machine translation is improved by S-BLEU 3.0 and D-BLEU 2.7 compared to that before application of the data augmentation technique.
Style Transfer for Chat Language using Unsupervised Machine Translation
Youngjun Jung, Changki Lee, Jeongin Hwang, Hyungjong Noh
http://doi.org/10.5626/JOK.2023.50.1.19
Style transfer is the task of generating text of a target style while maintaining content of given text written in a source style. In general, it is assumed that the content is an invariant and the style is variable when the style of the text is transferred. However, in the case of chat language, there is a problem in that it is not well trained by existing style transfer model. In this paper, we proposed a method of transfer chat language into written language using a style transfer model with unsupervised machine translation. This study shows that it is possible to construct a word transfer dictionary between styles that can be used for style transfer by utilizing transferred results. Additionally, it shows that transferred results can be improved by applying a filtering method to transferred result pair so that only well transferred results can be used and by training the style transfer model using a supervised learning method with filtered results.
Korean-English Neural Machine Translation Using Korean Alphabet Characteristics and Honorific Expressions
Jeonghui Kim, Jaemu Heo, Joowhan Kim, Heeyoul Choi
http://doi.org/10.5626/JOK.2022.49.11.1017
Recently, deep learning has improved the performance of machine translation, but in most cases, it does not reflect the characteristics of the languages. In particular, Korean has unique linguistic word and expression features, which might cause mistranslation. For example, in Google Translate from Korean to English, mistranslations occur when a noun in Korean ends with the postposition (josa) in the form of a single consonant. Also, in the English-Korean translations, the honorifics and casual expressions are mixed in the translated results. This is because the alphabetic characteristics and honorifics of the Korean language are not reflected. In this paper, to address these problems, we propose to train a model with sub-words composed of units of letters (jamo) and unifying honorific and casual expressions in the corpus. The experimental results confirmed that the proposed method resolved the problems mentioned above, and had a similar or slightly higher BLEU score compared to the existing method and the corpus.
Building a Parallel Corpus and Training Translation Models Between Luganda and English
Richard Kimera, Daniela N. Rim, Heeyoul Choi
http://doi.org/10.5626/JOK.2022.49.11.1009
Recently, neural machine translation (NMT) which has achieved great successes needs large datasets, so NMT is more premised on high-resource languages. This continuously underpins the low resource languages such as Luganda due to the lack of high-quality parallel corpora, so even ‘Google translate’ does not serve Luganda at the time of this writing. In this paper, we build a parallel corpus with 41,070 pairwise sentences for Luganda and English which is based on three different open-sourced corpora. Then, we train NMT models with hyper-parameter search on the dataset. Experiments gave us a BLEU score of 21.28 from Luganda to English and 17.47 from English to Luganda. Some translation examples show high quality of the translation. We believe that our model is the first Luganda-English NMT model. The bilingual dataset we built will be available to the public.
Grammar Accuracy Evaluation (GAE): Quantifiable Qualitative Evaluation of Machine Translation Models
Dojun Park, Youngjin Jang, Harksoo Kim
http://doi.org/10.5626/JOK.2022.49.7.514
Natural Language Generation (NLG) refers to the operation of expressing the calculation results of a system in human language. Since the quality of generated sentences from an NLG model cannot be fully represented using only quantitative evaluation, they are evaluated using qualitative evaluation by humans in which the meaning or grammar of a sentence is scored according to a subjective criterion. Nevertheless, the existing evaluation methods have a problem as a large score deviation occurs depending on the criteria of evaluators. In this paper, we propose Grammar Accuracy Evaluation (GAE) that can provide the specific evaluating criteria. As a result of analyzing the quality of machine translation by BLEU and GAE, it was confirmed that the BLEU score does not represent the absolute performance of machine translation models and GAE compensates for the shortcomings of BLEU with flexible evaluation of alternative synonyms and changes in sentence structure.
Kor-Eng NMT using Symbolization of Proper Nouns
Myungjin Kim, Junyeong Nam, Heeseok Jung, Heeyoul Choi
http://doi.org/10.5626/JOK.2021.48.10.1084
There is progress in the field of neural machine translation, but there are cases where the translation of sentences containing proper nouns, such as, names, new words, and words that are used only within a specific group, is not accurate. To handle such cases, this paper uses the Korean-English proper noun dictionary and the symbolization method in addition to the recently proposed translation model, Transformer Model. In the proposed method, some of the words in the sentences used for learning are symbolized using a proper noun dictionary, and the translation model is trained with sentences including the symbolized words. When translating a new sentence, the translation is completed by symbolizing, translation, and desymbolizing. The proposed method was compared with a model without symbolization, and for some cases improvement was quantitatively confirmed with the BLEU score. In addition, several examples of translation were also presented along with commercial service results.
English-Korean Neural Machine Translation using MASS with Relative Position Representation
Youngjun Jung, Cheoneum Park, Changki Lee, Junseok Kim
http://doi.org/10.5626/JOK.2020.47.11.1038
Neural Machine Translation has been mainly studied for a Sequence-to-Sequence model using supervised learning. However, since the supervised learning method shows low performance when the data is insufficient, recently, a transfer learning method of fine-tuning using the pre-training model based on a large amount of monolingual data such as BERT and MASS has been mainly studied in the field of natural language processing. In this paper, MASS using the pre-training method for language generation, was applied to the English-Korean machine translation. As a result of the experiment, the performance of the English-Korean machine translation model using MASS showed better performance than the existing models, and the performance of the machine translation model was further improved by applying the relative position representation method to MASS.
Prediction of Compound-Protein Interactions Using Deep Learning
http://doi.org/10.5626/JOK.2019.46.10.1054
Characterizing the interactions between compounds and proteins is an important process for drug development and discovery. Structural data of proteins and compounds are used to identify their interactions, but those structural data are not always available, and the speed and accuracy of the predictions made in this way ware limited due to the large number of calculations involved. In this paper, compound-protein interactions were predicted using S2SAE (Sequence-To-Sequence Auto-Encoder), which is composed of a sequence-to-sequence algorithm used in machine translation as well as an auto-encoder for effective compression of the input vector. Compared to the existing method, the method proposed in this paper uses fewer features of protein-compound complex and also show higher predictive accuracy.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr