Digital Library[ Search Result ]
Korean Dependency Parsing Using Sequence Labeling
http://doi.org/10.5626/JOK.2024.51.12.1053
Dependency parsing is a crucial step in language analysis. It identifies relationships between words within a sentence. Recently, many models based on a pre-trained transformer have shown impressive performances in various natural language processing research. hey have been also applied to dependency parsing. Generally, traditional approaches to dependency parsing using pre-trained models consist of two main stages: 1) merging token-level embeddings generated by the pre-trained model into word-level embeddings; and 2) analyzing dependency relations by comparing or classifying the merged embeddings. However, due to a large number of parameters and additional layers required for embedding construction, comparison, and classification, these models can be inefficient in terms of time and memory usage. This paper proposes a dependency parsing technique based on sequential labeling to improve the efficiency of training and inference by defining dependency parsing units and simplifying model layers. The proposed model eliminates the necessity of the word-level embedding merging step by utilizing special tokens to define parsing units. It also effectively reduces the number of parameters by simplifying model layers. As a result, the training and inference time is significantly shortened. With these optimizations, the proposed model maintains meaningful performance in dependency parsing.
Prompt Tuning For Korean Aspect-Based Sentiment Analysis
Bong-Su Kim, Seung-Ho Choi, Si-hyun Park, Jun-Ho Wang, Ji-Yoon Kim, Hyun-Kyu Jeon, Jung-Hoon Jang
http://doi.org/10.5626/JOK.2024.51.12.1043
Aspect-based sentiment analysis examines how emotions in text relate to specific aspects, such as product characteristics or service features. This paper presents a comprehensive methodology for applying prompt tuning techniques to multi-task token labeling challenges using aspect-based sentiment analysis data. The methodology includes a pipeline for identifying emotion expression domains, which generalizes the token labeling problem into a sequence labeling problem. It also suggests selecting templates to classify separated sequences based on aspects and emotions, and expanding label words to align with the dataset’s characteristics, thus optimizing the model's performance. Finally, the paper provides several experimental results and analyses for the aspect-based sentiment analysis task in a few-shot setting. The constructed data and baseline model are available on AIHUB. (www.aihub.or.kr).
Constructing a Korean Knowledge Graph Using Zero Anaphora Resolution and Dependency Parsing
Chaewon Lee, Kangbae Lee, Sungyeol Yu
http://doi.org/10.5626/JOK.2024.51.8.736
This study introduces a novel approach to creating a Korean-based knowledge graph by employing zero anaphora resolution, dependency parsing, and knowledge base extraction using ChatGPT. In order to overcome the limitations of conventional language models in handling the grammatical and morphological characteristics of Korean, this research incorporates prompt engineering techniques that combine zero anaphora resolution and dependency parsing. The main focus of this research is the 'Ko-Triple Extraction' method, which involves restoring omitted information in sentences and analyzing dependency structures to extract more sophisticated and accurate triple structures. The results demonstrate that this method greatly enhances the efficiency and accuracy of Korean text processing, and the validity of the triples has been confirmed through precision metrics. This study serves as fundamental research in the field of Korean text processing and suggests potential applications in various industries. Future research aims to apply this methodology to different industrial sectors and by expanding and connecting knowledge graph, generate valuable business insights. This approach is expected to contribute significantly make an important contribution not only to the advancement of natural language processing technologies but also to the effective of Korean in the field of artificial intelligence.
Korean Dependency Parsing using Subtree Linking based on Machine Reading Comprehension
Jinwoo Min, Seung-Hoon Na, Jong-Hoon Shin, Young-Kil Kim, Kangil Kim
http://doi.org/10.5626/JOK.2022.49.8.617
In Korean dependency parsing, biaffine attention models have shown state-of-the-art performances; they first obtain head-level and modifier-level representations by applying two multi-layer perceptrons (MLP) on the encoded contextualized word representation, perform the attention by regarding modifier-level representation as a query and head-level one as a key, and take the resulting attention score as a probability of forming a dependency arc between the corresponding two words. However, given two target words (i.e., candidate head and modifier), biaffine attention methods are basically limited to their word-level representations, not being aware of the explicit boundaries of their phrases or subtrees. Thus, without relying on semantically and syntactically enriched phrase-level and subtree-level representations, biaffine attention methods might be not effective in the case that determining a dependency arc is not simple but complicated such as identifying a dependency between “far-distant” words, where these cases may often require subtree or phrase-level information surrounding target words. To address this drawback, this paper presents the use of dependency paring framework based on machine reading comprehension (MRC) that explicitly utilizes the subtree-level information by mapping a given child subtree and its parent subtree to a question and an answer, respectively. The experiment results on standard datasets of Korean dependency parsing shows that the MRC-based dependency paring outperforms the biaffine attention model. In particular, the results further given observations that improvements in performances are likely strong in long sentences, comparing to short ones.
Korean Dependency Parsing using Token-Level Contextual Representation in Pre-trained Language Model
http://doi.org/10.5626/JOK.2021.48.1.27
Dependency parsing is a problem of disambiguating sentence structure by recognizing dependencies and labels between words in sentences. In contrast to previous studies that have applied additional RNNs to the pre-trained language model, this paper proposes a dependency parsing method that uses fine-tuning alone to maximize the self-attention mechanism of the pre-trained language model, and also proposes a technique for using relative distance parameters and SEP tokens. In the results of evaluating the Sejong parsing corpus of TTA standard guidelines, the KorBERT_base model showed 95.73% UAS and 93.39% LAS while the KorBERT_large model showed 96.31% UAS and 94.17% LAS. This represents an improvement of about 3% compared to the results of previous studies that did not use the pre-trained language model. Next, the results of the word-morpheme mixed transformation corpus of the previous study showed that the KorBERT_base model was 94.19% UAS and that the KorBERT_large model was 94.76% UAS.
Development of an Information Extraction System Using the Dependency Analysis
Hyeyoung Kim, Hangyeol Sun, Youngwook Kim
http://doi.org/10.5626/JOK.2020.47.3.266
In this paper, we propose an information extraction system that can automatically extract user intended key syntax, by analyzing the dependency parse of a sentence. Previous Open Information Extraction studies extract two related arguments based on a verb to structuralize information, from massive data in unsupervised methods. However, users may be unable to extract key syntax accordingly, from a sentence without a verb or a sentence with various arguments. To solve this problem, this system first splits a sentence into an appropriate length to enhance the accuracy of analysis and incorporates dependency relations between words using a dependency parser. Then, we defined four extraction rules from the most basic sentence structures and built a system to extract meaningful chunking from predefined rules. Consequently, with a rule-based approach, users can freely add or modify extraction rules and derive key syntax from any type of a document. We experimented with Wikipedia data and the system achieved 33% more accuracy than DepOE, another OIE system that applies a dependency parser. As a result of the experiment, the system we propose enables easy analyses of written text and will be useful in analyzing various texts in the future.
An Automatic Method of Generating a Large-Scale Train Set for Bi-LSTM based Sentiment Analysis
http://doi.org/10.5626/JOK.2019.46.8.800
Sentiment analysis using deep learning requires a large-scale train set labeled sentiment. However, direct labeling of sentiment by humans is time and cost-constrained, and it is not easy to collect the required data for sentiment analysis from many data. In the present work, to solve the existing problems, the existing sentiment lexicon was used to assign sentiment score, and when there was sentiment transformation element, the sentiment score was reset through dependency parsing and morphological analysis for automatic generation of large-scale train set labeled with the sentiment. The Top-k data with high sentiment score was extracted. Sentiment transformation elements include sentiment reversal, sentiment activation, and sentiment deactivation. Our experimental results reveal the generation of a large-scale train set in a shorter time than manual labeling and improvement in the performance of deep learning with an increase in the amount of train set. The accuracy of the model using only sentiment lexicon was 80.17% and the accuracy of the proposed model, which includes natural language processing technology was 89.17%. Overall, a 9% improvement was observed.
Korean Dependency Parsing using the Self-Attention Head Recognition Model
http://doi.org/10.5626/JOK.2019.46.1.22
Dependency parsing is the problem solving of structural ambiguities of natural language in sentences. Recently, various deep learning techniques have been applied and shown high performance. In this paper, we analyzed deep learning based dependency parsing problem in three stages. The first stage was a representation step for a word (eojeol) that is a unit of dependency parsing. The second stage was a context reflecting step that reflected the surrounding word information for each word. The last stage was the head word and dependency label recognition step. In this paper, we propose the max-pooling method that is widely used in the CNN model for a word representation. Moreover, we apply the Minimal-RNN Unit that has less computational complexity than the LSTM and GRU for contextual representation. Finally, we propose a Self-Attention Head Recognition Model that includes the relative distance embedding between each word for the head word recognition, and applies multi-task learning to the dependency label recognition simultaneously. For the evaluation, the SEJONG phrase-structure parsing corpus was transformed according to the TTA Standard Dependency Guideline. The proposed model showed the accuracy of parsing for UAS 93.38% and LAS 90.42%.
Korean Dependency Parsing using Pointer Networks
http://doi.org/10.5626/JOK.2017.44.8.822
In this paper, we propose a Korean dependency parsing model using multi-task learning based pointer networks. Multi-task learning is a method that can be used to improve the performance by learning two or more problems at the same time. In this paper, we perform dependency parsing by using pointer networks based on this method and simultaneously obtaining the dependency relation and dependency label information of the words. We define five input criteria to perform pointer networks based on multi-task learning of morpheme in dependency parsing of a word. We apply a fine-tuning method to further improve the performance of the dependency parsing proposed in this paper. The results of our experiment show that the proposed model has better UAS 91.79% and LAS 89.48% than conventional Korean dependency parsing.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr