Digital Library[ Search Result ]
Korean Semantic Role Labeling with BERT
Jangseong Bae, Changki Lee, Soojong Lim, Hyunki Kim
http://doi.org/10.5626/JOK.2020.47.11.1021
Semantic role labeling is an application of natural language processing to identify relationships such as "who, what, how and why" with in a sentence. The semantic role labeling study mainly uses machine learning algorithms and the end-to-end method that excludes feature information. Recently, a language model called BERT (Bidirectional Encoder Representations from Transformers) has emerged in the natural language processing field, performing better than the state-of- the-art models in the natural language processing field. The performance of the semantic role labeling study using the end-to-end method is mainly influenced by the structure of the machine learning model or the pre-trained language model. Thus, in this paper, we apply BERT to the Korean semantic role labeling to improve the Korean semantic role labeling performance. As a result, the performance of the Korean semantic role labeling model using BERT is 85.77%, which is better than the existing Korean semantic role labeling model.
Korean Semantic Role Labeling Using Semantic Frames and Synonym Clusters
Soojong Lim, Joon-Ho Lim, Chung-Hee Lee, Hyun-Ki Kim
Semantic information and features are very important for Semantic Role Labeling(SRL) though many SRL systems based on machine learning mainly adopt lexical and syntactic features. Previous SRL research based on semantic information is very few because using semantic information is very restricted. We proposed the SRL system which adopts semantic information, such as named entity, word sense disambiguation, filtering adjunct role based on sense, synonym cluster, frame extension based on synonym dictionary and joint rule of syntactic-semantic information, and modified verb-specific numbered roles, etc. According to our experimentations, the proposed present method outperforms those of lexical-syntactic based research works by about 3.77 (Korean Propbank) to 8.05 (Exobrain Corpus) F1-scores.
Syllable-based Korean POS Tagging Based on Combining a Pre-analyzed Dictionary with Machine Learning
Chung-Hee Lee, Joon-Ho Lim, Soojong Lim, Hyun-Ki Kim
This study is directed toward the design of a hybrid algorithm for syllable-based Korean POS tagging. Previous syllable-based works on Korean POS tagging have relied on a sequence labeling method and mostly used only a machine learning method. We present a new algorithm integrating a machine learning method and a pre-analyzed dictionary. We used a Sejong tagged corpus for training and evaluation. While the machine learning engine achieved eojeol precision of 0.964, the proposed hybrid engine achieved eojeol precision of 0.990. In a Quiz domain test, the machine learning engine and the proposed hybrid engine obtained 0.961 and 0.972, respectively. This result indicates our method to be effective for Korean POS tagging.
Korean Semantic Role Labeling Using Domain Adaptation Technique
Soojong Lim, Yongjin Bae, Hyunki Kim, Dongyul Ra
Developing a high-performance Semantic Role Labeling (SRL) system for a domain requires manually annotated training data of large size in the same domain. However, such SRL training data of sufficient size is available only for a few domains. Performances of Korean SRL are degraded by almost 15% or more, when it is directly applied to another domain with relatively small training data. This paper proposes two techniques to minimize performance degradation in the domain transfer. First, a domain adaptation algorithm for Korean SRL is proposed which is based on the prior model that is one of domain adaptation paradigms. Secondly, we proposed to use simplified features related to morphological and syntactic tags, when using small-sized target domain data to suppress the problem of data sparseness. Other domain adaptation techniques were experimentally compared to our techniques in this paper, where news and Wikipedia were used as the sources and target domains, respectively. It was observed that the highest performance is achieved when our two techniques were applied together. In our system"s performance, F1 score of 64.3% was considered to be 2.4~3.1% higher than the methods from other research.
Korean Semantic Role Labeling Using Structured SVM
Changki Lee, Soojong Lim, Hyunki Kim
Semantic role labeling (SRL) systems determine the semantic role labels of the arguments of predicates in natural language text. An SRL system usually needs to perform four tasks in sequence: Predicate Identification (PI), Predicate Classification (PC), Argument Identification (AI), and Argument Classification (AC). In this paper, we use the Korean Propbank to develop our Korean semantic role labeling system. We describe our Korean semantic role labeling system that uses sequence labeling with structured Support Vector Machine (SVM). The results of our experiments on the Korean Propbank dataset reveal that our method obtains a 97.13% F1 score on Predicate Identification and Classification (PIC), and a 76.96% F1 score on Argument Identification and Classification (AIC).
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr