Digital Library[ Search Result ]
Named Entity Tagged Corpus Augmentation Using Automatic Editing
http://doi.org/10.5626/JOK.2023.50.1.11
A corpus is an essential resource for machine learning and deep learning in the field of natural language processing. In Korean, there are insufficient well-refined named entity corpus compared to advanced research countries such as the United States, Japan, and China. Most projects for building a named entity corpus proceed manually and/or semi-automatically and thus require a lot of cost and effort. In this paper, we propose a novel method for automatically augmenting a small-sized named entity corpus. The proposed method augments the corpus by automatically editing, for example, substituting, inserting, and deleting. We use probabilistic sampling rather than simple editing to make the augmented corpus natural and diverse. Through experiments, we have shown that the performance of Korean named entity recognition can be improved using the augmented corpus and the proposed method should be used in practice.
Assignment Semantic Category of a Word using Word Embedding and Synonyms
http://doi.org/10.5626/JOK.2017.44.9.946
Semantic Role Decision defines the semantic relationship between the predicate and the arguments in natural language processing (NLP) tasks. The semantic role information and semantic category information should be used to make Semantic Role Decisions. The Sejong Electronic Dictionary contains frame information that is used to determine the semantic roles. In this paper, we propose a method to extend the Sejong electronic dictionary using word embedding and synonyms. The same experiment is performed using existing word-embedding and retrofitting vectors. The system performance of the semantic category assignment is 32.19%, and the system performance of the extended semantic category assignment is 51.14% for words that do not appear in the Sejong electronic dictionary of the word using the word embedding. The system performance of the semantic category assignment is 33.33%, and the system performance of the extended semantic category assignment is 53.88% for words that do not appear in the Sejong electronic dictionary of the vector using retrofitting. We also prove it is helpful to extend the semantic category word of the Sejong electronic dictionary by assigning the semantic categories to new words that do not have assigned semantic categories.
Korean Semantic Role Labeling Using Semantic Frames and Synonym Clusters
Soojong Lim, Joon-Ho Lim, Chung-Hee Lee, Hyun-Ki Kim
Semantic information and features are very important for Semantic Role Labeling(SRL) though many SRL systems based on machine learning mainly adopt lexical and syntactic features. Previous SRL research based on semantic information is very few because using semantic information is very restricted. We proposed the SRL system which adopts semantic information, such as named entity, word sense disambiguation, filtering adjunct role based on sense, synonym cluster, frame extension based on synonym dictionary and joint rule of syntactic-semantic information, and modified verb-specific numbered roles, etc. According to our experimentations, the proposed present method outperforms those of lexical-syntactic based research works by about 3.77 (Korean Propbank) to 8.05 (Exobrain Corpus) F1-scores.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr