Digital Library[ Search Result ]
Korean Morphological Analyzer for Neologism and Spacing Error based on Sequence-to-Sequence
Byeongseo Choe, Ig-hoon Lee, Sang-goo Lee
http://doi.org/10.5626/JOK.2020.47.1.70
In order to analyze Internet text data from Korean internet communities, it is necessary to accurately perform morphological analysis even in a sentence with a spacing error and adequate restoration of original form for an out-of-vocabulary input. However, the existing Korean morphological analyzer often uses dictionaries and complicate preprocessing for the restoration. In this paper, we propose a Korean morphological analyzer model which is based on the sequence-to-sequence model. The model can effectively handle the spacing problem and OOV problem. In addition, the model uses syllable bigram and grapheme as additional input features. The proposed model does not use a dictionary and minimizes rule-based preprocessing. The proposed model showed better performance than other morphological analyzers without a dictionary in the experiment for Sejong corpus. Also, better performance was evident for the dataset without space and sample dataset collected from Internet.
A Persistent Log Buffer Technique using Non-volatile Memory for In-Memory Key-Value Databases
Doyoung Kim, Won Gi Choi, Hanseung Sung, Jihwan Lee, Sanghyun Park
http://doi.org/10.5626/JOK.2018.45.11.1193
Redis, an In-Memory Key-Value Database, is widely used in services that require real time data processing and storage. Since main memory is volatile, Redis has a problem of data loss if the system is terminated abnormally. To prevent this problem, Redis stores logs on disk, preventing data loss by restoring logs when the system is terminated. The AOF recovery mechanism, a method of appending requested commands in disk as a log format, operates with the “everysec” policy that writes logs every second, and the “always” policy that writes a log every time a command is requested. The “everysec” policy does not degrade performance of Redis, but data loss can occur if the system is terminated abnormally within one second. Conversely, the “always” policy does not cause data loss, but it requires disk operation for every command, causing performance degradation. We propose a system model that constructs AOF buffer in non-volatile memory and stores logs in the buffer, which are not synchronized to disk in the “everysec” policy. The proposed model prevents data loss and has approximately 100 times better performance than the “always” policy.
Secure Format-Preserving Encryption for Message Recovery Attack
Sooyong Jeong, Dowon Hong, Changho Seo
http://doi.org/10.5626/JOK.2017.44.8.860
Recently, due to the personal information security act, the encryption of personal information has attracted attention. However, if the conventional encryption scheme is used directly, the database schema must be changed because the conventional encryption scheme does not preserve the format of the data, which can yield a large cost. Therefore, the Format-Preserving Encryption(FPE) has emerged as an important technique that ensures the confidentiality of the data and maintains the database schema naturally. Accordingly, National Institute of Standards and Technology(NIST) recently published the FF1 and FF3 as standards for FPE, although problems have been found in the security of FF1 and FF3 against message recovery attacks. In this paper, we study and analyze FF1 and FF3 as the standards of FPE, as well as the message recovery attack on these schemes. We also study a secure FPE against message recovery attack and verify the efficiency by implementing standardized FF1 and FF3.
Recovering Network Joining State for Normal/Abnormal Termination of Battlefield Management System
http://doi.org/10.5626/JOK.2017.44.8.749
The weapon system based on voice call can cause delay, error or damage to the message during the exchange of information. Furthermore, since the weapon system has a unique message format, it has limited data distribution. Therefore, a Korea Variable Message Format(KVMF) has been developed in this study to utilize a standard sized data format to guarantee the transmission quality and minimize the transmission amount. The ground tactical data link system quickly and accurately shares tactical information by incorporating a field management system that utilizes the KVMF standard message in the mobile weapon system. In this study, we examine the possibility of performing the mission immediately by recovering the state of network joining when a normal/abnormal termination situation of the battlefield management system occurs.
Techniques to Guarantee Real-Time Fault Recovery in Spark Streaming Based Cloud System
Jungho Kim, Daedong Park, Sangwook Kim, Yongshik Moon, Seongsoo Hong
In a real-time cloud environment, the data analysis framework plays a pivotal role. Spark Streaming meets most real-time requirements among existing frameworks. However, the framework does not meet the second scale real-time fault recovery requirement. Spark Streaming fault recovery time increases in proportion to the transformation history length called lineage. This is because it recovers the last state data based on the cumulative lineage recorded during normal operation. Therefore, fault recovery time is not bounded within a limited time. In addition, it is impossible to achieve a second-scale fault recovery time because it costs tens of seconds to read initial state data from fault-tolerant storage. In this paper, we propose two techniques to solve the problems mentioned above. We apply the proposed techniques to Spark Streaming 1.6.2. Experimental results show that the fault recovery time is bounded and the average fault recovery time is reduced by up to 41.57%.
Recovery of Software Module-View using Dependency and Author Entropy of Modules
Jung-Min Kim, Chan-Gun Lee, Ki-Seong Lee
In this study, we propose a novel technique of software clustering to recover the software module-view by using the dependency and author entropy of modules. The proposed method first performs clustering of modules based on structural and logical dependencies, then it migrates selected modules from the clustered result by utilizing the author entropy of each module. In order to evaluate the proposed method, we calculated the MoJoFM values of the recovery result by applying the method to open-source projects among which ground-truth decompositions are well-known. Compared to the MoJoFM values of previously studied techniques, we demonstrated the effectiveness of the proposed method.
Error Correction in Korean Morpheme Recovery using Deep Learning
Korean Morphological Analysis is a difficult process. Because Korean is an agglutinative language, one of the most important processes in Morphological Analysis is Morpheme Recovery. There are some methods using Heuristic rules and Pre-Analyzed Partial Words that were examined for this process. These methods have performance limits as a result of not using contextual information. In this study, we built a Korean morpheme recovery system using deep learning, and this system used word embedding for the utilization of contextual information. In ‘들/VV’ and ‘듣/VV’ morpheme recovery, the system showed 97.97% accuracy, a better performance than with SVM(Support Vector Machine) which showed 96.22% accuracy.
Data Consistency-Control Scheme Using a Rollback-Recovery Mechanism for Storage Class Memory
Hyun Ku Lee, Junghoon Kim, Dong Hyun Kang, Young Ik Eom
Storage Class Memory(SCM) has been considered as a next-generation storage device because it has positive advantages to be used both as a memory and storage. However, there are significant problems of data consistency in recently proposed file systems for SCM such as insufficient data consistency or excessive data consistency-control overhead. This paper proposes a novel data consistency-control scheme, which changes the write mode for log data depending on the modified data ratio in a block, using a rollback-recovery scheme instead of the Write Ahead Logging (WAL) scheme. The proposed scheme reduces the log data size and the synchronization cost for data consistency. In order to evaluate the proposed scheme, we implemented our scheme on a Linux 3.10.2- based system and measured its performance. The experimental results show that our scheme enhances the write throughput by 9 times on average when compared to the legacy data consistency control scheme.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr