Digital Library[ Search Result ]
A MapReduce-based Prior Probability Optimization Algorithm for Topic Extraction
http://doi.org/10.5626/JOK.2018.45.5.478
Various topic extraction algorithms have been used to obtain meaningful information from a large number of text documents. Since the topic extraction algorithms work based on the Bayesian probability model, the prior probabilities, α and β, should be given as inputs. Until now, in order to run the topic extraction models, users have to either take advantage of default prior probability values or determine them subjectively. In this study, we propose a MapReduce-based prior probability optimization algorithm that systematically determines the prior probability values in addition to the improvement of performance and accuracy against a large-scale input data. Unlike the previous single thread algorithm, the proposed MapReduce-based algorithm quickly determines the prior probability values that are suitable for the input data. It then extracts topics with high accuracy after the topic extraction algorithm is executed with the chosen prior probability values. Our experimental results showed that the proposed method outperforms the previous method in the aspect of topic coherence and performance.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr