A MapReduce-based Prior Probability Optimization Algorithm for Topic Extraction 


Vol. 45,  No. 5, pp. 478-488, May  2018
10.5626/JOK.2018.45.5.478


PDF

  Abstract

Various topic extraction algorithms have been used to obtain meaningful information from a large number of text documents. Since the topic extraction algorithms work based on the Bayesian probability model, the prior probabilities, α and β, should be given as inputs. Until now, in order to run the topic extraction models, users have to either take advantage of default prior probability values or determine them subjectively. In this study, we propose a MapReduce-based prior probability optimization algorithm that systematically determines the prior probability values in addition to the improvement of performance and accuracy against a large-scale input data. Unlike the previous single thread algorithm, the proposed MapReduce-based algorithm quickly determines the prior probability values that are suitable for the input data. It then extracts topics with high accuracy after the topic extraction algorithm is executed with the chosen prior probability values. Our experimental results showed that the proposed method outperforms the previous method in the aspect of topic coherence and performance.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

S. Oh and B. On, "A MapReduce-based Prior Probability Optimization Algorithm for Topic Extraction," Journal of KIISE, JOK, vol. 45, no. 5, pp. 478-488, 2018. DOI: 10.5626/JOK.2018.45.5.478.


[ACM Style]

SeonYeong Oh and Byung-Won On. 2018. A MapReduce-based Prior Probability Optimization Algorithm for Topic Extraction. Journal of KIISE, JOK, 45, 5, (2018), 478-488. DOI: 10.5626/JOK.2018.45.5.478.


[KCI Style]

오선영, 온병원, "주제 추출을 위한 맵리듀스 기반의 사전확률 최적화 알고리즘," 한국정보과학회 논문지, 제45권, 제5호, 478~488쪽, 2018. DOI: 10.5626/JOK.2018.45.5.478.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr