Feature Extraction to Detect Hoax Articles 


Vol. 43,  No. 11, pp. 1210-1215, Nov.  2016


PDF

  Abstract

Readership of online newspapers has grown with the proliferation of smart devices. However, fierce competition between Internet newspaper companies has resulted in a large increase in the number of hoax articles. Hoax articles are those where the title does not convey the content of the main story, and this gives readers the wrong information about the contents. We note that the hoax articles have certain characteristics, such as unnecessary celebrity quotations, mismatch in the title and content, or incomplete sentences. Based on these, we extract and validate features to identify hoax articles. We build a large-scale training dataset by analyzing text keywords in replies to articles and thus extracted five effective features. We evaluate the performance of the support vector machine classifier on the extracted features, and a 92% accuracy is observed in our validation set. In addition, we also present a selective bigram model to measure the consistency between the title and content, which can be effectively used to analyze short texts in general.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

S. Heo and K. Sohn, "Feature Extraction to Detect Hoax Articles," Journal of KIISE, JOK, vol. 43, no. 11, pp. 1210-1215, 2016. DOI: .


[ACM Style]

Seong-Wan Heo and Kyung-Ah Sohn. 2016. Feature Extraction to Detect Hoax Articles. Journal of KIISE, JOK, 43, 11, (2016), 1210-1215. DOI: .


[KCI Style]

허성완, 손경아, "낚시성 인터넷 신문기사 검출을 위한 특징 추출," 한국정보과학회 논문지, 제43권, 제11호, 1210~1215쪽, 2016. DOI: .


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr