Search : [ keyword: Twitter ] (6)

A Twitter News-Classification Scheme Using Semantic Enrichment of Word Features

Seonmi Ji, Jihoon Moon, Hyeonwoo Kim, Eenjun Hwang

http://doi.org/10.5626/JOK.2018.45.10.1045

Recently, with the popularity of Twitter as a news platform, many news articles are generated, and various kinds of information and opinions about them spread out very fast. But since an enormous amount of Twitter news is posted simultaneously, users have difficulty in selectively browsing for news related to their interests. So far, many works have been conducted on how to classify Twitter news using machine learning and deep learning. In general, conventional machine learning schemes show data sparsity and semantic gap problems, and deep learning schemes require a large amount of data. To solve these problems, in this paper, we propose a Twitter news-classification scheme using semantic enrichment of word features. Specifically, we first extract the features of Twitter news data using the Vector Space Model. Second, we enhance those features using DBpedia Spotlight. Finally, we construct a topic-classification model based on various machine learning techniques and demonstrate by experiments that our proposed model is more effective than other traditional methods.

A Model for Nowcasting Commodity Price based on Social Media Data

(Jaewoo Kim, Meeyoung Cha, Jong Gun Lee

http://doi.org/10.5626/JOK.2017.44.12.1258

Capturing real-time daily information on food prices is invaluable to help policymakers and development organizations address food security problems and improve public welfare. This study analyses the possible use of large-scale online data, available due to growing Internet connectivity in developing countries, to provide updates on food security landscape. We conduct a case study of Indonesia to develop a time-series prediction model that nowcasts daily food prices for four types of food commodities that are essential in the region: beef, chicken, onion and chilli. By using Twitter price quotes, we demonstrate the capability of social data to function as an affordable and efficient proxy for traditional offline price statistics.

Feature Expansion based on LDA Word Distribution for Performance Improvement of Informal Document Classification

Hokyung Lee, Seon Yang, Youngjoong Ko

http://doi.org/

Data such as Twitter, Facebook, and customer reviews belong to the informal document group, whereas, newspapers that have grammar correction step belong to the formal document group. Finding consistent rules or patterns in informal documents is difficult, as compared to formal documents. Hence, there is a need for additional approaches to improve informal document analysis. In this study, we classified Twitter data, a representative informal document, into ten categories. To improve performance, we revised and expanded features based on LDA(Latent Dirichlet allocation) word distribution. Using LDA top-ranked words, the other words were separated or bundled, and the feature set was thus expanded repeatedly. Finally, we conducted document classification with the expanded features. Experimental results indicated that the proposed method improved the micro-averaged F1-score of 7.11%p, as compared to the results before the feature expansion step.

Spatiotemporal Data Visualization using Gravity Model

Seokyeon Kim, Hanbyul Yeon, Yun Jang

http://doi.org/

Visual analysis of spatiotemporal data has focused on a variety of techniques for analyzing and exploring the data. The goal of these techniques is to explore the spatiotemporal data using time information, discover patterns in the data, and analyze spatiotemporal data. The overall trend flow patterns help users analyze geo-referenced temporal events. However, it is difficult to extract and visualize overall trend flow patterns using data that has no trajectory information for movements. In order to visualize overall trend flow patterns, in this paper, we estimate continuous distributions of discrete events over time using KDE, and we extract vector fields from the continuous distributions using the gravity model. We then apply our technique on twitter data to validate techniques.

An Evaluation Method for Contents Importance Based on Twitter Characteristics

Euijong Lee, Jeong-Dong Kim, Doo-Kwon Baik

http://doi.org/

Twitter is a social network service that generates about 140 million contents a day. Contents of Twitter contain a variety of information and many researchers research those in various fields. In this research, we propose a method for evaluating the importance of content based on characteristics of Twitter. We have found that number of follower means user’s popularity and Re-tweet that means the popularity of content. We perform experiments about proposed method using real Twitter data for proving effectiveness of proposed method. Also, we found information providers in Twitter are public user who represent a company or a representative of a specific group.

Spammer Detection using Features based on User Relationships in Twitter

Chansik Lee, Juntae Kim

http://doi.org/

Twitter is one of the most famous SNS(Social Network Service) in the world. Twitter spammer accounts that are created easily by E mail authentication deliver harmful content to twitter users. This paper presents a spammer detection method that utilizes features based on the relationship between users in twitter. Relationship based features include friends relationship that represents user preferences and type relationship that represents similarity between users. We compared the performance of the proposed method and conventional spammer detection method on a dataset with 3% to 30% spammer ratio, and the experimental results show that proposed method outperformed conventional method in Naive Bayesian Classification and Decision Tree Learning.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr