Search : [ author: 이재길 ] (5)

Latent Representation Learning for Autoencoder-based Top-K Recommender System

Dongmin Park, Junhyeok Kang, Jae-Gil Lee

http://doi.org/10.5626/JOK.2020.47.2.207

As the number of products on the Internet is growing exponentially, it becomes more difficult for customers to choose the product they want. Many researchers have been actively making efforts to develop appropriate recommender systems that satisfy the potential demand of the customer and increase the profit of the seller. Recently, collaborative filtering methods based on an autoencoder have shown high performance. However, little attention has been paid for improving the recommendation performance by changing the distribution of latent representation. In this paper, we propose the Dense Latent Representation learning method (DenseLR) which is combined with the autoencoder-based collaborative filtering method to further improve product recommendation performance. The key idea of the DenseLR is to tighten collaborative filtering effects on the latent space by effectively densifying the latent representations of user (or item) rating vectors. In performance comparison experiments on three real-world datasets, DenseLR showed the highest recommendation performance for all datasets. Furthermore, DenseLR can be flexibly combined with a wide range of autoencoder-based CF models, and we empirically validated the improvement of the f1@k score ranging from 4.6% to 23.7%.

A Greedy Rule Allocation Algorithm for Efficient Distributed Complex Event Processing

Yooju Shin, Jae-Gil Lee

http://doi.org/10.5626/JOK.2019.46.12.1222

Complex event processing (CEP) is defined as event processing for multiple stream sources to infer events that suggest complicated circumstances. As the size of stream data becomes larger, CEP engines have been parallelized to benefit from distributed computing. However, distributed CEP could duplicate redundant stream data and increase latency without consideration about the computational cost on each engine after the allocation of stream data and CEP rules. In this paper, we suggest an efficient rule allocation algorithm to prevent such situations. This algorithm determines event rules priorities for the allocation, wherein the rule with higher priority is allocated first to the engine that minimizes the increase of the value of the proposed cost function. We prove the superiority of our algorithm in two tests. In the optimization verification test, our algorithm achieves the results closest to the optimal results compared with the other algorithms. In the performance test, our algorithm shows lower latency and data replication ratio in the distributed CEP system using real world dataset and event rules.

Load Balancing for Distributed Processing of Real-time Spatial Big Data Stream

Susik Yoon, Jae-Gil Lee

http://doi.org/10.5626/JOK.2017.44.11.1209

A variety of sensors is widely used these days, and it has become much easier to acquire spatial big data streams from various sources. Since spatial data streams have inherently skewed and dynamically changing distributions, the system must effectively distribute the load among workers. Previous studies to solve this load imbalance problem are not directly applicable to processing spatial data. In this research, we propose Adaptive Spatial Key Grouping (ASKG). The main idea of ASKG is, by utilizing the previous distribution of the data streams, to adaptively suggest a new grouping scheme that evenly distributes the future load among workers. We evaluate the validity of the proposed algorithm in various environments, by conducting an experiment with real datasets while varying the number of workers, input rate, and processing overhead. Compared to two other alternative algorithms, ASKG improves the system performance in terms of load imbalance, throughput, and latency.

A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms

Minseo Kang, Jaesung Kim, Jaegil Lee

http://doi.org/

Recursive query algorithm is used in many social network services, e.g., reachability queries in social networks. Recently, the size of social network data has increased as social network services evolve. As a result, it is almost impossible to use the recursive query algorithm on a single machine. In this paper, we implement recursive query on two popular in-memory distributed platforms, Spark and Twister, to solve this problem. We evaluate the performance of two implementations using 50 machines on Amazon EC2, and real-world data sets: LiveJournal and ClueWeb. The result shows that recursive query algorithm shows better performance on Spark for the Livejournal input data set with relatively high average degree, but smaller vertices. However, recursive query on Twister is superior to Spark for the ClueWeb input data set with relatively low average degree, but many vertices.

A Fast and Scalable Image Retrieval Algorithms by Leveraging Distributed Image Feature Extraction on MapReduce

Hwan-Jun Song, Jin-Woo Lee, Jae-Gil Lee

http://doi.org/

With mobile devices showing marked improvement in performance in the age of the Internet of Things (IoT), there is demand for rapid processing of the extensive amount of multimedia big data. However, because research on image searching is focused mainly on increasing accuracy despite environmental changes, the development of fast processing of high-resolution multimedia data queries is slow and inefficient. Hence, we suggest a new distributed image search algorithm that ensures both high accuracy and rapid response by using feature extraction of distributed images based on MapReduce, and solves the problem of memory scalability based on BIRCH indexing. In addition, we conducted an experiment on the accuracy, processing time, and scalability of this algorithm to confirm its excellent performance.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr