Digital Library[ Search Result ]
PARPA: A Parallel Framework Simultaneously Using Heterogeneous Architecture for High Performance Computing
Hyojae Cho, Taehyun Han, Hyeonmyeong Lee, Heeseung Jo
http://doi.org/10.5626/JOK.2019.46.9.876
With the substantial performance improvements achieved in GPU, they have come to be commonly used not only in computer graphics but also in high performance computing. Simply using a CPU and a GPU concurrently is not difficult. However, distributing works and adjusting the computing ratio among these heterogeneous processors are challenging issues. We propose a novel framework in this paper, named PARPA, which automatically distributes and processes tasks to a CPU and a GPU. PARPA can maximize computation performance by using a CPU and a GPU simultaneously. The load balancing between them can be performed dynamically based on their usage and features. The evaluation results indicate that PARPA shows 3.48 times better performance.
Measuring Semantic Orientation of Words using Temporal Difference Learning
http://doi.org/10.5626/JOK.2018.45.12.1287
Temporal-difference(TD) learning is a core algorithm of reinforcement learning, which employs models of Markov process. In the TD methods, rewards are always discounted by a discount factor and states receive these discounted values as their rewards. In this paper, we attempted to estimate a semantic orientation of words in texts using the TD-based methods and examined the effectiveness of the proposed methods by comparing them to existing feature selection methods (indirect approach) and Bayes probabilities (direct approach). The TD-based estimation would be useful for tasks of social opinion mining, since TD learning is inherently an on-line method. In order to show our approach is scalable to huge data, the estimation method is also evaluated using asynchronous parallel processing.
Effective Parallel LiDAR Triangulated Irregular Network Construction Method Using Convex Boundary Triangle
Permata Nur Rizki, Sangyoon Oh
http://doi.org/10.5626/JOK.2018.45.8.761
A triangulated irregular network (TIN) model has been adopted in numerous digital mapping schemes to represent the terrain surfaces. With the TIN model, we can produce a more flexible resolution and a detailed surface compared to a grid-based model. However, TIN processing is computationally intensive and it requires an efficient approach in order to process massive Light Detection and Ranging (LiDAR) dataset. In this article, we present our parallelization method for LiDAR TIN construction using the MapReduce paradigm. We introduce a triangulation approach with a convex boundary triangle to reduce the number of vertices to visit - thereby reducing overhead from the data dependencies - in the parallel execution of the TIN construction. First, we divide a planar area vertically based on the information from the convex boundary region and allocate the initial LiDAR point cloud to parallel workers. Then, we apply our justification rules in each parallel process to prevent the Delaunay property violation in the boundary triangles. Lastly, the constructed triangles from each of the workers are merged based on 〈key,value〉 intermediate metadata properties. To evaluate the effectiveness of our proposed method, we used Apache Spark. The empirical results of the experiment show that our method outperforms the conventional method by having 16.2% less processed vertices.
Tile Partitioning-based HEVC Parallel Decoding Optimization for Asymmetric Multicore Processor
Yeongil Ryu, Hyun-Joon Roh, Eun-Seok Ryu
Recently, there is an emerging need for parallel UHD video processing, and the usage of computing systems that have an asymmetric processor such as ARM big.LITTLE is actively increasing. Thus, a new parallel UHD video processing method that is optimized for the asymmetric multicore systems is needed. This paper proposes a novel HEVC tile partitioning method for parallel processing by analyzing the computational power of asymmetric multicores. The proposed method analyzes (1) the computing power of asymmetric multicores and (2) the regression model of computational complexity per video resolution. Finally, the model (3) determines the optimal HEVC tile resolution for each core and partitions/allocates the tiles to suitable cores. The proposed method minimizes the gap in the decoding time between the fastest CPU core and the slowest CPU core. Experimental results with the 4K UHD official test sequences show average 20% improvement in the decoding speedup on the ARM asymmetric multicore system.
A Design of a Distributed Computing Problem Solving Environment for Dietary Data Analysis
Jieun Choi, Younsun Ahn, Yoonhee Kim
Recently, wellness has become an issue related to improvements in personal health and quality of life. Data that are accumulated daily, such as meals and momentum records, in addition to body measurement information such as body weight, BMI and blood pressure have been used to analyze the personal health data of an individual. Therefore, it has become possible to prevent potential disease and to analyze dietary or exercise patterns. In terms of food and nutrition, analyses are performed to evaluate the health status of an individual using dietary data. However, it is very difficult to process the large amount of dietary data. An analysis of dietary data includes four steps, and each step contains a series of iterative tasks that are executed over a long time. This paper proposes a problem solving environment that automates dietary data analysis, and the proposed framework increases the speed with which an experiment can be conducted.
Parallel Range Query Processing with R-tree on Multi-GPUs
Hongsu Ryu, Mincheol Kim, Wonik Choi
Ever since the R-tree was proposed to index multi-dimensional data, many efforts have been made to improve its query performances. One common trend to improve query performance is to parallelize query processing with the use of multi-core architectures. To this end, a GPU-base R-tree has been recently proposed. However, even though a GPU-based R-tree can exhibit an improvement in query performance, it is limited in its ability to handle large volumes of data because GPUs have limited physical memory. To address this problem, we propose MGR-tree (Multi-GPU R-tree), which can manage large volumes of data by dividing nodes into multiple GPUs. Our experiments show that MGR-tree is up to 9.1 times faster than a sequential search on a GPU and up to 1.6 times faster than a conventional GPU-based R-tree.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr