TY  - JOUR
T1  - Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce
AU  - Jang, Miyoung 
AU  - Chang, Jae Woo 
JO  - Journal of KIISE, JOK
PY  - 2015
DA  - 2015/1/14
DO  - 
KW  - distributed-data processing algorithm
KW  - MapReduce
KW  - k-NN join query-processing algorithm
KW  - grid index
AB  - MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.