TY - JOUR T1 - Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce AU - Jang, Miyoung AU - Chang, Jae Woo JO - Journal of KIISE, JOK PY - 2015 DA - 2015/1/14 DO - KW - distributed-data processing algorithm KW - MapReduce KW - k-NN join query-processing algorithm KW - grid index AB - MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.