TY  - JOUR
T1  - Distributed Processing of Deep Learning Inference Models for Data Stream Classification
AU  - Moon, Hyojong 
AU  - Son, Siwoon 
AU  - Moon, Yang-Sae 
JO  - Journal of KIISE, JOK
PY  - 2021
DA  - 2021/1/14
DO  - 10.5626/JOK.2021.48.10.1154
KW  - data stream
KW  - deep learning inference
KW  - stacking
KW  - distributed processing
KW  - Apache Storm
AB  - The increased generation of data streams has subsequently led to increased utilization of deep learning. In order to classify data streams using deep learning, we need to execute the model in real-time through serving. Unfortunately, the serving model incurs long latency due to gRPC or HTTP communication. In addition, if the serving model uses a stacking ensemble method with high complexity, a longer latency occurs. To solve the long latency challenge, we proposed distributed processing solutions for data stream classification using Apache Storm. First, we proposed a real-time distributed inference method based on Apache Storm to reduce the long latency of the existing serving method. The present study"s experimental results showed that the proposed distributed inference method reduces the latency by up to 11 times compared to the existing serving method. Second, to reduce the long latency of the stacking-based inference model for detecting malicious URLs, we proposed four distributed processing techniques for classifying URL streams in real-time. The proposed techniques are Independent Stacking, Sequential Stacking, Semi-Sequential Stacking, and Stepwise-Independent Stacking. Our study experimental results showed that Stepwise-Independent Stacking, whose characteristics are similar to those of independent execution and sequential processing, is the best technique for classifying URL streams with the shortest latency.