Search : [ keyword: Convolutional Neural Network ] (25)

The Method Using Reduced Classification Models for Distributed Processing of CNN Models in Multiple Edge Devices

Junyoung Kim, Jongho Jeon, Minkwan Kee, Gi-Ho Park

http://doi.org/10.5626/JOK.2020.47.8.787

Recently, there have been increasing demands for edge computing that processes data at the end of the network wherein data is collected because of various problems such as network load caused by a large amount of data transfer to a cloud server. However, it is difficult for edge devices to use deep learning applications used in cloud servers because most edge devices at the end of the network have limited performance. To overcome these problems, this paper proposes a distributed processing method that uses reduced classification models to jointly perform inferences on multiple edge devices. The reduced classification models have compressed model weights, and perform inferences for some parts of the total classification labels. The experimental results confirmed that the accuracy of the result of the proposed distributed processing method is similar to the accuracy of the result of the original model, even if the proposed reduced classification models had much less parameters than those of the original model.

The Cut Transition Detection Model Using the SSD Method

Sungmin Park, Ui Nyoung Yoon, Geun-Sik Jo

http://doi.org/10.5626/JOK.2020.47.7.655

Shot boundary detection is constantly being studied as an essential technique for analyzing video content. In this paper, we propose an End-to-End Learning model using the SSD (Single Shot Multibox Detector) method to resolve the shortcomings of the existing research and to identify the exact location of the cut transition. We applied the concept of the Multi-Scale Feature Map and Default box of the SSD to predict multiple cut transitions, and combined the concept of Image Concatenation, one of the image comparison methods, with the model to reinforce the feature information of the cut transitions. The proposed model showed 88.7% and 98.0% accuracy in the re-labeled ClipShots and TRECVID 2007 datasets, respectively, compared to the latest research. Additionally, it detected a range closer to the correct answer than the existing deep learning model.

A New Light-Weight and Efficient Convolutional Neural Network Using Fast Discrete Cosine Transform

Joonhyun Jeong, Sung-Ho Bae

http://doi.org/10.5626/JOK.2020.47.3.276

Recently proposed light-weight neural networks maintain high accuracy in some degree with a small amount of weight parameters and low computation cost. Nevertheless, existing convolutional neural networks commonly have a lot of weight parameters from the Pointwise Convolution (1x1 convolution), which also induces a high computational cost. In this paper, we propose a new Pointwise Convolution operation with one dimensional Fast Discrete Cosine Transform (FDCT), resulting in dramatically reducing the number of learnable weight parameters and speeding up the process of computation. We propose light-weight convolutional neural networks in two specific aspects: 1) Application of DCT on the block structure and 2) Application of DCT on the hierarchy level in the CNN models. Experimental results show that our proposed method achieved the similar classification accuracy compared to the MobileNet v1 model, reducing 79.1% of the number of learnable weight parameters and 48.3% of the number of FLOPs while achieving 0.8% increase in top-1 accuracy.

A Visual Analytics Technique for Analyzing the Cause and Influence of Traffic Congestion

Mingyu Pi, Hanbyul Yeon, Hyesook Son, Yun Jang

http://doi.org/10.5626/JOK.2020.47.2.195

In this paper, we present a technique to analyze the causes of traffic congestion based on the traffic flow theory. We extracted vehicle flows from the traffic data, such as GPS trajectory and Vehicle Detector data. Also, vehicle flow changes were identified by utilizing the entropy from the information theory. Then, we extracted cumulative vehicle count curves (N-curve) that can quantify the vehicle flows in the congestion area. According to the traffic flow theory, unique N-curve patterns can be observed depending on the congestion type. We build a convolution neural network classifier that can classify N-curve into four different congestion patterns. Analyzing the cause and influence of congestion is difficult and requires considerable experience and knowledge. Apparently, we present a visual analytics system that can efficiently perform a series of processes to analyze the cause and influence of traffic congestion. Through case studies, we have evaluated our system that can analyze the cause of traffic congestion.

Malware Variants Detection based on Dhash

Hongbi Kim, Hyunseok Shin, Junho Hwang, Taejin Lee

http://doi.org/10.5626/JOK.2019.46.11.1207

Malicious codes are becoming more intelligent due to the popularization of malware generation tools and obfuscation techniques, but existing malware detection techniques suffer from incomplete detection of malicious codes. Considering the facts that many newly emerging malicious codes are variants of existing malicious codes, and that they have binary data similar to those of the original malicious codes, a Dhash-based malware detection technique is presented here that classifies images based on the binary data in a file, along with a 10-gram algorithm that improves the long time taken by the analysis due to the full comparison of the Dhash algorithm. A comparison with the superior ssdep technique in variant malware detection shows that the Dhash algorithm can detect areas that ssdep does not detect, and the superiority of the proposed algorithm through the existing Dhash algorithm and the detection speed comparison experiment of the algorithms proposed in this paper. Future work will continue to develop variety of malware analysis technologies that are linked to other LSH-based detection techniques.

Semi-automatic Expansion for a Chatting Corpus Based on a K-means Clustering Method And Similarity Measure

Jaehyun An, Youngjoong Ko

http://doi.org/10.5626/JOK.2019.46.5.440

In this paper, we proposed a semi-automatic expansion method to expand a chatting corpus using a large amount of utterance data from movie subtitles and drama scripts. To expand the chatting corpus, the proposed system used previously constructed chatting corpus and a similarity measure. If the similarity is calculated between a previously constructed chatting corpus and the input utterance was greater than a threshold value set in the experiment, the input utterance was selected as a new chatting utterance, that it is a correct chatting pair. We used morpheme-unit word embeddings and a Convolutional Neural Networks to efficiently calculate the similarity of the utterance embedding. In order to improve the speed of the semi-automatic expansion process, we proposed to reduce the amount of computation by clustering chat corpus by K-means clustering algorithm. Experimental results showed that the precision, recall, and F1 score of the proposed system were 61.28%, 53.19%, and 56.94%, respectively, which was 5.16%p, 6.09%, and 5.73%p higher than that of the baseline system. The term frequency and the speed of our system were also about a hundred times faster.

Elastic Multiple Parametric Exponential Linear Units for Convolutional Neural Networks

Daeho Kim, Jaeil Kim

http://doi.org/10.5626/JOK.2019.46.5.469

Activation function plays a major role in determining the depth and non-linearity of neural networks. Since the introduction of Rectified Linear Units for deep neural networks, many variants have been proposed. For example, Exponential Linear Units (ELU) leads to faster learning as pushing the mean of the activations closer to zero, and Elastic Rectified Linear Units (EReLU) changes the slope randomly for better model generalization. In this paper, we propose Elastic Multiple Parametric Exponential Linear Units (EMPELU) as a generalized form of ELU and EReLU. EMPELU changes the slope for the positive part of the function argument randomly within a moderate range during training, and the negative part can be dealt with various types of activation functions by its parameter learning. EMPELU improved the accuracy and generalization performance of convolutional neural networks in the object classification task (CIFAR-10/100), more than well-known activation functions.

A CNN-based Column Prediction Model for Generating SQL Queries using Natural Language

Yoonki Jeong, Dongmin Kim, Jongwuk Lee

http://doi.org/10.5626/JOK.2019.46.2.202

To retrieve massive data using relational database management system (RDBMS), it is important to understanding of table schemas and SQL grammar. To address this issue, many studies have recently been carried out to generate an SQL query from a natural language question. However, the existing works suffer mostly from predicting columns at where clause and the accuracy is greatly reduced when there are multiple columns to be predicted. In this paper, we propose a convolutional neural network model with column attention mechanism that effectively extracts the latent representation of input question which helps column prediction of the model. The experiment shows that our model outperforms the accuracy of the existing model (SQLNet) by 6%.

Effect Scene Detection using Multimodal Deep Learning Models

Jeongseon Lim, Mikyung Han, Hyunjin Yoon

http://doi.org/10.5626/JOK.2018.45.12.1250

A conventional movie can be converted into a 4D movie by identifying effect scenes. In order to automate this process, in this paper, we propose a multimodal deep learning model that detects effect scenes using both visual and audio features of a movie. We have classified effect/non-effect scenes using audio-based Convolutional Recurrent Neural Network (CRNN) model and video-based Long Short-term Memory (LSTM) and Multilayer Perceptron (MLP) model. Also, we have implemented feature-level fusion. In addition, based on our own observation that effects typically occur during non-dialog scenes, we further detected non-dialog scenes using audio-based Convolutional Neural Network (CNN) model. Subsequently, the prediction scores of audio-visual effect scene classification and audio-based non-dialog classification models were combined. Finally, we detected sequences of effect scenes of the entire movie using prediction score of the input window. Experiments using real-world 4D movies demonstrate that the proposed multimodal deep learning model outperforms unimodal models in terms of effect scene detection accuracy.

Object Recognition in Low Resolution Images using a Convolutional Neural Network and an Image Enhancement Network

Injae Choi, Jeongin Seo, Hyeyoung Park

http://doi.org/10.5626/JOK.2018.45.8.831

Recently, the development of deep learning technologies such as convolutional neural networks have greatly improved the performance of object recognition in images. However, object recognition still has many challenges due to large variations in images and the diversity of object categories to be recognized. In particular, studies on object recognition in low-resolution images are still in the primary stage and have not shown satisfactory performance. In this paper, we propose an image enhancement neural network to improve object recognition performance of low resolution images. We also use the enhanced images for training an object recognition model based on convolutional neural networks to obtain robust recognition performance with resolution changes. To verify the efficiency of the proposed method, we conducted computational experiments on object recognition in a low-resolution environment using the CIFAR-10 and CIFAR-100 databases. We confirmed that the proposed method can greatly improve the recognition performance in low-resolution images while keeping stable performance in the original resolution images.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr