Vol. 41, No. 9,
Sep. 2014
Digital Library
Para-virtualized Library for Bare-metal Network Performance in Virtualized Environment
Dongwoo Lee, Youngjoong Cho, Young Ik Eom
Now, virtualization is no more emerging research area, and we can easily find its application in our circumstance. Nevertheless, I/O workloads are reluctant to be applied in virtual environment since they still suffer from unacceptable performance degradation due to virtualization latency. Many previous papers identified that virtual I/O overhead is mainly caused by exits and redundant I/O stack, and proposed several techniques to reduce them. However, they still have some limitations. In this paper, we introduce a novel I/O virtualization framework which improves I/O performance by exploiting multicore architecture. We applied our framework to the virtual network, and it improves TCP throughput up to 169%, and decreases UDP latency up to 38% on the network with the 10Gbps NIC.
Parallel Rabin Fingerprinting on GPGPU for Efficient Data Deduplication
Jeonghyeon Ma, Sejin Park, Chanik Park
Rabin fingerprinting used for chunking requires the largest amount computation time in data deduplication, In this paper, therefore, we proposed parallel Rabin fingerprinting on GPGPU for efficient data deduplication. In addition, for efficient parallelism in Rabin fingerprinting, four issues are considered. Firstly, when dividing input data stream into data sections, we consider the data located near the boundaries between data sections to calculate Rabin fingerprint continuously. Secondly, we consider exploiting the characteristics of Rabin fingerprinting for efficient operation. Thirdly, we consider the chunk boundaries which can be changed compared to sequential Rabin fingerprinting when adapting parallel Rabin fingerprinting. Finally, we consider optimizing GPGPU memory access. Parallel Rabin fingerprinting on GPGPU shows 16 times and 5.3 times better performance compared to sequential Rabin fingerprinting on CPU and compared to parallel Rabin fingerprinting on CPU, respectively. These throughput improvement of Rabin fingerprinting can lead to total performance improvement of data deduplication.
A Malicious Traffic Detection Method Using X-means Clustering
Myoungji Han, Jihyuk Lim, Junyong Choi, Hyunjoon Kim, Jungjoo Seo, Cheol Yu, Sung-Ryul Kim, Kunsoo Park
Malicious traffic, such as DDoS attack and botnet communications, refers to traffic that is generated for the purpose of disturbing internet networks or harming certain networks, servers, or hosts. As malicious traffic has been constantly evolving in terms of both quality and quantity, there have been many researches fighting against it. In this paper, we propose an effective malicious traffic detection method that exploits the X-means clustering algorithm. We also suggest how to analyze statistical characteristics of malicious traffic and to define metrics that are used when clustering. Finally, we verify effectiveness of our method by experiments with two released traffic data.
Establishing of Requirement and Design Development Process for Assuring Quality of Automotive Semiconductor
With the trend of increasing needs for high-technology from customer and tightening regulation on automotive fuel efficiency and safety, application of E/E system has been expanding consistently in automotive industry. Thus, demand for core elements of E/E system has been growing: micro controller, analog IC and ASIC. But, development process of automotive semiconductor hasn"t been clearly established in domestic area. This research aims to present a guide and an example for construction of requirement and design development process on semiconductor based on ISO/TS 16949 that is requirement for quality management system, CMMI that has been proven in various area and ISO 26262 widely used methodology for functional safety. It is expected that the result of this research is used as guidance for construction of semiconductor development process.
Taking a Jump Motion Picture Automatically by using Accelerometer of Smart Phones
This paper proposes algorithms to detect jump motion and automatically take a picture when the jump reaches its top. Based on the algorithms, we build jump-shot system by using accelerometer-equipped smart phones. Since the jump motion may vary depending on one"s physical condition, gender, and age, it is critical to figure out common features which are independent from such differences. Also it is obvious that the detection algorithm needs to work in real-time because of the short duration of the jump. We propose two different algorithms considering these requirements and develop the system as a smart phone application. Through a series of experiments, we show that the system is able to successfully detect the jump motion and take a picture when it reaches the top.
Syllable-based Probabilistic Models for Korean Morphological Analysis
This paper proposes three probabilistic models for syllable-based Korean morphological analysis, and presents the performance of proposed probabilistic models. Probabilities for the models are acquired from POS-tagged corpus. The result of 10-fold cross-validation experiments shows that 98.3% answer inclusion rate is achieved when trained with Sejong POS-tagged corpus of 10 million eojeols. In our models, POS tags are assigned to each syllable before spelling recovery and morpheme generation, which enables more efficient morphological analysis than the previous probabilistic models where spelling recovery is performed at the first stage. This efficiency gains the speed-up of morphological analysis. Experiments show that morphological analysis is performed at the rate of 147K eojeols per second, which is almost 174 times faster than the previous probabilistic models for Korean morphology.
Semantic Dependency Link Topic Model for Biomedical Acronym Disambiguation
Seonho Kim, Juntae Yoon, Jungyun Seo
Many important terminologies in biomedical text are expressed as abbreviations or acronyms. We newly suggest a semantic link topic model based on the concepts of topic and dependency link to disambiguate biomedical abbreviations and cluster long form variants of abbreviations which refer to the same senses. This model is a generative model inspired by the latent Dirichlet allocation (LDA) topic model, in which each document is viewed as a mixture of topics, with each topic characterized by a distribution over words. Thus, words of a document are generated from a hidden topic structure of a document and the topic structure is inferred from observable word sequences of document collections. In this study, we allow two distinct word generation to incorporate semantic dependencies between words, particularly between expansions (long forms) of abbreviations and their sentential co-occurring words. Besides topic information, the semantic dependency between words is defined as a link and a new random parameter for the link presence is assigned to each word. As a result, the most probable expansions with respect to abbreviations of a given abstract are decided by word-topic distribution, document-topic distribution, and word-link distribution estimated from document collection though the semantic dependency link topic model. The abstracts retrieved from the MEDLINE Entrez interface by the query relating 22 abbreviations and their 186 expansions were used as a data set. The link topic model correctly predicted expansions of abbreviations with the accuracy of 98.30%.
Face Recognition Based on Facial Landmark Feature Descriptor in Unconstrained Environments
Daeok Kim, Jongkwang Hong, Hyeran Byun
This paper proposes a scalable face recognition method for unconstrained face databases, and shows a simple experimental result. Existing face recognition research usually has focused on improving the recognition rate in a constrained environment where illumination, face alignment, facial expression, and background is controlled. Therefore, it cannot be applied in unconstrained face databases. The proposed system is face feature extraction algorithm for unconstrained face recognition. First of all, we extract the area that represent the important features(landmarks) in the face, like the eyes, nose, and mouth. Each landmark is represented by a high-dimensional LBP(Local Binary Pattern) histogram feature vector. The multi-scale LBP histogram vector corresponding to a single landmark, becomes a low-dimensional face feature vector through the feature reduction process, PCA(Principal Component Analysis) and LDA(Linear Discriminant Analysis). We use the Rank acquisition method and Precision at k(p@k) performance verification method for verifying the face recognition performance of the low-dimensional face feature by the proposed algorithm. To generate the experimental results of face recognition we used the FERET, LFW and PubFig83 database. The face recognition system using the proposed algorithm showed a better classification performance over the existing methods.
Distributed Table Join for Scalable RDFS Reasoning on Cloud Computing Environment
Wan-Gon Lee, Je-Min Kim, Young-Tack Park
The Knowledge service system needs to infer a new knowledge from indicated knowledge to provide its effective service. Most of the Knowledge service system is expressed in terms of ontology. The volume of knowledge information in a real world is getting massive, so effective technique for massive data of ontology is drawing attention. This paper is to provide the method to infer massive data-ontology to the extent of RDFS, based on cloud computing environment, and evaluate its capability. RDFS inference suggested in this paper is focused on both the method applying MapReduce based on RDFS meta table, and the method of single use of cloud computing memory without using MapReduce under distributed file computing environment. Therefore, this paper explains basically the inference system structure of each technique, the meta table set-up according to RDFS inference rule, and the algorithm of inference strategy. In order to evaluate suggested method in this paper, we perform experiment with LUBM set which is formal data to evaluate ontology inference and search speed. In case LUBM6000, the RDFS inference technique based on meta table had required 13.75 minutes(inferring 1,042 triples per second) to conduct total inference, whereas the method applying the cloud computing memory had needed 7.24 minutes(inferring 1,979 triples per second) showing its speed twice faster.
RDFS Rule based Parallel Reasoning Scheme for Large-Scale Streaming Sensor Data
Recently, large-scale streaming sensor data have emerged due to explosive supply of smart phones, diffusion of IoT and Cloud computing technology, and generalization of IoT devices. Also, researches on combination of semantic web technology are being actively pushed forward by increasing of requirements for creating new value of data through data sharing and mash-up in large-scale environments. However, we are faced with big issues due to large-scale and streaming data in the inference field for creating a new knowledge. For this reason, we propose the RDFS rule based parallel reasoning scheme to service by processing large-scale streaming sensor data with the semantic web technology. In the proposed scheme, we run in parallel each job of Rete network algorithm, the existing rule inference algorithm and sharing data using the HBase, a hadoop database, as a public storage. To achieve this, we implement our system and evaluate performance through the AWS data of the weather center as large-scale streaming sensor data.
A Labeling Methods for Keyword Search over Large XML Documents
As XML documents are getting bigger and more complex, a keyword-based search method that does not require structural information is needed to search these large XML documents. In order to use this method, not only all keywords expressed as nodes in the XML document must be labeled for indexing but also structural information should be well represented. However, the existing labeling methods either have very simple information of XML documents for index or represent the structural information which is difficult to deal with the increase of XML documents" size. As the size of XML documents is getting larger, it causes either the poor performance of keyword search or the exponential increase of space usage. In this paper, we present the Repetitive Prime Labeling Scheme (RPLS) in order to improve the problem of the existing labeling methods for keyword-based search of large XML documents. This method is based on the existing prime number labeling method and allows a parent"s prime number to be used at a lower level repeatedly so that the number of prime numbers being generated can be reduced. Then, we show an experimental result of the comparison between our methods and the existing methods.
Detection of Privacy Information Leakage for Android Applications by Analyzing API Inter-Dependency and the Shortest Distance
In general, the benign apps transmit privacy information to the external to provide service to users as the malicious app does. In other words, the behavior of benign apps is similar to the one of malicious apps. Thus, the benign app can be easily manipulated for malicious purposes. Therefore, the malicious apps as well as the benign apps should notify the users of the possibility of privacy information leakage before installation to prevent the potential malicious behavior. In this paper, We propose the method to detect leakage of privacy information on the android app by analyzing API inter-dependency and shortest distance. Also, we present LeakDroid which detects leakage of privacy information on Android with the above method. Unlike dynamic approaches, LeakDroid analyzes Android apps on market site. To verify the privacy information leakage detection of LeakDroid, we experimented the well-known 250 malicious apps and the 1700 benign apps collected from Android Third party market. Our evaluation result shows that LeakDroid reached detection rate of 96.4% in the malicious apps and detected 68 true privacy information leakages inside the 1700 benign apps.
Security Enhanced Authentication Protocol in LTE With Preserving User Location Privacy
Changhee Hahn, Hyunsoo Kwon, Junbeom Hur
The number of subscribers in 4th generation mobile system has been increased rapidly. Along with that, preserving subscribers’ privacy has become a hot issue. To prevent users’ location from being revealed publicly is important more than ever. In this paper, we first show that the privacy-related problem exists in user authentication procedure in 4th generation mobile system, especially LTE. Then, we suggest an attack model which allows an adversary to trace a user, i.e. he has an ability to determine whether the user is in his observation area. Such collecting subscribers’ location by an unauthorized third party may yield severe privacy problem. To keep users’ privacy intact, we propose a modified authentication protocol in LTE. Our scheme has low computational overhead and strong secrecy so that both the security and efficiency are achieved. Finally, we prove that our scheme is secure by using the automatic verification tool ProVerif.
Multi-Level Sequence Alignment : An Adaptive Control Method Between Speed and Accuracy for Document Comparison
Jong-kyu Seo, Haesung Tak, Hwan-Gue Cho
Finger printing and sequence alignment are well-known approaches for document similarity comparison. A fingerprinting method is simple and fast, but it can not find particular similar regions. A string alignment method is used for identifying regions of similarity by arranging the sequences of a string. It has an advantage of finding particular similar regions, but it also has a disadvantage of taking more computing time. The Multi-Level Alignment (MLA) is a new method designed for taking the advantages of both methods. The MLA divides input documents into uniform length blocks, and then extracts fingerprints from each block and calculates similarity of block pairs by comparing the fingerprints. A similarity table is created in this process. Finally, sequence alignment is used for specifying longest similar regions in the similarity table. The MLA allows users to change block’s size to control proportion of the fingerprint algorithm and the sequence alignment. As a document is divided into several blocks, similar regions are also fragmented into two or more blocks. To solve this fragmentation problem, we proposed a united block method. Experimentally, we show that computing document’s similarity with the united block is more accurate than the original MLA method, with minor time loss.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr