Journal of KIISE

Search : [ keyword: Static Analysis ] (11)

A Study on Reduction of False Alarms in Weapon System Software Static Test Using Natural Language Processing Model

Insub Lee, Hyoju Nam, Namhoon Jung, Kyutae Cho, Sungkyu Noh

http://doi.org/10.5626/JOK.2024.51.3.244

Recently, Securing software stability has become increasingly important as military systems have been upgraded. To this end, the Defense Acquisition Program Administration conducts reliability tests for weapon system software through static analysis tools. However, many false alarms occurred during the test process, resulting in a waste of time and resources. This paper aims to achieve a high positive/false positive classification rate by creating a dataset using the log of a static analysis tool and training a language model. Additionally, data processing methods appropriate for the static analysis features of weapon system software were investigated and analyzed during the research. As a result of the analysis, it was found that the CodeBert model pretrained in C/CPP and natural language using Optuna, a hyperparameter tuning tool, showed 4-5% higher performance based on the F1 score than the existing SoTA model. If the model presented in this research is mainly employed in software static testing, a significant number of false positives can be found.

Type-Checking-based Refinement for the Analysis of Uncaught Exceptions in Digital Forensic Software

Seowoo Lee, Dongwon Lee, Sehoon Kim

http://doi.org/10.5626/JOK.2023.50.12.1071

This paper designs an uncaught exception detection scheme for digital forensic software written in Python, aiming at enhancing the reliability of the forensic process. Inherited from the legacy set-constraint-based analysis method, the proposed scheme identifies potential uncaught exceptions in the target forensic software. Next, with the help of Pyright, a Python-specific static type checker, it is possible to eliminate meaningless alarms inevitably created during the analysis process, such as key errors in list types or out-of-range index errors in dictionary types. In addition, we remove duplicated detections based on the dependency tree which traces the inclusion relationship between each component or point of a given module. The experiment results, obtained by applying our static analyzer to nine benchmarks of digital forensic software, demonstrate that the proposed scheme successfully finds 10 locations of three exception patterns, including dictionary key errors, out-of-range index errors, and division by zero errors, which could not be located before. Furthermore, the analysis achieves an average of 84% and a maximum of 89% reduction in false alarms for each benchmark.

Malware Variants Detection based on Dhash

Hongbi Kim, Hyunseok Shin, Junho Hwang, Taejin Lee

http://doi.org/10.5626/JOK.2019.46.11.1207

Malicious codes are becoming more intelligent due to the popularization of malware generation tools and obfuscation techniques, but existing malware detection techniques suffer from incomplete detection of malicious codes. Considering the facts that many newly emerging malicious codes are variants of existing malicious codes, and that they have binary data similar to those of the original malicious codes, a Dhash-based malware detection technique is presented here that classifies images based on the binary data in a file, along with a 10-gram algorithm that improves the long time taken by the analysis due to the full comparison of the Dhash algorithm. A comparison with the superior ssdep technique in variant malware detection shows that the Dhash algorithm can detect areas that ssdep does not detect, and the superiority of the proposed algorithm through the existing Dhash algorithm and the detection speed comparison experiment of the algorithms proposed in this paper. Future work will continue to develop variety of malware analysis technologies that are linked to other LSH-based detection techniques.

V-gram: Malware Detection Using Opcode Basic Blocks and Deep Learning

Seongmin Jeong, Hyeonseok Kim, Youngjae Kim, Myungkeun Yoon

http://doi.org/10.5626/JOK.2019.46.7.599

With the rapid increase in number of malwares, automatic detection based on machine learning becomes more important. Since the opcode sequence extracted from a malicious executable file is useful feature for malware detection, it is widely used as input data for machine learning through byte-based n-gram processing techniques. This study proposed a V-gram, a new data preprocessing technique for deep learning, which improves existing n-gram methods in terms of processing speed and storage space. V-gram can prevent unnecessary generation of meaningless input data from opcode sequences. It was verified that the V-gram is superior to the conventional n-gram method in terms of processing speed, storage space, and detection accuracy, through experiments conducted by collecting more than 64,000 normal and malicious code files.

A Study on Two-dimensional Array-based Technology to Identify Obfuscatied Malware

Seonbin Hwang, Hogyeong Kim, Junho Hwang, Taejin Lee

http://doi.org/10.5626/JOK.2018.45.8.769

More than 1.6 milion types of malware are emerging on average per day, and most cyber attackes are generated by malware. Moreover, malware obfuscation techniques are becoming more intelligent through packing or encryption to prevent reverse engineering analysis. In the case of static analysis, there is a limit to the analysis when the analytical file becomes obfuscated, and a countermeasure is needed. In this paper, we propose an approach based on String, Symbol, and Entropy as a way to identify malware even during obfuscation. Two-dimensional arrays were applied for fixed feature-set processing as well as non-fixed feature-set processing, and 15,000 malware/benign samples were tested using the Deep Neural Network. This study is expected to operate in a complementary manner in conjunction with various malicious code detection methods in the future, and it is expected that it can be utilized in the analysis of obfuscated malware variants.

Static Analysis of Large Scale Software Repositories Using WALA and Boa

Gyunghee Park, Sukyoung Ryu

http://doi.org/10.5626/JOK.2017.44.10.1081

A program analysis of a large-scale open-source software repository has a significant meaning in that it allows us to examine the changes and improvements of the software in repositories, and this brings more reliable results based on a large amount of programs. In this paper, we introduce a new static analysis framework WALABOA, which enables a scalable static analysis of large-scale software repositories. In addition, we show new findings from applying WALABOA, together with a module comparing the analysis results from a static analysis and a dynamic analysis, in evaluation of the field-based analysis, one of JavaScript static analysis techniques used in WALA.

An Effective Technique for Detecting Vulnerabilities in Android Device Drivers

Youngki Chung, Seong-je Cho

http://doi.org/

Android- and Linux-based embedded systems require device drivers, which are structured and built in kernel functions. However, device driver software (firmware) provided by various 3rd parties is not usually checked in terms of their security requirements but is simply included in the final products, that is, Android-based smart phones. In addition, static analysis, which is generally used to detect vulnerabilities, may result in extra cost to detect critical security issues such as privilege escalation due to its large proportion of false positive results. In this paper, we propose and evaluate an effective technique to detect vulnerabilities in Android device drivers using both static and dynamic analyses.

Tunable Static Analysis Framework for JavaScript Applications

Yoonseok Ko, Sukyoung Ryu

http://doi.org/

In this paper, we present a novel approach to analyzing large-scale JavaScript applications statically by tuning the analysis scalability, possibly sacrificing soundness. For a given sound static baseline analysis of JavaScript programs, our framework allows users to define a sound approximation of selected executions that they wish to analyze, and it derives a tuned static analysis that can analyze the selected executions practically. The selected executions serve as parameters of the framework by taking a trade-off between the scalability and the soundness of the derived analyses. We formally describe our framework in the abstract interpretation setting and present two instances of the framework.

A Study on Quality Assurance of Embedded Software Source Codes for Weapon Systems by Improving the Reliability Test Process

Kyeong Yong Kwon, Joon Seok Joo, Tae Sik Kim, Jin Woo Oh, Ji Hyun Baek

http://doi.org/

In the defense field, weapon systems are increasing in importance, as well as the weight of the weapon system embedded software development as an advanced technology. As the development of a network-centric warfare has become important to secure the reliability and quality of embedded software in modern weapons systems in battlefield situations. Also, embedded software problems are transferred to the production stage in the development phase and the problem gives rise to an enormous loss at the national level. Furthermore, development companies have not systematically constructed a software reliability test. This study suggests that approaches about a quality-verification- system establishment of embedded software, based on a variety of source code reliability test verification case analysis.

A Study on Selecting Key Opcodes for Malware Classification and Its Usefulness

Jeong Been Park, Kyung Soo Han, Tae Gune Kim, Eul Gyu Im

http://doi.org/

Recently, the number of new malware and malware variants has dramatically increased. As a result, the time for analyzing malware and the efforts of malware analyzers have also increased. Therefore, malware classification helps malware analyzers decrease the overhead of malware analysis, and the classification is useful in studying the malware’s genealogy. In this paper, we proposed a set of key opcode to classify the malware. In our experiments, we selected the top 10-opcode as key opcode, and the key opcode decreased the training time of a Supervised learning algorithm by 91% with preserving classification accuracy.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Digital Library[ Search Result ]

Search

Editorial Office