Digital Library[ Search Result ]
A Study on Reduction of False Alarms in Weapon System Software Static Test Using Natural Language Processing Model
Insub Lee, Hyoju Nam, Namhoon Jung, Kyutae Cho, Sungkyu Noh
http://doi.org/10.5626/JOK.2024.51.3.244
Recently, Securing software stability has become increasingly important as military systems have been upgraded. To this end, the Defense Acquisition Program Administration conducts reliability tests for weapon system software through static analysis tools. However, many false alarms occurred during the test process, resulting in a waste of time and resources. This paper aims to achieve a high positive/false positive classification rate by creating a dataset using the log of a static analysis tool and training a language model. Additionally, data processing methods appropriate for the static analysis features of weapon system software were investigated and analyzed during the research. As a result of the analysis, it was found that the CodeBert model pretrained in C/CPP and natural language using Optuna, a hyperparameter tuning tool, showed 4-5% higher performance based on the F1 score than the existing SoTA model. If the model presented in this research is mainly employed in software static testing, a significant number of false positives can be found.
Improving False Positive Rate of Extended Learned Bloom Filters Using Grid Search
http://doi.org/10.5626/JOK.2022.49.1.78
Bloom filter is a data structure that represents a set and returns whether data is included or not. However, there are cases in which false positives are returned at the cost of using less space. The learned bloom filter is a variation of the bloom filter, that uses a machine learning model in the pre-processing process to improve the false-positive rate. The learned bloom filter stores some data in the machine learning model, and the leftover data is stored in the auxiliary filter. An auxiliary filter can be implemented by using a bloom filter only, but in this paper, we use the bloom filter and the learned hash function, and this is called an extended learned bloom filter. The learned hash function uses the output value of the machine learning model as a hash function. In this paper, we propose a method that improves the false positive rate of the extended learned bloom filter through grid search. This method explores the extended learned bloom filter with the lowest false positive rate, by increasing the hyperparameter that represents the ratio of the learned hash function. As a result, we experimentally show that the extended learned bloom filter selected through grid search, can have a 20% improvement in false-positive rate compared to the learned bloom filter, in the experiment that needs more than 100,000 data to store. In addition, we also show that the false negative error may occur in the learned hash function by the use of 32-bit floating points in the neural network model. This can be solved by changing the floating points to 64-bit. Finally, we show that in an experiment where we query 10,000 data, we can adjust the structure of the neural network model to save 20KB of space and create an extended learned bloom filter with the same false-positive rate. However, the query time is increased by 2% at the cost of saving 20KB of space.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr