Digital Library[ Search Result ]
Beyond Traditional Search: SIMD-Optimized Correction for Learned Index
Yeojin Oh, Nakyeong Kim, Jongmoo Choi, Seehwan Yoo
http://doi.org/10.5626/JOK.2025.52.5.363
To address the limitations of traditional indexing techniques, this study examines the search performance of machine learning-based Learned Indexes, focusing on the read-only RMI and the modifiable ALEX We propose a SIMD-based optimization technique to minimize the overhead incurred during the correction phase, which accounts for over 80% of the total search time. Learned Indexes operate in two phases: prediction and correction. In our experiments with RMI, we found that when the error range is large, the SIMD Branchless Binary Search capable of quickly narrowing down the search range outperforms other methods. In contrast. when the error range is small, the model prediction-based SIMD Linear Search demonstrates superior performance. For ALEX, which maintains a relatively constant error range, the straightforward SIMD Linear Search proved to be the most efficient compared to more complex search techniques. These results underscore the importance of choosing the right search algorithm based on the dataset’s error range, index size, and density to achieve optimal performance.
Political Bias in Large Language Models and its Implications on Downstream Tasks
Jeong yeon Seo, Sukmin Cho, Jong C. Park
http://doi.org/10.5626/JOK.2025.52.1.18
This paper contains examples of political leaning bias that can be offensive. Abstract As the performance of the Large Language Models (LLMs) improves, direct interaction with users becomes possible, raising ethical issues. In this study, we design two experiments to explore the diverse spectrum of political stances that an LLM exhibits and how these stances affect downstream tasks. We first define the inherent political stances of the LLM as the baseline and compare results from three different inputs (jailbreak, political persona, and jailbreak persona). The results of the experiments show that the political stances of the LLM changed the most with the jailbreak attack, while lesser changes were observed with the other two inputs. Moreover, an experiment involving downstream tasks demonstrated that the distribution of altered inherent political stances can affect the outcome of these tasks. These results suggest that the model generates responses that align more closely like its inherent stance rather than the user’s intention to personalize responses. We conclude that the intrinsic political bias of the model and its judgments can be explicitly communicated to users.
Multidimensional Subset-based Systems for Bias Elimination Within Binary Classification Datasets
KyeongSu Byun, Goo Kim, Joonho Kwon
http://doi.org/10.5626/JOK.2023.50.5.383
As artificial intelligence technology develops, artificial intelligence-related fairness issues are drawing attention. As a result, many related studies have been conducted on this issue, but most of the research has focused on developing models and training methods. Research on removing bias existing in data used for learning, which is a fundamental cause, is still insufficient. Therefore, in this paper, we designed and implemented a system that divides the biases existing within the data into label biases and subgroup biases and removes the biases to generate datasets with improved fairness. The proposed system consists of two steps: (1) subset generation and (2) bias removal. First, the subset generator divides the existing data into subsets on formed by a combination of values in an datasets. Subsequently, the subset is divided into dominant and weak groups based on the fairness indicator values obtained by validating the existing datasets based on the validation datasets. Next, the bias remover reduces the bias shown in the subset by repeating the process of sequentially extracting and verifying the dominant group of each subset to reduce the difference from the weak group. Afterwards, the biased subsets are merged and a fair data set is returned. The fairness indicators used for the verification use the F1 score and the equalized odd. Comprehensive experiments with real-world Census incoming data, COMPAS data, and bank marketing data as verification data demonstrated that our proposed system outperformed the existing technique by yielding a better fairness improvement rate and providing more accuracy in most machine learning algorithms.
Fair Feature Distillation Using Teacher Models of Larger Architecture
http://doi.org/10.5626/JOK.2021.48.11.1176
Achieving algorithmic fairness is becoming increasingly essential for various vision applications. Although a state-of-the-art fairness method, dubbed as MMD-based Fair feature Distillation (MFD), significantly improved accuracy and fairness via feature distillation based on Maximum Mean Discrepancy (MMD) compared to previous works, MFD could be limitedly applied into when a teacher model has the same architecture as a student model. In this paper, based on MFD, we propose a systematic approach that mitigates unfair biases via feature distillation of a teacher model of larger architecture, dubbed as MMD-based Fair feature Distillation with a regressor (MFD-R). Throughout the extensive experiments, we showed that our MFD-R benefits from the use of the larger teacher compared to MFD as well as other baseline methods.
Deep Learning Model based on Autoencoder for Reducing Algorithmic Bias of Gender
http://doi.org/10.5626/JOK.2019.46.8.721
Algorithmic bias is a discrimination that is reflected in the model by a bias in data or combination of characteristics of model and data in the algorithm. In recent years, it has been identified that the bias is not only present but also amplified in the deep learning model; thus, there exists a problem related to bias elimination. In this paper, we analyze the bias of the algorithm by gender in terms of bias-variance dilemma and identify the cause of bias. To solve this problem, we propose a deep auto-encoder based latent space matching model. Based on the experimental results, it is apparent that the algorithm bias in deep learning is caused by difference of the latent space for each protected feature in the feature extraction part of the model. A model proposed in this paper achieves the low bias by reducing the differences in extracted features by transferring data with different gender characteristics to the same latent space. We employed Equality of Odds and Equality of Opportunity as a quantitative measure and proved that proposed model is less biased than the previous model. The ROC curve shows a decrease in the deviation of the predicted values between the genders.
Speech-Act Analysis System Based on Dialogue Level RNN-CNN Effective on the Exposure Bias Problem
http://doi.org/10.5626/JOK.2018.45.9.911
The speech-act is the intention of the speaker in his or her utterance. Speech-act analysis classifies the speech-act about a given utterance. Recently, a lot of research based on machine learning using a corpus have been done. We have two goals in this study. First, the utterances in dialogue are continuative and organically related to each other, and the speech-act of a current utterance is greatly influenced by the direct previous utterance. Second, previous research did not deal with the exposure bias problem when the speech-act analysis model use the speech-act result of a previous utterance. In this paper, we suggest the RNN-CNN dialogue-level speech-act analysis model. We also experiment with the exposure bias problem. Finally, the RNN-CNN shows an 86.87% performance on the oracle condition and an 86.27% performance on the greedy condition.
Bias-Based Predictor to Improve the Recommendation Performance of the Rating Frequency Weight-based Baseline Predictor
Collaborative Filtering is limited because of the cost that is required to perform the recommendation (such as the time complexity and space complexity). The RFWBP (Rating Frequency Weight-based Baseline Predictor) that approximates the precision of the existing methods is one of the efficiency methods to reduce the cost. But, the following issues need to be considered regarding the RFWBP: 1) It does not reduce the error because the RFWBP does not learn for the recommendation, and 2) it recommends all of the items because there is no condition for an appropriate recommendation list when only the RFWBP is used for the achievement of efficiency. In this paper, the BBP (Bias-Based Predictor) is proposed to solve these problems. The BBP reduces the error range, and it determines some of the cases to make an appropriate recommendation list, thereby forging a recommendation list for each case.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr