Digital Library[ Search Result ]
Efficient Large Language Model Based Passage Re-Ranking Using Single Token Representations
Jeongwoo Na, Jun Kwon, Eunseong Choi, Jongwuk Lee
http://doi.org/10.5626/JOK.2025.52.5.395
In information retrieval systems, document re-ranking involves reordering a set of candidate documents based on evaluation of their relevance to a given query. Leveraging extensive natural language understanding capabilities of large language models(LLMs), numerous studies on document re-ranking have been conducted, demonstrating groundbreaking performance. However, studies utilizing large language models focus solely on improving reranking performance, resulting in degraded efficiency due to excessively long input sequences and the need for repetitive inference. To address these limitations, we propose ListT5++, a novel model that represents the relevance between a query and a passage using single token embedding and significantly improves the efficiency of LLM-based reranking through a single-step decoding strategy that minimizes the decoding process. Experimental results showed that ListT5++ could maintain accuracy levels comparable to existing methods while reducing inference latency by a factor of 29.4 relative to the baseline. Moreover, our approach demonstrates robust characteristics by being insensitive to th initial ordering of candidate documents, thereby ensuring high practicality in real-time retrieval environments.
Improving the Lifetime of NAND Flash-based Storages by Min-hash Assisted Delta Compression Engine
Hyoukjun Kwon, Dohyun Kim, Jisung Park, Jihong Kim
In this paper, we propose the Min-hash Assisted Delta-compression Engine(MADE) to improve the lifetime of NAND flash-based storages at the device level. MADE effectively reduces the write traffic to NAND flash through the use of a novel delta compression scheme. The delta compression performance was optimized by introducing min-hash based LSH(Locality Sensitive Hash) and efficiently combining it with our delta compression method. We also developed a delta encoding technique that has functionality equivalent to deduplication and lossless compression. The results of our experiment show that MADE reduces the amount of data written on NAND flash by up to 90%, which is better than a simple combination of deduplication and lossless compression schemes by 12% on average.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr