TY  - JOUR
T1  - Improving Retrieval Models through Reinforcement Learning with Feedback
AU  - Seo, Min-Taek 
AU  - Lim, Joon-Ho 
AU  - Kim, Tae-Hyeong 
AU  - Ryu, Hwi-Jung 
AU  - Chang, Du-Seong 
AU  - Na, Seung-Hoon 
JO  - Journal of KIISE, JOK
PY  - 2024
DA  - 2024/1/14
DO  - 10.5626/JOK.2024.51.10.900
KW  - Language model
KW  - Information retrieval
KW  - Reinforcement learning
KW  - Question answering
AB  - Open-domain question answering involves the process of retrieving clues through
search to solve problems. In such tasks, it is crucial that the search model provides appropriate clues,
as this directly impacts the final performance. Moreover, information retrieval is an important function
frequently used in everyday life. This paper recognizes the significance of these challenges and aims
to improve performances of search models. Just as the recent trend involves adjusting outputs in
decoder models using Reinforcement Learning from Human Feedback (RLHF), this study seeks to
enhance search models through the use of reinforcement learning. Specifically, we defined two
rewards: the loss of the answer model and the similarity between the retrieved documents and the
correct document. Based on these, we applied reinforcement learning to adjust the probability score of the top-ranked document in the search model's document probability distribution. Through this
approach, we confirmed the generality of the reinforcement learning method and its potential for
further performance improvements.