Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning

Junhyuk Kim; Junoh Kim; Kyungeun Cho

Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning

Junhyuk Kim

Junoh Kim

Kyungeun Cho

Vol. 52, No. 4, pp. 293-303, Apr. 2025

10.5626/JOK.2025.52.4.293

deep RL

policy-based RL algorithm

simulated-annealing

dynamic dense reward

PDF

Abstract

Nowadays, reinforcement learning is being studied and utilized in various fields, including autonomous driving, robotics, and gaming. The goal of reinforcement learning is to find the optimal policy for an agent to interact with its environment. Depending on the environment and the specific problem, either a policy-based algorithm or a value-based algorithm is selected for use. Policy-based algorithms can effectively learn in continuous and high-dimensional action spaces, but they face challenges such as the influence of learning rate parameters on the learning process and increased difficulty in converging to an optimized policy in complex environments. To address these issues, this paper proposes a behavior selection technique and a dynamic dense reward design based on a simulated annealing algorithm. The proposed method is applied to two different environments, and experimental results show that the policy-based reinforcement learning algorithms utilizing this method outperform the standard reinforcement learning algorithms.

Statistics

Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]

J. Kim, J. Kim, K. Cho, "Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning," Journal of KIISE, JOK, vol. 52, no. 4, pp. 293-303, 2025. DOI: 10.5626/JOK.2025.52.4.293.

[ACM Style]

Junhyuk Kim, Junoh Kim, and Kyungeun Cho. 2025. Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning. Journal of KIISE, JOK, 52, 4, (2025), 293-303. DOI: 10.5626/JOK.2025.52.4.293.

[KCI Style]

김준혁, 김준오, 조경은, "정책 기반 강화학습에서의 효율적 탐색을 위한 행동 선택 기법 및 동적 밀집 보상 적용 연구," 한국정보과학회 논문지, 제52권, 제4호, 293~303쪽, 2025. DOI: 10.5626/JOK.2025.52.4.293.

[Endnote/Zotero/Mendeley (RIS)] Download

[BibTeX] Download

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr