Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning 


Vol. 52,  No. 4, pp. 293-303, Apr.  2025
10.5626/JOK.2025.52.4.293


PDF

  Abstract

Nowadays, reinforcement learning is being studied and utilized in various fields, including autonomous driving, robotics, and gaming. The goal of reinforcement learning is to find the optimal policy for an agent to interact with its environment. Depending on the environment and the specific problem, either a policy-based algorithm or a value-based algorithm is selected for use. Policy-based algorithms can effectively learn in continuous and high-dimensional action spaces, but they face challenges such as the influence of learning rate parameters on the learning process and increased difficulty in converging to an optimized policy in complex environments. To address these issues, this paper proposes a behavior selection technique and a dynamic dense reward design based on a simulated annealing algorithm. The proposed method is applied to two different environments, and experimental results show that the policy-based reinforcement learning algorithms utilizing this method outperform the standard reinforcement learning algorithms.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

J. Kim, J. Kim, K. Cho, "Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning," Journal of KIISE, JOK, vol. 52, no. 4, pp. 293-303, 2025. DOI: 10.5626/JOK.2025.52.4.293.


[ACM Style]

Junhyuk Kim, Junoh Kim, and Kyungeun Cho. 2025. Research on Action Selection Techniques and Dynamic Dense Reward Application for Efficient Exploration in Policy-Based Reinforcement Learning. Journal of KIISE, JOK, 52, 4, (2025), 293-303. DOI: 10.5626/JOK.2025.52.4.293.


[KCI Style]

김준혁, 김준오, 조경은, "정책 기반 강화학습에서의 효율적 탐색을 위한 행동 선택 기법 및 동적 밀집 보상 적용 연구," 한국정보과학회 논문지, 제52권, 제4호, 293~303쪽, 2025. DOI: 10.5626/JOK.2025.52.4.293.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr