Digital Library[ Search Result ]
A Real-time Scheduling Framework for Multi-threaded ROS 2 Applications
Seryun Kang, Jinseop Jeong, Kanghee Kim
http://doi.org/10.5626/JOK.2025.52.1.1
Real-time performance of robot applications operating in the physical world is crucial. In ROS (Robot Operating System) 2, robot applications consist of dozens or even hundreds of tasks. If the end-to-end delay from sensing to control increases, the resulting motion may be delayed, potentially leading to physical accidents. Consequently, many studies have been conducted to analyze and reduce delays in robot applications. This paper proposes a real-time scheduling framework that allows the application of the probabilistic latency analysis method, originally designed for process graphs, to thread graphs. The proposed framework groups callback functions with the same period into a single group based on a global schedule table and creates a thread graph by assigning a dedicated thread to each group. Each thread is then fixed to a CPU core as determined by the table and is scheduled using FIFO. This paper applies the proposed framework to the localization pipeline of Autoware and confirms that probabilistic latency analysis is feasible within this framework.
Efficient Dynamic Graph Processing Based on GPU Accelerated Scheduling and Operation Reduction
Sangho Song, Jihyeon Choi, Donghyeon Cha, Hyeonbyeong Lee, Dojin Choi, Jongtae Lim, Kyoungsoo Bok, Jaesoo Yoo
http://doi.org/10.5626/JOK.2024.51.12.1125
Recent research has focused on utilizing GPUs to process large-scale dynamic graphs. However, processing dynamic graphs often leads to redundant data transmission and processing. This paper proposes an efficient scheme for processing large-scale dynamic graphs in memory-constrained GPU environments. The proposed scheme consists of dynamic scheduling and operation reduction methods. The dynamic scheduling method involves partitioning dynamic graph and maximizing GPU processing power by scheduling partitions based on active and potential active vertices. Also, snapshots are utilized to leverage the time-varying characteristics of the graph. The operation reduction method minimizes GPU computation and memory transfer costs by detecting redundant edge and vertex updates in dynamic graphs through snapshots. By avoiding redundant operations on the same edges or vertices, this method improves performance. Through various performance evaluations, the proposed scheme showed 280% and 108% performance improvements on average compared to a static graph processing scheme and a dynamic graph processing scheme, respectively.
Overcoming a Zone Reclaiming Overhead with Partial-Zone Reclaiming
Inho Song, Wonjin Lee, Jaedong Lee, Seehwan Yoo, Jongmoo Choi
http://doi.org/10.5626/JOK.2024.51.2.115
Solid State Drive (SSD) suffers unpredictable IO latency and space amplification due to the traditional block interface. Zoned Namespace, which is a more flash friendly interface, replaced the block interface bringing reliable IO latency and increasing both the capacity and lifespan of SSDs. The benefit of the zone interface is not free. A Zoned Namespace (ZNS) SSD delegates the garbage collection and data placement responsibility to the host, which requires host-level garbage collection called "zone reclaiming". At the same time, ZNS SSD exposes a larger zone to the host to exploit the device parallelism. The increased number of blocks to a zone gives high parallelism; however, the overhead of the zone reclaiming process becomes high with the increased size of the zone. Eventually, the host neither expects predictable latency nor optimal performance due to the background process. This paper tackles the overhead of the zone reclaiming process by introducing "Partial Zone Reclaiming" method. Partial zone reclaiming delays the ongoing reclaiming process and handles the host request that is on the fly. In our experiment, partial zone reclaiming not only improved the host request latency by up to 8% on average, but also reduced zone reclaiming time by up to 41%.
Performance Analysis of Instruction Priority Functions using a List Scheduling Simulator
Changhoon Chung, Soo-Mook Moon
http://doi.org/10.5626/JOK.2023.50.12.1048
Instruction scheduling is an important compiler optimization technique, for reducing the execution time of a program by parallel processing. However, existing scheduling techniques show limited performance, because they rely on heuristics. This study examines the effect of instruction priority functions on list scheduling, through simulation. As a result, using a priority function based on the overall structure of the dependency graph can reduce schedule length by up to 4%, compared to using a priority function based on the original instruction order. Furthermore, the result gives a direction on which input features should be used when implementing a reinforcement learning-based scheduling model.
Risk Scheduling-based Optimistic Exploration for Distributional Reinforcement Learning
Jihwan Oh, Joonkee Kim, Se-Young Yun
http://doi.org/10.5626/JOK.2023.50.2.172
Distributional reinforcement learning demonstrates state-of-the-art performance in continuous and discrete control systems with the features of variance and risk, which can be used to explore action space. However, the exploration method employing the risk property is hard to find, although numerous exploration methods in Distributional RL employ the variance of the return distribution for an action. This paper presents risk scheduling approaches that explore risk levels and optimistic behaviors from a risk perspective in Distributional reinforcement learning. We demonstrate the performance (win-rate) enhancement of the DMIX, DDN, and DIQL algorithms, which integrate Distributional reinforcement learning into a multi-agent system using risk scheduling in a multi-agent setting with comprehensive experiments.
Design and Implemention of Time-Triggered Architecture for Multicore Automotive Systems
http://doi.org/10.5626/JOK.2022.49.12.1043
Recently, automotive electrical/electronic (E/E) architectures have considered the Multicore AUTOSAR platform for guaranteeing the safety and performance of automotive systems. However, inter-core communication response time delays due to spinning caused by spinlock deteriorate Multicore performance. This paper presents the design of a Time-Triggered Architecture (TTA) to optimize the Multicore system. In our approach, we present the TTA design methodology, including the task allocation algorithm using DQN reinforcement for inter-core load balancing, the Harmonic-Period setting algorithm, and the task Offset, Deadline setting algorithm. Then, we proposed a Timing Violation detection method using Data Version to apply it to the AUTOSAR platform. For verification, we applied the TTA algorithm to the Fuel Cell Controller (FCU) task model. Our simulations showed that the load balancing rate was improved by 94% compared to the existing controller, and its scalability covered at least 78% of the optimal value. It also showed that mutual exclusion was enforced and confirmed that each algorithm was well applied.
ILP-based Schedule Synthesis of Time-Sensitive Networking
Jin Hyun Kim, Hyonyoung Choi, Kyong Hoon Kim, Insup Lee, Se-Hoon Kim
http://doi.org/10.5626/JOK.2021.48.6.595
IEEE 802.1Qbv Time Sensitive Network (TSN), the latest real-time Ethernet standard, is a network designed to guarantee the temporal accuracy of streams. TSN is an Ethernet-based network system that is actively being developed for the factory automation and automobile network systems. TSN controls the flow of data streams based on schedules generated statically off-line to satisfy end-to-end delay or jitter requirements. However, the generation of TSN schedules is an NP-hard problem; because of this, constraint solving techniques, such as SMT (Satisfiability Modulo Theory) and ILP (Integer Linear Programming), have mainly been proposed as solutions to this problem. This paper presents a new approach using a heuristic greedy and incremental algorithm working with ILP to decrease the complexity of computing schedules and improve the schedule generation performance in computing TSN schedules. Finally, we compare our proposed method with the existing SMT solver approach to show the performance of our approach.
Performance Analysis of Concurrent Multitasking for Efficient Resource Utilization of GPUs
Sejin Kim, Qichen Chen, HeonYoung Yeom, Yoonhee Kim
http://doi.org/10.5626/JOK.2021.48.6.604
As Graphics Processing Units (GPUs) are widely utilized to accelerate compute-intensive applications, their application has expanded especially in data centers and clouds. However, the existing resource sharing methods within GPU are limited and cannot efficiently handle several requests of concurrent cloud users’ executions on GPU while effectively utilizing the available system resources. In addition, it is challenging to effectively partition resources within GPU without understanding and assimilating application execution patterns. This paper proposes an execution pattern-based application classification method and analyzes run-time characteristics: why the performance of an application is saturated at a point regardless of the allocated resources. In addition, we analyze the multitasking performance of the co-allocated applications using smCompactor, a thread block-based scheduling framework. We identify near-best co-allocated application sets, which effectively utilize the available system resources. Based on our results, there was a performance improvement of approximately 28% compared to NVIDIA MPS.
Deadline Task Scheduling for Mitigating the CPU Performance Interference in Android Systems
Jeongwoong Lee, Taehyung Lee, Young Ik Eom
http://doi.org/10.5626/JOK.2020.47.1.11
In the Android Linux kernel, most of the tasks are expected to run fairly, and so, there can be delays in processing time-sensitive applications. In particular, since the user may feel inconveniences when the delay occurs in media data processing or biometrics processing such as fingerprint recognition, the tasks requiring completion within a given time should be considered as deadline tasks. However, using the deadline scheduler in current Android systems can cause two problems. First, as deadline tasks come to the system and are executed, the CPU energy consumption can be increased. Second, the high priority of the deadline tasks can cause performance degradation of the normal tasks. To mitigate these problems, this paper proposes a method of scheduling deadline tasks on Android systems, which reduces the performance impact on normal tasks, while trying to minimize energy consumption. Our evaluation on the CPU benchmark shows that the proposed method improves the CPU performance by about 10% compared with the conventional deadline scheduler, but does not increase power consumption by effectively utilizing CPU frequency.
Deep Reinforcement Learning based Multipath Packet Scheduling
Minwoo Joo, Wonwoo Jang, Wonjun Lee
http://doi.org/10.5626/JOK.2019.46.7.714
Packet scheduling in multipath environments deals with the determination of the manner of distribution of data traffic over multiple network paths and is considered as one of the significant factors affecting the multipath transport performance. However, existing algorithms for packet scheduling rely on particular metrics, which leads to limited performance under dynamic network conditions. In this paper, we propose a deep reinforcement learning (DRL) based packet scheduler with an ability to adapt to dynamic network changes. We have designed a DRL model to automatically capture and discover the network states and effects from the scheduling decisions. The proposed packet scheduler is implemented based on a multipath extension of the Quick UDP Internet Connections (QUIC) network stack and evaluated through network emulation to verify the potential of autonomous networking.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr