Search : [ author: Hyun-Wook Jin ] (6)

Lightweight Vertical Autoscaling Method Using Taylor Series for Serverless Computing

Hyeon-Jun Jang, Hyun-Wook Jin

http://doi.org/10.5626/JOK.2025.52.3.181

Serverless computing has become essential in modern IT infrastructure by utilizing autoscaling to reduce server management burdens, enabling developers to concentrate on service. However, as serverless environments now handle multiple requests per instance, the limitations of horizontal autoscaling have become more apparent. This underscores the increasing need for vertical autoscaling, which dynamically adjusts the resource allocations for each instance. Traditional vertical autoscaling methods, designed for long-running cloud applications, are not well-suited for serverless environments that require rapid response and short execution times. This paper introduces a lightweight vertical autoscaling method that employs Taylor series to enhance both resource efficiency and performance. Experiments with FunctionBench demonstrate that the proposed method reduces resource reservations and wasted resource slack compared to Vertical Pod Autoscaler (VPA) and Tiny Autoscaler, while also improving average and 99th-percentile tail latency. Specifically, when compared to VPA, resource reservations and slack decreased by 18.6% and 45%, respectively, while average and tail latency improved by 31.5% and 53.8%. Additionally, it exhibited the lowest overhead, confirming its effectiveness as a lightweight autoscaling solution.

MPI Progress Engine for Energy Efficient Intra-Node Point-to-Point Communication

Keon-Woo Kim, Hyun-Wook Jin

http://doi.org/10.5626/JOK.2021.48.10.1069

The communication completion layer called progress engine in the Message Passing Interface (MPI) library recognizes changes in communication states, such as message arrivals, by polling. Although polling provides a low communication latency, its use results in low energy efficiency because the progress engine occupies CPU resources while performing polling. The decrease in energy efficiency induced by polling has become more severe as the skew has increased with the advent of exascale systems. In this paper, we suggest a progress engine that uses both polling and signal, with the Eager protocol for small message processing and the Rendezvous protocol for large message processing to perform energy-efficient intra-node communication. The OSU microbenchmark and NAS performance measurements show that the suggested signal-based progress engine improves energy efficiency as the skew increases and reduces the execution time of applications when CPU resources are shared between multiple processes.

Dynamic Core Affinity for Manycore Partitioning

Chan-Gyu Lee, Joong-Yeon Cho, Hyun-Wook Jin

http://doi.org/10.5626/JOK.2020.47.12.1111

As the number of cores in computer systems with NUMA architecture is increasing, the contemporary operating systems are not scalable because of increased cache misses, cache coherence activities and synchronizations. To resolve this problem, several studies have suggested controlling the core affinity of the system calls and the event handlers, making these run on a specific set of cores. However, these core partitioning approaches statically decide the number of cores available for controlling the core affinity without considering the characteristics of the applications and the system architectures. In this paper, we propose a dynamic core affinity scheme for the core partitioning and compare with a static core partitioning mechanism.

Implementation of Intra-Partition Communication in Layered ARINC 653 for Drone Flight-Control Program

Joo-Kwang Park, Jooho Kim, Hyun-Chul Jo, Hyun-Wook Jin

http://doi.org/10.5626/JOK.2017.44.7.649

As the type and purpose of drones become diverse and the number of additional functions is increasing, the role of the corresponding software has increased. Through partitioning and an efficient solving of SWaP(size, weight and power) problems, ARINC 653 can provide reliable software reuse and consolidation regarding avionic systems. ARINC 653 can be more effectively applied to drones, a small unmanned aerial vehicle, in addition to its application with large-scale aircraft. In this paper, to exploit ARINC 653 for a drone flight-control program, an intra-partition communication system is implemented through an extension of the layered ARINC 653 and applied to a real drone system. The experiment results show that the overheads of the intra-partition communication are low, while the resources that are assigned to the drone flight-control program are guaranteed through the partitioning.

Using the On-Package Memory of Manycore Processor for Improving Performance of MPI Intra-Node Communication

Joong-Yeon Cho, Hyun-Wook Jin, Dukyun Nam

http://doi.org/

The emerging next-generation manycore processors for high-performance computing are equipped with a high-bandwidth on-package memory along with the traditional host memory. The Multi-Channel DRAM (MCDRAM), for example, is the on-package memory of the Intel Xeon Phi Knights Landing (KNL) processor, and theoretically provides a four-times-higher bandwidth than the conventional DDR4 memory. In this paper, we suggest a mechanism to exploit MCDRAM for improving the performance of MPI intra-node communication. The experiment results show that the MPI intra-node communication performance can be improved by up to 272 % compared with the case where the DDR4 is utilized. Moreover, we analyze not only the performance impact of different MCDRAM-utilization mechanisms, but also that of core affinity for processes.

Dynamic Core Affinity for High-Performance I/O Devices Supporting Multiple Queues

Joong-Yeon Cho, Junyong Uhm, Hyun-Wook Jin, Sungin Jung

http://doi.org/

Several studies have reported the impact of core affinity on the network I/O performance of multi-core systems. As the network bandwidth increases significantly, it becomes more important to determine the effective core affinity. Although a framework for dynamic core affinity that considers both network and disk I/O has been suggested, the multiple queues provided by high-speed I/O devices are not properly supported. In this paper, we extend the existing framework of dynamic core affinity to efficiently support the multiple queues of high-speed I/O devices, such as 40 Gigabit Ethernet and NVM Express. Our experimental results show that the extended framework can improve the HDFS file upload throughput by up to 32%, and can provide improved scalability in terms of the number of cores. In addition, we analyze the impact of the assignment policy of multiple I/O queues across a number of cores.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr