Search : [ author: Hwansoo Han ] (9)

Memory-Aware Eager Co-Scheduling for Multi-Tenant GPU Environments

Jeongjae Kim, Yunchae Choi, Hwansoo Han

http://doi.org/10.5626/JOK.2024.51.3.210

In a multi-tenant GPU environment, multiple applications are co-located on a single GPU to maximize utilization and throughput. However, co-location can lead to out-of-memory errors. Previous research addressed this problem by scheduling tasks that do not exceed the total GPU memory capacity. Our research introduces two novel methods that allow the co-location of additional tasks on a GPU while effectively preventing out-of-memory errors. Our approach involves immediate deallocation of unused memory within tasks, freeing up memory early on the GPU. This enables additional concurrent execution of multiple tasks on the GPU. Furthermore, by over-subscribing Unified Memory, tasks are scheduled to tolerate memory usage that exceeds the total GPU memory capacity. With our proposed schemes, it is feasible to reduce the execution time of multiple tasks compared to previous scheduling approaches and each scheme shows performance improvement of 7.3% and 1.9%, respectively, compared to prior research.

Compiler-directive based Heterogeneous Computing for Scala

Jungjae Woo, Seongsoo Park, Sungin Hong, Hwansoo Han

http://doi.org/10.5626/JOK.2023.50.3.197

With the advent of the big data era, heterogeneous computing is employed to process large amounts of data. Since Apache Spark, a representative big data analysis framework, is built with the Scala programming language, programs written in Scala need to be rewritten with CUDA, OpenCL, and others to enjoy the benefits of GPU computing. TornadoVM automatically converts Java programs into OpenCL programs using compiler annotations defined in the Java specification. Scala shares bytecode in an executable form with Java, but the annotation capabilities of current Scala compilers lack the annotations indispensable for TornadoVM’s OpenCL translation. In this work, the annotation capabilities of Scala compilers are extended to enable OpenCL translation on TornadoVM. Furthermore, we experimentally confirmed that the performance of Scala-OpenCL converted code is as fast as Java-OpenCL converted code. With our extension, we expect Scala programs to easily use GPU acceleration in the Apache Spark framework.

Data Transfer Optimized High-level FPGA Host Programming Interface

Jongwoo Kim, Seongsoo Park, Hwansoo Han

http://doi.org/10.5626/JOK.2021.48.8.859

Along with general-purpose CPUs, hardware accelerators have been widely adopted to execute various workloads efficiently. Recently, FPGAs have emerged in the area of software-level development as high-level languages such as C/C++ support FPGA programming. OpenCL supports most heterogeneous processors in high-level programming, but different optimization techniques are required depending upon the unique architectural features in the accelerators. In particular, developing FPGA kernel programs requires more knowledge of hardware architecture than other heterogeneous processors. Due to this characteristic, optimization should be collaborated with the host program as well. In this paper, we proposed SimFL, a high-level programming interface for developing host programs to use FPGAs as accelerators. To evaluate our optimization, we used the host programs for FPGA with SimFL and verified a performance improvement of up to 44.7% by applying multi-threaded copying within SimFL.

Predicting the Cache Performance Benefits for In-memory Data Analytics Frameworks

Minseop Jeong, Hwansoo Han

http://doi.org/10.5626/JOK.2021.48.5.479

In-memory data analytics frameworks provide intermediate results in caching facilities for performance. For effective caching, the actual performance benefits from cached data should be taken into consideration. As existing frameworks only measure execution times at the distributed task level, they have limitations in predicting the cache performance benefits accurately. In this paper, we propose an operator-level time measurement method, which incorporates the existing task-level execution time measurement with our cost prediction model according to input data sizes. Based on the proposed model and the execution flow of the application, we propose a prediction method for the performance benefits from data caching. Our proposed model provides opportunities for cache optimization with predicted performance benefits. Our cost model for operators showed prediction error rate of 7.3% on average, when measured with 10x input data. The difference between predicted performance and actual performance wes limited to within 24%.

Software Similarity Detection Using Highly Credible Dynamic API Sequences

Seongsoo Park, Hwansoo Han

http://doi.org/

Software birthmarks, which are unique characteristics of the software, are used to detect software plagiarism or software similarity. Generally, software birthmarks are divided into static birthmarks or dynamic birthmarks, which have evident pros and cons depending on the extraction method. In this paper, we propose a method for extracting the API sequence birthmarks using a dynamic analysis and similarity detection between the executable codes. Dynamic birthmarks based on API sequences extract API functions during the execution of programs. The extracted API sequences often include all the API functions called from the start to the end of the program. Meanwhile, our dynamic birthmark scheme extracts the API functions only called directly from the executable code. Then, it uses a sequence alignment algorithm to calculate the similarity metric effectively. We evaluate the birthmark with several open source software programs to verify its reliability and credibility. Our dynamic birthmark scheme based on the extracted API sequence can be utilized in a similarity test of executable codes.

In-Memory File System Backed by Cloud Storage Services as Permanent Storages

Kyungjun Lee, Jiwon Kim, Sungtae Ryu, Hwansoo Han

http://doi.org/

As network technology advances, a larger number of devices are connected through the Internet. Recently, cloud storage services are gaining popularity, as they are convenient to access anytime and anywhere. Among cloud storage services, object storage is the representative one due to their characteristics of low cost, high availability, and high durability. One limitation of object storage services is that they can access data on the cloud only through the HTTP-based RESTful APIs. In our work, we resolve this limitation with the in-memory file system which provides a POSIX interface to the file system users and communicates with cloud object storages with RESTful APIs. In particular, our flush mechanism is compatible with existing file systems, as it is based on the swap mechanism of the Linux kernel. Our in-memory file system backed by cloud storage reduces the performance overheads and shows a better performance than S3QL by 57% in write operations. It also shows a comparable performance to tmpfs in read operations.

Performance Analysis of Cloud-Backed File Systems with Various Object Sizes

Jiwon Kim, Kyungjun Lee, Sungtae Ryu, Hwansoo Han

http://doi.org/

Recent cloud infrastructures provide competitive performances and operation costs for many internet services through pay-per-use model. Particularly, object storages are highlighted, as they have unlimited file holding capacity and allow users to access the stored files anytime and anywhere. Several lines of research are based on cloud-backed file systems, which support traditional POSIX interface rather than RESTful APIs via HTTP. However, these existing file systems handle all files with uniform size backing objects. Consequently, the accesses to cloud object storages are likely to be inefficient. In our research, files are profiled according to characteristics, and appropriate backing unit sizes are determined. We experimentally verify that different backing unit sizes for the object storage improve the performance of cloud-backed file systems. In our comparative experiments with S3QL, our prototype cloud-backed file system shows faster performance by 18.6% on average.

Mapping Cache for High-Performance Memory Mapped File I/O in Memory File Systems

Jiwon Kim, Jungsik Choi, Hwansoo Han

http://doi.org/

The desire to access data faster and the growth of next-generation memories such as non-volatile memories, contribute to the development of research on memory file systems. It is recommended that memory mapped file I/O, which has less overhead than read-write I/O, is utilized in a high-performance memory file system. Memory mapped file I/O, however, brings a page table overhead, which becomes one of the big overheads that needs to be resolved in the entire file I/O performance. We find that same overheads occur unnecessarily, because a page table of a file is removed whenever a file is opened after being closed. To remove the duplicated overhead, we propose the mapping cache, a technique that does not delete a page table of a file but saves the page table to be reused when the mapping of the file is released. We demonstrate that mapping cache improves the performance of traditional file I/O by 2.8x and web server performance by 12%.

Detecting Software Similarity Using API Sequences on Static Major Paths

Seongsoo Park, Hwansoo Han

http://doi.org/

Software birthmarks are used to detect software plagiarism. For binaries, however, only a few birthmarks have been developed. In this paper, we propose a static approach to generate API sequences along major paths, which are analyzed from control flow graphs of the binaries. Since our API sequences are extracted along the most plausible paths of the binary codes, they can represent actual API sequences produced from binary executions, but in a more concise form. Our similarity measures use the Smith-Waterman algorithm that is one of the popular sequence alignment algorithms for DNA sequence analysis. We evaluate our static path-based API sequence with multiple versions of five applications. Our experiment indicates that our proposed method provides a quite reliable similarity birthmark for binaries.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr