Digital Library[ Search Result ]
Integrated Host-SSD Mapping Table Cache Management Techniques for Improving Performance of a Mobile Storage Device
Yoona Kim, Inhyuk Choi, Sungjin Lee, Jihong Kim
http://doi.org/10.5626/JOK.2023.50.11.924
As the size of a storage device gradually increases, the demand for on-device memory capacity required for managing the address mapping translation of a NAND flash-based storage device increases. The on-device memory capacity of a mobile storage device, Universal Flash Storage (UFS), does not increase due to H/W and cost constraints, making it challenging to manage the increased address translation table. To resolve the problem, Host Performance Booster (HPB), which borrows host-side DRAM memory to load portions of the address translation table was introduced. In this paper, we demonstrate that the HPB-enabled system does not work in an integrated manner with the device-side SRAM, therefore wasting the given memory resource. We propose integrated mapping table management techniques that consider the distinctive features of each cache layer. By adopting these techniques, we aim to minimize wasted cache resources, reduce storage latency, and prevent unnecessary degradation of the storage lifetime. Based on the evaluation results, the cache hit ratio is improved by 5% while the wasted memory resource is reduced by 95%, and the number of device-side garbage collections is reduced by 43% compared to the baseline scheme.
MQSim-E: Design and Implementation of an NVMe SSD Simulator for Enterprise SSDs
Duwon Hong, Dusol Lee, Jihong Kim
http://doi.org/10.5626/JOK.2022.49.4.271
In the study of storage systems such as SSD, a simulator that accurately mimic the operation of SW/HW inside the system plays an important role. In this paper, MQSim, which is widely used in research on NVMe SSDs, was shown to be inappropriate for the development of enterprise-SSD, and we propose an MQSim-E simulator that supports optimized techniques adopted in enterprise-SSD. MQSim-E fully utilizes the parallelism of flash memory and minimizes the performance overhead of garbage collection, improving IOPS, which is an important design goal for enterprise-SSDs, by up to 210% and reducing tail latency by up to 16,000% compared to the existing simulator (MQSim) to accurately reflect the characteristics of commercial enterprise SSDs.
New Flash Commands for Building Flash Storage Systems with Plausible Deniability
Geonhee Cho, Myungsuk Kim, Jihong Kim
http://doi.org/10.5626/JOK.2022.49.2.120
Traditional encryption cannot defend against coercive attackers who compel the user to hand over decryption keys as it cannot hide the existence of the ciphertext. To solve this problem, there have been studies on a deniable storage solution that applies plausible deniability, a characteristic that allows the user to deny the existence of sensitive data, to a storage device. The hidden volume mechanism is being used in various deniable storage solutions due to its relatively low-performance overhead compared to other mechanisms, and has recently evolved to defend against multiple-snapshot attacks. However, the existing hidden volume mechanism fundamentally requires a dummy random data pool to hide the ciphertext. Due to the existence of dummy random data stored in the storage device, the plausible deniability characteristic is exposed, which can reveal the intention to hide the data. This study proposes a flash chip-level access control command set that simultaneously supports data sanitization and plausible deniability, and using this, we propose a hidden volume-based deniable storage solution that supports plausible deniability characteristics without dummy random data.
Sequentiality-Aware Hash-based FTL
Jaemin Shin, Ilbo Jeong, Li Xiaochang, Jihong Kim
http://doi.org/10.5626/JOK.2020.47.8.717
As the capacity of an SSD significantly increases, the SSD needs a larger DRAM for managing SSD-internal information. Since the cost of DRAM is an important factor in deciding the overall SSD price, it is important to reduce the DRAM cost without a performance degradation. In this paper, we propose a novel hash-based FTL mapping technique that meets this goal. Unlike an existing hash-based scheme, our technique introduces a virtual block scheme which enables to exploit the sequentiality of the logical address which effectively reduces the garbage collection overhead. Experimental results showed that SEQhFTL can reduce this overhead as much as PFTL while only maintaining 39% of PFTL’s metadata used on average.
Improving Performance of Flash Storage Using Restricted Copyback
Duwon Hong, Seulgi Shin, Jihong Kim
http://doi.org/10.5626/JOK.2019.46.8.726
In case of modern flash-based SSDs, the performance overhead of internal data migrations is dominated by the data transfer time and not by the flash program time as in old SSDs. In order to mitigate the performance impact of data migrations, we propose rcopyback, a restricted version of copyback. Rcopyback works in a manner similar to the original copyback except that only n consecutive copybacks are allowed. By limiting the number of successive copybacks, the version guarantees internal migration of data using rcopyback without any reliability problem. In order to take a full advantage of rcopyback, we developed a rcopyback-aware FTL, rcFTL, which intelligently decides whether rcopyback should be used or not by exploiting varying host workloads. Our evaluation results show that rcFTL can improve the overall I/O throughput by 54% on average over an existing FTL which does not use copybacks.
qtar: Design and Implementation of an Optimized tar Command with FTL-level Remapping
Jeongseok Ryoo, Sangwook Shane Hahn, Jihong Kim
http://doi.org/10.5626/JOK.2018.45.1.9
Tar is a Linux command that combines several files into a single file. Combining multiple small files into large files increases the compression efficiency and data transfer speed. However, tar has a problem in that smaller target files, result in a lower performance. In this paper, we show that this performance degradation occurs when tar reads the data from the target files and propose qtar (quick tar) to solve this problem via flash-level remapping. When the size of an I/O request is less than 1 MB, the I/O performance decreases proportionally to the decrease in size of the I/O request. Since tar reads the data of files one by one, a smaller file size results in a lower performance. Therefore, the remapping technique is implemented in qtar to read data from the target files at the maximum I/O size regardless of the size of each file. Our evaluations show that the execution time with qtar is reduced by up to 3.4 times compared to that with tar.
A Cross Layer Optimization Technique for Improving Performance of MLC NAND Flash-Based Storages
Jisung Park, Sungjin Lee, Jihong Kim
http://doi.org/10.5626/JOK.2017.44.11.1130
The multi-leveling technique that stores multiple bits in a single memory cell has significantly improved the density of NAND flash memory along with shrinking processes. However, because of the side effects of the multi-leveling technique, the average write performance of MLC NAND flash memory is degraded more than twice that of SLC NAND flash memory. In this paper, we introduce existing cross-layer optimization techniques proposed to improve the performance of MLC NAND flash-based storages, and propose a new integration technique that overcomes the limitations of existing techniques by exploiting their complementarity. By fully exploiting the performance asymmetry in MLC NAND flash devices at the flash translation layer, the proposed technique can handle many write requests with the performance of SLC NAND flash devices, thus significantly improving the performance of NAND flash-based storages. Experimental results show that the proposed technique improves performance 39% on average over individual techniques.
Garbage Collection Synchronization Technique for Improving Tail Latency of Cloud Databases
Seungwook Han, Sangwook Shane Hahn, Jihong Kim
http://doi.org/10.5626/JOK.2017.44.8.767
In a distributed system environment, such as a cloud database, the tail latency needs to be kept short to ensure uniform quality of service. In this paper, through experiments on a Cassandra database, we show that long tail latency is caused by a lack of memory space because the database cannot receive any request until free space is reclaimed by writing the buffered data to the storage device. We observed that, since the performance of the storage device determines the amount of time required for writing the buffered data, the performance degradation of Solid State Drive (SSD) due to garbage collection results in a longer tail latency. We propose a garbage collection synchronization technique, called SyncGC, that simultaneously performs garbage collection in the java virtual machine and in the garbage collection in SSD concurrently, thus hiding garbage collection overheads in the SSD. Our evaluations on real SSDs show that SyncGC reduces the tail latency of 99.9th and, 99.99th-percentile by 31% and 36%, respectively.
AIOPro: A Fully-Integrated Storage I/O Profiler for Android Smartphones
Sangwook Shane Hahn, Inhyuk Yee, Donguk Ryu, Jihong Kim
Application response time is critical to end-user response time in Android smartphones. Due to the plentiful resources of recent smartphones, storage I/O response time becomes a major key factor in application response time. However, existing storage I/O trace tools for Android and Linux give limited information only for a specific I/O layer which makes it difficult to combine I/O information from different I/O layers, because not helpful for application developer and researchers. In this paper, we propose a novel storage I/O trace tool for Android, called AIOPro (Android I/O profiler). It traces storage I/O from application - Android platform - system call - virtual file system - native file system - page cache - block layer - SCSI layer and device driver. It then combines the storage I/O information from I/O layers by linking them with file information and physical address. Our evaluations of real smartphone usage scenarios and benchmarks show that AIOPro can track storage I/O information from all I/O layers without any data loss under 0.1% system overheads.
Improving the Lifetime of NAND Flash-based Storages by Min-hash Assisted Delta Compression Engine
Hyoukjun Kwon, Dohyun Kim, Jisung Park, Jihong Kim
In this paper, we propose the Min-hash Assisted Delta-compression Engine(MADE) to improve the lifetime of NAND flash-based storages at the device level. MADE effectively reduces the write traffic to NAND flash through the use of a novel delta compression scheme. The delta compression performance was optimized by introducing min-hash based LSH(Locality Sensitive Hash) and efficiently combining it with our delta compression method. We also developed a delta encoding technique that has functionality equivalent to deduplication and lossless compression. The results of our experiment show that MADE reduces the amount of data written on NAND flash by up to 90%, which is better than a simple combination of deduplication and lossless compression schemes by 12% on average.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr