Search : [ keyword: Non-volatile memory ] (12)

Design of Durable Node Replication for Persistent Memory Data Structures on NUMA Architectures

Junghan Kim, Young Ik Eom

http://doi.org/10.5626/JOK.2022.49.1.8

Recently, advances in persistent memory and NUMA technologies have allowed for the provision of high performance and large storage space to the applications such as big data and machine learning. Such PM environments on multi-node systems require a change in the data structures, which are being used in each layer of the software stack. In terms of the research on PM data structures, however, it is a difficult problem to ensure high level of concurrency as well as non-volatility which is an important characteristics of NUMA and PM, respectively. In this paper, we propose an NRPM that extends the node replication, which is a representative of NUMA algorithms. NRPM outperforms hash algorithm by up to 5x by improving concurrency in the multi-node PM server using shared-log and flat combining methods. We confirmed the validity of NRPM through various performance analyses considering the characteristics of NUMA-PM.

NAFS : Stackable Filesystem for Maximizing Local Access in a NUMA System

Seungjun Ha, Hobin Woo, Euiseong Seo, Beomseok Nam

http://doi.org/10.5626/JOK.2021.48.6.612

Intel Optane DC Persistent Memory has read/write latencies comparable to DRAM but ensures data persistence as in block devices such as SSD. However, Optane DC PM modules are installed in DIMM slots of NUMA nodes but legacy block devices are installed in PCIe or SATA. Therefore, Optane DC PMs are known to suffer from the NUMA effects, and the performance of a multithreaded application depends on the NUMA locality. In this paper, we propose a novel stackable file system, NUMA-Aware Filesystem (NAFS). NAFS divides a file into segment units such that I/Os can be performed in local NUMA nodes where each application thread runs. To enable this feature, NAFS duplicates the file metadata across all NUMA nodes if the number of remote I/Os exceeds a certain threshold. Our performance study shows NAFS reduces the number of accesses to remote NUMA nodes significantly, improving the performance of multithreaded applications.

LFA-SkipList: Optimizing SkipList by Reducing Access in a NUMA-Aware System

Sunghwan Ahn, Yujin Jang, Seungjun Ha, Beomseok Nam

http://doi.org/10.5626/JOK.2021.48.1.1

Intel"s Optane DC Persistent Memory is a non-volatile memory that works faster than storage devices and stores data persistently. However, in the NUMA system, it takes a longer latency to access the remote memory of another CPU socket than for local NUMA access. Therefore, performance is degraded when configuring the SkipList in multiple non-volatile memories. In this paper, an LFA-SkipList was proposed to solve this problem. The LFA-SkipList has a newly added local pointer and uses it to access the local node first and then the remote node, thereby reducing unnecessary remote node access and improving performance. The study found the LFA-SkipList demonstrated a much shorter search time than that of the legacy SkipList.

Performance Analysis of DRAM Cache by Comparing Intel Optane DC Persistent Memory Operating Modes

Yaebin Moon, Deok-Jae Oh, Jung Ho Ahn

http://doi.org/10.5626/JOK.2020.47.10.893

Non-Volatile Memory (NVM) technology is a promising alternative to DRAM technology especially when it comes to the challenge of scaling. Recently, Intel released Optane DC Persistent Memory (DCPMM), a NVM product. The latest Intel server supports two operating modes to exploit this DCPMM: 1) Memory mode uses DCPMM as main memory and DRAM as its cache, and 2) App Direct mode uses DCPMM and DRAM as independent main memory regions, necessitating software modification for efficient utilization. In this paper, we compare the performance of these two operating modes. In the Memory mode, if the working set size of an application is smaller than the DRAM cache size or data locality is guaranteed on the application, the performance reduction caused by accessing the relatively slow DCPMM can be mostly amortized. However, as the working set size becomes larger than the DRAM size, the performance decreases as more accesses are served by DCPMM experiencing additional DRAM cache miss penalties (~70 ns). Therefore, the DRAM cache has a performance limitation due to the DRAM cache miss penalty, and using the App Direct mode may well be better in terms of performance in an environment where the working set is large and there is limited data locality.

PSL-DB: Non-Volatile Memory-optimized LSM-Tree with Skip List

Chanyeol Park, Dongui Kim, Beomseok Nam

http://doi.org/10.5626/JOK.2020.47.7.635

With the release of Intel"s Optane DC Persistent Memory, non-volatile memory, offering higher capacity than DRAM and showing higher performance than SSD and HDD, is in the spotlight as the next generation of storage devices. In this paper, we propose the Persistent Skip List DataBase (PSL-DB), a key-value store system optimized for the Optane DCPM in app-direct mode. PSL-DB uses a byte-addressable skip list that significantly reduces the I/O traffic as it avoids redundant writes. PSL-DB also does not sacrifice write performance for read performance as it does not degrade the write performance via artificial governors. In our experiments using Intel Optane DC Persistent Memory, PSL-DB shows significantly higher query processing throughput than legacy LevelDB that stores SSTables in Optane DC PM.

Distributed Storage System for Reducing Write Amplification on Non-Volatile Memory

Junghan Kim, Young Ik Eom

http://doi.org/10.5626/JOK.2020.47.2.129

Recently, research on non-volatile memory, such as 3DXpoint, in distributed storage systems has received considerable interest from both academia and industry. However, in order to utilize these state-of-the-art non-volatile memory devices effectively in distributed storage systems, there is a need for improvements in traditional architectures of HDD/SSD-based storage systems. This is because current distributed storage system structures use a dedicated space for journaling to make up for slow storage performance. Also, considering the performance characteristics of non-volatile memory, which are similar to that of DRAM, current distributed storage system structures are not only inefficient in terms of overall performance but also cause write amplification. In this paper, we propose an architecture that mitigates the effects of write amplification in non-volatile memory-based distributed storage systems. To evaluate the proposed architecture and scheme, we have conducted diverse experiments in a CEPH storage system environment. Through these experiments, we have confirmed that the DAXNJ structure proposed in this paper decreases write amplification by 61% during 1M object write operations and increases the overall system performance by 15%.

An NVM-based Efficient Write-Reduction Scheme for Block Device Driver Performance Improvement

Junghan Kim, Young Ik Eom

http://doi.org/10.5626/JOK.2019.46.10.981

Recently, non-volatile memory (NVRAM) has attracted substantial attention as a next-generation storage device due to the fact that it shows higher read/write performance than flash-based storage as well as higher cost-effectiveness than DRAM. One way to use NVRAM as a storage device is to modify the existing file system layer or block device layer. Leveraging the NVRAM block device driver is advantageous in terms of overall system compatibility, as it does not require any modification of the existing storage stack. However, when considering the byte-level addressing of the NVRAM device, the block write is not effective in terms of durability or performance. In this paper, we propose a block device driver that attempts to optimize the existing block write operations while considering the existing functionalities of the file system. The proposed block write reduction scheme provides a partial block write by classifying the type of blocks according to the structure of the file system as well as the amount of data modified in the block using XOR operation. Several experiments are performed to validate the performance of the proposed block device driver under various workloads, and the results show that, compared to the conventional block write operations, the amount of writes is reduced by up to 90%.

Design and Implementation of a Log-structured Buffer Based on Non-volatile Memory

Yongseok Son

http://doi.org/10.5626/JOK.2018.45.11.1117

Next-generation non-volatile memory (NVM) technologies, such as PCM and STTMRAM, provide low latency, high bandwidth, non-volatility, and high capacity. Such NVMs are widely used and studied in the field of computer systems and databases for high performance computing. For example, recent researchers have used NVM for journaling buffers and database logging of file systems and have conducted many optimization studies accordingly. As a complement to existing work, this paper focuses on the atomic page update of applications. For example, in a data management application such as a database system, the atomicity of the pages is ensured by performing a redundant write operation with a temporary buffer in order to atomically update multiple pages. However, this redundant write operation can reduce the performance. Therefore, in this paper, we introduce a log-structured buffer manager (LSMB) to improve the performance while ensuring the consistency. LSBM updates the page to NVM by logging and provides buffering. In addition, if there are duplicated pages in the buffer, the old version of the page is removed to reflect only the latest page, which minimizes the I/O and write amount. Experimental results show that LSBM improves the performance of the application and reduces the total write amount.

Garbage Collection Technique for Non-volatile Memory by Using Tree Data Structure

Dokeun Lee, Youjip Won

http://doi.org/

Most traditional garbage collectors commonly use the language level metadata, which is designed for pointer type searching. However, because it is difficult to use this metadata in non-volatile memory allocation platforms, a new garbage collection technique is essential for nonvolatile memory utilization. In this paper, we design new metadata for managing information regarding non-volatile memory allocation called "Allocation Tree". This metadata is comprised of tree data structure for fast information lookup and a node that holds an allocation address and an object ID pair in key-value form. The Garbage Collector starts collecting when there are insufficient non-volatile memory spaces, and it compares user data and the allocation tree for garbage detection. We develop this algorithm in a persistent heap based non-volatile memory allocation platform called "HEAPO" for demonstration.

Flash Operation Group Scheduling for Supporting QoS of SSD I/O Request Streams

Eungyu Lee, Sun Won, Joonwoo Lee, Kanghee Kim, Eyeehyun Nam

http://doi.org/

As SSDs are increasingly being used as high-performance storage or caches, attention is increasingly paid to the provision of SSDs with Quality-of-Service for I/O request streams of various applications in server systems. Since most SSDs are using the AHCI controller interface on a SATA bus, it is not possible to provide a differentiated service by distinguishing each I/O stream from others within the SSD. However, since a new SSD interface, the NVME controller interface on a PCI Express bus, has been proposed, it is now possible to recognize each I/O stream and schedule I/O requests within the SSD for differentiated services. This paper proposes Flash Operation Group Scheduling within NVME-based flash storage devices, and demonstrates through QEMU-based simulation that we can achieve a proportional bandwidth share for each I/O stream.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr