Flashield: a Key-value Cache that Minimizes Writes to Flash

128 0 0.0 ( 0 )

Download Cite

Added by Assaf Eisenman

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Assaf Eisenman - Asaf Cidon - Evgenya Pergament

Operating Systems

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

As its price per bit drops, SSD is increasingly becoming the default storage medium for cloud application databases. However, it has not become the preferred storage medium for key-value caches, even though SSD offers more than 10x lower price per bit and sufficient performance compared to DRAM. This is because key-value caches need to frequently insert, update and evict small objects. This causes excessive writes and erasures on flash storage, since flash only supports writes and erasures of large chunks of data. These excessive writes and erasures significantly shorten the lifetime of flash, rendering it impractical to use for key-value caches. We present Flashield, a hybrid key-value cache that uses DRAM as a filter to minimize writes to SSD. Flashield performs light-weight machine learning profiling to predict which objects are likely to be read frequently before getting updated; these objects, which are prime candidates to be stored on SSD, are written to SSD in large chunks sequentially. In order to efficiently utilize the caches available memory, we design a novel in-memory index for the variable-sized objects stored on flash that requires only 4 bytes per object in DRAM. We describe Flashields design and implementation and, we evaluate it on a real-world cache trace. Compared to state-of-the-art systems that suffer a write amplification of 2.5x or more, Flashield maintains a median write amplification of 0.5x without any loss of hit rate or throughput.

rate research

Memshare: a Dynamic Multi-tenant Memory Key-value Cache

73 - Asaf Cidon , Daniel Rushton , Stephen M. Rumble 2016

Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory which negatively impacts cache hit rates. We present Memshare, a novel web memory cache that dynamically manages memory across applications. Memshare provides a resource sharing model that guarantees private memory to different applications while dynamically allocating the remaining shared memory to optimize overall hit rate. Todays high cost of DRAM storage and the availability of high performance CPU and memory bandwidth, make web caches memory capacity bound. Memshares log-structured design allows it to provide significantly higher hit rates and dynamically partition memory among applications at the expense of increased CPU and memory bandwidth consumption. In addition, Memshare allows applications to use their own eviction policy for their objects, independent of other applications. We implemented Memshare and ran it on a week-long trace from a commercial memcached provider. We demonstrate that Memshare increases the combined hit rate of the applications in the trace by an 6.1% (from 84.7% hit rate to 90.8% hit rate) and reduces the total number of misses by 39.7% without affecting system throughput or latency. Even for single-tenant applications, Memshare increases the average hit rate of the current state-of-the-art memory cache by an additional 2.7% on our real-world trace.

Operating Systems

Multi-version Indexing in Flash-based Key-Value Stores

256 - Pulkit A. Misra , Jeffrey S. Chase , Johannes Gehrke 2019

Maintaining multip

Databases Operating Systems

Faster than Flash: An In-Depth Study of System Challenges for Emerging Ultra-Low Latency SSDs

65 - Sungjoon Koh , Junhyeok Jang , Changrim Lee 2019

Emerging storage systems with new flash exhibit ultra-low latency (ULL) that can address performance disparities between DRAM and conventional solid state drives (SSDs) in the memory hierarchy. Considering the advanced low-latency characteristics, different types of I/O completion methods (polling/hybrid) and storage stack architecture (SPDK) are proposed. While these new techniques are expected to take costly software interventions off the critical path in ULL-applied systems, unfortunately no study exists to quantitatively analyze system-level characteristics and challenges of combining such newly-introduced techniques with real ULL SSDs. In this work, we comprehensively perform empirical evaluations with 800GB ULL SSD prototypes and characterize ULL behaviors by considering a wide range of I/O path parameters, such as different queues and access patterns. We then analyze the efficiencies and challenges of the polled-mode and hybrid polling I/O completion methods (added into Linux kernels 4.4 and 4.10, respectively) and compare them with the efficiencies of a conventional interrupt-based I/O path. In addition, we revisit the common expectations of SPDK by examining all the system resources and parameters. Finally, we demonstrate the challenges of ULL SSDs in a real SPDK-enabled server-client system. Based on the performance behaviors that this study uncovers, we also discuss several system implications, which are required to take a full advantage of ULL SSD in the future.

Operating Systems

Learning Key-Value Store Design

135 - Stratos Idreos , Niv Dayan , Wilson Qin 2019

We introduce the concept of design continuums for the data layout of key-value stores. A design continuum unifies major distinct data structure designs under the same model. The critical insight and potential long-term impact is that such unifying models 1) render what we consider up to now as fundamentally different data structures to be seen as views of the very same overall design space, and 2) allow seeing new data structure designs with performance properties that are not feasible by existing designs. The core intuition behind the construction of design continuums is that all data structures arise from the very same set of fundamental design principles, i.e., a small set of data layout design concepts out of which we can synthesize any design that exists in the literature as well as new ones. We show how to construct, evaluate, and expand, design continuums and we also present the first continuum that unifies major data structure designs, i.e., B+tree, B-epsilon-tree, LSM-tree, and LSH-table. The practical benefit of a design continuum is that it creates a fast inference engine for the design of data structures. For example, we can predict near instantly how a specific design change in the underlying storage of a data system would affect performance, or reversely what would be the optimal data structure (from a given set of designs) given workload characteristics and a memory budget. In turn, these properties allow us to envision a new class of self-designing key-value stores with a substantially improved ability to adapt to workload and hardware changes by transitioning between drastically different data structure designs to assume a diverse set of performance properties at will.

Databases Machine Learning

Authenticated Key-Value Stores with Hardware Enclaves

87 - Yuzhe Tang , Ju Chen , Kai Li 2019

Authenticated data storage on an untrusted platform is an important computing paradigm for cloud applications ranging from big-data outsourcing, to cryptocurrency and certificate transparency log. These modern applications increasingly feature update-intensive workloads, whereas existing authenticated data structures (ADSs) designed with in-place updates are inefficient to handle such workloads. In this paper, we address this issue and propose a novel authenticated log-structured merge tree (eLSM) based key-value store by leveraging Intel SGX enclaves. We present a system design that runs the code of eLSM store inside enclave. To circumvent the limited enclave memory (128 MB with the latest Intel CPUs), we propose to place the memory buffer of the eLSM store outside the enclave and protect the buffer using a new authenticated data structure by digesting individual LSM-tree levels. We design protocols to support query authentication in data integrity, completeness (under range queries), and freshness. The proof in our protocol is made small by including only the Merkle proofs at selective levels. We implement eLSM on top of Google LevelDB and Facebook RocksDB with minimal code change and performance interference. We evaluate the performance of eLSM under the YCSB workload benchmark and show a performance advantage of up to 4.5X speedup.

Cryptography and Security Databases Distributed Parallel and Cluster Computing