Y. Zhang, G. Soundararajan, M. W. Storer, L. N. Bairavasundaram, S. Subbiah, A. C. Arpaci-Dusseau and R. H. Arpaci-Dusseau
Bonfire is a mechanism for accelerating cache warmup for large caches so that application service levels can be met significantly sooner than would be possible with on-demand warmup.
Large caches in storage servers have become essential for meeting service levels required by applications. These caches need to be warmed with data often today due to various scenarios including dynamic creation of cache space and server restarts that clear cache contents. When large storage caches are warmed at the rate of application I/O, warmup can take hours or even days, thus affecting both application performance and server load over a long period of time.
We have created Bonﬁre, a mechanism for accelerating cache warmup. Bonﬁre monitors storage server workloads, logs important warmup data, and efﬁciently preloads storage-level caches with warmup data. Bonﬁre is based on our detailed analysis of block-level data-center traces that provides insights into heuristics for warmup as well as the potential for efﬁcient mechanisms. We show through both simulation and trace replay that Bonﬁre reduces both warmup time and backend server load significantly, compared to a cache that is warmed up on demand.
In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13)
San Jose, February, 2013.
Efficient Interfaces to Flash Storage
The objective of this research is to find whether there are new interfaces to stand-alone flash devices that improve their performance, cost, predictability, or reliability. We hope to demonstrate that relatively simple changes to the standard storage interface, which do not significantly affect the cost of a device, can have a dra- matic effect on these metrics. Our designs are guided by pragmatism, in that we seek changes to the interface that require little intelligence on the device. In addition, we seek interfaces that could be standardized across manufacturers, similar to existing storage interfaces, to enable a clean separation of software and hardware suppliers.
Trade-offs in Flash (SSD) Based Storage Caching vs. Tiering
There is substantial interest in both the industry and academia in best intergrating flash-based storage into existing disk-based storage systems due to their complementary cost, performance, and power characteristics. There are two primary camps or schools of thought about doing flash storage integration.
1) The caching camp argues for managing flash-based storage as a large caching layer in the storage hierarchy to capture the working set of data stored in the disk layer.
2) The tiering (or multi-tiering) camp argues for managing flash-based storage in a manner equivalent to disk drive based storage, in other words, as a primary data store, in conjunction with one or more classes of disk drives (managed as separate tiers).
Both of these camps are quite well represented in industry solutions with almost every storage vendor today incorporating flash storage into their product portfolio using very distinct approaches. The plethora of solutions in both these classes (caching and tiering) had led to debates about the superiority of each solution class for enterprise workloads.
The goal of this proposed project is to bring some clarity to this high-spirited and current debate in the storage research and industry communities. Comprehensively characterizing each solution class is important to address and compare the sometimes strikingly different implementations even within the same class of solutions. While the high-level assumptions under which one solution is better than the other may be obvious, what is less clear is the scope of the applicability of each to real-world enterprise workloads. Thus, a complementary and equally important analysis involves workload characterization to determine the specific paramters under which caching and/or tiering would be optimally effective. Thus, we intend to focus on formalizing the rationale and concepts that underlie caching and tiering, categorizing solutions within the same class based on architectural assumptions (local and shared SSD deployment), analyzing the impact of tunable parameters in each solution class, characterizing workloads with the goal of determining suitability towards either or both classes of solutions, and determining rules-of-thumb for choosing superior storage solutions given a workload description. Further, when answering these question, it is important to carefully define the metrics that should be used to compare various solutions.
A Study of Network Storage Benefits using FLASH Hardware with Indexing Workloads
The value proposition of network storage has long been increasing bandwidth and IOPS for partitionable workloads by aggregating disks. When network latency, and even bandwidth, start to lag behind faster and larger local persistent stores such as Phase Change Memory or FLASH, the value proposition of network storage will have to change as blocking reads will be more quickly serviced by a local client-side IOPS tier than a network storage device. One of the most significant random read IOPS workloads will be out-of-RAM index lookups and insertions. There is a lot of interest in this area due to search, rapid file attribute indexing, deduplication, provenance, revision control, client-side cloud caching, and more. Therefore the number of indexes and the percentage of IOPS they will consume is only going to increase. This research work plans to conduct a comprehensive study of the performance of the most effective indexing technologies under mixed workloads across a multi-tier cache hierarchy including a client-side FLASH tier, a LAN network storage tier, and a remote backup or cloud tier. The project will explore several popular index technologies directly above a FLASH SSD and will provide an instrumentation framework suitable for capturing and analyzing IOPS-intensive workloads across a multi-tier storage architecture.
High-Performance Flash-based Indexes for Content-Based Systems
Recent years have seen the widespread use of large-scale content-based systems, such as, wide-area accelerators, caches, data de-duplication systems, content-based routers, and multimedia similarity detectors. These systems crucially depend on the use of indexes for rapidly locating content of interest. Since the underlying base of content that these systems compute over is large (typically several terabytes to petabytes), the indexes are also large, ranging from a few 10s to 100s of GB. Existing approaches for maintaining such indexes, such as using DRAM or disk-based techniques, are either too expensive or too slow. In this research, we consider a suite of index designs that leverage flash-based storage (e.g., SSDs) to offer high-performance at low cost, and wide-ranging flexibility to meet the demands of a variety of content-based systems. Our goal is to design schemes that are 1-2 orders of magnitude better than the competition in terms of index operations per second per dollar.
We build on our prior work on designing fast flash-based indexes for wide-area acceleration. In prior work, we showed that workloads where reads and writes are both frequent can be supported by augmenting flash with a small amount of DRAM and adopting a combination of buffering and lazy updates to support high performance at low cost. We also showed how to support simple eviction under these workloads.
The proposed research expands our prior work in significant new directions and to several new content-based systems. In our research plan, we first consider how to design the appropriate data structures to handle different mixtures of reads and writes at the index. We then consider how to handle updates, deletions and streaming-style evictions from the index in a flexible and efficient fashion under different workloads. We consider how to modify the designs to accommodate hierarchical keys, as opposed to flat hash-based keys. We also consider data structures to support approximate nearest-neighbor matches over the keys, as opposed to exact matches. In all cases, we consider a two-tier set-up where the index spans a small amount of DRAM and a much larger amount of flash-based storage.
HaRD Storage Systems
How will ﬂash impact the next generation of parallel and distributed storage systems? One view is that the primary location for ﬂash in future systems is on the client side and not in the servers, which will remain disk-based. With ﬂash on the clients, and disks on the servers, the responsibilities and roles of storage are dramatically altered. First, ﬂash can decouple workloads from both network and server-side disk performance limits by serving as a large read cache and write buffer. Second, because data may persist in client-side ﬂash storage, redundancy must exist not only across server disks but also include client-side ﬂash. We call this arrangement, hierarchical redundancy.
Finally, as performance solutions migrate more to the client, the storage server can apply more aggressive space-saving techniques.
This project will investigate a hybrid ﬂash/disk architecture called Hierarchically Redundant Decoupled Storage System (HaRD) that enables massive performance improvements as well as capacity savings within large-scale storage systems. HaRD promises to change the way we build future storage clusters.