Improving Profile-Based Optimization
Our research improves the understanding of profile-based optimization. The research will create tools to extract information and conduct studies using these tools to understand how source and trace data are interacting and affecting optimizations. This research classifies program workloads in order to develop benchmarking workloads that yield better overall performance. This research enables one to identify workloads that have similar profiles and performance improvements. As a result, it will shed light into the black box that is profile-based optimization in order to more effectively use it.
This research is an extension of research begun during an independent study in the Spring Semester of 2014. Our previous results, although unfinished, indicate that the performance improvements to the optimized program are highly dependent on the benchmarking workload and vary significantly.
Yanpei Chen, Kiran Srinivasan, Garth Goodson, and Randy Katz.
In this paper, we provide future storage system insights by using a new methodology that leverages an objective, multi-dimensional statistical technique to extract data access patterns from network storage system traces.
Enterprise storage systems are facing enormous challenges due to increasing growth and heterogeneity of the data stored. Designing future storage systems requires comprehensive insights that existing trace analysis methods are ill-equipped to supply. In this paper, we seek to provide such insights by using a new methodology that leverages an objective, multi-dimensional statistical technique to extract data access patterns from network storage system traces. We apply our method on two large-scale real-world production network storage system traces to obtain comprehensive access patterns and design insights at user, application, file, and directory levels. We derive simple, easily implementable, threshold-based design optimizations that enable efficient data placement and capacity optimization strategies for servers, consolidation policies for clients, and improved caching performance for both.
In Proceedings of the ACM Symposium on Operating Systems Principles 2011 (SOSP’11)
- The author’s version of the paper is attached to this posting, please observe the following copyright:
© ACM, 2011. This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in the Proceedings of the ACM Symposium on Operating Systems Principles 2011 (SOSP ’11) https://doi.acm.org/10.1145/2043556.2043562
Workload Aware Database Storage
This research seeks to provide sufficient hints from the database storage engine to the underlying storage system to allow it more efficient data management and I/O operations when accessing data when executing queries. With OLTP workloads, the idea is to automatically partition the data according to continuous monitoring and observations what tuples are accessed (read or updated) together. Similarly, for OLAP workloads, the DB storage engine will monitor what columns of data are accessed together and provide hints to the storage system so that it can collocate, distribute and/or partition the data as necessary.
Increasing the Intelligence of Cloud-storage Gateways
Cloud-storage gateways represent an interesting intermediate point between current storage architectures and future cloud-only systems. A gateway appears to clients as if it is a typical network file or block server, speaking protocols such as NFS , CIFS, or iSCSI, and thus enabling ready deployment within existing infrastructures; however, on the backend, the gateway is connected to a cloud storage service such as Amazon’s S3 or Google Storage, thus adding new functionality (automated off-site backup, cross-site sharing) and giving rise to new opportunities for data access and management.
In this proposal, we describe our proposed research to increase the intelligence of gateways so as to improve performance, increase reliability, and lower costs. Our initial focus will be on block-storage servers; however, much of our research will be applicable to both block-level and file-level gateways. To improve block-level gateways, we plan to develop and evaluate a series of block-inference techniques; by peering into the block stream and interpreting the contents of blocks, we will achieve tremendous improvements in gateway performance, reliability, and cost.
Reducing Memory and I/O Interference for Virtualized Systems and Cloud Computing
Increasingly, I/O and memory contention limit the performance of applications running on virtualized environments and the cloud. The problem is particularly acute because virtualized systems share memory and I/O resources. Processing resources can be divided by core and shared by context switching incrementally with low overhead. In contrast, workloads sharing memory and I/O interfere with each other: interleaving I/O requests destroys sequential I/O and disk head locality, memory sharing reduces cache-hit rates, and processor sharing flushes high-level caches. This project will develop mechanisms to reduce memory and I/O interference in these environments.
Efficient Integrity Checking of Outsourced Storage
Authenticating keyword searches on outsourced documents appears to be an especially difficult type of problem in the design of authenticated data structures. The main challenge stems from the fact that the set of items returned in response to a search has size typically much smaller than set of items inspected during the execution of the search algorithm. Thus, solutions that authenticate the steps and intermediate results of the search algorithm would yield integrity proofs that are much larger than the query result. We are interested in developing efficient authentication methods where the proof size is comparable (allowing a small overhead) or smaller than the search result size. Also, we seek solutions that take into account the parallelism afforded by information retrieval computations in the cloud.
Simulation Analysis of Server-side Flash Cache
Project proposes to investigate whether it is feasible and beneficial to use server-side flash as a write-through cache to improve system performance or whether it is only practical to implement such caches on a write-back basis. A preliminary study comparing a variety of cache write-back policies suggest that a write-through cache is feasible and is certainly simpler in a number of directions. With this project, a more thorough and comprehensive answer to this and related questions will be sought.
High-Performance Flash-based Indexes for Content-Based Systems
Recent years have seen the widespread use of large-scale content-based systems, such as, wide-area accelerators, caches, data de-duplication systems, content-based routers, and multimedia similarity detectors. These systems crucially depend on the use of indexes for rapidly locating content of interest. Since the underlying base of content that these systems compute over is large (typically several terabytes to petabytes), the indexes are also large, ranging from a few 10s to 100s of GB. Existing approaches for maintaining such indexes, such as using DRAM or disk-based techniques, are either too expensive or too slow. In this research, we consider a suite of index designs that leverage flash-based storage (e.g., SSDs) to offer high-performance at low cost, and wide-ranging flexibility to meet the demands of a variety of content-based systems. Our goal is to design schemes that are 1-2 orders of magnitude better than the competition in terms of index operations per second per dollar.
We build on our prior work on designing fast flash-based indexes for wide-area acceleration. In prior work, we showed that workloads where reads and writes are both frequent can be supported by augmenting flash with a small amount of DRAM and adopting a combination of buffering and lazy updates to support high performance at low cost. We also showed how to support simple eviction under these workloads.
The proposed research expands our prior work in significant new directions and to several new content-based systems. In our research plan, we first consider how to design the appropriate data structures to handle different mixtures of reads and writes at the index. We then consider how to handle updates, deletions and streaming-style evictions from the index in a flexible and efficient fashion under different workloads. We consider how to modify the designs to accommodate hierarchical keys, as opposed to flat hash-based keys. We also consider data structures to support approximate nearest-neighbor matches over the keys, as opposed to exact matches. In all cases, we consider a two-tier set-up where the index spans a small amount of DRAM and a much larger amount of flash-based storage.
Feedback Control for Elastic Cloud Storage
Within cloud-based infrastructures, many applications can share a set of storage resources, and each application has its own service level objective that should be satisfied within this environment. As workloads change and applications are started, stopped, or moved, the load placed on the storage system changes. The storage system needs to automatically respond to these load changes by adjusting where data is stored and how it is serviced in order to continue to efficiently meet each application’s SLO.
This project focuses on performance control for storage-intensive workloads in such a cloud environment. Taking a control-theoretic approach, sensors placed throughout the system can monitor the current performance of various subsystems, and actuators can be used to tune the storage system. One goal of the project is to create control solutions and policies that are modular and non-intrusive, minimizing their assumptions about the system’s internal structure and behavior, including other resource management functions.
HaRD Storage Systems
How will ﬂash impact the next generation of parallel and distributed storage systems? One view is that the primary location for ﬂash in future systems is on the client side and not in the servers, which will remain disk-based. With ﬂash on the clients, and disks on the servers, the responsibilities and roles of storage are dramatically altered. First, ﬂash can decouple workloads from both network and server-side disk performance limits by serving as a large read cache and write buffer. Second, because data may persist in client-side ﬂash storage, redundancy must exist not only across server disks but also include client-side ﬂash. We call this arrangement, hierarchical redundancy.
Finally, as performance solutions migrate more to the client, the storage server can apply more aggressive space-saving techniques.
This project will investigate a hybrid ﬂash/disk architecture called Hierarchically Redundant Decoupled Storage System (HaRD) that enables massive performance improvements as well as capacity savings within large-scale storage systems. HaRD promises to change the way we build future storage clusters.