On the Accuracy and Scalability of Intensive I/O Workload Replay

Alireza Haghdoost and Weiping He, University of Minnesota; Jerry Fredin, NetApp; David H.C. Du, University of Minnesota

We introduce a replay tool that can be used to replay captured I/O workloads for performance evaluation of high-performance storage systems. We study several sources in the stock operating system that introduce the uncertainty of replaying a workload. Based on the remedies of these findings, we design and develop a new replay tool called hfplayer that can more accurately replay intensive block I/O workloads in a similar unscaled environment. However, to replay a given workload trace in a scaled environment, the dependency between I/O requests becomes crucial. Therefore, we propose a heuristic way of speculating I/O dependencies in a block I/O trace. Using the generated dependency graph, hfplayer is capable of replaying the I/O workload in a scaled environment. We evaluate hfplayer with a wide range of workloads using several accuracy metrics and find that it produces better accuracy when compared with two exiting available replay tools.


High Performance Metadata Integrity Protection in the WAFL Copy-on-Write File System

Harendra Kumar; Yuvraj Patel, University of Wisconsin—Madison; Ram Kesavan and Sumith Makam, NetApp

We introduce a low-cost incremental checksum technique that protects metadata blocks against in-memory scribbles, and a lightweight digest-based transaction auditing mechanism that enforces file system consistency invariants. Compared with previous work, our techniques reduce performance overhead by an order of magnitude. They also help distinguish scribbles from logic bugs. We also present a mechanism to pinpoint the cause of scribbles on production systems. Our techniques have been productized in the NetApp® WAFL® (Write Anywhere File Layout) file system with negligible performance overhead, greatly reducing corruption-related incidents over the past five years, based on millions of runtime hours.


More Publications

Joint NFF: Brown University & University of Maryland

Roberto Tamassia (Brown) & Charalampos “Babis”  Papamanthou (Univ. of Maryland)  are collaborating on research entitled  Secure Deduplication and Compression for Big Data.

Hybrid clouds are increasingly being deployed and enable seamless data movement between public and private environments. However, when data is stored in a public cloud, significant challenges arise in making encryption techniques work together with search, deduplication, and compression. The proposed project has two main components: (1) deduplication of encrypted data and (2) searching compressed and encrypted data. Concerning deduplication, they propose a new technique for deduplicating encrypted data that is based on locality-sensitive hashing and tolerates small changes in the underlying plaintext data without blowing up the space too much. Concerning search, they propose using new compression techniques inspired by the database community to perform searches on encrypted data much more efficiently that existing systems. They expect their research to improve the state-of-the-art both in theory and systems and to even lead to new approaches and algorithms for performing more efficient queries on deduplicated and compressed unencrypted data. 


Joint NFF: UNC, Stony Brook & Rutgers – November 2016

Donald Porter (UNC), Michael Bender and Rob Johnson (Stony Brook) & Martin Farach-Colton (Rutgers) have teamed up on the collaborative research entitled “Maintaining locality in a space-constrained file system”

It is increasingly common for file systems to run out of space, especially on mobile phones, tablets, and laptops.  Under space pressure, common file system data placement heuristics abandon any attempt to preserve locality, and do little, if anything, to recover data locality once space pressure is relieved, such as by deleting data or offloading data to the cloud.


More Fellowships