Tag Archives: osr

Designing a Fast File System Crawler with Incremental Differencing

cover.gifTim Bisson, Yuvraj Patel, and Shankar Pasupathy.

In this paper, we discuss the challenges in building a file system crawler and then present the design of two file system crawlers.

Search engines for storage systems rely on crawlers to gather the list of files that need to be indexed. The recency of an index is determined by the speed at which this list can be gathered. While there has been a substantial amount of literature on building efficient web crawlers, there is very little literature on file system crawlers. In this paper we discuss the challenges in building a file system crawler. We then present the design of two file system crawlers: the first uses the standard POSIX file system API but carefully controls the amount of memory and CPU that it uses. The second leverages modifications to the file system’s internals, and a new API called SnapDiff, to detect modified files rapidly. For both crawlers we describe the incremental differencing design; the method to produce a list of changes between a previous crawl and the current point in time.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 11-19

Resources

FS_crawler_Bisson.pdf

Hybrid Aggregates: Combining SSDs and HDDs in a single storage pool

cover.gifJohn D. Strunk.

This paper describes implementation of the Hybrid Aggregates prototype and the policies for automatic data placement and movement that have been evaluated, and presents some performance results from the prototype system.

Relative to traditional hard disk drives (HDDs), solid state drives (SSDs) provide a very large number of I/Os per second, but they have limited capacity. From a cost-effectiveness perspective, SSDs provide significantly better random I/O throughput per dollar than a typical disk, but the capacity provided per dollar spent on SSDs limits them to the most demanding of datasets.

Traditionally, Data ONTAP® storage aggregates have been provisioned using a single type of disk. This restriction limits the cost-effectiveness of the storage pool to that of the underlying disks. The Hybrid Aggregates project within the Advanced Technology Group (ATG) explored the potential to combine multiple disk types within a single aggregate. One of the primary goals of the project was to determine whether a hybrid aggregate, composed of SSDs (for their cost-effective performance) and Serial-ATA (SATA) disks (for their cost-effective capacity), could simultaneously provide better cost/performance and cost/throughput ratios than an all Fibre-Channel (FC) solution.

The project has taken a two-pronged approach to building a prototype system capable of supporting hybrid aggregates. The first part of the project investigated the changes necessary for Data ONTAP RAID and WAFL® layers to support a hybrid aggregate. This included propagating disk-type information to WAFL, modifying WAFL to support the allocation of blocks from a particular storage class (i.e., disk type), and repurposing the existing write-after-read and segment-cleaning infrastructure to support the movement of data between storage classes. The second part of the project examined potential policies for allocating and moving data between storage classes within a hybrid aggregate. Through proper policies, it is possible to automatically segregate the data within the aggregate such that the SSD-backed portion of the aggregate absorbs a large fraction of the I/O requests, leaving the SATA disks to contribute capacity for colder data.

This paper describes the implementation of the Hybrid Aggregates prototype and the policies for automatic data placement and movement that have been evaluated. It also presents some performance results from the prototype system.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 50-56

Resources

Hybrid_aggregates_Strunk.pdf

RAID Triple Parity

cover.gifAtul Goel and Peter Corbett.

This paper describes RAID triple parity (RTP) and a symmetric variant of the algorithm where parity computation is identical to triple reconstruction.

RAID triple parity (RTP) is a new algorithm for protecting against three-disk failures. It is an extension of the double failure correction Row-Diagonal Parity code. For any number of data disks, RTP uses only three parity disks. This is optimal with respect to the amount of redundant information required and accessed. RTP uses XOR operations and stores all data un-encoded. The algorithm’s parity computation complexity is provably optimal. The decoding complexity is also much lower than that of existing comparable codes. This paper also describes a symmetric variant of the algorithm where parity computation is identical to triple reconstruction.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 41-49

Resources

RTP_Goel.pdf

Model Building for Dynamic Multi-tenant Provider Environments

cover.gifJ. Basak, K. Wadhwani, K. Voruganti, S. Narayanamurthy, V. Mathur, and S. Nandi.

We present a machine learning based black-box modeling algorithm, M-LISP, which can predict system behavior in untrained regions for emerging multi-tenant and dynamic data center environments.

Increasingly, storage vendors are finding it difficult to leverage existing white-box and black-box modeling techniques to build robust system models that can predict system behavior in the emerging dynamic and multi-tenant data centers. White-box models are becoming brittle because the model builders are not able to keep up with the innovations in the storage system stack, and black-box models are becoming brittle because it is increasingly difficult to a priori train the model for the dynamic and multi-tenant data center environment. Thus, there is a need for innovation in system model building area.

In this paper we present a machine learning based black-box modeling algorithm called M-LISP that can predict system behavior in untrained region for these emerging multi-tenant and dynamic data center environments. We have implemented and analyzed M-LISP in real environments and the initial results look very promising. We also provide a survey of some common machine learning algorithms and how they fare with respect to satisfying the modeling needs of the new data center environments.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 20-31

Resources

Model_building_Basak.pdf

Space Savings and Design Considerations in Variable Length Deduplication

cover.gifGiridhar Appaji Nag Yasa and P.C. Nagesh.

In this paper, we describe and evaluate a hybrid of a variable length and block based deduplication that is hierarchical in nature.

Explosion of data growth and duplication of data in enterprises has led to the deployment of a variety of deduplication technologies. However not all deduplication technologies serve the needs of every workload. Most prior research in deduplication concentrates on fixed block size (or variable block size at a fixed block boundary) deduplication which provides sub-optimal space efficiency in workloads where the duplicate data is not block aligned. Workloads also differ in the nature of operations and their priorities thereby affecting the choice of the right flavor of deduplication. Object workloads for instance, hold multiple versions of archived documents that have a high degree of duplicate data. They are also write-once read-many in nature and follow a whole object GET, PUT and DELETE model and would be better served by a deduplication strategy that takes care of nonblack aligned changes to data.

In this paper, we describe and evaluate a hybrid of a variable length and block based deduplication that is hierarchical in nature. We are motivated by the following insights from real world data: (a) object workload applications do not do in-place modification of data and hence new versions of objects are written again as a whole (b) significant amount of data among different versions of the same object is shareable but the changes are usually not block aligned. While the second point is the basis for variable length technique, both the above insights motivate our hierarchical deduplication strategy.

We show through experiments with production data-sets from enterprise environments that this provides up to twice the space savings compared to a fixed block deduplication.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 57-64

Resources

Variable_length-dedup_Giridhar.pdf

Glitz: Cross-Vendor Federated File Systems

cover.gifDaniel Ellard, Craig Everhart, and Theresa Raj.

This paper gives a history of file system federation efforts and a detailed tour of Glitz and its benefits for vendors.

We propose Glitz, a system to integrate multiple file server federation regimes. NFS version 4 is a significant advance over prior versions of NFS, in particular specifying how NFS clients can navigate a large, multi-server namespace whose constituent parts may be replicated or moved while in use, as specified by NFS servers. This capability is essentially the same as that of previous distributed file systems such as AFS [7]. Sophisticated as this NFS capability is, it does not address the larger problem of building a usable system atop this basic capability. Multiple single-architecture solutions have been proposed, but each of these is based on an architecture for server federation that does not easily admit other members [3, 16, 18]. Glitz allows those multiple-server federations to interoperate and collaborate in a vendor-independent fashion. We give a history of file system federation efforts as well as a detailed tour of Glitz and its benefits for vendors.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 4-10

Resources

Glitz_Ellard.pdf

Systems Research and Innovation in Data ONTAP

cover.gifScott Dawkins, Kaladhar Voruganti, and John D. Strunk.

This paper introduces the December 2012 issue of OSR, which highlights some of the research and innovation that have helped us stay at the forefront of technological changes.

Over the last 20 years, there have been many changes in the data storage industry. NetApp® products have kept pace and pushed the boundary in various areas. Staying at the forefront requires attentiveness to emerging technology trends and a disciplined approach to analyzing them. By understanding the trends and how they affect our customers, we can focus our efforts on delivering the best products possible. In this issue of OSR, we highlight some of the research and innovation that have helped us stay at the forefront of these technological changes.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 1-3

Resources

DataONTAP_innovation_Dawkins2.pdf

Responding Rapidly to Service Level Violations Using Virtual Appliances

cover.gifLakshmi N. Bairavasundaram, Gokul Soundararajan, Vipul Mathur, Kaladhar Voruganti, and Kiran Srinivasan.

In this position paper, we propose dynamic instantiation of virtual machines with storage functionality as a mechanism to meet storage SLOs efficiently.

One of the key goals in the data center today is providing storage services with service-level objectives (SLOs) for performance metrics such as latency and throughput. Meeting such SLOs is challenging due to the dynamism observed in these environments. In this position paper, we propose dynamic instantiation of virtual appliances, that is, virtual machines with storage functionality, as a mechanism to meet storage SLOs efficiently.

In order for dynamic instantiation to be realistic for rapidly-changing environments, it should be automated. Therefore, an important goal of this paper is to show that such automation is feasible. We do so through a caching case study. Specifically, we build the automation framework for dynamically instantiating virtual caching appliances. This framework identifies sets of interfering workloads that can benefit from caching, determines the cache-size requirements of workloads, non-disruptively migrates the application to use the cache, and warms the cache to quickly return to acceptable service levels. We show through an experiment that this approach addresses SLO violations while using resources efficiently.

In ACM SIGOPS Operating Systems Review, Vol. 46, No. 3, December 2012, pp. 32-40

Resources

SLO_violation_Bairavasundaram.pdf