All posts by xing

CoARC: Co-operative, Aggressive Recovery and Caching for Failures in Erasure Coded Hadoop

Pradeep Subedi, Ping Huang, Tong Liu, Virginia Commonwealth University, Joseph Moore, Stan Skelton, NetApp, Inc., Xubin He, Virginia Commonwealth University.

2016 International Conference on Parallel Processing (ICPP 2016)
Philadelphia, PA, USA

Cloud file systems like Hadoop have become a norm for handling big data because of the easy scaling and distributed storage layout. However, these systems are susceptible to failures and data needs to be recovered when a failure is detected. During temporary failures, MapReduce jobs or file system clients perform degraded reads and satisfy the read request. We argue that lack of sharing of the recovered data during degraded reads and recovery of only the requested data block places a heavy strain on the system’s network resources and increases the job execution time. To this end, we propose CoARC (Co-operative, Aggressive Recovery and Caching), which is a new data-recovery mechanism for unavailable data during degraded reads in distributed file systems. The main idea is to recover not only the data block that was requested but also other temporarily unavailable blocks in the same strip and cache them in a separate data node. We also propose an LRF (Least Recently Failed) cache replacement algorithm for such a kind of recovery caches. We also show that CoARC significantly reduces the network usage and job runtime in erasure coded Hadoop.

Resources

Think Global, Act Local: A Buffer Cache Design for Global Ordering and Parallel Processing in the WAFL File System

Peter Denz, Matthew Curtis-Maury, Vinay Devadas. NetApp, Inc.

2016 International Conference on Parallel Processing (ICPP 2016)
Philadelphia, PA, USA

Given the enormous disparity in access speeds between main memory and storage media, modern storage servers must leverage highly effective buffer cache policies to meet demanding performance requirements. At the same time, these page replacement policies need to scale efficiently with ever-increasing core counts and memory sizes, which necessitate parallel buffer cache management. However, these requirements of effectiveness and scalability are at odds, because centralized processing does not scale with more processors and parallel policies are a challenge to implement with maximum effectiveness. We have overcome this difficulty in the NetApp® Data ONTAP® WAFL® file system by using a sophisticated technique to simultaneously allow global buffer prioritization while providing parallel management operations. In addition, we have extended the buffer cache to provide a soft isolation of different workloads’ buffer cache usage, which is akin to buffer cache quality of server (QoS). This paper presents the design and implementation of these significant extensions in the buffer cache of a high-performance commercial file system.

Resources

StackMap: Low-Latency Networking with the OS Stack and Dedicated NICs

Kenichi Yasukata, Michio Honda, Douglas Santry, and Lars Eggert

2016 USENIX Annual Technical Conference
Denver, CO

StackMap leverages the best aspects of kernel-bypass networking into a new low-latency OS network service based on the full-featured TCP kernel implementation, by dedicating network interfaces to applications and offering an extended version of the netmap API for zero-copy, low-overhead data path alongside control path based on socket API. For small-message, transactional workloads, StackMap outperforms baseline Linux by 4 to 78 % in latency and 42 to 133 % in throughput. It also achieves comparable performance with Seastar, a highly-optimized user-level TCP/IP stack that runs on top of DPDK.

Resources

The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments

FAST '16 Mingzhe Hao, Gokul Soundararajan, Deepak Kenchammana-Hosekote, Andrew A. Chien and Haryadi S. Gunawi

14th USENIX Conference on File and Storage Technologies (FAST ’16)
Santa Clara, CA

We study storage performance in over 450,000 disks and 4,000 SSDs over 87 days for an overall total of 857 million (disk) and 7 million (SSD) drive hours. We find that storage performance instability is not uncommon: 0.2% of the time, a disk is more than 2x slower than its peer drives in the same RAID group (and 0.6% for SSD). As a consequence, disk and SSD-based RAIDs experience at least one slow drive (i.e., storage tail) 1.5% and 2.2% of the time. To understand the root causes, we correlate slowdowns with other metrics (workload I/O rate and size, drive event, age, and model). Overall, we find that the primary cause of slowdowns are the internal characteristics and idiosyncrasies of modern disk and SSD drives. We observe that storage tails can adversely impact RAID performance, motivating the design of tail-tolerant RAID. To the best of our knowledge, this work is the most extensive documentation of storage performance instability in the field.

Resources

The Private Lives of Disk Drives

Rajesh Sundaram

NetApp builds resiliency into its storage systems at every level to ensure that critical data is always protected, including technologies such as SnapMirror®, SnapVault®, and SnapRestore® that protect you from events ranging from sitewide disasters to user and application errors. NetApp also offers a unique degree of resiliency against problems that occur within disk drives themselves. This paper described five of the most troublesome disk problems and the resiliency technologies that NetApp Engineering has developed to protect against them.

Resources