Peter Macko, Xiongzi Ge, James Kelley, David Slik, Keith A. Smith and Maxim G. Smith, NetApp, Inc.; John Haskins Jr., Qualcomm
33rd International Conference on Massive Storage Systems and Technology (MSST 2017)
May 15 – May 19, 2017 Santa Clara, CA
Shingled magnetic recording (SMR) increases the capacity of magnetic hard drives, but it requires that each zone of a disk be written sequentially and erased in bulk. This makes SMR a good fit for workloads dominated by large data objects with limited churn. To explore this possibility, we have developed SMORE, an object storage system designed to reliably and efficiently store large, seldom changing data objects on an array of host-managed or host-aware SMR disks.
SMORE uses a log-structured approach to accommodate the constraint that all writes to an SMR drive must be sequential within large shingled zones. It stripes data across zones on separate disks, using erasure coding to protect against drive failure. A separate garbage collection thread reclaims space by migrating live data out of the emptiest zones so that they can be trimmed and reused. An index stored on flash and backed up to the SMR drives maps object identifiers to on-disk locations. SMORE interleaves log records with object data within SMR zones to enable index recovery after a system crash (or failure of the flash device) without any additional logging mechanism.
SMORE achieves full disk bandwidth when ingesting data— with a variety of object sizes—and when reading large objects. Read performance declines for smaller object sizes where inter- object seek time dominates. With a worst-case pattern of random deletions, SMORE has a write amplification (not counting RAID parity) of less than 2.0 at 80% occupancy. By taking an index snapshot every two hours, SMORE recovers from crashes in less than a minute. More frequent snapshots allow faster recovery.
- The definitive version of the paper can be found at: http://storageconference.us/2017/Papers/ColdDataObjectStore.pdf.
- A longer version of the paper can be found at: https://arxiv.org/abs/1705.09701.