Tag Archives: storage

Neural Trees: Using Neural Networks as an Alternative to Binary Comparison in Classical Search Trees

Douglas Santry, NetApp

12th USENIX Workshop on Hot Topics in Storage and File Systems
July 2020
Online virtual conference

Binary comparison, the basis of the venerable B Tree, is perhaps the most successful operator for indexing data on secondary storage. We introduce a different technique, called Neural Trees, that is based on neural networks. Neural Trees increase the fan-out per byte of a search tree by up to 40% compared to B Trees. Increasing fan-out reduces memory demands and leads to increased cacheability while decreasing height and media accesses. A Neural Tree also permits search path layout policies that divorce a key’s value from its physical location in a data structure. This is an advantage over the total ordering required by binary comparison, which totally determines the physical location of keys in a tree. Previous attempts to apply machine learning to indices are based on learning the data directly, which renders insertion too expensive to be supported. The Neural Tree is a hybrid scheme using a tree of small neural networks to learn search paths instead of the data directly. Neural Trees can efficiently handle a general read/write workload. We evaluate Neural Trees with weeks of traces from production storage and SPC1 workloads to demonstrate their viability.


PracExtractor: Extracting Configuration Good Practices from Manuals to Detect Server Misconfigurations

Chengcheng Xiang and Haochen Huang, University of California San Diego;
Andrew Yoo, University of Illinois at Urbana-Champaign; Yuanyuan Zhou,
University of California, San Diego; Shankar Pasupathy, NetApp

2020 USENIX Annual Technical Conference.
July 15–17, 2020
Online virtual conference

Configuration has become ever so complex and error-prone in today’s server software. To mitigate this problem, software vendors provide user manuals to guide system admins on configuring their systems. Usually, manuals describe not only the meaning of configuration parameters but also good practice recommendations on how to configure certain parameters. Unfortunately, manuals usually also have a large number of pages, which are time-consuming for humans to read and understand. Therefore, system admins often do not refer to manuals but rely on their own guesswork or unreliable sources when setting up systems, which can lead to configuration errors and system failures.

To understand the characteristics of configuration recommendations in user manuals, this paper first collected and studied 261 recommendations from the manuals of six large open-source systems. Our study shows that 60% of the studied recommendations describe specific and checkable specifications instead of merely general guidance. Moreover, almost all (97%) of such specifications have not been checked in the systems’ source code, and 61% of them are not equivalent to the default settings. This implies that additional checking is needed to ensure the recommendations are correctly applied.

Based on our characteristic study, we build a tool called PracExtractor, which employs Natural Language Processing (NLP) techniques to automatically extract configuration recommendations from software manuals, converts them into specifications, and then uses the generated specifications to detect violations in system admins’ configuration settings. We evaluate PracExtractor with twelve widely-deployed software systems, including one large commercial system from a public company. In total, PracExtractor automatically extracts 338 recommendations and generates 173 specifications with reasonable accuracy. With these generated specifications, PracExtractor detects 1423 good practice violations from open-source docker images. To this day, we have reported 325 violations and have got 47 of them confirmed as real configuration issues by admins from different organizations.


Re-designing Enterprise Storage Systems for FLASH Memory

Jiri Schindler.

This presentation includes examples of Flash in the enterprise, the future of ESS architectures with Flash, and pertinent research challenges.

Invited talk at the International Workshop on Operating System Support for Next Generation Large Scale NVRAM (NVRAMOS ’09) Organized by KIISE,  October 19 – 21, 2009, Jeju, Korea


  • A copy of the presentation is attached to this posting.