Neural Trees: Using Neural Networks as an Alternative to Binary Comparison in Classical Search Trees

Douglas Santry, NetApp

12th USENIX Workshop on Hot Topics in Storage and File Systems
July 2020
Online virtual conference

Binary comparison, the basis of the venerable B Tree, is perhaps the most successful operator for indexing data on secondary storage. We introduce a different technique, called Neural Trees, that is based on neural networks. Neural Trees increase the fan-out per byte of a search tree by up to 40% compared to B Trees. Increasing fan-out reduces memory demands and leads to increased cacheability while decreasing height and media accesses. A Neural Tree also permits search path layout policies that divorce a key’s value from its physical location in a data structure. This is an advantage over the total ordering required by binary comparison, which totally determines the physical location of keys in a tree. Previous attempts to apply machine learning to indices are based on learning the data directly, which renders insertion too expensive to be supported. The Neural Tree is a hybrid scheme using a tree of small neural networks to learn search paths instead of the data directly. Neural Trees can efficiently handle a general read/write workload. We evaluate Neural Trees with weeks of traces from production storage and SPC1 workloads to demonstrate their viability.

Resources