Vinay Devadas and Matthew Curtis-Maury, NetApp
Given continually increasing core counts, multiprocessor software scaling becomes critical. One set of applications that is especially difficult to parallelize efficiently are those that operate on hierarchical data. In such applications, correct execution relies on all threads coordinating their accesses within the hierarchy. At the same time, high performance execution requires that this coordination happen efficiently while maximizing parallelism.