NetApp Faculty Fellowships (NFF) encourage leading-edge research in storage and data management and to foster relationships between academic researchers and engineers. Please see below for a current list of ATG’s NFFs.

Fellowship Year :

Jian Huang, University of Illinois at Urbana-Champaign

Hardware-Assisted Secure Flash-Based Storage

Modern storage systems have been developed for decades with the security-critical foundation provided by operating system (OS). However, they are still vulnerable to malware attacks and software defects. Adversaries can obtain the OS kernel privilege or leverage software vulnerabilities to bypass, terminate or destroy current malware detection and defense systems. For instance, encryption ransomware accounts for more than half of all malware attacks today, but current software-based defense systems often fail to enable the victims to say no to ransom collectors. Therefore, it is natural to utilize hardware techniques which have been proven effective in defending against malware attacks.

Eamonn Keogh, UC Riverside – August 2018

Time Series Snippets: A New Analytics Primitive with applications to IoT Edge Computing

While most of today’s always-connected tech devices take advantage of cloud computing, many Internet of Things (IoT) developers increasingly understand the benefits of doing more analytics on the devices themselves, a philosophy known as edge computing. By performing analytic tasks directly on the sensor, edge computing can drastically reduce the bandwidth, cloud processing, and cloud storage needed.

John Paparrizos, University of Chicago – August 2018

Accelerating Internet of Things Data Analytics through Scalable Time-Series Representation Learning

Kernel methods, a class of machine learning algorithms for pattern recognition, have shown a great deal of promise in the analysis of complex, real-world, data. However, kernel methods remain largely unexplored in the analysis of time- varying measurements (i.e., time series), which is becoming increasingly prevalent across scientific disciplines, industrial settings, and Internet of Things (IoT) applications. Until now, research in time-series analysis has focused on designing methods for three components, namely, (i) representation methods; (ii) comparison functions; and (iii) indexing mechanisms. Unfortunately, these components have typically been investigated and developed independently, resulting in methods that are incompatible with each other. The lack of a unified approach has hindered progress towards scalable analytics over massive time-series collections.

Stratos Idreos, Harvard – January 2018

Faster, Cheaper, and Predictable NoSQL Storage and Analytics

NoSQL key-value stores are at the heart of numerous modern data-driven applications. Prof Stratos Idreos research shows that all existing designs are sub-optimal. They work towards a completely new data system design, CrimsonDB, which is 1) 10x faster than existing systems and gets faster for bigger data, 2) requires less hardware resources (memory) to produce the same or better results, and 3) it is automatically adaptable to new hardware and workloads, requiring no human-in-the-loop to tune the system.

Puru Kulkarni & Umesh Bellur (IIT-Bombay) July 2017

Provisioning and managing SSD/NVM-based disk caching in derivative clouds

Flash-based non-volatile storage devices (SSDs) have been widely used for disk caching in data-center infrastructure setups. With virtualization enabled hosting, SSD cache partitioning and provisioning is an important management question for resource controllers to ensure disk IO performance guarantees. While this problem has received considerable attention in the server-side cache management domain, the following gaps remain and we aim to explore them:

Peter Bailis (Stanford) – July 2017

pbailis Prioritizing Attention in High-Volume, High-Dimensional Data Streams

The rise of Big Data infrastructure has supercharged the collection of high volume, highly heterogeneous data. However, increasingly, this data is too big for any cost-effective manual inspection, and, in practice, much of this “fast data” is only accessed in exceptional cases (e.g., to debug a failure). As a result, important behaviors often go unnoticed, leading to inefficiency, wasted resources, and limited visibility into complex application deployments.

Joint NFF: Brown University & University of Maryland

Roberto Tamassia   

Joint NFF: UNC, Stony Brook & Rutgers – November 2016

Maintaining locality in a space-constrained file system

Donald Porter (UNC), Michael Bender and Rob Johnson (Stony Brook) & Martin Farach-Colton (Rutgers) have teamed up on the collaborative research entitled “Maintaining locality in a space-constrained file system”

Peter Desnoyers, Northeastern – November 2016

Peter_Desnoyers_Northeastern Improving File System Performance over SMR drives

SMR drives may be incorporated into the storage stack as drive-managed devices, via host-resident block translation layers, or through the use of SMR-specific log-structured file systems (LFSs), at an engineering cost ranging from modest (drive-managed) to very large (LFS). The first generation of drive-managed SMR devices has shown significant performance deficiencies when compared to conventional drives, but at this point little is known about how well SMR can perform with better translation algorithms or tuned file systems.

Steven Swanson, UCSD – November 2016

swanson_ucsd The Speed of Non-Volatile Main Memory and Capacity of Disk: A Fast, Strongly-Consistent File System for Heterogeneous Storage Stacks

Professor Steven Swanson’s group have built a file system called NOVA, a log-structured file system for hybrid volatile/non-volatile main memories. In this proposal, they would like to extend NOVA to provide tiering and/or caching capabilities that will allow it combine the high performance of NVMMs with the cost-effective capacity of SSDs and hard drives. They are planning to explore multiple approaches to allocating valuable NVMM to maximize performance, and evaluate these approaches on a range of critical storage applications. They are also considering to examine how the techniques they build into NOVA can allow other storage systems (e.g., “all-flash arrays,” object stores, and large file servers) to leverage NVMM as well.