Tag Archives: configuration

PracExtractor: Extracting Configuration Good Practices from Manuals to Detect Server Misconfigurations

Chengcheng Xiang and Haochen Huang, University of California San Diego;
Andrew Yoo, University of Illinois at Urbana-Champaign; Yuanyuan Zhou,
University of California, San Diego; Shankar Pasupathy, NetApp

2020 USENIX Annual Technical Conference.
July 15–17, 2020
Online virtual conference

Configuration has become ever so complex and error-prone in today’s server software. To mitigate this problem, software vendors provide user manuals to guide system admins on configuring their systems. Usually, manuals describe not only the meaning of configuration parameters but also good practice recommendations on how to configure certain parameters. Unfortunately, manuals usually also have a large number of pages, which are time-consuming for humans to read and understand. Therefore, system admins often do not refer to manuals but rely on their own guesswork or unreliable sources when setting up systems, which can lead to configuration errors and system failures.

To understand the characteristics of configuration recommendations in user manuals, this paper first collected and studied 261 recommendations from the manuals of six large open-source systems. Our study shows that 60% of the studied recommendations describe specific and checkable specifications instead of merely general guidance. Moreover, almost all (97%) of such specifications have not been checked in the systems’ source code, and 61% of them are not equivalent to the default settings. This implies that additional checking is needed to ensure the recommendations are correctly applied.

Based on our characteristic study, we build a tool called PracExtractor, which employs Natural Language Processing (NLP) techniques to automatically extract configuration recommendations from software manuals, converts them into specifications, and then uses the generated specifications to detect violations in system admins’ configuration settings. We evaluate PracExtractor with twelve widely-deployed software systems, including one large commercial system from a public company. In total, PracExtractor automatically extracts 338 recommendations and generates 173 specifications with reasonable accuracy. With these generated specifications, PracExtractor detects 1423 good practice violations from open-source docker images. To this day, we have reported 325 violations and have got 47 of them confirmed as real configuration issues by admins from different organizations.


PopCon: Mining Popular Software Configurations from Community

Rukma Talwadker, Deepti Aggarwal; NetApp Inc

2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
September 16-20, 2019

Software system configuration problems are fairly prevalent and continue to impair the reliability of the underlying system software. Configurations also play an important role in establishing the quality of the software. With every configuration “knob” we delegate a responsibility to the user and also, we might make the software vulnerable to a failure, poor performance and other system operational issues. Efforts to facilitate a healthy configuration can be summarized by the way of following steps: 1) Gain knowledge about what defines a configuration; 2) operationalize a mechanism to mine popular or recommended configuration defaults; and 3) leverage insights for improving software quality or faster troubleshooting and fixing in the case of a software failure. Using PopCon, a tool that we built, we target all three aspects in a closed-loop fashion, by focussing on storage system software from NetApp, ONTAP data management software. We learn popular configurations from the deployed community, evaluate active configurations, deliver actionable information through this tool. Our findings have been encouraging. We can report that about 99% of our ONTAP software user community gravitates towards popular configuration values. Though about 20% of the configuration parameters initially need a custom or user input, we have found that over a period of a few months, systems adopt these popular values. Also, there is a high correlation between the number of outstanding deviations from the popular values and the number of active support cases on these systems. Further, we have also learned that for about 40% of the systems with support cases, deviations disappear at about the time of case closures. Finally, PopCon capabilities presented here are simple to implement and operationalize in any software system.


  • A copy of the paper can be downloaded at: link.