CoLUA: Automatically Predicting Configuration Bug Reports and Extracting Configuration Options
Wei Wen, Tingting Yu, and Jane Huffman Hayes
Configuration bugs are among the dominant causes of software failures.
Software organizations often use bug-tracking systems to manage bug reports
collected from developers and users. In order for software developers to understand
and reproduce configuration bugs, it is vital for them to know whether a bug in the bug
report is related to configuration issues; this is not often easily discerned due to a
lack of easy to spot terminology in the bug reports. In addition, to locate and fix a
configuration bug, a developer needs to know which configuration options are associated
with the bug. To address these two problems, we introduce CoLUA, a two-step automated
approach that combines natural language processing, information retrieval,
and machine learning. In the first step, CoLUA selects features from the textual
information in the bug reports, and uses various machine learning techniques to
build classification models; developers can use these models to label a bug
report as either a configuration bug report or a non-configuration bug report.
In the second step, CoLUA identifies which configuration options are involved
in the labeled configuration bug reports. We evaluate CoLUA on 900 bug reports
from three large open source software systems. The results show that CoLUA predicts
configuration bug reports with high accuracy and that it effectively identifies
the root causes of configuration options.
The data and programs can be downloaded at this link