Hitchhiker’s need free vehicles! A call for open-source statistical analyses in software engineering

post

position statement

software tool

Let’s enable the replication of statistical analyses!

Author

Gregory M. Kapfhammer

Published

2016

Introduction

As evident by Arcuri and Briand’s paper “A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering”, the field of search-based software engineering (SBSE) relies on statistical methods to support the empirical comparison of different techniques. Yet, this statistical source code is often bespoke and may not be available so that other researchers can replicate the analyses or learn from the project.

Vehicles

As a means for improving the maturity of the data analysis methods used in the SBSE field, I think that it would be useful if there were shared repositories of well-documented statistical analysis code and replication data. That is, the SBSE community would advance if its “hitchhikers” had access to “free vehicles” in the form of GitHub repositories containing the data sets and statistical analysis code used for published papers.

Resources

To learn more about the benefits associated with using shared repositories of statistical code in SBSE, you can read the suggestions in (Kapfhammer, McMinn, and Wright 2016) for improving the study of data arising from experiments with randomized algorithms. If you would like to examine the source code of that paper, then you can visit its GitHub repository at gkapfham/sbst2016-paper Or, do you have ideas about how the SBSE community should create, share, and apply statistical software? If so, then please contact me with your thoughts!

Further Details

Interested in learning more about this topic? Since this blog post was written, my colleagues and students and I have published (McMinn, Kapfhammer, and Wright 2016) and released a replication package for it as well. If you are interested in replicating the analyses in that paper, then I encourage you to visit gkapfham/vmutation-replicate on GitHub. You can learn more about the other replication packages that we’ve published by checking the software page.

Return to Blog Post Listing

References

Kapfhammer, Gregory M., Phil McMinn, and Chris J. Wright. 2016. “Hitchhikers Need Free Vehicles! Shared Repositories for Statistical Analysis in SBST.” In Proceedings of the 9th International Workshop on Search-Based Software Testing.

McMinn, Phil, Gregory M. Kapfhammer, and Chris J. Wright. 2016. “Virtual Mutation Analysis of Relational Database Schemas.” In Proceedings of the 11th International Workshop on Automation of Software Test.