13th Annual (2016) “Humies” Awards for Human-Competitive Results Produced by Genetic and Evolutionary Computation ------------------------------------------------------------------------------- 1. Paper title: “Multi-objective Software Effort Estimation” ------------------------------------------------------------------------------- 2. Authors contact details: (1) Federica Sarro Address: CREST, Department of Computer Science, University College London, Malet Place, London, WC1E 6BT, UK E-mail: f.sarro@ucl.ac.uk Phone number: +44 (0)207679037289 (2) Alessio Petrozziello Address: School of Computing, Buckingham Building, University of Portsmouth, Portsmouth, PO1 3HE, UK. E-mail: alessio.petrozziello@port.ac.uk Phone number: +44(0)2392846784 (3) Mark Harman Address: CREST, Department of Computer Science, University College London, Malet Place, London, WC1E 6BT, UK E-mail: mark.harman@ucl.ac.uk Phone number: +44(0)2076791305 ------------------------------------------------------------------------------- 3. Corresponding author: Federica Sarro, ------------------------------------------------------------------------------- 4. Paper abstract: We introduce a bi-objective effort estimation algorithm that combines Confidence Interval Analysis and assessment of Mean Absolute Error. We evaluate our proposed algorithm on three different alternative formulations, baseline comparators and current state-of-the-art effort estimators applied to five real-world datasets from the PROMISE repository, involving 724 different software projects in total. The results reveal that our algorithm outperforms the baseline, state-of-the-art and all three alternative formulations, statistically significantly (p<0.001) and with large effect size (Â12≥0.9) over all five datasets. We also provide evidence that our algorithm creates a new state-of-the-art, which lies within currently claimed industrial human-expert-based thresholds, thereby demonstrating that our findings have actionable conclusions for practicing software engineers. ------------------------------------------------------------------------------- 5. Criteria satisfied by the work: (B) The result is better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal. (D) The result is publishable in its own right as a new scientific result independent of the fact that the result was mechanically created. (E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions. (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. ------------------------------------------------------------------------------- 6. Why the result satisfies the criteria that the contestant claims: The estimation of the effort needed for producing a new software project is an indisputably challenging task for software practitioners. Although several formal/algorithmic approaches have been proposed in the literature to build accurate estimation models, human-expert estimation remains the dominant strategy when estimating software development effort in practice, mainly due to the fact that there is no evidence that formal estimation models lead to more accurate estimates. However, expert judgment is not immune to imprecisions and a survey of current industry practices reports that human-expert predictions of project effort lie within 30% and 40% of the true value [1]. In our work, we propose a novel multi-objective evolutionary approach that seeks to evolve robust and accurate estimation models by simultaneously maximising the estimation accuracy and minimising the uncertainty associated with the estimation model itself. We named this approach Confidence Guided Effort Estimator (CoGEE). The primary novelty of our approach lies in its incorporation of confidence intervals to guide a multi-objective evolutionary algorithm. To assess the effectiveness of our proposal we carry out a rigorous empirical study involving 724 real-world projects widely used in the previous work on effort estimation. The results show that our new bi-objective effort estimation algorithm, CoGEE, significantly outperforms (often with large effect size) state-of-the-art techniques and also alternatives single and multi-objective approaches proposed in previous studies. We also provide empirical evidence that our proposal achieves human-competitive results, while the same cannot be said for the state-of-the-art techniques. Indeed, we compare both the estimation error produced by CoGEE (and by the current state-of-the-art) and the budget overruns that would accrue from using them against two claimed thresholds for industrial best estimation practice (the evidence for these thresholds comes from a survey of current industry practices [1]). The results show that the median error of our approach, CoGEE, comfortably lies between the claimed human-based industrial thresholds for all the projects considered in our study, therefore providing human-competitive results. While the state-of-the-art techniques (i.e., Case-Based Reasoning, Linear Regression, and Classification Tree) yield median expected overrun values that very often lie outside the currently-claimed human-based industrial upper bound. The above results are very encouraging, suggesting that CoGEE moves the median expected state-of-the-art performance within at least one, and sometimes both human-based industrial thresholds. [1] K. Molkken and M. Jorgensen. A review of surveys on software effort estimation. In Procs. of ISESE’03, pages 223–230, 2003 ------------------------------------------------------------------------------- 7. Full citation of the paper: Federica Sarro, Alessio Petrozziello, and Mark Harman. 2016. Multi-objective software effort estimation. In Proceedings of the 38th International Conference on Software Engineering (ICSE'16). ACM, New York, NY, USA, 619-630. DOI=http://dx.doi.org/10.1145/2884781.2884830 ------------------------------------------------------------------------------- 8. Prize money: Any prize money, if any, is to be divided among Federica Sarro (80%) and Alessio Petrozziello (20%). ------------------------------------------------------------------------------- 9. "Best" statement: Although software development effort estimation is a widely researched field within Software Engineering established more than 30 years old, algorithmic predictive methods have been barely adopted in practise, mainly because of their inability to beat human-experts. Moreover, in the literature there has been more than 15 years of work on the use of evolutionary approaches for software effort estimation. However, all previous work seek to produce point estimates and only two previous studies concerned multi-objective formulations of effort estimation but aimed to analyse the tradeoff among different accuracy measures for the single overall objective of producing the most accurate point estimate (therefore ignoring the uncertainty of the estimate, which is the novelty of our work). Our work is the first to provide evidence that multi-objective evolutionary approaches can be used to successfully build robust effort estimation models that outperforms the state-of-the-art and provides effort estimates that par human-experts. Our results also provide evidence that it is the multi-objective nature of our approach (which is the first ever to incorporate confidence interval analyse to guide the evolution of robust prediction models) that conveys these breakthrough results. It is also worth to mention that we carry out a rigorous and thoughtful empirical study in order to evaluate and compare the techniques investigated in our study. This strengthens the validity of our work, which, indeed, has been published in the top software engineering conference ICSE 2016. In summary, we find evidence to support the claim that our proposed estimation algorithm moves the state-of-the-art that can be expected from automated estimators within the bounds of current claims for human-expert industrial best practice. This may prove to be an important finding, because it provides evidence that our new multi-objective approach can advance the claimed industrial state-of-best-practice as well as the known scientific state-of-the-art. ------------------------------------------------------------------------------- 10. General type of genetic or evolutionary computation used: Multi-Objective Genetic Algorithm (MOGA), Genetic Algorithm (GA)