Using Chronological Splitting to Compare Cross- and Single-company Effort Models: Further Investigation

Lokan, C. and Mendes, E.

    Numerous studies have used historical datasets to build and validate models for estimating software development effort. Very few used a chronological split (where projects' end dates are used so that training sets only contain projects that were completed before the start date of each project in the validation set), and only one compared chronological split to random split. Therefore the aim of this study is to investigate further and compare the use of chronological and random splitting. We do so in the context of comparing cross-company and singlecompany models for effort estimation. We used 450 single-company projects and 741 cross-company projects from the ISBSG Release 10 repository, and estimates were obtained using manual stepwise regression. We found that with these data the use of chronological splitting, and different splitting dates, did not affect prediction accuracy. We were not able to obtain a converging set of findings when comparing cross- to single-company predictions given that different accuracy measures presented contradictory results.
Cite as: Lokan, C. and Mendes, E. (2009). Using Chronological Splitting to Compare Cross- and Single-company Effort Models: Further Investigation. In Proc. Thirty-Second Australasian Computer Science Conference (ACSC 2009), Wellington, New Zealand. CRPIT, 91. Mans, B., Ed. ACS. 35-42.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS