Predictive Model of Insolvency Risk for Australian Corporations

Baxter, R., Gawler, M. and Ang, R.

    This paper describes the development of a predictive model for corporate insolvency risk in Australia. The model building methodology is empirical with out-ofsample future year test sets. The regression method used is logistic regression after pre-processing by quantisation of interval (or numeric) attributes. We show that logistic regression matches the performance of ensemble methods, such as random forests and ada boost, provided that preprocessing and variable selection is performed. A distinctive feature of the insolvency risk model described in this paper is its breadth; since we are using income tax return data we are able to risk score one million companies across all industries, all corporation types (public, private) and all sizes, as measured either by assets or number of employees. This is an application paper that uses standard credit scoring methodology on a new data source. The contribution is to demonstrate that insolvency risk can be estimated using income tax return data. The corporate insolvency prediction model is still in development and so we welcome suggestions for improvements on the current methodology.
Cite as: Baxter, R., Gawler, M. and Ang, R. (2007). Predictive Model of Insolvency Risk for Australian Corporations. In Proc. Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, Australia. CRPIT, 70. Christen, P., Kennedy, P. J., Li, J., Kolyshkina, I. and Williams, G. J., Eds. ACS. 21-28.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS