Mapping Semantic Knowledge for Unsupervised Text Categorisation

Tao, X., Li, Y., Zhang, J. and Yong, J.

    Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts.
Cite as: Tao, X., Li, Y., Zhang, J. and Yong, J. (2013). Mapping Semantic Knowledge for Unsupervised Text Categorisation. In Proc. Database Technologies 2013 (ADC 2013) Adelaide, Australia. CRPIT, 137. Wang, H. and Zhang, R. Eds., ACS. 51-60
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS