Clustering time series data using the popular subsequence
(STS) technique has been widely used in the data mining
and wider communities. Recently the conclusion was made
that it is meaningless, based on the findings that it produces
(a) clustering outcomes for distinct time series that are not
distinguishable from one another, and (b) cluster centroids
that are smoothed. More recent work has since showed that
(a) could be solved by introducing a lag in the subsequence
vector construction process, however we show in this paper
that such an approach does not solve (b). Motivating the
terminology that a clustering method which overcomes (a)
is meaningful, while one which overcomes (a) and (b) is
useful, we propose an approach that produces useful time
series clustering. The approach is based on restricting the
clustering space to extend only over the region visited by
the time series in the subsequence vector space. We test
the approach on a set of 12 diverse real-world and synthetic
data sets and find that (a) one can distinguish between the
clusterings of these time series, and (b) that the centroids
produced in each case retain the character of the underlying
series from which they came.
Cite as: Chen, J. (2007). Useful Clustering Outcomes from Meaningful Time Series Clustering. In Proc. Sixth Australasian Data Mining Conference (AusDM 2007), Gold Coast, Australia. CRPIT, 70. Christen, P., Kennedy, P. J., Li, J., Kolyshkina, I. and Williams, G. J., Eds. ACS. 101-109.
(from crpit.com)
(local if available)