Distance measures for effective clustering of ARIMA time-series
Top Cited Papers
- 14 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 273-280
- https://doi.org/10.1109/icdm.2001.989529
Abstract
Much environmental and socioeconomic time-series data can be adequately modeled using autoregressive integrated moving average (ARIMA) models. We call such time series "ARIMA time series". We propose the use of the linear predictive coding (LPC) cepstrum for clustering ARIMA time series, by using the Euclidean distance between the LPC cepstra of two time series as their dissimilarity measure. We demonstrate that LPC cepstral coefficients have the desired features for accurate clustering and efficient indexing of ARIMA time series. For example, just a few LPC cepstral coefficients are sufficient in order to discriminate between time series that are modeled by different ARIMA models. In fact, this approach requires fewer coefficients than traditional approaches, such as DFT (discrete Fourier transform) and DWT (discrete wavelet transform). The proposed distance measure can be used for measuring the similarity between different ARIMA models as well. We cluster ARIMA time series using the "partition around medoids" method with various similarity measures. We present experimental results demonstrating that, using the proposed measure, we achieve significantly better clusterings of ARIMA time series data as compared to clusterings obtained by using other traditional similarity measures, such as DFT, DWT, PCA (principal component analysis), etc. Experiments were performed both on simulated and real data.Keywords
This publication has 9 references indexed in Scilit:
- Distance measures for effective clustering of ARIMA time-seriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Mining the stock market (extended abstract): which measure is best?Published by Association for Computing Machinery (ACM) ,2000
- On similarity-based queries for time series dataPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Measuring time series similarity through large singular features revealed with wavelet transformationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Similarity-based queries for time series dataPublished by Association for Computing Machinery (ACM) ,1997
- Finding similar time seriesPublished by Springer Nature ,1997
- Fast subsequence matching in time-series databasesACM SIGMOD Record, 1994
- Case Studies in Time Series AnalysisPublished by World Scientific Pub Co Pte Ltd ,1993
- Efficient similarity search in sequence databasesPublished by Springer Nature ,1993