Similarity preserving representation learning for time series analysis

Abstract: A considerable amount of clustering algorithms take instance-feature matrices as their inputs. As such, they cannot directly analyze time series data due to its temporal nature, usually unequal lengths, and complex properties. This is a great pity since many of these algorithms are effective, robust, efficient, and easy to use. In this paper, we bridge this gap by proposing an efficient representation learning framework that is able to convert a set of time series with various lengths to an instance-feature matrix. In particular, we guarantee that the pairwise similarities between time series are well preserved after the transformation, thus the learned feature representation is particularly suitable for the time series clustering task. Given a set of n time series, we first construct an n×n partially-observed similarity matrix by randomly sampling O(nlogn) pairs of time series and computing their pairwise similarities. We then propose an efficient algorithm that solves a non-convex and NP-hard problem to learn new features based on the partially-observed similarity matrix. By conducting extensive empirical studies, we show that the proposed framework is more effective, efficient, and flexible, compared to other state-of-the-art time series clustering methods.

Download: pdf

Citation

Similarity preserving representation learning for time series analysis (pdf, software)
Q. Lei, J. Yi, R. Vaculin, L. Wu, I. Dhillon.
In International Joint Conference on Artificial Intelligence (IJCAI), pp. 2845-2851, August 2019.

Bibtex:
@inproceedings{lei2019similarity, author = "Qi Lei AND Jinfeng Yi AND Roman Vaculin AND Lingfei Wu AND Inderjit S. Dhillon", title = "Similarity preserving representation learning for time series analysis", booktitle = "International Joint Conference on Artificial Intelligence (IJCAI)", page = "2845–2851", year = "2019", month = "aug", abstract = "A considerable amount of clustering algorithms take instance-feature matrices as their inputs. As such, they cannot directly analyze time series data due to its temporal nature, usually unequal lengths, and complex properties. This is a great pity since many of these algorithms are effective, robust, efficient, and easy to use. In this paper, we bridge this gap by proposing an efficient representation learning framework that is able to convert a set of time series with various lengths to an instance-feature matrix. In particular, we guarantee that the pairwise similarities between time series are well preserved after the transformation, thus the learned feature representation is particularly suitable for the time series clustering task. Given a set of n time series, we first construct an n×n partially-observed similarity matrix by randomly sampling O(nlogn) pairs of time series and computing their pairwise similarities. We then propose an efficient algorithm that solves a non-convex and NP-hard problem to learn new features based on the partially-observed similarity matrix. By conducting extensive empirical studies, we show that the proposed framework is more effective, efficient, and flexible, compared to other state-of-the-art time series clustering methods." }

Center for Big Data Analytics

Similarity preserving representation learning for time series analysis

Qi Lei, Jinfeng Yi, Roman Vaculin, Lingfei Wu, Inderjit Dhillon

Download: pdf

Citation