ABSTRACT
Fitting sensors to humans and physical structures is becoming more and more common. These developments provide many opportunities for ubiquitous computing, as well as challenges for analyzing the resulting sensor data. From these challenges, an underappreciated problem arises: modeling multivariate time series with mixed sampling rates. Although mentioned in several application papers using sensor systems, this problem has been left almost unexplored, often hidden in a preprocessing step or solved manually as a one-pass procedure (feature extraction/construction). This leaves an opportunity to formalize and develop methods that address mixed sampling rates in an automatic fashion.
We approach the problem of dealing with multiple sampling rates from an aggregation perspective. We propose Accordion, a new embedded method that constructs and selects aggregate features iteratively, in a memory-conscious fashion. Our algorithms work on both classification and regression problems. We describe three experiments on real-world time series datasets, with satisfying results.
Supplemental Material
- Angelini, E., Henry, J., and Marcellino, M. Interpolation and backdating with a large information set. Journal of Economic Dynamics and Control 30, 12 (Dec. 2006), 2693--2724.Google ScholarCross Ref
- Armesto, M. Forecasting with mixed frequencies. Federal Reserve Bank of St. Louis (2010), 521--536.Google Scholar
- Bao, L., and Intille, S. Activity recognition from user-annotated acceleration data. Pervasive Computing (2004), 1--17.Google Scholar
- Brush, A., Krumm, J., Scott, J., and Saponas, T. Recognizing Activities from Mobile Sensor Data: Challenges and Opportunities. In Proceedings Ubicomp' 11 (2011).Google Scholar
- Figo, D., Diniz, P. C., Ferreira, D. R., and Cardoso, J. a. M. P. Preprocessing techniques for context recognition from accelerometer data. Personal and Ubiquitous Computing 14, 7 (Mar. 2010), 645--662. Google ScholarDigital Library
- Friedman, J., Hastie, T., and Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software (2010).Google Scholar
- Ghysels, E., Santa-Clara, P., and Valkanov, R. Predicting volatility: getting the most out of return data sampled at different frequencies. Journal of Econometrics (2006).Google Scholar
- Gujarati, D. Basic econometrics, fourth ed. McGraw Hill, 2003.Google Scholar
- Hastie, T., Tibshirany, R., and Friedman, J. The Elements of Statistical Learning Data Mining, Inference, and Prediction, second ed. Springer, 2009.Google Scholar
- Hobson, J. A. Rem sleep and dreaming: towards a theory of protoconsciousness. Nature Reviews Neuroscience 10, 11 (2009), 803--813.Google ScholarCross Ref
- Huynh, T., and Schiele, B. Analyzing features for activity recognition. In Smart objects and ambient intelligence: innovative context-aware services: usages and technologies, no. 10 (2005). Google ScholarDigital Library
- John, G., Kohavi, R., and Pfleger, K. Irrelevant features and the subset selection problem. In Machine Learning (1994), 121--129.Google Scholar
- Keogh, E., and Kasetty, S. On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining and Knowledge Discovery (2002), 349--371. Google ScholarDigital Library
- Kuzin, V., Marcellino, M., and Schumacher, C. MIDAS vs. mixed-frequency VAR: Nowcasting GDP in the euro area. International Journal of Forecasting 27, 2 (Apr. 2011), 529--542.Google ScholarCross Ref
- Kwapisz, J., Weiss, G., and Moore, S. Activity recognition using cell phone accelerometers. ACM SIGKDD Explorations Newsletter (2011). Google ScholarDigital Library
- Lin, J., Keogh, E., Lonardi, S., and Chiu, B. A symbolic representation of time series, with implications for streaming algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery - DMKD '03 (2003), 2. Google ScholarDigital Library
- Makridakis, S., Wheelwright, S. C., and Hyndman, R. J. Forecasting methods and applications. John Wiley & Sons, 1998.Google Scholar
- Miluzzo, E., Lane, N., and Fodor, K. Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application. In Proceedings of Embedded network sensor systems (2008), 337--350. Google ScholarDigital Library
- Park, J.-g., Patel, A., Curtis, D., Teller, S., and Ledlie, J. Online pose classification and walking speed estimation using handheld devices. Proceedings of the 2012 ACM Conference on Ubiquitous Computing - UbiComp '12 (2012), 10. Google ScholarDigital Library
- Quinlan, J. Induction of decision trees. Machine learning (1986), 81--106. Google ScholarDigital Library
- Stodden, V. Model selection when the number of variables exceeds the number of observations. PhD thesis, 2006.Google Scholar
- Tibshirani, R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58 (1994).Google Scholar
- Vespier, U., Knobbe, A., Nijssen, S., and Vanschoren, J. MDL-Based Analysis of Time Series at Multiple Time-Scales. In Proceedings of ECML PKDD '12 (2012). Google ScholarDigital Library
Index Terms
- Mining multivariate time series with mixed sampling rates
Recommendations
Detecting outlier samples in multivariate time series dataset
Multivariate time series (MTS) samples which differ significantly from other MTS samples are referred to as outlier samples. In this paper, an algorithm designed to efficiently detect the top n outlier samples in MTS dataset, based on Solving Set, is ...
TemDep: Temporal Dependency Priority for Multivariate Time Series Prediction
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementThe multivariate fusion transformation is ubiquitous in multivariate time series prediction (MTSP) problems. The previous multivariate fusion transformation fuses the feature of different variates at a time step, then projects them to a new feature space ...
A Robust Approach for Multivariate Time Series Forecasting
SoICT '17: Proceedings of the 8th International Symposium on Information and Communication TechnologyTime series forecasting is often confronted with multivariate data, but few model is available in this situation. Besides, data distortion aggravates the difficulty to predict multivariate time series. To tackle such problems, we propose an approach ...
Comments