Assessing the applicability of changepoint analysis to analyse short-term growth
DOI:
https://doi.org/10.52905/hbph2023.1.62Keywords:
changepoint analysis, changepoint detection, performance evaluation, mini growth spurt, short-term growthAbstract
Background: Assessing short-term growth in humans is still fraught with difficulties. Especially when looking for small variations and increments, such as mini growth spurts, high precision instruments or frequent measurements are necessary. Daily measurements however require a lot of effort, both for anthropologists and for the subjects. Therefore, new sophisticated approaches are needed that reduce fluctuations and reveal underlying patterns.
Objectives: Changepoints are abrupt variations in the properties of time series data. In the context of growth, such variations could be variation in mean height. By adjusting the variance and using different growth models, we assessed the ability of changepoint analysis to analyse short-term growth and detect mini growth spurts.
Sample and Methods: We performed Bayesian changepoint analysis on simulated growth data using the bcp package in R. Simulated growth patterns included stasis, linear growth, catch-up growth, and mini growth spurts. Specificity and a normalised variant of the Matthews correlation coefficient (MCC) were used to assess the algorithm’s performance. Welch’s t-test was used to compare differences of the mean.
Results: First results show that changepoint analysis can detect mini growth spurts. However, the ability to detect mini growth spurts is highly dependent on measurement error. Data preparation, such as ranking and rotating time series data, showed negligible improvements. Missing data was an issue and may affect the prediction quality of the classification metrics.
Conclusion: Changepoint analysis is a promising tool to analyse short-term growth. However, further optimisation and analysis of real growth data is needed to make broader generalisations.
References
Aminikhanghahi, S./Cook, D. J. (2017). A survey of methods for time series change point detection. Knowledge and Information Systems 51, 339–367. https://doi.org/10.1007/s10115-016-0987-z.
Bär, C. (2018). Lineare Algebra und analytische Geometrie. Wiesbaden, Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-22620-6.
Barry, D./Hartigan, J. A. (1993). A Bayesian Analysis for Change Point Problems. Journal of the American Statistical Association 88, 309–319. https://doi.org/10.1080/01621459.1993.10594323.
Bubitzky, W./Granzow, M./Berrar, D. P. (Eds.) (2007). Fundamentals of data mining in genomics and proteomics. New York, Springer.
Caino, S./Kelmansky, D./Lejarraga, H./Adamo, P. (2004). Short-term growth in healthy infants, schoolchildren and adolescent girls. Annals of Human Biology 31, 182–195. https://doi.org/10.1080/03014460310001652220.
Chicco, D. (2017). Ten quick tips for machine learning in computational biology. BioData Mining 10, 35. https://doi.org/10.1186/s13040-017-0155-3.
Chicco, D./Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21, 6. https://doi.org/10.1186/s12864-019-6413-7.
Erdman, C./Emerson, J. W. (2007). bcp: An R Package for performing a Bayesian analysis of change point problems. Journal of Statistical Software 23. https://doi.org/10.18637/jss.v023.i03.
Hermanussen, M. (Ed.) (2013). Auxology. Stuttgart, Schweizerbart Science Publishers.
Hermanussen, M. (1998). The Analysis of short-term growth. Hormone Research in Paediatrics 49, 53–64. https://doi.org/10.1159/000023127.
Hoffman, M. M./Chang, C./Chicco, D. (2020). The MCC-F1 curve: a performance evaluation technique for binary classification. arXiv. https://doi.org/10.48550/arXiv.2006.11278.
Lalkhen, A. G./McCluskey, A. (2008). Clinical tests: sensitivity and specificity. Continuing Education in Anaesthesia Critical Care & Pain 8, 221–223. https://doi.org/10.1093/bjaceaccp/mkn041.
Lin, J./Khade, R./Li, Y. (2012). Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent Information Systems 39, 287–315. https://doi.org/10.1007/s10844-012-0196-5.
Matthews, B. W. (1975). Coparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta - Protein Structure 405 (2), 442-451 https://doi.org/10.1016/0005-2795(75)90109-9.
Novine, M./Mattsson, C. C./Groth, D. (2022). Network reconstruction based on synthetic data generated by a Monte Carlo approach. Human Biology and Public Health 3. https://doi.org/10.52905/hbph2021.3.26.
Olshen, A. B./Venkatraman, E. S./Lucito, R./Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572. https://doi.org/10.1093/biostatistics/kxh008.
Rösler, A./Scheffler, C./Hermanussen, M./Gasparatos, N. (2022). Practicability and user friendliness of height measurements by proof of concept APP using Augmented Reality, in 22 healthy children. Human Biology and Public Health 2. https://doi.org/10.52905/hbph2022.2.48.
Schrade, L./Scheffler, C. (2013). Assessing the applicability of the digital laser rangefinder GLM Professional® Bosch 250 VF for anthropometric field studies. Anthropologischer Anzeiger 70, 137–145. https://doi.org/10.1127/0003-5548/2013/0223.
Schroth, C./Siebert, J./Groß, J. (2021). Time traveling with data science: focusing on change point detection in time series analysis (Part 2). Available online at https://www.iese.fraunhofer.de/blog/change-point-detection/ (accessed 11/21/2022).
Siebert, J./Groß, J./Schroth, C. (2021). A systematic review of Python packages for time series analysis. Engineering proceedings 5. https://doi.org/10.3390/engproc2021005022.
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Nikolaos Gasparatos, Michael Hermanussen, Christiane Scheffler
This work is licensed under a Creative Commons Attribution 4.0 International License.