Reconstructing missing and anomalous data collected from high-frequency in-situ sensors in fresh waters


Claire Kermorvant, Benoit Liquet, Guy Litt, Jeremy B Jones, Kerrie Mengersen, Erin E Peterson, Rob J Hyndman, Catherine Leigh


3 December 2021

Publication details

International Journal of Environmental Research and Public Health, 18(23), 12803




In-situ sensors that collect high-frequency data are used increasingly to monitor aquatic environments. These sensors are prone to technical errors, resulting in unrecorded observations and/or anomalous values that are subsequently removed and create gaps in time series data. We present a framework based on generalized additive and auto-regressive models to recover these missing data. To mimic sporadically missing (i) single observations and (ii) periods of contiguous observations, we randomly removed (i) point data and (ii) day and week-long sequences of data from a two-year time series of nitrate-concentration data collected from Arikaree River, USA, where synoptically collected water temperature, turbidity, conductance, elevation and dissolved oxygen data were available. In 72% of cases with missing point data, predicted values were within the sensor-precision interval of the original value, although predictive ability declined when sequences of missing data occurred. Precision also depended on the availability of other water-quality covariates. When covariates were available, even a sudden, event-based peak in nitrate concentration was reconstructed well. By providing a promising method for accurate prediction of missing data, the utility and confidence in summary statistics and statistical trends will increase, thereby assisting the effective monitoring and management of fresh waters and other at-risk ecosystems.