Dynamic Statistical Learning in Massive Datastreams

Jingshen Wang, Lilun Du*, Changliang Zou, Zhenke Wu

*Corresponding author for this work

Research output: Contribution to journalJournal Articlepeer-review

1 Citation (Scopus)

Abstract

Technological advances have necessitated statistical methodologies for analyzing large-scale datastreams comprising multiple indefinitely time series. This manuscript proposes a dynamic tracking and screening (DTS) framework for online learning and model updating. Utilizing the sequential nature of datastreams, a robust estimation approach is developed under a linear varying coefficient model framework. This accommodates unequally-spaced design points and updates coefficient estimates without storing historical data. A data-driven choice of an optimal smoothing parameter is proposed, alongside a new multiple testing procedure for the streaming environment. Statistical guarantees of the procedure are provided, along with simulation studies on its finite-sample performance. The methods are demonstrated through a mobile health example estimating when subjects’ sleep and physical activities unusually influence their mood.

Original languageEnglish
JournalStatistica Sinica
DOIs
Publication statusPublished - 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2024 Institute of Statistical Science. All rights reserved.

Keywords

  • Consistency
  • Kernel smoothing
  • Multiple testing
  • Varying coefficient

Fingerprint

Dive into the research topics of 'Dynamic Statistical Learning in Massive Datastreams'. Together they form a unique fingerprint.

Cite this