Abstract
U-statistics play central roles in many statistical learning tools but face the haunting issue of scalability. Despite extensive research on accelerating computation by U-statistic reduction, existing results almost exclusively focused on power analysis. Little work addresses risk control accuracy, which requires distinct and much more challenging techniques. In this article, we establish the first statistical inference procedure with provably higher-order accurate risk control for incomplete U-statistics. The sharpness of our new result enables us to reveal how risk control accuracy also trades off with speed, for the first time in literature, which complements the well-known variance-speed tradeoff. Our general framework converts the challenging and case-by-case analysis for many different designs into a surprisingly principled and routine computation. We conducted comprehensive numerical studies and observed results that validate our theory’s sharpness. Our method also demonstrates effectiveness on real-world data applications. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
| Original language | English |
|---|---|
| Journal | Journal of the American Statistical Association |
| DOIs | |
| Publication status | Published - Mar 2025 |
Bibliographical note
Publisher Copyright:© 2025 The Author(s). Published with license by Taylor & Francis Group, LLC.
Keywords
- Edgeworth expansion
- Fast computation
- Nonparametrics
- Statistical learning
Fingerprint
Dive into the research topics of 'U-Statistic Reduction: Higher-Order Accurate Risk Control and Statistical-Computational Trade-Off'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver