Abstract
Street trees are vital to urban livability, providing ecological and social benefits. Establishing a detailed, accurate, and dynamically updated street tree inventory has become essential for optimizing these multifunctional assets within space-constrained urban environments. Given that traditional field surveys are time-consuming and labor-intensive, automated surveys utilizing Mobile Mapping Systems (MMS) offer a more efficient solution. However, existing MMS-acquired tree datasets are limited by small-scale scene, limited annotation, or single modality, restricting their utility for comprehensive analysis. To address these limitations, we introduce WHU-STree, a cross-city, richly annotated, and multi-modal urban street tree dataset. Collected across two distinct cities, WHU-STree integrates synchronized point clouds and high-resolution images, encompassing 21,007 annotated tree instances across 50 species and 2 morphological parameters. Leveraging the unique characteristics, WHU-STree concurrently supports over 10 tasks related to street tree inventory. We benchmark representative baselines for two key tasks—tree species classification and individual tree segmentation—based on 18 major species and an “Others” category. Extensive experiments demonstrate that while multi-modal fusion yields improvements over uni-modal baselines, it currently presents performance gaps compared to strong 3D-only methods, indicating that effective fusion remains a challenging open problem requiring further research. In particular, we identify key challenges and outline potential future works for fully exploiting WHU-STree, encompassing multi-modal fusion, multi-task collaboration, cross-domain generalization, spatial pattern learning, and Multi-modal Large Language Model for street tree asset management. The WHU-STree dataset is accessible at: https://github.com/WHU-USI3DV/WHU-STree.
| Original language | English |
|---|---|
| Pages (from-to) | 519-542 |
| Number of pages | 24 |
| Journal | ISPRS Journal of Photogrammetry and Remote Sensing |
| Volume | 233 |
| Early online date | 6 Feb 2026 |
| DOIs | |
| Publication status | Published - Mar 2026 |
Bibliographical note
Publisher Copyright:© 2026
Keywords
- Deep learning
- Individual tree segmentation
- Mobile mapping system
- Multi-modal
- Tree inventory
- Tree species classification
Fingerprint
Dive into the research topics of 'WHU-STree: A multi-modal benchmark dataset for street tree inventory'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver