Abstract
Robot navigation in real-world, open-world environments has traditionally relied on explicit geometric models such as LiDAR SLAM, visual odometry, and high-definition maps to provide interpretable poses and scene structure. However, under dynamics, geometric degeneracy, outdated maps, and cross-domain shifts, these geometric models often deteriorate and struggle with long-tail corner cases and semantics-driven tasks.To address these challenges, this work develops three complementary themes that structure the chapters. (1) On the explicit geometry side, it introduces a multi-LiDAR odometry and online calibration framework (M-LOAM), a tightly coupled LiDAR–IMU localization system with prior maps (RLIL), and a tensor-voting-based HD curb extraction pipeline, achieving centimeter-to decimeter-level localization across campus, parking, and urban road scenarios while producing lightweight, interpretable maps and navigational elements. (2) On the data and evaluation side, it curates and extends the FusionPortable and FusionPortableV2 datasets to cover multiple sensors, platforms, and environments, together with unified benchmarking protocols for odometry, global localization, and mapping that enable reproducible cross-platform and cross-modality comparisons. (3) On the implicit modeling side, it contributes DiffLoc—a diffusion-based cross-view localization framework—along with a satellite–monocular semantic/depth alignment pipeline and CLIP-style language grounding for lightweight supervision, bringing generative and semantic priors into geometric pipelines to enhance cross-view alignment and metric localization under severe viewpoint and appearance variations.
Unified evaluations show that LiDAR-based and tightly coupled pipelines provide strong interpretability and robustness, yet their accuracy is highly sensitive to map quality and temporal synchronization, and maintaining such maps is costly. In contrast, learned perception suffers from substantial domain gaps without explicit adaptation. Overall, this work provides a systematic analysis of explicit geometry and implicit priors for open-world navigation, clarifying their respective strengths and limitations and offering guidance on when to rely on geometric methods and when to incorporate learned priors.
| Date of Award | 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Shaojie SHEN (Supervisor) |
Cite this
- Standard