Abstract
In driving scenarios, videos recorded in rainy weather conditions are often distorted by rain streaks and raindrops, posing a significant challenge in recovering the obscured background details. The inherent temporal redundancy in videos offers stability advantages for rain removal. Traditional video deraining techniques primarily depend on optical flow estimation and kernel-based methods, which are constrained by a limited receptive field. Although transformer architectures can capture long-term dependencies, they introduce substantial computational complexity. Recently, the Receptance Weighted Key Value Model (RWKV), characterized by its linear computational complexity, has emerged as an effective tool for efficient long-term temporal modeling, which is essential for the removal of rain streaks and raindrops in video sequences. To optimize RWKV for video deraining, we introduce a wavelet transform shift mechanism that enhances low-frequency features by targeting distinct frequency bands. Additionally, we present a tubelet embedding mechanism for RWKVs, augmenting the model’s capacity to capture high-frequency details by integrating the spatiotemporal context of input frames. Extensive experiments demonstrate that our approach achieves superior performance over state-of-the-art methods.
| Original language | English |
|---|---|
| Pages (from-to) | 6645-6656 |
| Number of pages | 12 |
| Journal | Visual Computer |
| Volume | 41 |
| Issue number | 9 |
| DOIs | |
| Publication status | Published - Jul 2025 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
Keywords
- Hierarchy fusion
- Receptance weighted key value model
- Video deraining