Despite the remarkable achievements in object detection, the model’s accuracy
and efficiency still require further improvement under challenging underwater
conditions, such as low image quality and limited computational resources. To
address this, we propose an Ultra-Light Real-Time Underwater Object Detection
framework, You Sense Only Once Beneath (YSOOB). Specifically, we utilize a
Multi-Spectrum Wavelet Encoder (MSWE) to perform frequency-domain encoding on
the input image, minimizing the semantic loss caused by underwater optical
color distortion. Furthermore, we revisit the unique characteristics of
even-sized and transposed convolutions, allowing the model to dynamically
select and enhance key information during the resampling process, thereby
improving its generalization ability. Finally, we eliminate model redundancy
through a simple yet effective channel compression and reconstructed large
kernel convolution (RLKC) to achieve model lightweight. As a result, forms a
high-performance underwater object detector YSOOB with only 1.2 million
parameters. Extensive experimental results demonstrate that, with the fewest
parameters, YSOOB achieves mAP50 of 83.1% Und 82.9% on the URPC2020 and DUO
datasets, respectively, comparable to the current SOTA detectors. The inference
speed reaches 781.3 FPS and 57.8 FPS on the T4 GPU (TensorRT FP16) and the edge
computing device Jetson Xavier NX (TensorRT FP16), surpassing YOLOv12-N by
28.1% Und 22.5%, respectively.
Dieser Artikel untersucht Zeitreisen und deren Auswirkungen.
PDF herunterladen:
2504.15694v1