Multi-label Video Classification for Underwater Ship Inspection
Published in OCEANS 2023 Conference & Exposition – Limerick, Ireland, 2023
This paper presents MViST (Multi-label Vision Spatiotemporal Transformer), a novel framework for multi-label video classification tailored for underwater ship inspection scenarios. The approach leverages multi-attention-based transformer and vision transformer (ViT) architectures to classify multiple concurrent labels from challenging underwater video footage, addressing unique challenges such as low visibility, turbulence, and complex backgrounds.
Authors: M. A. Azad, A. Mohammed, M. Waszak, B. Elvesæter, M. Ludvigsen
Venue: OCEANS 2023 – Limerick, Ireland
Note: This is the conference paper version; the full master’s thesis at NTNU contains extended experiments and results.
Recommended citation: M. A. Azad, A. Mohammed, M. Waszak, B. Elvesæter, and M. Ludvigsen. (2023). "Multi-label Video Classification for Underwater Ship Inspection." In OCEANS 2023 – Limerick, Limerick, Ireland, pp. 1–10. IEEE. https://doi.org/10.1109/OCEANSLimerick52467.2023.10244578
