TY - GEN
T1 - MDC-Net: Multimodal Detection And Captioning Network For Steel Surface Defects
AU - Chazhoor, Anthony Ashwin Peter
AU - Hu, Shanfeng
AU - Gao, Bin
AU - Woo, Wai Lok
PY - 2024/5/8
Y1 - 2024/5/8
N2 - In the highly competitive steel sector, product quality, particularly in terms of surface integrity, is critical. Surface defect detection (SDD) is essential in maintaining high production standards, as it directly impacts product quality and manufacturing efficiency. Traditional SDD approaches, which rely primarily on manual inspection or traditional computer vision techniques, are plagued with difficulties, including reduced accuracy and potential health concerns to inspectors. This research describes an innovative solution that uses a sequence generation model with transformers to improve the defect detection process while manufacturing hot-rolled steel sheets and generating captions about the defect and its spatial location. This method, which views object detection as a sequence generation problem, allows for a more sophisticated understanding of image content and a complete and contextually rich investigation of surface defects whilst providing captions. While this method can potentially improve detection accuracy, its actual power rests in its scalability and flexibility to various industrial applications. Furthermore, this technique has the potential to be further enhanced for visual question-answering applications, opening up opportunities for interactive and intelligent image analysis.
AB - In the highly competitive steel sector, product quality, particularly in terms of surface integrity, is critical. Surface defect detection (SDD) is essential in maintaining high production standards, as it directly impacts product quality and manufacturing efficiency. Traditional SDD approaches, which rely primarily on manual inspection or traditional computer vision techniques, are plagued with difficulties, including reduced accuracy and potential health concerns to inspectors. This research describes an innovative solution that uses a sequence generation model with transformers to improve the defect detection process while manufacturing hot-rolled steel sheets and generating captions about the defect and its spatial location. This method, which views object detection as a sequence generation problem, allows for a more sophisticated understanding of image content and a complete and contextually rich investigation of surface defects whilst providing captions. While this method can potentially improve detection accuracy, its actual power rests in its scalability and flexibility to various industrial applications. Furthermore, this technique has the potential to be further enhanced for visual question-answering applications, opening up opportunities for interactive and intelligent image analysis.
UR - https://link.springer.com/book/9783031590566
U2 - 10.1007/978-3-031-59057-3_20
DO - 10.1007/978-3-031-59057-3_20
M3 - Conference contribution
SN - 9783031590566
T3 - Communications in Computer and Information Science
SP - 316
EP - 333
BT - Robotics, Computer Vision and Intelligent Systems
A2 - Filipe, Joaquim
A2 - Röning, Juha
PB - Springer
CY - Cham, Switzerland
T2 - Robotics, Computer Vision and Intelligent Systems 2024
Y2 - 25 February 2024 through 27 February 2024
ER -