Steel, a critical material in construction, automobile, and railroad manufacturing industries, often presents defects that can lead to equipment failure, significant safety risks, and costly downtime. This research aims to evaluate the performance of state-of-the-art object detection models in detecting defects on steel surfaces, a critical task in industries such as railroad and automobile manufacturing. The study addresses the challenges of limited defect data and lengthy model training times. Five existing state-of-the-art object detection models (faster R-CNN, deformable DETR, double head R-CNN, Retinanet, and deformable convolutional network) were benchmarked on the Northeastern University (NEU) steel dataset. The selection of models covers a broad spectrum of methodologies, including two-stage detectors, single-stage detectors, transformers, and a model incorporating deformable convolutions. The deformable convolutional network achieved the highest accuracy of 77.28% on the NEU dataset following a fivefold cross-validation method. Other models also demonstrated notable performance, with accuracies within the 70–75% range. Certain models exhibited particular strengths in detecting specific defects, indicating potential areas for future research and model improvement. The findings provide a comprehensive foundation for future research in steel defect detection and have significant implications for practical applications. The research could improve quality control processes in the steel industry by automating the defect detection task, leading to safer and more reliable steel products and protecting workers by removing the human factor from hazardous environments.