A new methodology to automatically detect cracks in existing RC bridges Vincenzo Mario Di Mucci1, *, †, Angelo Cardellicchio2, †, Sergio Ruggieri1, †, Andrea Nettis1, †, Vito Renò2, † and Giuseppina Uva1, † 1 DICATECH Department, Polytechnic University of Bari, Via Orabona 4, Bari, Italy 2 STIIMA Institute, National Research Council of Italy, Via Amendola 122D/O, Bari, Italy Abstract The paper presents a novel approach to detect cracks in existing reinforced concrete (RC) bridges using computer vision (CV) techniques as smart sensors and to identify existing damages from photos. This method involves training specialized convolutional neural networks (CNNs) to identify cracks in RC components, focusing on automated detection. The process begins with defining a detailed dataset of labeled crack images by domain experts in the field. Subsequently, CNNs designed for crack detection are trained and assessed. The effectiveness of the method is initially evaluated through visual comparisons, with more specific evaluations planned to use defined metrics upon completion of development. This innovative methodology aims to drive digital progress and artificial intelligence applications in advanced visual inspections, ultimately safeguarding the structures of existing bridge stock. Keywords Existing bridges, Conservation, Visual inspections, Crack detection, Structural Health management, Computer vision, Artificial Intelligence1 2 1. Introduction In recent years, bridge collapses [1] have highlighted the importance of the safety of existing infrastructures, especially historic ones. This concerns not only ancient masonry bridges, but also reinforced concrete (RC) bridges, which are crucial for their function and cultural value. Events such as earthquakes have shown the vulnerability of these structures, making careful monitoring necessary to avoid economic losses and protect the built heritage [2]. VIPERC2024: 3rd International Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding, 1 September 2024 ∗ Corresponding author. † These authors contributed equally. v.dimucci1@phd.poliba.it (V. M. Di Mucci); angelo.cardellicchio@stiima.cnr.it (A. Cardellicchio); sergio.ruggieri@poliba.it (S. Ruggieri); a.nettis@poliba.it (A. Nettis); vito.reno@stiima.cnr.it (V. Renò); giuseppina.uva@poliba.it (G. Uva) 0009-0002-6239-2743 (V. M. Di Mucci); 0000-0003-3313-4817 (A. Cardellicchio); 0000-0001-5119- 8967(S. Ruggieri); 0000-0001-8133-6830(A. Nettis); 0000-0003-1830-4961 (V. Renò); 0000-0001-6408-167X (G. Uva) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings The focus has been on developing systematic and non-invasive methods for monitoring and maintaining these critical infrastructures [3], [4]. Existing RC bridges, which are often more than 50 years old, suffer from several problems including concrete deterioration and steel corrosion [5]. These issues underline the urgency of assessing the state of conservation of existing bridges as a fundamental step for their efficient management. Two critical aspects emerge: 1. Limitations of economic and temporal resources. 2. Huge number of structures to be assessed. To address this problem, the Ministry of Infrastructure and Transport (MIT) has released the new Guidelines for the management of bridges safety [6]. The decree provides a multi- level approach aimed at defining risk-based priority lists to direct accurate assessments and interventions on the most critical bridges, and then to drive available resources on the worst cases. Level 1 of the Guidelines consists of visual inspection activities on bridges, necessary to identify the current state of conservation and the presence of any degradation phenomena. Traditional methods consist of inspecting bridges by trained inspectors, which identify defects and define their intensity and extension using numerical coefficients. It is worth observing that this operation requires significant human and economic resources that infrastructure managers should face. Furthermore, traditional visual inspection methods are time-consuming, laborious and highly dependent on the inspectors' experience, which can lead to inconsistent assessments [7]. Visual inspection of a bridge requires access to all parts of it, such as the piers and supports, which is not always possible, as shown in [8]. In addition, these inspections often require the limitation of the bridge to traffic, causing issues to the bridge serviceability. For this reason, research is underway to find innovative solutions that automate inspections, reducing time and costs and improving the safety of inspectors. One of the most alarming defects is represented by cracks [9], which have specific geometric characteristics such as width, length and orientation (e.g., longitudinal or diagonal) [10]. With the aim of improving the current practice in cracks detection, this paper explores the possibility to automatically detect cracks on bridge surface, through advanced computer vision (CV) technologies, leveraging machine learning (ML) and deep learning (DL) algorithms for defects detection. The paper proposes a processing pipeline for automatic crack detection in existing RC bridges. The system uses a pixel-based method to generate several patches from a limited number of images showing cracks on RC bridge surfaces. These images are then used to train a convolutional neural network (CNN) to identify the presence of cracks in the images. The document has been organized as follows: Section 2 reports a review of the state-of-the- art techniques of ML and DL for civil engineering; Section 3 presents the proposed framework, detailing the steps of the process; Section 4 discusses the preliminary findings and finally Section 5 provides the concluding remarks anticipating future developments. 2. State-of-the-art on crack detection ML has been applied in various fields of civil and structural engineering [11], including earthquake engineering [12], structural property identification and structural health monitoring [13]. CV, which is the application of DL in the field of image analysis, has shown promising results in assessing the state of conservation of structures. One interesting application in this field is represented by VULMA [14], a tool able to derive a simplified vulnerability index using images of existing buildings. This tool is based on the use of Google Street View to automatically collect data, subjected to the labelling for 13 different geometrical parameters. Subsequently, by training a cascade of CNNs with transfer learning and fine-tuning techniques, the tool extracts an accurate simplified vulnerability index for each analyzed image. Analogously, also for bridge analysis and the detection of structural defects such as cracks, several studies have proposed the use of CV applications. In bridge damage detection, CNNs have been mostly used to automatically identify defects through pixel- based analysis, with a focus on crack detection and damage assessment. For example, Zhang et al. [15] presented CrackNet, a CNN that achieved a remarkable accuracy score of 88.86% on a 3D dataset containing 2,000 images of cracks present on asphalt surfaces. Similarly, Yang et al. [16] developed a fully convolutional network for crack segmentation, achieving an outstanding accuracy of 97.96% on a custom dataset. Further progress was made in crack identification in concrete structures. Qiao et al. [17] proposed an advanced method using the U-Net CNN, which outperformed standard U-Net models by 11.7% in terms of average accuracy. Inam et al. [18] successfully used the U-Net model for crack segmentation, accurately measuring attributes such as width, length, and area. Other innovative approaches include the YOLO algorithm, as proposed by Yu et al. [19], to identify cracks in images. After training and testing on a large dataset of manually labeled crack images, authors used the K-Means method to determine the optimal size of regions of interest resulted in an average accuracy of 84.37%. Finally, recent developments in crack detection adopted the integration of a Bottleneck Transformer into an improved version of the YOLOv5 network, as proposed by Yu and Zou [20]. This approach has been shown to accurately capture elongated features such as cracks, achieving a higher accuracy score than the original version of YOLOv5. Similarly, the use of semantic segmentation algorithms such as DeepLabv3+, as presented by Fu et al. [21], has shown improved accuracy in crack segmentation, revealing finer details and improving the overall effectiveness of the system. A final contribution in the field of using CV for automatic defect identification in RC bridges was presented by Cardellicchio et al. [22], which used CNNs and different DL techniques to classify various common defects in bridges, and interpreting the results through AI explainability techniques, such as Class Activation Maps (CAMs). Although the initial performances were not promising, new evaluation metrics were proposed, which proved to be effective in a real case study. 3. CNN-based crack detection framework The objective of a crack detection problem is to determine if a specific pixel in an image of an RC element is part of a crack. To solve this problem, a new framework is proposed to detect cracks using CNNs. The method analyzes small portions of images to determine the probability that the central pixel of each portion belongs to a crack. This method represents a first step towards the automated generation of large amounts of ground truth data that can be used to train pixel-based classifier models. The goal is to simplify the training process and significantly increase the number of images available to train such models. Figure 1 reports the flowchart illustrating the proposed framework. Figure 1: Proposed framework. 3.1. Data preparation The first step of the framework is to create the dataset with annotated cracks to train the algorithm. This phase includes three main steps: proper image selection, manual annotation, and extraction of the ground truth mask for each image (see Figure 2). The three steps are following described: 1. Image selection: the first step consists of selecting high-resolution images where cracks are clearly visible. However, including images with occlusions can also be beneficial, as they represent real-world conditions and enhance the performance of the ML model. Vegetation, shadow, reflection or elements that look like cracks (like grout run-off) are some of the occlusions that make the dataset heterogeneous. This variety improves the generalization and robustness of the proposed CNN. 2. Manual annotation: the second step consists of performing manual annotation of cracks. To ensure accurate and high-quality labels, reducing biases, and improving the generalization of the model, images need to be annotated by hand by domain experts. Using the “Polyline” command of the Computer Vision Annotation Tool (CVAT) [23], the annotations are then exported to the Dataset Management Framework (Datumaro) format. In this way, the exported file includes the image metadata (file name, dimensions) and the annotations that specify the object type (class) and the array of point coordinates (x,y) of the polylines for a precise segmentation of the cracks. 3. Ground truth mask extraction: the third step consists of generating the ground truth image with the annotated cracks. In the pixels where the polyline (crack) is present, the value 255 is assigned, corresponding to white, while all the other pixels are set to 0, corresponding to black (absence of cracks). This allows to obtain a black image with white cracks, providing a clear definition of the classes. At this point, the dataset for training the CNN is complete and ready to be processed. Figure 2: Data preparation workflow. 3.2. Dataset Preprocessing To train the CNN, a preprocessing step is performed. During this phase small patches of the original image are extracted. This is done by applying a sliding window that runs over the image, capturing square parts of a fixed size (defined as “𝐹𝑤 × 𝐹ℎ”). Each patch is automatically labeled as "positive" if the center is associated with a crack, "negative" otherwise. It is worth noting that splitting images into patches can lead to an unbalanced dataset because most of the pixels do not contain cracks. In particular, the number of patches containing cracks (positive patches) is much smaller than the number of patches without cracks (negative patches), resulting in an unbalanced dataset. To address this imbalance, the dataset is balanced by downsampling the negative patches. This involves randomly selecting a number of negative patches equal to the number of positive patches, resulting in a more balanced dataset. Finally, the use of patches allows the application of data augmentation techniques, such as rotations and translations. This process increases data diversity and makes the model more robust to variations in the input data. 3.3. CNN model This study proposes a CNN architecture similar to the one proposed by Cardellicchio et al. in [24] for plant root segmentation. The network model proposes a simple but efficient architecture with three stacked CNN layers, each followed by a max-pooling and ReLU activation operation. In this architecture, the RGB image is processed through three different convolutional layers, each applying filters to explore and capture visual patterns in the image. In addition, there is a gradual decrease in kernel density, which means that the filters used become smaller as one proceeds through the convolutional layers. After the third convolutional layer, a max- pooling layer is applied, the purpose of which is to reduce the spatial size of the data while retaining the most significant features extracted from the previous layers. These features are then passed to a fully connected layer, where each neuron is connected to all neurons in the previous layer, facilitating the integration of the extracted information. Finally, the results obtained are transferred to the decision layer, which is responsible for making the final decisions, such as recognizing the class of the object in the image (presence of cracks, in this case). 4. Preliminary Results The proposed framework aims to predict the presence of cracks on concrete surfaces, for which a software has been developed in Python [25] using OpenCV [26], NumPy [27], Scikit-learn [28] and PyTorch [29] libraries. For this purpose, a dataset of photos related to existing RC bridges was used, with 450 annotated images specifically used for the training phase. Images of bridges are particularly well-suited for this procedure because, compared to other RC structures, they have exposed structural surfaces where defects, such as cracks, are directly visible. The neural network was subjected only to preliminary tests, in order to qualitatively evaluate its performance. In particular, the functionality of the method was verified by visually comparing the original image, which contains the crack, with the automatic segmentation generated by the model. As shown in Figure 3, the results clearly indicate that the trained network can accurately follow the path of the crack during the segmentation process. Figure 3: Comparison between the original images (a) and the masks containing the cracks segmented by the trained CNN. This result is significant because it demonstrates the model ability to identify and delineate cracks effectively, which is crucial for applications where accurate detection of structural defects is required. The good visual match between the real crack and the automatic segmentation suggests that the neural network training algorithm has been properly configured and that the model has the potential to improve with additional data and further optimizations. These preliminary tests provide a promising basis for future development of the network, indicating that this could be the right track to achieve a robust and reliable system for automatic crack segmentation. 5. Conclusions and further works This paper proposes a CV-based methodology to automatically detect cracks in existing RC bridges. Three main steps of the proposed framework were identified: a) Definition of the dataset of RC bridge surface images with the annotated cracks. b) Extraction of small patches from images in the training dataset. c) Implementation of three stacked layers CNN model for automatic identification of cracks. Then, a CNN is trained to identify the presence of cracks in the images. Thus, from each photo provided as input, the proposed framework is able to determine the presence or absence of cracks. This approach is particularly practical in contexts with few labeled data, as it allows the generation of numerous patches from a limited number of images, thus being effective in reliably identifying complex cracks by reducing the computational effort. The evaluation of the method has been based on preliminary visual comparisons. Once the development is complete, a rigorous evaluation should be carried out using specific evaluation metrics and quantitatively comparing this method with other existing approaches. This should enable quantification of the model's effectiveness and verification of its capability to accurately and reliably detect cracks. In conclusion, this work proposes a preliminary promising framework for automatic crack detection in reinforced concrete bridges, paving the way for automated and intelligent inspection systems for health assessment of civil infrastructures. This innovative methodology aims to enhance digital progress and utilize artificial intelligence for advanced visual inspections, which are key to the development of automated inspection systems for defect identification. This approach ultimately contributes to the preservation of the existing bridge structure portfolio. References [1] Calvi, Gian Michele, et al. "Once upon a time in Italy: The tale of the Morandi Bridge." Structural Engineering International 29.2 (2019): 198-217. [2] Borzi, Barbara, et al. "Seismic vulnerability of the Italian roadway bridge stock." Earthquake Spectra 31.4 (2015): 2137-2161. [3] Assaad, Rayan, and Islam H. El-Adaway. "Bridge infrastructure asset management system: Comparative computational machine learning approach for evaluating and predicting deck deterioration conditions." Journal of Infrastructure Systems 26.3 (2020): 04020032. [4] Bień, Jan, and Marek Salamak. "The management of bridge structures–challenges and possibilities." Archives of Civil Engineering (2022): 5-35. [5] Miluccio, Giacomo, et al. "Traffic-load fragility models for prestressed concrete girder decks of existing Italian highway bridges." Engineering Structures 249 (2021): 113367. [6] MIT, C. S. D. L. (2020). Linee Guida per la classificazione e gestione del rischio, la valutazione della sicurezza ed il monitoraggio dei ponti esistenti. Technical report, Consiglio Superiore dei Lavori Pubblici. [7] Abdallah, Abdelrahman M., Rebecca A. Atadero, and Mehmet E. Ozbek. "A comprehensive uncertainty-based framework for inspection planning of highway bridges." Infrastructures 6.2 (2021): 27. [8] Nettis, Andrea, Mirko Saponaro, and Massimo Nanna. "RPAS-based framework for simplified seismic risk assessment of Italian RC-bridges." Buildings 10.9 (2020): 150. [9] Soga, Kenʼichi, Ivan Vaníček, and A. Gens. Micro-Measurement and Monitoring System for Ageing Underground Infrastructures. Czech Technical University in Prague, 2011. [10] Golewski, Grzegorz Ludwik. "The phenomenon of cracking in cement concretes and reinforced concrete structures: the mechanism of cracks formation, causes of their initiation, types and places of occurrence, and methods of detection—a review." Buildings 13.3 (2023): 765. [11] Sun, H., Burton, H. V. and Huang, H. “Machine learning applications for building structural design and performance assessment: State-of-the-art review.” Journal of Building Engineering 33 (2021): 101816. [12] Xie, Y, et al. “The promise of implementing machine learning in earthquake engineering: A state-of-the-art review.” Earthquake Spectra, 36(4) (2020): 1769-1801. [13] Flah, M, et al. “Machine learning algorithms in civil structural health monitoring: A systematic review.” Archives of computational methods in engineering, 28(4) (2021): 2621-2643. [14] Ruggieri, Sergio, et al. "Machine-learning based vulnerability analysis of existing buildings." Automation in Construction 132 (2021): 103936. [15] Zhang, Allen, et al. "Automated pixel-level pavement crack detection on 3D asphalt surfaces using a deep-learning network." Computer-Aided Civil and Infrastructure Engineering 32.10 (2017): 805-819. [16] Yang, Xincong, et al. "Automatic pixel-level crack detection and measurement using fully convolutional network." Computer-Aided Civil and Infrastructure Engineering 33.12 (2018): 1090-1109. [17] Qiao, Wenting, et al. "A crack identification method for concrete structures using improved U-Net convolutional neural networks." Mathematical Problems in Engineering 2021 (2021): 1-16. [18] Inam, Hina, et al. "Smart and automated infrastructure management: A deep learning approach for crack detection in bridge images." Sustainability 15.3 (2023): 1866. [19] Yu, Zhen. "YOLO V5s-based deep learning approach for concrete cracks detection." SHS Web of Conferences. Vol. 144. EDP Sciences, 2022. [20] Yu, Gui, and Xinglin Zhou. "An improved YOLOv5 crack detection method combined with a bottleneck transformer." Mathematics 11.10 (2023): 2377. [21] Fu, Huixuan, et al. "Bridge crack semantic segmentation based on improved Deeplabv3+." Journal of Marine Science and Engineering 9.6 (2021): 671. [22] Cardellicchio, Angelo, et al. "Physical interpretation of machine learning-based recognition of defects for the risk management of existing bridge heritage." Engineering Failure Analysis 149 (2023): 107237. [23] CVAT. "CVAT: Computer Vision Annotation Tool." GitHub, 2023, https://github.com/cvat-ai/cvat. [24] Cardellicchio, A., et al. "Patch-based probabilistic identification of plant roots using convolutional neural networks." Pattern Recognition Letters (2024). [25] Python Software Foundation. Python Language Reference, version 3.8. Available at http://www.python.org. [26] Bradski, Gary. "The opencv library." Dr. Dobb's Journal: Software Tools for the Professional Programmer 25.11 (2000): 120-123. [27] Harris, Charles R., et al. "Array programming with NumPy." Nature 585.7825 (2020): 357-362. [28] Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." the Journal of machine Learning research 12 (2011): 2825-2830. [29] Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems 32 (2019).