Multispectral Deep Learning Models for Wildfire Detection Smitha Haridasan, Ajita Rattani∗ , Zelalem Demissie and Atri Dutta Wichita State University, Wichita, KS, USA Abstract Aided by wind, all it takes is one ember and few minutes to create wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision based techniques have been proposed for early detection of forest fire using video surveillance. Several computer vision based methods have been proposed to predict and detect forest fires at various spectrum, namely, RGB, HSV and YCbCr. The aim of this paper is to propose multi-spectral deep learning model that combine information from different spectrum at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available dataset is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 1.9% and 14.88% in test and challenge set over those based on single spectrum for fire detection even in challenging environments. Keywords deep learning, forest fire detection, natural hazard detection, multi-spectral learning 1. Introduction Wildfires or conflagration are large destructive unexpected fire often caused by human activity or natural phenomena which spreads quickly over the woodland and bush such as forest and prairie and gets uncontrollable. As many as 90 percent of wildfires in the United States are caused by human, for example, campfires left unattended, the burning of debris, downed power lines, negligently discarded cigarettes and intentional act of arson. Surface fires, ground fires, crown fires and spot fires are different types of hazardous fires and each one of these differ in the way fire spreads. The year 2021 witnessed numerous wildfires in United States. Wildfires in Colorado, Portland, California, Arizona, Texas, Alabama, Kentucky and Florida burned millions of acres of forest endangering the lives of human, wildlife affecting the economy and environment. As on December 2021, statistics from US Fire administration [1] suggest that 7, 819, 070 acres of forest International Workshop on Data-driven Resilience Research 2022, July 6, 2022, Leipzig, Germany ∗ Corresponding author. Envelope-Open sxharidasan@shockers.wichita.edu (S. Haridasan); ajita.rattani@wichita.edu (A. Rattani); zelalem.demissie@wichita.edu (Z. Demissie); atri.dutta@wichita.edu (A. Dutta) Orcid 0000-0002-1541-8202 (A. Rattani); 0000-0001-9917-0234 (Z. Demissie); 0000-0003-2191-0305 (A. Dutta) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) got destroyed due to 58, 288 fires resulting in 128 firefighter fatalities in 2021. Rather than detecting and reporting massive fire, it is important to device techniques which detect smoke and fire at an early stage. Early detection and instant reporting of such fire incidents are very important to mitigate the damage caused by wildfire. Traditional fire detection systems include thermal and optical smoke and heat detectors. Build- ings and vehicles are often equipped with thermal and optical sensors for smoke detection. However, these sensor-based smoke detectors usually installed indoors have to be in close proximity in order to detect fire or smoke and they do not provide the location of fire/smoke incident. In order to overcome some of the disadvantages of sensor-based systems for wildfire detection and monitoring, recent research and development has been made in computer vision based surveillance systems for early detection of smoke and fire. Image features like color detection, histogram, motion, and flicker along with machine learning algorithms have been used to detect and locate fire and smoke in the existing studies. RGB, yCbCr and HSV color spectrums provide different imaging characteristics under different conditions. For instance, Premal et al.[2] used yCbCr spectrum to detect fire and RGB spectrum to detect smoke by exploiting the black-grayish to black region in the image. Chmelar et al. [3] used HSV based gaussian mixture model to detect fire. Melendez et al. [4] used multi-spectral imaging in infrared spectrum to measure instantaneous radiated power and total radiated energy. Foggia et al. [5] used a multi-classification system with a combination of color, shape and motion to detect fire. Fire and smoke have unreliable and unpredictable color and shape characteristics, therefore fire detection methods based on color, area and shape are often not reliable. Recent advances in deep learning have obtained hallmark accuracy rates for various computer vision problems [6, 7]. To this front, deep learning based object recognition and localization schemes have been utilized for wildfire detection and localization in the existing literature. Frizzi et al. [8] and Yin et al. [9] used deep convolutional neural networks to identify fire in videos. Muhammad et al. [10] applied transfer learning to pretrained VGG-16 and ResNet-50 based convolutional neural network (CNN) models for fire detection. With the increasing use of unmanned aerial vehicles for surveillance, there is a pressing need for lightweight models to enable deployment in the resource constraint environment. In this context, handful of studies have used lightweight models for forest fire detection. For example, Khan et al. [11] added an expansion layer to MobileNetv2 and eliminated the fully connected layers to make it computationally inexpensive. Yang et al. [12] applied transfer learning on MobileNet for forest fire classification. Abid et al. [13] and Geetha et al. [14] did a comprehensive survey on machine learning algorithms based on forest fire prediction and detection. Barmpoutis et al.[15] did a comprehensive review on optical remote sensing techniques used in early fire warning systems and provided an extensive survey on flame and smoke detection algorithms. However, the state-of-the-art is still not mature. There is a need for advanced deep learning models for accurate detection of forest fire. The aim of this manuscript is to develop advanced multi-spectral deep learning models that can combine information from various spectrums such as RGB, YCbCr and HSV, for accurate fire detection by leveraging features from three spectrum. Experimental results of multi-spectral models on publicly available datasets suggests an increase in accuracy of about 1.9% and 14.88%, respectively, in test and challenge set over those based on single spectrum. Fig. 1 shows the Figure 1: Multi-spectral deep learning model that combine information from various spectrum for fire detection. schema of the multi-spectral deep learning model that combine information from various spectrum by concatenating fully connected layers from all the networks followed by other dense and the final output layer. In summary, the main contributions of this work are as follows: • Multi-spectral deep learning models that combine information obtained from intermedi- ate layers from various spectrums i.e., RGB, yCbCr and HSV, is used for fire detection. To this aim, multi-spectral version of light weight models like MobileNetv2 [16], NAS- NetMobile [17], DenseNet121 [18], Xception [19], Inceptionv3 [20], were developed and compared with multi-spectral version of heavy-weight models like InceptionResNetv2 [21] and ResNet152v2 [22]. • Comparative evaluation with the unimodal deep learning models that use single spectrum for fire detection. • Assembling of large composite heterogeneous data set from benchmark fire detection dataset like FD-Dataset [23], Firesense [24], DeepQuest [25] and Furgfire [26] for deep- (a) Smoke and wildfire (b) Day time wildfire (c) Wildfire at night Figure 2: Sample wildfire images from the heterogeneous dataset (a) Foggy Evening (b) Night traffic (c) Sunset Figure 3: Sample challenge set no-fire images Table 1 Characteristics of the heterogeneous dataset used in this study. Fire Images No-Fire Images Training set 37217 37118 Validation set 5000 4996 Test Set 7344 7296 Challenge Set 250 248 learning models training and evaluation. • Comparative evaluation of unimodal and multi-spectral deep learning models over aerial forest images with pile burn. The rest of this paper is structured as follows: Section 2 discusses dataset and experiment protocol used in this study. Section 3 discusses the results and discussion of individual and multi-spectral models over heterogeneous fire accident and aerial forest images. Conclusion and future work are discussed in Section 4. 2. Dataset and Experimental Protocol 2.1. Dataset To evaluate the performance of the proposed approach for detecting fire in images, we assembled a heterogeneous dataset (Table-1) consisting of RGB images extracted from publicly available datasets, namely, FD[23], Visifire[24], DeepQuest [25] and Furgfire [26]. The fire images in the heterogeneous dataset contains images of various indoor and outdoor fire and smoke incidents including wildfire images (Fig- 2) captured using drones. Apart from the training, validation, and test set, heterogeneous dataset contains challenge set that contain various no-fire challenging images which are engulfed with smoke and fire like objects such as dust, fog, traffic lights, sunset and sunrise as shown in (Fig-3). This challenge set of images aims to test the effectiveness of the models on challenging conditions. Table-1 shows the characteristics of heterogeneous dataset consisting of 37,217 fire, 37118 non-fire images in training set, 5000 fire and 4996 no-fire images in the validation set and 7344 fire, and 7296 no-fire images in the test set, 250 fire, and 248 no-fire images are in the challenge set. All the images are in RGB format and are converted to HSV and YCbCr [27]. RGB, HSV and YCbCr images are preprocessed and resized as required by the respective CNN models used in the experiments. Data augmentation techniques like rotation, shear, zoom and horizontal flip were applied to all images in the training set to avoid model over-fitting and to improve generalization ability. 2.2. Experimental Protocol 2.2.1. Individual Models for RGB, HSV and YCbCr spectrum In this experiment, light weight and heavy CNN models were trained on RGB, HSV and yCbCr spectrum, separately. Models namely, MobileNetv2, NASNetmobile, DenseNet121, Xception, InceptionV3, InceptionResNetv2 and ResNet152v2 were used in this study. The pretrained version of these models with weights trained on ImageNet, were fine-tuned on the training set of heterogeneous dataset. Dense layers of 2048-1024-256-64 were used with a final output layer to classify images into fire and no-fire images 2.2.2. Multispectral Models The multi-spectral version of these models used images from RGB, HSV and YCbCr spectrum as input. These images were convolved with stack of convolutional layers for each spectrum separately. The features from the three spectrum are concatenated together horizontally to obtain a combined feature vector. The convolution layers were followed by dense layers 2048- 1024-256-64 and the final output layer of two neurons for the final classification. Learning rate of 3𝑒 −4 was used for all the experiments. The models were trained using ReLU as the loss function and ADAM optimizer. Batch size was set as 64 for training. All the experiments were conducted on two Intel Xeon processor NVIDIA Quadro RTX 4000 GPUs. 3. Results and Discussion 3.1. Experiment# 1: Performance Evaluation of CNN Models for RGB, HSV and YCbCr Spectrum Table 2 shows performance of models namely, MobileNetv2, NASNetMobile, DenseNet121, Xception, Inceptionv3, InceptionResNetv2 and ResNet152v2, in terms of accuracy evaluated on Table 2 Test and Challenge accuracy (%) of the CNN models for RGB, HSV and YCbCr, separately, when evaluated on the test and challenge set of the heterogeneous dataset. Test Set Challenge Set Model RGB HSV YCbCr RGB HSV YCbCr MobileNetv2 89.75 80.70 77.78 69.48 52.01 50.60 NASNetMobile 86.14 73.63 80.81 77.31 50.00 56.63 DenseNet121 93.82 77.20 79.89 73.89 54.21 50.00 Xception 94.30 83.10 84.94 67.27 56.83 61.84 InceptionV3 94.90 82.70 87.87 74.69 65.86 53.41 InceptionResNetv2 94.68 82.40 83.71 81.92 64.66 58.43 ResNet152v2 93.35 81.50 82.79 68.27 66.47 50.00 Table 3 Test and challenge accuracy (%) obtained from two-spectrum fusion Test Set Challenge Set Model RGB | HSV RGB | YCbCr RGB | HSV RGB | YCbCr MobileNetv2 85.67 78.95 68.70 64.39 NASNetMobile 81.19 78.20 79.94 76.45 DenseNet121 80.68 84.90 76.31 79.89 Xception 85.39 82.34 68.98 73.25 InceptionV3 87.47 83.01 75.42 67.58 InceptionResNetv2 82.42 83.14 84.53 82.46 ResNet152v2 90.05 87.28 74.28 71.48 the test and challenge set of the heterogenous dataset. In Table 2, best results are highlighted as bold and the second best results are highlighted as italics. InceptionV3 obtained the highest test accuracy of 94.90% and 87.87% for RGB and yCbCr spectrum, respectively. Xception obtained the highest test accuracy of 83.10% for HSV spectrum. When comparing results of the models across spectrum on test set, RGB spectrum obtained the best accuracy for fire detection, followed by HSV and yCbCr spectrum. When comparing the results of these models on the challenge set across spectrum, RGB again obtained the best accuracy, followed by HSV and YCbCr spectrum. This shows that features from RGB spectrum are the most effective in fire detection compared to HSV and yCbCr. As can be seen, InceptionResNetv2, ResNet152v2, and Xception obtained the highest challenge accuracy of 81.92%, 66.47%, 61.84% in RGB, HSV and YCbCr spectrum, respectively. The performance of all the models in the challenge set deteriorated due to the flame like objects and smoke engulfed images which led to more false positives and false negatives. 3.2. Experiment #2: Two spectrum fusion models for forest fire detection Table-3 shows test accuracy of fusion of two spectrum (RGB | HSV) and (RGB | YCbCr). When fusing RGB and HSV, ResNet152v2 and InceptionV3 gave the best and the second best test accuracy of 90.05% and 87.47% respectively. On an average, fusion of RGB and HSV gave a test accuracy of 84.70% which is 5.33% more than the average performance of HSV on test set in Table 4 Test and Challenge accuracy (%) of all the multi-spectral models for fire detection. Model Test set accuracy Challenge set accuracy MobileNetv2 91.67 80.32 NASNetMobile 87.75 83.13 DenseNet121 95.94 65.23 Xception 96.80 77.11 InceptionV3 96.01 81.12 InceptionResNetv2 95.99 85.94 ResNet152v2 96.38 73.29 experiment 1. When fusing RGB and YCbCr, ResNet152v2 and DenseNet121 gave the best and second best test accuracy of 87.28% and 84.90% respectively. On an average, fusion of RGB and YCbCr gave a test accuracy of 82.55% which again is a significant improvement when compared to the performance of YCbCr in experiment-1. Table-3 also shows challenge accuracy obtained from fusion of two spectrum (RGB | HSV) and (RGB | YCbCr). When combining RGB and HSV, InceptionResNetv2 and NASNetMobile gave the best and second best challenge accuracy of 84.53% and 79.94% respectively. An average challenge accuracy of 75.45% was obtained when fusing RGB and HSV spectrum which is 2.90% and 22.36% more than average obtained from RGB and HSV spectrum respectively on the challenge set in experiment 1. When fusing RGB and YCbCr, InceptionResNet152v2 and DenseNet121 gave the best and the second best challenge accuracy of 82.46% and 79.89% respectively. An average challenge accuracy of 73.64% was obtained when fusing RGB and YCbCr which is 25.91% more than the performance of YCbCr on challenge set in experiment 1. 3.3. Experiment # 3: Multi-spectral models for forest fire detection Table 4 shows performance of the multi-spectral models on the test and challenge set. The best results are highlighted in bold and the second best results are highlighted in italics. Maximum test and challenge accuracy of 96.8% and 85.94%, respectively, was obtained while combining features from multiple spectrum. Multi-spectrum fusion gave an average test accuracy of 93.28% which is 14.04% and 11.51% more than the average obtained with HSV and YCbCr in experiment 1. An average challenge accuracy of 78.02% was obtained with multispectral fusion which is a 6.10%, 24.92% and 30.26% more than the average challenge accuracy obtained in RGB, HSV and YCbCr spectrum (experiment-1) respectively. The highest and the second highest challenge accuracy was obtained using InceptionResNetv2 and NASNetMobile. Overall, multi-spectral models increased the test and challenge set accuracy by 1.9% and 4.02%, respectively, when compared to the results of individual models. 3.4. Experiment # 4: Individual and Multi-spectral models for Aerial Wildfire images: This experiment uses Aerial imagery FLAME (Fire Luminosity Airborne-based Machine learning evaluation) dataset [28] which consists of images (Table - 5) of pile burn in Northern Arizona (a) Fire image in train set (b) Fire image in test set (c) Clouds in aerial image Figure 4: Sample aerial images from [28] Table 5 Aerial Image Dataset Statistics Train set Test set Fire No Fire Fire No Fire 25018 14357 5137 3480 Table 6 Test accuracy obtained from Individual Spectrum and Multi-Spectrum fusion for fire image classification Model RGB yCbCr HSV Multi-spectral fusion MobileNetV2 65.11 64.49 66.62 69.85 NASNetMobile 66.2 68.04 65.74 70.80 DenseNet121 67.81 61.11 69.84 70.35 ResNet152v2 64.81 60.01 61.05 74.02 Xception 61.25 68.85 61.28 75.53 InceptionV3 60.47 60.88 60.07 74.55 InceptionResNetv2 60.42 69.26 61.11 79.29 captured using drones. The dataset consists of 25018 fire images and 14357 no fire images in training set, 5137 fire images and 3480 no fire images in test set. A set of sample fire and no fire images are shown in fig 4. Fire images includes fire and smoke from pile burns and No fire images include clouds and fog to make the classification a challenging task. Table-6 shows performance of individual spectrum RGB, HSV, yCbCr and three-spectrum fusion on models (MobileNetv2, NASNetMobile, DenseNet121, Xception, Inceptionv3, InceptionResNetv2 and ResNet152v2) in terms of accuracy evaluated on test set of aerial images (Table-5). Table-6 shows the test accuracy obtained using individual spectrum and multi-spectrum fusion. Fusion of spectrum proved to be giving better results compared to individual spectrum. In summary, experimental results demonstrate the effectiveness of multi-spectral model in fire image classification. 4. Conclusions Considering the wildfire frequencies and the causalities involved, automated computer vision based system for early fire detection is an important topic of research. However, automated fire detection using computer vision based methods is a challenging task due to non-uniform shape, and color and the presence of motion. In this study, we investigated multi-spectral deep learning models that can combine complementary information from various spectrum for performance enhancement. Experimental results on a large scale heterogeneous dataset and aerial forest dataset suggest performance enhancement of the multi-spectral models over those models trained on a single spectrum. As part of future work, end-to-end model will be developed for smoke and fire detection and localization. 5. Acknowledgement We acknowledge support from Wichita State University President’s Convergent Science Initia- tive for conducting the research described in this paper. References [1] NIFC, National interagency fire center, 2021. URL: https://www.nifc.gov/fire-information/ nfn. [2] C. E. Premal, S. Vinsley, Image processing based forest fire detection using ycbcr colour model, in: 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014], IEEE, 2014, pp. 1229–1237. [3] P. Chmelar, A. Benkrid, Efficiency of hsv over rgb gaussian mixture model for fire detection, in: 2014 24th International Conference Radioelektronika, IEEE, 2014, pp. 1–4. [4] J. Meléndez, J. M. Aranda, A. J. de Castro, F. López, Measurement of forest fire parameters with multi-spectral imaging in the medium infrared, Quantitative InfraRed Thermography Journal 3 (2006) 183–199. [5] P. Foggia, A. Saggese, M. Vento, Real-time fire detection for video-surveillance applications using a combination of experts based on color, shape, and motion, IEEE TRANSACTIONS on circuits and systems for video technology 25 (2015) 1545–1556. [6] A. V. Nadimpalli, A. Rattani, Gbdf: Gender balanced deepfake dataset towards fair deepfake detection, 2022. URL: https://arxiv.org/abs/2207.10246. doi:10.48550/ARXIV.2207.10246 . [7] S. Ramachandran, A. Rattani, Deep generative views to mitigate gender classification bias across gender-race groups, 2022. URL: https://arxiv.org/abs/2208.08382. doi:10.48550/ ARXIV.2208.08382 . [8] S. Frizzi, R. Kaabi, M. Bouchouicha, J.-M. Ginoux, E. Moreau, F. Fnaiech, Convolutional neu- ral network for video fire and smoke detection, in: IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society, IEEE, 2016, pp. 877–882. [9] Z. Yin, B. Wan, F. Yuan, X. Xia, J. Shi, A deep normalization and convolutional neural network for image smoke detection, Ieee Access 5 (2017) 18429–18438. [10] K. Muhammad, J. Ahmad, I. Mehmood, S. Rho, S. W. Baik, Convolutional neural networks based fire detection in surveillance videos, IEEE Access 6 (2018) 18174–18183. [11] K. Muhammad, S. Khan, M. Elhoseny, S. H. Ahmed, S. W. Baik, Efficient fire detection for uncertain surveillance environment, IEEE Transactions on Industrial Informatics 15 (2019) 3113–3122. [12] H. Yang, H. Jang, T. Kim, B. Lee, Non-temporal lightweight fire detection network for intelligent surveillance systems, IEEE Access 7 (2019) 169257–169266. [13] F. Abid, A survey of machine learning algorithms based forest fires prediction and detection systems, Fire Technology 57 (2021) 559–590. [14] S. Geetha, C. Abhishek, C. Akshayanat, Machine vision based fire detection techniques: A survey, Fire Technology 57 (2021) 591–623. [15] P. Barmpoutis, P. Papaioannou, K. Dimitropoulos, N. Grammalidis, A review on early forest fire detection systems using optical remote sensing, Sensors 20 (2020) 6442. [16] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, Mobilenetv2: Inverted residuals and linear bottlenecks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520. [17] B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le, Learning transferable architectures for scalable image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697–8710. [18] G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger, Densely connected convolu- tional networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708. [19] F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE CVPR, 2017, pp. 1251–1258. [20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE CVPR, 2016, pp. 2818–2826. [21] C. Szegedy, S. Ioffe, V. Vanhoucke, A. A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in: Thirty-first AAAI conference on artificial intelligence, 2017. [22] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceed- ings of the IEEE conference on CVPR, 2016, pp. 770–778. [23] S. Li, Q. Yan, P. Liu, An efficient fire detection method based on multiscale feature extraction, implicit deep supervision and channel attention mechanism, IEEE Transactions on Image Processing 29 (2020) 8467–8475. [24] N. Grammalidis, K. Dimitropoulos, E. Cetin, Firesense database of videos for flame and smoke detection, IEEE Trans Circuits Syst Video Technol 25 (2017) 339–351. [25] DeepQuest, Deepquest, 2021. URL: https://github.com/DeepQuestAI/Fire-Smoke-Dataset. [26] C. R. Steffens, R. N. Rodrigues, S. S. da Costa Botelho, An unconstrained dataset for non- stationary video based fire detection, in: 2015 12th Latin American Robotics Symposium and 2015 3rd Brazilian Symposium on Robotics (LARS-SBR), IEEE, 2015, pp. 25–30. [27] N. A. Ibraheem, M. M. Hasan, R. Z. Khan, P. K. Mishra, Understanding color models: a review, ARPN Journal of science and technology 2 (2012) 265–275. [28] A. Shamsoshoara, F. Afghah, A. Razi, L. Zheng, P. Fulé, E. Blasch, The flame dataset: Aerial imagery pile burn detection using drones (uavs), 2020. URL: https://dx.doi.org/10.21227/ qad6-r683. doi:10.21227/qad6- r683 .