Application of Naive Bayes and Decision Tree in the Prediction of Power Transformers Faults based on DGA Yassine Mahamdi1, Ahmed Boubakeur2, Abdelouahab Mekhaldi3andYoucef Benmahamed4 1,2,3,4 Ecole Nationale Polytechnique (ENP),B.P 182 EL-HARRACH, Algiers, 16200, Algeria Abstract Power transformers are the basic elements of the power grid, the state of which is directly related to the reliability of the electrical system. Many techniques were used to prevent power transformers failures, but the Dissolved Gas Analysis(DGA) remains the most effective one. Based on the DGA technique, we describe in this paper the use of two of the most effective machine learning algorithms: Naive Bayes (NB) and Decision Tree (DT) to identify power transformers faults. In our investigation, we developed 9 different input vectors from widely known DGA techniques. We used 481 samples and considered 6 types of faults. The implementation of the proposed methods has achieved an effectiveness of 86.25% in power transformers faults diagnosis. Keywords 1 DGA, Decision Tree, Naive Bayes, Input vectors, faults diagnosis, Accuracy rate. 1. Introduction Dissolved Gas Analysis is the most common and effective method for detecting transformer faults. It can immediately predict internal transformer failures, which generally avoids huge economic losses. A transformer in service is exposed to two types of stresses: electrical and thermal [1]. Due to these stresses, the transformer oil and paper decompose, releasing a set of gases that reduce their dielectric strength. The nature and quantity of each dissolved gas produced in transformer oil can indicate the internal condition of the transformer. The most common gases produced by the decomposition of oil are: ethane (C2H6), ethylene (C2H4),acetylene (C2H2), methane (CH4) and hydrogen (H2)[2], these differ mainly in the intensity of the energy which is dissipated by the fault [1], [3]. In addition tocarbon dioxide (CO2) and carbon monoxide (CO) that are formed as a result of the decomposition of paper[4], while, the nitrogen (N2) and the Oxygen (O2) are the non-fault gases. There are many approaches developed for the analysis of dissolved gases in transformer oil and interpret their meaning including IEC Ratio, DORNENBURG Ratio, Rogers Ratio, Duval Triangle and Pentagon, and,Key Gas method. However, these techniques have certain limitations such as the existence of non-decision areas and erroneous results [5]. To overcome this situation, several artificial intelligence techniques have been used to improve the diagnostic accuracy of power transformers, such as fuzzy logic inference systems [6], artificial neural networks [7], hybrid grey wolf optimization [4], support vector machines and K-nearest neighbors [8-9], and have impressive performance [10- 12]. In this paper, we examine the use of the Naive Bayes and the Decision Tree algorithms in faults identification. The originality comes from the introduction of several input vectors formed using International Conference on Emerging Technologies: AI, IoT, and CPS for Science & Technology Applications, September 06–07, 2021, NITTTR Chandigarh, India EMAIL: yassine.mahamdi@g.enp.edu.dz (A. 1); ahmed.boubakeur@g.enp.edu.dz (A. 2); abdelouahab.mekhaldi@g.enp.edu.dz (A. 3); youcef.benmahamed@g.enp.edu.dz (A. 4); ORCID: 0000-0002-3777-7994 (A. 1); 0000-0001-9984-266X (A. 2); 0000-0001-6194-1363 (A. 3); 0000-0001-7179-3448 (A. 4); ©2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) widely known DGA techniques in order to identify the most suitable input data which gives the best performance of each algorithm and achieves the best prediction of fault in power transformers. This article is arranged as follows, in the second section, we describe the collection of DGA data then the construction of the proposed input space followed by a brief presentation of the two classification algorithms used; Decision Tree (DT) and Naïve Bayes (NB). The results of implementing the two algorithms using our proposed input vectors are discussed in the third section where, the best input vector for each technique has been identified. Finally, the conclusions from this work were summarized and potential future work was mentioned. 2. Methodology 2.1. Data collection The construction of our proposed input space needs gas concentration values. For this purpose, samples of transformer oil are taken periodically to check the gasesformed[12].Generally, mixtures of all gases are present in an oil sample, where the relative amount of each, could be an indicator of the existing faults, such as, partial discharges (PD), thermal faults > 700 °C (T3), thermal faults of 300 °C to 700 °C (T2), thermal faults < 300 °C (T1), high energy discharges (D2) and low energy discharges (D1)[4]. In this work, a database of 481 samples has been used in training and testing the proposed methods. This database has been extracted from the literature [13].The distribution of the training and the testing samples according to their fault type is shown in Table 1. Table 1 Samples distribution Samples for Fault types Abbreviations Samples for testing training Partial Discharge PD 32 16 Thermal Faults > 700 °C T3 57 28 Thermal Faults of 300 °C to 700 °C T2 32 16 Thermal Faults < 300 °C T1 63 32 High Energy Discharges D2 84 42 Low Energy Discharges D1 53 26 TOTAL 321 160 2.2. Proposed Input vectors: The following attributes have been considered in the construction of our proposed input vectors: 1. Using the concentration of the usual five key gases in ppm: X=[ H 2 , CH 4 , C2 H 2 , C2 H 4 , C2 H 6 ] (1) 2. Using the ratios between key gases (The IEC Ratios): CH 4 C2 H 2 C2 H 4 X=[ , , ] (2) H2 C2 H 4 C2 H 6 3. Using the relative percentages of gases: 𝑋𝑋 = [%𝐶𝐶2 𝐻𝐻6 , %𝐶𝐶2 𝐻𝐻4 , %𝐶𝐶2 𝐻𝐻2 , %𝐶𝐶𝐻𝐻4 , %𝐻𝐻2 ] (3) 4. Using ROGER's four-ratio: 𝐶𝐶 𝐻𝐻 𝐶𝐶 𝐻𝐻 𝐶𝐶 𝐻𝐻 𝐶𝐶𝐻𝐻4 𝑋𝑋 = [ 2 6 , 2 4 , 2 2 , ] (4) 𝐶𝐶𝐻𝐻4 𝐶𝐶2 𝐻𝐻6 𝐶𝐶2 𝐻𝐻4 𝐻𝐻2 5. Using DORNENBURG's four-ratios: CH 4 C2 H 2 C2 H 4 C 2 H 2 X=[ , , , ] (5) H2 C2 H 4 C2 H 6 CH 4 6. Using Duval’s triangle coordinates: 𝑋𝑋 = [𝐶𝐶𝑎𝑎 , 𝐶𝐶𝑏𝑏 ] (6) Where 1 ∑𝑘𝑘−1 𝑖𝑖=0 (𝑎𝑎 𝑖𝑖 +𝑎𝑎 𝑖𝑖+1 )(𝑎𝑎 𝑖𝑖 𝑏𝑏 𝑖𝑖+1 −𝑎𝑎 𝑖𝑖+1 𝑏𝑏 𝑖𝑖 ) 𝐶𝐶𝑎𝑎 = ∑𝑘𝑘−1 (7) 3 𝑖𝑖=0 (𝑎𝑎 𝑖𝑖 𝑏𝑏 𝑖𝑖+1 −𝑎𝑎 𝑖𝑖+1 𝑏𝑏 𝑖𝑖 ) And 1 ∑𝑘𝑘−1 (𝑏𝑏 𝑖𝑖 +𝑏𝑏 𝑖𝑖+1 )(𝑏𝑏 𝑖𝑖 𝑎𝑎 𝑖𝑖+1 −𝑏𝑏 𝑖𝑖+1 𝑎𝑎 𝑖𝑖 ) 𝐶𝐶𝑏𝑏 = 𝑖𝑖=0 ∑𝑘𝑘−1 (8) 3 𝑖𝑖=0 (𝑏𝑏 𝑖𝑖 𝑎𝑎 𝑖𝑖+1 −𝑏𝑏 𝑖𝑖+1 𝑎𝑎 𝑖𝑖 ) The ai are calculated by the equations: 𝜋𝜋 𝑎𝑎0 = %𝐶𝐶𝐻𝐻4 cos � � 2 𝜋𝜋 𝑎𝑎1 = %𝐶𝐶2 𝐻𝐻4 cos � + 𝜑𝜑� (9) 2 𝜋𝜋 𝑎𝑎2 = %𝐶𝐶2 𝐻𝐻2 cos � + 2𝜑𝜑� 2 And the bi could be obtained by replacing ‘’cos’’ with ‘’sin’’ in the last equations with α = 2π/3 7. Using Duval’s pentagon coordinates: 𝑋𝑋 = [𝐶𝐶𝑎𝑎 , 𝐶𝐶𝑏𝑏 ] (10) Where 1 ∑𝑘𝑘−1 (𝑎𝑎 𝑖𝑖 +𝑎𝑎 𝑖𝑖+1 )(𝑎𝑎 𝑖𝑖 𝑏𝑏 𝑖𝑖+1 −𝑎𝑎 𝑖𝑖+1 𝑏𝑏 𝑖𝑖 ) 𝐶𝐶𝑎𝑎 = 𝑖𝑖=0 ∑𝑘𝑘−1 (11) 6 𝑖𝑖=0 (𝑎𝑎 𝑖𝑖 𝑏𝑏 𝑖𝑖+1 −𝑎𝑎 𝑖𝑖+1 𝑏𝑏 𝑖𝑖 ) And 1 ∑𝑘𝑘−1 (𝑏𝑏 𝑖𝑖 +𝑏𝑏 𝑖𝑖+1 )(𝑏𝑏 𝑖𝑖 𝑎𝑎 𝑖𝑖+1 −𝑏𝑏 𝑖𝑖+1 𝑎𝑎 𝑖𝑖 ) 𝐶𝐶𝑏𝑏 = 𝑖𝑖=0 ∑𝑘𝑘−1 (12) 6 𝑖𝑖=0 (𝑏𝑏 𝑖𝑖 𝑎𝑎 𝑖𝑖+1 −𝑏𝑏 𝑖𝑖+1 𝑎𝑎 𝑖𝑖 ) The ai are calculated using the following equations: 𝜋𝜋 𝑎𝑎0 = %𝐻𝐻2 cos � � 2 𝜋𝜋 𝑎𝑎1 = %𝐶𝐶2 𝐻𝐻6 cos � + 𝜑𝜑� 2 𝜋𝜋 𝑎𝑎2 = %𝐶𝐶𝐻𝐻4 cos � + 2𝜑𝜑� (13) 2 𝜋𝜋 𝑎𝑎3 = %𝐶𝐶2 𝐻𝐻4 cos � + 3𝜑𝜑� 2 𝜋𝜋 𝑎𝑎4 = 𝐶𝐶2 𝐻𝐻4 cos � + 4𝜑𝜑� 2 Also, the bi could be obtained by replacing ‘’cos’’ with ‘’sin’’ in the last equations with α = 2π/5. 8. In this case, a combination of two of the previously mentioned input vectorshas been done, Roger's and DORNENBURG's ratios: CH 4 C2 H 2 C2 H 4 C2 H 2 C2 H 6 X=[ , , , , ] (14) H2 C2 H 4 C2 H 6 CH 4 CH 4 9. To further improve fault recognition by expanding the proposed input space , another combination was made in the case of this input vector, Duval’s triangle-pentagon coordinate’scombination: 𝑋𝑋 = [𝐶𝐶𝑎𝑎1 , 𝐶𝐶𝑏𝑏1 , 𝐶𝐶𝑎𝑎2 , 𝐶𝐶𝑏𝑏2 ] (15) Where {Ca1, Cb1} are calculated using the triangle method, while {Ca2, Cb2} are calculated according to the pentagon one. 2.3. AI techniques: 2.3.1. Naive Bayes The NAIVE BAYES algorithm is a simple probabilistic classifier that uses Bayes theorem,which is given by the following equation [14]: 𝑃𝑃(𝑦𝑦 |𝑥𝑥)×𝑃𝑃(x) 𝑃𝑃(𝑥𝑥|𝑦𝑦) = (16) 𝑃𝑃(𝑦𝑦) Where 𝑃𝑃(𝑥𝑥|𝑦𝑦)refers to the subsequent possibility of the hypothesis x conditioned by some evidence y and𝑃𝑃(x) is the prior probability of x. 2.3.2. Decision tree The decision tree algorithm is a non-parametric supervised machine learning’s classifier used to split data into a set of branches. The construction of the tree is conducted from top to bottom in a recursive divide-and-conquer manner. The Decision Tree classifier training is based on finding the best split at each node as long as the full data set is not analyzed [15]. The said principle leads to the idea of partitioning the feature space until the interrupt criterion is satisfied in each list, or until all points in a given leaf belong to one class. Figure 1 illustrates the basic structure of a decision tree. Figure 1: Decision Tree general structure Among other classification algorithms, Decision Tree have the following advantages: • Good performance with large data sets • Requires little data preparation • Easy to display graphically • Easy to understand and interpret Construction of decision tree: In order to select the best variable to split, the Decision Tree uses the information gain. The equation for calculating information gain is as follows: 𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺(𝑇𝑇, 𝐴𝐴) = 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑇𝑇) − ∑𝑛𝑛𝑖𝑖=1 𝑖𝑖 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝑦𝑦(𝑇𝑇𝑖𝑖 ) (17) 𝑇𝑇 Where 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺(𝑇𝑇, 𝐴𝐴) is the information gain of set T (training data) on an attribute A and 𝑇𝑇𝑖𝑖 is a subgroup of T for which: A has value i. The Entropy of node T is defined as: 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸(𝑇𝑇) = − ∑𝑛𝑛𝑖𝑖=1 𝑝𝑝(𝑖𝑖) log 𝑝𝑝(𝑖𝑖) (18) Where 𝑝𝑝(𝑖𝑖) is the proportion of T belonging to a class i. 3. Results and discussion: To evaluate the performance of Naïve Bayes and Decision tree algorithms using our proposed input vectors according to six types of transformer faults, a set of 481 samples has been used to train and test the two methods; 67% of the dataset were used for the training and 33% for the testing, using the MATLAB software. Table 2 shows the results of the implementation of the two classifiers using the proposed input vectors. Table 2 Faults diagnostic results in percent using the Naïve Bayes and the Decision Tree algorithms with all the proposed input vectors Input vector 1 2 3 4 5 6 7 8 9 Naïve Bayes 25.62 81.87 13.75 11.25 28.75 58.25 42.50 28.75 86.25 Decision tree 75.62 80.62 83.12 83.75 77.50 45 78.75 76.25 78.75 From Table 2, it is easy to see that the highest prediction accuracy is obtained using vector 9 (combined Duval’s pentagon and triangle) with the Naïve Bayes algorithm (86.25%). Whereas, in the case of the Decision Tree, the input vector 4 (Roger's four-ratio method) gives the highest prediction accuracy, up to 83.75%. In order to deepen the study, the performance of each algorithm with its appropriate input vector was evaluated based on the accuracy of each fault type diagnosis (Figure 2). Figure 2:Histogram of accuracy rate From Figure 2, it is clear that the performance of each algorithm differs depending on the type of fault. For example, in the case of the partial discharges (PD), the Naïve Bayes has the best performance, while, in the case of medium thermal fault (T2), the Decision Tree has the superiority in such fault recognition. Overall, the Naïve Bayes algorithm remains the one with the greatest precision. 4. Conclusion: The Naïve Bayes and the Decision Tree classification algorithms were used to identify power transformer faults. A dataset of 481 samples was employed and 9 different input vectors were considered. The Naive Bayes algorithm achieved a diagnostic accuracy of 86.25% when using the 9th input vector (Duval’s triangle-pentagon coordinates combination), compared to 83.75% in the case of the Decision Tree using the 4th input vector (ROGER's four-ratio). These diagnostic results show an improvement in the identification of transformer faults over other traditional DGA methods. Significant differences in diagnostic accuracy were obtained when using the same classification algorithm with different input vectors, this investigation shows the appropriate input vector for the diagnosis of power transformers using the Naive Bayes and the Decision Tree algorithms. In a future work, we will extend the proposed input space using other input vectors with an improved machine learning algorithm. 5. References [1] ‘’Mineral Oil-Filled Electrical Equipment in Service - Guidance on the Interpretation of Dissolved and Free Gases Analysis’’, IEC Standard IEC 60599, IEC, Geneva, Switzerland, Edition 2.1, May 2007. [2] F. Jakob and J. J. Dukarm, “Thermodynamic estimation of transformer fault severity”, IEEE Trans. on Power Delivery, vol. 30, no. 4, pp. 1941–1948, 2015. [3] M. Duval and A. dePabla, “Interpretation of gas-in-oil analysis using new IEC publication 60599 and IEC TC 10 databases”, IEEE Electrical Insulation Magazine, vol. 17, no. 2, pp. 31–41, Mar. 2001. [4] A. Hoballah, D. A. Mansour and I. B. M. Taha, “Hybrid grey wolf optimizer for transformer fault diagnosis using dissolved gases considering uncertainty in measurements”, IEEE Access, vol. 8, pp. 139176–139187, 2020. [5] S. S. M. Ghoneim and I. B. M. Taha, ‘‘A new approach of DGA interpretation technique for transformer fault diagnosis’’, Int. J. Electr. Power Energy Syst., vol. 81, pp. 265–274, Oct. 2016. [6] Islam, S.M.; Wu, T.; Ledwich, G. ‘’A novel fuzzy logic approach to transformer fault diagnosis’’. IEEE Trans. Dielectr. Electr. Insul. 2000, 7, 177–186. [7] S. Souahlia, K. Bacha and A. Chaari, “MLP neural network-based decision for power transformers fault diagnosis using an improved combination of Rogers and Doernenburg ratios DGA”, Int. Journal of Electrical Power & Energy Systems, vol. 43, no. 1, pp. 1346–1353, [8] Benmahamed, Y.; Teguar, M.; Boubakeur, A. ‘’Application of SVM and KNN to Duval Pentagon 1 Transformer Oil Diagnosis’’. IEEE Trans. Dielect. Electr. Inst. 2017, 24, 3443–3451. [9] Kherif, O.; Benmahamed, Y.; Teguar, M.; Boubakeur, A and Ghoneim, S. S. M. "AccuracyImprovement of Power Transformer Faults Diagnostic Using KNN Classifier With Decision TreePrinciple," in IEEE Access, vol. 9, pp. 81693-81701, 2021. [10] Yang, M.-T.; Hu, L.-S. ‘’Intelligent fault types diagnostic system for dissolved gas analysis of oil-immersedpower transformer’’. IEEE Trans. Dielectr. Electr. Insul. 2013, 20, 2317–2324. [11] J. I. Aizpurua, V. M. Catterson, B. G. Stewart, S. D. J. McArthur, B. Lambert, B. Ampofo, G.Pereira, and J. G. Cross, ‘‘Power transformer dissolved gas analysis through Bayesian networksand hypothesis testing’’, IEEE Trans. Dielectrics Electr. Insul., vol. 25, no. 2, pp. 494– 506, Apr.2018. [12] Mirowski, P.; LeCun, Y. ‘’Statistical Machine Learning and Dissolved Gas Analysis: A Review’’. IEEE Trans. Power Deliv. 2012, 27, 1791–1799. [13] Taha, I.B.M.; Hoballah, A.; Ghoneim, S.S.M. ‘’Optimal ratio limits of Roger’s four-ratios and IEC 60599 code methods using particle swarm optimization fuzzy-logic approach’’. IEEE Trans. Dielect. Electr. Inst. 2020, 27, 222–230. [14] Dimitoglou, G., Adams, J. A., & Jim, C. M. (2012). ‘’Comparison of the C4. 5 and a Naïve Bayes classifier for the prediction of lung cancer survivability’’. arXiv preprint arXiv:1206.1121 [15] John Ross Quinlan, "C4.5: Programs for Machine Learning", Morgan Kaufmann Publishers, 1993