Custom semantic segmentation neural network architecture in spirochaete detection application Michał Wieczorek1,*,†, Natalia Wojtas2,† 1 Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44100 Gliwice, Poland 2 Faculty of Veterinary Medicine, U niversity of Life Sciences in Lublin, Stanisława Leszczyńskiego 7, 20400 Lublin, Poland Abstract The kingdom of bacteria is a very diverse group of organisms characterized by high phenotypic variability. This feature is often used in clinical diagnosis. The spirochaetes are a microbes with a characteristic spiral shape of a flagella located within the periplasmic space. Nowadays, there is a high demand for creating a rapid and sensitive method for their detection as many of them performs a high pathogenic risk. Currently used methods lays on combination of clinical examination results, serologic and cultivation methods. There can be also used Polymerase Chain Reaction (PCR) method if needed. Unfortunately this combination can be very time consuming and require a lot of money. This research presents a novel, semantic segmentation neural network architecture designed to quickly create a classification mask, outputting information about the position, shape, and possible affiliation of detected elements. The evaluation method is based on a light microscope imagery and was created to overcome above mentioned problems. Used abstract classes contains erythrocytes, spirochaete and background. The resulted mask can be later mapped to a human-readable form with the inclusion of colors, next to an original image. Such approach allows for semi-automatic recognition of unwanted objects, however still giving the final verdict to the specialist. Developed solution has achieved a high recognition accuracy, while the computer power requirements are kept at a minimum. The proposed solution can help reduce misclassification rates by providing additional data for the doctor and speed up the entire process with the early diagnosis made by a neural network. Keywords Spirochaete, detection, mask, semantic segmentation, neural network 1. Introduction imals, especially dogs and their human owners [5]. The most often used methods for detecting those pathogens The spirochetes are a phylum of mostly free living, anaer- are the serologic and polymerase chain reaction (PCR) obic, motile bacteria. Those prokaryotes are large and methods. Unfortunately, they are often quite expensive long spirals. Their shape is slender, helically coiled, spi- and not available directly in the clinic. In more compli- ral, or corkscrew-like [1]. Those gram-negative bacterias cated cases, when there is a time and the owner can afford contain a distinctive double membrane. Their lengths it the bacterial cultivation can be performed. Often dur- vary between 3 and 500 m, diameter: 0.09 - 3 m [2]. Be- ing the routine clinical examination there is performed neath the outer membrane, they own a flagella, which the blood sampling for haematology and biochemistry number can be highly variable - from 2 in Spirochaeta evaluation. It gives a chance for a quick and easy accom- to more than 300 in Cristospira [3]. The feature that plishment of executing the blood smears. This may allow distinguishes them from other phyla is the flagella/axial the doctor to perform the microscopic method of blood filament’s location - on each pole of the bacteria, within evaluation and classify the pathogen visually, however the periplasmic space [4]. During everyday life, veteri- currently this method is not used as a standard. One nary doctors often encounter those bacterias, as many of the reasons can be a huge variability among the mi- of them produce highly dangerous diseases. Examples crobes and often very little optical differences between can be leptospirosis, lyme boreliosis, treponematoses or them. There is also a need for a specific staining for brachyspira species, producing swine dysentery. Many understanding what type of a bacteria the doctor deals of them are zoonotic factors, like Leptospira interrogans, with and a cost of specific chemicals. The example can producing flu-like symptoms, renal and hepatic damage be the spirochaetes, that under the microscope resem- and exhibiting serious risk both for wild and domestic an- ble the wiggly hairs and easily may be mistaken with trypanosomes, some protists and other bacteria with a IVUS 2022: 27th International Conference on Information Technology similar shape. The achievement of a direct and quick re- * Corresponding author. sult may be also influenced by the risk of human mistakes † These authors contributed equally. as a result of tiredness, inaccuracy and lack of time and $ michal_wieczorek@hotmail.com (M. Wieczorek); natjia@wp.pl a special interest in this field of medicine. This is why, (N. Wojtas) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License in our paper we propose a quick and simple method of Attribution 4.0 International (CC BY 4.0). CEUR CEUR Workshop Proceedings (CEUR-WS.org) Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 evaluating the presence of spirochaete bacteria, that may CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings speed up the diagnosis, give the doctors a valuable clue 2 after each block. The internal signal addition within and save money and health of the patient. This can be the convolutional blocks contains several branches with especially valuable for busy, first contact clinics as a pri- different layers count and combinations of batch normal- mary method of evaluating the presence of spirochaete ization layers. The final block output is combined using microorganisms, whose presence can be predicted after concatenate and add layers for better signal fidelity. Sam- a basic physical examination, before some more detailed ple scheme is presented in Fig. 4. The training has been and expensive tests. optimized using NAdam algorithm with learning rate of 0.0078 and the selected loss function was Categori- cal Cross-entropy with custom class weights computed 2. Proposed deep learning before training to balance the training. solution 2.1. NAdam algorithm The concept of microbe detection can be approached using different various techniques, however some of To improve model’s performance in terms of final ac- the most accurate methods include visual classification. curacy performance and the training times the NAdam Normally such process is performed by a human training algorithm has been used. The formula can be specialist and contains manual checking of hundreds of described as follows: objects within previously selected frames per patient. This process is extremely slow and requires full focus 𝑧𝑠 = 𝛾1 𝑧𝑠−1 + (1 − 𝛾1 )𝑔𝑠 , (1) of the doctor for the whole time in order to avoid 𝑘𝑠 = 𝛾2 𝑘𝑠−1 + (1 − 𝛾2 )𝑔𝑠2 , (2) misclassification and oversight. As in the nature of such examination are high re- where 𝛾 parameters are constant hyper-parameters and 𝑔 peatability and small amount of additional stimulus, is the current gradient value of an error function. Values it is very difficult for any human being to maintain 𝑧𝑠 and 𝑘𝑠 are used later for computing the correlations the peak detection performance during the whole marked as 𝑧ˆ𝑠 and 𝑘ˆ𝑠 according to below equations: process, especially considering some external factors like tiredness, small amount of time, very little visual 𝑧ˆ𝑠 = (1 − 𝛾1 )𝑔𝑠 + 𝛾1𝑠+1 𝑧𝑠 (3) differences between microbes or lack of special interest ˆ𝑠 = 𝑘𝑠 in the field. 𝑘 . (4) 1 − 𝛾2𝑠 Above mentioned problems lead to high average detection error rate. Finally, using previously calculated variables, the final formula can be defined as: More common deep learning techniques contain 𝑧ˆ𝑠 rectangle masking, such as in [6] and [7], however for 𝑤𝑠 = 𝑤𝑠−1 − 𝐿𝑅 √ (5) 𝛾 2𝑠 + 𝜖 this task the output needs to be more precise. Because of that as a partial solution to this issue in this research where 𝜖 is a small, constant value and 𝐿𝑅 is a learning a custom semantic segmentation neural network rate. architecture has been created. It provides additional data, in the form of a mask with initial elements classification, to the doctor next to the original image for easy and fast verification. Such approach can highly reduce error 3. Training dataset rate by providing additional diagnosis and pointing out suspicious elements, as well as speed-up the entire There were several factors needed to be taken into con- diagnosis process by reducing the time needed to analyse sideration while searching for the dataset: the image. • The data have to contain microscopy imagery of Additionally, by choosing segmentation architecture both microbes and healthy cells, over the classical rectangle masking one, the classifi- • The dataset has to be free for academical use, cation is made per-pixel and thus there is clarity and • The images need to have appropriate masks. accuracy improvement on images with higher amount of overlapping objects. During the research phase the most suitable one, containing masked images of the spirochaete mi- Final architecture is based on the U-Net shape and the croorganisms mixed with the red blood cells was the final parameters were selected empirically. Final shape “Bacteria detection with darkfield microscopy” dataset is presented in Fig. 3. The input layer has a shape of gathered and annotated as part of a bachelor thesis of 256x256 and is reduced by max-pooling by a factor of university Heilbronn, Germany. The dataset contains Algorithm 1 NAdam training process able to update it fast enough and even when updating, 1: Generate random weights, the costs needs to be small so most of the time there is 2: while global error value 𝜀 < 𝑒𝑟𝑟𝑜𝑟_𝑣𝑎𝑙𝑢𝑒 do only a mediocre CPU with small amounts of RAM and 3: Shuffle the training dataset, integrated GPU. Very often those computers are also 4: for each batch inside training dataset do laptops. 5: Compute gradient vector g on the batch, 6: Update vector 𝑚 eq. (1), With that in mind some compromises has been made, 7: Update vector 𝑣 eq. (2), mainly on the training length side, however after the final 8: Rescale vector 𝑚 ˆ eq. (3), reduction the model consists of 7,921,534 parameters and 9: Rescale vector 𝑣ˆ eq. (4), weights around 94MB. The evaluation times are below 10: Update variable 𝑤ˆ𝑡 eq. (5). 0.1 second on the GPU and around 0.87 second on the 11: Step = Step + 1, CPU per image. 12: end for The training plots are presented in Fig. 1. 13: Calculate global error 𝜀, 14: end while 5. Results visualization The network originally outputs the data in a form of a 366 images from the darkfield microscopy with manually two dimensional matrix with sparse representation of created masks labelling 3 abstract classes: background, classes using integer values. Such data are optimal for spirochaete and erythrocytes. being stored and analyzed by the computer, however presents no useful value for the non-technical user and requires further processing to create an informative 3.1. Data augmentation image. That’s why, in order to make it readable, the matrix has been expanded by 3 additional color channels Although the data are high quality and each image con- and integer values from the [0, 2] range has been mapped sists of many microbes and red blood cells, the number to red, green and blue channels. Based on basic human of training examples is relatively small to train a highly psychology blue has been chosen as a background, accurate model without the use of data augmentation. green as harmless blood cells and red as dangerous After several trials including variety of simple image microbes. Such prepared mask is presented next to transforms, as well as state of the art methods based on the original image for fast and easy validation by the user. Generative Adversarial Networks (GAN), such as Prin- cipal Component Resampling presented in [8], the best Other methods of visualization has been considered, combination in this case includes horizontal and vertical such as merging both original and mask into one image, flip, random image rotation and random zooming. however the level of clarity has been highly reduced and the validation became much more difficult as some of 4. Model’s performance the original data has been compromised. 4.1. Used hardware Sample final results produced by the network can be seen in Fig. 2. During this research all computations were made on a In the above examples there can be seen that, although PC with specification below: the network in some cases struggles to find the exact shape of the microbe or the red blood cell, it is still able to • CPU: Ryzen Threadripper 2950X 16c/32t, perform really well in most cases, even the more extreme • RAM: 128GB, ones where the image is not the highest quality, there • GPU: NVidia RTX 3090 24GB. is high amount of overlapping elements, the contrast is very low or the microbe is small relative to the whole 4.2. Performance image. During this research one of the main goals was to create not only an accurate model but also to reduce its 6. Conclusion memory and power requirements to the bare minimum. Such approach is crucial, as it helps to spread the use This paper presents a novel solution for fast spirochaete of similar models in real-world applications. Although detection using a custom Semantic Segmentation Neural computer hardware is becoming more powerful each Network. The output is presented in a clear and easy year, very little people, especially in smaller clinics are to understand way, also allowing for quick validation if (a) Accuracy Plot (b) Loss Plot Figure 1: Training plots Figure 2: Results visualization needed. Although the training has been performed on 7. Future possibilities a powerful GPU, the evaluation could be also done on a CPU from the budget level computer, making it accessible In the future there are many paths of improvements both for almost everyone. Presented solution is able to speed in terms of functionality and accuracy performance. One up the detection time by providing additional data to of them is expanding the current dataset with new images the original image, helping the human performing the captured on more diverse conditions, such as more noisy evaluation spot potential bacteria. This could not only backgrounds, different bacteria shapes, lower contrast allow for testing more animals at the same time but also ratio between elements, etc. This approach would lead drastically reduce the costs of such operation. to much higher accuracy on validation data as the net- work will understand the wider context, thus the feature extraction should work correctly on more cases than the current model. Another way of improving the network would be by not only adding more images to the training Figure 3: Deep learning model scheme Figure 4: Sample block model scheme set but also by expanding the number of abstract classes fit the potential new dataset some changes in the Deep providing examples of other microbes. This would lead to Learning architecture could be necessary to further better understanding of the world by the model but improve the accuracy. could also reduce the misclassification rate. To better References [1] E. Jawetz, J. Melnick, E. Adelberg, G. Brooks, J. Butel, L. Ornston, Spirochetes and other spiral microorgan- isms, Medical microbiology, 18th ed. Appletion and Lange, Norwalk, Conn (1989) 267–271. [2] L. Margulis, J. B. Ashen, M. Sole, R. Guerrero, Com- posite, large spirochetes from microbial mats: spiro- chete structure review, Proceedings of the National Academy of Sciences 90 (1993) 6966–6970. [3] K. H. Hougen, A. Birch-Andersen, Electron mi- croscopy of endoflagella and microtubules in tre- ponema reiter, Acta Pathologica Microbiologica Scandinavica Section B Microbiology and Immunol- ogy 79 (1971) 37–50. [4] M. Madigan, J. Martinko, P. Dunlap, D. Clark, Brock biology of microorganisms 12th edn. microbiol. 2008; 11: 65-73, 2019. [5] D. C. Alexander, P. N. Levett, C. Y. Turenne, Molecu- lar taxonomy, in: Molecular Medical Microbiology, Elsevier, 2015, pp. 369–379. [6] M. Wieczorek, J. Sika, M. Wozniak, S. Garg, M. Has- san, Lightweight cnn model for human face detection in risk situations, IEEE Transactions on Industrial Informatics (2021). [7] M. Woźniak, J. Siłka, M. Wieczorek, Deep learning based crowd counting model for drone assisted sys- tems, in: Proceedings of the 4th ACM MobiCom Workshop on Drone Assisted Wireless Communica- tions for 5G and Beyond, 2021, pp. 31–36. [8] O. O. Abayomi-Alli, R. Damaševičius, M. Wieczorek, M. Woźniak, Data augmentation using principal com- ponent resampling for image recognition by deep learning, in: International Conference on Artificial Intelligence and Soft Computing, Springer, 2020, pp. 39–48.