Modern Automated Microscopy Systems in Oncology Oleh Berezsky1[0000-0001-9931-4154], Oleh Pitsun 1 [0000-0003-0280-8786], Natalia Batryn 1 [0000-0002-7152-4130], Tamara Datsko 1 [0000-0001-9283-2629], Kateryna Berezska 1 [0000-0002-9632-4004] , Lesia Dubchak 1 [0000-0003-3743-2432] 1 Ternopil National Economic University, 46001, Ukraine ob@tneu.edu.ua, o.pitsun@tneu.edu.ua, nbatryn@gmail.com, datskotv@gmail.com, km.berezska@gmail.com, dlo@tneu.edu.ua Abstract. In the article, the authors investigated modern automated microscopy systems used in oncology and analyzed recent methods of image processing at different levels of computer vision. The researchers described a class of biomed- ical images, such as cytological images of breast precancerous conditions. Liter- ature review has shown that the actual problem is biomedical image processing. The adaptive method of biomedical pre-processing and automated segmentation method has been developed. The method of automated selection of segmentation algorithms based on the use of Gromov-Frechet metric, Gromov-Hausdorff met- ric, and knowledge base has been developed. Convolution neural network method for histological and cytological image classification based on a combina- tion of convolutional and sub-sampling layers and their input parameters has been developed. Computer experiments have been carried out at each level of com- puter vision in automated microscopy systems. The generalized structure and components of the intelligent automated microscopy system for histological and cytological image processing have been developed. Keywords: automated microscopy system, biomedical image, breast precancer- ous condition, histological image, image pre-processing, segmentation, convolu- tional neural networks, intelligent system, telemedicine. 1 Introduction The problem of improving diagnosis and choosing the best tactics for treatment of breast tumors is still one of the most relevant in modern surgical practices [1]. There is a close correlation between the overall prevalence of tumors in mammary gland and mortality from breast cancer (BC) [2]. Breast cancer is one of the most widespread tumor processes in women and takes first place in the structure of morbidity and mor- tality from malignant neoplasms (25% of all cases of cancer) in Ukraine, as well as in most countries of the world. Various benign tumors of the mammary gland, usually not pose a threat to malig- nancy. However, specialists often face with the significant difficulties of differential diagnosis of benign and malignant tumors of the mammary gland at the stage of primary patient’s contact with health care facilities or while qualified and specialized medical care is providing. For many years, the generally accepted view is that the only real way to reduce morbidity and mortality frombreast cancer successfully is to improve the quality of early diagnosis [3]. Cytological and histological examination, along with clinical examination, ultrasound diagnosis and mammography are widely used in the diagnosis of breast disease. Histological study is a method of detailed study of the human body in order to detect the pathological process in the tissue. The cytological study allows identifying the pathological changes of cells on the early stages of development, since the main object of study is small cellular structures, such as, nucleus, cytoplasm, mitochondria, as well as determination of the nuclear-cytoplasmic ratio, which is a very important indicator. The result of the cytological study is cytological images, and in histological study it is histological images. Different methods of differential diagnosis of benign neoplasms and breast cancer are used today. Morphometric methods where the number of paren- chymal cells, fibroblasts and fibrocytes localized periarteriolarly are the most popular [4]. However, in spite of all diversity of methodological approaches in diagnosis and prediction of the development of breast cancer, the further development of cancer may occur often after the performed surgical interventions and treatment for verified benign processes. Thus, the search for new effective schemes for the prevention of adverse clinical manifestation in patients with benign breast pathology is still actual. 2 Literature review Histological and cytological images usually have low quality [5]. Considerable noise, blurriness, dark and light areas are the main problems in image processing. Image fil- tering is used to reduce Gaussian and pulse noise. In [6], different techniques of image pre-processing are compared according to their ability to reduce noise and segment the image. The defining feature of frequency filters is suppression of one a priori set fre- quency band. There are high-frequency filters (HFF) and low-frequency filters (LFF) [7]. After low-frequency filtering, the resulting image will have a fewer number of sharp details because high frequencies will be suppressed. While carrying out a cytological research, the analysis of the shape of the investi- gated micro-object is of great importance [8]. In particular, the change in shape of a micro-object can signal its transition from one state to another (dystrophy, necrobiosis, necrosis), which, in its turn, shows the course of the disease. In nature, the shape of a micro-object is determined by the cell walls and the state of the internal structural ele- ments of the cell. The shape of a cytological image is determined by the micro-object edges. The main algorithms for micro-object edge detection are the following ones: snake algorithm, Canny algorithm, filtering algorithm based on Sobel, Laplacian, and Prewitt operators, etc. [9]. Their defining feature is highlighting the sharp brightness differ- ences found in the border region. As a result, a set of unconnected areas appear. To obtain continuous contours, an additional processing is required. The following algorithms for contour detection are known: threshold segmentation, clustering, watershed algorithms, block segmentation, etc. According to these algo- rithms, pixels are grouped into homogeneous regions on the basis of a certain homoge- neity criterion. The result of their work is a set of homogeneous areas. To obtain the description of the object contour, it is necessary to apply the backward contour tracing [10]. Today, artificial neural networks are used as current image classifiers. In particular, convolutional neural networks (CNNs) are wide-known. CNN can detect image com- mon features, form more complicated features, and recognize them. The objective of CNN is to alternate convolutional, sub-sampling and and max-pooling layers at the out- put. The authors in [11] investigated the use of CNNs for recognition of large-sized images. The focus was on the development of CNN with a small filter window size (3x3). The analysis of modern image processing systems, their advantages and disad- vantages for image classification at the early stage of breast cancer detection were pre- sented in paper [12]. The emphasis has been laid on the use of CNNs and deep learning technology. The article [13] presented the structure of CNN models used for research- ing and detecting breast cancer. The principles of applying CNNs for medical purposes were described in works [14, 15]. 3 Modern automated microscopy system analysis Automated microscopy systems (AMSs) are widely used in biomedical image analysis. Automated microscopy system BioImageXD [16] covers both object-oriented and voxel-based approaches. The software can be used for both a simple visualization of multichannel temporary stacks and a complex 3D processing. The author of the article [17] described a method of computer processing of multispectral images of preparations used in medical biology research. The author [18] used hardware-software MetaMorph 7.1. in breast cancer research. The authors of the article [19] analyzed the OncoDoc system to improve the quality of research for breast cancer diagnosis. OncoDoc has a database (DB) that distinguishes it from other systems. In the article [20], the authors examined and analyzed interactive software environment for visualization, correction, and analysis of 3D morphology. In the work [21], he authors presented their analysis of studies in the field of im- munohistochemistry of breast cancer using the ImageJ software based on the RGB- model of histological images. In the article [22], the researchers showed the structure, advantages, and disadvantages of information automated system MECOS-CH. The most popular AMS include MECOS-C2, TissueFAXS, AnalySISFive, BioVision, Vid- eoTesTMorpho 5.2, BioImageXD, Ariol, ImageJ, Motic Images Advanced 3.2, Di- aMorph, Motic Video TesT Morfo 5.2, Cell D. The main AMS evaluation criteria are the following: operating modes for image segmentation algorithms, calculation of mi- cro-objects’ numerical characteristics, and ability to work with external programs. An- other important criterion is ability to output information, such as a report, in the form of charts and histograms. 4 Problem statement Literature review has shown that the actual problem is biomedical image processing. Therefore, the purpose of the work is to develop new methods and tools for image pro- cessing at different levels of computer vision. 5 Biomedical image classes Biomedical images are images obtained by using visual imaging devices in medicine and biology [23]. Due to the visualization of various processes occurring in living or- ganisms, it is possible to study the mechanisms of cells function, tissues and organs in humans and animals. During the studies the following precancer pathologiesof the mammary gland such asnon proliferative mastopathy, cystic nonproliferative mastopathy, fibrotic nonprolif- erative mastopathy, proliferative mastopathy, and fibroadenoma have been investi- gated. As a result of the research, the main morphological cytological signs of non-prolif- erative mastopathy were indicated, they include the following features: 1. Flattened apocrine epithelium. 2. Papillary structures formation. 3. Presence of secretory activity in cells. 4. Rounded hyperchromatic nuclei, located centrally. 5. A small number of hyperchromic monomorphic cells. 6. Cells are situated in layers. 7. Many phagocytes and histiocytesare in the background. 8. Presence of a secret around the cellular space. As a result of the analysis of morphological signs, rules for the diagnosis are formu- lated: a) cystic nonproliferativemastopathy: IF cells are located in layers AND there are cubic and prismatic elements, papillary and rounded complexes AND in the background has a lot of phagocytes and histiocytes in cells are rounded, the nucleus are situatedin the centre and there are cells with apo- crine secretion, which have 2 zones (basal and apical) - cystic nonproliferative mas- topathy ( 80%); b) fibrotic nonproliferativemastopathy: IF a small amount of hyperchromic monomorphic cells AND a narrow rim of in- tensely colored cytoplasm and rounded hyperchromic TA nucle - fibrous nonprolifera- tivemastopathy (70%). Based on experimental studies, the main morphological (cytological) identifications of proliferative mastopathy were selected, which include the following features: 1. Formation of cellular complexes (acinus). 2. Formation of papillary complexes with dense cells placement in multilayer layers. 3. Large cell sizes. 4. Large sizes of nuclei with intensely expressed chromatin. The next task was to isolate these qualitative diagnostic features in cytological prep- arations . As a result of the analysis of morphological features, a rule for the diagnostic of proliferative mastopathyhas been formulated: IF the formation of cellular complexes (acinus) AND the formation of papillary com- plexes with dense cell placement in multilayered layers aAND large cell sizes AND large size of nuclei with intensely expressed chromatin - epithelial proliferative mas- topathy (95%). The main morphological (cytological) signs of fibroadenomawere selected, based on experimental studies, they include the following features: 1. Papillary structures formation. 2. Placing apocrine epithelium. 3. Cells are increased in size. 4. Intensely expressed nuclei. 5. Narrow rim of intensely colored cytoplasm. 6. Rounded hyperchromatic nuclei. 7. Fibroblasts. As a result of the analysis of morphological features, a rule for the diagnosis of fi- broadenoma has been formulated: IF the formation of papillary structures and flattened apocrine epithelium AND in- tensive expression of the nucleus ANDa narrow rim of intensively colored cytoplasm AND rounded hyperchromic nuclei - fibroedenoma (80%). 6 Biomedical image pre-processing method Let Im be an input image. We represent this image in a matrix form (1). 𝛼0,1 . . . 𝛼0,𝑁−1 Im = [ ... ] (1) 𝛼𝑀−1 . . . 𝛼𝑀−1,𝑁−1 where  ij is an element of the image. The method of adaptive image processing consists of the following steps [24]: 1. Evaluation of image noisiness. The median filter is represented in a form of the transformation: Im𝐼 = 𝑀(𝐼𝑚) (2) The expression for a two-dimensional median filtration can be represented as fol- lows: Im𝐼𝑖,𝑗 = 𝑚𝑒𝑑[𝐼𝑚𝑖+𝑠,𝑗+𝑡 (𝑠, 𝑡) ∈ 𝑊]; 𝑖, 𝑗 ∈ 𝑍 2 (3) I where Im i , j is an image matrix element after the filtration; Ws ,t is an element of an aperture of an image with the size of m x n; Im i , j is an input image matrix element. For filtration, the filter of 5x5 size is selected. 2. The next step is to quantify the image noisiness. To calculate the Peak Signal-to- Noise Ratio (PSNR) [25], it is necessary to find out the mean square error (MSE) be- tween two images: 1 m1 n 1 I  Im  i, j   Im  i, j  , 2 MSE  mn i 0 j 0 I where Im and Im refer to a filtered and original images, respectively, the size is mxn. The PSNR value is defined as:  MAX I2  PSNR  10 log10  ,  MSE  where MAX I is a maximum value that is taken by the image pixel. 3. Filtering parameters adjustment. As a result of experimental studies with cytological and histological images, the fol- lowing filtering parameters were selected: 𝑚𝑤 = 5 × 5, 𝑔𝑤 = 3 × 3; 𝑃𝑆𝑁𝑅 ≤ 24𝑑𝐵 { 𝑚𝑤 = 3 × 3, 𝑔𝑤 = 3 × 3; 𝑃𝑆𝑁𝑅 > 24𝑑𝐵 where mw is a size of the median filter window, and gw is a size of the Gaussian filter window. 4. Image filtration. To reduce a level of additive noise, we apply the Gaussian filter. The transformation is represented as follows: Im𝐼𝐼 = 𝑔𝑤 × Im𝐼 The expression for the Gaussian filter convolution operation for the pixel with the co- ordinates x, y is the following: x2  y 2 1  G  x, y   e 2 2 2 2 where σ is a radius of the convolution window. Image filtering is represented as follows: ImIII  mw *ImII , II III where Im is an input image, mw is a filter window, and Im is the image after fil- tering. 5. Histogram equalization. Histogram equalization is represented by the transformation H: Im𝐼𝑉 = 𝐻(Im𝐼𝐼𝐼 ) IV III where Im is an image with a new value of the histogram, and Im is an input image. 6. Image brightness adjustment. On the basis of the defined parameter α, we carry out the following image transfor- mation: Im𝑉 = 𝛼 ∗ Im𝐼𝑉 To quantify the similarity of the image processed by known algorithms and the image processed by the expert, the criterion SSIM (structure similarity) was used. The difference between the two images A and B with the same size N × N is calculated by the formula: (2 A  B  c1 )(2 AB  c2 ) SSIM ( A, B)  , (  A2   B2  c1 )( A2   B2  c2 ) where  A  B refer to average values A and B;  A2 ,  B2 is dispersion;  AB is covariance A and B; 𝑐1 = (𝑘1 𝐿)2 ; 𝑐2 = (𝑘2 𝐿)2 ; L is range of pixels; k1 = 0,01; k 2 = 0,03 are constants. The comparative characteristics of the algorithms for automated image quality improvement by the SSIM criterion are shown in Table 1. Table 1. Comparative characteristics of the algorithms for automated image quality improve- ment by the SSIM criterion № image Segmentation with- HE CLAHE MSR Developed out pre-processing algorithm 1 0.576 0.652 0.678 0.709 0.778 2 0.601 0.676 0.679 0.658 0.704 3 0.709 0.78 0.82 0.859 0.976 4 0.421 0.454 0.5 0.523 0.523 5 0,586 0.602 0.607 0.7 0.631 7 The method of automatic selection of metric-based segmentation algorithms The process of image segmentation is time-consuming and it is not always possible to perform it in automatic mode. As a result of computer experiments, modern segmenta- tion algorithms and their combinations were tested and the limits of algorithm parame- ters were selected . The method of choosing an algorithm and segmentation parameters is the following [26]: 1. Determining the input parameters of the image (brightness level, average values of red, green and blue channels); 2. Image segmentation. At this stage, the following methods are used: threshold seg- mentation, watershed method, and k-mean method. For threshold segmentation, a set of values for the lower threshold (35 - 175) is used with step 3. 3. Segmentation evaluation. To evaluate the similarity between images, the Gromov-Hausdorff metric, the Gromov-Fréchet metric [27,28], and the FRAG param- eter are used. The graphical representation of the sequence of testing stages of histological and cytological image segmentation algorithms is shown in Figure 1. Fig. 1. Testing Stages of Image Segmentation Algorithms Table 2. Segmentation results Input image Expert pro- Automatic segmen- Developed module cessing tation (ImageJ) The results of the automatic selection of histological and cytological image segmen- tation parameters are presented in Table 2. 8 Image classification method For the classification of histological and cytological images the convolutional neural networks (CNN) were used [29]. The structure of the module for image classification is shown in Figure 2. The comparative analysis of cytological and histological images classification meth- ods is shown in Figure 3 Fig. 2. The structure of the module for image classification using CNN Fig. 3. The comparative analysis of cytological and histological images classification methods 9 Intelligent AMS with elements of telemedicine Based on the developed image processing algorithms, "HIAMS" has been designed for diagnosis of breast cancer and precancerous conditions based on the analysis of cyto- logical and histological images. System modules allow processing information about patients, processing images at different levels of computer vision. The key difference between the HIAMS intelligence system [30] and existing analogues is the availability of an adaptive graphical interface for different types of users and allocation of system access rights. HIAMS structure. The generalized structure of the developed AMS is presented in Fig. 4. The main groups of system users are treating physician, diagnostic doctor, ex- pert, assistant, and administrator. They communicate using a remote database and a remote FTP server. Currently, in medicine, scientists devote considerable attention to the design of databases for information systems that facilitate the work of physicians. The structure of such relational databases mostly makes it easy to formulate reports and statistical data on patients and their diagnoses. Most of the existing automated micros- copy systems for image analysis do not have databases or they have a limited function- ality. Fig. 4. Generalized structure of «HIAMS» A treating physician is responsible for the registration of patients. Every doctor has access only to his own patients. The results of image processing (preliminary pro- cessing, segmentation, etc.) are stored in the database. Each patient and each test has a unique identifier. The quantitative characteristics of microscopic objects in the image allow classifying them using known methods and algorithms, for example, by the method of support vectors. A diagnostic doctor makes a preliminary diagnosis based on image analysis. The doctor can form a set of quantitative features of microscopic objects and carry out a description of image qualitative characteristics. An expert can review the results of the research and make a diagnosis based on his or her own knowledge. Conclusions 1. Analysis of the existing methods, algorithms, and recent tools for image pro- cessing at different levels of computer vision has been conducted. 2. Informative features of breast pre-cancerous conditions have been described. 3. The method of automated selection of segmentation algorithms based on the use of Gromov-Frechet metric, Gromov-Hausdorff metric, and knowledge base has been developed, which enabled to automatically select segmentation algorithms and their parameters. 4. The method of adaptive image processing based on filtering algorithms and rules of histogram alignment has been developed and it improves the quality of histological and cytological images by 16%. 5. Neural network method for histological and cytological image classification based on a combination of convolutional and sub-sampling layers and their input pa- rameters has been developed, which allowed increasing the classification accuracy in comparison with the existing classifiers (SVM, k-means) by an average of 20%; 6. The generalized structure and components of the intelligent automated micros- copy system for histological and cytological image processing have been developed, which unlike the existing analogs has adaptive graphical interface, remoted database, knowledge base of algorithms for image preliminary processing, segmentation and classification and it is a multi-user software system. Areas for further research At present, the most precise method in the world practice of diagnosing pathological processes in oncology is the immunohistochemical method. Therefore, the very urgent first area for further research is the formation of a database of immunohistochemical images of breast dysplastic and cancerous conditions. The second direction is the expansion of the database of histological and cytological images with new types of breast dysplastic and cancerous conditions, the search for new informational features for diagnosis. This direction requires the involvement of expert cytologists and histologists. The third direction is the intellectual analysis of the quantitative and qualitative char- acteristics of the database to identify the regularities. This will allow obtaining new diagnostic rules, test them and discover new features. References 1. Torre, L. A., et. al.: Global cancer statistics. Cancer Journal for Clinicians – 2012. 2. Zajdela, A. The value of aspiration cytology in the diagnosis of breast cancer: Experience at the fondation curie. Cancer Cytopathology, (1994) 3. Liu, X.–F.: A Clinical Study on the Resection of Breast Fibroadenoma Using Two Types of Incision. Scandinavian Journal of Surgery, 100(3), 147–152 (2011) 4. Ohnstad, H. O., et. al.: Prognostic value of PAM50 and risk of recurrence score in patients with early–stage breast cancer with long–term follow–up. American journal of Clinical Pa- thology,Vol.19, No. 1, 120 р (2017) 5. Berezsky, O., Pitsun, O.: Automated processing of cytological and histological images. In: 2016 XII International Conference on Perspective Technologies and Methods in MEMS De- sign (MEMSTECH) 51–53 (2016) 6. Adatrao, S. et. al.: An analysis of different image preprocessing techniques for determining the centroids of circular marks using hough transform. In: 2nd International Conference on Frontiers of Signal Processing (ICFSP), 15–17 Oct. 2016, pp. 110–115 (2016) 7. Gonzalez, R., Woods, R.: Digital Image Processing. Tech nosphere. 1104 p. (2012) 8. Baldock, R., Graham, J.: Image Processing and Analysis A Practical Approach. Oxford uni- versity press. 300 p. (2000) 9. Pratt, W. K.: Digital Image Processing: PIKS Scientific Inside. NY, USA: John Wiley & Sons, Inc., 782 p. (2007) 10. Kazlouski, A., Sadykhov, R.: Plain objects detection in image based on a contour tracing al- gorithm in a binary image. In: 2014 IEEE International Symposium on Innovations in Intelli- gent Systems and Applications (INISTA) Proceedings, 23–25 June 2014. pp. 242–248 (2014) 11. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large–Scale Image Recognition. In: Proceedings of ICLR, May 7 – 9, 2015, 1–14 (2015) 12. Abdelhafiz, D., et. al.: Survey on deep convolutional neural networks in mammography. In: Conference: 2017 IEEE 7th International Conference on Computational Advances in Bio and Medical SciencesAt: Orlando, FL, USA, 1–17 (2017) 13. Zhang, X., et. al: Whole mammogram image classification with convolutional neural net- works. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2017) Kansas City, MO, USA Nov. 13, 2017 to Nov. 16, 700–704 (2017) 14. Qing, Li, et. al: Medical image classification with convolutional neural network. In: 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV) – Sin- gapore, 10–12 Dec. 2014, 34-42 (2014) 15. Khoshdeli, M.: Feature–based Representation Improves Color Decomposition and Nuclear Detection Using Convolutional Neural Network. IEEE Transactions on Biomedical Engineer- ing, (2017) 16. Kankaanpää, P., et. al: BioImageXD: an open, general–purpose and high–throughput image– processing platform. Nature methods, Vol. 9(7), 155–171 (2012) 17. Malov, A. M.: Computer processing of biomedical multichannel images using visualization of measures of similarity with the standard. Izvestiya VUzov. Instrument making, No. 52 (8), 74-79 (2009) 18. Yokoyama, Y., et. al: Loss of histone H4K20 trimethylation predicts poor prognosis in breast cancer and is associated with invasive activity. Breast Cancer Research, Vol. 16 (3), 66 (2014) 19. Seroussi, B.: Using OncoDoc as a computer–based eligibility screening system to improve accrual onto breast cancer clinical trials. Artificial Intelligence in Medicine, Vol. 29, Issues 1–2, 153–167 (2002) 20. Dercksen, V.: The Filament Editor: An Interactive Software Environment for Visualization, Proof–Editing and Analysis of 3D Neuron Morphology. Neuroinformatics, Vol. 12(2), 325– 339 (2014) 21. Vrekoussis, T., et. al.: Image Analysis of Breast Cancer Immunohistochemistry–stained Sec- tions Using ImageJ: An RGB–based Model. Anticancer Research December, Vol. 29(12), 4995–4998 (2009) 22. Medovy, V. S.: Informational automated microscopy systems for the analysis of biomaterials. Doctor and information technology, No. 6, 32-37 (2004) 23. Berezky, O. M., et. al.: Methods, algorithms and software tools for the processing of biomed- ical images. Ternopil: Economic Thought, TNEU, 330 p (2017) 24. Pitsun, O.Y.: Adaptive method for processing histological and cytological images. The Bul- letin of the National University "Lviv Polytechnic". Computer Science and Information Tech- nology, No. 864, 111-119 (2014) 25. Wang, Z.: Image quality assessment: From error visibility to structural similarity. In: Pro- cessing of the IEEE Transactions on Image Processing, Vol. 13, No. 4, 600–612 (2004) 26. Berezky, O.M: Adaptive method of image segmentation based on metrics. Scientific Bulletin of NLTU of Ukraine: a collection of scientific and technical works. Lviv: RVB NLTU of Ukraine, No. 28 (3), 110-123 (2018) 27. Berezsky, O.: Fréchet distance between weighted rooted trees. Matematychni Studii, Vol. 48, No.2, 165-170 (2017) 28. Berezky, O. M, et. al.:. Development of metrics and methods of quantitative estimation of biomedical images segmentation. East-European Journal of Advanced Technologies, Vol. 6, No. 4, 4-11 (2017) 29. Berezsky, O., et. al.: Computer diagnostic tools based on biomedical image analysis. In: Pro- ceedings of the 14th International Conference “The Experience of Designing and Application of CAD Systems in Microelectronics” (CADSM), Polyana–Svalyava, 388–391 (2017) 30. Certificate of registration of copyright for work №75360. Computer program "Intelligent sys- tem of diagnosis of precancerous conditions of the musculoskeletal system based on the anal- ysis of histological and cytological images" HIAMS ". / O.M. Berezky, O.Y. Pitsun, G.M Melnik, P.B. Lyashinsky, P.B., Lyashinsky. Date of registration 14.12.2017.