<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Parallel Optimization of Dimensionality Reduction Methods for Disease Prediction: PCA and LDA with Dask-ML</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Lesia</forename><surname>Mochurad</surname></persName>
							<email>lesia.i.mochurad@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 Bandera street</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Liliana</forename><surname>Mirchuk</surname></persName>
							<email>liliana.mirchuk.shi.2022@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 Bandera street</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Anastasiia</forename><surname>Veretilnyk</surname></persName>
							<email>anastasiia.veretilnyk.shi.2022@lpnu.ua</email>
							<affiliation key="aff0">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>12 Bandera street</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<address>
									<postCode>2024</postCode>
									<settlement>Cambridge</settlement>
									<region>MA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Parallel Optimization of Dimensionality Reduction Methods for Disease Prediction: PCA and LDA with Dask-ML</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">C8246D9B8DA0D2981F7AA165CC902B94</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Medical data processing</term>
					<term>machine learning</term>
					<term>ResNet-50</term>
					<term>parallel computing</term>
					<term>сholangiocarcinoma diagnosis 1</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In modern medicine, an urgent problem is to improve the results of cancer prognostication, in particular, to increase the accuracy and reduce the time to obtain a solution. In this paper, we propose to reduce the dimensionality of data at the preprocessing stage using principal component analysis (PCA) and linear discriminant analysis (LDA) methods in order to compare their performance and efficiency. It is known that these methods can be time-consuming, which is critical when solving a forecasting problem. To overcome this problem, the paper proposes to parallelise PCA and LDA methods based on Dask-ML technology. The ResNet-50 model was used to diagnose the disease. The proposed approach has achieved an accuracy of 85.2%, which is 3% higher than the results reported in previous studies. The results obtained indicate that data preprocessing and dimensionality reduction can avoid incorrectly set tasks and improve the accuracy of prediction. In addition, we were able to significantly reduce preprocessing time by parallelising the PCA method using parallel computing technology. In future research, we plan to further improve medical data processing methods, including exploring other approaches to dimensionality reduction and integrating the latest machine learning algorithms to improve prediction accuracy.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Timely diagnostics in medicine is extremely important as it allows detecting diseases at early stages, which increases the effectiveness of treatment, reduces its cost and prevents the development of complications. Early diagnostics also improves the quality of life of patients, reduces the burden on the healthcare system, and contributes to the prevention and control of infectious diseases <ref type="bibr" target="#b0">[1]</ref>. The development of technologies, such as artificial intelligence, increases the accuracy and speed of diagnostics, making it a key element of modern medicine <ref type="bibr" target="#b1">[2]</ref>, <ref type="bibr" target="#b2">[3]</ref>.</p><p>It has been reported <ref type="bibr" target="#b3">[4]</ref> that patients with cancer had an overall average time to diagnosis of 156.2 (164.9) days, and 15.4% of patients waited longer than 180 days before receiving a diagnosis. Computer diagnostics based on deep learning using images of pathological tissues are often used in cancer diagnosis. However, despite the availability of databases for cancer detection, we still do not have an accurate method for predicting the disease. There are significant difficulties in histological examinations, which are very important for the diagnosis and treatment of diseases. They consist in detecting cancer in tissue images, where scientists often face inverse problems that are incorrectly posed and require special attention to solve <ref type="bibr" target="#b4">[5]</ref>. In addition, huge amounts of medical data are generated every day, and their analysis is complicated by factors such as noise, missing data, and high dimensionality. For example, the diagnosis of malignant tumours requires the use of various sources of information <ref type="bibr" target="#b5">[6]</ref>. To improve the work with medical data and overcome the difficulties encountered in their analysis, preprocessing is used <ref type="bibr" target="#b6">[7]</ref>. Our analysis of the scientific sources has shown that parallelisation of preprocessing can provide better results compared to a sequential process <ref type="bibr" target="#b7">[8]</ref>.</p><p>The relevance of the research conducted in this paper is that there is a lack of efficiency in timely cancer diagnosis, which can lead to severe consequences for the health and life of patients, insufficient accuracy of the results <ref type="bibr" target="#b8">[9]</ref> and few solutions for processing multidimensional databases consisting of a large number of images in medicine. In this study, we consider various parallel computing methods and technologies to reduce the multidimensionality of the database at the preprocessing stage, which will allow for better accuracy and not significantly increase the overall solution time.</p><p>Challenges in analysing and predicting patient diagnoses include issues such as incomplete or inaccurate data, poorly formulated models, incorrect assumptions, high data dimensionality, and lack of standards or methodological guidance. These factors can make it difficult to make accurate predictions, lead to distorted results, and degrade the quality of diagnosis. Correcting these problems requires correct data preprocessing methods, adequate mathematical models and clear standards to ensure the accuracy and reliability of predictions.</p><p>Preprocessing, as a way of solving ill-posed problems, helps to avoid complexities, in particular the 'curse of dimensionality', by reducing the multidimensionality of the database to smaller dimensions that still retain important information. This includes methods such as linear discriminant analysis (LDA) and principal component analysis (PCA), which are used to improve data quality in machine learning, in particular to increase classification and regression accuracy <ref type="bibr" target="#b9">[10]</ref>. The optimal choice of preprocessing methods demonstrates significant improvements in prediction, emphasising the importance of this stage in data processing.</p><p>In <ref type="bibr" target="#b10">[11]</ref>, the authors propose a multidimensional choledochal database, which we used to test the effectiveness of the proposed approach. This database contains both microscopic hyperspectral images and colour RGB images in the same field of view for deep learning studies. All the images in this database have been evaluated and labelled by experienced pathologists, making them suitable for training neural networks. This database is very useful for researchers to learn new multivariate deep learning algorithms for pathological diagnosis, as it contains morphology, spectrum, and information about biochemical changes of the samples. Few three-dimensional databases for research have been published on the Internet. To date, the presented multidimensional choledochal database is the first publicly available database of choledochal pathology that contains both microscopic hyperspectral and colour RGB images with annotations of choledochal sections.</p><p>In contrast to the authors of <ref type="bibr" target="#b11">[12]</ref>, who propose a method for early detection of cholangiocarcinoma using hyperspectral images of microscopic tissues, using the ResNet-50 model, which achieves an accuracy of 82.4%. We additionally consider reducing the multidimensionality of the database using two methods: PCA and LDA. In addition, we parallelised these methods using modern parallel computing technologies Dask-ML and MPI, which allowed us to significantly improve data processing efficiency and diagnostic accuracy.</p><p>In <ref type="bibr" target="#b12">[13]</ref>, the authors use hyperspectral imaging (HSI), which offers a promising way to improve liver cancer diagnosis due to its ability to capture detailed continuous spectral and spatial information that is beyond the visible range of the human eye. Classification of cholangiocarcinoma using HSI is challenging due to its high dimensionality. To solve this problem, this article presents a network called MedisawHSI. As a result, they managed to achieve an accuracy of 93.35%. As we can see, the authors managed to achieve better accuracy compared to <ref type="bibr" target="#b11">[12]</ref>. In our opinion, the accuracy of the results has improved because they used the division of the hyperspectral image into smaller overlapping regions, which are then classified individually based on their spectral characteristics.</p><p>In our study, we propose a method to enhance the accuracy and efficiency of cancer prognostication by reducing data dimensionality through PCA and LDA, parallelized using Dask-ML technology. This approach not only improves diagnostic accuracy but also significantly reduces preprocessing time. Similar to our efforts to address the challenges of time-consuming processes in medical data analysis, recent advancements in the Internet of Medical Things (IoMT) have also focused on improving the security and efficiency of data handling. For instance, a Timestamp-based Secret Key Generation (T-SKG) scheme has been developed for resource-constrained IoMT devices to ensure secure data transmission without direct key sharing, thereby addressing vulnerabilities in traditional key sharing mechanisms <ref type="bibr" target="#b13">[14]</ref>. This parallel development in secure data processing complements our efforts to enhance the reliability and speed of medical diagnostics.</p><p>The purpose of our article is to compare the efficiency of reducing the multidimensionality of the database using PCA and LDA methods applied at the preprocessing stage and parallelised using Dask-ML and MPI technologies to reduce processing time and improve the accuracy of data analysis in medicine, in particular, in the diagnosis of cholangiocarcinoma.</p><p>The main contributions of the paper are as follows:</p><p>• An improved approach to reducing data dimensionality using PCA and LDA methods is proposed, contributing to the accuracy of disease prediction. • For the first time, Dask-ML technologies are used to parallelize PCA and LDA methods, significantly reducing data processing time. • A comparative analysis of the performance and efficiency of PCA and LDA methods in reducing data dimensionality to enhance forecasting results is conducted.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methods and materials</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Overview of the algorithm of the proposed approach</head><p>The proposed approach consists of two parts and is schematically presented in Figure <ref type="figure" target="#fig_0">1</ref>:</p><p>1. First of all, we applied data preprocessing to reduce the dimensionality of our multidimensional database. This will help us to reduce the dimensionality of the database while preserving the meaningful characteristics. We consider two methods of dimensionality reduction: PCA and LDA. To determine the effectiveness of each method, we propose to parallelise them and analyse the results. To parallelise the methods, we use the Dask-ML library <ref type="bibr" target="#b14">[15]</ref>. Dask-ML is a toolkit that provides parallelised implementations of machine learning algorithms. Specifically, for PCA, we use Dask-ML PCA, which works with Dask Array to represent data and automatically distributes computations across multiple processors or computers. 2. Next, we will work with already processed data, namely, a smaller database after applying a dimensionality reduction method such as PCA or LDA. The RestNet-50 method is used for classification, to comparatively evaluate the effectiveness of the impact of reducing the dimensionality of the database as a way to solve an incorrectly posed problem in determining the diagnosis of cancer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Overview of sequential PCA and LDA methods and their comparative characteristics</head><p>PCA is a statistical procedure that uses an orthogonal transformation. PCA transforms a group of correlated variables into a group of uncorrelated variables. Instead of discarding weak predictors, PCA generates new predictors that are uncorrelated. But in general, PCA works better if the data set contains independent but uncorrelated predictors, and another problem is the choice of the number of principal components <ref type="bibr" target="#b15">[16]</ref>. The main goal of LDA is to project a dataset with a large number of features into a smaller space with good class resolution. This will reduce computational costs <ref type="bibr" target="#b16">[17]</ref>. Dimensionality reduction techniques often require intensive computation and do not easily scale to large datasets. Recent advances in high-performance measurements using physical objects such as sensors or the results of complex numerical simulations generate data of extremely high dimensionality. It is becoming increasingly difficult to process such data consistently. In our research, we found that these methods were parallelised on distributed memory machines with MPI. The results show that their structure provides very good scalability for large problem sizes across the entire range of tested processor configurations <ref type="bibr" target="#b17">[18]</ref>, <ref type="bibr" target="#b18">[19]</ref>.</p><p>First of all, when applying the PCA method, we have to convert the data into feature vectors (we represent one image as one feature vector containing the pixels of the image). Thus, we have to standardise the data so that all features have the same weight.</p><p>An important step in understanding the relationships between attributes is the covariance matrix we build for standardised data. A covariance matrix is a square matrix that contains the covariances between all pairs of variables in your data set. Covariance measures how much two variables change together. Figure <ref type="figure" target="#fig_1">2</ref> shows a part of the covariance matrix with a size of 100*100. In this case, we obtained:</p><p>1. The diagonal elements are equal to 1, indicating that each variable is perfectly correlated with itself. For standardised data, these values are always 1. We can see that most of the matrix elements have positive covariance, which indicates that the variables in our dataset are likely to be correlated with each other. Now we can calculate the eigenvectors and eigenvalues. We sort the eigenvectors in descending order of their eigenvalues in order to select the principal components for the algorithm. In our case, this number was 2 because we were reducing the database to two dimensions. Each dot in Figure <ref type="figure">3</ref> corresponds to one image. The colours of the dots differ depending on the folder to which they belong: 'L', "N" or "P".</p><p>This figure allows you to visually understand the distribution of images in three-dimensional space in terms of their height, width, and color representation in RGB channels.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 3: 3D model</head><p>As we mentioned before, the multidimensional choledoch database consists of images of three species. That is why we didn't reduce the multidimensionality of the entire database at once, but separately for L, P, N. As a result, we saved the reduced images in .h5 format. Figure <ref type="figure" target="#fig_3">4</ref> shows the original image.  An overview of the key characteristics and differences between PCA and LDA is provided in Table <ref type="table" target="#tab_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Formal description of the parallel PCA and LDA algorithm using Dask-ML</head><p>The next stage of our research involves parallelizing PCA and LDA using the Dask-ML library <ref type="bibr" target="#b19">[20]</ref>, which allows us to scale computations to multiple processors or computers in a cluster. This is especially useful when working with large databases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Stages of the parallel PCA algorithm using Dask-ML: 1. Calculation of the covariance matrix</head><p>• The input data is centered by subtracting the average value of each feature, which is represented by the formula (1) 𝑋 !"#$"%"&amp; = 𝑋 − 𝑚𝑒𝑎𝑛(𝑋);</p><p>(1) • The covariance matrix is calculated from the centered data (see <ref type="bibr" target="#b1">(2)</ref>)</p><formula xml:id="formula_0">𝛴 = 1 𝑛 − 1 𝑋 !"#$"%"&amp; ' 𝑋 !"#$"%"&amp; .<label>(2)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Calculation of Householder coefficients</head><p>• Based on the resulting covariance matrix, we calculate the Householder coefficient required to update the matrices Q and R, where Q is the orthogonal matrix from the QR decomposition and R is the upper triangular matrix from the QR decomposition; • We calculate the norm of the vector v for its normalization, which is represented by the formula ( <ref type="formula" target="#formula_1">3</ref>)</p><formula xml:id="formula_1">‖𝑣‖ = 1𝑣 ( ) + 𝑣 ) ) + ⋯ + 𝑣 # ) ;<label>(3)</label></formula><p>• Create a normalized vector u using the formula ( <ref type="formula" target="#formula_2">4</ref>)</p><formula xml:id="formula_2">𝑢 = 𝑣 ‖𝑣‖ ;<label>(4)</label></formula><p>• Calculate the coefficient β used to create the Householder reflection</p><formula xml:id="formula_3">𝛽 = 2 𝑢 ' 𝑢 . (<label>5</label></formula><formula xml:id="formula_4">)</formula><p>3. Update matrices Q and R • Divide the calculations of Q and R into parallel parts;</p><p>• Each column i of the matrix A is processed separately: o Calculate a part of the matrix Q (Qi); o Calculate a part of the matrix R (Ri); • Combine the parts along the axis 1 (horizontally) to obtain the full matrices Q and R.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Calculating eigenvalues and eigenvectors</head><p>• Eigenvalues are calculated from the diagonal elements of the matrix R: 𝜆 = 𝑑𝑖𝑎𝑔(𝑅) ) ;</p><p>• Eigenvectors are calculated by solving a system of linear equations using the Gaussian method for each eigenvalue: 𝐴𝑒 * = 𝜆 * 𝑒 * . • Parallelize this process to speed up the calculations:</p><p>o Divide the array of eigenvalues into N subparts, where N is the number of threads.</p><p>o At the same time, we calculate eigenvectors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conversion of eigenvectors to the basis A</head><p>• Eigenvectors obtained from the matrix R, are converted to the base A by multiplying by the matrix Q, as shown in ( <ref type="formula" target="#formula_5">6</ref>)</p><formula xml:id="formula_5">𝑒 + = 𝑄𝑒 , .<label>(6)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Stages of the parallel LDA algorithm using Dask-ML:</head><p>1. Use Dask-ML to calculate averages and center data in parallel; 2. Use Dask-ML to compute the mean vectors of each class in parallel; 3. Calculate scattering matrices for each class in parallel; 4. Using Dask-ML to calculate the interclass scattering matrix in parallel; 5. In parallel, we sum the scattering matrices for each class; 6. Using Dask-ML to calculate eigenvectors and eigenvalues in parallel.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Using the ResNet-50 architecture</head><p>During the analysis of the literature, several methods of disease prediction for the selected dataset proved to be effective, namely: ResNet-50, InceptionResNetV2, Random Forest, etc. We decided to focus on the ResNet-50 method for the following reasons: − The authors of the article <ref type="bibr" target="#b11">[12]</ref> used ResNet-50 in their research and achieved good results.</p><p>This confirms the effectiveness of this method for our purposes. − According to our analysis, the InceptionResNetV2 network was used by the authors of <ref type="bibr" target="#b8">[9]</ref>, but it was designed to work with multidimensional databases. Since our goal is to reduce the dimensionality of the database using parallel PCA and LDA algorithms, the InceptionResNetV2 method does not meet our needs. − ResNet-50 is known to be an effective method for image classification and has successful results with different types of data. This makes it suitable for our task of predicting diseases from medical images.</p><p>Thus, when choosing ResNet-50, we took into account not only the availability of this method in the study, but also its suitability for our specific goals and limitations.</p><p>The authors of <ref type="bibr" target="#b11">[12]</ref> emphasized the importance of data preparation, which included normalization and cropping of the original images, to achieve good results. We decided to follow these steps by reducing the size of all images, which confirms the adaptability of the method to our case.</p><p>To achieve the best performance, we split the dataset into non-overlapping training and test datasets, which are divided into training (6800 images) and test (210 images).</p><p>To determine the accuracy of the ResNet-50 neural network, we calculated how many tests the neural network gave correct answers and how many did not. We considered 4 cases to calculate the results:</p><p>• True Positive (TP) is a case where a person had cancer and the neural network gave the result that the person had cancer. • True negative (TN) is a case where a person did not have cancer and the neural network determined correctly that the person really did not have cancer. • False Negative (FN) is a case where a person had cancer, but the neural network said he did not. • False Positive (FP) is a case where a person did not have cancer, but the neural network showed that they did.</p><p>To evaluate the forecasting efficiency, we chose the following metrics: Recall (the ratio of correctly identified positive cases to all actually positive cases), Precision (the ratio of correctly identified positive cases to all cases that the model identified as positive), and Accuracy (the ratio of correct predictions to the total number of observations). The formal representation of the latter is given by formulas ( <ref type="formula" target="#formula_6">7</ref>)- <ref type="bibr" target="#b8">(9)</ref>.</p><formula xml:id="formula_6">𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 ,<label>(7)</label></formula><formula xml:id="formula_7">𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 ,<label>(8)</label></formula><p>𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 .</p><p>(9)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results of numerical experiments</head><p>Before presenting the results of the study, we propose to consider the characteristics of the computers on which the calculations were performed: Computer models: Apple M1 Pro, Asus VivoBook Operating systems: Windows 10, MacOS Sonoma 14.5</p><p>The amount of disk space available: 200 Гб RAM: 16 Гб Number of cores: 8 Network connection speed: 100 Мбіт/с Software tools we used to conduct our research: Clion, Kaggle editor, Google Colab. In our work, we chose the choledoch dataset <ref type="bibr" target="#b20">[21]</ref>, which consists of three files:</p><p>• L -samples with parts of cancerous areas; • N -complete cancerous areas; • P -no cancerous areas; Each file type includes three directories: • annotation -contains the coordinates of the points where there are cancerous areas;</p><p>• hyper -general description of the image, including the number of columns, rows, channels, etc; • rgb -contains all multidimensional images. This dataset is multidimensional: DB dimensionality: (1728, 2304, 3), the first number indicates the height of the images, the second -the width, and the third -the number of dimensions.</p><p>Accordingly, separately for each type of image, we calculated the execution time of the sequential algorithms when reducing the multidimensional database using PCA and LDA methods and the total execution time. The results are presented in Table <ref type="table" target="#tab_1">2</ref>. The execution time of the program that implements the proposed parallel PCA and LDA methods depending on the number of cores is shown in Table <ref type="table" target="#tab_2">3</ref>. As we can see from Tables <ref type="table" target="#tab_2">2 and 3</ref>, the proposed division into threads and subtasks allowed us to reduce the execution time based on a parallel algorithm, which is important at the preprocessing stage in order to reduce the dimensionality of data without significant time costs.</p><p>Next, it is important to analyze the overall diagnostic results. Table <ref type="table" target="#tab_3">4</ref> shows the results of calculations of how many tests the neural network gave correct answers and how many did not.  The accuracy of our proposed method reached 85.2%, which is about 3% better than in <ref type="bibr" target="#b11">[12]</ref>. This leads to the conclusion that data preprocessing and dimensionality reduction can avoid incorrectly set tasks and improve the accuracy of the results. At the same time, we also managed to solve the problem of significant time costs at the preprocessing stage by parallelizing the PCA method using parallel computing technology.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions</head><p>Summarizing the results of this study, it should be emphasized that LDA is slower than PCA. We believe that this is due to the following factors: first, the calculation of the covariance matrix, as LDA needs to calculate the covariance matrix for each class in the dataset, which can be a computationally expensive operation, especially for large datasets with many classes. PCA, on the other hand, only needs to compute one covariance matrix for the entire dataset. Second, solving the eigenvalue problem, as LDA needs to solve the eigenvalue problem for the generalized eigenvalue matrix, which can be a computationally intensive task, especially for large matrices. PCA, on the other hand, requires solving the eigenvalue problem for a standardized covariance matrix, which is usually easier. Third, the number of eigenvalues: LDA typically requires only k eigenvalues to be calculated, where k is the desired projection dimension, while PCA requires all eigenvalues of the covariance matrix to be calculated. Fourth, sensitivity to noise: LDA can be more sensitive to noise in the data than PCA because LDA uses class label information, which can be sensitive to noise, while PCA does not use this information and therefore may be less sensitive to noise in the data. In general, LDA can be slower than PCA due to the more complex computations it requires. When applying the parallel PCA algorithm, we obtained a maximum speedup of about 3.5 times and an efficiency of 0.81. The dimensionality was reduced from three to two, which significantly improved the performance of our diagnostic system.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Diagram of the algorithm of the proposed approach</figDesc><graphic coords="4,199.53,63.55,203.40,300.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>2 .</head><label>2</label><figDesc>Off-diagonal elements: we can see different colours reflecting the level of covariance between different variables. Lighter colours (closer to yellow) indicate high positive covariance (variables that change in the same direction), while darker colours (closer to black) indicate low or negative covariance (variables that change in opposite directions). 3. The scale on the right shows how the colours of our matrix are interpreted: values closer to 1 indicate high positive covariance, and values closer to -0.2 indicate negative covariance.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Part of the covariance matrix for standardised data</figDesc><graphic coords="5,182.13,118.73,238.80,173.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Original image</figDesc><graphic coords="6,208.23,570.70,186.60,136.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :Figure 6 :</head><label>56</label><figDesc>Figure 5: 2D model</figDesc><graphic coords="6,138.63,345.91,325.80,195.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 Comparative characteristics of PCA and LDA methods for data dimensionality reduction</head><label>1</label><figDesc></figDesc><table><row><cell>Criterion</cell><cell>РСА</cell><cell>LDA</cell></row><row><cell>Purpose</cell><cell>Dimensionality reduction by</cell><cell>Reducing dimensionality by</cell></row><row><cell></cell><cell>maximizing the total variance</cell><cell>maximizing the difference</cell></row><row><cell></cell><cell>of the data</cell><cell>between classes</cell></row><row><cell>Methodology</cell><cell>Eigenvectors and eigenvalues</cell><cell>Eigenvectors and eigenvalues</cell></row><row><cell></cell><cell>of the covariance matrix</cell><cell>of the scatter matrix between</cell></row><row><cell></cell><cell></cell><cell>and within classes</cell></row><row><cell>Orientation</cell><cell>Independent of class</cell><cell>Class oriented</cell></row><row><cell>Application</cell><cell>Visualization, noise reduction,</cell><cell>Classification, improving the</cell></row><row><cell></cell><cell>data preprocessing</cell><cell>separation between classes</cell></row><row><cell>Data types</cell><cell>Any data</cell><cell>Class labeled data</cell></row><row><cell>Advantages</cell><cell>-Keeps the maximum amount</cell><cell>-Maximizes the separation</cell></row><row><cell></cell><cell>of variation;</cell><cell>between classes;</cell></row><row><cell></cell><cell>-Useful for data visualization;</cell><cell>-Effective for classification</cell></row><row><cell></cell><cell>-Independent of class</cell><cell>tasks;</cell></row><row><cell></cell><cell></cell><cell>-Can improve classification</cell></row><row><cell></cell><cell></cell><cell>results;</cell></row><row><cell>Disadvantages</cell><cell>-The loss of interpretation;</cell><cell>-Assumption of data normality;</cell></row><row><cell></cell><cell>-Not always suitable for</cell><cell>-Loss of efficiency with a large</cell></row><row><cell></cell><cell>classification tasks</cell><cell>number of classes or unequal</cell></row><row><cell></cell><cell></cell><cell>covariance matrices;</cell></row><row><cell>Dimensionality reduction</cell><cell>To the number of principal</cell><cell>Tо (k-1), where k is the number</cell></row><row><cell></cell><cell>components that retain most</cell><cell>of classes</cell></row><row><cell></cell><cell>of the variance</cell><cell></cell></row><row><cell>Data requirements</cell><cell>Does not require class labels</cell><cell>Requires class labels and</cell></row><row><cell></cell><cell></cell><cell>assumes a normal distribution</cell></row><row><cell></cell><cell></cell><cell>of data with equal covariance</cell></row><row><cell></cell><cell></cell><cell>matrices for each class</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 Time to execute sequential PCA and LDA methods, min</head><label>2</label><figDesc></figDesc><table><row><cell>Image type</cell><cell>L</cell><cell>N</cell><cell>P</cell><cell>Total time</cell></row><row><cell>Sequential PCA</cell><cell>7.01</cell><cell>6.43</cell><cell>6.57</cell><cell>20</cell></row><row><cell>Sequential LDA</cell><cell>7.10</cell><cell>7.23</cell><cell>7.02</cell><cell>21.53</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 Execution time of parallel PCA and LDA methods depending on the number of cores, min</head><label>3</label><figDesc></figDesc><table><row><cell cols="2">Number of cores 1</cell><cell>2</cell><cell>4</cell><cell>8</cell></row><row><cell>Parallel PCA</cell><cell>5.012</cell><cell>3.343</cell><cell>2.112</cell><cell>2.005</cell></row><row><cell>Parallel LDA</cell><cell>5.451</cell><cell>3.020</cell><cell>2.343</cell><cell>2.151</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 ResNet-50 test results</head><label>4</label><figDesc>Next, we calculated the prediction accuracy indicators based on the proposed preprocessing stage and the use of the ResNet-50 network (see Table5).</figDesc><table><row><cell>Result</cell><cell>Positive</cell><cell>Negative</cell></row><row><cell>Positive</cell><cell>107 -TP</cell><cell>3 -FN</cell></row><row><cell>Negative</cell><cell>28 -FP</cell><cell>72 -TN</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 Indicators of forecasting accuracy</head><label>5</label><figDesc></figDesc><table><row><cell>Result</cell><cell>Recall</cell><cell>Precision</cell><cell>Accuracy</cell></row><row><cell>Positive</cell><cell>0.793</cell><cell>0.972</cell><cell>0.852</cell></row><row><cell>Negative</cell><cell>0.722</cell><cell>0.965</cell><cell>-</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>The authors express their gratitude to the Armed Forces of Ukraine for providing the security necessary to perform this work. This work has been made possible only through the resilience and courage of the Ukrainian Army.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">On Intelligent Multiagent Approach to Viral Hepatitis B Epidemic Processes Simulation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Chumachenko</surname></persName>
		</author>
		<idno type="DOI">10.1109/DSMP.2018.8478602</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Second International Conference on Data Stream Mining &amp; Processing (DSMP)</title>
				<meeting><address><addrLine>Lviv</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2018-08">2018. Aug. 2018</date>
			<biblScope unit="page" from="415" to="419" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A Parallel Algorithm for the Detection of Eye Disease</title>
		<author>
			<persName><forename type="first">L</forename><surname>Mochurad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Panto</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-24475-9_10</idno>
	</analytic>
	<monogr>
		<title level="m">Lecture Notes on Data Engineering and Communications Technologies</title>
		<title level="s">Computer Science and Digital Economics IV</title>
		<editor>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>He</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham; Nature Switzerland</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">158</biblScope>
			<biblScope unit="page" from="111" to="125" />
		</imprint>
	</monogr>
	<note>Advances in Intelligent Systems</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A Hybrid Deep Learning-Based Approach for Brain Tumor Classification</title>
		<author>
			<persName><forename type="first">A</forename><surname>Raza</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics11071146</idno>
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page">1146</biblScope>
			<date type="published" when="2022-04">Apr. 2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Time duration and health care resource use during cancer diagnoses in the United States: A large claims database analysis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Gitlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Mcgarvey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shivaprakash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Cong</surname></persName>
		</author>
		<idno type="DOI">10.18553/jmcp.2023.29.6.659</idno>
	</analytic>
	<monogr>
		<title level="j">J. Manag. Care Spec. Pharm</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="659" to="670" />
			<date type="published" when="2023-06">Jun. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">AN OVERVIEW OF CHALLENGES IN MEDICAL IMAGE PROCESSING</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">O A</forename><surname>Deheyab</surname></persName>
		</author>
		<idno type="DOI">10.1145/3584202.3584278</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 6th International Conference on Future Networks &amp; Distributed Systems</title>
				<meeting>the 6th International Conference on Future Networks &amp; Distributed Systems<address><addrLine>Tashkent TAS Uzbekistan</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2022-12">Dec. 2022</date>
			<biblScope unit="page" from="511" to="516" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Advances in Data Preprocessing for Biomedical Data Fusion: An Overview of the Methods</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.inffus.2021.07.001</idno>
	</analytic>
	<monogr>
		<title level="j">Challenges, and Prospects</title>
		<imprint>
			<biblScope unit="volume">76</biblScope>
			<biblScope unit="page" from="376" to="421" />
			<date type="published" when="2021-12">Dec. 2021</date>
		</imprint>
	</monogr>
	<note>Inf. Fusion</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Application of Data Preprocessing in Medical Research</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">V</forename><surname>Bozhenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Tatarnikova</surname></persName>
		</author>
		<idno type="DOI">10.1109/WECONF57201.2023.10148004</idno>
	</analytic>
	<monogr>
		<title level="m">Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)</title>
				<meeting><address><addrLine>St. Petersburg</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2023-05">2023. May 2023</date>
			<biblScope unit="page" from="1" to="4" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Parallel Algorithms for Interpolation with Bezier Curves and B-Splines for Medical Data Recovery</title>
		<author>
			<persName><forename type="first">L</forename><surname>Mochurad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mochurad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">6th International Conference on Informatics and Data-Driven Medicine</title>
				<imprint>
			<date type="published" when="2023">2023. 2023</date>
			<biblScope unit="volume">3609</biblScope>
			<biblScope unit="page" from="189" to="197" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">InceptionRestNetV2 Transfer Learning Approach for Cholangiocarcinoma Diagnosis utilizing Multidimensional Choledochal Database</title>
		<author>
			<persName><forename type="first">Janarththanan</forename><surname>Jeyagopal</surname></persName>
		</author>
		<idno type="DOI">10.13140/RG.2.2.21993.10083</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>Unpublished</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Review of Dimension Reduction Methods</title>
		<author>
			<persName><forename type="first">S</forename><surname>Nanga</surname></persName>
		</author>
		<idno type="DOI">10.4236/jdaip.2021.93013</idno>
	</analytic>
	<monogr>
		<title level="j">J. Data Anal. Inf. Process</title>
		<imprint>
			<biblScope unit="volume">09</biblScope>
			<biblScope unit="issue">03</biblScope>
			<biblScope unit="page" from="189" to="231" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A Multidimensional Choledoch Database and Benchmarks for Cholangiocarcinoma Diagnosis</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chu</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2019.2947470</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="149414" to="149421" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">ResNet-50 based Method for Cholangiocarcinoma Identification from Microscopic Hyperspectral Pathology Images</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1088/1742-6596/1880/1/012019</idno>
	</analytic>
	<monogr>
		<title level="j">J. Phys. Conf. Ser</title>
		<imprint>
			<biblScope unit="volume">1880</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">12019</biblScope>
			<date type="published" when="2021-04">Apr. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Cholangiocarcinoma Classification using MedisawHSI: A Breakthrough in Medical Imaging</title>
		<author>
			<persName><forename type="first">H</forename><surname>Namburu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">N</forename><surname>Munipalli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Vanga</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pasam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sikhakolli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chinnadurai</surname></persName>
		</author>
		<idno type="DOI">10.1109/ic-ETITE58242.2024.10493579</idno>
	</analytic>
	<monogr>
		<title level="m">Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE)</title>
				<meeting><address><addrLine>Vellore, India</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2024-02">2024. Feb. 2024</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A secure data transmission framework for IoT enabled healthcare</title>
		<author>
			<persName><forename type="first">S</forename><surname>Saif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Biswas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Haq</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kovtun</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.heliyon.2024.e36269</idno>
	</analytic>
	<monogr>
		<title level="j">Heliyon</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">16</biblScope>
			<date type="published" when="2024-08">Aug. 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Machine Learning in Python: Main Developments and Technology Trends in Data Science</title>
		<author>
			<persName><forename type="first">S</forename><surname>Raschka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Patterson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Nolet</surname></persName>
		</author>
		<idno type="DOI">10.3390/info11040193</idno>
	</analytic>
	<monogr>
		<title level="j">Machine Learning, and Artificial Intelligence, Information</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">193</biblScope>
			<date type="published" when="2020-04">Apr. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Impact of Preprocessing Methods on Healthcare Predictions</title>
		<author>
			<persName><forename type="first">P</forename><surname>Misra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Yadav</surname></persName>
		</author>
		<idno type="DOI">10.2139/ssrn.3349586</idno>
	</analytic>
	<monogr>
		<title level="j">SSRN Electron. J</title>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Analysis of Dimensionality Reduction Techniques on Big Data</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">T</forename><surname>Reddy</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2020.2980942</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="54776" to="54788" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Parallel Framework for Dimensionality Reduction of Large-Scale Datasets</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Samudrala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Aluru</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ganapathysubramanian</surname></persName>
		</author>
		<idno type="DOI">10.1155/2015/180214</idno>
	</analytic>
	<monogr>
		<title level="j">Sci. Program</title>
		<imprint>
			<biblScope unit="volume">2015</biblScope>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">DEVELOPMENT OF PROGRAMMABLE HOME SECURITY USING GSM SYSTEM FOR EARLY PREVENTION</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A J</forename><surname>Alsayaydeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Aziz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">I A</forename><surname>Rahman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ARPN Journal of Engineering and Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="88" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Machine Learning in Python: Main Developments and Technology Trends in Data Science</title>
		<author>
			<persName><forename type="first">S</forename><surname>Raschka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Patterson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Nolet</surname></persName>
		</author>
		<idno type="DOI">10.3390/info11040193</idno>
	</analytic>
	<monogr>
		<title level="j">Machine Learning, and Artificial Intelligence, Information</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">193</biblScope>
			<date type="published" when="2020-04">Apr. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Microscopic Hyperspectral Choledoch Dataset</title>
		<ptr target="https://www.kaggle.com/datasets/ethelzq/multidimensional-choledoch-database" />
		<imprint>
			<date type="published" when="2024-07-20">Jul. 20, 2024</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
