<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Assigning different activation functions in artificial neural networks with the goal of achieving higher prediction accuracy *</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Gytis</forename><surname>Baravykas</surname></persName>
							<email>gytis.baravykas@ktu.lt</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Informatics</orgName>
								<orgName type="institution">Kaunas University of Technology</orgName>
								<address>
									<addrLine>Studentu 50</addrLine>
									<postCode>51368</postCode>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Justas</forename><surname>Kardoka</surname></persName>
							<email>justas.kardoka@ktu.lt</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Informatics</orgName>
								<orgName type="institution">Kaunas University of Technology</orgName>
								<address>
									<addrLine>Studentu 50</addrLine>
									<postCode>51368</postCode>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Domas</forename><surname>Grigaliunas</surname></persName>
							<email>domas.grigaliunas@ktu.lt</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Informatics</orgName>
								<orgName type="institution">Kaunas University of Technology</orgName>
								<address>
									<addrLine>Studentu 50</addrLine>
									<postCode>51368</postCode>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Darius</forename><surname>Naujokaitis</surname></persName>
							<email>darius.naujokaitis@ktu.lt</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Informatics</orgName>
								<orgName type="institution">Kaunas University of Technology</orgName>
								<address>
									<addrLine>Studentu 50</addrLine>
									<postCode>51368</postCode>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="laboratory">Smart Grids and Renewable Energy Laboratory</orgName>
								<orgName type="institution">Lithuanian Energy Institute</orgName>
								<address>
									<postCode>44403</postCode>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">IVUS2024: Information Society</orgName>
								<orgName type="institution">University Studies</orgName>
								<address>
									<addrLine>2024, May 17</addrLine>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Assigning different activation functions in artificial neural networks with the goal of achieving higher prediction accuracy *</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">5C647556633742BB3870FEC7DF4E10CB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:29+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Activation functions</term>
					<term>artificial neural networks</term>
					<term>machine learning D. Naujokaitis) 0000-0002-8548-5056 (D. Naujokaitis)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The research paper explores the concept of using multiple activation functions in artificial neural networks and investigates their impact on model performance. The experiments conducted on various models such as AlexNet, ResNet50, TuNet, and SimpleNN reveal insights into the effectiveness of different activation function combinations. The results indicate that using multiple activation functions can lead to modest improvements in model performance, particularly in image segmentation tasks where modifications to the UNet architecture show significant enhancements. However, for time series regression/forecasting tasks, the experiments demonstrate that using multiple activation functions does not significantly improve prediction accuracy. Therefore, the paper concludes that while there are some benefits to using multiple activation functions in certain scenarios, the choice of activation function should be based on the specific task and dataset.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Artificial neural networks (ANNs) are becoming increasingly more relevant. Although the idea of ANNs spans multiple decades, various ANN architectures are still widely being developed to this day. One of the most important components of ANNs is activation functions. They are often used for introducing non-linearity, and in turn, allow ANNs to understand intricate features in the data. Although different activation functions have been developed and studied, there exists no body of work in which the choice of activation functions would be considered in the case of solar power generation forecasts. In this paper, we propose a new approach for improving the results of ANN predictions via changing the activation functions in the ANN. We have chosen to test our approach on a range of different machine learning tasks, with the goal of introducing a new, alternative hyper-parameter that would work for different ANN architectures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Literature review</head><p>Activation functions in an ANN are used to introduce non-linear relations to the data, so that the network would better fit the results and improve the accuracy of a given task. It is a very common part of ANNs and often omitted from neural network structure diagrams. Many mathematical functions have been introduced to achieve non-linearity, such as ReLU, Tanh, Sigmoid and others, each tailored to specific tasks. In this paper we entertain the idea of using no one activation function per layer or network, but multiple, assigning a different one for each neuron.</p><p>The importance of activation functions is discussed in many recent works. Their importance is based on their wide-spread usage in ANN architectures. Dubey has published a comprehensive overview of the most common activation functions, along with their characteristics and a performance comparison between them <ref type="bibr" target="#b0">[1]</ref>. They have found that different activations functions are more suited for certain machine learning tasks, and that in certain cases, alternative choices must be considered. Although there are some common choices, new activation functions are constantly being developed <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6]</ref>. Yu has created a modified activation function based on ReLU, with the goal of increasing the accuracy of classification tasks <ref type="bibr" target="#b1">[2]</ref>. Wang developed a activation function as a better alternative to other commonly used activation functions <ref type="bibr" target="#b2">[3]</ref>. The developed activation function, Smish, performed better than other common activation functions in classification tasks on open datasets. Wuraola has developed a family of activation functions that are to be used in embedded systems <ref type="bibr" target="#b3">[4]</ref>. The proposed activation functions were shown to be computationally faster, and their use resulted in higher accuracy results than other common activation functions in recurrent neural networks and logistic regression models. Kaytan has introduced a new non-monotonic activation function capable of achieving higher results than other activation functions like Swish, Mish and others for image classification tasks <ref type="bibr" target="#b4">[5]</ref>. Chai developed a new model based on LSTM capable of achieving higher accuracy for short-term PV generation forecasts <ref type="bibr" target="#b5">[6]</ref>. The model uses a newly proposed activation function that helps solve the gradient disappearance problem and ensures a high accuracy of the prediction results for the task of short-term PV generation. There are also works in which the activation functions of the default implementation of model architectures are switched with other, alternative activation functions. Anami had performed experiments in which they had tried to compare prediction results by switching the default activation function with other different, common activation functions <ref type="bibr" target="#b6">[7]</ref>. Wang has performed experiments in which they tried to use alternative activation functions in VGG16, ResNet50 and LeNet architectures, achieving superior results <ref type="bibr" target="#b7">[8]</ref>. Essai Ali has tried to modify a LSTM by changing its' Tanh functions to different activation functions <ref type="bibr" target="#b8">[9]</ref>. The author has achieved his aim of increasing the classification accuracy from 86% to 88% using the Weather Reports dataset, and from 93% to 97 % using the Japanese Vowels dataset. Let's review the concept displayed in Figure <ref type="figure" target="#fig_0">1</ref>. In this example we have an input layer, hidden layer of 2 neurons and one output layer. Each neuron has a different function applied to it. Calculations for such a network is as follows:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Activation functions</head><formula xml:id="formula_0">𝑛 ℎ 𝑖 = ∑ 𝑤 𝑖𝑗 ⋅ 𝑥 𝑗 + 𝑏 𝑖 (1) =1 𝑗 𝑧 1 = 𝑟𝑒𝑙𝑢(ℎ 1 ) (2) 𝑧 2 = ℎ 𝑡𝑎𝑛 (ℎ 2 ) (3) 𝑜 1 = 𝑧 1 𝑤 𝑟𝑒𝑙𝑢 + 𝑧 2 𝑤 ℎ 𝑡𝑎𝑛 (4)</formula><p>where ℎ -hidden layers, 𝑤 -weights, 𝑥 -inputs, 𝑏 -bias, 𝑧 -activation function results and 𝑜outputs. In an artificial convolutional neural network activations play a similar role, but because there are no actual neurons in a convolutional layer, different application is required. For the convolution layer 2 approaches were introduced.  For linear layers it is also possible to have a complete list of activation functions assigned. This idea is later experimented in this paper. Combinations of this list can be calculated as such. In this case 2 activation functions (ReLU, Tanh) power by 4 neurons equal to 16 variations:</p><formula xml:id="formula_1">𝑣 = 𝑒 𝑛<label>(5)</label></formula><p>where 𝑣 -variations, 𝑒 -elected activations and 𝑛 -number of neurons. It must also be noted that various activation functions can be used, and it is not limited to the most used activation functions such as ReLU, Tanh, Sigmoid, etc. The range of activation functions that were tested in this work are detailed in the experiments section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Models</head><p>There has been a vast selection of CNN models proposed for image classification, a lot of those have complex implementations and long training hours. The models chosen for this paper are a low to mid-range complexity to test out the theory. Starting with SimpleNN, a simple neural network with one hidden layer of N neurons. TuNet -a CNN with 2 convolutions, 2 polling layers and 3 linear layers <ref type="bibr" target="#b9">[10]</ref>. AlexNet is a convolutional neural network (CNN) architecture that consists of five convolutional layers, three fully connected layers, and two pooling layers <ref type="bibr" target="#b9">[10]</ref>. The convolutional layers extract features from the input images, while the pooling layers reduce the dimensionality of the feature maps. The fully connected layers learn a mapping from the extracted features to the output classes. Some of the key innovations introduced by AlexNet include the use of rectified linear unit (ReLU) activation functions, dropout regularization, and data augmentation techniques.</p><p>ResNet50 derives its name from its depth, incorporating 50 layers <ref type="bibr" target="#b10">[11]</ref>. Notably, ResNet50 addresses the challenge of training deep networks by introducing residual connections that enable the direct flow of information across layers. This innovation mitigates the vanishing gradient problem, allowing for the successful training of extremely deep networks.</p><p>The architecture comprises building blocks known as residual blocks, each containing skip connections that bypass one or more layers. These skip connections facilitate the smooth propagation of gradients during backpropagation, enhancing the model's ability to capture intricate features. Additionally, ResNet50 employs batch normalization to accelerate training convergence and improve generalization performance.</p><p>UNet was used for image segmentation tasks <ref type="bibr" target="#b11">[12]</ref>. It is a popular model with several modifications over the years <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref>. The model has improved on the results of previous image segmentation models by its' architecture consisting of a contracting path used for capturing context and a symmetric expanding path used that enables precise localization <ref type="bibr" target="#b11">[12]</ref>. The resulting architecture consists of 23 convolutional layers and the architecture utilizes the ReLU activation function. The model also heavily utilizes image augmentation, which enables it to achieve high accuracy without relying on many training images.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Datasets</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.1.">Images</head><p>Several image datasets are popular for testing performance of CNN models. The CIFAR-100 is a dataset containing 60 000 32x32 color images with 100 classes (600 images per class). It is a subset of the Tiny Images dataset and is commonly used for fine-grained image classification <ref type="bibr" target="#b15">[16]</ref>. The dataset contains a wide variety of images of objects, animals, and textures. The images are labeled with both fine-grained and coarse labels. The fine-grained labels correspond to the specific object or scene in the image, while the coarse labels correspond to the superclass of the object or scene.</p><p>The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011 <ref type="bibr" target="#b16">[17]</ref>. The following dataset includes 43 classes of traffic signs and more than 50,000 images.</p><p>Cityscapes dataset is a popular image segmentation dataset that consists of 25 000 such images captured from a moving vehicle <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref>. The images were taken in different cities in Germany during different weather conditions. The dataset consists of 50 different classes. Each dataset item consists of a horizontally joined image, in which the left image is the original photograph, meanwhile the right image is the semantically segmented version of the image.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.2.">Tabular</head><p>Two tabular datasets were incorporated in this paper: breast cancer and iris flower classification. Breast cancer dataset features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass <ref type="bibr" target="#b17">[18]</ref>. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at http://www.cs.wisc.edu/~street/images/.</p><p>Iris flowers dataset is one of the earliest datasets used in literature on classification methods and widely used in statistics and machine learning <ref type="bibr" target="#b18">[19]</ref>. The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are not linearly separable from each other. When performing experiments, Obaid's work was used as a benchmark for the comparison of results <ref type="bibr" target="#b19">[20]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.3.">Timeseries</head><p>Timeseries data for amazon stocks with stock price, closing price and other attributes was used <ref type="bibr" target="#b20">[21]</ref>. Additionally, a custom photovoltaic (PV) panel generation dataset was used. The data consists of about a year of meteorological and PV generation data. The PV generation data was retrieved from a PV station in Kaunas, Lithuania, meanwhile the publicly available meteorological data was retrieved from Oikolab and from the Lithuanian Hydrometeorological Service. It was also attempted to include METAR data on cloud conditions at different altitudes, but utilizing this data did not provide any improvement to the results, so it was left out from the dataset. Based on the observed linear relationships between different meteorological features and PV generation, certain meteorological features were chosen to be used in the experiments (see Figure <ref type="figure" target="#fig_3">4</ref>). As can be seen from the relationships between different features, a strong linear relationship between PV generation and air temperature, surface solar radiation has been observed. It was noted that using other meteorological data improved the results, although these features did not seem to have a linear relationship with the PV generation data. In total, the dataset consists of the following 11 features (see Table <ref type="table" target="#tab_0">1</ref>). As it can be seen from the table, a wide range of different meteorological variables were used.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Environment</head><p>Google Collab environment with a single NVIDIA Tesla T4 GPU was used for experimentations of AlexNet and ResNet50 on CIFAR100. For GTSRB, UNet and LSTM experiments, the models were trained on two Tesla T4 GPU setup. Amazon stock close predictions were performed on a Kaggle provided CPU.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments and results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Image classification</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1.">CIFAR-100 with AlexNet</head><p>Inspired by Sharma's work <ref type="bibr" target="#b21">[22]</ref>, we choose AlexNet as the primary target. Main reasons for choosing this architecture were that it had linear layers aside convolution blocks. We began experiments with the OriginalAlexNet implementation as a baseline with Tanh. Next, we experimented with changing only linear layers -changing one layer then changing both. The change was that instead of applying a single activation function, we applied 2 or 3 in cyclic order. The best results were with Tanh and Softmax combination of functions -1.14% improvement in testing accuracy compared to the ReLU baseline, however, Tanh baseline was still more superior.</p><p>Later, we expanded experimentation with modifying Convolution Neural Network layers (CNN). Here implementation consisted of changing activation functions per channel. This showed marginally better results than the OriginalAlexNet with ReLU -0.36% improvement.</p><p>For experimentation, hyper parameters were the following: learning rate -0.0001, batch size -256 and number of epochs -40. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2.">CIFAR-100 with ResNet50</head><p>We have also investigated Residual networks block, using ResNet50 architecture (see Table <ref type="table" target="#tab_2">3</ref>). Hyperparameters used for the experiment: learning rate -0.0001, batch size -256 and number of epochs -12. As can be seen from results, only a combination of three functions -Tanh, Softmax and ReLUmanaged to outperform baseline model with ReLU by 0.66% margin. Other combinations were below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3">GTSRB with TuNet</head><p>Classifying images are pre-processed in the same manner and on the same training parameters as in the previous experiments, meanwhile the fixed size image is 32 by 32 pixels. The training parameters for TuNet are as follows: optimizer -Adam, learning rate -0.001, loss function -cross entropy and batch size -32. As can be seen in Table <ref type="table" target="#tab_3">4</ref>, the results of the TuNet baseline are generally worse than of the modified architecture: In the table, several different models can be seen:  TuNet -baseline model.</p><p> TuNetOnlyNN -a model, where convolution has one activation function and neuron linear layers have specific activation function for each neuron.</p><p> TuNetPerNeuronAndChannel -a model, where convolution layers have a specific activation function for each channel and a specific activation for each neuron in linear layer.</p><p>We can see a very slight improvement when different activations are applied to only the linear layer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Cityscapes with UNet</head><p>For the image segmentation task, the popular Cityscapes dataset was chosen alongside the UNet model. The following parameters were the same for all the experiments using UNet: Adam optimizer with a learning rate of 0.001, the mean-squared error as the loss function, a batch size of 4 and 20 as the number of epochs for training.</p><p>As it can be seen from the results of the experiments, a significant Dice metric increase of about 10% was achieved by various activation function combinations (see Table <ref type="table" target="#tab_4">5</ref>). As can be seen from the table, using almost any combinations of activation functions can result in better prediction results in the case of UNet. It is also observed that even changing the activation in the baseline model from ReLU to Tanh has improved the results by a significant amount as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Time series regression/forecasting</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1">Simple NN on Amazon stock prediction</head><p>Experiments were performed on Amazon stock timeseries data predict the closing price for the next day. An architecture named SimpleNN was used. It is a neural network with 1 input cells, 14 hidden layer cells and 1 output. The following parameters were used in the experiment: optimizer -Adam, learning rate -0.001, loss function -mean-squared error, batch size -16, lag values -7 and number of training epochs -5.</p><p>The experiment compares the same model and its architecture, the only difference is activations per neuron and one activation for the whole network (see Table <ref type="table" target="#tab_5">6</ref>). As can be seen from the results, there is an increase in accuracy in certain cases, and it can also be observed that finding the best possible set of activation functions yielded the best results out of the experiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2">Custom PV dataset with LSTM</head><p>Experiments were performed using a time-series dataset for forecasting PV generation. An LSTM model was used, as it is often utilized for solving PV generation forecast tasks <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b25">26,</ref><ref type="bibr" target="#b26">27]</ref>. For performing the forecasts, the output of the previous step is used as the input of the following training step. The following parameters were used for the experiments: Adam optimizer with a learning rate of 0.001, mean-squared error for the error metric, a batch size of 8, 12 lag values for the PV data, and 20 training epochs.</p><p>The parameters for the experiments were chosen based on experiments performed using different sets of parameters. The batch size refers to the number of predictions retrieved from the model output and the lag values refers to the number of previous predictions to use as input of the next prediction. Based on tests using different lag values, a value of 12 was noticed to be one of the best values for this parameter, although this parameter did not seem to have much impact on the accuracy of predictions. Regarding transformations of data, the training data has been standardized so that the ranges of values would be the same for all features. As can be seen from Table <ref type="table" target="#tab_6">7</ref>, there is no significant improvement based on testing RMSLE. Although many experiments yielded similar results to the baseline, there was not a single experiment which yielded better results than the baseline. It can also be observed that an increase in the number of different activation functions used does not improve the forecast results either.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. A simple neural network with different activation functions per neuron</figDesc><graphic coords="2,184.30,500.10,229.60,98.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 . 1 .</head><label>21</label><figDesc>Figure 2.1. Different activation functions per channel, 2.2. Different activation function for each matrix column. In regular CNN architectures there is often only one activation function in a convolution layer. As displayed in the diagram Figure 2.1. different activation function can be applied to each channel after the convolution layer. Second diagram Figure 2.2. refers to another idea to apply multiple activation functions for each matrix column. In this case 3x3 matrix there are 3 columns in each channel. Every slice has a specific activation applied to it.</figDesc><graphic coords="3,139.40,133.85,309.70,129.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 .</head><label>3</label><figDesc>Figure 3. One activation for convolution layers and different activation functions in linear layer. Some CNN architectures have a linear neuron layer which typically have only on activation function. The idea displayed on Figure 3 is to leave one activation in convolution layers and only have multiple activation functions in linear neuron layers, specifically an activation function for each neuron. As displayer in the diagram boxes (1-4) can each have a specific function assigned creating a spectrum of variations: (1-tanh, 2-relu, 3-sigmoid, 4-softmax), (1-relu, 2-tanh, 3-sigmoid, 4-relu) and so on.For linear layers it is also possible to have a complete list of activation functions assigned. This idea is later experimented in this paper. Combinations of this list can be calculated as such. In this case 2 activation functions (ReLU, Tanh) power by 4 neurons equal to 16 variations:</figDesc><graphic coords="3,161.00,380.35,261.40,126.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4. Scatter plots between PV generation data and surface solar radiation and air temperature.</figDesc><graphic coords="5,149.40,310.00,310.25,113.85" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Features used in the dataset, their data providers and measurement units</figDesc><table><row><cell>Feature name</cell><cell>Data provider</cell><cell>Measurement units</cell></row><row><cell>Generated power</cell><cell>-</cell><cell>kW</cell></row><row><cell>Air temperature</cell><cell>LHS</cell><cell>°C</cell></row><row><cell>Sea level pressure</cell><cell>LHS</cell><cell>hPa</cell></row><row><cell>Relative humidity</cell><cell>LHS</cell><cell>%</cell></row><row><cell>Wind speed</cell><cell>LHS</cell><cell>m/s</cell></row><row><cell>Wind gust speed</cell><cell>LHS</cell><cell>m/s</cell></row><row><cell>Is wind from north (true / false)</cell><cell>LHS</cell><cell>-</cell></row><row><cell>Is wind from south (true / false)</cell><cell>LHS</cell><cell>-</cell></row><row><cell>Is wind from west (true / false)</cell><cell>LHS</cell><cell>-</cell></row><row><cell>Surface solar radiation</cell><cell>Oikolab</cell><cell>W/m²</cell></row><row><cell>Total cloud cover</cell><cell>Oikolab</cell><cell>%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Results from AlexNet experiments.</figDesc><table><row><cell>Training</cell><cell cols="2">Activations Trainin g</cell><cell>Training</cell><cell>Validation</cell><cell>Testing</cell></row><row><cell></cell><cell></cell><cell>time min</cell><cell>accuracy</cell><cell>accuracy</cell><cell>accuracy</cell></row><row><cell>OriginalAlexNet</cell><cell>ReLU</cell><cell>34.75</cell><cell>81.209</cell><cell>36.64</cell><cell>36.95</cell></row><row><cell>OriginalAlexNetb</cell><cell>Tanh</cell><cell>26.11</cell><cell>84.216</cell><cell>43.060</cell><cell>43.18</cell></row><row><cell>AlexNetCustomLinear2a</cell><cell>Tanh, Softmax</cell><cell>35.03</cell><cell>81.473</cell><cell>37.84</cell><cell>37.21</cell></row><row><cell>AlexNetCustomLinear2b</cell><cell>Tanh,</cell><cell>36.77</cell><cell>82.46</cell><cell>36.68</cell><cell>38.36</cell></row><row><cell></cell><cell>Softmax</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>AlexNetCustomLinear2r</cell><cell>random list</cell><cell>36.11</cell><cell>82.316</cell><cell>37.2</cell><cell>37.41</cell></row><row><cell>AlexNetCustomCNNa</cell><cell>Tanh, Softmax</cell><cell>35.76</cell><cell>82.427</cell><cell>37.32</cell><cell>37.31</cell></row><row><cell>AlexNetCustomCNNb</cell><cell>Tanh, Softmax</cell><cell>35.73</cell><cell>81.502</cell><cell>36.62</cell><cell>37.31</cell></row><row><cell>AlexNetCustomCNNr</cell><cell>random list</cell><cell>35.26</cell><cell>80.767</cell><cell>38.16</cell><cell>37.17</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Results from ResNet50 experimentations.</figDesc><table><row><cell>Training</cell><cell>Activations</cell><cell>Trainin</cell><cell>Trainin</cell><cell>Validatio</cell><cell>Testin</cell></row><row><cell></cell><cell></cell><cell>g time, min</cell><cell>g accuracy</cell><cell>n accuracy</cell><cell>g accuracy</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>Results from TuNet experimentations.</figDesc><table><row><cell>Model</cell><cell>Activations</cell><cell cols="2">Epoch Training time (1 epoch), ms</cell><cell>Training accuracy</cell><cell>Validation accuracy</cell></row><row><cell>TuNet (baseline)</cell><cell>Tanh</cell><cell>8</cell><cell>7007.23</cell><cell>0.9973</cell><cell>0.9834</cell></row><row><cell>TuNet</cell><cell>ReLU</cell><cell>10</cell><cell>7066.44</cell><cell>0.9721</cell><cell>0.9599</cell></row><row><cell>TuNetOnlyNN(Tanh)</cell><cell>ReLU, Tanh</cell><cell>10</cell><cell>16265.21</cell><cell>0.9990</cell><cell>0.9863</cell></row><row><cell>TuNetOnlyNN(Tanh)</cell><cell>Tanh, Softplus</cell><cell>9</cell><cell>18699.11</cell><cell>0.9961</cell><cell>0.9837</cell></row><row><cell>TuNetOnlyNN(Tanh)</cell><cell>ReLU, Tanh, Softplus</cell><cell>10</cell><cell>18615.31</cell><cell>0.9943</cell><cell>0.9851</cell></row><row><cell>TuNetOnlyNN(Tanh)</cell><cell>ReLU, Tanh, ELU</cell><cell>10</cell><cell>16559.02</cell><cell>0.9945</cell><cell>0.9849</cell></row><row><cell>TuNetPerNeuronAndChannel</cell><cell>ReLU, Tanh</cell><cell>8</cell><cell>18736.37</cell><cell>0.9945</cell><cell>0.9800</cell></row><row><cell>TuNetPerNeuronAndChannel</cell><cell>Tanh, Sigmoid</cell><cell>10</cell><cell>17864.02</cell><cell>0.9939</cell><cell>0.9809</cell></row><row><cell>TuNetPerNeuronAndChannel</cell><cell>Tanh, Softplus</cell><cell>10</cell><cell>21888.24</cell><cell>0.9929</cell><cell>0.9813</cell></row><row><cell>TuNetPerNeuronAndChannel</cell><cell>ReLU, Tanh, ELU</cell><cell>9</cell><cell>19664.91</cell><cell>0.9931</cell><cell>0.9836</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 . Results from UNet experimentations</head><label>5</label><figDesc></figDesc><table><row><cell>Model</cell><cell>Activations</cell><cell>s</cell><cell>Epoch</cell><cell>Trainin g time, ms</cell><cell>Train. dice</cell><cell>Valid. dice</cell></row><row><cell>UNet UNet</cell><cell>ReLU Tanh</cell><cell></cell><cell>10 10</cell><cell>1378448.12 1380602.75</cell><cell>0.4700 0.4680</cell><cell>0.4062 0.4334</cell></row><row><cell>UNetPerNeuron</cell><cell>ReLU, Tanh</cell><cell></cell><cell>10</cell><cell>4429268.50</cell><cell>0.4747</cell><cell>0.4293</cell></row><row><cell>UNetPerNeuron</cell><cell>Tanh, ReLu</cell><cell></cell><cell>10</cell><cell>4430903.50</cell><cell>0.4656</cell><cell>0.4884</cell></row><row><cell>UNetPerNeuron</cell><cell>Tanh, Softmax</cell><cell></cell><cell>10</cell><cell>4487534.50</cell><cell>0.3716</cell><cell>0.3389</cell></row><row><cell>UNetPerNeuronAnd</cell><cell>ReLU, Tanh</cell><cell></cell><cell>10</cell><cell>4487183.00</cell><cell>0.4714</cell><cell>0.5013</cell></row><row><cell>Channel</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>UNetPerNeuronAnd Channel</cell><cell>ReLU, Softmax</cell><cell></cell><cell>10</cell><cell>4600614.00</cell><cell>0.3733</cell><cell>0.4442</cell></row><row><cell>UNetPerNeuronAnd</cell><cell>Tanh, Softmax</cell><cell></cell><cell>10</cell><cell>4539303.00</cell><cell>0.3696</cell><cell>0.4242</cell></row><row><cell>Channel UNetPerNeuronAnd Channel</cell><cell>Tanh, Softplus</cell><cell></cell><cell>10</cell><cell>4526773.00</cell><cell>0.4697</cell><cell>0.4453</cell></row><row><cell>UNetPerNeuronAnd</cell><cell>Tanh, Softplus</cell><cell></cell><cell>8</cell><cell>3621696.25</cell><cell>0.4685</cell><cell>0.4958</cell></row><row><cell>Channel UNetPerNeuronAnd Channel</cell><cell>Tanh, ReLU, Softplus</cell><cell></cell><cell>10</cell><cell>4516755.50</cell><cell>0.4709</cell><cell>0.4468</cell></row><row><cell>UNetPerNeuronAnd</cell><cell>Tanh, ReLU, Softplus</cell><cell></cell><cell>9</cell><cell>4065430.75</cell><cell>0.4700</cell><cell>0.5081</cell></row><row><cell>Channel UNetPerNeuronAnd Channel</cell><cell>ReLU, Tanh, ELU</cell><cell></cell><cell>10</cell><cell>4525098.50</cell><cell>0.4696</cell><cell>0.4339</cell></row><row><cell>UNetPerNeuronAnd</cell><cell>ReLU, Tanh, ELU</cell><cell></cell><cell>7</cell><cell>3169012.25</cell><cell>0.4646</cell><cell>0.4654</cell></row><row><cell>Channel</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 6 .</head><label>6</label><figDesc>Testing results of SimpleNN and PerNeuron models. Additionally, all possible combinations of different activation functions sets have been tested (see model PerNeuronList).</figDesc><table><row><cell>Model</cell><cell>Activations</cell><cell>MAE</cell><cell>RMSE</cell><cell>RMSLE</cell></row><row><cell>SimpleNN SimpleNN PerNeuron</cell><cell>ReLU (baseline) Tanh Tanh, ReLU</cell><cell>2.8582 2.8583 3.0003</cell><cell>3.7894 3.9185 4.0790</cell><cell>0.0312 0.0316 0.0332</cell></row><row><cell>PerNeuron PerNeuron</cell><cell>ReLU, Tanh ReLU, ReLU, Sigmoid</cell><cell>3.0899 2.7314</cell><cell>4.1825 3.6951</cell><cell>0.0343 0.0301</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 7 . Results from UNet experimentations</head><label>7</label><figDesc></figDesc><table><row><cell>Model</cell><cell>Activations</cell><cell cols="2">Epochs Training</cell><cell>Test</cell><cell>Test</cell><cell>Test</cell><cell>Time (ms)</cell></row><row><cell></cell><cell></cell><cell></cell><cell>MAE</cell><cell>MAE</cell><cell>RMSE</cell><cell>RMSLE</cell><cell></cell></row><row><cell>LSTM LSTM</cell><cell>Default (Tanh, Sigmoid) Tanh, Softmax</cell><cell>20 20</cell><cell>0.0563 0.0565</cell><cell>0.0757 0.0867</cell><cell>0.1262 0.1412</cell><cell>0.070 0.0806</cell><cell>197461.00 3275714.00</cell></row><row><cell>LSTM</cell><cell>ELU, Sigmoid</cell><cell>20</cell><cell>0.2056</cell><cell>0.2113</cell><cell>0.2882</cell><cell>0.1734</cell><cell>3259991.50</cell></row><row><cell>LSTM</cell><cell>Sigmoid, ELU</cell><cell>20</cell><cell>0.1792</cell><cell>0.1863</cell><cell>0.2516</cell><cell>0.1727</cell><cell>3271329.50</cell></row><row><cell>LSTM</cell><cell>Sigmoid, Tanh</cell><cell>20</cell><cell>0.0533</cell><cell>0.0857</cell><cell>0.1420</cell><cell>0.0783</cell><cell>3096815.50</cell></row><row><cell>LSTM</cell><cell>Sigmoid,</cell><cell>8</cell><cell>0.0693</cell><cell>0.0782</cell><cell>0.1305</cell><cell>0.0721</cell><cell>1248338.12</cell></row><row><cell>LSTM</cell><cell>Tanh Sigmoid, Softmax</cell><cell>20</cell><cell>0.0740</cell><cell>0.0798</cell><cell>0.1317</cell><cell>0.0741</cell><cell>3708114.00</cell></row><row><cell>LSTM</cell><cell>ELU, Sigmoid, Tanh</cell><cell>20</cell><cell>0.1748</cell><cell>0.1823</cell><cell>0.2469</cell><cell>0.1615</cell><cell>2843427.75</cell></row><row><cell>LSTM</cell><cell>ELU, Tanh, Sigmoid</cell><cell>20</cell><cell>0.1817</cell><cell>0.2553</cell><cell>0.1796</cell><cell>0.1542</cell><cell>2818499.50</cell></row><row><cell>LSTM</cell><cell>Softmax,</cell><cell>20</cell><cell>0.0606</cell><cell>0.0791</cell><cell>0.1315</cell><cell>0.0726</cell><cell>3155361.75</cell></row><row><cell></cell><cell>Sigmoid, Tanh</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>LSTM</cell><cell>Softmax, Tanh, Sigmoid</cell><cell>20</cell><cell>0.0604</cell><cell>0.0814</cell><cell>0.1331</cell><cell>0.0756</cell><cell>3171810.00</cell></row></table></figure>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Tabular</head><p>Tabular data is still widely used in machine learning tasks. In this paper we choose two datasets to experiment with the changes on Iris flowers and Breast cancer classifications. Both experiments have the following training parameters: optimizer -SGD, learning rate -0.01, loss function -cross entropy loss and number of training epochs -200.</p><p>From results displayed in Table <ref type="table">8</ref> comparing one activation versus multiple for this Iris flowers classification task, there is no improvement compared to best suited activation function. Additionally, a activation function set from a large number of combinations was selected and the accuracy using it is better compared to one activation function (see Table <ref type="table">10</ref>). It should also be noted that better results were achieved than from the SVM described in Obaid's work. As can be seen from the results, there is a significant accuracy increase for the PerNeuron models, whilst the most significant increase can be seen when finding the best activation function list from all possible combinations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and discussion</head><p>The research paper explores the concept of using multiple activation functions in artificial neural networks. It discusses the role of activation functions in introducing non-linear relations to improve the accuracy of tasks. The paper investigates different approaches to incorporating multiple activation functions, including assigning a different function to each neuron or channel.</p><p>The experiments included using models such as AlexNet, ResNet50, TuNet, and SimpleNN. In the AlexNet experiment, different activation function combinations were tested in both linear layers and convolutional neural network (CNN) layers. The results showed that using OriginalAlexNet with Tanh activation function yielded the best overall performance. The ResNet50 experiments resulted in one combination performing marginally better than any of single function baselines. The TuNet and SimpleNN experiments aimed to evaluate the performance of these specific architectures on their respective datasets. Overall, the experiments provided insights into the impact of activation function combinations on model performance, with modest improvements observed compared to using a single activation function. The datasets used in the experiments included CIFAR-100, GTSRB, Breast Cancer Wisconsin (Diagnostic), Iris flowers, and Amazon stocks. In image segmentation tasks, modifying the UNet architecture with different activation function combinations leads to significant improvements in the Dice metric. Even changing the activation function in the baseline model from ReLU to Tanh shows improved results. For time series regression/forecasting tasks, the experiments show that using multiple activation functions does not significantly improve the accuracy of predictions. This paper also hints into an idea of full list of activation functions, which would learn relation with the specific data neuron is receiving. An idea which requires further analysis.</p><p>Overall, the paper concludes that while using multiple activation functions can have some benefits in certain scenarios, the improvements are not substantial compared to using a single activation function. The choice of activation function should be based on the specific task, dataset and its features.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Activation functions in deep learning: A comprehensive survey and benchmark</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Dubey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Chaudhuri</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.neucom.2022.06.111</idno>
		<ptr target="https://doi.org/10.1016/j.neucom.2022.06.111" />
	</analytic>
	<monogr>
		<title level="j">Neurocomputing</title>
		<imprint>
			<biblScope unit="volume">503</biblScope>
			<biblScope unit="page" from="92" to="108" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">RMAF: Relu-Memristor-Like Activation Function for Deep Learning</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Adu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tashi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Anokye</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Ayidzoe</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2020.2987829</idno>
		<ptr target="https://doi.org/10.1109/ACCESS.2020.2987829" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="72727" to="72741" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Smish: A Novel Activation Function for Deep Learning Methods</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.3390/electronics11040540</idno>
		<ptr target="https://doi.org/10.3390/electronics11040540" />
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">540</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Efficient activation functions for embedded inference engines</title>
		<author>
			<persName><forename type="first">A</forename><surname>Wuraola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Nguang</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.neucom.2021.02.030</idno>
		<ptr target="https://doi.org/10.1016/j.neucom.2021.02.030" />
	</analytic>
	<monogr>
		<title level="j">Neurocomputing</title>
		<imprint>
			<biblScope unit="volume">442</biblScope>
			<biblScope unit="page" from="73" to="88" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Gish: a novel activation function for image classification</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kaytan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">İ</forename><forename type="middle">B</forename><surname>Aydilek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Yeroğlu</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00521-023-09035-5</idno>
		<ptr target="https://doi.org/10.1007/s00521-023-09035-5" />
	</analytic>
	<monogr>
		<title level="j">Neural Comput &amp; Applic</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="24259" to="24281" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">PV Power Prediction Based on LSTM With Adaptive Hyperparameter Adjustment</title>
		<author>
			<persName><forename type="first">M</forename><surname>Chai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Cui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2019.2936597</idno>
		<ptr target="https://doi.org/10.1109/ACCESS.2019.2936597" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="115473" to="115486" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images Classification</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S</forename><surname>Anami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">V</forename><surname>Sagarnal</surname></persName>
		</author>
		<idno type="DOI">10.1134/S1054661821040039</idno>
		<ptr target="https://doi.org/10.1134/S1054661821040039" />
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition and Image Analysis</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="78" to="88" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The Role of Activation Function in CNN</title>
		<author>
			<persName><forename type="first">W</forename><surname>Hao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yizhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yaqin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zhili</surname></persName>
		</author>
		<idno type="DOI">10.1109/ITCA52113.2020.00096</idno>
		<ptr target="https://doi.org/10.1109/ITCA52113.2020.00096" />
	</analytic>
	<monogr>
		<title level="m">2020 2nd International Conference on Information Technology and Computer Application (ITCA)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="429" to="432" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Developing Novel Activation Functions Based Deep Learning LSTM for Classification</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">H</forename><surname>Essai Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">B</forename><surname>Abdel-Raman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Badry</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2022.3205774</idno>
		<ptr target="https://doi.org/10.1109/ACCESS.2022.3205774" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="97259" to="97275" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">ImageNet Classification with Deep Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hinton</surname></persName>
		</author>
		<idno type="DOI">10.1145/3065386</idno>
		<ptr target="https://doi.org/10.1145/3065386" />
	</analytic>
	<monogr>
		<title level="j">Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Deep Residual Learning for Image Recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1512.03385</idno>
		<ptr target="https://doi.org/10.48550/arXiv.1512.03385" />
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">U-Net: Convolutional Networks for Biomedical Image Segmentation</title>
		<author>
			<persName><forename type="first">O</forename><surname>Ronneberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fischer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Brox</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1505.04597</idno>
		<ptr target="https://doi.org/10.48550/arXiv.1505.04597" />
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A novel UNet segmentation method based on deep learning for preferential flow in soil</title>
		<author>
			<persName><forename type="first">H</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhao</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.still.2023.105792</idno>
		<ptr target="https://doi.org/10.1016/j.still.2023.105792" />
	</analytic>
	<monogr>
		<title level="j">Soil &amp; Tillage Research</title>
		<imprint>
			<biblScope unit="volume">233</biblScope>
			<biblScope unit="page">105792</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">GCW-UNet segmentation of cardiac magnetic resonance images for evaluation of left atrial enlargement</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">K</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">N</forename><surname>Ghista</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.cmpb.2022.106915</idno>
		<ptr target="https://doi.org/10.1016/j.cmpb.2022.106915" />
	</analytic>
	<monogr>
		<title level="j">Computer Methods and Programs in Biomedicine</title>
		<imprint>
			<biblScope unit="volume">221</biblScope>
			<biblScope unit="page" from="106915" to="106915" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">KUB-UNet: Segmentation of Organs of Urinary System from a KUB X-ray Image</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Thakkar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mehta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Chavan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">S</forename><surname>Dhaka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">K</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Vocaturo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Zumpano</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.cmpb.2022.107031</idno>
		<ptr target="https://doi.org/10.1016/j.cmpb.2022.107031" />
	</analytic>
	<monogr>
		<title level="j">Computer Methods and Programs in Biomedicine</title>
		<imprint>
			<biblScope unit="volume">224</biblScope>
			<biblScope unit="page" from="107031" to="107031" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Learning Multiple Layers of Features from Tiny Images</title>
		<author>
			<persName><forename type="first">A</forename><surname>Krizhevsky</surname></persName>
		</author>
		<ptr target="https://www.semanticscholar.org/paper/Learning-Multiple-Layers-of-Features-from-Tiny-Krizhevsky/5d90f06bb70a0a3dced62413346235c02b1aa086" />
		<imprint>
			<date type="published" when="2009-01-17">2009. January 17, 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">computer: Benchmarking machine learning algorithms for traffic sign recognition</title>
		<author>
			<persName><forename type="first">J</forename><surname>Stallkamp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schlipsing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Salmen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Igel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Man</forename><surname>Vs</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.neunet.2012.02.016</idno>
		<ptr target="https://doi.org/10.1016/j.neunet.2012.02.016" />
	</analytic>
	<monogr>
		<title level="j">Neural Networks</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="323" to="332" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Nuclear feature extraction for breast tumor diagnosis</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">N</forename><surname>Street</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">H</forename><surname>Wolberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">L</forename><surname>Mangasarian</surname></persName>
		</author>
		<idno type="DOI">10.1117/12.148698</idno>
		<ptr target="https://doi.org/10.1117/12.148698" />
		<editor>R.S. Acharya, D.B. Goldgof</editor>
		<imprint>
			<date type="published" when="1993">1993</date>
			<biblScope unit="page" from="861" to="870" />
			<pubPlace>San Jose, CA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">The Iris Data Set: In Search of the Source of Virginica</title>
		<author>
			<persName><forename type="first">A</forename><surname>Unwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kleinman</surname></persName>
		</author>
		<idno type="DOI">10.1111/1740-9713.01589</idno>
		<ptr target="https://doi.org/10.1111/1740-9713.01589" />
	</analytic>
	<monogr>
		<title level="j">Significance</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="26" to="29" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Evaluating the Performance of Machine Learning Techniques in the Classification of Wisconsin Breast Cancer</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">Ibrahim</forename><surname>Obaid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mohammed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Ghani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mostafa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Al-Dhief</surname></persName>
		</author>
		<idno type="DOI">10.14419/ijet.v7i4.36.23737</idno>
		<ptr target="https://doi.org/10.14419/ijet.v7i4.36.23737" />
	</analytic>
	<monogr>
		<title level="j">International Journal of Engineering and Technology</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="160" to="166" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">Inc</forename><surname>Amazon</surname></persName>
		</author>
		<author>
			<persName><surname>Amazon</surname></persName>
		</author>
		<author>
			<persName><surname>Com</surname></persName>
		</author>
		<ptr target="https://finance.yahoo.com/quote/AMZN/history/" />
		<title level="m">AMZN) Stock Historical Prices &amp; Data -Yahoo Finance</title>
				<imprint>
			<publisher>Amazon</publisher>
			<date type="published" when="2024-01-17">2024. January 17, 2024</date>
		</imprint>
		<respStmt>
			<orgName>Inc.</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">An Analysis Of Convolutional Neural Networks For Image Classification</title>
		<author>
			<persName><forename type="first">N</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mishra</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.procs.2018.05.198</idno>
		<ptr target="https://doi.org/10.1016/j.procs.2018.05.198" />
	</analytic>
	<monogr>
		<title level="j">Procedia Computer Science</title>
		<imprint>
			<biblScope unit="volume">132</biblScope>
			<biblScope unit="page" from="377" to="384" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model</title>
		<author>
			<persName><forename type="first">T</forename><surname>Limouni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Yaagoubi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bouziane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Guissi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">H</forename><surname>Baali</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.renene.2023.01.118</idno>
		<ptr target="https://doi.org/10.1016/j.renene.2023.01.118" />
	</analytic>
	<monogr>
		<title level="j">Renewable Energy</title>
		<imprint>
			<biblScope unit="volume">205</biblScope>
			<biblScope unit="page" from="1010" to="1024" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Accurate solar PV power prediction interval method based on frequency-domain decomposition and LSTM model</title>
		<author>
			<persName><forename type="first">L</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.energy.2022.125592</idno>
		<ptr target="https://doi.org/10.1016/j.energy.2022.125592" />
	</analytic>
	<monogr>
		<title level="j">Energy</title>
		<imprint>
			<biblScope unit="volume">262</biblScope>
			<biblScope unit="page">125592</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Time series forecasting for hourly photovoltaic power using conditional generative adversarial network and Bi-LSTM</title>
		<author>
			<persName><forename type="first">X</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Liu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.energy.2022.123403</idno>
		<ptr target="https://doi.org/10.1016/j.energy.2022.123403" />
	</analytic>
	<monogr>
		<title level="j">Energy</title>
		<imprint>
			<biblScope unit="volume">246</biblScope>
			<biblScope unit="page">123403</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<idno type="DOI">10.3390/su15108266</idno>
		<ptr target="https://doi.org/10.3390/su15108266" />
		<title level="m">Short-Term Prediction of PV Power Based on Combined Modal Decomposition and NARX-LSTM-LightGBM, Sustainability</title>
				<meeting><address><addrLine>Basel, Switzerland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page">8266</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Forecasting of PV plant output using hybrid wavelet-based LSTM-DNN structure model</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ospina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Newaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">O</forename><surname>Faruque</surname></persName>
		</author>
		<idno type="DOI">10.1049/iet-rpg.2018.5779</idno>
		<ptr target="https://doi.org/10.1049/iet-rpg.2018.5779" />
	</analytic>
	<monogr>
		<title level="j">IET Renewable Power Generation</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="1087" to="1095" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
