1. Introduction

ProfIT AI

Using of Ellipsoid Method and Linear Regression with L1- Regularization for Medical Data Investigation

Petro Stetsyuk

Viktor Stovba

Ivan Senko

Illya Chaikovsky

0 0 V.M. Glushkov Institute of Cybernetics of the NASU , Academician Glushkov Avenue, 40, Kyiv, 03187 , Ukraine

2024

4 25 27

The problem of finding of parameters of linear regression model with !-regularization and the least moduli criterion with 1 ≤ ≤ 2 is considered. To solve the problem the Shor's ellipsoid method is used, which is implemented as the emlmpr algorithm. A series of three computational experiments is conducted, which demonstrate solving time of the emlmpr algorithm and robustness of the least moduli criterion if is close to 1. The third experiment considers situation when the model contains linearly dependent features and shows the effect of !-regularization on the quality of solutions obtained.

eol>linear regression least moduli criterion !-regularization non-smooth optimization problem Shor's ellipsoid method dependent factors data prediction 1

1. Introduction

Regression models are an extremely prevalent tool for effective prediction both in machine learning and artificial intelligence in general. Applying of linear regression models for building effective forecasting models, which describe linear relationships between factors, in such fields as statistics, medicine, economics, ecology, identification of parameters of complex systems etc. is studied and investigated. This type of models has proven themselves to be flexible in construction and to provide clear interpretation of relationships between dependent variable and model factors, sometimes even outperforming more complex nonlinear models [1].

When working with regression models, it is rather important to choose correct criteria for estimating model parameters. The most well-known and common variants are the criterion based on least squares and based on least moduli. Effectiveness of the first variant is confirmed by theoretical studies [2] and numerous statistical experiments. Nevertheless, one of the most significant disadvantages of the least squares criterion is the increase of the effect of large errors when they are squared, which makes the model extremely sensitive to anomalous observations (or outliers). An important condition for using this criterion is the standard normal distribution of model errors, which is not always fulfilled in practice. A well-known and effective alternative to this criterion is the criterion based on the least moduli, which is robust to outliers [3, 4] and assumes a Laplacian distribution of model errors.

Another important aspect of work with linear regression models is the presence of dependencies between two or more factors of a model, which negatively affect the quality of the obtained parameter estimates. Usually, such dependencies are detected at the stage of data preprocessing and model building by selecting optimal set of model factors that best describe relationship between the dependent variable and the factors. However, in practice, situations often occur when a certain group of factors collectively affects the dependent variable. As a result, both the criterion based on the least squares and the least moduli incorrectly determines parameters of the model, often significantly overestimating or underestimating them. Therefore, it is expedient to develop methods and criteria that make it possible to detect such dependencies between factors and make their coefficients be close to zero. One of the most famous so-called shrinkage methods in machine learning [1] is regularization approach that permits to balance the model and reduce the effect of dependent factors on the quality of parameter determination.

The article is dedicated to applying of the Shor’s ellipsoid method for finding parameters of a linear regression model with !-regularization and the least moduli criterion with 1 ≤ ≤ 2. This criterion includes the use of the least moduli ( = 1) and the least squares ( = 2) criteria, as well as allows to use any value of the parameter . Certain work results of applying the ellipsoid method for this type of problems are given in [5].

2. Finding of linear regression model parameters using the least

moduli criterion powered to p Let us consider a classical linear regression problem: to find unknown parameters !, … , " with known observations (#, #), # = (#!, #$, … , #") ∈ ", # ∈ , = 1444,444, which are related as follows:

" # = 5

#%% + #, = 1444,444, .(̅) = ⎧ 5 ⎪ ⎪ ' #&!

" O5 %&! #% %̅ − #P C5

#% %̅ − #C " %&! ) /!

⎫ #!,⎪

⎪ %&! where #% are known coefficients, # are unknown random variables, which have (approximately) the same distribution functions, > . The equation ( 1 ) can be rewritten in matrix form = + , ( 2 ) where = (!, … , ')( ∈ ' and = (!, … , ')( ∈ ' are -dimentional vectors, is a × matrix, = (!, … , ")( ∈ " is a -dimentional vector that is to be evaluated.

The least moduli method powered to , which corresponds to finding the unknown vector )∗ according to the least moduli criterion powered to (1 ≤ ≤ 2), is a mathematical programming problem: ∗ = =)∗ >= +m∈in" B() = 5 #&! %&! where |∙| is an absolute value of a number. The function () is non-smooth, if = 1 and smooth, if > 1.

The problem ( 3 ) is a problem of unconditional minimization of the convex function (), subgradient of which at the point ̅ is calculated using the following formula: #&! %&! The problem ( 5 ) is a problem of unconditional minimization of convex piecewise-linear function !(), which corresponds to the least moduli method, which has proven to be robust to anomalous observations or outliers [3, 6]. Finding the best according to the least moduli criterion vector ∗, where ∗ is a solution of the problem ( 5 ), can be formulated as the following LP-problem: to find !∗ = +m∈in" B!() = 5 subject to # − 5 #%% ≤ #, − # + 5 #%% ≤ #, = 4144,444.

( 6 ) $∗ = +m∈in" V$() = 5

%&! %&! For solving the LP-problem ( 6 ) one can use appropriate standard linear programming tools. At the same time, as we find the vector ∗ we find optimal values of the vector ∗ = (!∗, … , '∗)( as well, elements of which define estimates for independent random variable #, = 1444,444.

If = 2 the problem ( 3 ) can be written as the following mathematical programming problem: ' " $

O# − 5 #%%P W. ( 7 ) #&! %&! The problem ( 7 ) is a problem of unconditional minimization of a convex quadratic function $(), which corresponds to the least squares method. Linear independency of the rows of the matrix provides existence of an analytical solution ∗ = (( )/!( of the problem ( 7 ). Otherwise, if rows of the matrix are linearly dependent or > , it is impossible to obtain an analytical solution. In that case one can use methods for balancing the model, in particular, regularization.

Let us consider the problem ( 3 ) with !-regularization:

' " " )∗ = ) =)∗ >= +m∈in" B) () = 5 #&! %&! %&! The problem ( 8 ) is a problem of unconditional minimization of a convex piecewise-linear function ) (). Here is a regularization parameter, and if = 0 the function ) () coincides with the function (). To calculate the subgradient of the function ) () at the point ̅ one can use the following formula:

.#(̅) = .(̅) + (̅), ( 9 ) where .(̅) is calculated using the expression ( 4 ).

For solving the problem ( 8 ) the Shor’s ellipsoid method [7, 8, 9] can be used, which is implemented as the emshor program [10]. We will apply it for the problem of the function ) () minimization, providing that its minimum point )∗ is localized in -dimensional ball with radius 4, which is centered at the point 4 ∈ ", i.e. \4 − )∗ \ ≤ 4. The algorithm to be used is called emlmpr, description of which is given below.

) #%%C + 5

Y%YE.

( 8 )

3. The emlmpr algorithm and its Octave implementation

The input parameter of the algorithm is . > 0 – accuracy, with which )∗ = ) =)∗ >is to be found.

Initialization. Let us consider × -matrix and set 4 ≔ ", where " is × identity matrix. We go to the first iteration with values 4, 4 and 4. Let values 5 ∈ ", 5, 5 be found at the iteration . Passing to the iteration + 1 consists of the following sequence of actions.

Step 1. Calculate ) (5) and subgradient .#(5) at the point 5 using formula ( 9 ). If 5 a5( .#(5)a ≤ ., then “Stop: ∗ = and )∗ = 5”. Otherwise, go to the step 2.

Step 2. Set 5 ≔ 6$%7&#(+$) .

:6$%7&#(+$): Step 3. Calculate the next point Step 4. Calculate

5;! ≔ 5 − ℎ555, where ℎ5 = ";!! 5. 5;!: = 5 + Oe"";/!! − 1P (55)5( " . and 5;!: = 5 √"'/! Step 5. Go to the iteration + 1 with values 5;!, 5;!, 5;!.

Theorem. Sequence of points {5}55&∗4 satisfy the following inequalities:

\5/!=5 − )∗ >\ ≤ 5, = 0,1,2, … , ∗.

On each iteration > 0 the value of decreasing of volume of the ellipsoid 5 = i ∈ ": \5/!(5 − )\ ≤ 5k, which localizes point )∗ , is constant and equal to =

(5) (5/!) = p − 1

q + 1 √$ − 1

" s < v−

1 2( + 1) w < 1.

Theorem implies the fact that the algorithm of finding )∗ can be successfully run on modern computers, if = 10 ÷ 30 and = 100 ÷ 1000. Indeed, to decrease in 10 times volume of the ellipsoid localizing the point )∗ , it is needed to perform iterations, where = =" !4 ≈ =" > (2 10)( + 1) ≈ 4.6( + 1). It means that in order to improve deviation of found record value of the function ) () from its optimal value )∗ by 10 times, it is necessary to perform 4.6( + 1)$ iterations of the algorithm for finding )∗ . language. Its code is given below. formula ( 9 ) allows to provide fast algorithm work on modern computers.

If = 30 and . = 10/? × (4), then the maximal number of iterations of the algorithm is equal to 4.6( + 1)$ = 46 × 961 = 44206. Therefore, even the straight-up matrix-vector implementation of calculation of the function ) () value and its subgradient according to the

The algorithm emlmpr for finding an approximation to the point )∗ is implemented using Octave # Input parameters: #com01 # A(m,n) – observation matrix; #com02 # y(m,1) – vector of tags (output vector); #com03 # p – power for least moduli criterion, 1<=p<=2; #com04 # lambda – regularization rate; #com05 # x0(n,1) – starting point; #com06 # r0 – radius of the ball centered at x0 that localizes x_p^*; #com07 # epsf, maxitn – stop parameters: #com08 # epsf – precision to stop by the value of the function fp, #com09 # maxitn – maximal number of iterations; #com10 # intp – print information for every intp iteration. #com11 # Output parameters: #com12 # xp(n,1) – approximation to x_p^*; #com13 # fp – the value of the function f_R at the point xp; #com14 # itn – the number of iterations; #com15 # ist – exit code: 1 – epsf, 4 – maxitn. #com16 function [xp,fp,itn,ist] = emlmpr(A,y,p,lambda,x0,r0,

epsf,maxitn,intp); #row01 n = columns(A); xp = x0; B = eye(n); r = r0; #row02 dn = double(n); beta = sqrt((dn-1.d0)/(dn+1.d0)); #row03 for (itn = 0:maxitn) #row04 temp = A*xp-y; fp = sum(abs(temp).^p) + lambda*sum(abs(xp)); #row05 if((mod(itn,intp)==0)&&(intp<=maxitn)) #row06

printf(" itn %4d fp %14.6e\n",itn,fp); #row07 endif #row08 g1 = p*A'*(sign(temp).*(abs(temp)).^(p-1)) + lambda*sign(xp);#row09 g = B'*g1; dg = norm(g); #row10 if(r*dg < epsf) ist = 1; return; endif #row11 xi = (1.d0/dg)*g; dx = B * xi; #row12 hs = r/(dn+1.d0); xp -= hs * dx; #row13 B += (beta - 1) * B * xi * xi'; #row14 r = r/sqrt(1.d0-1.d0/dn)/sqrt(1.d0+1.d0/dn); #row15 endfor #row16 ist = 4; #row17 endfunction #row18 5;! (row 15) are recalculated.

Core of the emlmpr program is the for loop (rows 4–16). First, the value of the function (line 5) and its normalized subgradient at the point ) (row 10) are calculated. If the stop condition is satisfied (row 11), the algorithm stops its work. Stop in the emlmpr algorithm occurs when a condition 5 a5( .#(5)a ≤ . is fulfilled, which is equivalent to condition ) (5) − )∗ ≤ .. Otherwise, the next point 5;! is calculated (row 13), the space transformation matrix 5;! (row 14) and the radius

4. Computational experiments without regularization

To demonstrate the effectiveness of the emlmpr algorithm work we present results of three computational experiments conducted for solving the problem ( 8 ). For the first and the second experiments parameters = 30 and = 10 × = 300. The purpose of the first experiment is to estimate time of solving the problem ( 8 ) for specified parameters on a personal computer with Intel Core i7-10750H processor (2.6 GHz), and 16 Gb RAM. The purpose of the second experiment is to demonstrate robustness of the least moduli method, and therefore solutions of the problem ( 8 ) without regularization ( = 0), if is close to one. Third experiment is dedicated to finding parameters of linear regression model using real medical data for further prediction psychological indicators.

All the calculations are performed on a computer with Intel Core i7-10750H processor (2.6 GHz), 16 Gb RAM in Windows 10/64 system using GNU Octave, version 6.3.0. For the first two experiments regularization parameter is chosen equal to zero.

Test example 1. For the first experiment input data for the problem ( 8 ) are matrix and vector , which are generated randomly with a standard uniform distribution according to the following formulas: A = 10*rand(m,n), y = A*xstar(n,1), xstar(n,1) = round(10*rand(n,1) + 0.5). Starting point is chosen according to the rule x0(n,1) = round(5*rand(n,1)), and radius of the sphere, in which the point )∗ = @ABC is located, is chosen according to the rule r0 = 5*norm(x0 – xstar), i.e. 4 = ‖4 − @ABC‖. The first experiment is implemented by the following Octave code. # Test 1: emlmpr running time for n = 30 and m = 300 n = 30, m = 10*n, rand("seed", 2024); A = 10*rand(m,n); xstar = round(10.0*rand(n,1) + 0.5); y = A*xstar; x0 = round(5.0*rand(n,1)); r0 = 5*norm(x0 - xstar), maxitn = 50000, intp = 10000, lambda = 0.0, # running the emlmpr algorithm for p=1.0;1.1.2;1.5;1.8;2.0 printf("\n Test 1: emlmpr runnning time for n = 30 and m = 300 \n"); epsf0 = 1.e-6; ntest = 5; table = []; for (i = 1:ntest) p = 1.d0 + (i - 1.d0)/(ntest - 1.d0), epsf = epsf0**(p); time0 = time(); [xp,fp,itn,ist] = emlmpr(A,y,p,lambda,x0,r0,epsf,maxitn,intp); time1 = time() - time0, dx = norm(xp - xstar); table = [table; p epsf time1 itn ist fp dx]; itn, fp, endfor n,m, printf(" p epsf time itn ist fp dx \n"); for (i = 1:ntest) printf(" %4.1f %6.1e %4.2f %6d %2d

table(i, 1:7)) endfor %10.5e %10.1e\n",

Results of the emlmpr program work for the first experiment are required to solve the problem ( 8 ) with accuracy ., the number of iterations of the method, the minimum value of the function ) found, norm of deviation of the found approximation to the minimum point from the known minimum point xstar are given in Table 1. Here . is chosen as follows: if = 1 the value . = 10/?, if > 1 we choose . = (10/?)) . Results of solving the problem ( 8 ) with = , = and =

It is easy to see from Table 1 that to get solution with accuracies 10/? ÷ 10/!$ for different the emlmpr algorithm requires approximately 40 000 iterations and no more than 7 seconds of time. The least deviation equals 2.3e–11 and is obtained for = 1.

Test example 2. The purpose of the second experiment is to demonstrate robustness of the least moduli method, which means that the same robustness will characterize solutions of the problem ( 8 ), if is close to one. Here, the matrix , the starting point 4, ball radius 4 are chosen to be the same as in the first test, the vector is adjusted so that its odd components remain the same as in the first test, and even components are multiplied by the value q = (1.0 + 1.0*sign(0.5 - rand)). Thus, even components of the vector can be considered anomalous (incorrect) results of observations. which indicates successful completion of the program.

Calculation results for = 30 and = 300 are given in Table 2. Here, is an exit code of the emlmpr program, is a norm of deviation of found approximation to the minimum point from the point xstar. The 5th column contains values of the function ) at the found point ) , the 7th column contains the -th root of the 5th column. For all the values of the parameter code = 1,

Results of Table 2 show that the function value C grows as the parameter increases: from 1.34e+05 if = 1 to 1.09e+08 if = 2. Deviation of the solution found from the minimum point with = 1 is significantly smaller than if > 1, which confirms robustness of the least moduli method corresponding to = 1 situation. It is important to emphasize that this situation is typical for all the values of the parameter close enough to 1. Time used for finding solutions for each of the parameter values does not exceed 4 seconds.

5. Computational experiments with regularization

To show effectiveness of the emlmpr algorithm applied to real data we consider the problem of prediction of psychological indicators of the patient's condition based on cardiological data obtained using complex [11]. There were 90 patients studied with more than 200 features including cardiological and basic ones (like age and ordinal number). Willing to exclude choice of categorial features recoding method from analysis so we are omitting categorial feature as well as ordinal. Practically, usage of ordinal features instead of numerical could increase the quality of linear modelling, see [12], however, we need to simplify experiment in order to research only the ellipsoid method usage. While ability of the medical complex [11] to create binning good enough for the linear modelling is out the scope of the current research. So, we are taking just 175 numerical features that we have. Then, we apply the feature selection procedure to test the ellipsoid method on the dataset being optimal at least at some sense.

We want to select features that describe relationship between medical and psychological data in the best way using the $ metric [1]. While the goal of the studying the medical data includes feature interpretability, we take these data as is. In other words, we do not make transformations like PCA and similar ones to get linear independent features. Undoubtedly, it is possible to get some interpretation even after the transformations, but our approach is to take features as is. Taking into account that internal metrics for feature importance in the case of linear regression model work are the best when features are either linearly independent or have normal distributions at least, we cannot rely on internal linear regression metrics, so we try to use “wrapper” approach for the model feature selection [13]. For the quality metric, we use 5-fold cross-validation [1]. Since the initial dataset holds missing values, we use simple imputation via median strategy using only training subsample to avoid distortion due to the whole-set median calculation. Moreover, in our situation the initial number of features, which is 200, is greater than number of observations, which is 90, so we start from the first feature, increase number of features until the quality metric $ stops to grow. Also, we consider non-transformed features to decrease the number of experiments to perform and the variability of the whole scheme. Selection of the optimal transformation is an additional task, which is out of scope of the current paper. In general, the feature selection procedure is described at Figure 1.

The calculations for feature selection are made in Python 3 [14] using Google Colab with Sequential Feature Selection and Linear Regression classes with embedded $-metric taken from Scikit-learn library [15]. We also used Pandas library [16] for keeping feature names during calculations.

The observation matrix consists of values of the following 16 numerical features for 90 patients: ( 1 ) observation number; ( 2 ) amplitude () (wd. II); ( 3 ) amplitude () (wd. III); ( 4 ) amplitude () (wd. III); ( 5 ) amplitude () (wd. AvL); ( 6 ) amplitudes ⁄ ratio (wd. II); ( 7 ) amplitude () (wd. AvF); ( 8 ) LFn; ( 9 ) amplitude () (wd. AvF); ( 10 ) ECG phase ratio index; ( 11 ) state of regulation reserves; ( 12 ) withdrawal code AvR_init; ( 13 ) comprehensive assessment of occurrence of significant cardiovascular events_init; ( 14 ) functional condition according to Baevsky; ( 15 ) withdrawal code I_univ; ( 16 ) HFn; (17) target: Beck anxiety scale. The last feature is target and is to be predicted.

To determine parameters of linear regression model and further prediction the emlmpr algorithm is used with parameter = 1 and = 2, where the first case corresponds to the least moduli method, and the second case corresponds to the least square method. The observation matrix is as follows: problem D∗ (line 6) for four accuracies and two values of the parameter .

Results of the emlmpr program work is given in Table 3. It contains problem solving time (line 3), the number of iterations (line 4), value of the function at the point D∗ (line 5) and solution of the

Table 4 shows that to solve the problem with = 1 with . = 10/? and . = 10/$4 the emlmpr program requires approximately 8 thousand operations. If we use = 1 for the same accuracies 11700 iterations are required, and their number is increased to 29719 iterations when using . = 10/F4. The ) value for fixed remains unchanged.

As it can be seen from Table 4, the emlmpr program successfully finds linear regression model coefficients when using = 0 (see Table 5). However, some of the coefficients are rather larger than others (bold values in Table 5), which can indicate presence of dependency between the following features in the observation matrix. To reduce their effect on the quality of coefficients restoration we apply !-regularization, which allows to set model parameters corresponding to dependent columns to zero. In practice, it is difficult to obtain exactly zero values of the corresponding parameters, so we have to settle for values close to zero with a certain accuracy.

Table 6 contains coefficients of linear regression model found by the emlmpr program with = 1.0; 2.0, different accuracies . and regularization rate = 0.1. Corresponding values to large coefficients from Table 5, as well as any changes in coefficients digits are highlighted in bold. It is easy to see that now these coefficients are rather close to zero with sufficient accuracy: 10/$ for the feature 7 with any values of and ., 10/G for the feature 14 with = 1 and . = 10/? and even 10/$H for the feature 16 with = 2 and . = 10/F4. The rest of the coefficients remained almost unchanged except several digits. It is also worth noting that increasing of the regularization rate leads to decreasing coefficients values of dependent features even more. It gives an instrument to adjust the impact of regularization and obtain coefficients at dependent features close enough to zero, thus improving quality of the solutions obtained.

The prediction results obtained using the model with parameters calculated with the emlmpr algorithm show that using the least moduli method ( = ) we obtain many more zero values (which means that solution is found with required accuracy) than in case of using the least square method ( = ). Thus, using = is more appropriate than = .

6. Conclusions

The paper investigates the problem of finding parameters of linear regression model with the least moduli criterion with ≤ ≤ and -regularization. The problem is formulated as a problem of unconditional minimization of a convex piecewise-linear function. For solving this problem, Shor’s ellipsoid method is used, which is implemented by the emlmpr program using Octave programming language.

Series of three computational experiments with the emlmpr program are considered. Results of the first experiment show that the problem of finding parameters of linear regression model with = and = can be solved within 7 seconds being run on modern laptop of average performance. The second experiment shows that the least moduli criterion is robust if is close to one, thus solutions of the problem are robust as well. The third experiment is dedicated to using of -regularization for decreasing effect of linearly dependent features that the model can include on the solutions quality. Results of the experiment, where real cardiological data are used for prediction of psychological indicators of the patient’s condition, show that the emlmpr algorithm can successfully compute linear regression model parameters with = , = within 3 seconds, and set coefficients at dependent features to zero with sufficient accuracy using -regularization approach.

Acknowledgements

The paper is supported by National Research Foundation of Ukraine (grants № 2021.01/0136 and №2023.04/0094), Volkswagen Foundation grant № 97775, the project of research works of young scientists №07-02/03-2023, the NASU grant for research laboratories/groups of young scientists №02/01-2024( 5 ), and the DTT TS KNU NASU project № 0124U002162.

[1]

James ,

Witten ,

Hastie ,

Tibshirani ,

Taylor , An Introduction to Statistical Learning: with Applications in Python, Springer Texts in Statistics, Springer Cham, New York, NY, 2023 . doi: 10 .1007/978-3- 031 -38747-0

[2]

Deisenroth ,

Faisal , C. Soon Ong, Mathematics for Machine Learning: textbook, Cambridge, 1st Edition , 2020 .

[3]

P.J.

Huber ,

E.M.

Ronchetti , Robust Statistics, John Wiley & Sons, 2nd Edition , 2011 .

[4]

F.H.

Clarke , Optimization and

Nonsmooth

Analysis , SIAM , 1990 .

[5]

Stetsyuk ,

Budnyk , I. Sen'ko .,

Stovba , I. Chaikovsky , Using the Ellipsoid Method to Study Relationships in Medical Data, Cybernetics and Computer Technologies ( 2023 ) 23 - 43 . doi: 10 .34229/ 2707 - 451X . 23 . 3 . 3

[6]

Fan , P. Hall, On curve estimation by minimizing mean absolute deviation and its implications , The Annals of Statistics ( 1994 ) 867 - 885 .

[7]

N.Z.

Shor , Cutting-off Method with Space Dilation for Solving Convex Programming Problems , Cybernetics ( 1977 ) 94 - 95 .

[8]

N.Z.

Shor , Nondifferentiable Optimization and Polynomial Problems , Kluwer, Amsterdam, 1998 .

[9]

N.Z.

Shor , Minimization Methods for Non-Differentiable

Functions

, Berlin, Springer-Verlag, 1985 .

[10]

Stetsyuk ,

Fischer ,

Khomyak , The Generalized Ellipsoid Method and Its Implementation , Communications in Computer and Information Science ( 2020 ) 355 - 370 . doi: 10 .1007/978-3- 030 - 38603-0_ 26 .

[11]

Chaikovsky ,

Primin ,

Kazmirchuk , Development and implementation into medical practice new information technologies and metrics for analysis of small changes in electromagnetic field of human heart , Visnyk of the National Academy of Sciences of Ukraine ( 2021 ) 33 - 43 . doi: 10 .15407/visn2021. 02 .033.

[12]

Persson , Weight of evidence transformation in credit scoring models: How does it affect the discriminatory power? Master's thesis , Lund university, Lund, Sweden, 2021 . https://lup.lub.lu.se/luur/download?func= downloadFile&recordOId=9066332&fileOId=9067075

[13]

Jundong , K. Cheng,

Wang ,

Morstatter ,

R.P.

Trevino ,

Tang ,

Tang , H. Liu, Feature Selection: A Data Perspective ., ACM Computing Surveys ( 2017 ), 1 - 45 . doi: 10 .1145/3136625

[14]

Van Rossum ,

F.L.

Drake , Python 3 Reference Manual, CreateSpace, Scotts Valley, CA, 2009 .

[15]

Pedregosa et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research 12.85 ( 2011 ), 2825 - 2830 . URL: http://jmlr.org/papers/v12/pedregosa11a.html

[16]

McKinney , Data structures for statistical computing in python , in: Proceedings of the 9th Python in Science Conference , Austin, 28 June-3 July 2010 , 56 - 61 . doi: 10 .25080/Majora-92bf1922-00a