<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Merging Method to Discretizing and Grouping the Input Factors of ANOVA Model while Research of Time Dynamic of the Students Intelligence Quotient</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Anastasiia</forename><surname>Timofeeva</surname></persName>
							<email>a.timofeeva@corp.nstu.ru</email>
							<affiliation key="aff0">
								<orgName type="institution">Novosibirsk State Technical University</orgName>
								<address>
									<addrLine>20, Karla Marksa ave</addrLine>
									<postCode>630073</postCode>
									<settlement>Novosibirsk</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tatiana</forename><surname>Avdeenko</surname></persName>
							<email>avdeenko@corp.nstu.ru</email>
							<affiliation key="aff0">
								<orgName type="institution">Novosibirsk State Technical University</orgName>
								<address>
									<addrLine>20, Karla Marksa ave</addrLine>
									<postCode>630073</postCode>
									<settlement>Novosibirsk</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Olga</forename><surname>Razumnikova</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Novosibirsk State Technical University</orgName>
								<address>
									<addrLine>20, Karla Marksa ave</addrLine>
									<postCode>630073</postCode>
									<settlement>Novosibirsk</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A Merging Method to Discretizing and Grouping the Input Factors of ANOVA Model while Research of Time Dynamic of the Students Intelligence Quotient</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">596762AD25E158E5DF5A9E49CF21CFFC</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:39+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>1 intelligence</term>
					<term>Flynn effect</term>
					<term>analysis of variance</term>
					<term>discretization</term>
					<term>grouping</term>
					<term>interaction effect</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In present work we study, with use of multivariate ANOVA model, the influence of independent factors such as year, faculty, gender, on the indicators of students' general intelligence (IQ) with a sample collected in 1991-2013 at the Novosibirsk State Technical University. The peculiarity of models of this type is that the response is a quantitative variable, and the input features must be qualitative. Therefore, first, the problem of converting quantitative features into categorical ones (discretization) arises, second, with a large number of levels of input qualitative features their grouping is required. If the variables are strongly correlated, then both tasks should be solved simultaneously. In this case, the optimal quality of the model should be ensured in accordance with a certain criterion. Existing methods for the features type conversion are limited to one of the tasks (discretization or grouping) and often do not take into account the relationships between the features. Therefore, an original approach is proposed that allows solving the problem and interpreting the results obtained.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The intelligence quotient (IQ) is associated with the quality of people's life and its duration. Thus, a study carried out in Scotland, and presented in <ref type="bibr" target="#b0">[1]</ref>, showed that the probability of surviving to 76 years depends significantly on the IQ level detected at the age of 11 years. The studies carried out were based on IQ measurements of 2792 children in 1932 in Scotland, born in 1921, the fate of 79.9% (2230) of which was subsequently tracked. One possible explanation for these findings is that intelligence enhances people's health care by helping them to acquire problem-solving skills that are useful for preventing chronic diseases, accidental injuries, and for adhering to complex treatment schemes.</p><p>There are other reasons for the influence of IQ on the quality of life, and, as a consequence, on its duration. Thus, in the article <ref type="bibr" target="#b1">[2]</ref>, based on a survey of 6870 participants living in England in 2007, a positive correlation was found between the level of verbal IQ and the feeling of happiness. People with lower IQ were found to be less happy than people with higher IQ.</p><p>On the other hand, recent studies show that high intelligence is associated with increased anxiety and stress, and can also cause chronic depression <ref type="bibr" target="#b2">[3]</ref>. It is also noted that gifted people are more likely than others to suffer from asthma and allergies <ref type="bibr" target="#b3">[4]</ref>, and are also susceptible to autoimmune diseases <ref type="bibr" target="#b4">[5]</ref>.</p><p>All of the above indicates the relevance of conducting research based on the accumulation and multivariate statistical analysis of intelligence indicators and its relationship with various time and demographic factors. A special place in these studies is played by the phenomenon of a gradual increase of IQ in the 20th century, known as the "Flynn effect". The effect was observed in different countries and for different categories of test subjects <ref type="bibr" target="#b5">[6]</ref>. For example, in <ref type="bibr" target="#b6">[7]</ref> it was concluded that a representative sample of Americans from 1932 to 1978 every year coped better and better with IQ tests, while the overall increase in average IQ over 46 years was 13.8 points. However, since the end of the 20th century, the reverse temporal dynamics of IQ (or the anti-Flynn effect) began to be observed, the reasons for which remain unclear <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b9">10]</ref>.</p><p>At the Novosibirsk State Technical University for 23 years from 1991 to 2013, the intelligence of 1st year students was tested according to the Amthauer method. The sample consisted of 3,677 students of both sexes from various departments of the university in the natural science, technical, and humanitarian fields of knowledge. As a result of the analysis of these data, it becomes possible to establish the influence of factors such as gender and faculty on the IQ of students, as well as to study the temporal dynamics of changes in the IQ of students studying at a Russian university.</p><p>For the research, a multivariate analysis of variance was chosen, in which the response (dependent variable) is the final IQ of students, measured on a scale of relationships. Categorical independent features, measured on a nominal scale, are student gender, faculty, and year of study. The aim of the study is to identify the influence of independent factors on the dependent variable -a quantitative indicator of IQ. It is important to assess not only the impact of factors separately, but also their interactions.</p><p>When conducting long-term studies of intelligence, it is not always possible to develop an experimental design that makes it possible to obtain optimal estimates of the effects in the ANOVA model, since it is difficult to ensure such conditions under which a similar sample population of individuals would be surveyed every year. In this regard, the analyzed sample is characterized by an uneven distribution of students across faculties and survey years, i.e. in one year, students from one subset of faculties were surveyed, and in the next year, students from another subset. To construct an acceptable analysis of variance model under these conditions, in present paper the method of agglomerative discretization and grouping of input features was developed, investigated and applied.</p><p>The article has the following structure. Section 2 provides an overview of the existing discretization methods, substantiates the development of a new method. Section 3 presents the quality criteria investigated in the article for constructing optimal discretization. In Section 4, we describe the ANOVA model used. Section 5 describes the developed discretization algorithm. Section 6 contains the results of the studies of the proposed approach, and section 7 contains their interpretation for solving the multifactor task of studying the IQ of students. In section 8 we provide a conclusion on the work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Overview of discretization methods</head><p>A good overview of the current state of research on discretization methods is presented in <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>. If the transformation of a quantitative attribute into a qualitative one is carried out in such a way as to ensure the best agreement with the response, then we are talking about supervised discretization. This task can be solved using top-down (divisive) discretization techniques or bottom-up (agglomerative) techniques. In the first case, a gradual division into intervals occurs, and in the second, the intervals are merged. At each step of such algorithms, an evaluation function is calculated that characterizes the quality of the division into intervals. In addition, the stop criterion is important, which determines that further partition (merging) does not make sense.</p><p>For example, an efficient recursive partitioning algorithm MDLP <ref type="bibr" target="#b12">[13]</ref> evaluates the quality based on information gain based on entropy, and the stopping criterion is derived from the principle of minimum length description. The chi-square statistic is popular in the agglomerative merging problem. Algorithms such as ChiMerge <ref type="bibr" target="#b13">[14]</ref>, Chi2 <ref type="bibr" target="#b14">[15]</ref> were built on its basis. Both approaches are designed for classification tasks, that is, they assume that the response is categorical. Therefore, their application to transform a set of input variables in the construction of ANOVA models requires discretizing the response, which can lead to the loss of significant information.</p><p>Another group of discretization methods, the so-called wrapping methods, focuses on the quality of the estimated model. Thus, these methods simultaneously solve the learning problem. The existing algorithms are built for classifiers, for example, such simple ones as a majority class voting classifier <ref type="bibr" target="#b15">[16]</ref>, or more general classifiers such as Naive Bayes <ref type="bibr" target="#b16">[17]</ref>.</p><p>Compared to the problem of discretization, the grouping problem has not been studied so deeply in the literature. A fairly complete overview of grouping methods is presented in <ref type="bibr" target="#b17">[18]</ref>. Many commercial data mining packages suggest excluding variables that have too many categories. This approach, however, cannot be considered acceptable in cases where the research interest is to assess the effects of just such variables. Effective grouping methods allow for fewer, more informative categories. This can be done by Sequential Forward Selection method <ref type="bibr" target="#b18">[19]</ref>. It is a greedy that initializes a group with the best category and then iteratively adds new categories to this first group. Decision tree algorithms often solve the grouping problem with a greedy heuristic based on bottom-up categorization. The CHAID algorithm <ref type="bibr" target="#b19">[20]</ref> uses this greedy approach with a criterion close to the ChiMerge criterion <ref type="bibr" target="#b13">[14]</ref>. In <ref type="bibr" target="#b17">[18]</ref>, a new method of grouping MODL based on the Bayesian approach was proposed, as well as the discretization method MODL <ref type="bibr" target="#b20">[21]</ref>. It searches for the most likely grouping model for the given dataset. Optimization is done using a greedy bottom-up algorithm.</p><p>Thus, most of the existing supervised discretization algorithms are designed to solve classification problems, that is, for categorical response. They are mainly aimed at improving the quality of predicting the response (quality of classification) [ <ref type="bibr" target="#b21">[22]</ref>, <ref type="bibr" target="#b22">[23]</ref>]. Moreover, they are usually univariate. In this regard, it seems relevant to develop an algorithm for the optimal categorization of input features, taking into account their interrelationships, to build a model of analysis of variance. Here categorization includes two tasks: discretization of quantitative variables and grouping of nominal features. Due to the specifics of the practical task, the construction of response predictions is secondary, therefore, the use of criteria such as cross-validation in order to assess the quality of the model and avoid overfitting is limited. The main task was to obtain and interpret estimates of the effects of influencing factors. As a result, we had to resort to goodness-of-fit criteria.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Goodness of fit criteria</head><p>Most often, the quality of a regression model is judged by the coefficient of determination, calculated as</p><formula xml:id="formula_0">2 1 ESS R TSS  ,</formula><p>where ESS is residual sum of squares of the model, TSS is total sum of squares of the model. However, this indicator has an obvious drawback. With increasing complexity of the model (including new variables), it is possible to better describe the response, thereby decreasing ESS and increasing 2 R . However, the number of the degrees of freedom decrease, which is in no way taken into account when calculating the coefficient of determination.</p><p>To check the significance of the model, the F-statistic is used, calculated as</p><formula xml:id="formula_1">2 2 11 R N m F Rm    ,</formula><p>where N is the number of observations, m is the number of estimated parameters. It takes degrees of freedom into account, so the increase in model complexity must be offset by a sufficient decrease in the residual sum of squares. Akaike information criterion is often used in the problem of feature selection, for example, in the stepwise regression procedure. It provides a trade-off between goodness of fit and complexity of the model (number of parameters). The Akaike criterion is calculated as follows.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">log AIC m N ESS </head><p>. It should be borne in mind that with a very large number of categories, building good groupings is difficult because of the risk of overfitting the model. In the extreme case, to avoid overfitting, efficient grouping methods can combine all values into one group, thereby excluding the variable from consideration. In order to prevent such a situation, the stopping criterion must include a condition for the minimum number of categories (for example, two).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">ANOVA model</head><p>For research, the following model of analysis of variance was formulated:</p><formula xml:id="formula_2">        ktji k t j ktji kt kj tj ktj y                   ,<label>(1)</label></formula><p>where ktji y - i -th observed value corresponding to the IQ level for a student of the k -th sex of the j - th faculty in the year t , k  is the effect of the k -th sex ( 1 k  for male, 2 k  for female), It is impossible to estimate all the effects in model <ref type="bibr" target="#b0">(1)</ref>. Usually they resort to reduction. This estimates paired comparisons with some baseline, for example,  </p><formula xml:id="formula_3">21  </formula><p>is the influence of female versus male. The first levels of factors are taken as the baseline levels.</p><p>The distribution of the studied students is uneven over the years (see table <ref type="table" target="#tab_0">1</ref>). There is a close relationship between the variables Faculty and Year. The chi-square statistic is 8092.9, which indicates a significant correlation at 0.1% significance level. Nevertheless, it should be borne in mind that the original contingency table has a very large dimension (220 degrees of freedom), and, as a consequence, cells with a small number of observations, which negatively affects the correctness of the chi-square test. For confirmation, the correlation ratio was calculated, showing the influence of the faculty for the year. It is 0.192 (F-statistic is equal to 86.9), which also speaks of a significant connection at 0.1% significance level.  <ref type="bibr">1991</ref><ref type="bibr" target="#b15">, 1994</ref><ref type="bibr">-1996</ref><ref type="bibr">, 1998</ref><ref type="bibr">-2000</ref><ref type="bibr">, 2003</ref><ref type="bibr">, 2004</ref><ref type="bibr" target="#b20">, 2006</ref><ref type="bibr">, 2009</ref><ref type="bibr">power engineering PEF 1994</ref><ref type="bibr" target="#b16">, 1995</ref><ref type="bibr" target="#b14">, 1997</ref><ref type="bibr" target="#b5">-1999</ref><ref type="bibr" target="#b0">, 2001</ref><ref type="bibr">, 2002</ref><ref type="bibr">, 2004</ref><ref type="bibr">natural sciences NSF 2007</ref> Consequently, it is impossible to assess all the effects of faculty and year interactions in order to separate the effect of student specialization from the time trend. Therefore, it is necessary to discretize the Faculty and Year variables in such a way as to ensure the optimal quality of estimation of the model, which includes interaction effects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">The developed algorithm</head><p>The algorithm is developed for the case when it is required to discretize one quantitative variable and group one categorical variable, and the variables are highly correlated. It can be extended to the case when there are more than two variables, but with a large number of variables and levels the curse of dimension arises.</p><p>The pseudocode of the algorithm for the optimal categorization of input features, taking into account their interrelationships for constructing an analysis of variance model, is shown in Figure <ref type="figure" target="#fig_1">1</ref>. x with 0 K levels. The thresholds were selected simultaneously for two variables by the agglomerative merging method. The initial model was built taking into account all available levels of factors. Further, one boundary between the levels was successively removed. For a categorical variable, all possible pairs of factor levels were considered, for a quantitative variable, only adjacent values. In addition, such an option was considered when the levels were not combined. It was assigned an index 0 according to the variable for which the levels were not combined. This is done in case the optimal solution is to combine levels in only one of the variables. If the best value of the quality index corresponding to the optimal solution was achieved, the levels were combined. Then the procedure was repeated until an improvement was obtained.</p><formula xml:id="formula_4">  12 quality , Q x x  repeat for 0 k  to   0 1 T  do if 0 0 &amp; 2 kT  then   11 merge , ,<label>1</label></formula><formula xml:id="formula_5">x x k k   else 11 xx   if 0 k  then     1 1 2 0,0, quality , Q k x x   for 1 i  to   0 1 K  do for 1 ji  to 0 K do if 0 2 K  then   2 2 merge , , x x i j   else 22 xx       1 1 2 , , quality , Q i j k x x   end for end for end for 1 ,, opt i j k QQ   if | Q Q Q Q   p then break : QQ     * * * 1 ,, , , arg opt i j k i j k Q  if * 0 k  then   ** 1 1 : merge , , 1 x x k k   , 00 :1 TT  if * 0 i  then</formula><p>The function   12 quality , xx returns an indicator of the quality of fitting an ANOVA model of the form (1) (determination coefficient, F-statistic, AIC) depending on the input data.</p><p>The function   merge , , x i j combines the levels , ij of a variable x so that the number of levels is reduced by one. If the input variable included K levels, then the function returns the transformed variable with  </p><formula xml:id="formula_6">1 K  levels numbered from 1 to   1 K  .</formula><p>Since the optimization of the goodness-of-fit criteria can go in different directions (for the determination coefficient and F-statistic it is maximization, for AIC it is minimization), we denoted the optimal value as opt . Wherein</p><formula xml:id="formula_7">| Q Q Q Q   p</formula><p>means that Q is no better than Q .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Results</head><p>The choice of the determination coefficient as an evaluation criterion did not give any results, original partition provided the minimum residual sum of squares. As expected, any merging of intervals led to a decrease in the determination coefficient.</p><p>The use of the F-statistic, on the contrary, led to the fact that at each step there was an improvement in the values of the evaluation function. Thus, the work of the algorithm ended only when the intervals could no longer be combined, that is, when there were two categories left for each feature. The faculty of AMCSF stood out in a separate group, as well as 1991. The results of evaluating such a model indicate one significant effect -on the AMCSF, compared to the rest of the faculty, IQ is 6.3 points higher (significant at the 1% level).</p><p>The use of the Akaike information criterion made it possible to obtain more interesting results. When applying the algorithm, three groups of faculties were distinguished. From table 1 it is clearly seen that there are years in which some faculties were not covered by the study. This problem was partially solved by discretizing the variable Year. Table <ref type="table" target="#tab_1">2</ref> shows the proportions of students of faculties of three groups studied in a given range of years. For example, for the first group, there were no periods left when the faculties of this group were not covered by the study. Nevertheless, there is a gap for the second group of faculties in 2009, and for the third -in 2008 and 2010-2013. Therefore, it was not possible to estimate the corresponding effects. After discretizing the variables, a model was estimated describing the dependence of IQ on gender, faculty, and year and on their interactions. It turned out that gender has an insignificant effect on the level of intelligence. Therefore, the gender factor was eliminated and the model was re-estimated.</p><p>Table <ref type="table" target="#tab_2">3</ref> provides a summary table with the values of the F-statistic and p-value for 1% significance level. Almost all the effects of the variable Year turned out to be significant at the 5% or 10% level. For the base year, the effect of the faculties of the second group compared to the first was -5.3 and is significant at the 1% level. The effect of faculties of the third group compared to the first for the base year is estimated as 1.2 and is significant at the 10% level. Most of the interactions between the year and the faculty were significant. The general average is estimated at 112.3.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Interpreting the Results</head><p>From the point of view of specialization, the distinguished groups of faculties can be divided as follows. The first group is technical and economic faculties, the second is humanitarian and applied faculties, and the third is physics and mathematics. The latter group, on average, is characterized by the highest level of intelligence. Although since 2006 the IQ has dropped and has become comparable to the level of intelligence of students in other faculties. But during this period, a group of students with a physical and mathematical specialization was observed very little (see Table <ref type="table" target="#tab_0">1</ref> of this group were not studied, so the interaction effect could not be estimated, and the IQ forecast is based only on the main effects. This explains the sharp increase in the IQ forecast in 2009, which cannot be considered reasonable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Conclusion</head><p>Thus, in this work, an analysis of variance model was constructed to study the influence of input factors on the IQ of students. To build a qualitative model, taking into account the specifics of the collected data, a new agglomerative method for discretizing and grouping input features was developed and tested. The interpretation of the obtained estimation results is carried out. In practice, the obtained results of interpretation can be used in the construction of individual educational trajectories, which is one of the key problems of the modern digital educational environment <ref type="bibr" target="#b23">[24]</ref>.</p><p>Future work involves the improvement of the developed algorithm in terms of finding the optimal solution, as well as the development of alternative models for the study of students' IQ with subsequent comparison of the results.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>is the effect of the j -th faculty, 1,...,11 j  ,   kt  is the interaction effect of the k -th sex and t -th year,   kj  is the effect of the interaction of the k -th year and the j -th faculty,   tj  is the effect of the interaction of the j -th faculty and the t -th year,   ktj  is the effect of the interaction of the k -th sex, t -th year and j -th faculty, ktji  is a random error.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Pseudocode of the developed algorithm Input: raw data including response values, quantitative factor 1 x with 0 T levels, and qualitative factor 2</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure</head><label></label><figDesc>Figure shows the predicted IQ values by year and depending on the group of faculties.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Model estimation results</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>): only PEF 2006(22 students)  and 2009(14 students). Therefore, the decline in IQ may be due to the nonrepresentativeness of the sample.In the 2000s, there was instability of IQ indicators among students of technical and economic specialization. Growth period 2006-2007 can be explained by the fact that in 2007 only the NSF was observed from this group, which was characterized by higher IQ indices.For students of humanitarian and applied specialties from 2000 to 2005 in general, there was an increase in intelligence indicators, and then a sharp decline began in 2006-2008. In 2009, the faculties</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>The ratio of faculties and survey years in the sample</figDesc><table><row><cell>Faculty</cell><cell>Abbreviation</cell><cell>Survey years</cell></row><row><cell>automation and computer engineering</cell><cell>ACEF</cell><cell>1994, 1995, 1997-1999, 2001, 2002, 2004</cell></row><row><cell>mechanical engineering and technologies</cell><cell>MTF</cell><cell>1993, 1995-2001, 2013</cell></row><row><cell>radio engineering and electronics</cell><cell>REEF</cell><cell>1992-2002, 2006, 2008-2010</cell></row><row><cell>business</cell><cell>FB</cell><cell>1997-1999, 2001-2005</cell></row><row><cell>humanity education</cell><cell>HEF</cell><cell>1994, 1998-2004, 2006-2008, 2010-2012</cell></row><row><cell>aircraft enginiiring</cell><cell>AEF</cell><cell>1993, 1994, 1997, 2000, 2002-2006</cell></row><row><cell>mechatronics and automation</cell><cell>MAF</cell><cell>1994, 1995, 1997, 1998</cell></row><row><cell>applied mathematics and computer science</cell><cell>AMCSF</cell><cell>1995-2004</cell></row><row><cell>physical engineering</cell><cell>PEF</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Shares of students of faculty groups in the total number of students studied in a given range of years</figDesc><table><row><cell>Survey years</cell><cell>1 st group</cell><cell>2 nd group</cell><cell>3 rd group</cell></row><row><cell>1991-1996</cell><cell>0.697</cell><cell>0.073</cell><cell>0.230</cell></row><row><cell>1997</cell><cell>0.556</cell><cell>0.148</cell><cell>0.296</cell></row><row><cell>1998-1999</cell><cell>0.661</cell><cell>0.195</cell><cell>0.144</cell></row><row><cell>2000</cell><cell>0.203</cell><cell>0.228</cell><cell>0.568</cell></row><row><cell>2001</cell><cell>0.296</cell><cell>0.245</cell><cell>0.460</cell></row><row><cell>2002</cell><cell>0.579</cell><cell>0.274</cell><cell>0.147</cell></row><row><cell>2003-2005</cell><cell>0.306</cell><cell>0.320</cell><cell>0.373</cell></row><row><cell>2006-2007</cell><cell>0.394</cell><cell>0.518</cell><cell>0.088</cell></row><row><cell>2008</cell><cell>0.600</cell><cell>0.400</cell><cell>0</cell></row><row><cell>2009</cell><cell>0.759</cell><cell>0</cell><cell>0.241</cell></row><row><cell>2010-2013</cell><cell>0.556</cell><cell>0.444</cell><cell>0</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 The</head><label>3</label><figDesc></figDesc><table><row><cell cols="2">significance of factors</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Factor</cell><cell>Degrees of freedom</cell><cell>F-statistics</cell><cell>Critical F-value</cell><cell>p-value</cell></row><row><cell>Year</cell><cell>10</cell><cell>14.88</cell><cell>0.55</cell><cell>&lt;210 -16</cell></row><row><cell>Faculty</cell><cell>2</cell><cell>244.41</cell><cell>0.63</cell><cell>&lt;210 -16</cell></row><row><cell>Faculty:Year</cell><cell>17</cell><cell>11.38</cell><cell>0.53</cell><cell>&lt;210 -16</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Acknowledgements</head><p>The research is supported by Ministry of Science and Higher Education of Russian Federation (project No. FSUN-2020-0009).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Longitudinal Cohort Study of Childhood IQ and Survival up to Age 76</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">J</forename><surname>Whalley</surname></persName>
		</author>
		<idno type="DOI">10.1136/bmj.322.7290.819</idno>
	</analytic>
	<monogr>
		<title level="j">Bmj</title>
		<imprint>
			<biblScope unit="volume">322</biblScope>
			<biblScope unit="page" from="819" to="819" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The Relationship between Happiness and Intelligent Quotient: the Contribution of Socio-Economic and Clinical Factors</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ali</surname></persName>
		</author>
		<idno type="DOI">10.1017/s0033291712002139</idno>
	</analytic>
	<monogr>
		<title level="j">Psychological Medicine</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="1303" to="1312" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Intelligence and Emotional Disorders: Is the Worrying and Ruminating Mind a More Intelligent Mind? Personality and Individual Differences</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Penney</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.paid.2014.10.005</idno>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">74</biblScope>
			<biblScope unit="page" from="90" to="93" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Some Common Allergic Emergencies</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Hildreth</surname></persName>
		</author>
		<idno type="DOI">10.1016/s0025-7125(16)33127-3</idno>
	</analytic>
	<monogr>
		<title level="j">Medical Clinics of North America</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="1313" to="1324" />
			<date type="published" when="1966">1966</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Intellectually Gifted Students Also Suffer from Immune Disorders</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">P</forename><surname>Benbow</surname></persName>
		</author>
		<idno type="DOI">10.1017/s0140525x00001059</idno>
	</analytic>
	<monogr>
		<title level="j">Behavioral and Brain Sciences</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page">442</biblScope>
			<date type="published" when="1985">1985</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Searching for Justice: The Discovery of IQ Gains over Time</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Flynn</surname></persName>
		</author>
		<idno type="DOI">10.1037/0003-066X.54.1.5</idno>
	</analytic>
	<monogr>
		<title level="j">American Psychologist</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page" from="5" to="20" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">The Mean IQ of Americans: Massive Gains 1932 to 1978</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Flynn</surname></persName>
		</author>
		<idno type="DOI">10.1037/0033-2909.95.1.29</idno>
	</analytic>
	<monogr>
		<title level="j">Psychological Bulletin</title>
		<imprint>
			<biblScope unit="volume">95</biblScope>
			<biblScope unit="page" from="29" to="51" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Rogeberg, Flynn effect and its reversal are both environmentally caused</title>
		<author>
			<persName><forename type="first">B</forename><surname>Bratsberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">О</forename></persName>
		</author>
		<idno type="DOI">10.1073/pnas.1718793115</idno>
	</analytic>
	<monogr>
		<title level="j">PNAS</title>
		<imprint>
			<biblScope unit="volume">115</biblScope>
			<biblScope unit="page" from="6674" to="6678" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The negative Flynn effect: A systematic literature review</title>
		<author>
			<persName><forename type="first">E</forename><surname>Dutton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Van Der Linden</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lynn</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.intell.2016.10.002</idno>
	</analytic>
	<monogr>
		<title level="j">Intelligence</title>
		<imprint>
			<biblScope unit="volume">59</biblScope>
			<biblScope unit="page" from="163" to="169" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">IQ decline and Piaget: Does the rot start at the top?</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Flynn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shayer</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.intell.2017.11.010</idno>
	</analytic>
	<monogr>
		<title level="j">Intelligence</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="page" from="112" to="121" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Discretization techniques: A recent survey</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kotsiantis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kanellopoulos</surname></persName>
		</author>
		<idno>doi:10.1.1.109.3084</idno>
	</analytic>
	<monogr>
		<title level="j">GESTS International Transactions on Computer Science and Engineering</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="47" to="58" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning</title>
		<author>
			<persName><forename type="first">S</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Luengo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Sáez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Herrera</surname></persName>
		</author>
		<idno type="DOI">10.1109/TKDE.2012.35</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="734" to="750" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning</title>
		<author>
			<persName><forename type="first">U</forename><surname>Fayyad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Irani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th Int&apos;l Joint Conf. Artificial Intelligence (IJCAI)</title>
				<meeting>the 13th Int&apos;l Joint Conf. Artificial Intelligence (IJCAI)</meeting>
		<imprint>
			<date type="published" when="1993">1993</date>
			<biblScope unit="page" from="1022" to="1029" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">ChiMerge: Discretization of Numeric Attributes</title>
		<author>
			<persName><forename type="first">R</forename><surname>Kerber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Nat&apos;l Conf. Artifical Intelligence Am. Assoc. for Artificial Intelligence (AAAI)</title>
				<meeting>the Nat&apos;l Conf. Artifical Intelligence Am. Assoc. for Artificial Intelligence (AAAI)</meeting>
		<imprint>
			<date type="published" when="1992">1992</date>
			<biblScope unit="page" from="123" to="128" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Feature Selection via Discretization</title>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Setiono</surname></persName>
		</author>
		<idno type="DOI">10.1109/69.617056</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Knowledge and Data Eng</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="642" to="645" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">BRACE: A Paradigm for the Discretization of Continuously Valued Data</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ventura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">R</forename><surname>Martinez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Seventh Ann. Florida AI Research Symp. (FLAIRS)</title>
				<meeting>the Seventh Ann. Florida AI Research Symp. (FLAIRS)</meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
			<biblScope unit="page" from="117" to="121" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">An Iterative Improvement Approach for the Discretization of Numeric Attributes in Bayesian Classifiers</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Pazzani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Int&apos;l Conf. Knowledge Discovery and Data Mining (KDD)</title>
				<meeting>the First Int&apos;l Conf. Knowledge Discovery and Data Mining (KDD)</meeting>
		<imprint>
			<date type="published" when="1995">1995</date>
			<biblScope unit="page" from="228" to="233" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Grouping method for categorical attributes having very large number of values</title>
		<author>
			<persName><forename type="first">M</forename><surname>Boullé</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Workshop on Machine Learning and Data Mining in Pattern Recognition</title>
				<meeting><address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="228" to="242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A knowledge-elicitation tool for sophisticated users</title>
		<author>
			<persName><forename type="first">G</forename><surname>Cestnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Konenenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bratko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Progress in Machine Learning</title>
				<meeting><address><addrLine>SigmaPress, Wihnslow, England</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">An exploratory technique for investigating large quantities of categorical data</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">V</forename><surname>Kass</surname></persName>
		</author>
		<idno type="DOI">10.2307/2986296</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of the Royal Statistical Society: Series C (Applied Statistics)</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="119" to="127" />
			<date type="published" when="1980">1980</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">MODL: a Bayes optimal discretization method for continuous attributes</title>
		<author>
			<persName><forename type="first">M</forename><surname>Boullé</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10994-006-8364-x</idno>
	</analytic>
	<monogr>
		<title level="j">Machine learning</title>
		<imprint>
			<biblScope unit="volume">65</biblScope>
			<biblScope unit="page" from="131" to="165" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Supervised discretization for optimal prediction</title>
		<author>
			<persName><forename type="first">W</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.procs.2014.05.383</idno>
	</analytic>
	<monogr>
		<title level="j">Procedia Computer Science</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="75" to="80" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Improving classification performance with discretization on biomedical datasets</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Lustgarten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Gopalakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Grover</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Visweswaran</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AMIA annual symposium proceedings</title>
				<imprint>
			<publisher>American Medical Informatics Association</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="445" to="449" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Development and Research of Algorithms for the Formation the Individual Educational Trajectories of Students in the Digital Educational Platform</title>
		<author>
			<persName><forename type="first">D</forename><surname>Parfenov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Zaporozhko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lapina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sora</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page">2494</biblScope>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
