<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Evolutionary Approach to Multimodal Clustering on Formal Contexts</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mikhail</forename><surname>Bogatyrev</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tula State University</orgName>
								<address>
									<addrLine>92 Lenin ave</addrLine>
									<settlement>Tula</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sergey</forename><surname>Dvoenko</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tula State University</orgName>
								<address>
									<addrLine>92 Lenin ave</addrLine>
									<settlement>Tula</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dmitry</forename><surname>Orlov</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tula State University</orgName>
								<address>
									<addrLine>92 Lenin ave</addrLine>
									<settlement>Tula</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tatyana</forename><surname>Shestaka</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Tula State University</orgName>
								<address>
									<addrLine>92 Lenin ave</addrLine>
									<settlement>Tula</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Evolutionary Approach to Multimodal Clustering on Formal Contexts</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F79F919E2FC417D5607A40C6F379DD7D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:42+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Evolutionary Computation</term>
					<term>Formal Context</term>
					<term>Multimodal Clustering</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Evolutionary approach to multimodal clustering on multidimensional formal contexts is proposed. Its main advantage is that it allows one to find a well-grounded number of clusters corresponding to the global or close to the global extremum of the function that characterizes the quality of solution of the clustering problem. This approach is effective when the proximity measure for a data being clustered is not Euclidean. Formal context is the data model in Formal Concept Analysis, the area in data analysis where mathematically rigorous methods from lattice theory have been applied for discovering relationships on heterogeneous data. Taking into account the effect of data heterogeneity in cluster analysis can be effectively implemented using multimodal clustering methods. The paper contains main definitions from Formal Concept Analysis, description the principle of evolutionary computation and evolutionary approach to multimodal clustering. Experimental study of proposed approach is performed on the task of phenotyping of disease of myocardial infarction.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>One of the features of the data used in the area of data intensive processing is heterogeneity. Accordingly, in the methods of intensive data processing, it is necessary to take into account their heterogeneity, which makes it possible to preserve knowledge presented in data sets obtained on domains of different nature.</p><p>There are a lot of works, for example <ref type="bibr" target="#b0">[1]</ref><ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref>, devoted to clustering heterogeneous data and cluster interpreting. Taking into account the effect of data heterogeneity in cluster analysis can be effectively implemented using multimodal clustering methods. Classical cluster analysis is based on dividing a single set of objects into disjoint subsets that are clusters. At the same time, despite the wide variety of proximity measures of objects used here, the problem of interpreting the obtained clusters remains urgent in cluster analysis. The clusters interpretation is always performed in the context of the applied proximity measure, and for heterogeneous data from different domains, such an interpretation based on a single proximity measure will obviously not be correct.</p><p>Multimodal clustering involves not one, but several sets of objects to be clustered simultaneously. Such sets can be formed by heterogeneous data. A multimodal cluster is a subset in the form of combinations of objects from different sets. The very fact of the combination of certain objects in a multimodal cluster can carry important information and serve as the basis for clusters interpretation.</p><p>The simplest variant of multimodal clustering is biclustering, which was first proposed in <ref type="bibr" target="#b3">[4]</ref>. Biclustering algorithms work on data in the form of matrices, the rows of which contain instances of objects with attributes (features) located in columns. Biclustering methods have been applied in various fields of data analysis, but especially in bioinformatics, in the task of studying gene expression <ref type="bibr" target="#b4">[5]</ref>. Here namely biclusters make it possible to objectively assess the mutual influence of genes on biological processes.</p><p>Biclustering data in the form of an object-attribute representation is also used in Formal Concept Analysis (FCA) <ref type="bibr" target="#b5">[6]</ref>. FCA is the paradigm of conceptual modeling which studies how objects can be hierarchically grouped together according to their common attributes. Such grouping of objects is really biclustering of them. The output of standard FCA algorithms is conceptual lattice which contains hierarchically linked formal concepts which are biclusters <ref type="bibr" target="#b7">[7]</ref>. FCA methods for constructing concept lattices differ from the biclustering methods used, for example, in the analysis of gene expression, and FCA methods expand the possibilities of interpreting clusters. The hierarchy of concepts in the lattice reflects a hierarchy of data that is not obvious in the original view. The use of concept lattices also makes it possible to build association rules and functional dependencies on clustered data.</p><p>Triclustering is a generalization of biclustering <ref type="bibr" target="#b8">[8]</ref>. However, the appearance of the third set in the data presentation fundamentally changes the situation, and triclustering algorithms are not built by simple scaling of biclustering ones. An overview of triclustering algorithms can be found in <ref type="bibr" target="#b9">[9]</ref>. In <ref type="bibr" target="#b10">[10]</ref>, the analysis of triclustering algorithms implemented using FCA is performed.</p><p>The generalization of the approach based on the construction of concept lattices to the n-dimensional case of multimodal clustering is presented in <ref type="bibr" target="#b11">[11]</ref>. The most well-known algorithm for n-dimensional multimodal clustering is the DataPeeler algorithm <ref type="bibr" target="#b12">[12]</ref>, which is often used as a benchmark for comparison with other algorithms. The novelty of this work lies in the development of an approach to multimodal data clustering based on the use of evolutionary calculations. The fact that this approach is used on the data model in the form of formal contexts allows us to naturally take into account the heterogeneity of data.</p><p>Evolutionary computation has been applied in clustering <ref type="bibr" target="#b13">[13,</ref><ref type="bibr" target="#b15">15]</ref>. Their main advantage is that the evolutionary algorithms allow one to find a reasonable number of clusters corresponding to the global or close to the global extremum of the function that characterize the quality of solution of the clustering problem. This approach is also effective when there is no Euclidean proximity measure for clustered data, which is relevant for heterogeneous data.</p><p>The rest of the paper is organized as follows. In Section 2, there is brief description of multimodal clustering on formal contexts. It contains main definitions from FCA and explanation why triclustering is not scalable biclustering. Section 3 contains the description of the main contribution of this paper as evolutionary algorithm of multimodal clustering on formal contexts. In the Section 4, the results of experimental study of proposed approach are presented. They are illustrated on the task of phenotyping of disease of myocardial infarction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Multimodal Clustering on Formal Contexts</head><p>We perceive multimodal clustering on formal context which is the data model in FCA. That is why we briefly consider the main issues of the FCA <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b10">10,</ref><ref type="bibr" target="#b12">12]</ref>. Classical FCA deals with two basic notions: formal context and concept lattice. Formal context is a triple K = (G, M, I) where G is a set of objects, M -set of their attributes, I ⊆ G × M -binary relation which represents facts of belonging attributes to objects. Formal context may be represented by [0, 1] -matrix K = {k i,j } in which units mark correspondence between objects g i ⊆ G and attributes m j ⊆ M . The concepts in the formal context have been determined by the following way. If for subsets of objects A ⊆ G and attributes B ⊆ M there are exist mappings which are realized as prime operators A ′ : A → B and B ′ : B → A with the following properties of completeness</p><formula xml:id="formula_0">A ′ := {m ∈ M | &lt; g, m &gt; I ∈ for all g ∈ A} B ′ := {g ∈ G| &lt; g, m &gt; I ∈ for all m ∈ B}<label>(1)</label></formula><p>then the pair (A, B) that A ′ = B, B ′ = A is named as formal concept. The composition of mappings demonstrates following properties of A and B: (A, B) that A ′′ = B, B ′′ = A; A and B is called the extent and the intent of a formal context K = (G, M, I) respectively. By other words, a formal concept is a pair (A, B) of subsets of objects and attributes which are connected so that every object in A has every attribute in B, for every object in G that is not in A, there is an attribute in B that the object does not have and for every attribute in M that is not in B, there is an object in A that does not have that attribute.</p><p>If for formal concepts (A 1 , B 1 ) and</p><formula xml:id="formula_1">(A 2 , B 2 ), A 1 ⊑ A 2 and B 2 ⊑ B 1 then (A 1 , B 1 ) ≤ (A 2 , B 2 ) and formal concept (A 1 , B 1 ) is less general than (A 2 , B 2</formula><p>).This order is represented by concept lattice. A lattice consists of a partially ordered set in which every two elements have a unique supremum (also called a least upper bound or join) and a unique infimum (also called a greatest lower bound or meet).</p><p>In the concept lattice, formal concepts are hierarchically grouped together according to their common attributes. Such grouping of objects is really bicluclustering of them i.e. clustering on the two sets simultaneously, the set of objects and the set of attributes.</p><p>Example 1. Consider an example of a formal context and the concept lattice built on it. In the Fig. <ref type="figure" target="#fig_0">1 a</ref>) a fragment of the data contained in the myocardial infarction database is shown in the form of a two-dimensional binary context. The context has 7 objects and 4 binary attributes. The objects are patient IDs, and the meaning of the attributes is as follows:</p><p>•gb is the presence of an essential hypertension of a patient; •ant im is the presence of an anterior myocardial infarction (left ventricular); •im pg p is the presence of a right ventricular myocardial infarction; •let is is the lethal outcome of a patient. The concept lattice built on the considered formal context is shown on the Fig. <ref type="figure" target="#fig_0">1 b</ref>). It has seven formal concepts including top and bottom ones which are unit and zero elements of an abstract lattice correspondingly. Information about patients is contained in five formal concepts, shown as filled circles. It is used so called reduced labeling <ref type="bibr" target="#b21">[21]</ref> in order to succinctly represent information about objects and attributes of formal context. If label of attribute A is attached to some concept, that means, that this attribute occurs in objects of all concepts, reachable by descending paths from this concept to zero concept (bottom element) of lattice. If label of object O is attached to some concept, this means, that object O lays in all concepts, reachable by ascending paths in lattice graph from this concept to unit concept (top element) of lattice. If drawing of node contains blue filled upper semicircle, that means, that there is an attribute, attached to this concept. If drawing of node contains black filled lower semicircle, that means, that there is an object, attached to this concept.</p><p>In this example, all the patients have anterior myocardial infarction. According with reduced labeling, we can see in the lattice that ant im attribute is presented in all five filled concepts. Among patients, those ones with IDs 1500,1503 and 1655 have the lethal outcome. On the left in the lattice there is a separate path with the nodes as the following formal concepts:</p><p>({1500, 1503, 1655}, {ant im, let is}), (1655, {ant im, let is, gb}). The first concept is more general than the second, since it is located above it. The first concept reflects two facts: the presence of a myocardial infarction and a lethal outcome for all of three patients. The second concept reflects the peculiarity of patient 1655: he is the only one of the three who has essential hypertension.</p><p>Thus, the formal context and concept lattice are a visual tool for representing knowledge.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Multidimensional Formal Contexts and Multimodal Clustering</head><p>Multidimensional or polyadic formal contexts arise as generalization of usual formal contexts. A multidimensional, n-ary formal context is defined by a relation R ⊆ D 1 × D 2 × . . . × D n on data domains D 1 , D 2 , . . . , D n . The context is an n+1 set:</p><formula xml:id="formula_2">K =&lt; K 1 , K 2 , . . . , K n , R &gt;,<label>(2)</label></formula><p>where K i ⊆ D i . Every n-ary context begets k -ary contexts, whose number is given by the Stirling formula <ref type="bibr" target="#b12">[12]</ref> </p><formula xml:id="formula_3">S(n, k) = 1 k! k i=0 (−1) i ( k i )(k − i) n .</formula><p>As it is shown in <ref type="bibr" target="#b11">[11]</ref>, multidimensional n-ary context also contains formal concepts which also form a lattice. Clustering on multidimensional formal contexts is called multimodal clustering <ref type="bibr" target="#b10">[10]</ref>.</p><p>According to multimodal clustering, for any dimension of formal context, the purpose of its processing is to find n-sets H = &lt; X 1 , X 2 , . . . , X n &gt; which have the closure property <ref type="bibr" target="#b12">[12]</ref>:</p><formula xml:id="formula_4">∀u = (x 1 , x 2 , . . . , x n ) ∈ X 1 , X 2 , . . . , X n , u ∈ R,<label>(3)</label></formula><formula xml:id="formula_5">∀j = 1, 2, . . . , n, ∀x j ∈ D j \X j &lt; X 1 , . . . , X j ∪ {x j }, . . . , X n &gt; does not satisfy (3).</formula><p>The sets H = &lt; X 1 , X 2 , . . . , X n &gt; constitute multimodal clusters.</p><p>In the Formal Concept Analysis, biclustering algorithms have been developed in sufficient detail <ref type="bibr" target="#b16">[16]</ref>. The same cannot be said about the triclustering algorithms and especially about the multimodal clustering algorithms.</p><p>The simplest multidimensional formal context is a triadic context (tricontext) of the form T = (G, M, B, I), where B is a set specifying the conditions for the belonging of attributes to objects, I ⊆ G × M × B is a ternary relation. Accordingly, the ternary concepts (triconcepts) are defined as triplets of the form:</p><formula xml:id="formula_6">(C 1 , C 2 , C 3 ), C 1 ⊆ G, C 2 ⊆ M, C 3 ⊆ B,<label>(4)</label></formula><p>with corresponding closure conditions for prime operators. Prime operators have several other implementations here:</p><formula xml:id="formula_7">m ′ = {(g, b)|(g, m, b) ∈ I} g ′ = {(m, b)|(g, m, b) ∈ I} b ′ = {(g, m)|(g, m, b) ∈ I}<label>(5)</label></formula><p>as well as the corresponding double prime operators:</p><formula xml:id="formula_8">m ′′ = { ∼ m |(g, b) ∈ m ′ (g, ∼ m, b) ∈ I} g ′′ = { ∼ g |(m, b) ∈ g ′ ( ∼ g , m, b) ∈ I} b ′′ = { ∼ b |(g, m) ∈ b ′ (g, m, ∼ b) ∈ I}<label>(6)</label></formula><p>If formal context T is represented by a three dimensional tensor, then a triconcept is a 3-dimensional rectangle full of crosses.</p><p>Although there are several recognized algorithms for constructing threedimensional formal concepts, for example, the Data-Peeler algorithm <ref type="bibr" target="#b12">[12]</ref>, the problem of constructing three-dimensional clusters of a given density, which are insufficiently dense concepts, is of practical interest.</p><p>If</p><formula xml:id="formula_9">C = (X, Y, Z), X ⊆ K 1 , Y ⊆ K 2 , Z ⊆ K 3 is a cluster then its density is defined as d(C) = |I ∩ (X × Y × Z)| v(C)<label>(7)</label></formula><p>where cluster volume is</p><formula xml:id="formula_10">v(C) = |X| × |Y | × |Z|<label>(8)</label></formula><p>Example 2. A three-dimensional formal context can be built on the data from the infarct database presented in the Example 1. The use of drugs on certain days of hospitality may be a third dimension in the context. The standard maximum treatment time for myocardial infarction is 21 days (at least in Russia), which defines the scale of the third dimension. Fig. <ref type="figure">2</ref> shows a regular threedimensional cluster built on the context from the Example 1 with additional attributes. It is built with our evolutionary modeling framework <ref type="bibr" target="#b20">[20]</ref> but may be discovered by other tools. The second subset of the cluster contains attributes lat im, inf im, post im which detail variants of myocardial infarction. Attribute tikl s n means the use of the drug of Ticlid during therapy. It has values on the third dimension. On the Fig. <ref type="figure">2</ref> it is seen that Ticlid was used immediately (day 0) and on 21 st day of therapy.</p><p>The cluster on the Fig. <ref type="figure">2</ref> is not dense. This means that not all combinations of elements from the three subsets of the cluster take place. The "informativeness" of multidimensional clusters depends on their density, but not absolutely. Fig. <ref type="figure">2</ref>: Three-dimensional cluster.</p><p>In this example, we are not sure that the Ticlid was applied to these three patients only on these days, but the fact contained in this cluster may be of interest to cardiologists. "Absolutely reliable" facts have been contained in absolutely dense clusters, which are formal concepts. However, special facts that fall out of the general regularities can also be found in loose clusters. As for this example, this subset of patients may not form absolutely dense clusters and the facts should be searched for in low-density clusters by their additional analysis.</p><p>Evolutionary methods make it possible to simulate the clustering process in such a way that they allow to investigate the density distributions across clusters and thereby have a detailed picture of clustering.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Evolutionary Approach to Multimodal Clustering</head><p>Evolutionary Approach to Multimodal Clustering is based on Evolutionary Computation <ref type="bibr" target="#b15">[15]</ref>. Evolutionary computation is a term referring to several methods of global optimization, united by the fact that they all use the concept of the evolution of a set of solutions to an optimization problem, leading to solutions corresponding to the extreme value of some function that sets the optimization quality criterion. Evolutionary computation is effective when working with multimodal functions. If such a function has a global extremum, the evolutionary algorithm finds solutions corresponding to the range of values of the quality function that are sufficiently close to the that global extremum.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Principle of Evolutionary Computation</head><p>Let X is a set of solutions of a problem. Every solution x ∈ X can be characterized by a quality measure named as fitness function f(x). This measure, in general, is a mapping f : X → Y where Y ⊆ R is the subset of a set of real numbers. This is basis for existence many variations of fitness functions which for example characterize configuration of clusters.</p><p>Let solutions of a problem depend on a set of parameters P . Most problems which have been solved by using Evolutionary Computation can be formulated as the following optimization problem: it is required to find optimal values of parameters p * which deliver maximum fitness value y * ∈ Y, so the following is true:</p><formula xml:id="formula_11">p * = argmax p * ∈P f(x)<label>(9)</label></formula><p>Evolutionary approach to solving this problem consists in the following. Building encoding scheme. Encoding scheme is the mapping φ : P → S where set S contains objects which encode parameters from P. Genetic algorithms <ref type="bibr" target="#b13">[13]</ref> which are widely used in Evolutionary Computation often use binary encoding and every value of p ∈ P is represented as binary string. Encoding scheme is not necessarily binary (as it is not binary in Nature): every string position contains a symbol (gene) from encoding alphabet, and there are variants of alphabets applied in encoding schemata <ref type="bibr" target="#b14">[14]</ref>. But necessarily there exists an inverse mapping φ −1 : S → P , so for every s ∈ S there exists p ∈ P .</p><p>Evolutionary algorithm. For given encoding scheme the following algorithm solves the problem <ref type="bibr" target="#b7">(7)</ref>.</p><p>A. Randomly generate an initial set (population) S 0 of objects from S. B. Start evolution of the populations by applying a set of operators A to population S 0 and further iteratively so that for every S k+1 = A(S k ) exists at least one</p><formula xml:id="formula_12">f[φ −1 (S k+1 )] ≥ f[φ −1 (S k )],<label>(10)</label></formula><p>where s k ∈ S k and s k+1 ∈ S k+1 . C. Finish the evolution of the population in accordance with the stopping criterion. Most often, the criterion for stopping is the immutability of the fitness function values over several steps of evolution. If the set of operators A consists of genetic operators of selection, mutation and recombination (crossover) then evolutionary algorithm is named as genetic algorithm <ref type="bibr" target="#b14">[14]</ref>.</p><p>Selection works so that condition (10) is supported by the following "biological" principle: good parents produce good offspring (that is not true in Nature). So the higher fitness chromosomes have more opportunity to be selected than the lower ones and good solution is always alive in the next generation.</p><p>Crossover is the genetic operator that mixes two chromosomes together to form a new offspring. It does mixing by replacing fragments of chromosome's code divided in certain one or several randomly selected points.</p><p>MutationMutation involves modification of the gene values by randomly selecting new value from the alphabet at random point in the strings of genes.</p><p>Being realized, the algorithm (A. -C.) provides fairly accurate solution of the problem <ref type="bibr" target="#b9">(9)</ref>.</p><p>Fairly accurate means that evolutionary algorithm stops in a neighbourhood of global extreme of fitness function f. The size of a neighbourhood around extreme depends on the fitness function and parameters of genetic operators. When evolutionary algorithm works too fast it may stop at local extreme. This feature is traditionally considered as the lack of the algorithm but it may be useful for clustering since local extreme of quality measure may be "semantically better" than global extreme because it may correspond to the cluster containing an interesting fact. In our experiments we have observed just that situations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Evolutionary Computation in Multimodal Clustering</head><p>Here we outline our solution to applying evolutionary computation to clustering in multidimensional contexts.</p><p>There are two crucial parameters of clustering problem: a measure of similarity of clustering objects (proximity measure) and number of clusters -is it given or not before clustering. Evolutionary algorithms have advantage over traditional clustering methods when:</p><p>1. measure of similarity of clustering objects is not traditional (Euclidian norm) <ref type="bibr" target="#b22">[22]</ref>; 2. number of clusters is not given and 3. number of clusters is great.</p><p>Consider this in more detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Many clustering algorithms use proximity measures of objects based on cal-</head><p>culating the distances between them. Evolutionary and genetic algorithms work on the black box principle. The black box inputs are the values of the parameters P (see previous Section) and at the outputs we get the corresponding solutions, for which we calculate the values of the fitness function.</p><p>A dependence between input and output may be very complex and not being expressed analytically as traditional proximity measures. For our experiments, there is no analytical dependence between cluster configurations specified by chromosomes and, for example, cluster densities. Therefore, the fitness function used in this case does not use the traditional proximity measure. 2. In evolutionary and genetic algorithms for clustering <ref type="bibr" target="#b15">[15]</ref>, the number of clusters which will be obtained depends on chromosome encodings. After analyzing the existing variants of chromosome encodings, we settled on two of them. The first variant is our chained integer encoding scheme <ref type="bibr" target="#b20">[20]</ref>. The second encoding scheme is a binary scheme organized according to the principle of "one chromosome -one cluster". In this scheme, the chromosome gene is the object number. In this encoding, the maximal number of clusters is equal to the number of chromosomes. The number of clusters is not set initially. As a result of the evolution of n chromosomes, really k different chromosomes from the population should remain. 3. Models in the form of formal contexts give rise to a large number of formal concepts and an even greater number of clusters. When applying ndimensional contexts, the upper bound of the number of clusters is estimated as 2 |K1| × . . . × 2 |Kn| <ref type="bibr" target="#b11">[11]</ref>. In the study of gene expression with evolutionary algorithms the number of genes in experiments may be over dozens of thousands and the length of chromosomes which represent clusters may be giant. Nevertheless, the computational problem of processing very long chromosomes (usually binary) is solved now <ref type="bibr" target="#b9">[9,</ref><ref type="bibr" target="#b18">18]</ref>. We performed evolutionary clustering with genetic algorithm realizing evolutionary algorithm (A. -C.) and having various parameters Chromosome encoding. After analyzing the existing variants of chromosome encoding <ref type="bibr" target="#b15">[15]</ref>, we settled on two of them. The first variant is our chained integerencoding scheme <ref type="bibr" target="#b20">[20]</ref>. The second encoding scheme is a binary scheme organized according to the principle of "one chromosome -one cluster". It has one, two or three sections in chromosomes according with the variant of encoding (see Section 3.1) and dimension of a context. Chromosomes for three-dimensional contexts have sections "patients", "attrbutes" and "days". In the sections, a number of gene is the number of patient, number of attribute from the context and number of a day according with objects order in the corresponding subsets in formal tricontext. Different chromosomes form different clusters. Because of the evolution of many such chromosomes, really k different chromosomes from n members of the population should remain.</p><p>Fitness function. As in FCA, we control cluster density <ref type="bibr" target="#b7">(7)</ref>, its volume (8) and special kind of interestingness. There is the trade-off problem between the density and the volume of triclusters <ref type="bibr" target="#b10">[10,</ref><ref type="bibr" target="#b17">17]</ref>. Depending on the data, density and volume may be contradictory characteristics of clusters. The data that we use are sparse, and if we collect enough units in a cluster, it will be simultaneously voluminous. Therefore, we do not use the volume of clusters in the fitness function, but only use their density. Nevertheless we calculate cluster volumes during evolution.</p><p>For the binary encoding scheme, fitness function has the form:</p><formula xml:id="formula_13">f(d) = 1 N N i=1 α i d(C i ),<label>(11)</label></formula><p>where α i is user defined coefficient, which in general depends on cluster density, N is the number of chromosomes in population which is equal to the maximal number of clusters.</p><p>For the chained integer-encoding scheme fitness function is the following:</p><formula xml:id="formula_14">f(d) = 1 N N j=1 1 K j Kj i=1 α i d(C i ),<label>(12)</label></formula><p>where K j is the number of clusters in the j-th chromosome.</p><p>Interestingness of a clusters. In a genetic algorithm, the whole fitness of population hides the features of individual chromosomes. But if selection leaves chromosomes with maximum fitness, then there is a chance that they will lead evolution to good solutions. Patient ID values found in clusters, other attributes corresponding to them from the "treatment" and "treatment outcomes" domains are evaluated for the presence of information in them that can be treated as facts. The formal criteria for selecting such "interesting" clusters are the following.</p><p>A single cluster. The presence of a single cluster at the end of evolution means that the algorithm most likely stopped at the global extremum of a fitness function <ref type="bibr" target="#b14">[14,</ref><ref type="bibr" target="#b22">22]</ref>. That cluster may be interesting and it probably may be dense. But as it was mentioned above, the diversity of clustering results is important and the presence of a single cluster is a reason to change parameters of the algorithm to have several extremes of a fitness function.</p><p>Dense clusters. The densest clusters among the received are very important since the information they contain may be certainly treated as facts.</p><p>Clusters of the maximum volume are interesting if they are dense enough. A large number of objects from different domains in the cluster is a sign that some pattern occurs on the data.</p><p>Clusters with given values of density and volume are interesting when density or volume values are specified in relation to some other characteristics of the clustering task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experiments</head><p>Experimental studies of the proposed approach were carried out in order to test the performance of the genetic algorithm under various parameters and for checking the possibility of interpreting clustering results as facts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Data Set</head><p>We use Myocardial Infarction Complications Data Set <ref type="bibr" target="#b19">[19]</ref> for experiments. It contains information about 1700 patients having disease of myocardial infarction. This data set has 1700 objects and 124 attributes collected in the multivalued formal context. Among attributes, there are ones about patients (ID only), their anamnesis, their treatment methods, and complications after treatment. An attribute may be binary or has a value as natural number.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Clustering Task</head><p>Based on the data of myocardial infarction, the task of phenotyping this disease was solved. Phenotyping refers to the determination of the form of the disease based on the clinical profile. A clinical profile is a cluster that can include various data describing both the disease itself and the methods of its treatment, as well as the conditions of patients. Therefore, we were interested in various triples of attribute sets from the domains "patient", "treatment", "treatment results".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Investigated Properties of the Algorithm</head><p>In the experiments, we investigated, first of all, the effectiveness of various options for implementing the evolutionary clustering algorithm, as well as its performance.</p><p>Diversity in clusters. To ensure diversity in clusters, we used large values of mutation probability. The graph in Fig. <ref type="figure" target="#fig_1">3</ref> shows that when a certain threshold value of the mutation probability is reached, the number of clusters increases sharply and remains unchanged with an increase in the mutation probability. In this case, the algorithm found 30 local extrema of the fitness function.    The combination of cluster density and volume in a single fitness function masks certain relationships between attributes in clustering results. Therefore, here we make a principal conclusion that it is necessary to apply two optimization criteria in this clustering task with genetic algorithm.</p><p>Comparison with Data-Peeler. We were also interested in absolutely dense clusters, the formal concepts. As expected, there were few such clusters, which follows from the sparsity of myocardial data. To compare our results with well known another algorithm, we selected Data-Peeler <ref type="bibr" target="#b12">[12]</ref> and modernized its code [23] by adding graphical user interface. Comparison of the results is shown in Table <ref type="table" target="#tab_2">2</ref>. The results in the last row of Table <ref type="table" target="#tab_2">2</ref> can be explained by the high sparsity of data in this formal context. Accordingly, the Data-Peeler algorithm has built a lot of small concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Facts Extracted with Clustering</head><p>We were interested in special clusters. First of all, these are clusters with large groups of patients characterized by certain combinations of attributes from the domains "patient", "treatment", "treatment results". Several such groups were obtained.</p><p>1. We have found that the lethal outcome of myocardial infarction is inherent in elderly patients over 60 years of age. This fact is consistent with the known data of cardiology. 2. In more detail, cases of heart attack in the anamnesis correlate with a fatal outcome, which also looks natural.</p><p>For both this groups of patients, we found absolutely dense clusters built on tensors with age and anamnesis attributes.</p><p>Unexpected result. We have found one unexpected result, which is as follows. On the data of myocardial infarction, there are stable (not changing according with different parameters of the genetic algorithm) and rather dense clusters in which a subgroup of patients with a lethal outcome have not got certain drugs. At the same time, patients with a non-lethal outcome had these drugs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>This paper proposes an approach to multimodal clustering on multidimensional formal contexts using evolutionary computation. This approach has been shown to be effective in experiments on clustering three-dimensional formal contexts based on data of patients with myocardial infarction.</p><p>The presented experimental results reflect the initial stage of research in this area. In the future, it is planned to do the following.</p><p>1. Evaluate the informativeness of the obtained clusters not manually, but using a user interface focused on doctors. 2. Experiments have confirmed that the criteria of cluster density and volume contradict each other. Therefore, it is necessary to apply multi-objective evolutionary clustering with appropriate algorithms. 3. Transition to the dimension of formal contexts greater than three. Separate groups of parameters can be represented as dimensions. Then their combinations obtained in clusters will reflect in more detail the relationships in heterogeneous data.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 :</head><label>1</label><figDesc>Fig. 1: Example of a formal context and the concept lattice.</figDesc><graphic coords="4,190.10,341.99,236.40,194.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 3 :</head><label>3</label><figDesc>Fig. 3: Effect of the mutation probability value on the number of clusters.Scaling algorithm performance. The algorithm processes very long threesection chromosomes of about 2000 genes fairly quickly. To investigate algorithm performance we have constructed seven formal contexts acquired from the whole set which number of objects and attributes are shown in Table1where ECG is electrocardiogram.</figDesc><graphic coords="12,178.98,232.95,257.95,133.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 4</head><label>4</label><figDesc>Fig. 4 shows clustering execution time for each of the seven contexts. The graph on this figure shows that the cluster construction time depends quasilinearly on the amount of data.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 4 :Fig. 5 :</head><label>45</label><figDesc>Fig. 4: Clustering execution time for several formal contexts. Conflicting criteria. As we expected the criteria for maximizing the volume and density of clusters contradict each other. In the graphs on Figure 5 it is shown that the volume (graph a) and density (graph b) of clusters synchronously change during evolution.</figDesc><graphic coords="13,134.77,327.22,183.19,119.53" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 where</head><label>1</label><figDesc>ECG is electrocardiogram.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 :</head><label>1</label><figDesc>Number of objects and attributes of formal contexts.</figDesc><table><row><cell>Context</cell><cell>Objects</cell><cell>Attributes</cell><cell></cell></row><row><cell>Anamnesis</cell><cell>1700</cell><cell>33</cell><cell></cell></row><row><cell>Therapy</cell><cell>1700</cell><cell>24</cell><cell></cell></row><row><cell>Analyzes</cell><cell>1700</cell><cell>19</cell><cell>.</cell></row><row><cell>Infarct</cell><cell>1700</cell><cell>6</cell><cell></cell></row><row><cell>ECG</cell><cell>1700</cell><cell>27</cell><cell></cell></row><row><cell>Therapy results</cell><cell>1700</cell><cell>14</cell><cell></cell></row><row><cell>Full data</cell><cell>1700</cell><cell>123</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Clustering results compared with Data-Peeler.</figDesc><table><row><cell>Formal context</cell><cell>Number of</cell><cell>Number of</cell><cell>Number of Data-</cell></row><row><cell></cell><cell>clusters</cell><cell>dense clusters</cell><cell>Peeler concepts</cell></row><row><cell>Anamnesis</cell><cell>30</cell><cell>14</cell><cell>449639</cell></row><row><cell>Therapy</cell><cell>30</cell><cell>19</cell><cell>28599</cell></row><row><cell>Analyzes</cell><cell>30</cell><cell>17</cell><cell>162</cell></row><row><cell>Infarct</cell><cell>30</cell><cell>30</cell><cell>65</cell></row><row><cell>ECG</cell><cell>30</cell><cell>10</cell><cell>689011</cell></row><row><cell>Therapy results</cell><cell>30</cell><cell>12</cell><cell>7798</cell></row><row><cell>Full data</cell><cell>30</cell><cell>4</cell><cell>12564890</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgments. The reported study was funded by Russian Foundation of Basic Research, the research projects № 19-07-01178, № 20-07-00055 and RFBR and Tula Region according to research project № 19-47-710007.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Data Clustering: Algorithms and Applications</title>
		<editor>Charu Aggarwal, Chandan Reddy</editor>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>CRC Press</publisher>
			<pubPlace>London</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Pattern Recognition</title>
		<author>
			<persName><forename type="first">S</forename><surname>Theodoridis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Koutroumbas</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>Academic Press</publisher>
		</imprint>
	</monogr>
	<note>3rd edn</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The Permutable k-means for the Bi-partial Crite-rion</title>
		<author>
			<persName><forename type="first">D</forename><surname>Sergey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><forename type="middle">W</forename><surname>Dvoenko</surname></persName>
		</author>
		<author>
			<persName><surname>Owsinski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Informatica</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="253" to="262" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Direct clustering of a data matrix</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Hartigan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American statistical association</title>
		<imprint>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="issue">337</biblScope>
			<biblScope unit="page" from="123" to="129" />
			<date type="published" when="1972">1972</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Biclustering algorithms for biological data analysis: a survey</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">C</forename><surname>Madeira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Oliveira</surname></persName>
		</author>
		<idno type="DOI">10.1109/TCBB.2004.2</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE/ACM Trans. Comput. Biol. Bioinform. Jan-Mar</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="24" to="45" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Bernhard</forename><surname>Ganter</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Formal Concept Analysis: Foundations and Applications</title>
		<idno type="DOI">10.1007/978-3-540-31881-1</idno>
	</analytic>
	<monogr>
		<title level="s">Lecture Notes in Artificial Intelligence</title>
		<editor>Stumme, Gerd</editor>
		<editor>Wille, Rudolf</editor>
		<imprint>
			<biblScope unit="volume">3626</biblScope>
			<date type="published" when="2005">2005</date>
			<publisher>Springer-Verlag</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Biclustering Numerical Data in Formal Concept Analysis</title>
		<author>
			<persName><forename type="first">Mehdi</forename><surname>Kaytoue</surname></persName>
		</author>
		<author>
			<persName><surname>Kuznetsov</surname></persName>
		</author>
		<author>
			<persName><surname>Sergei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Amedeo</forename><surname>Napoli</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-642-20514-912</idno>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="135" to="150" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Can triconcepts become triclusters?</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">I</forename><surname>Ignatov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">O</forename><surname>Kuznetsov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">E</forename><surname>Zhukov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Poelmans</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">// International Journal of General Systems</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">6</biblScope>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Triclustering Algorithms for Three-Dimensional Data Analy-sis: A Comprehensive Survey</title>
		<author>
			<persName><forename type="first">R</forename><surname>Henriques</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Madeira</surname></persName>
		</author>
		<idno type="DOI">10.1145/3195833</idno>
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv. V</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1" to="43" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">From Triadic FCA to Triclustering: Experimental Comparison of Some Triclustering Algorithms</title>
		<author>
			<persName><forename type="first">V</forename><surname>Dmitry</surname></persName>
		</author>
		<author>
			<persName><surname>Gnatyshak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Dmitry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sergei</forename><forename type="middle">O</forename><surname>Ignatov</surname></persName>
		</author>
		<author>
			<persName><surname>Kuznetsov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Tenth International Conference on Concept Lattices and Their Applications (CLA&apos;2013)</title>
				<meeting>the Tenth International Conference on Concept Lattices and Their Applications (CLA&apos;2013)<address><addrLine>La Rochelle</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="249" to="260" />
		</imprint>
		<respStmt>
			<orgName>Laboratory L3i, University of La Rochelle</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Polyadic concept analysis</title>
		<author>
			<persName><forename type="first">G</forename><surname>Voutsadakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">-Order</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="295" to="304" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Closed Patterns Meet N-ary Relations</title>
		<author>
			<persName><forename type="first">L</forename><surname>Cerf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Besson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Robardet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Boulicaut</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Knowl. Discov. Data</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">36</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Clustering with Genetic Algorithms</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M</forename><surname>Cole</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1998">1998</date>
			<pubPlace>Australia</pubPlace>
		</imprint>
		<respStmt>
			<orgName>University of Western Australia</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">MSc Thesis</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Genetic Algorithms in Search Optimization and Machine Learning</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Goldberg</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1989">1989</date>
			<publisher>Addison-Wesley</publisher>
			<pubPlace>Reading, MA, USA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A Survey of Evolutionary Algorithms for Clustering</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hruschka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Campello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Freitas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>De Carballo</surname></persName>
		</author>
		<idno type="DOI">10.1109/TSMCC.2008.2007252</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Evolutionary Computation. V</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="133" to="155" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
	<note>Systems, Man, and Cybernetics</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Comparing Performance of Algorithms for Generating Concept Lattices</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">O</forename><surname>Kuznetsov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Obiedkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Experimental and Theoretical Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="189" to="216" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Triadic Formal Concept Analysis and triclustering: searching for optimal patterns</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">I</forename><surname>Ignatov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">V</forename><surname>Gnatyshak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sergei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Boris</forename><forename type="middle">G</forename><surname>Kuznetsov</surname></persName>
		</author>
		<author>
			<persName><surname>Mirkin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Machine Learning</title>
				<imprint>
			<date type="published" when="2015-04">April, 2015</date>
			<biblScope unit="page" from="1" to="32" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">An evolutionary clustering algorithm for gene expression microarray data analysis</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chiu</surname></persName>
		</author>
		<idno type="DOI">10.1109/TEVC.2005.859371</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Evolutionary Computation. V</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="296" to="314" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<ptr target="http://archive.ics.uci.edu/ml/machine-learning-databases/00579/" />
		<title level="m">Myocardial infarction complications Data Set</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Framework for Evolutionary Modelling in Text Mining. -Proceedings of the SENSE&apos;09 -Conceptual Structures for Extracting Natural Language Semantics</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Y</forename><surname>Bogatyrev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Terekhov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop at 17th International Conference on Conceptual Structures (ICCS&apos;09)</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="26" to="37" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Yevtushenko: System of data analysis &quot;Concept Explorer</title>
		<author>
			<persName><forename type="first">A</forename><surname>Serhiy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th National Conference on Artificial Intelligence KII-2000</title>
				<meeting>the 7th National Conference on Artificial Intelligence KII-2000<address><addrLine>Moscow</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="127" to="134" />
		</imprint>
	</monogr>
	<note>In Russian</note>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Evolutionary Optimization Algorithms: Biologically-Inspired and Population-Based Approaches to Computer Intelligence</title>
		<author>
			<persName><forename type="first">Dan</forename><surname>Simon</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>John Wiley &amp; Sons</publisher>
			<pubPlace>New Jersey</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
