<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Reducing the Size of Concept Lattices: The JBOS Approach</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Sérgio</forename><forename type="middle">M</forename><surname>Dias</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">Federal University of Minas Gerais (UFMG) Av</orgName>
								<address>
									<addrLine>Antônio Carlos 6627 -ICEx -4010 -Pampulha ZIP: 31.270</addrLine>
									<postCode>-010</postCode>
									<settlement>Belo, Horizonte, Minas Gerais</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Newton</forename><forename type="middle">J</forename><surname>Vieira</surname></persName>
							<email>nvieira@dcc.ufmg.br</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">Federal University of Minas Gerais (UFMG) Av</orgName>
								<address>
									<addrLine>Antônio Carlos 6627 -ICEx -4010 -Pampulha ZIP: 31.270</addrLine>
									<postCode>-010</postCode>
									<settlement>Belo, Horizonte, Minas Gerais</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Reducing the Size of Concept Lattices: The JBOS Approach</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">A12FEDACE537D6D5967B16ED6B7F920F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>Formal Concept Analysis, Concept Lattices</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Formal concept analysis (FCA) is used nowadays in a large number of applications in several different areas. Nevertheless, in some applications the size of a concept lattice may be prohibitively large, thus creating a need for new approaches and algorithms to reduce concept lattices to a manageable size. This paper presents a new method for reducing a concept lattice's complexity and introduces a new methodology for evaluating reduction methods. The main idea of the proposed approach, junction based on objects similarity (JBOS), is the substitution of groups of "similar" objects by prototypical ones. As a consequence, the resulting lattice is simplified in its size and structure. The user has control of the degree of simplification by means of parameters specification. Two measures are proposed for the evaluation of the final lattice with respect to the original one, the first related to the consistency of the reduced lattice and the other to the descriptive losses incurred by the reduction. The JBOS approach was applied to two data sets from the UCI machine learning repository. The results show that it is possible to reduce the size of a concept lattice in a controlled way. In particular, it makes possible the use of FCA in a data set (formal context) for which that use was previously impossible.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Formal concepts analysis (FCA) is a technique of applied mathematics based on the mathematization of the notion of concept, consisting of intention and extension, and on the organization of the concepts through a conceptual hierarchy <ref type="bibr" target="#b7">[8]</ref>. Its formalization was born in 1982 with the work presented by Wille <ref type="bibr" target="#b17">[18]</ref>, who proposed to consider each lattice element as a formal concept and the lattice representing a conceptual hierarchy. Currently there is a growing interest in applications of FCA in information retrieval <ref type="bibr" target="#b11">[12]</ref>, data mining <ref type="bibr" target="#b15">[16]</ref>, neural networks <ref type="bibr" target="#b18">[19]</ref>, social networks <ref type="bibr" target="#b10">[11]</ref>, etc.</p><p>The great potential of FCA is provided by the organization of knowledge into a conceptual hierarchy. In fact, the main applications make use of this hierarchy representing it by means of a line diagram, a nested line diagram, a tree diagram etc. However, generating all formal concepts and classifying them hierarchically presents, in the worst case, an exponential behavior <ref type="bibr" target="#b13">[14]</ref>. Although this behavior is rarely found in practical cases <ref type="bibr" target="#b9">[10]</ref>, the computational cost can be prohibitive for many applications.</p><p>In general, techniques for reducing the complexity of the concept lattices can be applied to the formal context or to the concept lattice itself. A simple way to reduce the size of the concept lattice would be, for example, the creation of a cutting threshold, from which only the formal concepts considered "relevant" would be selected and preserved (of course in this case it would be necessary to develop some criterion of relevance). A simple kind of reduction applied to the formal context would be the junction of lines and/or columns as suggested by Ganter and Wille <ref type="bibr" target="#b7">[8]</ref>: if for two objects g and h, g ′ = h ′ then g and h can be replaced by one single object; dually, if for two attributes m and n, m ′ = n ′ , then m and n can be replaced by one single attribute. Despite reducing the formal context, this method does not reduce the size of the concept lattice.</p><p>Another interesting possibility is to apply techniques of the KDD process (knowledge discovery in databases) <ref type="bibr" target="#b14">[15]</ref> to reduce the size of the formal context. Gajdos et al. <ref type="bibr" target="#b6">[7]</ref> and Karen et al. <ref type="bibr" target="#b3">[4]</ref> propose the use of SVD (singular value decomposition), which is a technique capable of reducing the dimensionality of data masses. Another very interesting technique for reducing the concept lattice based on the formal context was presented by Belohlávek et al <ref type="bibr" target="#b1">[2]</ref>. The authors propose to use the knowledge of the user to create restrictions on attributes of the formal context. These restrictions, called AD-formulas (attribute-dependency formulas), are used during the generation of formal concepts and concept lattice construction to retain only the formal concepts conforming to them.</p><p>Among the main techniques applied directly to the concept lattice, there is that of Arévalo et al. <ref type="bibr" target="#b0">[1]</ref> which proposes the creation of a Galois Subhierarchy. In this case, the only formal concepts selected are C g ∪C m , where</p><formula xml:id="formula_0">C g = {(g ′′ , g ′ )|g ∈ G} and C m = {(m ′′ , m ′ )|m ∈ M }.</formula><p>Another option to reduce the complexity of the concept lattice is to associate the concept of frequent items from the KDD process, to frequent formal concepts <ref type="bibr" target="#b14">[15]</ref>. A formal concept (X, Y ) is considered frequent if, only if, sup(Y, G) ≥ minSup, where sup(Y, G) (the support) is the number of objects in G which contains the attributes in Y and minSup is a minimal support provided as a parameter by the user. Another very interesting technique for reducing complexity is based on the concept of stability proposed by Kuznetsov <ref type="bibr" target="#b12">[13]</ref>. Unlike other methods of reduction, the stability method seeks to create an index for concepts, indicating how much the intent of the concept depends on the set of objects. Using this index, one can assume a threshold at which all the formal concepts with lower values are removed. Recently, Jay et al. <ref type="bibr" target="#b10">[11]</ref> applied the concept of stability and iceberg lattices in social network analysis.</p><p>Despite the range of techniques to reduce the complexity of concept lattices, no methodologies were found for the analysis of these techniques. Obviously, any proposal should be accompanied by measuring of how the resulting concept lattice relates to the original lattice. Thus, this paper presents a methodology based on two metrics. The first aims to evaluate the degree of consistency of the reduced lattice with respect to the original data. The second aims to assess the loss of the ability to discriminate between objects features.</p><p>We propose a new method for reducing the complexity of the concept lattice, called junction based on objects similarity (JBOS), which seeks to replace groups of similar objects by representative objects. The method was applied to two databases of the UCI machine learning repository, one which leads to a concept lattice of moderate size and the other which leads to a very large concept lattice. The results were evaluated through the proposed methodology, which showed for each of the databases, the most promising levels of reduction based on the values for the parameters of the JBOS method. In particular, the JBOS method was able to reduce the complexity of the large concept lattice, making plausible its analysis on the reduced form.</p><p>This paper is organized in five sections. In the next section a short review of the formal concept analysis is presented mainly in order to fix the terminology. In the third section, the method for reducing the complexity of concept lattices is presented and the methodology for the evaluation of reduction methods is proposed. In the fourth one, a case study is presented. Finally, the contributions and the conclusions of this work are presented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Formal Concept Analysis -A short review</head><p>This very short review is mainly based on the notions and terminology exposed on <ref type="bibr" target="#b7">[8]</ref>. Formal concept analysis is a field of mathematics which was born in the early eighties <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b7">8]</ref>. Its main characteristic is knowledge representation by means of specific diagrams called line diagrams (based on Hasse diagrams), which are what mathematicians call diagrams of (concept) lattices. In FCA, formal contexts are primordial elements. A formal context is a triplet (G, M, I), where G is a set whose elements are called objects, M is a set whose members are called attributes and I ⊆ G × M is a relation termed incidence relation. If (g, m) ∈ I, one says that "the object g has the attribute m". A formal context is usually represented by a cross table where the objects are rows headers, the atributes are columns headers and there is a cross in row g and column m if and only if (g, m) ∈ I.</p><p>Given a set of objects A ⊆ G from a formal context (G, M, I), it could be asked which attributes from M are common to all those objects. Similarly, it could be asked, for a set B ⊆ M , which objects have all the attributes from B. These questions are answered by the derivation operators, defined as:</p><formula xml:id="formula_1">A ′ = {m ∈ M |∀g ∈ A(g, m) ∈ I} and B ′ = {g ∈ G|∀m ∈ B(g, m) ∈ I}.</formula><p>A formal concept is a pair (A, B) ∈ P(G) × P(M ) such that A ′ = B and B ′ = A, where A is called the extent and B the intent of the concept. The set of formal concepts is ordered by the partial order ≤ such that for any two formal concepts (A 1 , B 1 ) and (A 2 , B 2 ), (A 1 , B 1 ) ≤ (A 2 , B 2 ) if and only if A 1 ⊆ A 2 (and, in this case, B 2 ⊆ B 1 ). The set of concepts ordered by ≤ constitutes a complete lattice <ref type="bibr" target="#b4">[5]</ref> termed concept lattice. The concept lattice obtained from a formal context (G, M, I) is denoted by B(G, M, I).</p><p>The first part of the basic theorem on concept lattices says that a concept lattice B(G, M, I) is a complete lattice in which for any arbitrary set C ⊆ B(G, M, I) the infimum and supremum are given by C = ( X, ( Y ) ′′ ) and C = (( X) ′′ , Y ), where</p><formula xml:id="formula_2">X = {A | (A, B) ∈ C} and Y = {B | (A, B) ∈ C}[8].</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Complexity reduction</head><p>In the development of the method of complexity reduction proposed in this work, it was assumed that the following requirements should be met:</p><p>1. avoid creating new objects, attributes or incidences; 2. seek to preserve original conceptual hierarchy at the most; 3. provide a high fidelity with respect to the original knowledge, in other words, the knowledge presented in the reduced concept lattice must be faithful (consistent) with the original; and 4. present low descriptive losses, in other words, seek to preserve the power of discrimination afforded by the range of attributes of the original formal context.</p><p>In order to measure the degree of satisfaction given to the last two requirements, a methodology for evaluation was created.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">A methodology for evaluating methods of complexity reduction</head><p>In order to evaluate techniques for reducing the size of concept lattices, one must somehow compare the information in the formal context and/or in the obtained concept lattice with the information found in the original formal context and/or the respective concept lattice.</p><p>The methodology proposed here is based on two types of comparisons. In the first one, the obtained concept lattice, named reduced concept lattice is compared with the original formal context, but in an indirect way: from the reduced concept lattice it is produced a set of rules and the "capacity" of this set to take into account all the original objects is assessed. In the second one, the obtained formal context, named reduced formal context is evaluated with respect to the original formal context, seeking to assess the loss in the capacity to discriminate the characteristics of objects.</p><p>The first evaluation measure requires the production of implication rules from the reduced lattice. A valid implication rule for a certain formal context is an expression P → Q, where P and Q are sets of attributes of the formal context and P ′ ⊆ Q ′ , expressing the fact that if an object has attributes in P , then it has those of Q <ref type="bibr" target="#b8">[9]</ref>. The set of all valid rules for a formal context usually has a high degree of redundancy and its extraction is impractical. On the other hand, the extraction of a minimum non-redundant set may also be impractical <ref type="bibr" target="#b2">[3]</ref>. Thus, it is proposed to generate a set of rules possibly not minimal, but with little redundancy, for example, a set of rules reduced to the left <ref type="bibr" target="#b16">[17]</ref>. A rule P → Q is said reduced to the left in a set of rules Z if for any proper subset S of P the set</p><formula xml:id="formula_3">(Z − {P → Q}) ∪ {S → Q} is not equivalent to Z 1 .</formula><p>To measure the degree of equivalence between the original formal context and the reduced formal context, two metrics are proposed: fidelity and descriptive loss. Let G o be the set of objects from the original formal context, G r be the set of objects from the reduced formal context and let us assume that k rules are extracted from the reduced concept lattice, rules which will be denoted by r 1 , r 2 , . . . , r k .</p><p>The fidelity, F , aims to measure the percentage of successfull applications of rules to the original objects. If there is a rule r i = P → Q for which P ⊂ g ′ and Q ⊂ g ′ for an object g ∈ G o , then one says that the rule r i has failed. The fidelity is given by the following equation<ref type="foot" target="#foot_1">2</ref> :</p><formula xml:id="formula_4">F = k i=1 1 − Nf i |Go| k × 100 (1)</formula><p>where Nf i is the total number of failures of the rule r i (i.e. the number of objects in G o for which the rule fails.). Thus, F is the percentage of non failures considering the</p><formula xml:id="formula_5">|G o | × k applications of k rules to |G o | objects.</formula><p>In order to avoid multiple contributions to F from objects which have exactly the same attributes, such objects must be previously reduced to only one object. The descriptive loss, DL, aims to measure the loss of ability to characterize the set of objects G o due to loss of attributes in the reduction. For its definition an onto function will be used ν : G o → G r that maps to each object g of G o an object ν(g) in G r as a result of a possible simplification of g due to the reduction process. It's supposed that the object ν(g) will have as attributes a subset of the g attributes, i.e. ν(g) ′ ⊆ g ′ . In particular, if the object g is "eliminated" in the reduction process, ν(g) ′ = ∅. Note that it may happen that ν(g) = ν(h) for two different objects g and h, and this could be interpreted as saying that g and h stayed with the same attributes after the reduction process, becoming indistinguishable. In particular, all eliminated objects become indistinguishable. The descriptive loss is given by:</p><formula xml:id="formula_6">DL =   1 − g∈Go |ν(g) ′ | |g ′ | |G o |   × 100<label>(2)</label></formula><p>where |ν(g) ′ | is the number of attributes of the reduced object ν(g) and |g ′ | is the number of attributes of the original object g. DL, as defined above, can report descriptive losses when attributes considered "redundant" in the actual application are eliminated. Therefore, such attributes must be removed before applying the reduction method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Junction Based on Objects Similarity -JBOS</head><p>The JBOS method is applied from the formal context. It is based on the junction of "similar" objects, namely the replacement of such objects by a single representative of them. Naturally, the key entity here is the notion of similarity. For its definition, a weight is created for each attribute of the formal context, which seeks to represent the attribute's relevance, and the similarity is determined from the weights of the attributes. As the structure of the concept lattice is derived from the arrangements of attributes of objects, it is expected that the junction of similar objects will preserve at a certain degree (which will depend on the specific assigned weights) a portion of the original structure while providing a certain degree of simplification.</p><p>In the rest of this section, let (G, M, I) be a formal context from which a reduced formal context must be produced.</p><p>It is assumed that to each attribute m ∈ M is assigned a weight w m such that 0 ≤ w m ≤ 1. Such weight can be thought as measuring the significance of the attribute from 0 (no relevance) to 1 (absolute relevance). The weight should be defined by the user based on the semantic load of the attribute, i.e. based on its importance to the model represented by the formal context. The similarity between two objects g, h ∈ G will also be expressed by a real number between 0 (completely dissimilar) and 1 (completely indistinguishable), so defined:</p><formula xml:id="formula_7">sim(g, h) = m∈M µ(g, h, m) m∈M w m ; µ(g, h, m) = w m , if (g, m) ∈ I ↔ (h, m) ∈ I 0, otherwise.</formula><p>(3) That is, the similarity between two objects g and h is given by the weighted sum of the weights of attributes in which both objects agree with each other (both have them or both do not have them).</p><p>From the similarity matrix it is possible to build an algorithm that makes clusters of similar objects, so that two objects can be in the same group only if they have a certain degree of "similarity". Accordingly, it is suggested the use of a similarity index ǫ, and a maximum number of similar elements α, both defined by the user. Two objects g, h ∈ G will be considered similar by the algorithm, and therefore are likely to stay in the same group if and only if sim(i, j) ≥ ǫ. The number α is just one more cutting option for the user: it specifies that each group can contain at most α elements. Algorithm 1 shows how the groups of similar objects are determined. The auxiliary function GreaterSimilarity(sim, g, G) returns an object of G among the most similar to the object g; if there is more than one, it chooses any one. The function ChooseObjectIn(G) selects any object from the set G. The set of groups of similar objects is denoted by γ in the algo-rithm. Notice that even though similarity<ref type="foot" target="#foot_2">3</ref> is a tolerance relation, the algorithm computes "maximal" sets H such that g is similar to h for all g, h ∈ H.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Algorithm 1 Function Group Objects</head><p>Input: a set of objects G, a similarity matrix sim, a similarity index ǫ and a maximum number of similar elements α. Output: the set of groups of similar objects γ.</p><p>1:  The main merit of the JBOS method just described is to apply a consistent reduction based on the user's knowledge.If the relevance of the attributes is not known the user must discover it, for example, by correlation analysis, principal components analysis etc. By knowing the relative importance of attributes, the user can set weights that mirror that importance. This knowledge and the definition of the parameters ǫ and α are then used to reduce the number of objects, which in real applications is often higher than the number of attributes. Consequently, the number of formal concepts and the cost of performing the derivations are also reduced. Therefore, applications once inviable can eventually become viable.</p><formula xml:id="formula_8">γ = ⊘ 2: while G = ⊘ do 3: g = ChooseObjectIn(G) 4: H = {g} 5: G = G -</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Case Study</head><p>The JBS method was analyzed using the proposed methodology on two classic databases from UCI machine learning repository: Wine Data Set and Water Treatment Plant Data Set (avaliable at: http://www.ics.uci.edu/~mlearn.). The data cleaning process and formal contexts modelling, including the choice of cut points for the discretization of the real parameters, are described in <ref type="bibr" target="#b5">[6]</ref>. It is noteworthy that the number of formal concepts for the Water base, about 9 million, is intractable for the existing usual implementations. The JBOS method and the calculations concerning the methodology for complexity reduction techniques evaluation have been implemented using the framework EF-Concept Analysis (Educational Framework for Concept Analysis) (available at: http://www.dcc.ufmg.br/~mariano) <ref type="bibr" target="#b5">[6]</ref>. This framework consists of a set of interfaces, data structures and algorithms that ensures agile development of FCA algorithms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Applying the JBOS method</head><p>As discussed in Section 3.2, the JBOS method requires a set of weights representing the significance of each attribute. In order to determine this set of weights, a deep study of the database or the assistance of a domain expert is necessary. That study is beyond the scope of this work, which seeks only to demonstrate that from a set of weights it is possible to reduce the formal context in order to obtain a concept lattice consistent with the original. Although the JBOS method is not validated here in terms of specific applications, which must be done in a future work, biased results or very specific results applying to particular databases are avoided. Thus, a set of weights for each of the databases, Wine and Water, is assigned based simply on the frequency of the attributes: a higher frequency corresponds to a higher weight (obviously, this criterion may not be appropriate depending on the application that will follow the reduction process.).</p><p>The method also needs values for the parameters ǫ and α. The first was set to range from 1 to 0 for each of the bases. The second was set at 20% of the number of objects, which amounts to 26 and 63 for the Wine and Water bases, respectively.</p><p>The results for the Wine database can be seen in the graphs of Figures <ref type="figure" target="#fig_2">1  and 2</ref>. It should be emphasized that all values are percentages of the respective elements of the original formal context. Initially, note on Figure <ref type="figure" target="#fig_1">1</ref> that for ǫ = 0.92 the number of objects was reduced to 77% of the number of objects from the original formal context and the number of formal concepts to approximately 70% of the number of formal concepts of the original lattice. Note also that for the same value of ǫ the fidelity was approximately 99% and the descriptive loss about 3.8%, as shown in the graph in Figure <ref type="figure" target="#fig_2">2</ref>. Thus, seting ǫ = 0.92 can be an interesting option as it implies a significant reduction of the concept lattice combined with a high fidelity and a low descriptive loss. Indeed, for ǫ as low as 0.85 the numbers are relatively good. In particular, for ǫ = 0.85 the number of formal concepts drops to 18.46%, the fidelity is 97% and the descriptive loss is approximately 18.8%. Note that despite the large reduction of the number of formal concepts, the measured quality indices may be deemed acceptable.  Note too in the graph of Figure <ref type="figure" target="#fig_1">1</ref> that the reduction of the number of objects |G| is higher, when compared with the reduction of the density |I|. One explanation for this is that the objects of each set F ∈ γ have number of attributes relatively close to each other; and this induces a smaller change in density with respect to the number of objects. Note also that for ǫ ≤ 0.15 there were neither further reductions in the parameters under analysis nor changes in levels of fidelity and descriptive loss, since the number of formal concepts was reduced at the most for the given value of α.</p><p>The results for the Water database can be seen in the graphs of Figures <ref type="figure" target="#fig_4">3  and 4</ref>. First, note that using the framework EF-Concept Analysis, on the account of the huge number of formal concepts in this database, it was not possible to generate the formal concepts for values ǫ &gt; 0.86; consequently, it was not possible to get the number of rules for the original context. Thus, the graph in Figure <ref type="figure" target="#fig_3">3</ref> does not have the percentages of rules for such values of ǫ. Note, however, that the calculation of fidelity after a reduction that leads to a treatable formal context is possible, since it does not depend on obtaining the formal concepts or the original concept lattice.</p><p>Figure <ref type="figure" target="#fig_3">3</ref> shows that for ǫ = 0.86 the number of objects obtained was about 31% and the number of formal concepts about 2.14% (which results in about 191,800 formal concepts). For this same value of ǫ, Figure <ref type="figure" target="#fig_4">4</ref> shows a fidelity of about 95% and a descriptive loss of about 21%. Note that even with a great reduction of formal concepts, the fidelity was still high, demonstrating a high degree of consistency of the JBOS method, and the descriptive loss was not high. Also note that for values ǫ &lt; 0.42% there is no reduction at all.  The results show that the quality of the reduction achieved by the JBOS method, and even its viability, depend on a proper choice of values for ǫ after the weights of attributes are defined. Although the results do not show, as they don't consider particular uses of the reduced formal contexts, the definition of weights that really reflect the relevance of each attribute is very important because it is from these weights that the similarity calculations, which are the basis for the application of the method, are made. Regarding the parameter α, it is not much important. It can, in general, be ignored or receive a high value as it was done above. In other words, decreasing values for α can be considered if experiments with values for weights of attributes and for ǫ are insufficient to produce acceptable reductions.</p><p>Regarding the measures of the reduction quality, a high fidelity and a low descriptive loss give evidence of good quality. The calculations of both are made from the original formal context, without the need of obtaining their respective formal concepts. After obtaining a reduced formal context through the JBOS method, it is always possible to calculate the descriptive loss incurred. The calculation of the fidelity is possible when the generation of rules from the resulting formal context is feasible, and this is what happens when the creation of formal concepts is achieved. This was the case, for example, for the Water database with ǫ ≤ 0.86. Another particularly important feature of these indices is that they are, as far as the authors know, the first application independent proposal for measuring quality of methods of complexity reduction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions</head><p>The JBOS method obtains a reduced formal context based on a set of weights measuring the significance of the attributes. The requirements presented in Section 3, desirable for any method of complexity reduction, are met by JBOS. Regarding the first one, creating objects, attributes or incidences, besides going in the opposite direction of the reduction, it would alter the model described by the formal context and hence by the concept lattice. The JBOS method fully meets this requirement. Regarding the level of satisfaction for the last three requirements, it depends very much on a judicious choice of weights for the attributes and also on the allocation of an appropriate similarity threshold ǫ. Anyway, such degree of judiciousness can be adjusted by means of experiments using different weight assignments and values for ǫ and α.</p><p>It was also presented a methodology for evaluation of techniques for reducing the concept lattice's complexity, consisting of two metrics that seek to quantify precisely the degree of satisfaction to the latter two requirements refered to above. Because these two requirements are very dependent on the second, preservation of the original conceptual hierarchy, such metrics also indirectly quantify this second requirement. It is rarely found in the literature methodologies for evaluating methods of complexity reduction, and those which are found are designed based on specific reduction methods or applications. The methodology introduced in this paper is independent of reduction method and application. It is based on two metrics: an index of fidelity and a descriptive loss. The first, which measures the degree of consistency of the information present in the obtained formal context with respect to the original, is calculated from a set of rules extracted from the concept lattice. The second, which measures the loss of capacity to represent the details found in the model described in the original formal context, is calculated directly on the obtained formal context.</p><p>The results of the evaluation of the JBOS method applied to the two databases, Wine and Water, show that it is possible to reduce the complexity of the concept lattice preserving the consistency and with a good level of descriptive loss, both being measured by the methodology introduced. In particular, regarding the database Water, for which it was not possible to generate the formal concepts from the original context, it was possible to build the concept lattice with very significant reductions, yet maintaining a relatively high rate of fidelity and a not very big descriptive loss. The results showed that the method is able to reduce the complexity of the concept lattice (number of formal concepts) with a good level of control on the consistency and amount of knowledge detail produced. Besides, it is possible through adjustment of weights and parameters, to strive to meet the desired levels of quality.</p><p>As future work we intend to compare the JBOS method with other approaches and to analyze the impact of the discretization process (applied to build the formal context) on similarity and quality measures. The suitability of JBOS for reducing the structural complexity of concept lattices for some real applications will be tested.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Let (G o , M o , I o ) be the original formal context. Let γ be the set obtained by Algorithm 1 from G o and specific values for ǫ and α. In the reduced formal context each set H ∈ γ will be considered as an object and such object will have the attributes of M o which are common to all objects in H. That is, the attributes of H will be those in {g ′ |g ∈ H}, where the operator of derivation is applied from I o . Therefore, by the JBOS method the reduced formal context (G, M, I) is such that: G = γ; M = { {g ′ |g ∈ H}|H ∈ γ}; and (H, m) ∈ I if, and only if, m ∈ {g ′ |g ∈ H}.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Objects, density, formal concepts and rules obtained for the Wine formal context.</figDesc><graphic coords="9,134.77,305.45,154.80,93.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Fidelity and descriptive loss obtained for the Wine formal context.</figDesc><graphic coords="9,311.13,316.41,154.80,93.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Objects, density, formal concepts and rules obtained for the Water formal context.</figDesc><graphic coords="10,134.77,230.24,154.80,93.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Fidelity and descriptive loss obtained for the Water formal context.</figDesc><graphic coords="10,311.13,241.19,154.80,93.07" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Characteristics of the databases used in the evaluation of the JBOS method.</figDesc><table><row><cell></cell><cell cols="2">Wine Water</cell></row><row><cell>Number of objects</cell><cell>132</cell><cell>316</cell></row><row><cell>Number of attributes</cell><cell>29</cell><cell>67</cell></row><row><cell>Density</cell><cell cols="2">48.3% 52.2%</cell></row><row><cell cols="3">Number of formal concepts 29,032 8,940,648</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Two sets of rules are said to be equivalent if they have the same set of consequences.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">|C| denotes the number of elements of the set C.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">Two objects g and h are similar to each other iff sim(g, h) ≥ ǫ.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Performances of galois sub-hierarchy-building algorithms</title>
		<author>
			<persName><forename type="first">Gabriela</forename><surname>Arévalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Anne</forename><surname>Berry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marianne</forename><surname>Huchard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Guillaume</forename><surname>Perrot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alain</forename><surname>Sigayret</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICFCA&apos;07: Proceedings of the 5th international conference on Formal concept analysis</title>
				<meeting><address><addrLine>Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="166" to="180" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Formal concept analysis with background knowledge: attribute priorities</title>
		<author>
			<persName><forename type="first">Radim</forename><surname>Belohlavek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vilem</forename><surname>Vychodil</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Trans. Sys. Man Cyber Part C</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="399" to="409" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Concept Data Analysis: Theory and Applications</title>
		<author>
			<persName><forename type="first">C</forename><surname>Carpineto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Romano</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2004">2004</date>
			<publisher>John Wiley &amp; Sons</publisher>
			<pubPlace>England</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Complexity reduction in lattice-based information retrieval</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Karen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Douglas</forename><surname>Cheung</surname></persName>
		</author>
		<author>
			<persName><surname>Vogel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Retrieval</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="285" to="299" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Introduction to lattices and order</title>
		<author>
			<persName><forename type="first">B</forename><surname>Davey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Priestley</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1990">1990</date>
			<publisher>Cambridge University Press</publisher>
			<pubPlace>Cambridge, England</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Sérgio</surname></persName>
		</author>
		<author>
			<persName><surname>Dias</surname></persName>
		</author>
		<title level="m">Algoritmos para geração de reticulados conceituais (in portuguese)</title>
				<meeting><address><addrLine>Belo Horizonte, Brazil</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
		<respStmt>
			<orgName>Federal University of Minas Gerais (UFMG)</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s thesis</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Concept lattice generation by singular value decomposition</title>
		<author>
			<persName><forename type="first">Petr</forename><surname>Gajdos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pavel</forename><surname>Moravec</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Václav</forename><surname>Snásel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CLA</title>
		<imprint>
			<biblScope unit="page" from="102" to="110" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Formal Concept Analysis: Mathematical Foundations</title>
		<author>
			<persName><forename type="first">Bernhard</forename><surname>Ganter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rudolf</forename><surname>Wille</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>Germany</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">An incremental concept formation approach for learning from databases</title>
		<author>
			<persName><forename type="first">Robert</forename><surname>Godin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rokia</forename><surname>Missaoui</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Theor. Comput. Sci</title>
		<imprint>
			<biblScope unit="volume">133</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="387" to="419" />
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Lattice model of browsable data spaces</title>
		<author>
			<persName><forename type="first">Robert</forename><surname>Godin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eugene</forename><surname>Saunders</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Gecsei</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Sci</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="89" to="116" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Analysis of social communities with iceberg and stability-based concept lattices</title>
		<author>
			<persName><forename type="first">Nicolas</forename><surname>Jay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">François</forename><surname>Kohler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Amedeo</forename><surname>Napoli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICFCA</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="258" to="272" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">FooCA -Web Information Retrieval with Formal Concept Analysis</title>
		<author>
			<persName><forename type="first">Bjoern</forename><surname>Koester</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>Verlag Allgemeine Wissenschaft</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Stability as an estimate of the degree of substantiation of hypotheses derived on the basis of operational similarity</title>
		<author>
			<persName><forename type="first">Sergei</forename><surname>Kuznetsov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">N.T.I</title>
		<imprint>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="21" to="29" />
			<date type="published" when="1990">1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">On computing the size of a lattice and related decision problems</title>
		<author>
			<persName><forename type="first">Sergei</forename><surname>Kuznetsov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Order</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="313" to="321" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Computing iceberg concept lattices with titanic</title>
		<author>
			<persName><forename type="first">G</forename><surname>Stumme</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Taouil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bastide</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Pasquier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Lakhal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data and Knowledge Eng</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">34</biblScope>
			<biblScope unit="page" from="189" to="222" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Formal concept analysis for knowledge discovery and data mining: The new challenges</title>
		<author>
			<persName><forename type="first">Petko</forename><surname>Valtchev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rokia</forename><surname>Missaoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert</forename><surname>Godin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Concept Lattices</title>
				<meeting><address><addrLine>Berlin / Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">2961</biblScope>
			<biblScope unit="page" from="3901" to="3901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Um estudo de algoritmos para a extração de regras baseados em análise formal de conceitos</title>
		<author>
			<persName><forename type="first">Renato</forename><surname>Vimieiro</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<pubPlace>Belo Horizonte, Brazil</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Federal University of Minas Gerais (UFMG)</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s thesis</note>
	<note>in portuguese</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Restructuring lattice theory: an approach based on hierarchies of concepts</title>
		<author>
			<persName><forename type="first">R</forename><surname>Wille</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ordered Sets</title>
				<editor>
			<persName><forename type="first">I</forename></persName>
		</editor>
		<imprint>
			<date type="published" when="1982">1982</date>
			<biblScope unit="page" from="445" to="470" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Qualitative behavior rules for the cold rolling process extracted from trained ann via the fcann method</title>
		<author>
			<persName><forename type="first">Luis</forename><forename type="middle">E</forename><surname>Zárate</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sérgio</surname></persName>
		</author>
		<author>
			<persName><surname>Dias</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Eng. Appl. Artif. Intell</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">4-5</biblScope>
			<biblScope unit="page" from="718" to="731" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
