<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Ontology-Driven Method for Ranking Unexpected Rules</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mohamed</forename><forename type="middle">Said</forename><surname>Hamani</surname></persName>
							<email>saidhamani@hotmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Mohamed Boudiaf-M&apos;sila University</orgName>
								<address>
									<country key="DZ">Algeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ramdane</forename><surname>Maamri</surname></persName>
							<email>rmaamri@yahoo.fr</email>
							<affiliation key="aff1">
								<orgName type="institution">Mentouri-Constantine University</orgName>
								<address>
									<country key="DZ">Algeria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Ontology-Driven Method for Ranking Unexpected Rules</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">37AA406104DC269EBDA3256B42B36348</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T00:18+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>data mining</term>
					<term>ontology</term>
					<term>unexpectedness</term>
					<term>association rules</term>
					<term>domain knowledge</term>
					<term>subjective measures</term>
					<term>semantic distance</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Several rule discovery algorithms have the disadvantage to discover too much patterns sometimes obvious, useless or not very interesting to the user. In this paper we propose a new approach for patterns ranking according to their unexpectedness using semantic distance calculated based on a prior background knowledge represented by domain ontology organized as DAG (Directed Acyclic Graph) hierarchy.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Knowledge discovery in databases (data mining) has been defined in <ref type="bibr" target="#b5">[6]</ref> as the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns from data. Association rule algorithms <ref type="bibr" target="#b0">[1]</ref> are rule-discovery methods that discover patterns in the form of IF-THEN rules. It was noticed that most algorithm of data mining generates a large number of rules who are valid but obvious or not very interesting to the user <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b21">22,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b12">13]</ref>. The presence of the huge number of rules makes it difficult for the user to identify those that are of interest. To address this issue most approaches on knowledge discovery use objective measures of interestingness, such as confidence and support <ref type="bibr" target="#b0">[1]</ref>, for the evaluation of the discovered rules. These objective measures capture the statistical strength of a pattern. The interestingness of a rule is essentially subjective <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b10">11]</ref>. Subjective measures of interestingness, such as unexpectedness <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b30">31,</ref><ref type="bibr" target="#b3">4]</ref>, assume that the interestingness of a pattern depends on the decision-maker and does not solely depend on the statistical strength of the pattern. Although objective measures are useful, they are insufficient in the determination of the interestingness of rules. One way to approach this problem is by focusing on discovering unexpected patterns <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b19">20]</ref> where unexpectedness of discovered patterns is usually defined relative to a system of prior expectations. In this paper we define a degree of unexpectedness based on the semantic distance of the rule vocabulary and relative to a prior knowledge represented by ontology. Ontology represents knowledge with the relationships between concepts. It is organized as a DAG (Directed Acyclic Graph) hierarchy. We propose a new approach for ranking the most interesting rules according to conceptual distance (distance between the antecedent and the consequent of the rule) relative to the hierarchy. Highly related concepts are grouped together in the hierarchy. The more concepts are far away, the less are related to each other. The less concepts are related to each other and take part of the definition of a rule the more surprising the rule is and therefore interesting. With such ranking, a user can check fewer rules on the top of the list to extract the most pertinent ones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Method Presentation</head><p>Data-mining is the process of discovering patterns in data. Data-mining methods have the drawbacks to generate a very large number of rules that are not of interest to the user. The use of objective measures of interestingness, such as confidence and support, is a step toward interestingness. Objective measures of interestingness are data driven; they measure the statistical strength of the rule and do not exploit domain knowledge and intuition of the decision maker. Beside objective measures, our approach exploit domain knowledge represented by ontology organized as DAG hierarchy. The nodes of the hierarchy represent the rules vocabulary. For a rule like (x AND y→z) x, y and z are nodes in the hierarchy. The semantic distance between the Antecedent (x AND y) and the consequent (z) of the rule is a measure of interestingness. The more the distance is high, the more the rule is unexpected and therefore interesting. Based on this measure a ranking algorithm helps in selecting those rules of interest to the user.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Semantic distance</head><p>Two main categories of algorithms for computing the semantic distance between terms organized in a hierarchical structure have been proposed in the literature <ref type="bibr" target="#b8">[9]</ref>: distance-based approaches and information content-based approaches. The general idea behind the distance-based algorithms <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b31">32]</ref> is to find the shortest path between two terms in terms of number of edges. Information content-based approaches <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b23">24]</ref> are inspired by the perception that pairs of words which share many common contexts are semantically related. We will be using distance-based approaches in this paper. In an IS-A semantic network, the simplest form of determining the distance between two elemental concept nodes, A and B, is the shortest path that links A and B, i.e. the minimum number of edges that separate A and B or the sum of weights of the arcs along the shortest path between A and B <ref type="bibr" target="#b23">[24]</ref>.</p><p>In the hierarchy of Figure <ref type="figure" target="#fig_0">1</ref>, the edges distance between nodes of the graph with weight=1 is: Dist(Apple, Kiwi) = 2 Dist(Carrots, Pepper) = 2 Dist(Apple, Meat) = 4 Dist(Fruit, Red Meat) = 4</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Ontology</head><p>The prior knowledge of domain or a process in the field of data mining can help to select the appropriate information (preprocessing), decrease the space of hypothesis (processing), to represent results in a more comprehensible way and to improve process (post processing) <ref type="bibr" target="#b4">[5]</ref>. Ontology expresses the domain knowledge which includes semantic links between domain individuals described as relations of inter-concepts or roles <ref type="bibr" target="#b6">[7]</ref>. </p><formula xml:id="formula_0">(X, Y)= max(h(X,Y),h(Y,X)) where h(X, Y)= max Xi∈X min Yj ∈Y X i − X j</formula><p>The function h(X,Y) is called the directed Hausdorff 'distance' from X to Y (this function is not symmetric and thus is not a true distance). It identifies the point Xi∈X that is farthest from any point of Y, and measures the distance from Xi to its nearest neighbor in Y. the Hausdorff distance, H(X,Y), measures the degree of mismatch between two sets, as it reflects the distance of the point of X that is farthest from any point of Y and vice versa <ref type="bibr" target="#b7">[8]</ref>. This expression measures semantic distance between groups X 1 ∧ . . . ∧ X k and Y 1 ∧ . . . ∧ Y m of concepts which contain k X i and m atomic Y j concepts respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Rules ranking</head><p>In this section we introduce an algorithm to rank rules according to their degree of unexpectedness based on background knowledge. The rules we consider are on the form "body → head" where "body" and "head" are conjunctions of concepts in vocabulary of the ontology. We assume that other techniques carry out the task of patterns discovery and eliminated the patterns that do not satisfy objective criteria. With such ranking, a user can check simply few patterns on the top of the list to confirm rule pertinence. </p><formula xml:id="formula_1">; i ∈[1,k] ; j ∈[1,m] Body = X 1 ∧ . . . ∧ X k Head = Y 1 ∧ . . . ∧ Y m For i=1 to ND begin For j=1 to ND Distance (X i , X j ) =shortest path between X i , X j ; End For i=1 to N begin DU [i] = (Distance(X 1 ∧ . . . ∧ X k , Y 1 ∧ . . . ∧ Y m ))/2D</formula><p>End Sort Descending degree of unexpectedness DU.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Example</head><p>In this section we present results from applying our method to the hierarchy of Figure <ref type="figure" target="#fig_0">1</ref> with a set of association rules R = {Apple → Kiwi; Apple → Carrots; P epper, Carrots → T urkey, Chicken} resulting from a data mining process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Nodes distance Computation</head><p>The number of graph nodes in Figure <ref type="figure" target="#fig_0">1</ref> is ND=16 and the depth of the graph is D=3. The semantic distance (the minimum number of edges that separate 2 nodes) computation of Figure <ref type="figure" target="#fig_0">1</ref> graph nodes is presented in the following table <ref type="table" target="#tab_0">(Table 1</ref>) where every cell represents the distance between the node on the line and the corresponding one on the column.</p><p>We have presented only the leaves of the hierarchy in Table <ref type="table" target="#tab_0">1</ref> due to the fact that all the rules R are expressed using leaves concepts of the hierarchy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Degree of unexpectedness computation</head><p>The maximum depth of the hierarchy in Figure <ref type="figure" target="#fig_0">1</ref>   The degree of unexpectedness for a given rule X→Y is calculated using our expression DU(X→Y)=Distance(X,Y)/2D and the resulting computation is presented in ( The order of rules would be (c), (b), (a) based on degree of unexpectedness descending order as shown in (Table <ref type="table" target="#tab_1">2</ref>).From decision system point of view the rule (c) belongs to a higher level (Food) than the rule (b) that belongs to level (vegetable-dishes). The rule (a) belongs to a lower level (Fruit). More we move up on in the hierarchy more the decision is important and the vision of the decision maker is broader and therefore the discovered rule is more interesting. Rule (c) is the crossing result of domains (vegetables-dishes, Meat) which are farther than domains (vegetables, Fruits) of the rule (b). The rule (a) concerns domain (Fruit) only and therefore it is the less interesting.</p><formula xml:id="formula_2">(X,Y)= Distance (X 1 ∧ . . . ∧ X k , Y 1 ∧ . . . ∧ Y m ) = max(h(X, Y ), h(Y, X))</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experiments</head><p>The experiments were performed using a census income database with 48.842 records <ref type="bibr" target="#b2">[3]</ref> with an implementation of our algorithm. To generate the association rules, we used the implementation of the Apriori algorithm <ref type="bibr" target="#b1">[2]</ref> with a minimum support value equal 0.2 and a confidence value equal 0.2. The number of the generated rules set is 2225. In order to perform the experiments, we created the taxonomy of 81 weighted concepts based on the data set we are studying, as shown in (Table <ref type="table">3</ref>).</p><p>We conducted two tests, the first one with a weight value equals to one for all concepts with results presented in (Figure <ref type="figure" target="#fig_3">2</ref>). The second test was conducted with different weights on the atomic concepts level (see Table <ref type="table">3</ref> for weights), with results presented in (Figure <ref type="figure">3</ref>). (Figure <ref type="figure" target="#fig_3">2</ref>) and (Figure <ref type="figure">3</ref>) are the extracted first two lines within each distance value for each test. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 3. Second Test Ranking Results</head><p>Looking to the results we notice:</p><p>1. Best results are those for highest weight (Figure <ref type="figure">3</ref> with Bachelors concept). 2. Best results from both tests are cross level concepts (higher subsumer like 'Personal', 'Education', 'Work' or 'census-income') and not those within the same concept level. 3. Low results from both tests (last 2 lines) are within the same concept level like 'Personal'.</p><p>Our approach is based on a hierarchy in (Table <ref type="table">3</ref>) which guides the resulting rules. The maximum hierarchy depth is 3 and it is the same as the minimum depth; this hierarchy is distributing the load equally between its different branches. The first test was conducted with weight equals to 1, for all concepts; In this case all  <ref type="table">3</ref>. Experiment Taxonomy concepts have the same degree of interest to the user. The ranking rules algorithm picks those with higher subsumer concept. The common subsumer for the rules ((1), ( <ref type="formula">2</ref>) and (3) of (Figure <ref type="figure" target="#fig_3">2</ref>) is the top concept 'census-income', however The common subsumer for the rules (4) and ( <ref type="formula">5</ref>) is the concept 'Work'. Rule (1) concerns 'sex' and 'occupation', however rules (2) and (3) are about education and occupation. The last 2 rules (4) and ( <ref type="formula">5</ref>) express the relation between 'occupation' and 'salary-class'. We believe a rule like (1), ( <ref type="formula">2</ref>) or ( <ref type="formula">3</ref>) is more interesting, because it is giving us information between 'Education' and 'Personal' information and it involves a higher decision maker (strategic) than the one concerning 'occupation' and 'salary' that can concerns payroll for instance. The second test was conducted with a weight of 'bachelors' concept equals to 7 (among other concepts settings see Table <ref type="table">3</ref>).The user in this case is putting more emphasis on this concept by setting its weight to a high value or because it is really that important in the domain of study. The ranking rules algorithm picks those with higher weight. The common subsumer for the rules (1) and ( <ref type="formula">2</ref>) of (Figure <ref type="figure">3</ref>) is the concept 'census-income', but in this case with a 'Bachelors' concept as member of the rule. In this case the user is focusing his study on people with 'bachelors' education and their relation to 'Personal' information or 'Work'. The common subsumer for the last 2 rules of (Figure <ref type="figure">3</ref>) is the concept 'Personal'. These rules express the relation between 'sex', 'age' and 'matrial-status' concepts. Even though interestingness is subjective (What's interesting of one may not be of the same degree of interest to the other), we believe more we move up on in the hierarchy, more the decision is important and the vision of the decision maker is broader,stratigic and important; therefore the discovered rule is more interesting. Our approach follows this vision.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Related Works</head><p>Unexpectedness of patterns has been studied in <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b19">20]</ref> and defined in comparison with user beliefs. A rule is considered interesting if it affects the levels of conviction of the user. The unexpectedness is defined in probabilistic terms in <ref type="bibr" target="#b28">[29,</ref><ref type="bibr" target="#b29">30]</ref> while in <ref type="bibr" target="#b12">[13]</ref> it is defined as a distance and it is based on a syntactic comparison between a rule and a conviction. Similarity and distance are defined syntactically based on the structure of the rules and convictions. A rule and a conviction are distant if the consequence of the rule and conviction is similar but antecedents are distant or vice versa. In <ref type="bibr" target="#b20">[21]</ref> the focus is on discovering minimal unexpected patterns rather than using any of the post processing approaches, such as filtering, to determine the minimal unexpected patterns from the set of all the discovered patterns. In <ref type="bibr" target="#b17">[18]</ref> unexpectedness is defined from the point of view of a logical contradiction of a rule and conviction, the pattern that contradict a prior knowledge is unexpected. It is based on the contradiction of the consequence of the rule and the consequence of belief. Given a rule A→B and a belief X→Y, if B AND Y is False with A AND X is true for broad group of data, the rule is unexpected. In <ref type="bibr" target="#b14">[15]</ref> , the subjective interestingness (unexpectedness) of a discovered pattern is characterized by asking the user to specify a set of patterns according to his/her previous knowledge or intuitive feelings. This specified set of patterns is then used by a fuzzy matching algorithm to match and rank the discovered patterns. <ref type="bibr" target="#b25">[26,</ref><ref type="bibr" target="#b26">27,</ref><ref type="bibr" target="#b27">28]</ref> has taken a different approach to the discovery of interesting patterns by eliminating noninteresting association rules. Rather than getting the users define their entire knowledge of a domain, they are asked to identify several non-interesting rules, generated by the Apriori algorithm. <ref type="bibr" target="#b24">[25]</ref> use genetic algorithm to dynamically maintain and search populations of rule sets for the most interesting rules rather than act as post-processor. The rules identified by the genetic algorithm compared favorably with the rules selected by the domain expert <ref type="bibr" target="#b16">[17]</ref>. Most researches on the unexpectedness makes a syntactic or semantic comparison between a rule and a belief. Our definition of unexpectedness is based on the structure of background knowledge (hierarchy) underlying the terms (vocabulary) of the rule. We are taking a different approach from all the preceding work. The preceding work is a filtering process based on a belief expressed as rules that the user has to enter. We are proposing a ranking process and the knowledge are not expressed as rules, but as hierarchy of concepts ontology. Ontologies enable knowledge sharing. Sharing vastly increases the potential for knowledge reuse and therefore allows our approach to get free knowledge just from using domain ontologies already available like "ONTODerm" for dermatology, "BIO-ONT" for biomedicine, "ASFA, OneFish , FIGIS , AGROVOC" for Food,etc.</p><p>In this paper we proposed a new approach to estimate the degree of unexpectedness of a rule with respect to ontology and ranking patterns according to their unexpectedness, defined on the base of ontological distance. The ranking algorithm proposed uses an ontology to calculate the distance between the antecedent and the consequent of rules on which is based the ranking. The more the conceptual distance is high, the more the rule represents a high degree of interest. This work constitutes a contribution to post analysis stage to help the user identify the most interesting patterns.</p><p>In the future, we plan to incorporate a semantic distance threshold in the algorithm of calculation of frequent items, to exploit others relation of ontology other than "IS-A". We are also validating our approach on fuzzy ontology to take into account vague and imprecise information.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. hierarchy example</figDesc><graphic coords="2,70.87,353.06,470.28,176.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Algorithm</head><label></label><figDesc>Input: Ontology, Set of rules Output: Ordred set of rules R: Set of rules R= {Ri/ Ri=body → head} where i ∈[1,N] ND: Number of nodes N: number of rules D: Maximum depth of the hierarchy DU: Array of size N representing degree of unexpectedness Xi, Yj : Atomic Concepts</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>For</head><label></label><figDesc>the set of rules R = {(a), (b), (c)} where: (a) Apple→ Kiwi (b) Apple → Carrots (c) Pepper, Carrots → Turkey, Chicken The detail computation distance of the rules (a), (b), (c) is : (a) Dist(Apple, Kiwi)=2 (b) Dist(Apple, Carrots)=4 (c ) Dist(Pepper∧Carrots, Turkey∧ Chicken)= max(h(Pepper∧Carrots, Turkey∧ Chicken), h(Turkey∧ Chicken,Pepper∧Carrots)) h(Pepper∧Carrots, Turkey∧ Chicken)=6 h(Turkey∧ Chicken,Pepper∧Carrots)=6 (c)Dist(Pepper∧Carrots, Turkey∧ Chicken)= 6</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. First Test Ranking Results</figDesc><graphic coords="5,70.87,162.97,470.29,83.61" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Graph nodes distance</figDesc><table><row><cell cols="9">Nodes Apple Kiwi Carrots Pepper Beef Mutton Turkey Chicken</cell></row><row><cell>Apple</cell><cell>0</cell><cell>2</cell><cell>4</cell><cell>4</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell></row><row><cell>Kiwi</cell><cell>2</cell><cell>0</cell><cell>4</cell><cell>4</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell></row><row><cell>Carrots</cell><cell>4</cell><cell>4</cell><cell>0</cell><cell>2</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell></row><row><cell>Pepper</cell><cell>4</cell><cell>4</cell><cell>2</cell><cell>0</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell></row><row><cell>Beef</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>0</cell><cell>2</cell><cell>4</cell><cell>4</cell></row><row><cell>Mutton</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>2</cell><cell>0</cell><cell>4</cell><cell>4</cell></row><row><cell>Turkey</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>4</cell><cell>4</cell><cell>0</cell><cell>2</cell></row><row><cell cols="2">Chicken 6</cell><cell>6</cell><cell>6</cell><cell>6</cell><cell>4</cell><cell>4</cell><cell>2</cell><cell>0</cell></row><row><cell>Distance</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note>is D=3. For a given rule X→Y where X=X 1 ∧ . . . ∧ X k and Y=Y 1 ∧ . . . ∧ Y k</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 )</head><label>2</label><figDesc></figDesc><table><row><cell cols="2">Label Rule</cell><cell cols="2">Distance Degree of unexpectedness</cell></row><row><cell cols="2">(a) Apple → Kiwi</cell><cell>2</cell><cell>2/6=0.33</cell></row><row><cell cols="2">(b) Apple → Carrots</cell><cell>4</cell><cell>4/6=0.66</cell></row><row><cell>(c)</cell><cell>Pepper, Carrots → Turkey, Chicken</cell><cell>6</cell><cell>6/6=1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Rules degree of unexpectedness</figDesc><table /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Database mining: A performance perspective</title>
		<author>
			<persName><forename type="first">R</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Imielinski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Swami</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="914" to="925" />
			<date type="published" when="1993-12">December 1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Chistian</forename><surname>Borgelt</surname></persName>
		</author>
		<ptr target="http://www.borgelt.net/software.html" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="ftp://ftp.ics.uci.edu/pub/machine-learning-databases/census-income/" />
		<title level="m">census income</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Book review: &apos;fuzzy set theory and its applications</title>
		<author>
			<persName><forename type="first">Didier</forename><surname>Dubois</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Diffusion scientifique</title>
				<editor>
			<persName><forename type="first">H</forename><forename type="middle">J</forename><surname>Zimmermann</surname></persName>
		</editor>
		<imprint>
			<publisher>Kluwer Academic Publ. Dordrecht</publisher>
			<date type="published" when="1991">1991</date>
			<biblScope unit="volume">48</biblScope>
			<biblScope unit="page" from="169" to="170" />
		</imprint>
	</monogr>
	<note>2nd edition</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A new algorithm for mining fuzzy association rules in the large databases based on ontology</title>
		<author>
			<persName><forename type="first">Zahra</forename><surname>Farzanyar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohammadreza</forename><surname>Kangavari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sattar</forename><surname>Hashemi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDM Workshops</title>
				<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="65" to="69" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">From data mining to knowledge discovery: An overview</title>
		<author>
			<persName><forename type="first">M</forename><surname>Usama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gregory</forename><surname>Fayyad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Padhraic</forename><surname>Piatetsky-Shapiro</surname></persName>
		</author>
		<author>
			<persName><surname>Smyth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Knowledge Discovery and Data Mining</title>
				<imprint>
			<date type="published" when="1996">1996</date>
			<biblScope unit="page" from="1" to="34" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Towards principles for the design of ontologies used for knowledge sharing</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">R</forename><surname>Gruber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Formal Ontology in Conceptual Analysis and Knowledge Representation</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Poli</surname></persName>
		</editor>
		<meeting><address><addrLine>Deventer, The Netherlands</addrLine></address></meeting>
		<imprint>
			<publisher>Kluwer Academic Publishers</publisher>
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Comparing images using the hausdorff distance</title>
		<author>
			<persName><forename type="first">P</forename><surname>Daniel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gregory</forename><forename type="middle">A</forename><surname>Huttenlocher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">William</forename><forename type="middle">J</forename><surname>Kl</surname></persName>
		</author>
		<author>
			<persName><surname>Rucklidge</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Pattern Analysis and Machine Intelligence</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="850" to="863" />
			<date type="published" when="1993">1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Semantic similarity based on corpus statistics and lexical taxonomy</title>
		<author>
			<persName><forename type="first">Jay</forename><forename type="middle">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">W</forename><surname>Conrath</surname></persName>
		</author>
		<idno>CoRR, cmp-lg/9709008</idno>
		<imprint>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
	<note>informal publication</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Semantic similarity based on corpus statistics and lexical taxonomy</title>
		<author>
			<persName><forename type="first">Jay</forename><forename type="middle">J</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">W</forename><surname>Conrath</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997-09-20">September 20 1997</date>
		</imprint>
	</monogr>
	<note>Comment: 15 pages. Postscript only</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Finding interesting rules from large sets of discovered association rules</title>
		<author>
			<persName><forename type="first">Mika</forename><surname>Klemettinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Heikki</forename><surname>Mannila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pirjo</forename><surname>Ronkainen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hannu</forename><surname>Toivonen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Inkeri</forename><surname>Verkamo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Third International Conference on Information and Knowledge Management (CIKM&apos;94)</title>
				<editor>
			<persName><forename type="first">R</forename><surname>Nabil</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Bharat</forename><forename type="middle">K</forename><surname>Adam</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Yelena</forename><surname>Bhargava</surname></persName>
		</editor>
		<editor>
			<persName><surname>Yesha</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="1994-11">November 1994</date>
			<biblScope unit="page" from="401" to="407" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Combining local context and WordNet similarity for word sense identification</title>
		<author>
			<persName><forename type="first">Claudia</forename><surname>Leacock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Chodorow</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WordNet: An Electronic Lexical Database</title>
				<editor>
			<persName><forename type="first">Christaine</forename><surname>Fellbaum</surname></persName>
		</editor>
		<meeting><address><addrLine>Cambridge, Massachusetts</addrLine></address></meeting>
		<imprint>
			<publisher>The MIT Press</publisher>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="265" to="283" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Post-analysis of learned rules</title>
		<author>
			<persName><forename type="first">Bing</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wynne</forename><surname>Hsu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference</title>
				<meeting>the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference<address><addrLine>Menlo Park</addrLine></address></meeting>
		<imprint>
			<publisher>AAAI Press / MIT Press</publisher>
			<date type="published" when="1996">1996</date>
			<biblScope unit="page" from="828" to="834" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Using general impressions to analyze discovered classification rules</title>
		<author>
			<persName><forename type="first">Bing</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wynne</forename><surname>Hsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shu</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97)</title>
				<editor>
			<persName><forename type="first">David</forename><surname>Heckerman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Heikki</forename><surname>Mannila</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Daryl</forename><surname>Pregibon</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Ramasamy</forename><surname>Uthurusamy</surname></persName>
		</editor>
		<meeting>the Third International Conference on Knowledge Discovery and Data Mining (KDD-97)</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page">31</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Finding interesting patterns using user expectations</title>
		<author>
			<persName><forename type="first">Bing</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wynne</forename><surname>Hsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lai-Fun</forename><surname>Mun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hing-Yan</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Knowl. Data Eng</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="817" to="832" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Selecting among rules induced from a hurricane database</title>
		<author>
			<persName><forename type="first">John</forename><forename type="middle">A</forename><surname>Major</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><forename type="middle">J</forename><surname>Mangano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Intell. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="39" to="52" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A survey of interestingness measures for knowledge discovery</title>
		<author>
			<persName><forename type="first">Kenneth</forename><surname>Mcgarry</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge Eng. Review</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="39" to="61" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">On the discovery of unexpected rules in data mining applications</title>
		<author>
			<persName><forename type="first">B</forename><surname>Padmanabhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Procs. of the Workshop on Information Technology and Systems (WITS &apos;97)</title>
				<meeting>s. of the Workshop on Information Technology and Systems (WITS &apos;97)</meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="81" to="90" />
		</imprint>
	</monogr>
	<note>On the Discovery of Unexpected Rules in Data Mining Applications</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A belief-driven method for discovering unexpected patterns</title>
		<author>
			<persName><forename type="first">Balaji</forename><surname>Padmanabhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD</title>
				<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="94" to="100" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Unexpectedness as a measure of interestingness in knowledge discovery</title>
		<author>
			<persName><forename type="first">Balaji</forename><surname>Padmanabhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Tuzhilin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999-01-09">January 09 1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">On characterization and discovery of minimal unexpected patterns in rule discovery</title>
		<author>
			<persName><forename type="first">Balaji</forename><surname>Padmanabhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Knowl. Data Eng</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="202" to="216" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Discovery, analysis, and presentation of strong rules</title>
		<author>
			<persName><forename type="first">Gregory</forename><surname>Piatetsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">-</forename><surname>Shapiro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Discovery in Databases</title>
				<imprint>
			<publisher>AAAI/MIT Press</publisher>
			<date type="published" when="1991">1991</date>
			<biblScope unit="page" from="231" to="233" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">The interstigness of deviations</title>
		<author>
			<persName><forename type="first">Gregory</forename><surname>Piatetsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">-</forename><surname>Shapiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">J</forename><surname>Matheus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD Workshop</title>
				<imprint>
			<date type="published" when="1994">1994</date>
			<biblScope unit="page" from="25" to="36" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Development and application of a metric on semantic nets</title>
		<author>
			<persName><forename type="first">R</forename><surname>Rada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Bicknell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Blettner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Systems, Man, and Cybernetics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="17" to="30" />
			<date type="published" when="1989-02">January-February 1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Discovering interesting knowledge from a science and technology database with a genetic algorithm</title>
		<author>
			<persName><forename type="first">Wesley</forename><surname>Romão</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alex</forename><forename type="middle">Alves</forename><surname>Freitas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Itana Maria De</forename><surname>Souza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gimenes</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Appl. Soft Comput</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="121" to="137" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Interestingness via what is not interesting</title>
		<author>
			<persName><forename type="first">Sigal</forename><surname>Sahar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD</title>
				<imprint>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="332" to="336" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Interestingness preprocessing</title>
		<author>
			<persName><forename type="first">Sigal</forename><surname>Sahar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDM</title>
				<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="489" to="496" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">On incorporating subjective interestingness into the mining process</title>
		<author>
			<persName><forename type="first">Sigal</forename><surname>Sahar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDM</title>
				<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="681" to="684" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">On subjective measures of interestingness in knowledge discovery</title>
		<author>
			<persName><forename type="first">Abraham</forename><surname>Silberschatz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD</title>
				<imprint>
			<date type="published" when="1995">1995</date>
			<biblScope unit="page" from="275" to="281" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">What makes patterns interesting in knowledge discovery systems</title>
		<author>
			<persName><forename type="first">Abraham</forename><surname>Silberschatz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="970" to="974" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Learning useful rules from inconclusive data</title>
		<author>
			<persName><forename type="first">Ramasamy</forename><surname>Uthurusamy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Usama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Scott</forename><surname>Fayyad</surname></persName>
		</author>
		<author>
			<persName><surname>Spangler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge Discovery in Databases</title>
				<imprint>
			<date type="published" when="1991">1991</date>
			<biblScope unit="page" from="141" to="158" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Verb semantics and lexical selection</title>
		<author>
			<persName><forename type="first">Zhibiao</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martha</forename><surname>Palmer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Annual Meeting of the Association for Computational Linguistics</title>
				<meeting><address><addrLine>New Mexico State University; Las Cruces, New Mexico</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1994">1994</date>
			<biblScope unit="page" from="133" to="138" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
