<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Evolutionary Learning of Boolean Queries by Genetic Programming</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Suhail</forename><forename type="middle">S J</forename><surname>Owais</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">V ŠB-Technical University of Ostrava</orgName>
								<address>
									<addrLine>17. listopadu 15</addrLine>
									<settlement>Ostrava -Poruba</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pavel</forename><surname>Krömer</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">V ŠB-Technical University of Ostrava</orgName>
								<address>
									<addrLine>17. listopadu 15</addrLine>
									<settlement>Ostrava -Poruba</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Václav</forename><surname>Snášel</surname></persName>
							<email>vaclav.snasel@vsb.cz</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science</orgName>
								<orgName type="institution">V ŠB-Technical University of Ostrava</orgName>
								<address>
									<addrLine>17. listopadu 15</addrLine>
									<settlement>Ostrava -Poruba</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Evolutionary Learning of Boolean Queries by Genetic Programming</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7CA4A98F6A25E2BE1D9E8B305FE4C9E6</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:14+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The performance of an information retrieval system is usually measured in terms of two different criteria, precision and recall. This way, the optimization of any of its components is a clear example of a multiobjective problem. However, although evolutionary programming have been widely applied in the information retrieval area, in all of these applications both criteria have been combined in a single scalar fitness function by means of a weighting scheme. In this paper, we deal with using of Genetic Programming in Information retrieval specially in optimizing of a Boolean query.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Ever since the advent of the public network Internet, the quantity of available information is rapidly rising. One of the most important uses of this public network is to find suitable information for such user query request. In such a huge and unstable information collection, todays greatest problem is to find relevant information to the user query.</p><p>Information filtering is concerned with finding information from unstable collections of documents such as the Internet. In the information filtering domain, the user query does not consists of a list of words or terms (word and term have the same meaning in our work) to search for but rather of combinations of words extracted from various examples. The most important problem to solve is to optimize the significance of the user query and obtaining accurate collection statistics for calculating the term arity.</p><p>After using evolutionary techniques for single-objective optimization during more than two decades, the incorporation of more than one objective in the fitness function has become a popular area of research for multiobjective problems. The use of evolutionary algorithms to solve problems with multiple objectives (known as Multi-objective Optimization Problems) has attracted much attention <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b17">18]</ref>. An information retrieval system is basically constituted of three main components: documentary database, query subsystem and matching or evaluation mechanism <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b12">13]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Evaluation of Information Retrieval System</head><p>Evaluation of the information retrieval system, measured by effectiveness, two statistics are used precision and recall, where these measures are evaluated over a set of documents called a collection of documents. All documents in this collection of documents are divided into four subsets: Relevant set "set of documents that are relevant to the user query"; Retrieved set "set of documents that are returned to the user"; and Relevant-Retrieved set "set of documents that are retrieved and relevant to the user query"; and finally the rest set of documents "set of documents that are not relevant and not retrieved". Where precision is the percentage of the retrieved documents that are relevant to the user query and recall is the percentage of the relevant documents that are retrieved for the requested query <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b7">8]</ref>.</p><formula xml:id="formula_0">Recall = RelevantRetrieved Relevant P recision = RelevantRetrieved</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Retrieved</head><p>In our work we introduce to use Genetic Programming for implementing the Information Retrieval system with Boolean queries, trying to evolve Boolean queries by genetic programming.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Genetic Algorithms</head><p>Most of the search engines in the internet depend on the user query and operate an information retrieval system to get the response of the user query request. Where the user query consists of set of terms and set of logical operators; especially and, or, of, and not operator see <ref type="bibr" target="#b7">[8]</ref>. For this our motivation in our work is to do the evolution of the Boolean queries using genetic programming in the information retrieval <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b14">15]</ref>.</p><p>Genetic Algorithm is an algorithm that used to find approximate solutions to problems that were difficult to solve it through set of methods or techniques inheritance or crossover, mutation, natural selection, and fitness function that are principles of evolutionary biology in computer science. For more detail about Genetic Algorithms see <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b15">16]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Genetic Programming</head><p>This section will present the implementation of information retrieval using genetic programming (for SQL we can see <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b3">4]</ref>). The GA is generally used to solve optimization problems <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b4">5]</ref>. GA starts on an initial population with fixed size of chromosomes "P-chromosomes". Each individual are coded according to chromosome length, where genes are allocated in each position in a chromosome with different data types, and each gene values is called allele. In information retrieval, query for relevant documents are representing for each individual or chromosome, and each document described by set of terms. For the collection of documents D, the description for document D i from l documents, where i = 1 . . . l, the set of terms for D i are T j , where j = 1 . . . n, thus D i = (w 1i , w 2i , . . . , w ni ). The value for each term will be 1 if this term exists in the document or 0 if not (Note: about another weights for terms were mentioned in paper <ref type="bibr" target="#b13">[14]</ref>), this indicate that the indexing function that is maps a given index term t and a given document d is</p><formula xml:id="formula_1">F : D × T → [0, 1]</formula><p>Defining a query will be combination from set of terms and set of Boolean operators and, or, of and not. The query set Q is defined as set of queries for documents, define the query processing mechanism by which documents can be evaluated in terms of their relevance to a given query <ref type="bibr" target="#b9">[10]</ref>.</p><p>Note: The of operator has the following general form:</p><p>N of(w 1 , w 2 , w 3 , . . . , w M ); M ≥ N and works like this: the document will be retrieved when it contains at least N terms from the list of M terms. For an example, 2 of(w 1 , w 2 , w 3 ) = ((w 1 and w 2 ) or(w 1 and w 3 ) or(w 2 and w 3 ))</p><p>In this work, we develop genetic programming for implementing GA operators with variable length of chromosomes and mixture symbolic of information, like real values and Boolean queries values.</p><p>Each chromosome from the initial population represented a tree structure for one query; an index was defined for each node in the tree. Genetic operators were operated over individuals. Queries will be encoded as trees, where each chromosome contains set of genes, and each gene mention to be a node in a tree and the value for each node known as allele. An example that show query encoding for chromosome in the population shown in Figure <ref type="figure" target="#fig_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Implement Genetic Operators to Evolve Boolean Queries</head><p>Genetic operators used in our work to evolve Boolean queries. Presenting for these operators Fitness, Selection, Crossover, and Mutation follows:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fitness function operator</head><p>For each individual the value of precision and recall will be computed and known as fitness values see RecallF itness and P recisionF itness respectively, this depends on the number of relevance documents r d in the collection of documents to the user query, number of retrieved document f d , and α and β are arbitrary weights. Where α and β are added specially to precision fitness function <ref type="bibr" target="#b9">[10]</ref>.</p><formula xml:id="formula_2">ReallF itness = d [r d × f d ] d [r d ] P recisionF itness = α d [r d × f d ] d [r d ] + β d [r d × f d ] d [f d ]</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Selection operator</head><p>Two individuals with best fitness values are chosen from a population, but if there are more than two individuals with the same highest fitness values, then two of them will be chosen randomly. The two selected chromosomes will be called parents and they will be used to produce two new offsprings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Crossover operator</head><p>Offsprings must have some inheritance from the two parents; single point crossover will do that by exchange subtree from parent1 with subtree from parent2. Positions for exchanging subtree1 and subtree2 will be select randomly. In our work we define the selection of the position for subtree to be:</p><p>1. The root node of the tree. 2. Each Boolean operator node.</p><p>3. Each leaf from the tree.</p><p>Producing two new offsprings from implementation of a single point crossover was shown as an example in Figure <ref type="figure" target="#fig_1">2</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Mutation operators</head><p>Mutation, random perturbation in the chromosome representation, is necessary to assure that the current generation is connected to the entire search space, and it is necessary to introduce new genetic material into a population that has stabilized level <ref type="bibr" target="#b9">[10]</ref>. In our implementation, mutation operator works as the most important operator for the evolutionary learning of Boolean query.</p><p>Each node from the new offsprings may be mutated; that depends on mutation value (by default 0.2). And we work with different type of mutations shown below:</p><p>-Mutation on Boolean operator: randomly exchanging one operator to another but both must be from the same arity, such as any exchange in ( and, or, of and not) are allowed. -Mutation on term node or leaf node: changing one term selected randomly from the offspring by any another one but the other one will be one from:</p><p>• The terms in a given collection of documents • The terms in an initial population.</p><p>• A specified list of terms.</p><p>• The terms appeared in the user query. -Mutation by inserting or deleting unary operator between two nodes in the offspring.</p><p>Where mutation was implemented on this way: For generated offspring select one node randomly and for this node we have two possibilities to mutate into another one or to apply insert an unary operator before it or delete it if and only if this node is an unary operator. Some examples were shown for mutations in Figure <ref type="figure" target="#fig_2">3</ref>. Presenting our work now to show how our research processed for Boolean queries evolutionary learning was done.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Introduction to experiments</head><p>We developed a genetic programming to process some experiments over a set of Boolean queries and various collections of documents; the documents are with various number of words or terms. All collections used in our experiments are described in Table <ref type="table" target="#tab_0">1</ref>, where collection 'Collection-1' consists of 10 different documents and 30 different words, where each document includes some of these words (one, two or more of them). For all of our experiments were used the following ten Boolean queries as an initial population for processing our genetic programming: 2 of(w 2 , w 8 ) 1 of(w 1 , w 2 , w 8 ) not(not w 13 and not w 8 ) (w 1 and(w 2 and w 8 )) or not(w 4 or w 2 ) not(w 1 or w 2 ) and((w 5 or w 4 ) and(w 3 and w 6 )) (w 9 and w 14 ) (not w 14 ) and w 1 (w 2 or w 6 ) or(w 8 and w 13 ) (w 3 and w 4 ) or((w 1 2 xor w 15 ) and w 8 ) (w 2 or w 8 ) or(w 1 and w 2 )</p><p>The Genetic programming execution was terminated when one or more chromosomes from the population reached the highest value of the selected fitness function, or when reaching a maximum number of generations, where the highest values for precision and recall are α + β and 1 respectively.</p><p>All experiments were done multiple times with the same options to see the differences in the results, because of probability used during genetic programming process. In all experiments were used following fixed options:</p><p>the arbitrary weights for α = 0.25, and β = 1.0 -crossover value = 0.8</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Experiments on different types of mutation and different fitness functions</head><p>Mutation value is probability of applying mutation operator on offsprings. In these experiments we observed how the changes of mutation value affect the result of genetic programming process, where we used different types of mutation as described above and two different sets of options where implemented on our experiments.</p><p>First set of options are:</p><p>-User query is:-(w 6 and w 8 ) and not w 10 -Collection name is:-Collection-1 -Used fitness measures are:-precision or recall -Highest number of generations are: -200 generations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiments on using precision fitness function:</head><p>All terms from the initial population were used for mutation on leaves, the results were shown in table <ref type="table" target="#tab_1">2</ref>. Where in all experiments the chromosomes fitness values in the final population reached to be highest value, and the same for recall fitness value is the highest too, where the number of generations was variant. All terms from the user query were used for mutation on leaves, and the results were shown in Table <ref type="table" target="#tab_2">3</ref>. In this case, mostly maximum number of generations were reached to obtain the optimized query, because of not reaching the highest value on the selected fitness function, when the highest values for recall were reached. All terms form the whole collection were used for mutation of leaves, and the results were shown in Table <ref type="table" target="#tab_3">4</ref>. Where in some experiments the termination of the program execution was done because of reaching the maximum number of generations or reaching the highest precision fitness value, where mostly in all experiments we reached the highest values for recall fitness values. Experiments on using recall fitness function:</p><p>All chromosomes in the final population had approximately the same highest value of recall, but mostly the values of precision are various, some of these results are shown in tables bellow. Table <ref type="table" target="#tab_4">5</ref> shows the results when the mutation over leaves used terms from user query only, table <ref type="table">6</ref> shows the results when the mutation over leaves used terms from initial population, and table <ref type="table" target="#tab_5">7</ref> shows the results when the mutation over leaves used terms from whole population. Some experiments were done using the second set of options with the following results shown in Table <ref type="table" target="#tab_7">8 and 9</ref>.</p><p>Second set of options are:</p><p>-User query is:-((not w 10 ) and(w 6 and w 8 )) -Collection name is:-Collection-2 -Used fitness measures are:-precision or recall -Highest number of generations are: -1200 generations. After increasing number of generations and experiments were done on Collection-2, there were a differences in the results because in some cases we reached the best solution depends on precision fitness function as shown in table 8 before, where the results in table <ref type="table" target="#tab_7">9</ref> shown before mostly all experiments were reached the highest value of recall within a few number of generations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Experiments over fitness functions:</head><p>The goal of optimization process of a Boolean query is to get a query with highest possible values of precision or recall depends on chosen fitness function. Results shown above demonstrate, that when we used recall as the fitness function, the program terminated within few number of generations because of reaching the highest value of recall as a fitness function, and when using precision as the fitness function recall has reached the highest value mostly in all experiments, where some of precision values were not high.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusions</head><p>In this paper, an optimization of Boolean query over a collection of documents is presented. We focused especially on different types of mutation and on comparison of two different fitness measures, precision and recall. Experiments were done over various types of document collections and different types of mutation and two types of fitness functions. So we obtained the following conclusions:</p><p>First, when applying mutation operator on terms in the chromosomes from the initial population, it is necessary to have all the terms from the search space at disposal for mutation. If only terms from user query or initial population were used for mutation, the results were worse than when terms from whole collection were used. Only then there can come into existence new queries, describing the same documents as user query, but containing terms not included into user query or initial population.</p><p>Second, recall seems to be more efficient than precision when chosen as a fitness function to reach an optimized query within less number of generations than when precision was chosen as a fitness function. So we retrieved all relevant documents with few number of non-relevant documents. But on choosing precision as a fitness function, we reached mostly the highest values of recall before program termination when the highest values of precision were mostly not reached and the process terminated after reaching maximum number of generations.</p><p>Third, in some cases, especially when we used for mutation over leaves the terms from collection or from initial population on two different types of fitness functions an optimized query was reached within few number of generations, but on chosen recall as fitness function the results were reached within less number of generations than when precision was used as a fitness function. But for mutation over leaves the terms from user query only and the fitness function was precision there were worse results than in other cases.</p><p>We will focus in our future work on weighted terms and weighted Boolean operators for implementing the fuzzy logic over terms and Boolean operators weights for optimizing user query in information retrieval systems, and also on using different methods for evaluating the performance of information retrieval such as Harmonic mean measure (F-score). We also want to consider the number of Boolean operators and the number of terms as the objectives for query optimization.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Chromosome encoding form a query</figDesc><graphic coords="3,151.80,457.86,311.80,99.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Single point crossover, Randomly select nodes on parents</figDesc><graphic coords="5,151.80,66.06,311.80,170.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Processing mutation over offsprings, where nodes are selected randomly</figDesc><graphic coords="6,151.80,65.92,311.80,255.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Description of Document Collections</figDesc><table><row><cell cols="3">Collection Name Number of Documents Maximum number of terms</cell></row><row><cell></cell><cell></cell><cell>in each document</cell></row><row><cell>Collection-1</cell><cell>10</cell><cell>30</cell></row><row><cell>Collection-2</cell><cell>50</cell><cell>200</cell></row><row><cell>Collection-3</cell><cell>5000</cell><cell>1000</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Precision, mutation over leaves using terms from all initial population mutation value number of generations final precision final recall</figDesc><table><row><cell>0.1</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>24</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>27</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>118</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>45</cell><cell>1.25</cell><cell>1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Precision, mutation over leaves using terms from user query only mutation value number of generations final precision final recall</figDesc><table><row><cell>0.1</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>113</cell><cell>1.25</cell><cell>1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>Precision, mutation over leaves using terms from whole collection</figDesc><table><row><cell cols="4">mutation value number of generations final precision final recall</cell></row><row><cell>0.1</cell><cell>200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>58</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>11</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>187</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>42</cell><cell>1.25</cell><cell>1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5 .</head><label>5</label><figDesc>Recall, mutation over leaves using terms from user query mutation value number of generations final precision final recall</figDesc><table><row><cell>0.1</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>5</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>5</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>6</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell cols="4">Table 6. Recall, mutation over leaves using terms from initial population</cell></row><row><cell cols="4">mutation value number of generations final precision final recall</cell></row><row><cell>0.1</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>5</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>5</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>7</cell><cell>0.583</cell><cell>1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 7 .</head><label>7</label><figDesc>Recall, mutation over leaves using terms from whole collection mutation value number of generations final precision final recall</figDesc><table><row><cell>0.1</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>5</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>6</cell><cell>0.500</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>8</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>5</cell><cell>0.583</cell><cell>1.00</cell></row><row><cell cols="4">From these results shown in tables 5, 6 and 7 the executions of program were</cell></row><row><cell cols="4">terminated when the highest recall fitness function values of chromosomes were</cell></row><row><cell cols="2">reached within few number of generations.</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 8 .</head><label>8</label><figDesc>Precision, mutation over leaves using terms from user query, Collection-2</figDesc><table><row><cell cols="4">mutation value number of generations final precision final recall</cell></row><row><cell>0.1</cell><cell>1200</cell><cell>1.00</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>1200</cell><cell>0.75</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>197</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>462</cell><cell>1.25</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>1200</cell><cell>0.75</cell><cell>1.00</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 9 .</head><label>9</label><figDesc>Recall, mutation over leaves using terms from user query, Collection-2 mutation value number of generations final precision final recall</figDesc><table><row><cell>0.1</cell><cell>16</cell><cell>0.3050</cell><cell>1.00</cell></row><row><cell>0.2</cell><cell>13</cell><cell>0.3050</cell><cell>1.00</cell></row><row><cell>0.3</cell><cell>23</cell><cell>0.3050</cell><cell>1.00</cell></row><row><cell>0.4</cell><cell>11</cell><cell>0.3050</cell><cell>1.00</cell></row><row><cell>0.5</cell><cell>17</cell><cell>0.3050</cell><cell>1.00</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Modern Information Retrieval</title>
		<author>
			<persName><forename type="first">R</forename><surname>Baeza-Yates</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ribeiro-Neto</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>Addison Wesley</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing</title>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Society for Information Science</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="693" to="705" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Evolutionary Learning of Boolean Queries by Multiobjective Genetic Programming</title>
		<author>
			<persName><forename type="first">O</forename><surname>Cordon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Herrera-Viedma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Luque¡</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">PPSN VII, LNCS</title>
				<editor>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Merelo Guervos</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="volume">2439</biblScope>
			<biblScope unit="page" from="710" to="719" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A Rule-Based View of Query Optimization</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Freytag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ACM-SIGMOD</title>
				<meeting>ACM-SIGMOD</meeting>
		<imprint>
			<date type="published" when="1987">1987</date>
			<biblScope unit="page" from="173" to="180" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Genetic Algorithms in Search, Optimization and Machine Learning</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Goldberg</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1989">1989</date>
			<publisher>Addison-Wesley</publisher>
			<pubPlace>Reading, Massachusetts</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">PDE:A Pareto-frontier Differential Evolution Approach for Multi-objective Optimization Problems</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Abbass</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sarker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Newton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Congress on Evolutionary Computation 2001 (CEC&apos;2001)</title>
				<meeting>the Congress on Evolutionary Computation 2001 (CEC&apos;2001)<address><addrLine>Piscataway, New Jersey</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Service Center</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="971" to="978" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">On Optimizing an SQL-like Nested Query</title>
		<author>
			<persName><forename type="first">W</forename><surname>Kim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Database Systems</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="443" to="469" />
			<date type="published" when="1982">1982</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Information Storage and Retrieval</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Korfhage</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997">1997</date>
			<publisher>John Wiley &amp; Sons, Inc</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Genetic programming. On the programming of computers by means of natural selection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Koza</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1992">1992</date>
			<publisher>The MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Fuzzy Set Techniques in Information Retrieval</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">H</forename><surname>Kraft</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bordogna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Pasi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fuzzy Sets in Approximate Reasoning and Information Systems</title>
				<editor>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Bezdek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Didier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Prade</surname></persName>
		</editor>
		<meeting><address><addrLine>Norwell, MA</addrLine></address></meeting>
		<imprint>
			<publisher>Kluwer Academic Publishers</publisher>
			<date type="published" when="1999">1999</date>
			<biblScope unit="volume">3</biblScope>
		</imprint>
	</monogr>
	<note>The Handbook of Fuzzy Sets Series</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Evaluating Optimizers. Database Programming and Design</title>
		<author>
			<persName><forename type="first">D</forename><surname>Mcgoveran</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1990">1990</date>
			<biblScope unit="page" from="38" to="49" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">An Introduction to Genetic Algorithms</title>
		<author>
			<persName><forename type="first">M</forename><surname>Melanie</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>A Bradford Book The MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Rijsbergen</surname></persName>
		</author>
		<title level="m">Information Retrieval</title>
				<imprint>
			<publisher>Butterworth</publisher>
			<date type="published" when="1979">1979</date>
		</imprint>
	</monogr>
	<note>2nd edition</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Terms-Weighting approach in automatic text retrieval</title>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Buckley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing and management</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="513" to="523" />
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">The use of genetic programming to build Boolean queries for text retrieval through relevance feedback</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Information Science</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="423" to="431" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Timetabling of Lectures in the Information Technology College at Al al-Bayt University Using Genetic Algorithms</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S J</forename><surname>Owais</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2003">2003</date>
			<pubPlace>Jordan</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Al al-Bayt University</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master thesis</note>
	<note>in Arabic</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Optimization of Query Algorithms</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Yao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Database Systems</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="133" to="155" />
			<date type="published" when="1979">1979</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications</title>
		<author>
			<persName><forename type="first">E</forename><surname>Zitzler</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<pubPlace>Zurich</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Swiss Federal Institute of Technology Zurich</orgName>
		</respStmt>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
