<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Preliminary Study Towards a Fuzzy Model for Visual Attention</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Anca</forename><surname>Ralescu</surname></persName>
							<email>anca.ralescu@uc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">EECS Department</orgName>
								<orgName type="institution">University of Cincinnati</orgName>
								<address>
									<addrLine>ML 0030</addrLine>
									<postCode>45221</postCode>
									<settlement>Cincinnati</settlement>
									<region>OH</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Isabelle</forename><surname>Bloch</surname></persName>
							<email>isabelle.bloch@telecom-paristech.fr</email>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">Institut Mines Telecom</orgName>
								<orgName type="department" key="dep2">Telecom Paristech</orgName>
								<orgName type="institution">CNRS LTCI</orgName>
								<address>
									<settlement>Paris</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roberto</forename><surname>Cesar</surname></persName>
							<email>cesar@ime.usp.br</email>
							<affiliation key="aff2">
								<orgName type="department">IME</orgName>
								<orgName type="institution">University of Sao Paulo</orgName>
								<address>
									<settlement>Sao Paulo</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Preliminary Study Towards a Fuzzy Model for Visual Attention</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">5AC8EB11917075898960C572C5A1F0A9</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T22:53+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Attention, in particular visual attention, has been a subject of studies in various disciplines, including cognitive science, experimental psychology, and computer vision. In cognitive science and experimental psychology the objective is to develop theories that can explain the attention phenomenon of cognition. In computer vision, the objective is to inform image understanding systems by hypotheses on the human visual attention. There is, however, very little influence of studies across these two disciplines. In a departure from this state of affairs, this study seeks to develop an algorithmic approach to visual attention as part of an image understanding system, by starting with a theory of visual attention put forward in experimental psychology. In the process, it will become useful to revise some of the concepts of this theory, in particular by adopting fuzzy set based representations and the necessary calculus for them.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>As subject of human cognition, attention has attracted a great interest from the fields of cognitive science and experimental psychology.</p><p>Visual attention is a wide field, largely addressed in the literature covering different aspects. Some works related to the present paper are briefly reviewed, without seeking at exhaustivity. One approach relies on Gestalt theory, and Gestalt and computer vision models are compared by <ref type="bibr" target="#b4">(Desolneux, Moisan, and Morel 2003)</ref>. Two sets of experiments for Gestalt detection methods are carried out and compared to computationally predicted results. Object size and noise are the two parameters taken into account in these experiments. The authors indicate that the qualitative thresholds predicted by the proposed computational approach of gestalt detection fit the human perception.</p><p>Another approach is purely computational and based on image information. An important review on visual attention modeling is presented by <ref type="bibr">(Borji and Itti 2013)</ref>. The important aspect of saliency-based attention is specifically addressed in this review. Nearly 65 models are reviewed and classified in a didactical taxonomy that helps clarifying the field. Visual saliency refers to a bottom-up phenomenon where some scene regions are detected as more prominent than others due to some visual features. There are different biological and computational approaches to model such phenomena. For instance, the center-surround hypothesis (a common issue for the analysis of receptive fields in the retina) is a classical model for bottom-up saliency <ref type="bibr" target="#b7">(Gao, Mahadevan, and Vasconcelos 2008)</ref>. In such settings, Gao and co-authors <ref type="bibr" target="#b7">(Gao, Mahadevan, and Vasconcelos 2008)</ref> incorporate discriminant features and decision-theoretic model for saliency characterization. Saliency detection is important in many different imaging and vision applications <ref type="bibr" target="#b15">(Yan et al. 2013;</ref><ref type="bibr" target="#b16">Yang et al. 2013)</ref>. For instance, in medical imaging, saliency maps are useful to guide model-based image segmentation <ref type="bibr" target="#b6">(Fouquier, Atif, and Bloch 2012)</ref>, thus merging top-down and bottom-up approaches.</p><p>The mechanism of attention has been studied intensively in the field of psychology and cognitive science, (Kahneman 1973), <ref type="bibr" target="#b12">(Treisman and Gelade 1980)</ref>, <ref type="bibr" target="#b13">(Treisman 1988)</ref>, <ref type="bibr" target="#b14">(Treisman 2014)</ref>, <ref type="bibr" target="#b8">(Humphreys 2014)</ref>, <ref type="bibr" target="#b1">(Bundesen, Habekost, and Kyllingsbaek 2005)</ref>  <ref type="bibr" target="#b2">(Bundesen, Vangkilde, and Petersen 2014)</ref>. In this paper we focus on the theory of visual attention introduced in <ref type="bibr" target="#b3">(Bundesen 1990)</ref>, where visual recognition and attentional selection are considered as the task of perceptual categorization, basically deciding to which category an object or element of the visual field belongs.</p><p>Following the notation of <ref type="bibr" target="#b3">(Bundesen 1990)</ref>, throughout this paper, x is an input item, e.g. image or image region, or more generally an item to be categorized of recognized. The collection of all items x is denoted by S. A category is denoted by i and the collection of all categories is denoted by R. A category can stand for an ontological category (e.g., an object, or a scene), or for subsets in the range of a particular attribute (e.g., red for the attribute color). Regardless of the situation the conceptual treatment of categories and/or items is the same. E(x, i) denotes the event/statement "x is in category i". When viewed as an event, one can talk about its probability; when viewed as a statement, one can talk about its truth or its possibility.</p><p>From this point on this paper is organized as follows: Section 2 contains a brief review of TVA concepts and mechanisms -filtering and pigeonholing. Section 3 presents the motivation for the introduction of fuzzy sets; the fuzzy mechanisms of filtering and pigeonholing. Conclusions and future research are in Section 4.</p><p>In this section, we review and comment the main concepts and modeling steps of the Theory of Visual Attention (TVA) by <ref type="bibr" target="#b3">(Bundesen 1990</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Attentional Weight</head><p>One of the main concepts introduced in TVA is that of attentional weight defined as follows:</p><formula xml:id="formula_0">w(x) = i∈R η(x, i)π(i)<label>(1)</label></formula><p>What are the possible interpretations of the quantities in Equation (1)? If η(x, i) is interpreted as the salience of x for category i, then w(x) could be interpreted as the salience of x across the family of categories R, averaged with respect to category pertinence. From the point of view of computer vision, η(x, i) is simply the output of an operator designed to provide information for category i.</p><p>Note that pertinence of a category is (or must be) considered with respect to a task, which could be a categorization at a higher semantic/ontological level. Adopting this point of view, the product η(x, i)π(i) can then be interpreted as the pertinence of item x to the task with respect to which category i had pertinence π(i). More precisely, one can define π(x, T i ) = η(x, i)π(i)</p><p>as the pertinence of x to T i where T i is the task to which category i has pertinence value π(i). For example, suppose that i is the color category "red" of the attribute color. Furthermore, suppose that the color category "red" has pertinence π(red) to the task of identifying visually an object such as, for instance, the "flag of some country". Let now x be a region in an image, and η(x, red) the output of evaluating it with respect to the color "red". Then η(x, T red ) = η(x, red)π(red) is the pertinence of x to the task T red .</p><p>Taking max/min with respect to x obtains:</p><formula xml:id="formula_1">x max,red = arg max x∈S η(x, T red ),</formula><p>the region in the input which is most pertinent to T red , and</p><formula xml:id="formula_2">x min,red = arg min x∈S η(x, T red ),</formula><p>the region in the input which is least pertinent to T red . Similarly, taking max / min over categories, yields</p><formula xml:id="formula_3">i max = arg max i∈R π(i); i min = arg min i∈R,π(i)&gt;0 π(i)</formula><p>the most/least pertinent categories respectively. The condition π(i) &gt; 0 ensures that categories which are not pertinent at all, i.e. with π(i) = 0, are not taken into account, so the trivial case π(i min ) = 0 is never obtained. Then, for fixed x, η(x, i max ), η(x, i min ) are the strengths of evidence for x to be in the highest/lowest pertinence category, and π(x, T max ) = η(x, i max )π(i max ) π(x, i min ) = η(x, T min )π(i min ) are the importance of x to the task corresponding to the category of highest/lowest pertinence value. Versions of the following "flag example" will be used in this paper to illustrate various points.</p><p>Example 1 Let T stand for the task to determine if an object identified in an image corresponds to a "flag of some country". The decision is to be based on color information only. Assume several color categories and their respective pertinences as shown in Table <ref type="table">1</ref>.</p><p>Table <ref type="table">1</ref>: Color categories and their respective pertinence values to the task "Identify flag of a country".</p><p>Color category: i Category pertinence:</p><formula xml:id="formula_4">π(i) red 0.8 yellow 0.3 black 0.1 green 0.2 (max π(i), i max ) (0.8, red) (min π(i), i min ) (0.1, black)</formula><p>In this example η(x, T red ) = 0.8η(x, red); η(x, T black ) = 0.1η(x, black).</p><p>In Equation ( <ref type="formula" target="#formula_0">1</ref>) only those categories i with π(i) &gt; 0 contribute to w(x). This means that categories which are not pertinent (i.e., π(i) = 0) are never considered for x, even when η(x, i) is very large.</p><p>To summarize, with the interpretation of η(x, i)π(i) as described above, the attentional weight w(x) defined by Equation (1) is the cumulative pertinence of x to a task T , obtained from strength of the sensory evidence given by x to all categories, in proportion to their pertinence to the task T .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Hazard Function</head><p>In <ref type="bibr" target="#b3">(Bundesen 1990</ref>) the notion of a hazard function ν(x, i) is introduced as ν(x, i) = P rob(E(x, i)), that is, the probability that item x is in category i (e.g., image region x is red). It is assumed (see 2nd assumption in <ref type="bibr" target="#b3">(Bundesen 1990</ref>)) that ν is computed as:</p><formula xml:id="formula_5">ν(x, i) = η(x, i)β(i)w(x)<label>(2)</label></formula><p>where η(x, i) and w(x) are as described above<ref type="foot" target="#foot_0">1</ref> , and β(i) is introduced to indicate a bias for category i. Since ν is interpreted as a probability, ν(x, i) ∈ [0, 1], which is ensured when η(x, i), β(i), w(x) ∈ [0, 1], without additional constraints on these values. Moreover, when R is an exhaustive set of exclusive (non-overlapping) categories, then ν should be normalized so that i∈R ν(x, i) = 1, in order to really satisfy its interpretation from <ref type="bibr" target="#b3">(Bundesen 1990</ref>) as a probability. More recently, in (Bundesen, Vangkilde, and Petersen 2014) β(i) is decomposed as</p><formula xml:id="formula_6">β(i) = Ap(i)u(i)<label>(3)</label></formula><p>where A ∈ [0, 1] is the level of alertness, and p(i) and u(i) are respectively, the prior probability and utility of category i. One can imagine that A also varies with the category, in which case A in Equation ( <ref type="formula" target="#formula_6">3</ref>) is replaced by an A i . This is justified by the fact that one may be more alert to a category than to others. In an image processing system, A, or A i could be tied to the performance of the image processing operators used. The components p(i), u(i) of β(i), and hence β(i), must also be tied up to a (higher level) task T . While p(i) may be obtained from past data and experiments on the task T , u(i) seems to be purely subjective, and to a large extent, its role seems to overlap with that is π(i). Plugging w(x) and β(i) in (2) results in</p><formula xml:id="formula_7">ν(x, i) = Aη(x, i)p(i)u(i) j∈R η(x, j)π(j) = Ap(i)u(i)[η(x, i) 2 π(i)+ +η(x, i) j =i η(x, j)π(j)]<label>(4)</label></formula><p>which suggests that the most important role in computing ν(x, i) is played by the sensory evidence. In particular, ν's largest value is obtained when</p><formula xml:id="formula_8">A = p(i) = u(i) = 1, (i.e.</formula><p>under maximum alertness, maximum prior probability, and maximum utility), and in that case ν(x, i) is a function only of the sensory evidence. Stated differently, this means that A, p(i) and u(i) can only decrease the value of ν(x, i). However, they may provide a mechanism to account for different types of subjective information, and of ranking the values of ν(x, i) when they enter its definition as shown in Equations ( <ref type="formula" target="#formula_5">2</ref>) -( <ref type="formula" target="#formula_7">4</ref>). The justification in (Bundesen, Vangkilde, and Petersen 2014) of Equation ( <ref type="formula" target="#formula_6">3</ref>) is based on the fact that when either one of A, p(i), or u(i) is null, then β(i) = 0. However, the same result holds when these quantities enter the definition of β not through a product, but through other operations, such as the min, or more generally, t-norms.</p><p>The fact that the value of ν(x, i) decreases when Ap(i)u(i) = 1 (i.e. at least one of these three values is less than 1, u(i) for instance) can be interpreted as follows: x will be less probably categorized in i if, for instance, the utility for i is low, which means that we do not really care for this category. This also goes with the interpretation as a rate of encoding information in the memory, according to (Bundesen 1990), even without considering time information.</p><p>The two mechanisms for visual attention proposed in <ref type="bibr" target="#b3">(Bundesen 1990</ref>), filtering and pigeonholing, are described next.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Filtering</head><p>Filtering <ref type="bibr" target="#b3">(Bundesen 1990</ref>) refers to the mechanism of selecting an item x ∈ S (given a higher level task), for a target category i. This mechanism seeks to (F1) increase ν(x, i) for some category i, while (F2) not changing the conditional probability of E(x, i)</p><p>given that x is categorized.</p><p>Filtering can be achieved by increasing w(x) as follows: For category j ∈ R assume π (j) = aπ(j), where a &gt; 1. Then w(x) of equation (1) becomes w (x) = i∈R,i =j η(x, i)π i + η(x, j)π j = i∈R,i =j η(x, i)π i + η(x, j)aπ j &gt; w(x). Therefore, ν(x, i) becomes ν (x, i) = η(x, i)β(i)w (x) &gt; ν(x, i), which satisfies condition (F1) above. Computing now P (x is i | x is categorized) yields:</p><formula xml:id="formula_9">P (x is i | x is categorized) = ν(x,i) k∈R ν(x,k) = η(x,i)β(i)w(x) w(x) k∈R ν(x,k) = η(x,i)β(i) k∈R ν(x,k)</formula><p>(5) which does not depend on w, hence satifies condition (F2). In Equation ( <ref type="formula">5</ref>) the numerator is ν(x, i) since {x is i} ⊂ {x is categorized} and therefore P (x is i, x is categorized) = P (x is i), while the denominator uses an assumption on non-overlapping categories to write P (x is categorized) as k∈R ν(x, k). Dropping the constraint of non-overlapping categories is discussed later in this study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Pigeonholing</head><p>For fixed item x ∈ S, pigeonholing <ref type="bibr" target="#b3">(Bundesen 1990</ref>) refers to the mechanism of selecting a category i ∈ R (given a higher level task), across a set of items S. It seeks to:</p><p>(P1) increase x∈S ν(x, i) for category i pertinent to the task, such that (P2) for all j ∈ R, j = i, x∈S ν(x, j) does not change Pigeonholing can be done by increasing β(i) for some i ∈ R as follows: For category i ∈ R, let</p><formula xml:id="formula_10">β i = aβ i , with a &gt; 1. Then ν (x, i) = η(x, i)β i w x = η(x, i)aβ i w x &gt; η(x, i)β i w x = ν(x, i).</formula><p>Summing up over x ∈ S obtains</p><formula xml:id="formula_11">P (i is selected) = x∈R η(x, i)β i w x &gt; P (i is selected), (<label>6</label></formula><p>) which achieves (P1). At the same time, it is clear that for any other category j = i, P (j is selected) does not change, and hence (P2) is satisfied too.</p><p>Equation ( <ref type="formula" target="#formula_11">6</ref>) uses the assumption that items x are nonoverlapping, for example that they form a partition of the image. However, this partition need not be crisp, i.e. may allow overlapping x's, as for example these are stated in qualitative terms. In such cases, Equation ( <ref type="formula" target="#formula_11">6</ref>) does not hold. Dropping the constraint of non-overlapping items, discussed later, leads to a different interpretation of ν(x, i).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Fuzzy Mechanisms for Visual Attention</head><p>We consider in this section the situations when the values of the attentional weight and/or category pertinence are not exact. In such situations these values may be represented as fuzzy sets, and therefore, the computation of the categorization of an item must resort to calculus with fuzzy sets. First, let us see why indeed such situations may arise.</p><p>Recall that in its original definition, for a given input x and category i, the strength of sensory evidence for E(x, i), η(x, i) ∈ [0, 1]. Assuming that η(x, i) is the output of an operator/test for category i on item x, this output may be inexact because of the inexact nature of the category i. For example, if the category i = red of the attribute color, then for a given input pixel value x this category holds "more or less" and it may not be useful to commit to an exact 0/1 value.</p><p>Likewise, in its original definition, the pertinence of a category, π(i) conveys its importance. Obviously, given a collection of visual categories, and task, they may be distinguished along their pertinence values. Moreover, several categories may have the same, maximum importance for the given task. As an example, consider the pertinence of color categories for the detection of an object which is known to have one of two possible color categories, white or yellow, from the collection of all possible color categories. In this case, it is useful to be able to encode</p><formula xml:id="formula_12">π(white) = π(yellow) = 1,</formula><p>which would be possible when π is considered as a possibility distribution on the color categories, regardless of the number of color categories allowed. By contrast, using a probability based approach, the cardinality of R, the collection of categories, restricts the values assigned to these equally possible categories, to at most 0.5. That is, π(white) = π(yellow) ≤ 0.5 with equality when R = {yellow, white}.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">A new definition for w(x)</head><p>The departure point for the new definition for w(x) is the interpretation of a special case of Equation ( <ref type="formula" target="#formula_0">1</ref></p><formula xml:id="formula_13">). Let R a = {i ∈ R | π(i) = a} and consider the special case R = R 0 ∪ R 1 , that is, all categories in R are either "fully" pertinent, π(i) = 1 (i ∈ R 1 ), or not pertinent π(i) = 0 (i ∈ R 0 ). Then (1) becomes w(x) = i∈R1 η(x, i)</formula><p>Next let η max = max i∈R1 , and recall that η(x, i) ≤ 1. Then</p><formula xml:id="formula_14">w(x) ≤ i:π(i)=1 η max = η max i∈R1 1 = η max |R 1 | ≤ |R 1 |,</formula><p>where |R 1 | denotes the cardinality of the set R 1 . That is, w(x) is bounded by the number of categories i with pertinence π(i) = 1. If η(x, i) = 1 for all i ∈ R 1 then w(x) is exactly the number of such categories. This meaning of w(x) is very natural and appealing. Indeed, one would expect the item x to count to the extent that it supports more categories. To generalize this notion, define for fixed x ∈ S and fixed task T µ (x,T ) (i) = η(x, i)π T (i) the degree to which category i, pertinent to task T , is supported by the (data) item x as shown by the strength of sensory evidence, η(x, i). Therefore, µ (x,T ) : R → [0, 1] is the membership of a fuzzy set on the set of categories.<ref type="foot" target="#foot_1">2</ref> Then the weight of item x is now defined as the cardinality of this fuzzy set. That is</p><formula xml:id="formula_15">w(x) = Card {(i, µ x (i)) | i ∈ R} (7)</formula><p>Several formulas for the cardinality of a fuzzy set have been put forward. Here, for illustration purposes, the definition from <ref type="bibr" target="#b10">(Ralescu 1986</ref>) is used to obtain</p><formula xml:id="formula_16">Card ({µ x (i) | i ∈ R}) (k) = µ x,(k) ∧ (1 − µ x,(k+1) ) (8)</formula><p>where µ x,(k) denotes, the kth largest value of µ x (•), and µ x,(|R|+1) = 0. Thus, the cardinality defined in Equation ( <ref type="formula">7</ref>) is a fuzzy set on {0, ..., |R|}. For an exact value of w(x) the 0.5-level set of w(x) (which is an interval), or its classic cardinality can be used <ref type="bibr" target="#b11">(Ralescu 1995)</ref>.</p><p>3.2 A new definition for β(i)</p><p>Following the discussion from Section 2.4, define</p><formula xml:id="formula_17">β(i) = min{A, p(i), u(i)} (9)</formula><p>As in the case of β defined in (3), β(i) = 0 whenever A = 0, or p(i) = 0, or u(i) = 0, and the discussion of <ref type="bibr" target="#b3">(Bundesen 1990</ref>) holds: that is, category i biases the selection to the extent that the system is alert, and category i is possible and useful. Alternatively, (9) means that the bias for the selection of i cannot be greater than the system alertness, the possibility of i or its utility. Furthermore, replacing the product by min also eliminates the possibility of values for β smaller than each one of A, p(i), and u(i), which is the well-known drowning effect of multiplication of positive values smaller than 1. More importantly, it should be mentioned that the min can handle ordinal or qualitative values, without needing specifying precise numbers. Specifying such precise values might be difficult when subjective assessments are made. By contrast, in the case of such assessments, ordinal or qualitative values are usually easily produced.</p><p>As already mentioned, in the fuzzy set framework, the product and min are but two particular cases of a t-norm (conjunction operator). A, p(i), and u(i) are interpreted respectively, as degrees of alertness, possibility (rather than probability) of i to be selected, and utility for the category i, and the bias for i is defined as the conjunction of these. This interpretation makes (9) meaningful beyond a mere computational artifice. Another choice for defining β is to select a more general, aggregation operator, H : [0, 1] × [0, 1] × [0, 1] → [0, 1], which would allow the contribution of more than one of A, p(i), u(i) towards β.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">A new definition for ν(x, i)</head><p>With the new definitions, w(x), and β of w(x) and β respectively, the meaning of ν(x, i) also changes from a probability to a possibility, more precisely, P ossibility(x is i): P ossibility(x is i) = H(η(x, i), β(i), w(x))</p><p>(10</p><formula xml:id="formula_18">)</formula><p>where H is again an aggregation operator, and hence the definition of ν(x, i) from <ref type="bibr" target="#b3">(Bundesen 1990</ref>) is a particular case, when H is the product.</p><p>For defining H, one may rely on the huge literature on information fusion, for which the fuzzy sets theory provides a number of useful operators (see e.g. <ref type="bibr" target="#b5">(Dubois and Prade 1985;</ref><ref type="bibr" target="#b14">Yager 1991;</ref><ref type="bibr" target="#b0">Bloch 1996)</ref> for reviews on fuzzy fusion operators). The large choice offered by these operators allows modeling different combination behaviors <ref type="bibr">(conjunctive, disjunctive, compromise, etc.)</ref>, with different degrees (e.g. the min is a less severe conjunction as the product). Operators can also behave differently depending on whether the values to be combined are small, large, of the same order of magnitude, or having different priorities. The operators H could also be set differently for the three values. For instance η and w, which depend on x and i could be combined using an operator H 1 , and the result combined with β, which depends on i only, using another operators H 2 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusions and Future Work</head><p>This paper discussed an attentional model developed in the field of psychology and cognitive science set in a probabilistic framework. The basic concepts of this model were discussed and an alternative, fuzzy set based approach was suggested. In the fuzzy set framework, modeling would be easier, more natural (for instance replacing numbers by ordinal or qualitative values), and it would allow for more flexible ways of combining the different terms. This discussion paves the way for a new attentional model, the complete development of it being left for future work.</p></div>			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Note that the expression of<ref type="bibr" target="#b3">(Bundesen 1990</ref>) involves a normalized version of w, i.e. w(x)/ x∈S w(x). Here we implicitly assume that w is normalized, in order to simplify equations.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">In the following, assuming only one task, T , for ease of notation, the subscript T will be dropped, to write µx(i).</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Acknowledgments</head><p>Anca Ralescu's contribution was partially supported by a visit to Telecom ParisTech.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Information Combination Operators for Data Fusion: A Comparative Review with Classification</title>
		<author>
			<persName><forename type="first">I</forename><surname>Bloch</surname></persName>
		</author>
		<author>
			<persName><surname>Borji</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Pattern Analysis and Machine Intelligence</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="185" to="207" />
			<date type="published" when="1996">1996. 2013</date>
		</imprint>
	</monogr>
	<note>IEEE Transactions on Systems, Man, and Cybernetics</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A neural theory of visual attention: bridging cognition and neurophysiology</title>
		<author>
			<persName><forename type="first">C</forename><surname>Bundesen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Habekost</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kyllingsbaek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological review</title>
		<imprint>
			<biblScope unit="volume">112</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">291</biblScope>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Bundesen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vangkilde</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Petersen</surname></persName>
		</author>
		<title level="m">Recent developments in a computational theory of visual attention</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note type="report_type">Vision research</note>
	<note>tva</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A theory of visual attention</title>
		<author>
			<persName><forename type="first">C</forename><surname>Bundesen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Psychological review</title>
		<imprint>
			<biblScope unit="volume">97</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">523</biblScope>
			<date type="published" when="1990">1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Computational gestalts and perception thresholds</title>
		<author>
			<persName><forename type="first">A</forename><surname>Desolneux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Moisan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-M</forename><surname>Morel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Physiology-Paris</title>
		<imprint>
			<biblScope unit="volume">97</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="311" to="324" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A Review of Fuzzy Set Aggregation Connectives</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dubois</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Prade</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="85" to="121" />
			<date type="published" when="1985">1985</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Sequential model-based segmentation and recognition of image structures driven by visual features and spatial relations</title>
		<author>
			<persName><forename type="first">G</forename><surname>Fouquier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Atif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bloch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Vision and Image Understanding</title>
		<imprint>
			<biblScope unit="volume">116</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="146" to="165" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The discriminant center-surround hypothesis for bottom-up saliency</title>
		<author>
			<persName><forename type="first">D</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mahadevan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Vasconcelos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="497" to="504" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Feature confirmation in object perception: Feature integration theory 26 years on from the Treisman Bartlett lecture</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">W</forename><surname>Humphreys</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Quarterly Journal of Experimental Psychology</title>
		<imprint>
			<biblScope unit="page" from="1" to="49" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note>just-accepted</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Attention and Effort</title>
		<author>
			<persName><forename type="first">D</forename><surname>Kahneman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1973">1973</date>
			<publisher>Prentice-Hall</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A note on rule representation in expert systems</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Ralescu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="193" to="203" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Cardinality, quantifiers, and the aggregation of fuzzy criteria</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ralescu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Fuzzy sets and systems</title>
		<imprint>
			<biblScope unit="volume">69</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="355" to="365" />
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A feature-integration theory of attention</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Treisman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gelade</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cognitive psychology</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="97" to="136" />
			<date type="published" when="1980">1980</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Features and objects: The fourteenth bartlett memorial lecture</title>
		<author>
			<persName><forename type="first">A</forename><surname>Treisman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Quarterly Journal of Experimental Psychology</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="201" to="237" />
			<date type="published" when="1988">1988</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">The psychological reality of levels of processing. Levels of processing in human memory 301-330</title>
		<author>
			<persName><forename type="first">A</forename><surname>Treisman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Yager</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Fuzzy Sets and Systems</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="39" to="75" />
			<date type="published" when="1991">2014. 1991</date>
		</imprint>
	</monogr>
	<note>Connectives and Quantifiers in Fuzzy Sets</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Hierarchical saliency detection</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="1155" to="1162" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Saliency detection via graph-based manifold ranking</title>
		<author>
			<persName><forename type="first">C</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ruan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-H</forename><surname>Yang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="3166" to="3173" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
