=Paper= {{Paper |id=Vol-1424/Paper7 |storemode=property |title=Preliminary Study Towards a Fuzzy Model for Visual Attention |pdfUrl=https://ceur-ws.org/Vol-1424/Paper7.pdf |volume=Vol-1424 |dblpUrl=https://dblp.org/rec/conf/ijcai/RalescuBC15 }} ==Preliminary Study Towards a Fuzzy Model for Visual Attention== https://ceur-ws.org/Vol-1424/Paper7.pdf
                Preliminary Study Towards a Fuzzy Model for Visual Attention

                                  Anca Ralescu1 , Isabelle Bloch2 , and Roberto Cesar3
        1. EECS Department, University of Cincinnati, ML 0030, Cincinnati, OH 45221, USA - anca.ralescu@uc.edu
       2. Institut Mines Telecom, Telecom Paristech, CNRS LTCI, Paris, France - isabelle.bloch@telecom-paristech.fr
                            3. University of Sao Paulo, IME, Sao Paulo, Brazil - cesar@ime.usp.br



                             Abstract                               different biological and computational approaches to model
                                                                    such phenomena. For instance, the center-surround hypothe-
     Attention, in particular visual attention, has been a sub-     sis (a common issue for the analysis of receptive fields in the
     ject of studies in various disciplines, including cognitive
                                                                    retina) is a classical model for bottom-up saliency (Gao, Ma-
     science, experimental psychology, and computer vision.
     In cognitive science and experimental psychology the           hadevan, and Vasconcelos 2008). In such settings, Gao and
     objective is to develop theories that can explain the at-      co-authors (Gao, Mahadevan, and Vasconcelos 2008) incor-
     tention phenomenon of cognition. In computer vision,           porate discriminant features and decision-theoretic model
     the objective is to inform image understanding systems         for saliency characterization. Saliency detection is important
     by hypotheses on the human visual attention. There is,         in many different imaging and vision applications (Yan et al.
     however, very little influence of studies across these two     2013; Yang et al. 2013). For instance, in medical imaging,
     disciplines. In a departure from this state of affairs, this   saliency maps are useful to guide model-based image seg-
     study seeks to develop an algorithmic approach to visual       mentation (Fouquier, Atif, and Bloch 2012), thus merging
     attention as part of an image understanding system, by         top-down and bottom-up approaches.
     starting with a theory of visual attention put forward in
     experimental psychology. In the process, it will become           The mechanism of attention has been studied intensively
     useful to revise some of the concepts of this theory, in
                                                                    in the field of psychology and cognitive science, (Kahne-
     particular by adopting fuzzy set based representations
     and the necessary calculus for them.                           man 1973), (Treisman and Gelade 1980), (Treisman 1988),
                                                                    (Treisman 2014), (Humphreys 2014), (Bundesen, Habekost,
                                                                    and Kyllingsbæk 2005) (Bundesen, Vangkilde, and Petersen
                      1    Introduction                             2014). In this paper we focus on the theory of visual atten-
                                                                    tion introduced in (Bundesen 1990), where visual recogni-
As subject of human cognition, attention has attracted a
                                                                    tion and attentional selection are considered as the task of
great interest from the fields of cognitive science and ex-
                                                                    perceptual categorization, basically deciding to which cate-
perimental psychology.
                                                                    gory an object or element of the visual field belongs.
   Visual attention is a wide field, largely addressed in the
literature covering different aspects. Some works related              Following the notation of (Bundesen 1990), throughout
to the present paper are briefly reviewed, without seeking          this paper, x is an input item, e.g. image or image region, or
at exhaustivity. One approach relies on Gestalt theory, and         more generally an item to be categorized of recognized. The
Gestalt and computer vision models are compared by (Des-            collection of all items x is denoted by S. A category is de-
olneux, Moisan, and Morel 2003). Two sets of experiments            noted by i and the collection of all categories is denoted by
for Gestalt detection methods are carried out and compared          R. A category can stand for an ontological category (e.g., an
to computationally predicted results. Object size and noise         object, or a scene), or for subsets in the range of a particular
are the two parameters taken into account in these experi-          attribute (e.g., red for the attribute color). Regardless of the
ments. The authors indicate that the qualitative thresholds         situation the conceptual treatment of categories and/or items
predicted by the proposed computational approach of gestalt         is the same. E(x, i) denotes the event/statement ”x is in cat-
detection fit the human perception.                                 egory i”. When viewed as an event, one can talk about its
   Another approach is purely computational and based on            probability; when viewed as a statement, one can talk about
image information. An important review on visual atten-             its truth or its possibility.
tion modeling is presented by (Borji and Itti 2013). The
important aspect of saliency-based attention is specifically           From this point on this paper is organized as follows:
addressed in this review. Nearly 65 models are reviewed             Section 2 contains a brief review of TVA concepts and
and classified in a didactical taxonomy that helps clarify-         mechanisms - filtering and pigeonholing. Section 3 presents
ing the field. Visual saliency refers to a bottom-up phe-           the motivation for the introduction of fuzzy sets; the fuzzy
nomenon where some scene regions are detected as more               mechanisms of filtering and pigeonholing. Conclusions and
prominent than others due to some visual features. There are        future research are in Section 4.
2     TVA concepts and mechanisms of attention                     Example 1 Let T stand for the task to determine if an ob-
In this section, we review and comment the main concepts           ject identified in an image corresponds to a “flag of some
and modeling steps of the Theory of Visual Attention (TVA)         country”. The decision is to be based on color information
by (Bundesen 1990).                                                only. Assume several color categories and their respective
                                                                   pertinences as shown in Table 1.
2.1    Attentional Weight
One of the main concepts introduced in TVA is that of atten-
                                                                   Table 1: Color categories and their respective pertinence val-
tional weight defined as follows:
                           X                                       ues to the task “Identify flag of a country”.
                   w(x) =      η(x, i)π(i)               (1)              Color category: i Category pertinence: π(i)
                            i∈R                                                   red                      0.8
   What are the possible interpretations of the quantities in                   yellow                     0.3
Equation (1)? If η(x, i) is interpreted as the salience of x for                 black                     0.1
category i, then w(x) could be interpreted as the salience of                   green                      0.2
x across the family of categories R, averaged with respect                (max π(i), imax )             (0.8, red)
to category pertinence. From the point of view of computer                 (min π(i), imin )           (0.1, black)
vision, η(x, i) is simply the output of an operator designed
to provide information for category i.
   Note that pertinence of a category is (or must be) consid-        In this example
ered with respect to a task, which could be a categorization       η(x, Tred ) = 0.8η(x, red); η(x, Tblack ) = 0.1η(x, black).
at a higher semantic/ontological level. Adopting this point of
view, the product η(x, i)π(i) can then be interpreted as the       In Equation (1) only those categories i with π(i) > 0 con-
pertinence of item x to the task with respect to which cate-       tribute to w(x). This means that categories which are not
gory i had pertinence π(i). More precisely, one can define         pertinent (i.e., π(i) = 0) are never considered for x, even
                    π(x, Ti ) = η(x, i)π(i)                        when η(x, i) is very large.
                                                                      To summarize, with the interpretation of η(x, i)π(i) as de-
as the pertinence of x to Ti where Ti is the task to which         scribed above, the attentional weight w(x) defined by Equa-
category i has pertinence value π(i).                              tion (1) is the cumulative pertinence of x to a task T , ob-
   For example, suppose that i is the color category “red” of      tained from strength of the sensory evidence given by x to
the attribute color. Furthermore, suppose that the color cat-      all categories, in proportion to their pertinence to the task
egory “red” has pertinence π(red) to the task of identifying       T.
visually an object such as, for instance, the “flag of some
country”. Let now x be a region in an image, and η(x, red)         2.2    Hazard Function
the output of evaluating it with respect to the color “red”.       In (Bundesen 1990) the notion of a hazard function ν(x, i) is
Then η(x, Tred ) = η(x, red)π(red) is the pertinence of x          introduced as ν(x, i) = P rob(E(x, i)), that is, the probabil-
to the task Tred .                                                 ity that item x is in category i (e.g., image region x is red).
   Taking max/min with respect to x obtains:                       It is assumed (see 2nd assumption in (Bundesen 1990)) that
               xmax,red = arg max η(x, Tred ),                     ν is computed as:
                                  x∈S
the region in the input which is most pertinent to Tred , and                        ν(x, i) = η(x, i)β(i)w(x)                  (2)
               xmin,red = arg min η(x, Tred ),                                                                       1
                                                                   where η(x, i) and w(x) are as described above , and β(i) is
                                  x∈S
                                                                   introduced to indicate a bias for category i. Since ν is in-
the region in the input which is least pertinent to Tred .         terpreted as a probability, ν(x, i) ∈ [0, 1], which is ensured
   Similarly, taking max / min over categories, yields             when η(x, i), β(i), w(x) ∈ [0, 1], without additional con-
     imax = arg max π(i); imin = arg min π(i)                      straints on these values. Moreover, when R is an exhaustive
                  i∈R                      i∈R,π(i)>0
                                                                   set of exclusive (non-overlapping)   categories, then ν should
the most/least pertinent categories respectively. The condi-
                                                                                            P
                                                                   be normalized so that i∈R ν(x, i) = 1, in order to really
tion π(i) > 0 ensures that categories which are not pertinent      satisfy its interpretation from (Bundesen 1990) as a proba-
at all, i.e. with π(i) = 0, are not taken into account, so the     bility. More recently, in (Bundesen, Vangkilde, and Petersen
trivial case π(imin ) = 0 is never obtained. Then, for fixed       2014) β(i) is decomposed as
x, η(x, imax ), η(x, imin ) are the strengths of evidence for x
to be in the highest/lowest pertinence category, and                                     β(i) = Ap(i)u(i)                       (3)
               π(x, Tmax ) = η(x, imax )π(imax )                   where A ∈ [0, 1] is the level of alertness, and p(i) and u(i)
                π(x, imin ) = η(x, Tmin )π(imin )                  are respectively, the prior probability and utility of category
                                                                   i. One can imagine that A also varies with the category, in
are the importance of x to the task corresponding to the cate-
gory of highest/lowest pertinence value. Versions of the fol-         1
                                                                       Note that the expression of (Bundesen   1990) involves a nor-
lowing “flag example” will be used in this paper to illustrate
                                                                                                    P
                                                                   malized version of w, i.e. w(x)/ x∈S w(x). Here we implicitly
various points.                                                    assume that w is normalized, in order to simplify equations.
which case A in Equation (3) is replaced by an Ai . This is        above. Computing now P (x is i | x is categorized) yields:
justified by the fact that one may be more alert to a cate-
gory than to others. In an image processing system, A, or Ai                P (x is i | x is categorized) = P ν(x,i)
                                                                                                                 ν(x,k)
could be tied to the performance of the image processing op-                    η(x,i)β(i)w(x)
                                                                                                               k∈R
                                                                                                                               (5)
erators used. The components p(i), u(i) of β(i), and hence                  = w(x) P
                                                                                          ν(x,k)
                                                                                                 = Pη(x,i)β(i)
                                                                                                          ν(x,k)
                                                                                         k∈R            k∈R
β(i), must also be tied up to a (higher level) task T . While
p(i) may be obtained from past data and experiments on the         which does not depend on w, hence satifies condition (F2).
task T , u(i) seems to be purely subjective, and to a large        In Equation (5) the numerator is ν(x, i) since
extent, its role seems to overlap with that is π(i). Plugging                      {x is i} ⊂ {x is categorized}
w(x) and β(i) in (2) results in
                                   P                               and therefore
     ν(x, i) = Aη(x, i)p(i)u(i) j∈R η(x, j)π(j)                       P (x is i, x is categorized) = P (x is i), while the denom-
               = Ap(i)u(i)[η(x,  i)2 π(i)+                (4)      inator uses an assumption on non-overlapping
                                                                                                   P                 categories to
                                                                   write P (x is categorized) as k∈R ν(x, k). Dropping the
                        P
               +η(x, i) j6=i η(x, j)π(j)]
                                                                   constraint of non-overlapping categories is discussed later
which suggests that the most important role in computing           in this study.
ν(x, i) is played by the sensory evidence. In particular, ν’s
largest value is obtained when A = p(i) = u(i) = 1, (i.e.          2.4   Pigeonholing
under maximum alertness, maximum prior probability, and            For fixed item x ∈ S, pigeonholing (Bundesen 1990) refers
maximum utility), and in that case ν(x, i) is a function only      to the mechanism of selecting a category i ∈ R (given a
of the sensory evidence. Stated differently, this means that       higher level task), across a set of items S. It seeks to:
A, p(i) and u(i) can only decrease the value of ν(x, i). How-                      P
                                                                    (P1) increase x∈S ν(x, i) for category i pertinent to the
ever, they may provide a mechanism to account for different           task, such that
types of subjective information, and of ranking the values of                                   P
ν(x, i) when they enter its definition as shown in Equations        (P2) for all j ∈ R, j 6= i, x∈S ν(x, j) does not change
(2) - (4). The justification in (Bundesen, Vangkilde, and Pe-      Pigeonholing can be done by increasing β(i) for some i ∈ R
tersen 2014) of Equation (3) is based on the fact that when        as follows: For category i ∈ R, let βi0 = aβi , with a > 1.
either one of A, p(i), or u(i) is null, then β(i) = 0. How-        Then
ever, the same result holds when these quantities enter the
                                                                            ν 0 (x, i)   = η(x, i)βi0 wx = η(x, i)aβi wx
definition of β not through a product, but through other op-
                                                                                         > η(x, i)βi wx = ν(x, i).
erations, such as the min, or more generally, t-norms.
   The fact that the value of ν(x, i) decreases when               Summing up over x ∈ S obtains
Ap(i)u(i) 6= 1 (i.e. at least one of these three values is less                           X
than 1, u(i) for instance) can be interpreted as follows: x will    P 0 (i is selected) =   η(x, i)βi0 wx > P (i is selected),
be less probably categorized in i if, for instance, the utility                            x∈R
for i is low, which means that we do not really care for this                                                                   (6)
category. This also goes with the interpretation as a rate of      which achieves (P1). At the same time, it is clear that for any
encoding information in the memory, according to (Bunde-           other category j 6= i, P (j is selected) does not change, and
sen 1990), even without considering time information.              hence (P2) is satisfied too.
   The two mechanisms for visual attention proposed in                Equation (6) uses the assumption that items x are non-
(Bundesen 1990), filtering and pigeonholing, are described         overlapping, for example that they form a partition of the
next.                                                              image. However, this partition need not be crisp, i.e. may
                                                                   allow overlapping x’s, as for example these are stated in
                                                                   qualitative terms. In such cases, Equation (6) does not hold.
2.3   Filtering                                                    Dropping the constraint of non-overlapping items, discussed
Filtering (Bundesen 1990) refers to the mechanism of se-           later, leads to a different interpretation of ν(x, i).
lecting an item x ∈ S (given a higher level task), for a target
category i. This mechanism seeks to                                  3    Fuzzy Mechanisms for Visual Attention
                                                                   We consider in this section the situations when the values
 (F1) increase ν(x, i) for some category i, while
                                                                   of the attentional weight and/or category pertinence are not
 (F2) not changing the conditional probability of E(x, i)          exact. In such situations these values may be represented as
  given that x is categorized.                                     fuzzy sets, and therefore, the computation of the categoriza-
                                                                   tion of an item must resort to calculus with fuzzy sets. First,
Filtering can be achieved by increasing w(x) as follows:           let us see why indeed such situations may arise.
   For category j ∈ R assume π 0 (j) = aπ(j), where                   Recall that in its original definition, for a given input x
                                                    0
a
P > 1. Then w(x) of equation          Pbecomes w (x) =
                                    (1)                            and category i, the strength of sensory evidence for E(x, i),
                                0
   i∈R,i6=j η(x, i)πi + η(x, j)πj =     i∈R,i6=j η(x, i)πi +       η(x, i) ∈ [0, 1]. Assuming that η(x, i) is the output of an
η(x, j)aπj > w(x). Therefore, ν(x, i) becomes ν 0 (x, i) =         operator/test for category i on item x, this output may be
η(x, i)β(i)w0 (x) > ν(x, i), which satisfies condition (F1)        inexact because of the inexact nature of the category i. For
example, if the category i = red of the attribute color, then           Several formulas for the cardinality of a fuzzy set have been
for a given input pixel value x this category holds ”more or            put forward. Here, for illustration purposes, the definition
less” and it may not be useful to commit to an exact 0/1                from (Ralescu 1986) is used to obtain
value.
   Likewise, in its original definition, the pertinence of a             Card ({µx (i) | i ∈ R}) (k) = µx,(k) ∧ (1 − µx,(k+1) ) (8)
category, π(i) conveys its importance. Obviously, given a               where µx,(k) denotes, the kth largest value of µx (·), and
collection of visual categories, and task, they may be dis-             µx,(|R|+1) = 0. Thus, the cardinality defined in Equation
tinguished along their pertinence values. Moreover, several             (7) is a fuzzy set on {0, ..., |R|}. For an exact value of w(x)
categories may have the same, maximum importance for the                the 0.5-level set of w(x)
                                                                                             e      (which is an interval), or its classic
given task. As an example, consider the pertinence of color             cardinality can be used (Ralescu 1995).
categories for the detection of an object which is known to
have one of two possible color categories, white or yellow,             3.2   A new definition for β(i)
from the collection of all possible color categories. In this           Following the discussion from Section 2.4, define
case, it is useful to be able to encode
                 π(white) = π(yellow) = 1,                                                β(i)
                                                                                          e = min{A, p(i), u(i)}                      (9)
which would be possible when π is considered as a pos-                  As in the case of β defined in (3), β(i)
                                                                                                            e = 0 whenever A = 0,
sibility distribution on the color categories, regardless of            or p(i) = 0, or u(i) = 0, and the discussion of (Bundesen
the number of color categories allowed. By contrast, us-                1990) holds: that is, category i biases the selection to the
ing a probability based approach, the cardinality of R, the             extent that the system is alert, and category i is possible and
collection of categories, restricts the values assigned to              useful. Alternatively, (9) means that the bias for the selection
these equally possible categories, to at most 0.5. That is,             of i cannot be greater than the system alertness, the possibil-
π(white) = π(yellow) ≤ 0.5 with equality when R =                       ity of i or its utility. Furthermore, replacing the product by
{yellow, white}.                                                        min also eliminates the possibility of values for βe smaller
                                                                        than each one of A, p(i), and u(i), which is the well-known
3.1    A new definition for w(x)                                        drowning effect of multiplication of positive values smaller
The departure point for the new definition for w(x) is the              than 1. More importantly, it should be mentioned that the
interpretation of a special case of Equation (1). Let Ra =              min can handle ordinal or qualitative values, without need-
{i ∈ R | π(i) = a} and consider the special case R = R0 ∪               ing specifying precise numbers. Specifying such precise val-
R1 , that is, all categories in R are either ”fully” pertinent,         ues might be difficult when subjective assessments are made.
π(i) = 1 (i ∈ R1 ), or not pertinent π(i) = 0 (i ∈ R0 ). Then           By contrast, in the case of such assessments, ordinal or qual-
(1) becomes                    X                                        itative values are usually easily produced.
                      w(x) =       η(x, i)                                 As already mentioned, in the fuzzy set framework, the
                                i∈R1                                    product and min are but two particular cases of a t-norm
                                                                        (conjunction operator). A, p(i), and u(i) are interpreted re-
Next let ηmax = maxi∈R1 , and recall that η(x, i) ≤ 1. Then
           X                  X                                         spectively, as degrees of alertness, possibility (rather than
w(x) ≤          ηmax = ηmax        1 = ηmax |R1 | ≤ |R1 |,              probability) of i to be selected, and utility for the category i,
          i:π(i)=1                  i∈R1
                                                                        and the bias for i is defined as the conjunction of these. This
                                                                        interpretation makes (9) meaningful beyond a mere compu-
where |R1 | denotes the cardinality of the set R1 . That is,            tational artifice. Another choice for defining βe is to select
w(x) is bounded by the number of categories i with perti-               a more general, aggregation operator, H : [0, 1] × [0, 1] ×
nence π(i) = 1. If η(x, i) = 1 for all i ∈ R1 then w(x) is              [0, 1] → [0, 1], which would allow the contribution of more
exactly the number of such categories.
                                                                        than one of A, p(i), u(i) towards β. e
    This meaning of w(x) is very natural and appealing. In-
deed, one would expect the item x to count to the extent that           3.3   A new definition for ν(x, i)
it supports more categories. To generalize this notion, define
for fixed x ∈ S and fixed task T                                        With the new definitions, w(x),
                                                                                                   e      and βe of w(x) and β respec-
                                                                        tively, the meaning of ν(x, i) also changes from a probability
                     µ(x,T ) (i) = η(x, i)πT (i)                        to a possibility, more precisely, P ossibility(x is i):
the degree to which category i, pertinent to task T , is sup-
ported by the (data) item x as shown by the strength of sen-                  P ossibility(x is i) = H(η(x, i), β(i),
                                                                                                                e     w(x))
                                                                                                                      e              (10)
sory evidence, η(x, i). Therefore, µ(x,T ) : R → [0, 1] is the          where H is again an aggregation operator, and hence the
membership of a fuzzy set on the set of categories. 2 Then              definition of ν(x, i) from (Bundesen 1990) is a particular
the weight of item x is now defined as the cardinality of this          case, when H is the product.
fuzzy set. That is                                                         For defining H, one may rely on the huge literature on
              w(x)
              e    = Card {(i, µx (i)) | i ∈ R}                  (7)    information fusion, for which the fuzzy sets theory provides
                                                                        a number of useful operators (see e.g. (Dubois and Prade
    2                                                                   1985; Yager 1991; Bloch 1996) for reviews on fuzzy fu-
      In the following, assuming only one task, T , for ease of nota-
tion, the subscript T will be dropped, to write µx (i).                 sion operators). The large choice offered by these operators
allows modeling different combination behaviors (conjunc-         Humphreys, G. W. 2014. Feature confirmation in object
tive, disjunctive, compromise, etc.), with different degrees      perception: Feature integration theory 26 years on from the
(e.g. the min is a less severe conjunction as the product).       Treisman Bartlett lecture. The Quarterly Journal of Experi-
Operators can also behave differently depending on whether        mental Psychology (just-accepted):1–49.
the values to be combined are small, large, of the same order     Kahneman, D. 1973. Attention and Effort. Prentice-Hall.
of magnitude, or having different priorities. The operators
                                                                  Ralescu, A. L. 1986. A note on rule representation in expert
H could also be set differently for the three values. For in-
                                                                  systems. Information Sciences 38(2):193–203.
stance η and w,
              e which depend on x and i could be combined
using an operator H1 , and the result combined with β,
                                                     e which      Ralescu, D. 1995. Cardinality, quantifiers, and the aggrega-
depends on i only, using another operators H2 .                   tion of fuzzy criteria. Fuzzy sets and systems 69(3):355–365.
                                                                  Treisman, A. M., and Gelade, G. 1980. A feature-integration
        4    Conclusions and Future Work                          theory of attention. Cognitive psychology 12(1):97–136.
This paper discussed an attentional model developed in the        Treisman, A. 1988. Features and objects: The fourteenth
field of psychology and cognitive science set in a proba-         bartlett memorial lecture. The Quarterly Journal of Experi-
bilistic framework. The basic concepts of this model were         mental Psychology 40(2):201–237.
discussed and an alternative, fuzzy set based approach was        Treisman, A. 2014. The psychological reality of levels of
suggested. In the fuzzy set framework, modeling would be          processing. Levels of processing in human memory 301–
easier, more natural (for instance replacing numbers by ordi-     330.
nal or qualitative values), and it would allow for more flex-     Yager, R. R. 1991. Connectives and Quantifiers in Fuzzy
ible ways of combining the different terms. This discussion       Sets. Fuzzy Sets and Systems 40:39–75.
paves the way for a new attentional model, the complete de-       Yan, Q.; Xu, L.; Shi, J.; and Jia, J. 2013. Hierarchical
velopment of it being left for future work.                       saliency detection. In IEEE Conference on Computer Vision
                                                                  and Pattern Recognition (CVPR), 1155–1162.
                5    Acknowledgments
                                                                  Yang, C.; Zhang, L.; Lu, H.; Ruan, X.; and Yang, M.-H.
Anca Ralescu’s contribution was partially supported by a          2013. Saliency detection via graph-based manifold rank-
visit to Telecom ParisTech.                                       ing. In IEEE Conference on Computer Vision and Pattern
                                                                  Recognition (CVPR), 3166–3173.
                        References
Bloch, I. 1996. Information Combination Operators for Data
Fusion: A Comparative Review with Classification. IEEE
Transactions on Systems, Man, and Cybernetics 26(1):52–
67.
Borji, A., and Itti, L. 2013. State-of-the-art in visual atten-
tion modeling. IEEE Transactions on Pattern Analysis and
Machine Intelligence 35(1):185–207.
Bundesen, C.; Habekost, T.; and Kyllingsbæk, S. 2005.
A neural theory of visual attention: bridging cognition and
neurophysiology. Psychological review 112(2):291.
Bundesen, C.; Vangkilde, S.; and Petersen, A. 2014. Recent
developments in a computational theory of visual attention
(tva). Vision research.
Bundesen, C. 1990. A theory of visual attention. Psycho-
logical review 97(4):523.
Desolneux, A.; Moisan, L.; and Morel, J.-M. 2003. Com-
putational gestalts and perception thresholds. Journal of
Physiology-Paris 97(2):311–324.
Dubois, D., and Prade, H. 1985. A Review of Fuzzy Set
Aggregation Connectives. Information Sciences 36:85–121.
Fouquier, G.; Atif, J.; and Bloch, I. 2012. Sequential
model-based segmentation and recognition of image struc-
tures driven by visual features and spatial relations. Com-
puter Vision and Image Understanding 116(1):146–165.
Gao, D.; Mahadevan, V.; and Vasconcelos, N. 2008.
The discriminant center-surround hypothesis for bottom-up
saliency. In Advances in Neural Information Processing Sys-
tems, 497–504.