<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Concepts, Proto-Concepts, and Shades of Reasoning in Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A. Augello</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Gaglio</string-name>
          <email>salvatore.gaglio@unipa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Oliveri</string-name>
          <email>gianluigi.oliveri@unipa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Pilato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ICAR - CNR</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Palermo</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>22</volume>
      <issue>2018</issue>
      <abstract>
        <p />
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>One of the most important functions of concepts is that of
producing classi cations; and since there are at least two di erent types
of such things, we better give a preliminary short description of them
both.</p>
      <p>The rst kind of classi cation is based on the existence of a
property common to all the things that fall under a concept. The second,
instead, relies on similarities between the objects belonging to a
certain class A and certain elements of a subclass AS of A; the so-called
`stereotypes.' In what follows, we are going to call `proto-concepts' all
those concepts whose power of classi cation depends on stereotypes,
leaving the term `concepts' for all the others.</p>
      <p>The main aim of this article is showing that, if a proto-concept
is given simply in terms of the ability to make the appropriate
distinctions, then there are stimulus-response cognitive systems | whose
way of manipulating information is based on Neural Networks (NN)
| able to make the appropriate distinctions typical of proto-concepts
in the absence of high-level cognitive features such as consciousness,
understanding, representation, and intentionality. This, of course,
implies that either proto-concepts cannot be given simply in terms of the
ability to make the appropriate distinctions, or that we need to modify
our traditional conception of mind, because the induction-like
procedure followed by a NN in producing its classi cations, far from being
the ultimate product of a `linguistic mind,' is, rather, inscribed in the
nuts and bolts of the system's biology/electronics to which the NN
belongs.</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>A standard way of producing a classi cation is that obtained by means of
sharp concepts. A concept C is sharp if and only if C refers to a property
P such that, if D is the domain of quanti cation, there exists a class A such
that A = fx j P (x)g; A \ A = ;; and A [ A = D: For example, if D = N; the
concept x is prime is sharp. And we say that `the number 2 falls under the
concept x is prime' or, more simply, that `2 is prime.'</p>
      <p>As is well known, when we are dealing with sharp concepts the law of
excluded middle holds, whereas this is not the case with concepts that are
not sharp like x is a heap. Within ordinary language we make an extensive,
and productive, use of fuzzy concepts like x is a heap, y is bald, z is tall, etc.
without even being aware of their problematic logical status.</p>
      <p>However, besides the kind of classi cation we obtain by means of
sharp/fuzzy concepts, there is a di erent type of classi cation which is not based
on the existence of one and the same property common to all the members of
a class A. But, it, rather, relies on similarities between the objects belonging
to a certain class A and the elements of a subclass AS of A; elements that we
are going to call `stereotypes.' By way of example, take As to contain two
elements: a shark and a mullet. Starting from As we could generate A by
taking some similarities between our stereotypes and other things.</p>
      <p>Note that here by `similarity between a and b' we mean a property
common to a and b; where a and b belong to D; e.g. being an animal, having
ns, having scales, etc. Of course, more are the properties that a and b have
in common the more similar a and b are to one another. The limiting case
being that expressed by the identity a = b where the objects denoted by a
and b have the same properties, that is, when a and b denote the same object.</p>
      <p>It is not di cult to see how, from such similarities with our stereotypes,
we can generate a concept of sh given in terms of: animal living in water
having either ns or scales or both. And, of course, from such a way of
characterising the concept of sh it would follow that whales, dolphins, etc. are
sh. But, we know that contemporary zoology has successfully challenged the
above mentioned use of the term ` sh' by introducing a distinction between
mammals and sh, a distinction according to which whales and dolphins are
not sh, but mammals.</p>
      <p>The quasi-accidental, purely phenomenical nature of the classi cations
obtained by means of brute correlations, such as those arising from mere
similarities between objects and a set of stereotypes, has led us to call the
cognitive representatives of these classi cations `proto-concepts.' We are
going to use the term `concepts' only for the cognitive representatives of all
the other kinds of classi cation.</p>
      <p>The main aim of this article is showing that, if a proto-concept is given
simply in terms of the ability to make the appropriate distinctions, then
there are stimulus-response cognitive systems1 | whose way of
manipulating information is based on Neural Networks (NN) | able to make the
appropriate distinctions typical of proto-concepts in the absence of high-level
cognitive features such as consciousness, understanding, representation, and
intentionality. This, of course, implies that either proto-concepts cannot be
given simply in terms of the ability to make the appropriate distinctions,
or that we need to modify our traditional conception of mind, because the
induction-like procedure followed by a NN in producing its classi cations, far
from being the ultimate product of a `linguistic mind,' is, rather, inscribed
in the nuts and bolts of the system's biology/electronics to which the NN
belongs.</p>
      <p>The present paper is a follow up to `Wittgenstein, Turing, and Neural
Networks' by G. Oliveri and S. Gaglio where, among other things, the authors
endeavour to bring out the genuine cognitive character of Neural Networks
(NN), cognitive character exhibited, primarily, by their ability to learn and
being trained to perform a certain task.
2</p>
    </sec>
    <sec id="sec-3">
      <title>The three main functions of concepts</title>
      <p>Concepts have always played a central r^ole in philosophy and, especially,
in logic. One of the most important philosophical disputes in which
medieval philosophers engaged | the so-called `dispute about universals' |
was directly related to the attempt to provide a plausible explanation of the
classifying power of concepts. In fact, the realists argued, the reason why the
concept red is so useful in classifying certain objects, separating them out
from all the others, is that to such a concept there corresponds a property, a
universal, that is present in all and only those things of which we correctly
say that they are red.</p>
      <p>On the other hand, the nominalists thought that, in contrast with what
asserted by the realists, the universals do not exist. For if you take two red
things a and b you, immediately, realise that the shade of red of a is di erent
from that of b and that, therefore, `red' is just a word, a name, to which
no universal property corresponds. Therefore, according to the medieval
nominalist, `red' is a name whose usefulness in classifying boils down to the
possibility of putting together all and only those things that are similar to
one another with respect to (a certain) colour.</p>
      <p>
        1See on this [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Chapters 2 and 3.
      </p>
      <p>Concerning the importance of concepts in logic, we can mention here,
by way of example, the peculiar relation that, in pre-Fregean logic, a
concept/predicate was supposed to have to the subject in a judgment, a relation
exploited by Kant in the Critique of Pure Reason to draw the important
distinction between analytic and synthetic judgments.</p>
      <p>However, the modern theory of concepts starts with Frege for whom a
concept is not an object, but a function2 that takes proper names (or
expressions performing the ro^le of proper names) as arguments, and truth-values
as values. From this it, immediately, follows that, according to Frege, x is
bald, y is a heap, z is tall, etc. are not concepts, because the expressions `a
is bald,' `b is a heap,' `c is tall,' etc. may not have a truth-value for certain
proper names a; b; c:</p>
      <p>Although in Frege we do not nd the analytical philosopher's
commitment to the idea that only a theory of language can provide a safe basis for
the construction of a theory of thought,3 nevertheless, for Frege, concepts
have an essential function in thought that consists in presiding over the
formulation and justi cation of judgments. It is only with the advent of Gestalt
psychology,4 Husserl's phenomenology,5 and the philosophy of psychology of
the later Wittgenstein | embodied, in particular, in the study of the
phenomenon known as seeing-as6 | that some non-idealist philosophers and
psychologists discovered the very important ro^le performed by concepts in
perception.</p>
      <p>Consider the Necker cube given in Figure 1. Now, apart from the well
known possibility of shifting from perceiving face ABCD as `coming forward'
to seeing, instead, face 1234 as `coming forward' | depending on which of
the two faces you are focussing your attention | the really interesting thing
here is that one of the necessary conditions for you to see the object in Figure
1 as a cube is having the concept of cube.</p>
      <p>In fact, although a young child with no knowledge of mathematics would be
able to perceive the face-shifting phenomenon, and draw a fairly resembling
picture of the object in Figure 1; if he were asked to say what he sees the
object as, he would probably reply `a box,' `a lump of sugar,' `a brick,' `a
wire frame,' etc. but, certainly, not `a cube,' because he does not know what
a cube is.</p>
      <p>
        If what we have said so far is correct, concepts have at least three di erent
2See [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
3See on this [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], especially Chapter 10.
      </p>
      <p>
        4See on this [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] especially in relation to the impact that Gestalt psychology has on
what he calls `productive thinking.'
5See on this [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Part Three, Chapter Three.
6See on this [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], Part II, Chapter XI.
important functions: classifying, being integral part of judgments, a ecting
some of our perceptions. Of these three functions of concepts two of them |
being integral part of judgments, a ecting some of our perceptions |
presuppose the existence of a cognitive system able to produce judgments/thoughts
and representations of objects. The classifying function, instead, seems to
us to be somewhat independent of the existence of such a cognitive system.
For, although, when a competent speaker of English says `Socrates is a man'
the very meaningfulness of the assertion, and of the thought expressed by it,
presuppose the speaker's ability to classify some of the objects of his domain
of quanti cation as men; and that when someone sees something as a cube,
the very possibility of his perception depends on his ability to classify certain
objects of his domain of quanti cation as cubes; the ability, for example, to
classify something as a sh (see x1) may not presuppose either judging or
seeing-as (representing).
      </p>
      <p>Perhaps, some light on these matters will be obtained from considering
how concepts are given to us, because in so doing we might come across
concepts that are given to us as means of classi cation and not as instruments
for judging or representing.
3</p>
    </sec>
    <sec id="sec-4">
      <title>Concepts and proto-concepts</title>
      <p>In investigating how concepts are given to us, one of the obvious things to
look at is language. Two of the main concept-producing devices present
in language are the so-called `prototypes,'7 and `stereotypes.'8 A prototype
is an individual belonging to the domain of quanti cation that has certain
features `at their best.' Take, by way of example, D to be the set of the
elements of the colour spectrum projected on to a wall by a prism when this
is hit by a pencil of light rays (see Figure 2). The colour spectrum D; with
the Euclidean distance de ned on it, is a metric space (D; d):</p>
      <p>Now, for each colour band i present in the colour spectrum, where i 2 R;
choose the middle element of the band as the prototype of that colour, and
call it `pi:' If k; p 2 D; where p represents a prototype, as is shown by
the colour spectrum, the shorter is the distance between k and p the more
similar the colour represented by k is to the colour represented by p: Clearly,
if pi is the prototype of the red colour band then the set R = fx j d(x; pi) &lt;
d(x; pj); for any j such that j 6= ig can be considered as a classi cation of
elements of D which can, eventually, be turned into the extension of a concept
and, precisely, of the concept x is red.</p>
      <p>A stereotype, on the other hand, is an individual s belonging to the
domain of quanti cation that appears to have certain features, but such
features are not necessarily given at their best in s. Take (D; d) to be, as
above. If k; s 2 D; where s represents a stereotype then, as above, the smaller
is the distance between k and s the more similar the colour represented by k
is to the colour represented by s: As colour stereotypes s consider the colours
of the paints present on the palette of a Renaissance painter.</p>
      <p>
        Assuming that on the palette of a Renaissance painter there could be a
nite number of colour stereotypes s1; : : : ; sn; and that si is the only stereotype
7On a related concept of prototype see [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], especially Chapter 3, x3.9. See also [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] on
prototypes and theory-like representations; and [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] on incorporating prototype theory in
Convolutional Neural Networks.
      </p>
      <p>
        8On a linguistic concept of stereotype see [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
of red, we have that Ci = fx j d(x; si) &lt; d(x; sj); for any j such that j 6= ig
is a classi cation of elements of D which can be, eventually, turned into the
extension of the concept x is red.
      </p>
      <p>Although, as we have seen in the examples above, both prototypes and
stereotypes generate potential extensions of concepts, there are some
analogies and di erences existing between these two types of objects that deserve
some attention.</p>
      <p>One of the main di erences between prototypes and stereotypes is that
there is only one prototype of something, but you could have several
stereotypes of the same kind of thing. Indeed, the prototype of red is that
electromagnetic wave having a wavelength of 700 nanometers (see Figure 2),
whereas the so-called `Titian red' and `Pompeian red' are two di erent
possible stereotypes of red. Of course, if there is more than one stereotype of,
say, red the classi cation induced by all the stereotypes of red available is
going to be the union of the classi cations induced by each single stereotype,
and the larger is the number of di erent stereotypes of red the better are the
chances of producing a correct classi cation of red objects.</p>
      <p>Secondly, the very idea of `features at their best,' present in the de nition
of prototype, requires some form of theorising and judging to distinguish, for
example, between a n at its best and a n that is not at its best. On
the other hand, when it comes to producing stereotypes, the situation looks
rather di erent.</p>
      <p>To see this consider the well known ethological phenomenon of imprinting.
Imprinting is that type of learning that takes place only within a certain
number of hours from birth. As Konrad Lorenz has shown,9 if, within a
certain number of hours from its birth, a gosling is exposed to a human
being, rather than to a goose, it ends up considering the human being as its
parent following him, etc. (see Figures 3 and 4). In other words, imprinting
is that form of learning whereby geese, and other animals, form a stereotype
not only of their mother/parent, but also of the species to which they belong.
This has two very important consequences for us. First, the phenomenon of
imprinting, clearly, shows that the formation of some stereotypes is brute,
that is, it does not take place through theorising, or any sort of reasoning or
representation in uenced by previously acquired concepts.</p>
      <p>Secondly, there are stereotypes, some of which play a crucial r^ole in
producing very important, basic classi cations, that are independent of a
linguistic mind.</p>
      <p>But, be as it may with regard to the connection between imprinting
and stereotypes formation, we believe that, if su ciently many samples are
provided, then the classi cation obtained of the elements of the domain of
quanti cation D can be turned into the extension of the corresponding
protoconcept by means of CNNs. Although the following section gestures in this
direction, substantiating this claim will be one of the objects of our future
investigations.
4</p>
    </sec>
    <sec id="sec-5">
      <title>CNNs and Inductively</title>
    </sec>
    <sec id="sec-6">
      <title>Concepts?</title>
    </sec>
    <sec id="sec-7">
      <title>Generated Proto</title>
      <p>
        Traditional Feed-Forward Neural Network architectures receive a single
vector as an input and process it through a series of hidden layers. Each hidden
layer is constituted by a set of arti cial neurons, where each unit is fully
connected to all the units belonging to the previous layer. The neural units
belonging to the same layer make their computation in parallel with the other
units of the same layer. Furthermore, the neurons of the same layer do not
share any connections.
Traditional feed-forward architectures do not perform well on image
recognition and image segmentation tasks. In the last years a category of Neural
Network architectures, known in the literature as Convolutional Neural Networks
(CNNs) and inspired by the mammalian visual system [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ][Fukushima,1980]
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], have proven to be very e ective in performing tasks like image
recognition and segmentation. The very rst convolutional neural network
architecture was LeNet, developed by Yann LeCun and it was e ectively used
for character recognition tasks. This kind of architectures gave rise to the
general paradigm of Deep Learning.
      </p>
      <p>The main operations performed in Convolutional Neural Network are
Convolution, Pooling or Sub Sampling, and Classi cation.</p>
      <p>Traditionally, ConvNets assume that the kind of inputs are images.
A typical representation of a convolutional network is shown in Figure 5.</p>
      <p>The network processes the original image layer by layer from the original
pixel values to the nal classi cation output. The input layer speci es the
dimensions of the input images. CNNs derive their name from the
\convolution" operator. The main goal of Convolution in this kind of architectures is
to extract features from the input image. The Convolution operation allows
the network to learn image features using small portions of input data.
A set of parameterized kernels constitutes the convolution layer. Every
kernel is spatially small, and it is applied to the whole image through a spanning
process.</p>
      <p>The input image is convolved with these multiple learned kernels which
exploit shared weights. This operation generates a bidimensional activation
map that gives the responses of that lter at every spatial position. The size
of the kernels gives rise to the locally connected structure and produces a set
of feature maps.
Then, the pooling layer reduces the size of the image, attempting to maintain
the information.</p>
      <p>
        The combination of the convolution and the pooling layers realizes the
feature extraction part. Subsequently, the features are weighted and combined
in the fully-connected layer, which constitutes the classi cation layer of the
network [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>
        Convolutional Neural Networks (CNNs) are powerful models capable of
achieving outstanding results in particular for image classi cation and segmentation
tasks. Nowadays this kind of neural architectures has been extremely
successful in identifying faces, objects, tra c signs and are widely used in vision
for robots and self-driving cars. Moreover, ConvNets have been e ectively
used also in several Natural Language Processing tasks as well.
Furthermore, they are capable to e ectively extract features from images:
pre-trained models are used as generic feature extractors[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. This goal
is reached by removing the last layer which gives the output classi cation
scores. The activations from the last fully connected layer de ne the features
extracted from the input image [31].
      </p>
      <p>
        These kinds of features extracted from pre-trained CNN have been
successfully used in computer vision tasks such as scene recognition or object
attribute detection, yielding better results concerning traditional handcrafted
features [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        Moreover, Athiwaratkun et al. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] have also demonstrated that Random
Forest and SVM can be used with features extracted from CNN to obtain
better a prediction accuracy compared to the original CNN.
      </p>
      <p>
        Recent results indicate that very deep networks achieve even better results
on various benchmarks [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. One drawback of this trend, however, is
the time required to train such kind of neural architecture.
      </p>
      <p>Summing up we can say that CNNs can be used to perform a
generalization/classi cation based on the stereotypes belonging to the training set
and, as discussed in this section, with considerable improvements with
regard to other traditional types of neural networks (even if the limitations
typical of neural networks, relating to the size of the training set and the
curse of dimensionality, still remain). This is a good baseline to investigate,
in future works, if the classi cation obtained by a CNN can be turned into
the extension of corresponding proto-concepts.
5</p>
    </sec>
    <sec id="sec-8">
      <title>Conclusions</title>
      <p>This is a philosophy and ethology inspired paper relating to cognition.
Starting from a discussion of various kinds of classi cation, we are led to
distinguishing between classi cations operated on the basis of concepts, and
classi cations driven by what we have called `proto-concepts.' The di erence
between the two di erent kinds of classi cations being that whereas
concepts appeal to the existence of a property common to all the objects falling
under them, proto-concepts, instead, derive their classi cation power from a
set of what we call `stereotypes' and the relevant similarities existing between
these stereotypes and the objects falling under the proto-concepts.</p>
      <p>Having discussed the di erence between prototypes and stereotypes and
their ro^le in producing classi cations | classi cations that are presented
as potential extensions of proto-concepts | we discuss the ethological
phenomenon of imprinting discovered and studied by Konrad Lorenz. As is well
known, imprinting is that cognitive phenomenon whereby goslings, if exposed
to a certain object K within a certain time from birth | a goose, a human
being, etc. | elect K as a stereotype (in our sense) of parent/representative
of their species and behave accordingly following K, etc. This, of course,
implies that goslings subject to imprinting classify K, and themselves, as
belonging to the same class K that becomes the potential extension of a
proto-concept.</p>
      <p>We then engage in a discussion of a particular type of Neural Network, the
so-called `Convolutional Neural Network' (CNN). What we intend to show
in our discussion of CNNs is that cognitive agents that operate on the basis
of CNNs are able to produce classi cations typical of proto-concepts in the
absence of high-level cognitive features such as consciousness, understanding,
representation, and intentionality. On the basis of this result we ask ourselves
whether this means that proto-concepts cannot be given simply in terms of
the ability to make the appropriate distinctions or that we should, instead,
modify our traditional conception of mind.
[31] Garcia-Gasulla, Dario, Ferran Pars, Armand Vilalta, Jonatan Moreno,
Eduard Ayguad, Jess Labarta, Ulises Corts, and Toyotaro Suzumura.
"On the Behavior of Convolutional Nets for Feature Extraction."
Journal of Arti cial Intelligence Research 61 (2018): 563-592.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Bonomi</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (ed.):
          <year>1973</year>
          ,
          <article-title>La struttura logica del linguaggio, Valentino Bompiani</article-title>
          , Milano.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Dummett</surname>
            ,
            <given-names>M. A. E.</given-names>
          </string-name>
          :
          <year>1991</year>
          , The Logical Basis of Metaphysics, Duckworth, London.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Dummett</surname>
            ,
            <given-names>M.A.E.</given-names>
          </string-name>
          :
          <year>1993</year>
          , Origins of Analytical Philosophy, Harvard University Press, Cambridge, Massachusetts.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Frege</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <year>1973</year>
          , `Funzione e Concetto',
          <source>in [1]</source>
          , pp.
          <volume>411</volume>
          {
          <fpage>423</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Frege</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <year>1973</year>
          , `Concetto e Oggetto',
          <source>in [1]</source>
          , pp.
          <volume>373</volume>
          {
          <fpage>386</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] Gardenfors, P.:
          <year>2004</year>
          ,
          <string-name>
            <given-names>Conceptual</given-names>
            <surname>Spaces</surname>
          </string-name>
          : The Geometry of Thought, MIT Press, Cambridge, Massachusetts.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Husserl</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <year>1998</year>
          ,
          <article-title>Ideas pertaining to a pure phenomenology and to a phenomenological philosophy, transl</article-title>
          . by
          <string-name>
            <given-names>F.</given-names>
            <surname>Kersten</surname>
          </string-name>
          , rst book, Kluwer Academic Publishers, Dordrecht.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Kant</surname>
          </string-name>
          , I.:
          <volume>1787</volume>
          (
          <year>1990</year>
          ), Critique of Pure Reason, transl. by Norman Kemp Smith, Macmillan, London.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kohonen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <year>2001</year>
          ,
          <string-name>
            <surname>Self-Organizing</surname>
            <given-names>Maps</given-names>
          </string-name>
          , Springer, Berlin.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Lieto</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <year>2018</year>
          , `
          <article-title>Heterogeneus Proxytypes Extended: Integrating Theory-like Representations and Mechanisms with Prototypes and Exemplars,'</article-title>
          <source>BICA 2018</source>
          , Springer,
          <source>Advances in Intelligent Systems and Computing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Lorenz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <year>2012</year>
          , L'anello di Re Salomone, Adelphi eBook, Milano.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>McCulloch</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Pitts</surname>
          </string-name>
          , W.:
          <year>1943</year>
          ,
          <article-title>`A Logical Calculus of the Ideas Immanent in Nervous Activity'</article-title>
          ,
          <source>Bulletin of Mathem.Biophysics</source>
          ,
          <volume>5</volume>
          :
          <fpage>115</fpage>
          -
          <lpage>133</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Nilsson</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          :
          <year>2002</year>
          ,
          <article-title>Intelligenza arti ciale</article-title>
          , edited by S. Gaglio, Apogeo, Milano.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Oliveri</surname>
            ,
            <given-names>G..:</given-names>
          </string-name>
          <year>1984</year>
          , `
          <article-title>Le Ricerche di Wittgenstein nella lettura di S</article-title>
          . Kripke', Paradigmi,
          <string-name>
            <surname>Anno</surname>
            <given-names>II</given-names>
          </string-name>
          , n. 6.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Oliveri</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Gaglio</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : In press, `Wittgenstein, Turing, and
          <article-title>Neural Networks', Giornale di Meta sica</article-title>
          , vol.
          <volume>1</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Putnam</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <year>1975</year>
          , `
          <article-title>The meaning of `meaning"</article-title>
          ,
          <source>Minnesota Studies in the Philosophy of Science</source>
          , vol.
          <volume>7</volume>
          , pp.
          <volume>131</volume>
          {
          <fpage>193</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Rumelhart</surname>
            ,
            <given-names>D. E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
            ,
            <given-names>G. E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Williams</surname>
          </string-name>
          , R. J.:
          <year>1986</year>
          , `
          <article-title>Learning Internal Representations by Error Propagation'</article-title>
          , in
          <string-name>
            <surname>Rumelhart</surname>
          </string-name>
          , D. E., and
          <string-name>
            <surname>McClelland</surname>
            ,
            <given-names>J. L</given-names>
          </string-name>
          . (Eds.) (
          <year>1983</year>
          ), Parallel Distributed Processing, MIT Press, Boston, Vol.
          <volume>1</volume>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Saleh</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elgammal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feldman</surname>
          </string-name>
          , J.:
          <year>2016</year>
          , `
          <article-title>Incorporating Prototype Theory in Convolutional Neural Networks'</article-title>
          ,
          <source>Proceedings of the TwentyFifth International Joint Conference on Arti cial Intelligence.</source>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Varela</surname>
            ,
            <given-names>F. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thompson</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosch</surname>
          </string-name>
          , E.:
          <year>1993</year>
          , The Embodied Mind, The MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Wetheimer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <year>1965</year>
          ,
          <article-title>Il pensiero produttivo, transl</article-title>
          . by
          <string-name>
            <given-names>M.</given-names>
            <surname>Giacometti</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bolletti</surname>
          </string-name>
          , Giunti e Barbera, Firenze.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Wittgenstein</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <year>1981</year>
          (
          <year>1921</year>
          ),
          <article-title>Tractatus Logico-Philosophicus, transl</article-title>
          . by
          <string-name>
            <given-names>D.F.</given-names>
            <surname>Pears</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>B.F.</surname>
          </string-name>
          <article-title>McGuinness, with the introduction by B</article-title>
          .
          <string-name>
            <surname>Russell</surname>
          </string-name>
          , Routledge &amp; Kegan Paul, London &amp; Henley.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Wittgenstein</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <year>1983</year>
          ,
          <string-name>
            <given-names>Philosophical</given-names>
            <surname>Investigations</surname>
          </string-name>
          , Basil Blackwell, Oxford.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Hubel</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Wiesel</surname>
          </string-name>
          ,
          <article-title>Receptive elds of single neurones in the cats striate cortex</article-title>
          ,
          <source>J. Physiol.</source>
          , vol.
          <volume>148</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>574591</fpage>
          ,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <source>[Fukushima</source>
          ,1980]
          <string-name>
            <given-names>K.</given-names>
            <surname>Fukushima</surname>
          </string-name>
          ,
          <article-title>Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition una ected by shift in position, Biol</article-title>
          . Cybern., vol.
          <volume>36</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>193202</fpage>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , L. Bottou,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          , and
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>Ha ner, Gradient-based learning applied to document recognition</article-title>
          ,
          <source>Proc. IEEE</source>
          , vol.
          <volume>86</volume>
          , no.
          <issue>11</issue>
          , pp.
          <fpage>22782324</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Lars</surname>
            <given-names>Hertel</given-names>
          </string-name>
          , Erhardt Barth, Thomas Kater, Thomas Martinetz,
          <article-title>"Deep Convolutional Neural Networks as Generic Feature Extractors"</article-title>
          2015
          <source>International Joint Conference on Neural Networks (IJCNN)</source>
          ,
          <year>Killarney</year>
          ,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          , W. Liu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sermanet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Anguelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Erhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rabinovich</surname>
          </string-name>
          ,
          <article-title>Going deeper with convolutions</article-title>
          ,
          <source>presented at the Workshop ImageNet Large Scale Visual Recognition Challenge (ILSVRC)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>K.</given-names>
            <surname>Simonyan</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Zisserman</surname>
          </string-name>
          ,
          <article-title>Very deep convolutional networks for large-scale image recognition</article-title>
          ,
          <source>arXiv preprint arXiv:1409.1556</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Sharif</given-names>
            <surname>Razavian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Azizpour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Sullivan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            , and
            <surname>Carlsson</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Cnn features o the- shelf: an astounding baseline for recognition</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops</source>
          , pp.
          <fpage>806813</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>B.</given-names>
            <surname>Athiwaratkun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <article-title>"Feature representation in convolutional neural networks"</article-title>
          ,
          <source>CoRR</source>
          , vol.
          <source>abs/1507.02313</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>