=Paper= {{Paper |id=Vol-2325/paper-05 |storemode=property |title=What Stands-in for a Missing Tool?: A Prototypical Grounded Knowledge-based Approach to Tool Substitution |pdfUrl=https://ceur-ws.org/Vol-2325/paper-05.pdf |volume=Vol-2325 |authors=Madhura Thosar,Christian A. Mueller,Sebastian Zug |dblpUrl=https://dblp.org/rec/conf/kr/ThosarMZ18 }} ==What Stands-in for a Missing Tool?: A Prototypical Grounded Knowledge-based Approach to Tool Substitution== https://ceur-ws.org/Vol-2325/paper-05.pdf
                   What Stands-in for a Missing Tool?
                A Prototypical Grounded Knowledge-based
                     Approach to Tool Substitution

                          Madhura Thosar1 , Christian A. Mueller2 , Sebastian Zug1
                                      1
                                        Faculty of Computer Science,
                            Otto von Guericke University Magdeburg, Germany
                                            2
                                              Robotics Group,
                          Computer Science & Electrical Engineering Department,
                                   Jacobs University Bremen, Germany
                           thosar@iks.cs.ovgu.de, zug@ivs.cs.uni-magdeburg.de,
                                     chr.mueller@jacobs-university.de



                                                                         especially when they are faced with unfavorable situa-
                                                                         tions. For example, if we don’t find a hammer to ham-
                         Abstract                                        mer a nail into a wall, we will use a heel of a shoe or a
                                                                         rock or if a tray is unavailable for serving the drinks,
    When a robot is operating in a dynamic en-                           we will use a plate for serving. In situations like these,
    vironment, it cannot be assumed that a tool                          humans seem to know - either from the past experi-
    required to solve a given task will always be                        ence or from observations or from the “necessity is the
    available. In case of a missing tool, an ideal                       mother of improvisation (invention)” type approach -
    response would be to find a substitute to com-                       what kind of object is needed as a substitute.
    plete the task. In this paper, we present a                              On the contrary, consider a robot performing a task
    proof of concept of a grounded knowledge-                            that involves tool use. When a robot is operating in a
    based approach to tool substitution. In order                        dynamic environment, it can not be assumed that a
    to validate the suitability of a substitute, we                      tool required in the task will always be available. In
    conducted experiments involving 22 substitu-                         situations like these, an effective way for a robot would
    tion scenarios. The substitutes computed by                          be to find an alternative as humans do, for example,
    the proposed approach were validated on the                          use an eating plate for serving, rather than wait un-
    basis of the experts’ choices for each scenario.                     til a tray becomes available. This skill is significant
    Our evaluation showed, in 20 out of 22 scenar-                       when operating in a dynamic, uncertain environment
    ios (91%), the approach identified the same                          because it allows a robot to adapt to unforeseen sit-
    substitutes as experts.                                              uations to a degree. The question is how can a robot
                                                                         determine which object in the environment is a viable
1    Introduction                                                        candidate for a substitute? A possible approach would
                                                                         be by interacting with an object in a manner missing
The sophistication pertaining to tool-use in humans                      tool is maneuvered. However, it would be time con-
involves not just the dexterity in manipulating a tool,                  suming if a robot interacts with every single object in
but also the diversity in tool exploitation. The abil-                   the environment to determine a viability which makes
ity to exploit the tools has enabled humans to adapt                     this approach less practical.
and thus exert control over an uncertain environment,                        In this prototypical work, we propose a non-invasive
                                                                         approach that identifies viable candidate/s from the
Copyright c by the paper’s authors. Copying permitted for pri-
vate and academic purposes.
                                                                         existing objects in the environment. This paper makes
In: G. Steinbauer, A. Ferrein (eds.): Proceedings of the 11th In-
                                                                         the following contributions: 1) An approach to create
ternational Workshop on Cognitive Robotics, Tempe, AZ, USA,              grounded knowledge about objects expressed in terms
27-Oct-2018, published at http://ceur-ws.org                             of their properties (Sec. 5.2), 2) an approach to identify




                                                                    20
relevant properties of a missing tool and determine a            work discussed in [2] retrieves the knowledge about ob-
substitute on the basis of them (Sec. 5.3).                      jects from the ROAR [22] relational database and de-
                                                                 termines a substitute that shares similar affordances.
                                                                 However, in ROAR, the knowledge is acquired either
2   Related Work
                                                                 using machine learning techniques requiring training
Typically, a substitute for a missing tool is determined         examples or inferred or hand-coded. The work in [7]
by means of knowledge base that provides knowledge               uses the ConceptNet where potential candidates are
about objects and similarity measures to determine the           extracted from the knowledge base if they share the
similarity between a missing tool and a potential sub-           same parent with a missing tool for the predetermined
stitute. In the following, in addition to the approaches         relations: has-property, capable-of and used-for. After
to determine a substitute, we also report the litera-            eliminating irrelevant candidates, a substitute is deter-
ture related to existing knowledge bases developed for           mined on the basis of the similarity metrics. The ap-
robotic applications.                                            proach proposed in [1] uses a part-based 3D model and
                                                                 weight of an object to determine the orientation and
                                                                 manipulation of a substitute to be used as a missing
Knowledge Base
                                                                 tool. In the cases where supervised machine learning
We reviewed in [24] nine existing knowledge bases                technique is used, providing bulk of labeled examples
namely: KNOWROB [23], MLN-KB [25], NMKB [19],                    beforehand would not be realistic for a substitution
OMICS [10], OMRKF [21], ORO [14], OUR-K [15],                    problem scenario. On the other hand, the approaches
PEIS-KB [8], and RoboBrain [20]. The objective was               which rely on existing external knowledge bases are
to determine whether these existing knowledge bases              built around the available knowledge in the knowledge
contain 1) ontological knowledge about the properties            bases which does impose some constraints. We circum-
of objects, 2) such knowledge is grounded into robot’s           vents this issue by first identifying what knowledge is
perception, and 3) intra-class variability in a property         generally required to determine a substitute and then
is modeled instead of expressing the property in a bi-           build an approach to acquire the required knowledge
nary form. We gained primarily the following insights            and compute a substitute on the basis of it.
which form the basis for our work.
   We noted that the majority of the knowledge bases             3   Challenges
relied on the external human-centric commonsense
knowledge bases such as WordNet, Cyc, OpenCyc, and               How to characterize similarity between a missing tool
some either relied on the hand-coded knowledge or                and a potential substitute: A candidate for a substi-
on the knowledge acquired by human-robot interac-                tute is expected to be similar to a missing tool to some
tion. The main issue, we believe is that, the depth and          degree to ensure a substitutability. The notion of simi-
breadth of the human-centric knowledge base is not               larity can be understood in various forms, for instance,
observable by a robot in its entirety due to its lim-            a distance between two objects denoted by two points
ited sensing capabilities. This causes a disconnect be-          in a multi-dimensional space or two objects belonging
tween human-centric knowledge and robot-centric per-             to the same cluster or aspects of the objects that are
ception. To deflect this issue, we aim to acquire the            identified as shared. In this work, the question will be
robot-centric perceptual data for different properties           addressed in a broader sense: it is not merely about
of objects. Such property data can then be used to               identifying a similar object by deploying some similar-
generate grounded knowledge about objects (see Sec.              ity measure, instead, it is about gaining an access to
5.1).                                                            what aspects of the objects were found to be shared
                                                                 between the similar objects.
Substitution Computation                                             What kind of knowledge is required to determine the
                                                                 similarity: It has been demonstrated in the literature
One of the closest areas that study the usability of an          on tool use in humans and animals alike that in order
object is affordancs of tools where the primary focus            to use an object in tasks one needs to have knowledge
is to examine various functional abilities of an object          about objects [4]. Baber in [5] also noted that con-
by exploring what actions can be performed on the                ceptual knowledge about objects is especially desired
object and observing its responses. As such, using a             in tool use where a systematic deliberation is called
substitute in place of a missing tool can also be seen as        for. For a robot, the story won’t be much different if
transferring of an affordance of the missing tool to the         it is expected to perform in the real world along side
substitute after determining similarity between them.            humans. As a consequence, a robot needs conceptual
   In [3], a substitute for a missing tool is inferred on        knowledge about an object where the object will not
the basis of inheritance and equivalence relations. The          be only a physical entity that is merely to be perceived,




                                                            21
but also a concept which consists of distinct character-        for a tray. A tray can be defined as a rigid, rectangu-
istics and relations which set each object apart from           lar, flat, wooden, brown colored object while a plate
each other and also similar to each other.                      can be defined as a rigid, circular, semi-flat, white col-
   How to acquire the necessary knowledge: The ac-              ored object and a mouse pad as soft, rectangular, flat,
quisition of such conceptual knowledge is not with-             leather-based object. Bear in mind, however, that some
out challenge. From a robot stand-point, it is a trade-         properties are more relevant than others with respect
off between what needs to be known and what can                 to the designated purpose of the tool. For a tray whose
be known. The trade-off is a direct consequence of              designated purpose is to carry, rigid and flat are more
the limited perception capabilities of a robot which            relevant to carry than a material or a color of a tray.
often leads to partial understanding of the environ-            Consequently, to find the most appropriate substitute,
ment. While deploying a multi-modal perception to               the relevant properties of the unavailable tool need to
extract the required knowledge about objects would              correspond to as large a degree as possible to the prop-
be an ideal solution, however, it carries its own set           erties of the possible choices for a substitute.
of complexities such as noisy sensors, dynamicity of               The proposed approach performs conceptual
the environment, complexities of the composition of             knowledge-driven computation to identify the relevant
an object. For this prototypical work, the necessary            properties of the missing tool and determines the most
knowledge is acquired using human-centric as well as            similar substitute on the basis of those properties. Be-
machine-centric methods.                                        sides identifying the most similar object as a substitute
   How to maneuver a substitute as a missing tool:              for a missing tool, the proposed approach grants an ex-
Once the substitute has been identified, a robot is ex-         plicit access to the relevant properties of the missing
pected to use it in place of a missing tool and achieve         tool which carries twofold advantages: firstly, knowing
the same result as the missing tool in the task. The            which properties are primarily required in the poten-
challenge to estimate the maneuver as well as grasping          tial substitute narrows down the search space and sec-
of a substitute is two fold: to determine whether the           ondly, in case of an unknown object instance, only the
maneuver and grasping knowledge of a missing tool               relevant properties will have to be learned to determine
can be transferred and utilized on a substitute, else,          a substitute.
estimate the maneuver and grasping for a substitute                The conceptual knowledge considered in this work
such that it can be used as a missing tool in the task.         primarily involves properties of the objects. The prop-
   For this work, we have focused on the first three            erties considered are divided into physical and func-
challenges and have developed a prototypical system             tional properties where physical properties describe
called ERSATZ (German word for a substitute or al-              the physicality of the objects such as rigidity, weight,
ternative) where the focus is to identify the required          hollowness while the functional properties ascribe the
knowledge to determine a substitute and develop a sys-          (functional) abilities or affordances to the objects
tem that computes a substitute for a missing tool.              such as containment, blockage, support. The functional
                                                                properties in the proposed approach play a primary
                                                                role in identifying the relevant properties of a missing
4   Approach
                                                                tool (see Sec. 5.3).
The proposed approach distinguishes a tool from a                  The functional properties considered in this work
substitute where a tool is defined as an artifact that          are derived from the theory of image schemas [9] which
is designed, manufactured and maneuvered in accor-              has its roots in cognitive linguistics. According to [12]
dance with its designated purpose in the tasks such             image schemas are patterns abstracted from spatio-
as hammer for hammering, tray for serving etc., while           temporal experiences. Essentially image schemas cap-
a substitute is seen as an extension of a missing tool.         ture recurrent patterns that emerge from our percep-
Within the context of a designated purpose, the rela-           tual and bodily interactions with the environment.
tionship between a tool and a substitute is symmetric,          Since some of these patterns are posited on the op-
for instance, for hammering, a hammer can be replaced           erational abilities of objects Kuhn postulated in [12]
by a heeled shoe and vice versa. However, it may not            that affordances a.k.a. functional properties [9] for the
always be the case once you step outside the context,           spatio-temporal processes can be derived from image
for instance, a hammer can not replace a heeled shoe.           schemas. For example, the containment schema sug-
Our research work, therefore, focuses on searching for          gest an object’s ability to contain something or the
a substitute for a conventional tool required in the on-        support schema indicates an object’s ability to hold
going task as opposed to determining a substitute for           up something or the blockage refers to the ability of
itself.                                                         an object to block or obstruct the movement of an
   Consider a scenario in which a robot has to choose           other object. Currently, the proposed system is re-
between a plate and a mouse pad as an alternative               stricted to three functional properties based on image




                                                           22
                                                               type theory since we believe that the richness of a pro-
                                                               totype provides an unaltered perspective on the char-
                                                               acteristics of object instances. Based on the proposed
                                                               symbolic shape representation we analyze topology and
                                                               structure within the encoded symbol compositions in
                                                               order to discover persistent patterns that may repre-
                                                               sent shape concepts.
Figure 1: Illustration of the object shape conceptual-
                                                                  We introduce an iterative filtering process [18] to
ization approach [18]. Concepts are randomly colored.
                                                               associated instances to groups which may represent
schemas: containment, support and blockage. While the          shape concepts (see C in Fig. 1). Given the set of
functional properties as well as the designated purpose        learned concepts, for an unknown object, concept re-
can both be identified as affordances, the proposed ap-        sponses are retrieved (see D in Fig. 1) and exploited
proach is built by hypothesizing that the functional           as machine-generated geometric object property val-
properties are building blocks upon which designated           ues in our tool-substitution scenario. Note that in our
purposes of tools rest.                                        tool-substitution scenario, concepts are learned from
                                                               unlabeled object instances of the Object Discovery
5     Methodology                                              Dataset(ODD) [17]; the ODD provides a variety of ob-
                                                               jects from teddy bears over flash lights to shoes which
5.1   Knowledge Acquisition
                                                               facilitates an expressive concept generation.
Our ultimate objective is to acquire machine centric              Human Generated Properties The geometric
data from which property specific data can be ex-              properties alone offer a very limited scope of the phys-
tracted. Such property data will then be used to gener-        icality as well as the functionality of an object. There-
ate grounded knowledge about objects. As a first step,         fore, to compensate the gap, we also considered non-
our initial property acquisition focuses on the compos-        geometrical properties such as weight, rigid, hollow-
ite of a machine-centric and a human-centric method.           ness as physical and support, blockage, containment
In the machine-centric approach, geometrical prop-             as functional. Note that, in general, these proper-
erties are acquired using a non-invasive vision-based          ties are challenging and cumbersome to extract solely
technique while non-geometric properties are acquired          from non-invasive visuoperceptual approaches. Conse-
by sampling from the data from the expert generated            quently, extracting such properties via multi-modal or
intuitive model for the properties.                            manipulation capabilities is needed, but this is beyond
   Machine Generated Properties: In this paper,                the scope of this paper. In the generation process, a set
we introduce a state-of-art data-driven approach that          of labeled prototype objects selected from the Wash-
unsupervisedly conceptualizes shape according to com-          ington dataset (see Table 1) were taken into account.
monalities within object point clouds which is dis-            The distribution of each property for particular object
cussed in detail in our work [18]. As a result of the          labels (cf. Table 1) was approximated by an expert
process, a set of shape concepts is generated which            to resemble the scope for the variations in the values
concept responses for an unknown object are used in            of the property in general. Consequently, given an ob-
the knowledge base as machine-generated geometric              ject and its label, a sample value was drawn from the
object properties.                                             a-priori generated property distribution.
   In our previous work on shape concept learning [18],
raw sensor information in form of point clouds is ab-          5.2   Knowledge about Objects
stracted to a symbolic level in which point cloud seg-
ments [17] may represent meaningful shape compo-               Knowledge about objects is spread across three levels:
nents in a symbolic space [16]. Therein we introduce           the first level consists of the data about the machine-
a hierarchical learning procedure that leads to sym-           generated as well as human-generated properties, the
bols which are gradually organized to reflect generic-         second level consists of qualitative knowledge about
to-specific facets of shape components and can be sub-         individual object instances, while the third level con-
sequently used as building blocks that constitute ob-          sists of the aggregated qualitative fuzzy knowledge
jects (see A in Fig. 1).                                       about respective classes of object instances. The fuzzy
   An object shape representation is introduced that           formalism is used to model the intra-class variations
gradually encodes observed objects symbol composi-             in the objects. In the following, we discuss the for-
tions (see B in Fig. 1): from local components to com-         mal description of the methodology deployed to create
ponent groups that may represent object parts or ob-           grounded knowledge about objects.
jects as a whole. The proposed shape representation               Consider O as a given set of object class labels
incorporates aspects of exemplar, respectively, proto-         where (by abuse of notation) each object class is iden-




                                                          23
tified with its label. Let each object
                                  S    class O ∈ O be           bigger are its physical qualities. The semantic terms
a given set of its instances.
                            S Let   O  be a union of all        given above are meant for the readers to understand
object classes such that | O| = n. Let P and F be               the qualitative measures of the properties.
the given sets of physical properties’ labels and a set
of functional properties’ labels respectively. By abuse         Attribution - Object Instance Knowledge
of notation, each physical and functional property is
                                                                The attribution process generates knowledge about
identified with its label. For each physical property
                                                                each object instance by aggregating all the physi-
P ∈ P as well as for a functional property F ∈ F,
                                                                cal and functional qualities assigned to the object
sensory
     S data is acquired from each object instance               instance by the sub-categorization step. In other
o ∈ O. Let Pn and Fn represent sets of n number of
                                                                terms, the knowledge about an instance consists of
extracted sensory values from n number of object in-
                                                                the physical as well as functional qualities reflected
stances for a physical property P ∈ P and a functional
                                                                in the instance. Let Pη and Fη be the families of
property F ∈ F respectively.
                                                                sets containing the physical quality labels Pη and the
                                                                functional quality labels Fη for each physical property
Sub-categorization - From Continuous to Dis-                    P ∈ P and functional propertySF ∈ F respectively.
crete                                                           Thus, each object instance o ∈ O is represented as
                                                                a set of all the physical as well as functional qualities
The sub-categorization process is performed to form
                                                                attributed toSit which are expressed by a symbol holds
(more intuitive) qualitative measures to represent the
                                                                as: holds ⊂ O × (Pη ∪ Fη ) For example, knowledge
degree with which a property is reflected by an ob-
                                                                about the instance plate1 of a plate class can be given
ject instance. It is the first step in creating symbolic
                                                                as, holds(plate1 , medium), holds(plate1 , harder),
knowledge about object classes where the symbols rep-
                                                                holds(plate1 , can support) where medium is a phys-
resenting the qualitative measures of a physical or a
                                                                ical quality of size property, harder is a physical
functional property reflected in an object instance are
                                                                quality of rigidity property and can support is a
generated unsupervisedly by a clustering mechanism.
                                                                functional quality of support property.
A qualitative measure of a physical property is referred
to as a physical quality and that of a functional prop-
erty as a functional quality.                                   Conceptualization - Knowledge about Objects
   In this process, Pn and Fn representing measure-             The conceptualization process aggregates the knowl-
ments of a physical property P ∈ P and a functional             edge about all the instances of an object class. The
property F ∈ F respectively extracted from n number             aggregated knowledge is regarded as conceptual knowl-
of object instances is categorized into a given num-            edge about an object class.
ber of discrete clusters η using a clustering algorithm.           Let OKB be a knowledge base about object classes
Let ∇P and ∇F be partitions of the sets Pn and Fn               where each object class O ∈ O. Given the knowledge
after performing clustering on them. Let Pη and Fη              about all the instances of an object class O, in the con-
be the sets of labels, expressing physical qualities and        ceptualization process, the knowledge about the object
functional qualities, generated for a physical property         class OK ∈ OKB is expressed as a set of tuples con-
P ∈ P and a functional property F ∈ F respectively.             sisting of a physical or a functional quality and its
Given the label for a property, the quality labels are          proportion (membership) value in the object class. A
generated by combining a property label P and a clus-           tuple is expressed as hO, t, mi where t ∈ Pη ∪ Fη and
ter label (created by the clustering algorithm). For            a proportion value m is calculated using the follow-
instance, the quality labels for a property size are            ing membership function: m = P (holds(o, t)|o ∈ O).
represented as {size 1, size 2, size 3, size 4}. At the         The proportion value allows to model the intra-class
end of the sub-categorization process, the clusters are         variations in the objects.
mapped to the generated symbolic labels for qualita-               For example, knowledge about object class table
tive measures.                                                  can be expressed as: {hplate, harder, 0.6 i, hplate,
   Note that the number of clusters essentially de-             light weight, 0.75 i, hplate, less hollow, 0.67 i, hplate,
scribes the granularity with which each property can            hollow, 0.33 i, hplate, more support, 0.71 i}, where the
qualitatively be represented. The higher number of              numbers indicate that, for instance, physical quality
clusters suggest that an object is described in a finer         harder was observed in 60% instances of object class
detail which may obstruct the selection of a substitute         plate. At the end of the conceptualization process,
since it may not be possible to find a substitute which         conceptual knowledge about an object class is cre-
is similar to a missing tool down to the finer details.         ated which is represented in a symbolic fuzzy form
For example, in size = {small, medium, big, bigger},            and grounded into the human-generated or machine-
size is a physical property and small, medium, big,             generated data about the properties of objects. The




                                                           24
                                                                  ities are regarded as exceptional or uncommon quali-
                                                                  ties.
                                                                      Let O ∈ O be an object class of a missing tool and
                                                                  let θ is a representative model threshold which qualifies
Figure 2: A typical process flow to determine a substi-           a physical or a functional quality as stereotypical or
tute for a missing tool from the available objects.               representative to the object class O. Orp is called as
                                                                  a representative physical model of an object class O
knowledge about objects is then used to determine a
                                                                  such that Orp = {p : implies(O, p) ≥ θ, p ∈ Pη } and
substitute from the existing objects in the environ-
                                                                  Orf is called as a representative functional model of an
ment.
                                                                  object O such that Orf = {f : implies(O, f ) ≥ θ, f ∈
                                                                  Fη }. Similarly, let fd be a function model of functional
Conceptualization - Knowledge about Func-                         quality f , then frp is called as a representative physical
tional Properties                                                 model of a functional quality f such that frp = {p :
In addition to conceptual knowledge about ob-                     implies(f, p) ≥ θ, p ∈ Pη }
jects, Conceptualization process also creates knowl-
edge about functional quality, termed as a function               Relevant Qualities
model, by associating the occurrence of physical qual-            Due to the abstract nature of an image schema and
ities in an object instance with the occurrence of a              by extension a corresponding functional property, it
functional quality in the instance and aggregating the            can subsume various purposes of objects, for example,
result of such concurrent occurrences. The role of a              a functional property support which can subsume the
functional model is discussed later in the section 5.3.           purposes place on, sit on and serve on of the a table,
Given the knowledge about the object instances, a                 a chair and a tray respectively. It is suggested in [6]
function model fd of a functional quality f ∈ Fη is ex-           that a certain assemblage of physical properties are
pressed as a set of tuples containing a functional qual-          essential prerequisites to enable a functional property.
ity f ∈ Fη , a physical quality p ∈ Pη and a propor-              Thus, it can be assumed that by knowing the relevance
tion value d. A tuple is represented as hf, p, di where           of one functional property can help identify the rele-
f ∈ Fη , p ∈ Pη and a proportion value d is computed              vant physical properties of different objects which are
as, d = P (holds(o, p)|holds(o, f )) For example, a func-         used for different purposes.
tion model for a functional quality more support is                  The relevance of a representative functional quality
given as, { hmore support, harder,0.8 i, hmore support,           is decided by examining whether the physical charac-
softer, 0.2 i where the number indicates that, for in-            terization of the function model of the representative
stance, functional quality more support and a physical            functional quality of a tool are in a close proximity to
quality harder co-occurred in the knowledge about the             the physical characterization of a representative phys-
object instances 80% of the time.                                 ical model of the tool. The close proximity between a
                                                                  functional quality and the object class of the tool is
5.3   Reasoner                                                    determined using Jaccard Index. Jaccard Index deter-
                                                                  mines a similarity and dissimilarity between the two
Fig. 2 illustrates a process flow consisting of the pri-
                                                                  sets A and B where the similarity is calculated by di-
mary operations involved in determining a substitute.
                                                                  viding the magnitude of the intersection of A and B
The flow offers an approximated aerial view for the
                                                                  by the magnitude of the union of A and B.
prototypical model of ERSATZ. When ERSATZ is
                                                                     Let Orp and frp be the representative physical mod-
queried to find a substitute for a missing tool x from
                                                                  els of an object class O of the missing tool and of a
the set of available objects Y the system checks if the
                                                                  function model fd of a representative functional qual-
substitution model for x exists in the knowledge base.
                                                                  ity f ∈ Fη of the object class O respectively. Let φ be
If the substitution model does not exist, then the rea-
                                                                  a Minimum Similarity Tolerance threshold for similar-
soner computes the relevant functional and physical
                                                                  ity. Then, Jaccard Index of Orp and frp is computed as:
properties of the queried tool.                                                     |O ∩frp |
                                                                  J(Orp , frp ) = |Orprp ∪frp |
                                                                                                . A representative functional
                                                                  quality f of an object class O is regarded as relevant
Representative Models
                                                                  if J(Orp , frp ) > φ. Let OF 0 be a set of all relevant
A representative physical model and a representative              functional qualities of an object class O. Let frp be a
functional model of an object consists of the physical            representative physical model of a function model fd
or functional qualities, respectively, that are regarded          of a relevant functional quality f ∈ OF 0 . Let Orp be a
as representative qualities of the object class, while the        representative physical model of O. Then, the relevant
qualities which do not fall under representative qual-            physical qualities of an object class O, expressed by a




                                                             25
set OP 0 = (Orp ∩ frp ).
                                                                    Table 1: Number of scans (#) per category (Σ# =
                                                                    692) of the Washington RGBD dataset [13].
Reasoning about a Substitute




                                                                                  1-9 water bottle
                                                                                  1-5 hand towel
                                                                                  1-8 coffee mug
                                                                                  1-5 cereal box
                                                                                  Inst. Category




                                                                                  1-5 notebook
                                                                                  1-5 flashlight




                                                                                  1-5 keyboard
                                                                                  1-12 food box




                                                                                  1-6 soda can
                                                                                  1-6 shampoo
                                                                                  1-14 food can
                                                                                  1-8 food bag


                                                                                  1-5 food cup
Let Oµ ∈ O be an object class of a missing tool and




                                                                                  1-6 food jar


                                                                                  1-5 kleenex

                                                                                  1-3 pitcher
                                                                                  1-3 binder




                                                                                  1-12 sponge
let Oβ ∈ O be an object class of a possible candidate




                                                                                        label




                                                                                  1-7 plate
                                                                                  1-6 bowl
                                                                                  1-7 ball


                                                                                  1-4 cap
for a substitute. Let OPµ 0 be a set of relevant phys-
ical qualities of Oµ and let Orp      β
                                        be a representative
                      β
physical model of O . Let φ be a Minimum Similarity




                                                                        per Inst.
Tolerance threshold for similarity. The substitutability




                                                                        Scans
of a candidate is determined by measuring the sim-




                                                                        10




                                                                        10
                                                                        5

                                                                        5
                                                                        8
                                                                        6
                                                                        4
                                                                        6
                                                                        4
                                                                        3
                                                                        2
                                                                        6
                                                                        5
                                                                        6
                                                                        6
                                                                        6
                                                                        6

                                                                        5
                                                                        5
                                                                        5
                                                                        3
                                                                        4
ilarity between OPµ 0 and Orp   β
                                     using Jaccard’s Index.             #   35 30 30 32 30 32 30 32 36 28 30 30 30 30 30 30 30 35 30 30 36 36
O is termed as a substitute, expressed as Oβ+ , if
  β

J(OPµ 0 , Orp
                                                                                                       water bottle
           β
              ) > φ, else it is regarded as not a substi-                                                    sponge
                                                                                                           soda can
                                                                                                          shampoo

tute and expressed as Oβ− . Given the set of relevant
                                                                                                                plate




                                                                                       Missing tools
                                                                                                             pitcher
                                                                                                          notebook

physical qualities OPµ 0 , the set of relevant functional
                                                                                                             kleenex
                                                                                                          keyboard
                                                                                                        hand towel
                                                                                                            food jar

qualities OFµ 0 and a positive substitute Oβ+ , and a neg-                                                 food cup
                                                                                                           food can
                                                                                                           food box

ative substitute Oβ− , a substitution model of Oµ is ex-
                                                                                                           food bag
                                                                                                          flashlight
                                                                                                        coffee mug

pressed as a tuple: hOPµ 0 , OFµ 0 , Oβ+ , Oβ− i. The knowl-
                                                                                                         cereal box
                                                                                                                 cap
                                                                                                                bowl
                                                                                                              binder

edge about object Oµ ∈ O is then extended in OKB                                                                 ball




                                                                                                                                  ball
                                                                                                                               binder
                                                                                                                                 bowl
                                                                                                                                  cap
                                                                                                                          cereal box
                                                                                                                         coffee mug

                                                                                                                            food bag
                                                                                                                           flashlight

                                                                                                                            food box
                                                                                                                            food can
                                                                                                                            food cup
                                                                                                                             food jar
                                                                                                                         hand towel
                                                                                                                           keyboard
                                                                                                                              kleenex
                                                                                                                           notebook
                                                                                                                              pitcher
                                                                                                                                 plate
                                                                                                                           shampoo
                                                                                                                            soda can
                                                                                                                              sponge
                                                                                                                        water bottle
to accommodate its substitution model.
                                                                                                                            Substitutes

6    Experimental Evaluation                                                           (a) Human expert                                selection
                                                                                       distributions
The objective of the experimental evaluation of ER-                                  water bottle
                                                                                           sponge
                                                                                                                                             1.0

                                                                                         soda can




                                                                                                                                                   Distribution / Similarity
SATZ is to validate the suitability of the substitutes                                  shampoo
                                                                                              plate
                                                                                           pitcher
                                                                                                                                             0.8


computed by ERSATZ by comparing the results with                                        notebook
                                                                                           kleenex
                                                                                        keyboard                                             0.6
                                                                                      hand towel
that of human experts. For the experimental eval-                                         food jar
                                                                                         food cup
                                                                                         food can
uation, we used the images from the Washington                                           food box
                                                                                         food bag
                                                                                                                                             0.4

                                                                                        flashlight
Dataset [13] to generate human-based and machine-                                     coffee mug
                                                                                       cereal box
                                                                                               cap
                                                                                                                                             0.2
                                                                                              bowl
based properties. Around 22 object categories were se-                                      binder
                                                                                               ball
                                                                                                                                             0.0
lected and for each category, we selected random im-
                                                                                                                        ball
                                                                                                                     binder
                                                                                                                       bowl
                                                                                                                        cap
                                                                                                                cereal box
                                                                                                               coffee mug

                                                                                                                  food bag
                                                                                                                 flashlight

                                                                                                                  food box
                                                                                                                  food can
                                                                                                                  food cup
                                                                                                                   food jar
                                                                                                               hand towel
                                                                                                                 keyboard
                                                                                                                    kleenex
                                                                                                                 notebook
                                                                                                                    pitcher
                                                                                                                       plate
                                                                                                                 shampoo
                                                                                                                  soda can
                                                                                                                    sponge
                                                                                                              water bottle
ages from all the given instances of the category lead-
                                                                                                                         Substitutes
ing up to total of 692 images. Table 1 illustrates the
number of images selected from each category. For the                                (b) ERSATZ selections with sim-
                                                                                     ilarity to the missing tool
experiment, we generated 22 queries based on 22 ob-
ject categories. Each query consisted of a missing tool
                                                                    Figure 3: Substitution results w.r.t. human expert se-
and 5 randomly selected objects from which a sub-
                                                                    lection distribution and ERSATZ similarity responses.
stitute was to be selected. We gave 22 queries, to 14
                                                                    Note that, gray cells correspond to object categories
human experts and asked them to select a substitute
                                                                    which are not available in the respective query, cells
in each query. The distribution of the human selections
                                                                    marked with represents substitutes selected by ex-
for each scenario is illustrated in Fig. 3(a). Similarly,
                                                                    perts and ERSATZ.
the queries were run on ERSATZ with the following
(heuristically determined) optimal values of the target             ERSATZ and the experts identified the same substi-
parameters: i) Number of machine-generated proper-                  tutes in 20 scenarios (91%).
ties is set to 4 (Sec. 5.1), ii) Number of clusters to 4
(Sec. 5.2), iii) Representative threshold (Sec. 5.3) and
                                                                    7   Future Work
Minimum Similarity Tolerance (Sec. 5.3) to 0.35.
   The results of both experiments were plotted as a                The paper presents a prototypical system to deter-
heat map where the y-axis shows missing tools and x-                mine a substitute for a missing tool using the grounded
axis shows the available objects illustrated in Fig. 3.             knowledge about objects. The approach has drawn in-
The grayed cells mean the corresponding object cate-                spiration from symbol grounding, the theory of affor-
gories were not available in the respective query. The              dances and the theory of image schemas to represent
cells that are marked with represents substitutes se-               the grounded knowledge and to determine a substi-
lected by experts and ERSATZ. Out of 22 scenarios,                  tute. This is an ongoing research with a focus on the




                                                               26
                                   Fuzzy Conceptual Knowledge - Object Classes         [5] C. Baber. Introduction. In Cognition and Tool
                              Bivariate Joint Frequency Distributions                      Use, chapter 1, pages 1–15. Taylor and Francis,
                              Fuzzy Conceptual Knowledge - Object Instances
                                                                                           2003.
                       Clustering Method                                               [6] C. Baber. Working With Tools. In Cognition and
                       Functional Property Data - Object Instances                         Tool Use, pages 51–68. 2003.
                Aggregation
                                                                                       [7] A. Boteanu, A. St. Clair, A. Mohseni-Kabir,
                 Physical Property Data - Object Instances
                                                                                           C. Saldanha, and S. Chernova.     Leveraging
         Property Extraction Methods                                                       Large-Scale Semantic Networks for Adaptive
               Sensory Data - Object Instances                                             Robot Task Learning and Execution. Big Data,
                                                                                           4(4):217–235, 2016.
Figure 4: Multi-layered dataset to build a robot-centric                               [8] M. Daoutis, S. Coradeshi, and A. Loutfi. Ground-
grounded knowledge about objects.                                                          ing commonsense knowledge in intelligent sys-
following aspects.                                                                         tems. Journal of Ambient Intelligence and Smart
   Our immediate goal focuses on the fuzzification of                                      Environments, 1(4):311–321, 2009.
the clustering method and the reasoning method to                                      [9] P. Gärdenfors. Cognitive semantics and image
combat the migration of the data points within clus-                                       schemas with embodied forces. Image (Rochester,
ters. Moreover, we have derived three functional prop-                                     N.Y.), pages 1–16, 1987.
erties, namely, contain, support, block from the image
schemas Containment, Support and Blockage respec-                                     [10] R. Gupta and M. J. Kochenderfer. Common Sense
tively. However, further investigation is needed to for-                                   Data Acquisition for Indoor Mobile Robots. In
malize the identification of additional functional prop-                                   Proceedings of the Nineteenth National Confer-
erties to be derived from the existing image schema.                                       ence on Artificial Intelligence, Sixteenth Confer-
For robot-centric property acquisition, we are cur-                                        ence on Innovative Applications of Artificial In-
rently developing a framework that allows a robot to                                       telligence, pages 605–610, San Jose, California,
extract properties of individual objects and build a                                       USA, 2004.
knowledge base in a bottom-up manner such that the
knowledge about properties of objects is constructed                                  [11] G. Jäger, C. A. Mueller, M. Thosar, S. Zug,
on the basis of what is sensed (see Fig. 4). We have                                       and A. Birk. Towards Robot-Centric Conceptual
proposed the preliminary framework in [11].                                                Knowledge Acquisition. In Robots that learn and
                                                                                           reason in IEEE/RSJ International Conference on
References                                                                                 Intelligent Robots and Systems, Madrid, 2018.

 [1] P. Abelha, F. Guerin, and M. Schoeler. A model-                                  [12] W. Kuhn. An Image-Schematic Account of Spa-
     based approach to finding substitute tools in 3D                                      tial Categories. Spatial Information Theory, pages
     vision data. Proceedings - IEEE International                                         152–168, 2007.
     Conference on Robotics and Automation, 2016-
                                                                                      [13] K. Lai, L. Bo, X. Ren, and D. Fox. A Large-
     June, 2016.
                                                                                           Scale Hierarchical Multi-View RGB-D Object
 [2] A. Agostini, M. J. Aein, S. Szedmak, E. E. Ak-                                        Dataset. In IEEE International Conference on
     soy, J. Piater, and F. Worgotter. Using struc-                                        Robotics and Automation (ICRA), pages 1817–
     tural bootstrapping for object substitution in                                        1824, Shanghai, China, 2011.
     robotic executions of human-like manipulation
                                                                                      [14] S. Lemaignan, R. Ros, L. Mösenlechner, R. Alami,
     tasks. IEEE International Conference on Intelli-
                                                                                           and M. Beetz. ORO, a knowledge management
     gent Robots and Systems, 2015-Decem:6479–6486,
                                                                                           platform for cognitive architectures in robotics.
     2015.
                                                                                           IEEE/RSJ 2010 International Conference on In-
 [3] I. Awaad, G. K. Kraetzschmar, and J. Hertzberg.                                       telligent Robots and Systems, IROS 2010 - Con-
     Challenges in Finding Ways to Get the Job                                             ference Proceedings, (April):3548–3553, 2010.
     Done. Planning and Robotics (PlanRob) Work-
                                                                                      [15] G. H. Lim, I. H. Suh, and H. Suh. Ontology-
     shop at 24th International Conference on Auto-
                                                                                           based unified robot knowledge for service robots
     mated Planning and Scheduling, 2014.
                                                                                           in indoor environments. IEEE Transactions on
 [4] C. Baber. Cognition and Tool Use. Taylor and                                          Systems, Man, and Cybernetics Part A:Systems
     Francis, 2003.                                                                        and Humans, 41(3):492–509, 2011.




                                                                                 27
[16] C. Mueller, K. Pathak, and A. Birk. Object shape               multi-layered robot knowledge framework (OM-
     categorization in rgbd images using hierarchical               RKF) for robot intelligence. IEEE International
     graph constellation models based on unsupervis-                Conference on Intelligent Robots and Systems,
     edly learned shape parts described by a set of                 (October):429–436, 2007.
     shape specificity levels. In International Confer-
     ence on Intelligent Robots and Systems, 2014.              [22] S. Szedmak, E. Ugur, and J. Piater. Knowledge
                                                                     Propagation and Relation Learning for Predict-
[17] C. A. Mueller and A. Birk. Hierarchical Graph-                  ing Action Effects. In IEEE/RSJ International
     Based Discovery of Non-Primitive-Shaped Ob-                     Conference on Intelligent Robots and Systems,
     jects in Unstructured Environments. In Interna-                 Chicago, 9 2014.
     tional Conference on Robotics and Automation,
     2016.                                                      [23] M. Tenorth and M. Beetz.          KNOWROB-
                                                                     Knowledge Processing for Autonomous Personal
[18] C. A. Mueller and A. Birk. Conceptualization of                 Robots. In IEEE/RSJ International Conference
     Object Compositions Using Persistent Homology.                  on Intelligent Robots and Systems, pages 4261–
     In IEEE/RSJ International Conference on Intel-                  4266, 2009.
     ligent Robots and Systems (IROS), Madrid, 2018.
                                                                [24] M. Thosar, S. Zug, A. M. Skaria, and A. Jain.
[19] L. A. Pineda, A. Rodrı́guez, G. Fuentes,
                                                                     A Review of Knowledge Bases for Service Robots
     C. Rascón, and I. Meza. A light non-monotonic
                                                                     in Household Environments. In 6th International
     knowledge-base for service robots. Intelligent Ser-
                                                                     Workshop on Artificial Intelligence and Cogni-
     vice Robotics, 10(3):159–171, 2017.
                                                                     tion, 2018.
[20] A. Saxena, A. Jain, O. Sener, A. Jami, D. K.
     Misra, and H. S. Koppula. RoboBrain: Large-                [25] Y. Zhu, A. Fathi, and L. Fei-Fei. Reasoning About
     Scale Knowledge Engine for Robots. arXiv, pages                 Object Affordance in a Knowledge Based Repre-
     1 – 11, 2014.                                                   sentation. European Conference on Computer Vi-
                                                                     sion, (3):408–424, 2014.
[21] I. H. Suh, G. H. Lim, W. Hwang, H. Suh,
     J. H. Choi, and Y. T. Park. Ontology-based




                                                           28