=Paper=
{{Paper
|id=Vol-2325/paper-05
|storemode=property
|title=What Stands-in for a Missing Tool?: A Prototypical Grounded Knowledge-based Approach
to Tool Substitution
|pdfUrl=https://ceur-ws.org/Vol-2325/paper-05.pdf
|volume=Vol-2325
|authors=Madhura Thosar,Christian A. Mueller,Sebastian Zug
|dblpUrl=https://dblp.org/rec/conf/kr/ThosarMZ18
}}
==What Stands-in for a Missing Tool?: A Prototypical Grounded Knowledge-based Approach
to Tool Substitution==
What Stands-in for a Missing Tool?
A Prototypical Grounded Knowledge-based
Approach to Tool Substitution
Madhura Thosar1 , Christian A. Mueller2 , Sebastian Zug1
1
Faculty of Computer Science,
Otto von Guericke University Magdeburg, Germany
2
Robotics Group,
Computer Science & Electrical Engineering Department,
Jacobs University Bremen, Germany
thosar@iks.cs.ovgu.de, zug@ivs.cs.uni-magdeburg.de,
chr.mueller@jacobs-university.de
especially when they are faced with unfavorable situa-
tions. For example, if we don’t find a hammer to ham-
Abstract mer a nail into a wall, we will use a heel of a shoe or a
rock or if a tray is unavailable for serving the drinks,
When a robot is operating in a dynamic en- we will use a plate for serving. In situations like these,
vironment, it cannot be assumed that a tool humans seem to know - either from the past experi-
required to solve a given task will always be ence or from observations or from the “necessity is the
available. In case of a missing tool, an ideal mother of improvisation (invention)” type approach -
response would be to find a substitute to com- what kind of object is needed as a substitute.
plete the task. In this paper, we present a On the contrary, consider a robot performing a task
proof of concept of a grounded knowledge- that involves tool use. When a robot is operating in a
based approach to tool substitution. In order dynamic environment, it can not be assumed that a
to validate the suitability of a substitute, we tool required in the task will always be available. In
conducted experiments involving 22 substitu- situations like these, an effective way for a robot would
tion scenarios. The substitutes computed by be to find an alternative as humans do, for example,
the proposed approach were validated on the use an eating plate for serving, rather than wait un-
basis of the experts’ choices for each scenario. til a tray becomes available. This skill is significant
Our evaluation showed, in 20 out of 22 scenar- when operating in a dynamic, uncertain environment
ios (91%), the approach identified the same because it allows a robot to adapt to unforeseen sit-
substitutes as experts. uations to a degree. The question is how can a robot
determine which object in the environment is a viable
1 Introduction candidate for a substitute? A possible approach would
be by interacting with an object in a manner missing
The sophistication pertaining to tool-use in humans tool is maneuvered. However, it would be time con-
involves not just the dexterity in manipulating a tool, suming if a robot interacts with every single object in
but also the diversity in tool exploitation. The abil- the environment to determine a viability which makes
ity to exploit the tools has enabled humans to adapt this approach less practical.
and thus exert control over an uncertain environment, In this prototypical work, we propose a non-invasive
approach that identifies viable candidate/s from the
Copyright c by the paper’s authors. Copying permitted for pri-
vate and academic purposes.
existing objects in the environment. This paper makes
In: G. Steinbauer, A. Ferrein (eds.): Proceedings of the 11th In-
the following contributions: 1) An approach to create
ternational Workshop on Cognitive Robotics, Tempe, AZ, USA, grounded knowledge about objects expressed in terms
27-Oct-2018, published at http://ceur-ws.org of their properties (Sec. 5.2), 2) an approach to identify
20
relevant properties of a missing tool and determine a work discussed in [2] retrieves the knowledge about ob-
substitute on the basis of them (Sec. 5.3). jects from the ROAR [22] relational database and de-
termines a substitute that shares similar affordances.
However, in ROAR, the knowledge is acquired either
2 Related Work
using machine learning techniques requiring training
Typically, a substitute for a missing tool is determined examples or inferred or hand-coded. The work in [7]
by means of knowledge base that provides knowledge uses the ConceptNet where potential candidates are
about objects and similarity measures to determine the extracted from the knowledge base if they share the
similarity between a missing tool and a potential sub- same parent with a missing tool for the predetermined
stitute. In the following, in addition to the approaches relations: has-property, capable-of and used-for. After
to determine a substitute, we also report the litera- eliminating irrelevant candidates, a substitute is deter-
ture related to existing knowledge bases developed for mined on the basis of the similarity metrics. The ap-
robotic applications. proach proposed in [1] uses a part-based 3D model and
weight of an object to determine the orientation and
manipulation of a substitute to be used as a missing
Knowledge Base
tool. In the cases where supervised machine learning
We reviewed in [24] nine existing knowledge bases technique is used, providing bulk of labeled examples
namely: KNOWROB [23], MLN-KB [25], NMKB [19], beforehand would not be realistic for a substitution
OMICS [10], OMRKF [21], ORO [14], OUR-K [15], problem scenario. On the other hand, the approaches
PEIS-KB [8], and RoboBrain [20]. The objective was which rely on existing external knowledge bases are
to determine whether these existing knowledge bases built around the available knowledge in the knowledge
contain 1) ontological knowledge about the properties bases which does impose some constraints. We circum-
of objects, 2) such knowledge is grounded into robot’s vents this issue by first identifying what knowledge is
perception, and 3) intra-class variability in a property generally required to determine a substitute and then
is modeled instead of expressing the property in a bi- build an approach to acquire the required knowledge
nary form. We gained primarily the following insights and compute a substitute on the basis of it.
which form the basis for our work.
We noted that the majority of the knowledge bases 3 Challenges
relied on the external human-centric commonsense
knowledge bases such as WordNet, Cyc, OpenCyc, and How to characterize similarity between a missing tool
some either relied on the hand-coded knowledge or and a potential substitute: A candidate for a substi-
on the knowledge acquired by human-robot interac- tute is expected to be similar to a missing tool to some
tion. The main issue, we believe is that, the depth and degree to ensure a substitutability. The notion of simi-
breadth of the human-centric knowledge base is not larity can be understood in various forms, for instance,
observable by a robot in its entirety due to its lim- a distance between two objects denoted by two points
ited sensing capabilities. This causes a disconnect be- in a multi-dimensional space or two objects belonging
tween human-centric knowledge and robot-centric per- to the same cluster or aspects of the objects that are
ception. To deflect this issue, we aim to acquire the identified as shared. In this work, the question will be
robot-centric perceptual data for different properties addressed in a broader sense: it is not merely about
of objects. Such property data can then be used to identifying a similar object by deploying some similar-
generate grounded knowledge about objects (see Sec. ity measure, instead, it is about gaining an access to
5.1). what aspects of the objects were found to be shared
between the similar objects.
Substitution Computation What kind of knowledge is required to determine the
similarity: It has been demonstrated in the literature
One of the closest areas that study the usability of an on tool use in humans and animals alike that in order
object is affordancs of tools where the primary focus to use an object in tasks one needs to have knowledge
is to examine various functional abilities of an object about objects [4]. Baber in [5] also noted that con-
by exploring what actions can be performed on the ceptual knowledge about objects is especially desired
object and observing its responses. As such, using a in tool use where a systematic deliberation is called
substitute in place of a missing tool can also be seen as for. For a robot, the story won’t be much different if
transferring of an affordance of the missing tool to the it is expected to perform in the real world along side
substitute after determining similarity between them. humans. As a consequence, a robot needs conceptual
In [3], a substitute for a missing tool is inferred on knowledge about an object where the object will not
the basis of inheritance and equivalence relations. The be only a physical entity that is merely to be perceived,
21
but also a concept which consists of distinct character- for a tray. A tray can be defined as a rigid, rectangu-
istics and relations which set each object apart from lar, flat, wooden, brown colored object while a plate
each other and also similar to each other. can be defined as a rigid, circular, semi-flat, white col-
How to acquire the necessary knowledge: The ac- ored object and a mouse pad as soft, rectangular, flat,
quisition of such conceptual knowledge is not with- leather-based object. Bear in mind, however, that some
out challenge. From a robot stand-point, it is a trade- properties are more relevant than others with respect
off between what needs to be known and what can to the designated purpose of the tool. For a tray whose
be known. The trade-off is a direct consequence of designated purpose is to carry, rigid and flat are more
the limited perception capabilities of a robot which relevant to carry than a material or a color of a tray.
often leads to partial understanding of the environ- Consequently, to find the most appropriate substitute,
ment. While deploying a multi-modal perception to the relevant properties of the unavailable tool need to
extract the required knowledge about objects would correspond to as large a degree as possible to the prop-
be an ideal solution, however, it carries its own set erties of the possible choices for a substitute.
of complexities such as noisy sensors, dynamicity of The proposed approach performs conceptual
the environment, complexities of the composition of knowledge-driven computation to identify the relevant
an object. For this prototypical work, the necessary properties of the missing tool and determines the most
knowledge is acquired using human-centric as well as similar substitute on the basis of those properties. Be-
machine-centric methods. sides identifying the most similar object as a substitute
How to maneuver a substitute as a missing tool: for a missing tool, the proposed approach grants an ex-
Once the substitute has been identified, a robot is ex- plicit access to the relevant properties of the missing
pected to use it in place of a missing tool and achieve tool which carries twofold advantages: firstly, knowing
the same result as the missing tool in the task. The which properties are primarily required in the poten-
challenge to estimate the maneuver as well as grasping tial substitute narrows down the search space and sec-
of a substitute is two fold: to determine whether the ondly, in case of an unknown object instance, only the
maneuver and grasping knowledge of a missing tool relevant properties will have to be learned to determine
can be transferred and utilized on a substitute, else, a substitute.
estimate the maneuver and grasping for a substitute The conceptual knowledge considered in this work
such that it can be used as a missing tool in the task. primarily involves properties of the objects. The prop-
For this work, we have focused on the first three erties considered are divided into physical and func-
challenges and have developed a prototypical system tional properties where physical properties describe
called ERSATZ (German word for a substitute or al- the physicality of the objects such as rigidity, weight,
ternative) where the focus is to identify the required hollowness while the functional properties ascribe the
knowledge to determine a substitute and develop a sys- (functional) abilities or affordances to the objects
tem that computes a substitute for a missing tool. such as containment, blockage, support. The functional
properties in the proposed approach play a primary
role in identifying the relevant properties of a missing
4 Approach
tool (see Sec. 5.3).
The proposed approach distinguishes a tool from a The functional properties considered in this work
substitute where a tool is defined as an artifact that are derived from the theory of image schemas [9] which
is designed, manufactured and maneuvered in accor- has its roots in cognitive linguistics. According to [12]
dance with its designated purpose in the tasks such image schemas are patterns abstracted from spatio-
as hammer for hammering, tray for serving etc., while temporal experiences. Essentially image schemas cap-
a substitute is seen as an extension of a missing tool. ture recurrent patterns that emerge from our percep-
Within the context of a designated purpose, the rela- tual and bodily interactions with the environment.
tionship between a tool and a substitute is symmetric, Since some of these patterns are posited on the op-
for instance, for hammering, a hammer can be replaced erational abilities of objects Kuhn postulated in [12]
by a heeled shoe and vice versa. However, it may not that affordances a.k.a. functional properties [9] for the
always be the case once you step outside the context, spatio-temporal processes can be derived from image
for instance, a hammer can not replace a heeled shoe. schemas. For example, the containment schema sug-
Our research work, therefore, focuses on searching for gest an object’s ability to contain something or the
a substitute for a conventional tool required in the on- support schema indicates an object’s ability to hold
going task as opposed to determining a substitute for up something or the blockage refers to the ability of
itself. an object to block or obstruct the movement of an
Consider a scenario in which a robot has to choose other object. Currently, the proposed system is re-
between a plate and a mouse pad as an alternative stricted to three functional properties based on image
22
type theory since we believe that the richness of a pro-
totype provides an unaltered perspective on the char-
acteristics of object instances. Based on the proposed
symbolic shape representation we analyze topology and
structure within the encoded symbol compositions in
order to discover persistent patterns that may repre-
sent shape concepts.
Figure 1: Illustration of the object shape conceptual-
We introduce an iterative filtering process [18] to
ization approach [18]. Concepts are randomly colored.
associated instances to groups which may represent
schemas: containment, support and blockage. While the shape concepts (see C in Fig. 1). Given the set of
functional properties as well as the designated purpose learned concepts, for an unknown object, concept re-
can both be identified as affordances, the proposed ap- sponses are retrieved (see D in Fig. 1) and exploited
proach is built by hypothesizing that the functional as machine-generated geometric object property val-
properties are building blocks upon which designated ues in our tool-substitution scenario. Note that in our
purposes of tools rest. tool-substitution scenario, concepts are learned from
unlabeled object instances of the Object Discovery
5 Methodology Dataset(ODD) [17]; the ODD provides a variety of ob-
jects from teddy bears over flash lights to shoes which
5.1 Knowledge Acquisition
facilitates an expressive concept generation.
Our ultimate objective is to acquire machine centric Human Generated Properties The geometric
data from which property specific data can be ex- properties alone offer a very limited scope of the phys-
tracted. Such property data will then be used to gener- icality as well as the functionality of an object. There-
ate grounded knowledge about objects. As a first step, fore, to compensate the gap, we also considered non-
our initial property acquisition focuses on the compos- geometrical properties such as weight, rigid, hollow-
ite of a machine-centric and a human-centric method. ness as physical and support, blockage, containment
In the machine-centric approach, geometrical prop- as functional. Note that, in general, these proper-
erties are acquired using a non-invasive vision-based ties are challenging and cumbersome to extract solely
technique while non-geometric properties are acquired from non-invasive visuoperceptual approaches. Conse-
by sampling from the data from the expert generated quently, extracting such properties via multi-modal or
intuitive model for the properties. manipulation capabilities is needed, but this is beyond
Machine Generated Properties: In this paper, the scope of this paper. In the generation process, a set
we introduce a state-of-art data-driven approach that of labeled prototype objects selected from the Wash-
unsupervisedly conceptualizes shape according to com- ington dataset (see Table 1) were taken into account.
monalities within object point clouds which is dis- The distribution of each property for particular object
cussed in detail in our work [18]. As a result of the labels (cf. Table 1) was approximated by an expert
process, a set of shape concepts is generated which to resemble the scope for the variations in the values
concept responses for an unknown object are used in of the property in general. Consequently, given an ob-
the knowledge base as machine-generated geometric ject and its label, a sample value was drawn from the
object properties. a-priori generated property distribution.
In our previous work on shape concept learning [18],
raw sensor information in form of point clouds is ab- 5.2 Knowledge about Objects
stracted to a symbolic level in which point cloud seg-
ments [17] may represent meaningful shape compo- Knowledge about objects is spread across three levels:
nents in a symbolic space [16]. Therein we introduce the first level consists of the data about the machine-
a hierarchical learning procedure that leads to sym- generated as well as human-generated properties, the
bols which are gradually organized to reflect generic- second level consists of qualitative knowledge about
to-specific facets of shape components and can be sub- individual object instances, while the third level con-
sequently used as building blocks that constitute ob- sists of the aggregated qualitative fuzzy knowledge
jects (see A in Fig. 1). about respective classes of object instances. The fuzzy
An object shape representation is introduced that formalism is used to model the intra-class variations
gradually encodes observed objects symbol composi- in the objects. In the following, we discuss the for-
tions (see B in Fig. 1): from local components to com- mal description of the methodology deployed to create
ponent groups that may represent object parts or ob- grounded knowledge about objects.
jects as a whole. The proposed shape representation Consider O as a given set of object class labels
incorporates aspects of exemplar, respectively, proto- where (by abuse of notation) each object class is iden-
23
tified with its label. Let each object
S class O ∈ O be bigger are its physical qualities. The semantic terms
a given set of its instances.
S Let O be a union of all given above are meant for the readers to understand
object classes such that | O| = n. Let P and F be the qualitative measures of the properties.
the given sets of physical properties’ labels and a set
of functional properties’ labels respectively. By abuse Attribution - Object Instance Knowledge
of notation, each physical and functional property is
The attribution process generates knowledge about
identified with its label. For each physical property
each object instance by aggregating all the physi-
P ∈ P as well as for a functional property F ∈ F,
cal and functional qualities assigned to the object
sensory
S data is acquired from each object instance instance by the sub-categorization step. In other
o ∈ O. Let Pn and Fn represent sets of n number of
terms, the knowledge about an instance consists of
extracted sensory values from n number of object in-
the physical as well as functional qualities reflected
stances for a physical property P ∈ P and a functional
in the instance. Let Pη and Fη be the families of
property F ∈ F respectively.
sets containing the physical quality labels Pη and the
functional quality labels Fη for each physical property
Sub-categorization - From Continuous to Dis- P ∈ P and functional propertySF ∈ F respectively.
crete Thus, each object instance o ∈ O is represented as
a set of all the physical as well as functional qualities
The sub-categorization process is performed to form
attributed toSit which are expressed by a symbol holds
(more intuitive) qualitative measures to represent the
as: holds ⊂ O × (Pη ∪ Fη ) For example, knowledge
degree with which a property is reflected by an ob-
about the instance plate1 of a plate class can be given
ject instance. It is the first step in creating symbolic
as, holds(plate1 , medium), holds(plate1 , harder),
knowledge about object classes where the symbols rep-
holds(plate1 , can support) where medium is a phys-
resenting the qualitative measures of a physical or a
ical quality of size property, harder is a physical
functional property reflected in an object instance are
quality of rigidity property and can support is a
generated unsupervisedly by a clustering mechanism.
functional quality of support property.
A qualitative measure of a physical property is referred
to as a physical quality and that of a functional prop-
erty as a functional quality. Conceptualization - Knowledge about Objects
In this process, Pn and Fn representing measure- The conceptualization process aggregates the knowl-
ments of a physical property P ∈ P and a functional edge about all the instances of an object class. The
property F ∈ F respectively extracted from n number aggregated knowledge is regarded as conceptual knowl-
of object instances is categorized into a given num- edge about an object class.
ber of discrete clusters η using a clustering algorithm. Let OKB be a knowledge base about object classes
Let ∇P and ∇F be partitions of the sets Pn and Fn where each object class O ∈ O. Given the knowledge
after performing clustering on them. Let Pη and Fη about all the instances of an object class O, in the con-
be the sets of labels, expressing physical qualities and ceptualization process, the knowledge about the object
functional qualities, generated for a physical property class OK ∈ OKB is expressed as a set of tuples con-
P ∈ P and a functional property F ∈ F respectively. sisting of a physical or a functional quality and its
Given the label for a property, the quality labels are proportion (membership) value in the object class. A
generated by combining a property label P and a clus- tuple is expressed as hO, t, mi where t ∈ Pη ∪ Fη and
ter label (created by the clustering algorithm). For a proportion value m is calculated using the follow-
instance, the quality labels for a property size are ing membership function: m = P (holds(o, t)|o ∈ O).
represented as {size 1, size 2, size 3, size 4}. At the The proportion value allows to model the intra-class
end of the sub-categorization process, the clusters are variations in the objects.
mapped to the generated symbolic labels for qualita- For example, knowledge about object class table
tive measures. can be expressed as: {hplate, harder, 0.6 i, hplate,
Note that the number of clusters essentially de- light weight, 0.75 i, hplate, less hollow, 0.67 i, hplate,
scribes the granularity with which each property can hollow, 0.33 i, hplate, more support, 0.71 i}, where the
qualitatively be represented. The higher number of numbers indicate that, for instance, physical quality
clusters suggest that an object is described in a finer harder was observed in 60% instances of object class
detail which may obstruct the selection of a substitute plate. At the end of the conceptualization process,
since it may not be possible to find a substitute which conceptual knowledge about an object class is cre-
is similar to a missing tool down to the finer details. ated which is represented in a symbolic fuzzy form
For example, in size = {small, medium, big, bigger}, and grounded into the human-generated or machine-
size is a physical property and small, medium, big, generated data about the properties of objects. The
24
ities are regarded as exceptional or uncommon quali-
ties.
Let O ∈ O be an object class of a missing tool and
let θ is a representative model threshold which qualifies
Figure 2: A typical process flow to determine a substi- a physical or a functional quality as stereotypical or
tute for a missing tool from the available objects. representative to the object class O. Orp is called as
a representative physical model of an object class O
knowledge about objects is then used to determine a
such that Orp = {p : implies(O, p) ≥ θ, p ∈ Pη } and
substitute from the existing objects in the environ-
Orf is called as a representative functional model of an
ment.
object O such that Orf = {f : implies(O, f ) ≥ θ, f ∈
Fη }. Similarly, let fd be a function model of functional
Conceptualization - Knowledge about Func- quality f , then frp is called as a representative physical
tional Properties model of a functional quality f such that frp = {p :
In addition to conceptual knowledge about ob- implies(f, p) ≥ θ, p ∈ Pη }
jects, Conceptualization process also creates knowl-
edge about functional quality, termed as a function Relevant Qualities
model, by associating the occurrence of physical qual- Due to the abstract nature of an image schema and
ities in an object instance with the occurrence of a by extension a corresponding functional property, it
functional quality in the instance and aggregating the can subsume various purposes of objects, for example,
result of such concurrent occurrences. The role of a a functional property support which can subsume the
functional model is discussed later in the section 5.3. purposes place on, sit on and serve on of the a table,
Given the knowledge about the object instances, a a chair and a tray respectively. It is suggested in [6]
function model fd of a functional quality f ∈ Fη is ex- that a certain assemblage of physical properties are
pressed as a set of tuples containing a functional qual- essential prerequisites to enable a functional property.
ity f ∈ Fη , a physical quality p ∈ Pη and a propor- Thus, it can be assumed that by knowing the relevance
tion value d. A tuple is represented as hf, p, di where of one functional property can help identify the rele-
f ∈ Fη , p ∈ Pη and a proportion value d is computed vant physical properties of different objects which are
as, d = P (holds(o, p)|holds(o, f )) For example, a func- used for different purposes.
tion model for a functional quality more support is The relevance of a representative functional quality
given as, { hmore support, harder,0.8 i, hmore support, is decided by examining whether the physical charac-
softer, 0.2 i where the number indicates that, for in- terization of the function model of the representative
stance, functional quality more support and a physical functional quality of a tool are in a close proximity to
quality harder co-occurred in the knowledge about the the physical characterization of a representative phys-
object instances 80% of the time. ical model of the tool. The close proximity between a
functional quality and the object class of the tool is
5.3 Reasoner determined using Jaccard Index. Jaccard Index deter-
mines a similarity and dissimilarity between the two
Fig. 2 illustrates a process flow consisting of the pri-
sets A and B where the similarity is calculated by di-
mary operations involved in determining a substitute.
viding the magnitude of the intersection of A and B
The flow offers an approximated aerial view for the
by the magnitude of the union of A and B.
prototypical model of ERSATZ. When ERSATZ is
Let Orp and frp be the representative physical mod-
queried to find a substitute for a missing tool x from
els of an object class O of the missing tool and of a
the set of available objects Y the system checks if the
function model fd of a representative functional qual-
substitution model for x exists in the knowledge base.
ity f ∈ Fη of the object class O respectively. Let φ be
If the substitution model does not exist, then the rea-
a Minimum Similarity Tolerance threshold for similar-
soner computes the relevant functional and physical
ity. Then, Jaccard Index of Orp and frp is computed as:
properties of the queried tool. |O ∩frp |
J(Orp , frp ) = |Orprp ∪frp |
. A representative functional
quality f of an object class O is regarded as relevant
Representative Models
if J(Orp , frp ) > φ. Let OF 0 be a set of all relevant
A representative physical model and a representative functional qualities of an object class O. Let frp be a
functional model of an object consists of the physical representative physical model of a function model fd
or functional qualities, respectively, that are regarded of a relevant functional quality f ∈ OF 0 . Let Orp be a
as representative qualities of the object class, while the representative physical model of O. Then, the relevant
qualities which do not fall under representative qual- physical qualities of an object class O, expressed by a
25
set OP 0 = (Orp ∩ frp ).
Table 1: Number of scans (#) per category (Σ# =
692) of the Washington RGBD dataset [13].
Reasoning about a Substitute
1-9 water bottle
1-5 hand towel
1-8 coffee mug
1-5 cereal box
Inst. Category
1-5 notebook
1-5 flashlight
1-5 keyboard
1-12 food box
1-6 soda can
1-6 shampoo
1-14 food can
1-8 food bag
1-5 food cup
Let Oµ ∈ O be an object class of a missing tool and
1-6 food jar
1-5 kleenex
1-3 pitcher
1-3 binder
1-12 sponge
let Oβ ∈ O be an object class of a possible candidate
label
1-7 plate
1-6 bowl
1-7 ball
1-4 cap
for a substitute. Let OPµ 0 be a set of relevant phys-
ical qualities of Oµ and let Orp β
be a representative
β
physical model of O . Let φ be a Minimum Similarity
per Inst.
Tolerance threshold for similarity. The substitutability
Scans
of a candidate is determined by measuring the sim-
10
10
5
5
8
6
4
6
4
3
2
6
5
6
6
6
6
5
5
5
3
4
ilarity between OPµ 0 and Orp β
using Jaccard’s Index. # 35 30 30 32 30 32 30 32 36 28 30 30 30 30 30 30 30 35 30 30 36 36
O is termed as a substitute, expressed as Oβ+ , if
β
J(OPµ 0 , Orp
water bottle
β
) > φ, else it is regarded as not a substi- sponge
soda can
shampoo
tute and expressed as Oβ− . Given the set of relevant
plate
Missing tools
pitcher
notebook
physical qualities OPµ 0 , the set of relevant functional
kleenex
keyboard
hand towel
food jar
qualities OFµ 0 and a positive substitute Oβ+ , and a neg- food cup
food can
food box
ative substitute Oβ− , a substitution model of Oµ is ex-
food bag
flashlight
coffee mug
pressed as a tuple: hOPµ 0 , OFµ 0 , Oβ+ , Oβ− i. The knowl-
cereal box
cap
bowl
binder
edge about object Oµ ∈ O is then extended in OKB ball
ball
binder
bowl
cap
cereal box
coffee mug
food bag
flashlight
food box
food can
food cup
food jar
hand towel
keyboard
kleenex
notebook
pitcher
plate
shampoo
soda can
sponge
water bottle
to accommodate its substitution model.
Substitutes
6 Experimental Evaluation (a) Human expert selection
distributions
The objective of the experimental evaluation of ER- water bottle
sponge
1.0
soda can
Distribution / Similarity
SATZ is to validate the suitability of the substitutes shampoo
plate
pitcher
0.8
computed by ERSATZ by comparing the results with notebook
kleenex
keyboard 0.6
hand towel
that of human experts. For the experimental eval- food jar
food cup
food can
uation, we used the images from the Washington food box
food bag
0.4
flashlight
Dataset [13] to generate human-based and machine- coffee mug
cereal box
cap
0.2
bowl
based properties. Around 22 object categories were se- binder
ball
0.0
lected and for each category, we selected random im-
ball
binder
bowl
cap
cereal box
coffee mug
food bag
flashlight
food box
food can
food cup
food jar
hand towel
keyboard
kleenex
notebook
pitcher
plate
shampoo
soda can
sponge
water bottle
ages from all the given instances of the category lead-
Substitutes
ing up to total of 692 images. Table 1 illustrates the
number of images selected from each category. For the (b) ERSATZ selections with sim-
ilarity to the missing tool
experiment, we generated 22 queries based on 22 ob-
ject categories. Each query consisted of a missing tool
Figure 3: Substitution results w.r.t. human expert se-
and 5 randomly selected objects from which a sub-
lection distribution and ERSATZ similarity responses.
stitute was to be selected. We gave 22 queries, to 14
Note that, gray cells correspond to object categories
human experts and asked them to select a substitute
which are not available in the respective query, cells
in each query. The distribution of the human selections
marked with represents substitutes selected by ex-
for each scenario is illustrated in Fig. 3(a). Similarly,
perts and ERSATZ.
the queries were run on ERSATZ with the following
(heuristically determined) optimal values of the target ERSATZ and the experts identified the same substi-
parameters: i) Number of machine-generated proper- tutes in 20 scenarios (91%).
ties is set to 4 (Sec. 5.1), ii) Number of clusters to 4
(Sec. 5.2), iii) Representative threshold (Sec. 5.3) and
7 Future Work
Minimum Similarity Tolerance (Sec. 5.3) to 0.35.
The results of both experiments were plotted as a The paper presents a prototypical system to deter-
heat map where the y-axis shows missing tools and x- mine a substitute for a missing tool using the grounded
axis shows the available objects illustrated in Fig. 3. knowledge about objects. The approach has drawn in-
The grayed cells mean the corresponding object cate- spiration from symbol grounding, the theory of affor-
gories were not available in the respective query. The dances and the theory of image schemas to represent
cells that are marked with represents substitutes se- the grounded knowledge and to determine a substi-
lected by experts and ERSATZ. Out of 22 scenarios, tute. This is an ongoing research with a focus on the
26
Fuzzy Conceptual Knowledge - Object Classes [5] C. Baber. Introduction. In Cognition and Tool
Bivariate Joint Frequency Distributions Use, chapter 1, pages 1–15. Taylor and Francis,
Fuzzy Conceptual Knowledge - Object Instances
2003.
Clustering Method [6] C. Baber. Working With Tools. In Cognition and
Functional Property Data - Object Instances Tool Use, pages 51–68. 2003.
Aggregation
[7] A. Boteanu, A. St. Clair, A. Mohseni-Kabir,
Physical Property Data - Object Instances
C. Saldanha, and S. Chernova. Leveraging
Property Extraction Methods Large-Scale Semantic Networks for Adaptive
Sensory Data - Object Instances Robot Task Learning and Execution. Big Data,
4(4):217–235, 2016.
Figure 4: Multi-layered dataset to build a robot-centric [8] M. Daoutis, S. Coradeshi, and A. Loutfi. Ground-
grounded knowledge about objects. ing commonsense knowledge in intelligent sys-
following aspects. tems. Journal of Ambient Intelligence and Smart
Our immediate goal focuses on the fuzzification of Environments, 1(4):311–321, 2009.
the clustering method and the reasoning method to [9] P. Gärdenfors. Cognitive semantics and image
combat the migration of the data points within clus- schemas with embodied forces. Image (Rochester,
ters. Moreover, we have derived three functional prop- N.Y.), pages 1–16, 1987.
erties, namely, contain, support, block from the image
schemas Containment, Support and Blockage respec- [10] R. Gupta and M. J. Kochenderfer. Common Sense
tively. However, further investigation is needed to for- Data Acquisition for Indoor Mobile Robots. In
malize the identification of additional functional prop- Proceedings of the Nineteenth National Confer-
erties to be derived from the existing image schema. ence on Artificial Intelligence, Sixteenth Confer-
For robot-centric property acquisition, we are cur- ence on Innovative Applications of Artificial In-
rently developing a framework that allows a robot to telligence, pages 605–610, San Jose, California,
extract properties of individual objects and build a USA, 2004.
knowledge base in a bottom-up manner such that the
knowledge about properties of objects is constructed [11] G. Jäger, C. A. Mueller, M. Thosar, S. Zug,
on the basis of what is sensed (see Fig. 4). We have and A. Birk. Towards Robot-Centric Conceptual
proposed the preliminary framework in [11]. Knowledge Acquisition. In Robots that learn and
reason in IEEE/RSJ International Conference on
References Intelligent Robots and Systems, Madrid, 2018.
[1] P. Abelha, F. Guerin, and M. Schoeler. A model- [12] W. Kuhn. An Image-Schematic Account of Spa-
based approach to finding substitute tools in 3D tial Categories. Spatial Information Theory, pages
vision data. Proceedings - IEEE International 152–168, 2007.
Conference on Robotics and Automation, 2016-
[13] K. Lai, L. Bo, X. Ren, and D. Fox. A Large-
June, 2016.
Scale Hierarchical Multi-View RGB-D Object
[2] A. Agostini, M. J. Aein, S. Szedmak, E. E. Ak- Dataset. In IEEE International Conference on
soy, J. Piater, and F. Worgotter. Using struc- Robotics and Automation (ICRA), pages 1817–
tural bootstrapping for object substitution in 1824, Shanghai, China, 2011.
robotic executions of human-like manipulation
[14] S. Lemaignan, R. Ros, L. Mösenlechner, R. Alami,
tasks. IEEE International Conference on Intelli-
and M. Beetz. ORO, a knowledge management
gent Robots and Systems, 2015-Decem:6479–6486,
platform for cognitive architectures in robotics.
2015.
IEEE/RSJ 2010 International Conference on In-
[3] I. Awaad, G. K. Kraetzschmar, and J. Hertzberg. telligent Robots and Systems, IROS 2010 - Con-
Challenges in Finding Ways to Get the Job ference Proceedings, (April):3548–3553, 2010.
Done. Planning and Robotics (PlanRob) Work-
[15] G. H. Lim, I. H. Suh, and H. Suh. Ontology-
shop at 24th International Conference on Auto-
based unified robot knowledge for service robots
mated Planning and Scheduling, 2014.
in indoor environments. IEEE Transactions on
[4] C. Baber. Cognition and Tool Use. Taylor and Systems, Man, and Cybernetics Part A:Systems
Francis, 2003. and Humans, 41(3):492–509, 2011.
27
[16] C. Mueller, K. Pathak, and A. Birk. Object shape multi-layered robot knowledge framework (OM-
categorization in rgbd images using hierarchical RKF) for robot intelligence. IEEE International
graph constellation models based on unsupervis- Conference on Intelligent Robots and Systems,
edly learned shape parts described by a set of (October):429–436, 2007.
shape specificity levels. In International Confer-
ence on Intelligent Robots and Systems, 2014. [22] S. Szedmak, E. Ugur, and J. Piater. Knowledge
Propagation and Relation Learning for Predict-
[17] C. A. Mueller and A. Birk. Hierarchical Graph- ing Action Effects. In IEEE/RSJ International
Based Discovery of Non-Primitive-Shaped Ob- Conference on Intelligent Robots and Systems,
jects in Unstructured Environments. In Interna- Chicago, 9 2014.
tional Conference on Robotics and Automation,
2016. [23] M. Tenorth and M. Beetz. KNOWROB-
Knowledge Processing for Autonomous Personal
[18] C. A. Mueller and A. Birk. Conceptualization of Robots. In IEEE/RSJ International Conference
Object Compositions Using Persistent Homology. on Intelligent Robots and Systems, pages 4261–
In IEEE/RSJ International Conference on Intel- 4266, 2009.
ligent Robots and Systems (IROS), Madrid, 2018.
[24] M. Thosar, S. Zug, A. M. Skaria, and A. Jain.
[19] L. A. Pineda, A. Rodrı́guez, G. Fuentes,
A Review of Knowledge Bases for Service Robots
C. Rascón, and I. Meza. A light non-monotonic
in Household Environments. In 6th International
knowledge-base for service robots. Intelligent Ser-
Workshop on Artificial Intelligence and Cogni-
vice Robotics, 10(3):159–171, 2017.
tion, 2018.
[20] A. Saxena, A. Jain, O. Sener, A. Jami, D. K.
Misra, and H. S. Koppula. RoboBrain: Large- [25] Y. Zhu, A. Fathi, and L. Fei-Fei. Reasoning About
Scale Knowledge Engine for Robots. arXiv, pages Object Affordance in a Knowledge Based Repre-
1 – 11, 2014. sentation. European Conference on Computer Vi-
sion, (3):408–424, 2014.
[21] I. H. Suh, G. H. Lim, W. Hwang, H. Suh,
J. H. Choi, and Y. T. Park. Ontology-based
28