=Paper= {{Paper |id=Vol-2831/paper4 |storemode=property |title=Towards a Virtual Librarian for Biologically Inspired Design Knowledge-Based Methods for Document Understanding |pdfUrl=https://ceur-ws.org/Vol-2831/paper4.pdf |volume=Vol-2831 |authors=Ruth Petit-Bois,Jeffrey Jacob,Spencer Rugaber,Ashok Goel |dblpUrl=https://dblp.org/rec/conf/aaai/Petit-BoisJR021 }} ==Towards a Virtual Librarian for Biologically Inspired Design Knowledge-Based Methods for Document Understanding== https://ceur-ws.org/Vol-2831/paper4.pdf
           Towards a Virtual Librarian for Biologically Inspired Design –
             Knowledge-Based Methods for Document Understanding
                          Ruth Petit-Bois1, Jeffrey Jacob2, Spencer Rugaber3, Ashok Goel4
                      Design & Intelligence Lab, School of Interactive Computing, Georgia Institute of Technology1,2,4
                                       School of Computer Science, Georgia Institute of Technology3
                      petitbois@gatech.edu1, jeffrey.jacob@gatech.edu2, rugaber@cc.gatech.edu3, goel@cc.gatech.edu4




                              Abstract                                             We posit that AI can be a powerful ally in tracking and
   IBID is a virtual librarian that processes biology articles and              understanding scientific documents and that knowledge-
   builds semantic annotations based on the contents of an arti-                based methods that use ontologies can augment the under-
   cle. It then assists human designers by locating and present-                standing capability of AI agents. This kind of AI agent can
   ing biology articles related to a design query. IBID’s use of                serve as a sort of virtual librarian for scientific literature. The
   ontologies allows for knowledge extraction and assists users
   with the identification of key information in an article and                 IBID (Intelligent Biologically Inspired Design; Goel et al.
   comparison of the contents of two articles. In this paper, we                2020; Rugaber et al. 2016) interactive system is intended to
   describe how the addition of an environment ontology en-                     be a virtual librarian for the domain of biologically inspired
   hances IBID’s capability to understand the habitats of various               design in which designers of technological systems look to
   organisms. In a pilot study, we evaluated IBID’s performance                 the natural world for ideas (Goel 2013a; Goel, McAdams &
   against human subjects who read the same passage and high-
   lighted phrases pertaining to locations and habitats. The pre-               Stone 2014). In this paper, we describe how IBID’s use of
   liminary results indicate that the ability to add ontologies to              ontologies allows for knowledge extraction and can assist
   IBID allows it to extract meaning from new documents.                        users with tasks like identifying key information in an article
                                                                                and comparing the contents of two different articles. In par-
                                                                                ticular, we show how the addition of an environment ontol-
                       1 Introduction                                           ogy enhances IBID’s capability to understand the locations
Scientific documents are information-rich and are more                          and habitats of various organisms.
common and more available than ever before. However,
with this proliferation comes the challenge of tracking and
understanding scientific documents at scale. Traditionally, a
                                                                                                    2    Related Research
scientist could work with a librarian to find the literature rel-               Biologically inspired design, also known as biomimicry
evant to the problem of interest. Now, most scientific litera-                  (Beynus 1997) and as biomimetics (Vincent & Mann 2002)
ture has moved online, real librarians are hard to find, and it                 is a paradigm for sustainable and environmentally friendly
is increasingly difficult, even for experts, to track, read and                 design. Consider, for example, the Namib Desert Beetle:
understand all the new scientific documents that are being                      The insect survives in the acrid desert by harvesting fog
generated on a given topic.                                                     droplets that stick to its wings (Naidu and Hattingh 1988).
   Understanding scientific documents is an involved pro-                       If engineers could successfully and efficiently mimic this
cess: there is a big difference between just reading text and                   ability in technological systems at scale, it might be possible
actually understanding it. We view scientific document un-                      to solve many water crises that exist in the world (Chen &
derstanding as the ability to process information and then be                   Zhang 2020).
able to draw useful inferences from it and not draw spurious                       However, there are several major hurdles in putting bio-
inferences. This view supports higher level tasks like com-                     logically inspired design paradigms into practice. From an
paring the contents of two different documents and identify-                    information-processing perspective, one big hurdle is locat-
ing similarities and differences between them.                                  ing biological cases relevant to a design problem. Given a


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                            Figure 1. The Conceptual Architecture of IBID

problem, most designers search for articles describing rele-         systems (Vincent 2014) and focuses on identifying inter-re-
vant biological systems online. Observations of online in-           lations in biological systems.
formation-seeking behavior of (student) designers indicate              Of course, the goal of using publicly available scientific lit-
three problems (Vattam & Goel 2011, 2013): Findability –             erature to support human creativity extends far beyond the
designers have difficulty finding biology articles relevant to       domain of biologically inspired design. In the context of
a design problem; Recognizability – designers have diffi-            computational creativity more generally, Abgaz et al. (2017)
culty recognizing that an article describes a biological sys-        use natural language processing to find analogies between
tem that is relevant to their problem; and Understandability         constructs in research papers on computer graphics, and
– designers have difficulty understanding the biological sys-        Lavrac et al. (2019) describe text mining techniques for de-
tem described in an article.                                         tecting bridging concepts between seemingly unrelated
   As a result, there have been several attempts in using nat-       terms in different articles such as migraine and magnesium.
ural language processing techniques to help designers locate
biology articles relevant to their problem. Shu (2010) de-
scribes an early approach in engineering for using natural             3    Intelligent Biologically Inspired Design
language processing for this task. Shu uses keywords for an-         The goal of the IBID project is to address the above men-
choring the natural language processing, but points out that         tioned problems of findability, recognizability and under-
the benefits of information extraction through natural lan-          standability in the context of biologically inspired design.
guage processing is not restricted to known patterns. Nagel,         Figure 1 shows the full functionality of IBID for its three
Stone & McAdams (2010) use an engineering to biology                 use cases: (1) End users such as engineers and designers
thesaurus that translates design queries in engineering to           looking for biology articles relevant to their design prob-
equivalent keywords in biology. Krupier et. al (2017) pro-           lems, (2) Knowledge engineers extending IBID’s
vide a more recent effort coming from biology. Their work            knowledge representation ontologies, and (3) System ad-
is based on a domain-specific ontology of biological                 ministrators adding to its repository of analyzed papers. Fig-
                                                                     ure 1 also specifies the actions available to each user type;
the arrows in the figure indicate progression of steps and/or      The mechanism by which fog water forms into large
access to/from the database.                                       droplets on a beaded surface has been described from
   The core of IBID’s approach to these problems is the use        the study of the elytra of beetles from the genus Sten-
of the Structure-Behavior-Function (SBF) models of tech-           ocara. The structures behind this process are believed
nological and natural systems (Goel 2013b; Goel, Rugaber           to be hydrophilic peaks surrounded by hydrophobic ar-
                                                                   eas; water carried by the fog settles on the hydrophilic
& Vatttam 2009) that originate from Chandrasekaran’s
                                                                   peaks of the smooth bumps on the elytra of the beetle
Functional Representation scheme (Chandrasekaran 1994;             and form fast-growing droplets that - once large
Chandrasekaran, Goel & Iwasaki 1993). By an ontology we            enough to move against the wind - roll down towards
mean the specification of concepts and their relationships to      the head.
other concepts (Chandrasekaran, Josephson & Benjamins              IBID processes the above paragraph and identifies the
1999; Guarino, Oberle & Staab 2009). The SBF model of a         structure, behavior and function specified in it:
system, technological or natural, is based on an ontology        • Structure: IBID identifies the entity in question as elytra.
composed of several subontologies:                               • Behavior: IBID identifies the cause as droplets grow in
 • Structure Ontology: The components, elements, or sub-           size and the effect as they roll down towards the head.
   stances in a biological process.                              • Function: IBID identifies the result of the action as move
 • Behavior Ontology: The causal mechanisms or the pro-            the water droplets.
   cesses of a biological system.                               This list is only illustrative of IBID’s capabilities, not com-
 • Function Ontology: The outcome, result or the purpose of     prehensive. IBID performs this kind of automatic extraction
   a biological systems.                                        of structure, behavior and function for whole articles and an-
 • Ontology of Relationships: Relationships between struc-      notates the articles with the extracted structure, behaviors
   ture and behavior and between behavior and function.         and functions.
   In earlier work on the KA project in the 1990s (Goel et.        Given IBID’s annotation of biology articles in a corpus
al 1996), we showed how an AI agent could learn an SBF          with the structure, behaviors and functions of biological sys-
model of a new device (such as a shaving cream can) from        tems described in them, users can perform faceted search on
its natural language description in The Way Things Work         the corpus (Prieto-Diaz 1991). Thus, a user may search for
(Macaulay 1988) by adapting the SBF model of a similar          the function move, or the structural element elytra, or both.
device (such as a fire extinguisher) stored in the agent’s      A user may also use IBID to perform a search using a design
memory. More recently, we have shown that manually an-          query expressed in plain English: given such a query, IBID
notating biology articles by SBF models enhance their           extracts the structure, behaviors and functions of the desired
findability and recognizability (Vattam & Goel 2011) and as     technological system from the query and then matches the
well as their understandability (Helms, Vattam & Goel           extractions with the SBF annotations on the articles in the
2010). IBID seeks to automatically extract the SBF models       corpus in a manner similar to the earlier KA project (Peter-
of the biological systems described in the articles.            son, Mahesh & Goel 1994). This helps IBID address the
                                                                problems of findability and recognizability we described
3.1 Structure-Behavior-Function Ontologies                      earlier.
                                                                   IBID also highlights the SBF annotations on a biology ar-
In IBID, the SBF ontologies come from several sources:          ticle. This helps IBID address the problem of understanda-
 • Structure Ontology is borrowed from Vincent’s (2014)         bility even for dense and long articles, such as the Norgaard
   ontology of biological systems.                              and Dacke (2010) article quoted above. This can potentially
 • Behavior Ontology builds on Khoo et al.’s (1998) patterns    help the user process biology articles more efficiently and
   of cause and effect.                                         easily, where the users may include not only biologists, but
 • Function Ontology was developed in our laboratory (Ru-       also engineers, designers, or even citizen scientists.
   gaber et al. 2016). Functional concepts are organized hi-
   erarchically: similar concepts are grouped together as
   families and more nuanced concepts are found deeper in       4    Adding an Environment Ontology to IBID
   the hierarchy.
The current version of IBID does not directly relate struc-     While the paragraph from the Norgaard and Dacke (2010)
ture, behaviors and functions of a biological system into a     article briefly mentions the location of elytra (elytra of bee-
complete SBF model.                                             tles), the above description of IBID has no way of identify-
   These ontologies help IBID construct a partial SBF model     ing the location of the structural elements of a biological
of the biological system described in a biology article. Ru-    system. However, for many biological systems, the loca-
gaber et. al. (2016) provide an example of how IBID pro-        tion, habitat, and, more generally, the environment of the
cesses the following passage from Norgaard and Dacke            system is very important. The external environment is also
(2010):
important for technological systems: the specification of
many design problems includes a specification of the envi-
ronment of the desired technological system (Helms & Goel
2014). Thus, there is a need to add an environment ontology
so that IBID can identify the locations and habitats of bio-
logical systems.
  Actually, the environment always was a part of SBF mod-
eling (Goel 2013b). For example, Prabhakar & Goel (1998)
analyzed the functioning of technological systems such as a
room air conditioning system not only in terms of its struc-
ture, behaviors and functions, but also its external environ-
ment. The research question for the IBID project is whether
we can add an environment ontology to the SBF ontology
and if IBID can use the new ontology to identify the loca-
tions and habitats of biological systems just as it identifies
                                                                           Figure 2. An excerpt of stripped-down ENVO.
their structures, behaviors and functions.
   Instead of building a new environment ontology from                3.   The ability to use & export this information easily.
scratch, we decided to explore already existing ontologies.
After examining several candidates, we selected the Envi-            By using ENVO, it was clear that Objective 1 could be
ronment Ontology (ENVO) described by Buttigieg et al.             reached just by establishing that all future ontologies would
(2013) in the Journal of Biomedical Semantics. This ontol-        use the OWL format. Not only can OWL files be imported,
ogy is hosted on the OBO Foundry (Smith et. al. 2007) and         parsed, modified, and exported easily, there are many tools
is quite comprehensive. A big advantage of this ontology          to help visualize and act on these OWL files such as Protégé
over many others is that it can be exported as a Web Ontol-       (Musen 2015). Protégé became the software used to scale
ogy Language (OWL) file. OWL files written in the Seman-          down ENVO, as well as rebuild the Structure, Behavior, and
tic Web Language are “designed to represent rich and com-         Function ontologies so they also conformed to the new
plex knowledge about things, groups of things, and relations      OWL standard. Adding new concepts or modifying existing
between things.” (McGuinness & Van Harmelen 2004; Web             concepts was simple using the Protégé software, thereby ad-
Ontology Language at www.w3.org/OWL/). Given that the             dressing Objective 2.
OWL file containing ENVO was developed by highly                     With the new converted ontologies, the issue of how to
skilled biologists, it eliminated the need for us to spend time   store these ontologies in a relational database arose. To re-
creating the links between concepts manually. Not only are        solve this, we developed a script that would take in an OWL
the links already made, but ENVO is made up of hundreds           file and convert it into its relational database equivalent. By
of nodes of concept names, definitions, parents, synonyms,        the end of the implementation, the structure, behavior, and
notes, and other metadata that describe ecosystems, entire        function ontologies were updated and reimported into
planets, and other astronomical bodies, and their parts           IBID’s relational database using the new OWL format. The
(Buttigieg et. al. 2013). Integrating new knowledge into          environment ontology was also imported into IBID allowing
IBID is efficient and easy because of the use of OWL files        for articles to be analyzed to extract environment concepts.
and sourcing them from places like OBO Foundry helps              In addition to this, IBID now has a pipeline for integrating
IBID leverage the knowledge of domain experts.                    new ontologies that are in the OWL format in an easy man-
   To provide a simpler testing ground of adding an ontol-        ner. Given that all of the data was imported into a relational
ogy into IBID and testing its effectiveness, we reduced           database, exporting this information from the database was
ENVO to just contain extremely basic concepts relating to         simple, and even using Protégé to export the OWL files into
ecosystems and their key environmental concepts. Figure 2         other formats was simple, addressing Objective 3.
illustrates a small excerpt from the stripped-down ENVO.
                                                                                   5    Experimentation
4.1 Generalizing the Approach
Three factors were especially important in adding the             With the ability to import new knowledge executed, the next
ENVO ontology to IBID:                                            step was experimenting and evaluating how well IBID could
    1. The ability to have a standard format by which to          leverage this knowledge. An experiment was conducted to
        import ontologies.                                        test the effectiveness of the environment ontology with ten
    2. The ability to add information quickly without             participants outside of the IBID project in the Design and
        breaking previous implementations.                        Intelligence Lab. In conjunction with this experiment, a
validation page was developed to test the functionality of the   the 14 sentences based on the African Bush Elephant, the
environment ontology. The use case of comparing scientific       humans were able to on average find 10 different environ-
documents was also explored qualitatively.                       ment terms; IBID was able to identify 4. Finally, the passage
                                                                 on the Highland Streaked Tenrac had humans denoting
5.1 Validation of Environment Ontology                           around 11 environment terms while IBID was able to extract
IBID’s validation took a passage of text and ran it through      4. The results are shown in Table 1.
IBID’s analysis pipeline and returned a list of results spe-
cific to the environment ontology. The experiment com-                  Sentence              Passage        Phrase Selected
pared IBID’s results with human subjects analyzing the            (where >50% of Users           #       (where at least 5 people
same passages. The results of this experiment would reveal               Agree)                             agreed on the con-
                                                                                                               cept/phrase)
gaps in the environment ontology’s functionality that could       They prefer streams in      Passage    streams in dense or
be used to make it more robust. The text for the experiments      dense or open forest,          1       open forest(x6)
came from Szalay (2014), en.wikipedia.org/wiki/Elephant,          bamboo thickets, adja-                 bamboo thickets (x8)
and McTighe (2011).                                               cent agricultural areas
                                                                                                         adjacent agricultural ar-
   In total, 10 human participants completed the experiment.      and dense mangrove
                                                                                                         eas (x5)
Each participant read the same three passages on three dif-       swamps.
                                                                                                         dense mangrove
ferent organisms. The instructions were to underline terms                                               swamps(x5)
in the passages they considered to be related to the “envi-       The African bush ele-       Passage    dry savannahs (x7)
ronment” or the “habitat” in which organisms live. The or-        phant can be found in          2
                                                                  habitats as diverse as                 Deserts (x8)
ganisms in question were the King Cobra, with a passage
                                                                  dry savannahs, deserts,                Marshes (x8)
containing 4 sentences, the African Bush Elephant with a          marshes, and lake
passage containing 14 sentences, and the Highland Streaked        shores, and in eleva-                  lake shores (x8)
Tenrac with a passage containing 6 sentences.                     tions from sea level to
                                                                  mountain areas above
                                                                  the snow line.
 Passage                       Avg. # of Terms      # of Terms    Forest elephants            Passage    equatorial forests (x7)
                               by Humans            by IBID       mainly live in equato-         2
                                                                  rial forests but will en-
 King Cobra                    8                    2             ter gallery forests and
 (4 sentences) – Passage 1                                        ecotones between for-
 African Bush Elephant         10                   4             ests and savannahs.
 (14 sentences) – Passage 2                                       Asian elephants prefer      Passage    dry thorn-scrub forests
 Highland Streaked Tenrac      11                   4             areas with a mix of            2       (x5)
 (6 sentences) – Passage 3                                        grasses, low woody
                                                                  plants, and trees, pri-
     Table 1. Results from the first pass of the experiment       marily inhabiting dry                  evergreen forests (x7)
                                                                  thorn-scrub forests in
   The highlighted phrases were pulled out exactly as they        southern India and Sri
were marked by the participant. The assumption here was           Lanka and evergreen
that there was a difference between a term having been high-      forests in Malaya.
lighted in one straight stroke, and a term being highlighted      Elephants tend to stay      Passage    stay near water sources
with spaces in between. This meant that in this sentence          near water sources.            2       (x6)
from en.wikipedia.org/wiki/Elephant:
   The African bush elephant can be found in habitats as          Highland streaked ten-      Passage    Schlerophyllous (x5)
   diverse as dry savannahs, deserts, marshes, and lake           recs are found in              3       montane forests (x5)
                                                                  schlerophyllous and
   shores, and in elevations from sea level to mountain ar-
                                                                  montane forests and
   eas above the snow line.
                                                                  adjacent areas at eleva-
There was a difference if a participant highlighted, “dry sa-     tions of 1550 to 1800
vannahs, deserts, marshes, and lake shores” in one go to          m.
count as one phrase, or they highlighted “dry savannahs,”         They occur both in pri-     Passage    primary rainforests (x6)
then “deserts,” then “marshes,” then “lake shores” sepa-          mary rainforests and in        3
                                                                  introduced forests of                  introduced forests of
rately to count as 4 different phrases. Of the 4 sentences                                               eucalyptus and pine
based on the King Cobra’s habitat, the humans were on av-         eucalyptus and pine.
                                                                                                         (x7)
erage able to locate ~8 different environment terms. Run-
ning the same passage in IBID led to it finding only 2. Of            Table 2. The aggregated results for the three passages.
           Figure 3. Example of Comparing Results from Two Documents about an Eastern Box Turtle and Desert Tortoise.

Table 2 contains the concepts that a majority of participants      key concepts in a document helps the researcher quickly
agreed on. The criteria for “agreeing” means that of the ag-       compare two documents. If we have two documents about
gregated list of results, at least 50% of the participants         the same or similar species, IBID can help the researcher
agreed that the selected sentence was one that contained an        compare and contrast information and see where two differ-
environment concept and at least 5 participants also agreed        ent documents are in agreement and where they disagree.
on the concept that indicated it related to the environment.       We believe that this can be a powerful tool and a major fea-
                                                                   ture in the realm of scientific document understanding.
5.2 Comparing Two Documents
As mentioned earlier, scientific document understanding al-                      6   Discussion and Results
lows an agent to perform higher level tasks and one such
task that is paramount in any kind of research is the ability      Based on the experiment above, we can see that the addition
to quickly compare the key points of two different docu-           of a new ontology, in this case the environment ontology,
ments. IBID is able to take in two documents and run its           improves IBID’s understanding in this domain. IBID ini-
analysis and display the results side-by-side. This process        tially had no understanding ability when it came to habitats
involves the same pipeline as discussed earlier and leverages      and locations, but the addition of this ontology led to in-
the same knowledge base. We tested this process with sev-          creased understanding as seen in Table 1. However, we
eral different excerpts taken from descriptions of the habi-       acknowledge that the number of participants in our experi-
tats of different species, an example of which is shown in         ment was small and IBID did not reach human level perfor-
Figure 3. It can be seen that IBID’s ability to understand the     mance. We still feel that these preliminary results show that
IBID’s ability to integrate new knowledge moves it towards        that IBID missed. For example, in the sentence from en.wik-
becoming a true virtual librarian.                                ipedia.org/wiki/Elephant:
   The experiment also showed some of the weaknesses
                                                                     Asian elephants prefer areas with a mix of grasses, low
IBID has. For example, there are many proper noun location           woody plants, and trees, primarily inhabiting dry thorn-
words (country names, cardinal directions, etc.) that many           scrub forests in southern India and Sri Lanka and ever-
participants deemed relevant to the environment of an ani-           green forests in Malaya.
mal. IBID’s knowledge base is strictly that of habitats as        It makes sense that humans marked “mix of grasses” and
described in ENVO. Take for instance the simple sentence          “low woody plants, and trees.” However, there aren’t any
from Passage 3 (McTighe 2011):                                    real concepts in ENVO that are mapped to by these phrases.
   They are most commonly found at forest fringes on the          However, the verb “prefer” was identified by IBID and al-
   central plateau edge and near cultivated fields and rice       lowed the sentence to be extracted independent of the envi-
   paddies                                                        ronment terms found by the participants.
The key term was “forest” and it was pulled out by IBID;             These results show that IBID’s knowledge-based meth-
the term “forest” maps to an environment concept in ENVO.         ods show promise in efficiently extracting information from
In contrast to this, humans are able to look at a sentence say-   a scientific document and that the use of ontologies allows
ing, “southern Indian desert” and see that the whole phrase       for it to quickly integrate and leverage new knowledge,
indicates location while IBID would only be able to recog-        without the need for extensive data collection or training.
nize the term “desert”.                                           Another major benefit of IBID’s approach is better explai-
   Looking at the “Phrase Selected” column in Table 2, it is      nability. It is easy to determine gaps in IBID’s knowledge,
clear that there are many examples where humans agreed            like those identified in regard to proper nouns and cardinal
that adjectives describing habitats are just as important as      directions. It is also easy to see which knowledge IBID used
the habitat itself. Descriptive words like “dense mangrove        to extract information. The use of an ontology also allows
swamps” and “dry savannahs” might be difficult for IBID to        IBID and its users to leverage the relationships that are
parse because they are compound terms containing a de-            found for downstream inferencing tasks. The use of the
scriptive word followed by a habitat word. This issue could       standard OWL file format also allows users to edit the
be addressed by extending IBID’s parser to include adjec-         knowledge using tools like Protégé.
tives that might describe an environment term.
   One thing IBID does really well is identify the verb pred-
icates from a sentence. Verbs like “prefer”, “occur”, and                               Conclusions
“find”, occur frequently with environment related phrases         IBID demonstrates the effectiveness of knowledge-based
that were marked by the human participants. For example,          methods in augmenting scientific document understanding
in Passage 3 (McTighe 2011), IBID identifies the phrase,          and moves us towards a true virtual librarian. IBID’s use of
   tenrecs are found in schlerophyllous and montane for-          standardized ontologies allows it to quickly gain a deeper
   ests and adjacent areas at elevations of 1550 to 1800 m.       understanding of a new domain, without the need to acquire
where the verb used to identify this phrase is “find”.            lots of new data or to spend time learning a complex model.
Although the specific environment terms don’t map to con-         This ability also allows IBID to be extensible. The Environ-
cepts in the ontology, IBID was able to extract this infor-       ment Ontology was a working example, but the same pro-
mation.                                                           cess can be applied to new ontologies, thus growing IBID’s
   There are sentences where IBID identified information          understanding capability. These abilities allow IBID to fa-
that was right, but the term used to do so was not. For exam-     cilitate higher level tasks like document comparison, which
ple, in the sentence from en.wikipedia.org/wiki/Elephant,         can help users of IBID compare and contrast different ap-
                                                                  proaches to their engineering problem. We acknowledge
  The African bush elephant can be found in habitats as           that there is a need for augmenting the analysis and filling
  diverse as dry savannahs, deserts, marshes, and lake
                                                                  the gaps in IBID’s knowledge, but the use of knowledge-
  shores, and in elevations from sea level to mountain ar-
  eas above the snow line.                                        based methods helps the user to efficiently identify these
IBID pulled out the word “bush” instead of one of the envi-       gaps and easily make modifications or add extra processing.
ronment terms, even though “bush” is just part of the species
name. This means that in passages where the name of an
animal is an environment concept, IBID may pull out a false
positive.
  Finally, there were cases where humans identified vague
habitat phrases like “under a tree,” or “near water sources”
                   Acknowledgements                                   Goel, A., Rugaber, S., & Vattam, S. 2009. Structure, behavior, and
                                                                      function of complex systems: The structure, behavior, and function
We are grateful to Julian Vincent for sharing his ontology of         modeling language. Artificial Intelligence for Engineering Design
                                                                      Analysis          and         Manufacturing             23(1):23-35.
biological systems; IBID’s structure ontology is a subset of
                                                                      doi.org/10.1017/S0890060409000080
his structure ontology. We thank Pablo Boserman and Dan-
                                                                      Guarino, N.; Oberle, D.; & Staab, S. 2009. What is an ontology?
iel Dias for their work on making the IBID system func-               Handbook on Ontologies, Edited by R. Studer. 1-17. Berlin:
tional, as well as members of the Georgia Tech Design &               Springer-Verlag Berlin Heidelberg. doi.org/10.1007/978-3-540-
Intelligence Lab for assisting with the IBID experiment in-           92673-3_0
cluding the evaluation of interactive system.                         Helms, M., & Goel, A. 2014. The Four-Box Method: Problem For-
                                                                      mulation and Analogy Evaluation in Biologically Inspired Design.
                                                                      Journal       of     Mechanical        Design,     136(11):111106.
                                                                      doi.org/10.1115/1.4028172
                         References                                   Helms, M., Vattam, S., & Goel, A. 2010. The effects of functional
Abgaz, Y., Chaudhry, E., O’Donoghue, D., et al. 2017. Character-      modeling on understanding complex biological systems. In Pro-
istics of pro-c analogies and blends between research publications.   ceedings of the ASME 2010 International Design Engineering
In Proceedings of the 8th International Conference on Computa-        Technical Conferences and Computers and Information in Engi-
tional Creativity, 1–8.                                               neering Conference. Montreal: American Society of Mechanical
                                                                      Engineers, 107-115. doi.org/10.1115/DETC2010-28939
Beynus, J. 1997. Biomimicry: Innovation Inspired by Nature. New
York: Harper Perennial. doi.org/10.1002/inst.12116                    Khoo, C., Kornfilt, J., Oddy, R., & Myaeng, S-H. 1998. Automatic
                                                                      Extraction of Cause-Effect Information from Newspaper Text
Buttigieg, P., Morrison, N., Smith, B., et al. (2013) The environ-
                                                                      Without Knowledge-based Inferencing. Literary and Linguistic
ment ontology: contextualising biological and biomedical enti-
                                                                      Computing, 13(4):177–186. doi.org/ 10.1093/llc/13.4.177
ties. Journal       of      Biomedical     Semantics,      4(1):43.
doi.org/10.1186/2041-1480-4-43                                        Kruiper, R.; Vincent, J.; Chen-Burger, J. & Desmulliez, M.
                                                                      2017. Towards identifying biological research articles in com-
Chandrasekaran, B. 1994. Functional Representation: A Brief His-
                                                                      puter-aided biomimetics. In Proceedings of the Conference on Bi-
torical     Perspective.      Applied    AI,     8(2):   173-197.
                                                                      omimetic and Biohybrid Systems, 242–254. Springer.
doi.org/10.1080/08839519408945438
                                                                      doi.org/10.1007/978-3-319-63537-8_21
Chandrasekaran, B., Goel, A., & Iwasaki, Y. 1993. Functional rep-
                                                                      Lavrac, N., Jursic, M., Sluban, et al. 2019. Bisociative Knowledge
resentation as design rationale. IEEE Computer, 48-56.                Discovery for Cross-domain Literature Mining. In Computational
doi.org/10.1109/2.179157
                                                                      Creativity, edited by T. Veale & A. Cardoso, 121-139. Springer.
Chandrasekaran, B., Josephson, J., Benjamins, V. 1999. What are       doi.org/10.1007/978-3-319-43610-4_6
ontologies and why do we need them? Intelligent Systems and their
                                                                      Macaulay, D. 1988. The Way Things Work. Boston, MA: Hough-
Applications. IEEE Intelligent Systems, 14(1):20–26. doi.org/
                                                                      ton Mifflin Company.
10.1109/5254.747902
Chen, Z., & Zhang, Z. 2020. Recent progress in beetle-inspired su-    McGuinness, D., & Van Harmelen, F. 2004. OWL web ontology
perhydrophilic-superhydrophobic micropatterned water-collection       language overview. W3C recommendation, 10(10).
materials. Water Science & Technology 82 (2): pp. 207–226.            McTighe, L. 2011. Hemicentetes nigriceps. Available at animal-
doi.org/10.2166/wst.2020.238                                          diversity.org/accounts/Hemicentetes_nigriceps/
Goel, A. 2013a. Biologically Inspired Design: A New Program           Musen, M.A. 2015. The Protégé project: A look back and a look
for Computational Sustainability, IEEE Intelligent Systems,           forward. AI Matters. Association of Computing Machinery Spe-
28(3), 80-84.                                                         cific Interest Group in Artificial Intelligence, 1(4): 4-12.
                                                                      doi.org/10.1145/2757001.2757003
Goel, A. 2013b. One Thirty Year Long Case Study; Fifteen Prin-
ciples: Implications of an AI Methodology for Functional Model-       Nagel, J., Stone, R., & McAdams, D. 2010. An engineering-to-bi-
ing. Artificial Intelligence for Engineering Design Analysis and      ology thesaurus for engineering design. In Proceedings of the
Manufacturing,         27(3):   203-     215,     2013.    doi.org/   ASME 2010 International Design Engineering Technical Confer-
10.1017/S0890060413000218                                             ences and Computers and Information in Engineering Conference.
                                                                      Montreal: American Society of Mechanical Engineers.
Goel, A., Hagopian, K., Zhang, S., & Rugaber, S. 2020. Towards
                                                                      doi.org/10.1115/DETC2010-28233
a Virtual Librarian for Biologically Inspired Design. In Proceed-
ings of the 9th International Conference on Design Computing and      Naidu, S., & Hattingh, J. 1988. Water Balance and Osmoregulation
Cognition, 377-396.                                                   in Physadesmia Globosa, a Diurnal Tenebrionid Beetle from the
                                                                      Namib Desert. Journal of Insect Physiology 34(10): 911-917.
Goel, A., Mahesh, K., Peterson, J., & Eiselt, K. 1996. Unification    doi.org/10.1016/0022-1910(88)90126-6
of Language Understanding, Device Comprehension and
Knowledge Acquisition. In Proceedings of the 10th Knowledge           Norgaard, T., & Dacke, M. 2010. Fog-basking behavior and water
Acquisition for Knowledge-Based Systems Workshop, Banff,              collection efficiency in Namib Desert Darkling beetles. Frontiers
                                                                      of Zoology 7(1), 1. doi.org/10.1186/1742-9994-7-23
Canada.
                                                                      Peterson, J., Mahesh, K., Goel, A. 1994. Situating natural language
Goel, A., McAdams, D., & Stone, R. 2014. Biologically Inspired
                                                                      understanding within experience-based design. International Jour-
Design: Computational Methods and Tools. London: Springer-
                                                                      nal     of    Human-Computer          Studies    41(6):    881-913.
Verlag. doi.org/10.1007/978-1-4471-5248-4                             doi.org/10.1006/ijhc.1994.1085
Prabhakar, S., & Goel, A. 1998. Functional modeling for enabling
adaptive design of devices for new environments. AI in Engineer-
ing 12(4): 417–444. doi.org/10.1016/S0954-1810(98)00003-X
Prieto-Diaz, R. 1991. Implementing faceted classification for
software reuse. Communications of the ACM, 34(5): 88-97.
doi.org/10.1145/103167.103176
Rugaber, S., Bhati, S., Goswami, et al. 2016. Knowledge Extrac-
tion and Annotation for Cross-Domain Textual Case-Based Rea-
soning in Biologically Inspired Design. In Proceedings of the 24th
International Conference in Case- Based Reasoning, 342-355.
doi.org/10.1007/978-3-319-47096-2_23
Shu, L. 2010. A Natural-language approach to biomimetic design.
Artificial Intelligence for Engineering Design Analysis and Manu-
facturing, 24(4):507–519. doi.org/10.1017/S0890060410000363
Smith, B.; Ashburner, M.; Rosse, C. et al. 2007. The OBO
Foundry: coordinated evolution of ontologies to support biomedi-
cal data integration. Nature Biotechnology, 25(11):1251–1255.
doi.org/10.1038/nbt1346
Szalay, J. 2014. Facts about Cobras.. Live Science. 19 December,
2014. Available at https://www.livescience.com/43520-cobra-
facts.html.
Vattam, S., & Goel, A. 2011. Foraging for inspiration: Understand-
ing and supporting the information seeking practices of biologi-
cally inspired designers. In Proceedings of the ASME 2011 Inter-
national Design Engineering Technical Conferences and Comput-
ers and Information in Engineering Conference. doi.org/
10.1115/DETC2011-48238
Vattam, S., & Goel, A. 2013. Seeking Bioinspiration Outline: A
Descriptive Account. In Proceedings of the 19th International Con-
ference on Engineering Design, 517-526.
Vincent, J. 2014. An ontology of biomimetics. In Biologically In-
spired Design: Computational Methods and Tools, edited by A.
Goel, D. McAdams & R. Stone, 269-285, London: Springer.
doi.org/10.1007/978-1-4471-5248-4_11
Vincent, J., & Mann, D. 2002. Systematic technology transfer from
biology to engineering. Philosophical Transactions of the Royal
Society of London A: Mathematical, Physical and Engineering Sci-
ences, 360(1791): 159-173. doi.org/10.1098/rsta.2001.0923
Web Ontology Language. 2011. By the OWL Working Group.
Available from www.w3.org/OWL/. Published 11 December
2011.