<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Frame Semantic Parsing using Framester Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Diego Reforgiato Recupero</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mehwish Alam</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aldo Gangemi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valentina Presutti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>. CNR, ISTC</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>. University of Cagliari</institution>
          ,
          <addr-line>Cagliari, Italy, 2.</addr-line>
          <institution>Universite Paris 13</institution>
          ,
          <addr-line>Paris</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces TakeFive, a new algorithm that performs frame semantic parsing using frame-oriented knowledge graph generated by Framester. TakeFive performs dependency parsing, identi es the words that evoke lexical frames, locates the roles and llers for each frame, and runs coercion techniques. 1So-called cognitive computing systems such as Google Now [3], SIRI2, and IBM Watson3 have provided strong evidence of what can be achieved with knowledge graphs used as background knowledge. In those cases, knowledge graphs are proprietary resources represented with proprietary formats. However, a key point of knowledge graphs, including linked data, is to represent entities and their relations with possibly additional attributes that may support temporal, spatial, causal inferences. Regardless of the format and the copyright, existing knowledge graphs share a common limit: they express facts that lack of contextual and situational information. This makes it hard if not impossible to go beyond encyclopaedic question answering or limited human-machine interaction tasks. The ability to automatically perform semantic frame parsing of natural language text is a requirement for evolving frame-oriented knowledge graphs. For example, FrameBase [4] has shown the usefulness of linguistic frames as a cognitive tool for semantic interoperability. Frame-semantic parsing refers to the combined tasks of frame detection and semantic role labeling on natural language text. Its output can greatly enrich knowledge graphs and semantic interoperability. Let us consider the following sentence from the Wall Street Journal (WSJ) dataset4: Despite recent declines in yields, investors continue to pour cash into money funds.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>1 The research leading to these results has received funding from the European Union Horizon 2020 the Framework</p>
      <p>
        Programme for Research and Innovation (2014-2020) under grant agreement 643808 Project MARIO Managing
2 ahcttitvpesan:d//hewawltwhy.aagpipnglew.itchoumse/oifocsar/insgisreirv/ice robots.
3 https://www.ibm.com/watson/
4 Available from https://catalog.ldc.upenn.edu/
By performing frame-semantic parsing on this sentence, we recognize that the
text fragment to pour evokes e.g. the frame Cause motion from FrameNet,
meaning that the sentence provides an occurrence of this frame, and that the
text fragments the investors and cash respectively denote the argument of a
role Agent.cause motion, and the argument of a role Theme.cause motion, as
both involved in the Cause motion situation occurrence. FrameNet, VerbNet
and PropBank are three of the main resources for frames and roles which are
abundantly used for Semantic Role Labeling (SRL). This paper proposes a novel
method, called TakeFive, that relies on dependency (instead of categorial)
parsing, one (or more) reference resources available from a novel linguistic linked
data hub Framester [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We evaluate TakeFive with VerbNet frames and roles
and compare it against existing methods for SRL-based knowledge extraction.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>TakeFive, Semantic Role Labeling Algorithm</title>
      <p>TakeFive5 addresses the problem of detecting the verb (lemma and VerbNet
verb class), along with its arguments, and relating them to their
corresponding VerbNet roles. Consider the sentence: The Spaniards conquered the Incas.
Here, our method should be able to detect the verb conquered, the fact that
The Spaniards is the ller of the VerbNet role Conqueror whereas the Incas is
the ller of the VerbNet role Theme. Verbs, llers and roles are therefore the
entities we are looking for and that we need to properly associate with the input
sentence. The backbone of TakeFive is a two step approach: (i) preprocessing
the sentence, where syntactic and semantic information are extracted and (ii)
detecting (CoreNLP-derived, mainly syntactic) interface roles, (VerbNet-based,
mainly semantic) speci c roles for a certain frame, and checking the
compatibility between interface and semantically speci c roles.</p>
      <p>Step 1: Framester and CoreNLP preprocessing. For a given input sentence we
collect semantic information from Framester and syntactic information from
Stanford CoreNLP: the usage of Word Frame Disambiguation (WFD)6 allows
detecting the frames evoked by each verb when the verb is polysemous, whereas
CoreNLP provides a dependency tree along with the POS tags (see Figure 1).
Here, nsubj, conquered-3, Spaniards-2 related to the verb conquered, and
its Spaniards argument. Dependency types such as nsubj, dobj are generalized
to interface roles (e.g., Agent, Undergoer, Recipient, Eventuality, Oblique) to add
a semantic layer on top of the syntactic one e.g., nsubj Ñ Agent. By applying
our heuristic nsubj Ñ Agent to the dependency triple nsubj, conquered-3,
Spaniards-2, we assign the role Agent to the argument Spaniards. As next step,
we need to check if the CoreNLP interface role is compatible with the VerbNet
interface role of the underlying verb (conquered in our example).
5 Further details are available at https://lipn.univ-paris13.fr/framester/en/srl
6 http://lipn.univ-paris13.fr/framester/
det
nsubj
root
dobj</p>
      <p>det</p>
      <p>The Spaniards conquered the Incas
Step 2: Compatibility between CoreNLP and VerbNet interface roles. TakeFive
introduces an algorithm for checking the compatibility between the CoreNLP
interface roles and VerbNet roles with respect to a verb occurring in a sentence. The
rst part of the algorithm takes as input a sentence, along with the CoreNLP and
Framester information of the same sentence and generates a pair of VerbNet
interface roles and VerbNet speci c roles. Due to space constraints, we directly
explain the algorithm using our example sentence. Consider two dependency triples
(Listing 1 from https://lipn.univ-paris13.fr/framester/en/srl) fnsubj,
conquered-3, Spaniards-2g and fdobj, conquered-3, Incas-5g. Using our
heuristics, we assign the CoreNLP interface roles Agent and Undergoer to Spaniards
and Incas, respectively. The VerbNet sense of the verb conquered is Conquer 42030000
and the returned pairs (VerbNet interface role, VerbNet speci c role) are: (Agent,
Agent.conquer 42030000), (Eventuality, Event.conquer 42030000).The
second part of the algorithm checks the compatibility of CoreNLP interface roles
detected using the heuristics de ned in Step 1 and the VerbNet interface roles
detected in the previous part of the algorithm. The objective here is to return all
roles and llers for each argument of verbs from the input sentence. For our
example, it follows that the CoreNLP interface role Agent is equal to the VerbNet
interface role and is returned. The same applies for the CoreNLP interface role
Undergoer. Patient.conquer 42030000 would be the VerbNet speci c role that
would be matched and the role Patient is returned. Therefore the nal output
would contain the role Agent for the argument Spaniards and the role Patient
for the argument Incas.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Performance Evaluation</title>
      <p>
        Several experiments were conducted for testing the performance of TakeFive
and the results were compared with several existing tools such as SEMAFOR,
FRED, Pikes and PathLSTM. Recently, we have presented FRED [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as a
machine reader to produce frame-based knowledge graphs. We combined FRED
and TakeFive by including all the VerbNet roles and llers extracted by FRED
to the results of TakeFive when the latter does not extract roles information
for a particular ller in general caused by the complexity of the sentence
grammar. Conversely, if FRED detects a VerbNet role for a particular ller which
has not been detected by TakeFive, it is likely to be a correct pair thanks
to the Combinatory Categorial Grammar theory which FRED is built upon.
The data set used for this purpose was the WSJ section of the Penn
Treebank PropBank annotated with VerbNet and PropBank annotations7. These
annotations indicate the VerbNet and PropBank roles associated to each verb
of each sentence contained in the dataset and related to each ller. An
evaluation analysis was conducted as follows: for each pair (role, ller) that was
returned using our approach, it was veri ed against the gold standard
annotations related to the same sentence and same verb. For each pair, the
produced output contains proleOUT ; f illerOUT q. This output was compared with
the annotated pairs proleANN ; f illerANN q and a weighted score de ned as
follows: if roleOUT roleANN and f illerOUT f illerANN we assign 1; if
roleANN roleOUT but either there exists a subsumption relation between
them or they are siblings, and f illerOUT f illerANN , then we assign a score
of either 0.5 or 0.25. Otherwise, the weighted score has a value of 0. We
performed a precision-recall analysis as follows: (i) true positives are counted when
the weighted score for a pair is greater than 0, (ii) false positives are counted when
the weighted score for the pair is equal to 0, (iii) false negatives are counted for
all the annotation pairs that were not successfully retrieved by a given method,
(iv) true negatives are represented by all the pairs (role, llers) not retrieved by
the algorithm for which there is no annotation. Table 1 shows the comparisons
between our approach and the other competitors.
      </p>
      <p>Method Weighted Score Precision Recall F1
TakeFive 0.174 0.156 0.22 0.185
TakeFive +FRED 0.193 0.176 0.201 0.191
SEMAFOR 0.050 0.038 0.031 0.034
Pikes 0.181 0.155 0.122 0.137
FRED 0.066 0.052 0.080 0.063</p>
      <p>PathLSTM 0.101 0.095 0.094 0.094
4
This paper introduces a new algorithm for semantic role labeling, TakeFive,
which aims at detecting verbs and their associated arguments. Several
experiments show that the proposed approach outperforms the state of the art
algorithms for semantic role labelling. Ongoing work focuses on de ning a strategy
to combine the existing methods for performance improvements.
7 https://github.com/ibeltagy/pl-semantics/blob/master/resources/
semlink-1.2.2c/1.2.2c.okay</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alam</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asprino</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Framester: A wide coverage linguistic linked data hub</article-title>
          .
          <source>In: EKAW</source>
          ,
          <year>2016</year>
          . pp.
          <volume>239</volume>
          {
          <issue>254</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuzzolese</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Draicchio</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mongiov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semantic web machine reading with FRED</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>6</issue>
          ),
          <volume>873</volume>
          {
          <fpage>893</fpage>
          (
          <year>2017</year>
          ), https://doi.org/10.3233/SW-160240
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raghunathan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srikant</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>User modeling for a personal assistant</article-title>
          .
          <source>WSDM '15</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Rouces</surname>
          </string-name>
          , J., de Melo, G.,
          <string-name>
            <surname>Hose</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Framebase: Representing n-ary relations using semantic frames</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>