<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neguess: Wikidata-entity guessing game with negative clues</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aditya Bikram Biswas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiba Arnaout</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simon Razniewski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Max Planck Institute for Informatics</institution>
          ,
          <addr-line>Saarland</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present Neguess, an entity-guessing game with unique emphasis on challenging negative clues. The clues have been automatically generated using the peer-based negation inference methodology [3]. The game can be used i) as an entertaining way to familiarize participants with the novel area of explicit negative knowledge in open-world knowledge bases; and ii) has the potential to be adopted in pedagogical approaches, like game-based teaching practices. The demo is available at: https://neguess.mpi-inf.mpg.de.</p>
      </abstract>
      <kwd-group>
        <kwd>Negation</kwd>
        <kwd>RDF</kwd>
        <kwd>Knowledge Bases</kwd>
        <kwd>Wikidata</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Knowledge bases (KBs) operate under the open-world assumption (OWA),
meaning that statements asserted in them, in the form of (subject; predicate; object)
are true, like (Denmark; member of; European Union), and statements not
asserted are unknown, like (Iceland; member of; European Union). Given that
existing web-scale KBs are far from complete, it is not realistic to assume that
absent information is false. It is also not realistic to add every possible negation
to the KB (e.g., more than 280k actors with no Oscars1). For this reason, we have
seen a rising interest in augmenting open-world KBs with useful negative
statements. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], interesting negations are inferred about a given entity based on
observations made on similar entities. For instance, Iceland is a European
country like Denmark, however, the former does not have the statement asserting
its membership in the European Union. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], an anti-KB containing common
factual mistakes has been built, through mining Wikipedia edit logs. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the
focus is on obtaining meaningful negative information in commonsense KBs.
      </p>
      <p>
        Neguess (short for \entity-guessing game with negative clues") builds on
the methodology introduced in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and shows multiple-choice guessing cards,
where the clues are entirely negated assertions, i.e., properties not satis ed by
the correct answer. For every guessing card, i) it picks a random entity as the
right answer ii) retrieves similar entities for wrong answers, (e.g., other countries
      </p>
      <sec id="sec-1-1">
        <title>1 https://w.wiki/3ZB9</title>
        <p>Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
from the same continent), and iii) compiles challenging negative clues that are
mostly, or fully, applicable to the correct entity.</p>
        <p>
          Peer-based negation inference. Neguess relies on the so-called peer-based
inference methodology [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to compile interesting negative statements. In particular,
given an entity e from KB, the method:
1. Collects e's peers using a prede ned similarity function (e.g.,
embeddingbased similarity [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]). Peer grouping is based on three di erent functions
(i) structured facets of entity e [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. For instance, people sharing the same
occupation, nationality, similar eld of work, (ii) graph-based measures like
distance or connectivity [
          <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
          ], expressed as the number of predicate-object
pairs two entities share, and (iii) cosine similarity based on Wikipedia
embeddings [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
2. Produces a set of candidate negations (i.e., statements that are asserted in
        </p>
        <p>KB for at least one peer, but not for e).
3. Scores the set of candidates using various ranking metrics (e.g., frequency,
unexpectedness, etc.). The need for ranking stems from the very large set
of correct negative statements inferred. For example, a person-entity is not
married to millions of people.</p>
        <p>
          Further details about the methodology are in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
2
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>System Overview</title>
      <p>
        Neguess cards. We make use of the method, described brie y in Section 1, to
generate three challenging negative clues for Wikidata [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] entities from diverse
types. A challenging negative clue is equivalent to an inferred negative statement
with a high score. Figure 1 shows a sample card game. Players can pick the type
of entities to guess (1); pick the similarity function to be used for collecting the
peers (i.e., the multiple options) (2); and pick the di culty of the clues (3). Here,
di culty re ects how unique are the clues to the correct answer. For instance,
the multiple options are the famous world leaders Roosevelt, Napoleon, and
Lincoln (5). They have been chosen as peers because they share the occupation
\statesperson" (relying on structured facets of the subject [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]). In this case, the
di culty is set to easy and is re ected as 2/3 of the clues are unique to the
correct answer, making it somehow more distinguishable. The clues are shown in
two possible structured forms: i) (p; none), e.g., (educated at; none) and ii) NOT
(p; o), e.g., NOT (manner of death; natural causes). Unlike the others, Lincoln
was shot in the famous theatre incident. He is also known as one of few
American presidents with no formal education. The third clue does not contribute to
the answer and is there to confuse the player, as all of them are not Lutheran.
Moreover, players can track their progress in the game (6). Finally, players can
report a card if it contains any incorrect negations or technical problems (7).
Implementation and web interface. The Neguess front-end or the web
interface is developed using React JS 2, a JavaScript library to build user interfaces.
The back-end is developed using Spring Boot 3 with JAVA running on Apache
Tomcat server. We use PostgreSQL to create and manage our database. It stores
around 3m negative clues about 40k popular Wikidata entities from 5 diverse
types, namely, people, countries, literature work, organizations, and businesses.
Neguess runs on a server with capacity 1 TB and a 8 GB RAM. The average
speed of retrieving a guessing card is 3 s.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Demonstration Experience</title>
      <p>Can you neguess? A player, who is very con dent of her knowledge about
countries, chooses the type \country" with di culty \hard". She gets two
consecutive cards, shown in Figure 2. The focus of the rst card is countries of
central and south America. She knows that the main emergency number in
Argentina is 911, so she immediately disregard this country as the answer. She is
certain that Chile does not share a border with Columbia, so Chile is a likely
option as the card's answer. She clicks on the Central American Bank for
Economic Integration and is lead to the Wikidata (and then Wikipedia) page of the
institution. She nds out that Guatemala is one of the founding members. She
clicks on Chile as her nal (and correct) pick!
Her second card covers Gulf countries. She does not know which electric plug
type these countries use, so this clue was not helpful to her. She is certain that
none are in Africa. However, the rst clue confused her the most. They are all
countries known for their oil production, so how is it possible that (at least)</p>
      <sec id="sec-3-1">
        <title>2 https://reactjs.org/ 3 https://spring.io/projects/spring-boot</title>
        <p>
          one of them is not a member of OPEC ? Hesitant about these clues, she picks
Bahrain as a lucky guess. She answered correctly, but still not sure which clues
are applicable. She checks Bahrain's Wikidata page and does not nd the OPEC
membership. She googles the fact and learns that Bahrain is not a member of
OPEC but OPEC+, a division for non-OPEC countries which export crude oil.
Beyond fun and games. Neguess can be used to understand the peer-based
negation inference method it is based on [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. By choosing embeddings [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] as a
similarity function for instance, countries which have latent shared information
start to appear together in a guessing card (for instance, U.S. and Russia). On
the other hand, when the peering function is changed to graph-based measures
(computed as p-o pair combinations entities share), countries which share a lot
of geographical information start to appear together (for instance, U.S. and
Mexico). In addition, Neguess could be used as an entertaining tool to nd and
understand modelling issues in Wikidata. One clue for a person card, including
three famous computer scientists, is NOT ( eld of work; computer science). This
is clearly an incorrect card that must be reported. Moreover, digging deeper into
the reason this card was generated, we nd that two of these computer scientists
had Informatics and Information Technology as their eld of work. Finally, we
use the game to gather feedback on the correctness of the inferred negation. A
player can ag a card and add her comment on the informativeness or correctness
of the clues. In future work, we would like to give players more opportunity to
give feedback (e.g., agging individual clues, or correcting clues if they wish to).
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>
        In order to compile the set of negative clues for this game, the peer-based
methodology infer useful negative statements by assuming completeness in parts
of the KBs, namely within peer groups. Although this approach outperformed
baselines methods in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], inferences (i.e., clues) may still be incorrect. At the
moment, we allow players to ag cards as incorrect, and would like to use this
feedback in the future to a ect the display/disregard of erroneous cards. In
addition, we understand that wrapping up the negative statements in a game setting
would not allow users to inspect speci c entities of interest. Another platform,
built upon the same research work, has been published recently, where users can
explore useful negation through an entity summarization and structured
question answering interfaces [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Acknowledgments This work is supported by the German Science
Foundation (DFG: Deutsche Forschungsgemeinschaft) by grant 4530095897: \Negative
Knowledge at Web Scale".</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Arnaout</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razniewski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>J.Z.</given-names>
          </string-name>
          :
          <article-title>Wikinegata: a knowledge base with interesting negative statements</article-title>
          .
          <source>PVLDB</source>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Arnaout</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , et al.:
          <article-title>Negative knowledge for open-world wikidata</article-title>
          . Wiki Workshop at WWW (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Arnaout</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razniewski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>Enriching knowledge bases with interesting negative statements</article-title>
          .
          <source>In: AKBC</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Balaraman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , et al.:
          <article-title>Recoin: Relative completeness in Wikidata</article-title>
          . Wiki Workshop at WWW (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Karagiannis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , et al.:
          <article-title>Mining an "anti-knowledge base" from Wikipedia updates with applications to fact checking and beyond</article-title>
          . In: VLDB (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Petrova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Entity comparison in rdf graphs</article-title>
          ..
          <source>ISWC</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ponza</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferragina</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>C.S.:</surname>
          </string-name>
          <article-title>A two-stage framework for computing entity relatedness in Wikipedia</article-title>
          .
          <source>CIKM</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Safavi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koutra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Generating negative commonsense knowledge</article-title>
          .
          <source>KR2ML</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Vrandecic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Krotzsch, M.:
          <article-title>Wikidata: A free collaborative knowledge base</article-title>
          .
          <source>CACM</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Yamada</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , et al.:
          <article-title>Wikipedia2Vec: An optimized tool for learning embeddings of words and entities from Wikipedia</article-title>
          .
          <source>EMNLP</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>