Neguess: Wikidata-entity guessing game with
                  negative clues

          Aditya Bikram Biswas, Hiba Arnaout, and Simon Razniewski

               Max Planck Institute for Informatics, Saarland, Germany
                 {adbiswas,harnaout,srazniew}@mpi-inf.mpg.de


        Abstract. We present Neguess, an entity-guessing game with unique
        emphasis on challenging negative clues. The clues have been automati-
        cally generated using the peer-based negation inference methodology [3].
        The game can be used i) as an entertaining way to familiarize partici-
        pants with the novel area of explicit negative knowledge in open-world
        knowledge bases; and ii) has the potential to be adopted in pedagogical
        approaches, like game-based teaching practices.
        The demo is available at: https://neguess.mpi-inf.mpg.de.

        Keywords: Negation · RDF · Knowledge Bases · Wikidata


1     Introduction
Knowledge bases (KBs) operate under the open-world assumption (OWA), mean-
ing that statements asserted in them, in the form of (subject; predicate; object)
are true, like (Denmark; member of; European Union), and statements not as-
serted are unknown, like (Iceland; member of; European Union). Given that
existing web-scale KBs are far from complete, it is not realistic to assume that
absent information is false. It is also not realistic to add every possible negation
to the KB (e.g., more than 280k actors with no Oscars1 ). For this reason, we have
seen a rising interest in augmenting open-world KBs with useful negative state-
ments. In [3], interesting negations are inferred about a given entity based on
observations made on similar entities. For instance, Iceland is a European coun-
try like Denmark, however, the former does not have the statement asserting
its membership in the European Union. In [5], an anti-KB containing common
factual mistakes has been built, through mining Wikipedia edit logs. In [8], the
focus is on obtaining meaningful negative information in commonsense KBs.
    Neguess (short for “entity-guessing game with negative clues”) builds on
the methodology introduced in [3], and shows multiple-choice guessing cards,
where the clues are entirely negated assertions, i.e., properties not satisfied by
the correct answer. For every guessing card, i) it picks a random entity as the
right answer ii) retrieves similar entities for wrong answers, (e.g., other countries
1
    https://w.wiki/3ZB9
    Copyright © 2021 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
2      Biswas, Arnaout, and Razniewski

from the same continent), and iii) compiles challenging negative clues that are
mostly, or fully, applicable to the correct entity.


                     Fig. 1. An overview of a Neguess round.


Peer-based negation inference. Neguess relies on the so-called peer-based in-
ference methodology [3] to compile interesting negative statements. In particular,
given an entity e from KB, the method:
1. Collects e’s peers using a predefined similarity function (e.g., embedding-
   based similarity [10]). Peer grouping is based on three different functions
   (i) structured facets of entity e [4]. For instance, people sharing the same
   occupation, nationality, similar field of work, (ii) graph-based measures like
   distance or connectivity [6,7], expressed as the number of predicate-object
   pairs two entities share, and (iii) cosine similarity based on Wikipedia em-
   beddings [10].
2. Produces a set of candidate negations (i.e., statements that are asserted in
   KB for at least one peer, but not for e).
3. Scores the set of candidates using various ranking metrics (e.g., frequency,
   unexpectedness, etc.). The need for ranking stems from the very large set
   of correct negative statements inferred. For example, a person-entity is not
   married to millions of people.
Further details about the methodology are in [3] and [2].

2   System Overview
Neguess cards. We make use of the method, described briefly in Section 1, to
generate three challenging negative clues for Wikidata [9] entities from diverse
                 Neguess: Wikidata-entity guessing game with negative clues          3

types. A challenging negative clue is equivalent to an inferred negative statement
with a high score. Figure 1 shows a sample card game. Players can pick the type
of entities to guess (1); pick the similarity function to be used for collecting the
peers (i.e., the multiple options) (2); and pick the difficulty of the clues (3). Here,
difficulty reflects how unique are the clues to the correct answer. For instance,
the multiple options are the famous world leaders Roosevelt, Napoleon, and Lin-
coln (5). They have been chosen as peers because they share the occupation
“statesperson” (relying on structured facets of the subject [4]). In this case, the
difficulty is set to easy and is reflected as 2/3 of the clues are unique to the cor-
rect answer, making it somehow more distinguishable. The clues are shown in
two possible structured forms: i) (p; none), e.g., (educated at; none) and ii) NOT
(p; o), e.g., NOT (manner of death; natural causes). Unlike the others, Lincoln
was shot in the famous theatre incident. He is also known as one of few Ameri-
can presidents with no formal education. The third clue does not contribute to
the answer and is there to confuse the player, as all of them are not Lutheran.
Moreover, players can track their progress in the game (6). Finally, players can
report a card if it contains any incorrect negations or technical problems (7).

Implementation and web interface. The Neguess front-end or the web inter-
face is developed using React JS 2 , a JavaScript library to build user interfaces.
The back-end is developed using Spring Boot 3 with JAVA running on Apache
Tomcat server. We use PostgreSQL to create and manage our database. It stores
around 3m negative clues about 40k popular Wikidata entities from 5 diverse
types, namely, people, countries, literature work, organizations, and businesses.
Neguess runs on a server with capacity 1 TB and a 8 GB RAM. The average
speed of retrieving a guessing card is 3 s.


3     Demonstration Experience
Can you neguess? A player, who is very confident of her knowledge about
countries, chooses the type “country” with difficulty “hard”. She gets two con-
secutive cards, shown in Figure 2. The focus of the first card is countries of
central and south America. She knows that the main emergency number in Ar-
gentina is 911, so she immediately disregard this country as the answer. She is
certain that Chile does not share a border with Columbia, so Chile is a likely
option as the card’s answer. She clicks on the Central American Bank for Eco-
nomic Integration and is lead to the Wikidata (and then Wikipedia) page of the
institution. She finds out that Guatemala is one of the founding members. She
clicks on Chile as her final (and correct) pick!
Her second card covers Gulf countries. She does not know which electric plug
type these countries use, so this clue was not helpful to her. She is certain that
none are in Africa. However, the first clue confused her the most. They are all
countries known for their oil production, so how is it possible that (at least)
2
    https://reactjs.org/
3
    https://spring.io/projects/spring-boot
4       Biswas, Arnaout, and Razniewski


                    Fig. 2. Two Neguess cards about countries.


one of them is not a member of OPEC ? Hesitant about these clues, she picks
Bahrain as a lucky guess. She answered correctly, but still not sure which clues
are applicable. She checks Bahrain’s Wikidata page and does not find the OPEC
membership. She googles the fact and learns that Bahrain is not a member of
OPEC but OPEC+, a division for non-OPEC countries which export crude oil.

Beyond fun and games. Neguess can be used to understand the peer-based
negation inference method it is based on [3]. By choosing embeddings [10] as a
similarity function for instance, countries which have latent shared information
start to appear together in a guessing card (for instance, U.S. and Russia). On
the other hand, when the peering function is changed to graph-based measures
(computed as p-o pair combinations entities share), countries which share a lot
of geographical information start to appear together (for instance, U.S. and
Mexico). In addition, Neguess could be used as an entertaining tool to find and
understand modelling issues in Wikidata. One clue for a person card, including
three famous computer scientists, is NOT (field of work; computer science). This
is clearly an incorrect card that must be reported. Moreover, digging deeper into
the reason this card was generated, we find that two of these computer scientists
had Informatics and Information Technology as their field of work. Finally, we
use the game to gather feedback on the correctness of the inferred negation. A
player can flag a card and add her comment on the informativeness or correctness
of the clues. In future work, we would like to give players more opportunity to
give feedback (e.g., flagging individual clues, or correcting clues if they wish to).


4   Discussion

In order to compile the set of negative clues for this game, the peer-based
methodology infer useful negative statements by assuming completeness in parts
of the KBs, namely within peer groups. Although this approach outperformed
baselines methods in [3], inferences (i.e., clues) may still be incorrect. At the
moment, we allow players to flag cards as incorrect, and would like to use this
                 Neguess: Wikidata-entity guessing game with negative clues       5

feedback in the future to affect the display/disregard of erroneous cards. In addi-
tion, we understand that wrapping up the negative statements in a game setting
would not allow users to inspect specific entities of interest. Another platform,
built upon the same research work, has been published recently, where users can
explore useful negation through an entity summarization and structured ques-
tion answering interfaces [1].

Acknowledgments This work is supported by the German Science Founda-
tion (DFG: Deutsche Forschungsgemeinschaft) by grant 4530095897: “Negative
Knowledge at Web Scale”.


References
 1. Arnaout, H., Razniewski, S., Weikum, G., Pan, J.Z.: Wikinegata: a knowledge base
    with interesting negative statements. PVLDB (2021)
 2. Arnaout, H., et al.: Negative knowledge for open-world wikidata. Wiki Workshop
    at WWW (2021)
 3. Arnaout, H., Razniewski, S., Weikum, G.: Enriching knowledge bases with inter-
    esting negative statements. In: AKBC (2020)
 4. Balaraman, V., et al.: Recoin: Relative completeness in Wikidata. Wiki Workshop
    at WWW (2018)
 5. Karagiannis, G., et al.: Mining an ”anti-knowledge base” from Wikipedia updates
    with applications to fact checking and beyond. In: VLDB (2019)
 6. Petrova, A., et al.: Entity comparison in rdf graphs.. ISWC (2017)
 7. Ponza M., Ferragina P., C.S.: A two-stage framework for computing entity relat-
    edness in Wikipedia. CIKM (2017)
 8. Safavi, T., Koutra, D.: Generating negative commonsense knowledge. KR2ML
    (2020)
 9. Vrandečić, D., Krötzsch, M.: Wikidata: A free collaborative knowledge base.
    CACM (2014)
10. Yamada, I., et al.: Wikipedia2Vec: An optimized tool for learning embeddings of
    words and entities from Wikipedia. EMNLP (2020)