<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Athens, Greece
$ mountant@ics.forth.gr (M. Mountantonakis); tzitzik@ics.forth.gr (Y. Tzitzikas)
 https://users.ics.forth.gr/~mountant/ (M. Mountantonakis); https://users.ics.forth.gr/~tzitzik/ (Y. Tzitzikas)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Real-Time Validation of ChatGPT facts using RDF Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michalis Mountantonakis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Tzitzikas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Crete</institution>
          ,
          <addr-line>Heraklion</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science</institution>
          ,
          <addr-line>FORTH, Heraklion</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>ChatGPT is an innovative application of Large Language Models (LLMs) that produces detailed and articulate responses across many domains of knowledge. However, it does not provide evidence for its responses, and it returns several erroneous facts, even for popular persons, places and others. For tackling the mentioned limitation, we present the fact checking service of the research prototype GPT∙ LODS, which can validate ChatGPT facts by using RDF Knowledge Graphs (KGs) containing high quality structured data. Indeed, GPT∙ LODS is able to generate triples for a question, an entity or a given text using ChatGPT. Afterwards, it can validate at real-time the generated ChatGPT triples through DBpedia or LODsyndesis KG (a KG that has indexed 400 other RDF KGs), by combining SPARQL queries, word embeddings and sentence similarity metrics. We present the functionality and use cases of GPT∙ LODS, including fact checking, question answering, triples generation from text and comparison of diferent GPT models. Demo URL: https://demos.isl.ics.forth.gr/GPToLODS/FactChecking Demo Video: https://youtu.be/5DW1d37aPMc</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Fact Checking</kwd>
        <kwd>Validation</kwd>
        <kwd>Provenance</kwd>
        <kwd>ChatGPT</kwd>
        <kwd>Knowledge Graphs</kwd>
        <kwd>Embeddings</kwd>
        <kwd>LODsyndesis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        ChatGPT is a novel Artificial Intelligence (AI) chatbox (https://openai.com/), which is based on
Large Language Models (LLMs), and ofers detailed responses across many domains. However,
it does not provide justifications for the responses, and can return erroneous and outdated facts,
even for popular places, persons and other entities. The mentioned limitation can be assisted by
using RDF Knowledge Graphs (KGs) containing high quality structured data. Indeed, there have
been already proposed approaches that combine RDF KGs with ChatGPT, i.e., for summarizing
RDF KGs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], for entity matching [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], for question answering [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and for annotation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In this
demo we focus on providing a service for validating ChatGPT facts by using multiple RDF KGs.
      </p>
      <p>
        In particular, it seems that ChatGPT can produce valid RDF N-triples for a given model, e.g.,
DBpedia ontology [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], i.e., see Fig. 1. In that real example, we asked ChatGPT for facts for the
famous greek painter El Greco (in RDF N-triples format using DBpedia model). The left side
shows the ChatGPT response, the middle side the corresponding triple in DBpedia and the right
side a comparison between them. The ideal scenario is the same triple to be part of the KG (see
ID 1), e.g., DBpedia, however, this is not always the case even for the correct facts. In particular,
we can see correct ChatGPT facts with i) wrong/invalid URIs, e.g., disambiguation URIs, wrong
predicates, unknown URIs (see IDs 2-4) or/and ii) problems with literals, e.g., diferent formats
(see ID 5) or URIs instead of literals (see ID 6). On the contrary, there can be several erroneous
facts even for popular entities, including both wrong literals and URIs, e.g., see IDs 7-8 in Fig. 1.
      </p>
      <p>For enabling the validation of all these cases, we demonstrate the GPT∙ LODS fact checking
service, which generates RDF N-triples from ChatGPT and validates the generated ChatGPT
facts by combining SPARQL queries, word embeddings and sentence similarity metrics. For
increasing the possibility to find the desired fact in an RDF KG, we also use LODsyndesis KG
(https://demos.isl.ics.forth.gr/lodsyndesis), which has integrated data from 400 RDF KGs, by
taking into account the transitive closure of their equivalence relationships. To the best of our
knowledge, this is the first service that validates ChatGPT facts using one or more RDF KGs.</p>
      <p>The rest of this demo paper is organized as follows: §2 introduces the related work, §3
describes the steps and the functionality of the Fact Checking Service of GPT∙ LODS, §4 presents
the use cases and §5 concludes the paper and discusses future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Concerning the related approaches, in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] the authors used ChatGPT for the Question Answering
task, and they concluded that ChatGPT ofered high precision answers for popular domains,
but low precision for unpopular ones. Moreover, ChatGPT has been used for entity matching
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], for named entity identification tasks over historical data [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and for summarizing RDF KGs
by using a ChatGPT constructed classifier [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Moreover, in our previous work [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], we have
used multiple RDF KGs for annotating ChatGPT responses by using popular Entity Recognition
tools, and each entity is enriched with more statistics and links to LODsyndesis [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Concerning the novelty, to the best of our knowledge, this is the first service that tries to
validate ChatGPT facts by using RDF KGs and techniques based on word embeddings.
3. Steps and Functionality of the ChatGPT Fact Checking Service
Here, we present the steps and the functionality, which are analyzed below and are depicted in
Fig. 2 with a running example about the city of “Salzburg" (see the lower left part of Fig. 2). The
steps are generic and can be easily adjusted for performing fact checking for the response of
any LLM (e.g., a new version of ChatGPT or any existing/new LLM ofering an analogous API).</p>
      <p>Step 1. Send a Prompt to ChatGPT. The user can type a i) question, ii) an entity or iii) a text,
and the user selects a ChatGPT model to use (i.e., “text-davinci-003" or “gpt-3.5-turbo-0301").
According to the input type, a diferent query is sent to ChatGPT, e.g., for the entity case, the
query to ChatGPT follows: “Give me K RDF N-triples using DBpedia format for the entity E",
where K is the number of facts and E is the entity name (both K and E are given by the user).</p>
      <p>Step 2. Triples Generation from ChatGPT. We decided to use ChatGPT for generating
the triples, since ChatGPT can usually create dereferencable DBpedia URIs and valid RDF
N-triples. Concerning this step, GPT∙ LODS collects the ChatGPT response and keeps only the
RDF N-triples (since sometimes ChatGPT also returns a text with the RDF N-triples), which are
shown to the user in an HTML table. Additionally, the user can export the generated triples.</p>
      <p>
        Step 3. Facts Validation using RDF Knowledge Graphs and Word Embeddings. The
user selects which KG to use for validating the generated ChatGPT facts; DBpedia [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or
LODsyndesis [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Concerning LODsyndesis, it contains 2 billion facts from 400 RDF KGs, by
having precomputed and stored the transitive closure of owl:sameAs, owl:equivalentProperty
and owl:equivalentClass relationships, and all the available facts for each real entity in the
same index entry. By using DBpedia, SPARQL queries are sent for retrieving the desired data,
whereas for LODsyndesis, we use its REST API. The user can select to validate either a single
fact or all the facts. Concerning the validation, it is based on 3 rules: A) Same/Equivalent Triple,
B) Same Subject-Predicate or Subject-Object and C) Most Similar Triples.
      </p>
      <p>Concerning rule A, we search in the KG if the same or an equivalent (for LODsyndesis KG)
triple exists, and in such a case it returns the triple (and its provenance) to the user, e.g., see
rule A in the right side of Fig. 2. Regarding rule B, we search if the same subject-predicate or
the same subject-object exists in the KG, and in such a case we return the results in descending
order according to their similarity score with the ChatGPT fact, e.g., see rule B in Fig. 2, where
we found the correct population of Salzburg (i.e., same subject-predicate but diferent object).
Concerning rule C (executed if the previous two failed), we collect all the triples containing
the main entity of the triple (e.g., Salzburg) and we create the embeddings for each of these
triples and the desired ChatGPT fact, by using a sentence similarity library from Hugging Face
(https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). Finally we return the top-K
similar triples according to the cosine similarity of their vector with the vector of the given
ChatGPT fact. In Fig. 2 (lower right side), DBpedia uses a diferent property for the “mayor" and
a literal instead of URI for “Harald Preuner", however, GPT∙ LODS managed to verify the fact.</p>
      <p>Step 4. Browse/Export the Results. The user can browse (through HTML tables) for
each ChatGPT fact the corresponding triple(s) in the KG with their provenance, the possible
diferences between the ChatGPT fact and the triple(s) from the KG, and the cosine similarity
score of their vectors. Apart from the HTML tables, one can export the results in JSON format.</p>
      <p>Code &amp; Evaluation. In https://github.com/mountanton/GPToLODS_FactChecking, one
can browse the code and a preliminary evaluation where the results are promising; for 1,000
manually-labelled ChatGPT facts for famous Greek persons (including 812 correct and 188
erroneous facts), by using the above steps and LODsyndesis, we verified the 92.2% of the correct
ChatGPT facts and we found the correct answer for the 57.1% of the erroneous ChatGPT facts.</p>
    </sec>
    <sec id="sec-3">
      <title>4. Use Cases &amp; Demonstration</title>
      <p>We present four use cases, where the fact checking service of GPT∙ LODS can be useful, i.e.,
UC1UC4, which are also presented in the following tutorial video: https://youtu.be/5DW1d37aPMc.
Finally, the demo webpage is available in https://demos.isl.ics.forth.gr/GPToLODS/FactChecking.</p>
      <p>UC1. Fact Validation &amp; Question Answering from Multiple KGs. The user can verify
ChatGPT facts from one or more RDF KGs, i.e., for confirming correct ChatGPT facts or/and
for finding the correct answer for erroneous ChatGPT facts. The presented process can be
also exploited for Question Answering applications, e.g., see the examples of Figures 1 and 2.
Moreover, through LODsyndesis more facts can be verified even for the same entity, e.g, in the
UC1 of Fig. 3 the ChatGPT fact verified by using LODsyndesis (specifically from the Fishbase
KG) and not from DBpedia (which does not contain the native countries of the fish Mahi-mahi).</p>
      <p>UC2. Triples Generation. The user can generate (and export) RDF triples either for well
known entities and facts or for any given text by exploiting ChatGPT. In the first case they can
be possibly used for KG completion (creating new triples for existing KG entities). In the second
one, the triples can be used for KG generation, e.g., in Fig. 3 (see UC2), GPT∙ LODS generated
triples for a given text about a village in Crete, for which a DBpedia page does not exist.</p>
      <p>UC3. Comparison of GPT models. One can use diferent GPT models, for comparing the
validity of answers for the same questions. For instance, Fig. 3 shows a real example, where the
“text-daVinci" model provided the correct answer for the user question, whereas “gpt-3.5-turbo"
provided an erroneous one (although it is newer comparing to daVinci). In both cases, GPT∙ LODS
found the correct answer for the desired fact (however with a diferent similarity score).</p>
      <p>
        UC4. Combination of GPT∙ LODS services - Annotation and Fact Checking. GPT∙ LODS
also ofers a service [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which can annotate ChatGPT textual responses with links to
LODsyndesis and DBpedia (i.e., for information enrichment). The latter service is also connected with
the presented fact checking service; specifically the generated ChatGPT annotated textual
response can be converted to RDF triples (using ChatGPT) and the triples are validated using
the presented steps. In such a case two requests are sent to ChatGPT, e.g., see UC4 in Fig. 3.
      </p>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>In this paper, we demonstrated the fact checking service of GPT∙ LODS, which exploits SPARQL
queries and sentence similarity techniques (based on word embeddings), for validating at real
time any ChatGPT response from multiple RDF Knowledge Graphs. We presented all the steps
and use cases of GPT∙ LODS, including fact validation, triples generation and others. As a future
work, we plan to extend the service i) for providing a REST API and ii) for adding more features,
e.g., providing the service as a chat-box and exploiting knowledge from web search engines.</p>
      <p>Acknowledgments. This work has received funding from the European Union’s Horizon
2020 coordination and support action 4CH (Grant agreement No 101004468).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Vassiliou</surname>
          </string-name>
          , et al.,
          <article-title>SummaryGPT: Leveraging ChatGPT for summarizing knowledge graphs</article-title>
          ,
          <source>ESCW</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Peeters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <article-title>Using ChatGPT for entity matching</article-title>
          ,
          <source>arXiv preprint arXiv:2305.03423</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Omar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Mangukiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kalnis</surname>
          </string-name>
          , E. Mansour,
          <article-title>ChatGPT versus traditional question answering for knowledge graphs: Current status and future directions towards knowledge graph chatbots</article-title>
          ,
          <source>arXiv:2302.06466</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mountantonakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          ,
          <article-title>Using multiple RDF knowledge graphs for enriching ChatGPT responses</article-title>
          , in: ECML/PKDD 2023 Demo paper,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          , et al.,
          <article-title>DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia</article-title>
          ,
          <source>Semantic web 6</source>
          (
          <year>2015</year>
          )
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.-E.</given-names>
            <surname>González-Gallardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Girdhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          , Yes but..
          <article-title>can ChatGPT identify entities in historical documents?</article-title>
          ,
          <source>arXiv preprint arXiv:2303.17322</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mountantonakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          ,
          <article-title>Content-based union and complement metrics for dataset search over RDF knowledge graphs</article-title>
          ,
          <source>ACM JDIQ 12</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>