<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>without asserting approaches in RDF. The case of Conjectures.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Valentina Pasqual</string-name>
          <email>valentina.pasqual2@unibo.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerald Manzano</string-name>
          <email>gerald.manzano@studio.unibo.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduart Uzeir</string-name>
          <email>eduart.uzeir@studio.unibo.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Tomasi</string-name>
          <email>francesca.tomasi@unibo.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Vitali</string-name>
          <email>fabio.vitali@unibo.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>RDF Reification, Eficiency assessment, Conjectures, Expressing Without Asserting,</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bologna</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Digital Humanities Advanced Research Center, Department of Classica Philology and Italian studies, University of</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we evaluate the existing reification approaches for expressing without asserting (EWA) statements in RDF along with their related contextual information and compare them with a new method, called Conjectures. Conjectures express RDF statements with three states of knowledge: undisputed claims, disputed claims, and settled claims. Conjectures extend the semantics of RDF Named graphs and introduce a new syntactical form to represent both conjectural and asserted information. Our evaluation tests were performed on a large sample of Wikidata entities about artworks interspersed with additional dummy statements to simulate alternative or abandoned claims and enrich the set of non-asserted claims. Our study evaluates metrics such as the total number of triples, loading time, dataset weight, and in particular query execution time for many diferent and meaningful types of queries. Results show that Conjectures is competitive with existing methods and outperforms other methods in terms of eficiency when retrieving debated statements, thereby demonstrating its potential as an efective tool for expressing nuanced RDF statements.</p>
      </abstract>
      <kwd-group>
        <kwd>Conjectures</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        RDF [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] is a powerful tool for expressing statements as absolute, asserted relationships between
entities. Yet, it falls short when it comes to representing nuances about such relationships,
e.g., to enrich them with contextual information, or to represent in full how they relate to
each other. While some statements may never need to be questioned (or maybe there is no
interest in questioning them), and therefore can be represented adequately with plain RDF
triples, frequently on the other hand scientific and critical discourse is filled with concurrent
opinions and interpretations. This knowledge cannot be simply expressed as plain RDF triples
but requires the specification of contextual information (e.g. who is the author of such claims),
and possibly of multiple and competing statements, which the community may settle or discard
over time.
nEvelop-O
LGOBE
CEUR
Workshop
Proceedings
      </p>
      <p>
        This is for instance the case of the hypotheses made by scholars about the many (still
undetermined) locations of the Mona Lisa painting when it was stolen in 1911 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], or of the
attribution of the painting Salvator Mundi, possibly by Leonardo da Vinci and world-famous
for having been the most expensive painting ever sold at public auction. Its attribution is now
under discussion by many scholars [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Our guiding example in this paper, on the other hand,
is the debated attributions of the painting Napoleon crossing the Alps1, currently attributed to
Jacques-Louis David, but previously attributed to Jérôme-Martin Langlois and the workshop of
Jacques-Louis David.
      </p>
      <p>
        RDF Reification 2 methods [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have been used to make statements about statements in RDF
[6] and even allowing the coexistence of multiple opinions in RDF. For instance, reification
enables to express sentences like ”JocondeLab3 reports that Napoleon crossing the alps author is
Jacques-Louis David”. Usually, reification methods typically involve the use of extra triples to
provide information about the original triple being reified. Listing 1 exemplifies the statement
above presented using traditional reification. In the example, a new entity is introduced as an
instance of the class rdf:Statement to introduce ”JocondeLab reports” and the original triple
(Napoleon crossing the alps author is Jacques-Louis David) is expressed by the means of three
additional triples whose predicates are strictly rdf:subject, rdf:predicate, rdf:object.
@prefix rdf : &lt;http : / /www. w3 . org /1999/02/22 − rdf −syntax −ns#&gt; .
: statement1 rdf : type rdf : Statement ;
rdf : s u b j e c t wd : Q19801150 ; # Napoleon Crossing the Alps
rdf : p r e d i c a t e wd : P170 ; # c r e a t o r
rdf : o b j e c t wd : Q67215 . # Jacques −Louis David
: statement1 wdt : P248 wd : Q29633776 . # s t a t e d in JocondeLab
Listing 1: Representation of the claims ”XXX states that Napoleon Crossing the Alps” author is
      </p>
      <p>Jacques-Louis David using standard reification</p>
      <p>Accepted statements can be therefore characterised by a plain triple to assert the statement,
and the same triple under reification to provide additional information about the claim itself.
This would be the case of the settled attribution of Napoleon crossing the alps currently attributed
to Jacques-Louis David. Statements that represent alternative or historical information would
then be represented with just the claim in reified form and no assertion through a plain RDF
triple. This would be case of the earlier attributions of Napoleon crossing the alps to
JérômeMartin Langlois and the workshop of Jacques-Louis David. We call this approach Expressing
Without Asserting (EWA)[7] and we consider it as a powerful tool to express claims with diferent
degrees of validity and even critical debate.</p>
      <p>In addition to analysing the expressivity and efectiveness of the many existing reification
approaches, their eficiency is also an open issue. To the best of our knowledge, no study
has been proposed on the evaluation of the eficiency of EWA mechanisms with
ontologyindependent approaches. This paper evaluates existing reification methods for expressing
without asserting in RDF with the related contextual information and compares them with a
new approach, called Conjectures.</p>
      <p>The paper is structured as follows: in section 2 we discuss of a number of syntactical</p>
      <sec id="sec-2-1">
        <title>1http://www.wikidata.org/entity/Q19801150 2https://www.w3.org/wiki/RdfReification 3http://www.wikidata.org/entity/Q29633776</title>
        <p>approaches to EWA and benchmarks adopted in the literature for their evaluation. In section
3 we briefly introduce and compare conjectures as an alternative method to represent EWA
statements in addition to existing proposals such as RDF reification, RDF star, and others. In
section 4 we discuss the dataset and the metrics used for our comparison of the various EWA
approaches, and in section 5 we analyze our findings. In section 6 we discuss the results before
drawing some conclusions in section 7.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. State of the art</title>
      <p>Many Reification methods have been implemented [ 8, 9, 10, 11, 12, 13] to represent statements
about RDF claims. Reification provides additional triples about such claims enabling querying
and reasoning mechanisms to integrate the following key attributes[12, 14, 15]: provenance:
enables the identification and representation of the source of a fact or statement (i.e. Who
claimed this fact? ), time: we can communicate the fact’s time-related information (i.e. Which
is the most up-to-date claim about this fact? ), location: reveals locational information about an
event (i.e. Which is the location in which this claim is applicable? ), certainty: indicate the level of
confidence that is attributed to a statement (i.e. how confident are we about a certain event? or is
a given statement true? ), versioning: it is helpful to keep track of RDF datasets’ updating history
(e.g. what data version I am using right now? )</p>
      <p>The scientific community has suggested the following major reification strategies to be able
to encapsulate all this variety of contextual information about RDF statements.</p>
      <p>Among existing Knowledge Graphs, Wikidata provides RDF data with its reification method[ 11],
and a long list of qualifiers about provenance, time, and place to represent complex and
competing information. Additionally, each claim validity is provided by a customised approach to
express without asserting [7] such claims (via ranking mechanism4) which truth state cannot
be given for granted. Consider for example the case of concurring attributions of Napoleon
crossing the Alps, as shown in figure 1, is represented in Wikidata with three diferent claims:
the currently accepted attribution to Jacques-Luois David (marked with a preferred rank and
therefore asserted), the former attribution to the his workshop (marked with a normal rank and
non-asserted) and the former attribution to Jérôme-Martin Langlois (marked with a deprecated
rank and therefore non-asserted). The two former attributions are therefore marked with
contextual information about the claim (e.g. reason for deprecated rank, former attribution) to
express additional information on the claim itself.</p>
      <p>To be widely accepted, every data representation method must be both efective and eficient.
While a method’s efectiveness can be formally demonstrated, we must rely on empirical
observations to determine its eficiency. Every new method that is suggested typically includes
performance data as well. This is accomplished by carrying out the necessary tests to see how
the approach performs on performance indicators like the number of triples, query execution time,
query complexity, dataset storage consumption, support by existing tools and implementations and
other pertinent metrics. There is a vast number of academic papers that suggest this kind of
experimental setup to benchmark the aforementioned reification methods[ 16, 12, 6, 14, 17, 18, 19].
Additionally, all the benchmarks mentioned above rely on the same four components, namely
datasets, queries, triplestores and reification methods .</p>
    </sec>
    <sec id="sec-4">
      <title>3. Conjectures</title>
      <p>Conjectures is an approach intended to express RDF statements with three main states of
knowledge: undisputed claims, disputed claims or evolving situations, and settled claims.
Conjectures is a RDF 1.1 compliant characterization of Named graphs (the weak form), and an
extension of RDF Named graphs semantics to distinguish plain Named graphs from disputed
(conjectural) and settled (the strong form)5.</p>
      <p>Undisputed claims Non-disputed claims are expressed as asserted Named graphs (and
therefore introduced by the keyword GRAPH). For example, the main subject of Napoleon Crossing
the Alps has been recognised as Napoleon itself without any doubt as shown in listing2.
Disputed claims Conjectures is a prototypical extension of the syntax of Trig, where the
keyword GRAPH is replaced with CONJECTURE in front of a graph whose contents is expressed but
not asserted and expressing those statement which is disputes or conveys evolving situations.
For example, Napoleon crossing the Alps former attribution to Jérôme-Martin Langlois can be
represented via a Conjecture as in listing 2. Naturally, graphs that are not marked as conjectures
maintain the same (locally-decided) semantics as before, and therefore they may or may not
contribute to the truth value of the entire dataset depending on such choice. A key aspect
is that Conjectures do not use reification,  -ary relationships or ad hoc classes For example,</p>
      <sec id="sec-4-1">
        <title>4https://www.Wikidata.org/wiki/Help:Ranking</title>
        <p>5A complete overview of Conjectures semantics is available at https://conjectures.altervista.org//CONJ_
semantics.pdf
the rdf:Statement class, as employed in standard reification and illustrated in Listing 1, and
therefore they are orthogonal to, and fully compatible with, most of the other approaches.
Settled claims Settled claims (introduced by the keyword SETTLED) record both the dispute,
as well as its subsequent resolution. This is specifically and intentionally diferent from a trivial
re-assertion of the disputed claims, in which we do not acknowledge or mention the dispute
at all (the case of GRAPH in 2). To handle settled disputes we introduce Settled Conjectures,
a third type of named graph that is at the same time conjectured and asserted. The collapse
graphs allow us to both represent the conjectural triples (inside the usual conjectural graph)
as well as the same triples but completely asserted (inside the collapse graph). In addition,
the  ∶  relation connects the conjecture and its settlement, simplifying the task of
exploring the relationships between disputes and their settlements. The rationale behind Settled
Conjectures is two-fold: on the one hand, to stress the diference between claims that have
not been challenged and claims that emerged as winning among competing and incompatible
hypotheses and on the other to represent the dual nature of settled claims as both conjectures
and assertions. Consider for example ”Napoleon Crossing the Alps” current attribution to
Jacques-Louis David can be represented via a Settled conjecture as in listing2.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Testing EWA approaches</title>
      <p>The methodology adopted to run the tests outlined in this study can be summarized as follows:
ifrst, a series of datasets were generated, scaled, and converted in RDF to mimic a set of EWA
approaches (see Section 4.1). Next, the hardware and software employed to run our experiments
on the collected data has been setup (see Section 4.2). A set of metrics has been defined to
evaluate the eficiency of EWA approaches (see Section 4.3). Lastly, the outcomes of the tests
are presented in the next section (see Section 5).</p>
      <sec id="sec-5-1">
        <title>4.1. Data Acquisition, scaling and conversion</title>
        <p>The dataset on which these experiments have been run is composed as follows and has been
named D3:
• Art: A thematic set of claims about 300k artwork entities in Wikidata (i.e., painting,
manuscripts, books). This corresponds to about 10% of all artwork entities currently
present inside Wikidata.
• Random: after considerable deliberation, we concluded that adding some kind of entropy
to the dataset would make it more representative. This dataset contains the claims of
300k Wikidata random entities.
• Dummy: a selection of dummy statements regarding the artwork attributions
(represented by the property wdt:P50 and wdt:P170 and including from 1 to 4 authors in each
claim and the source of the claim) and artworks locations (represented by the property
wdt:P276, including 1 possible location, time constraints and source) has been created6.
Those new statements contain dummy arbitrary information ranked as deprecated and
therefore non-asserted to represent alternative or historical claims to those contained in
Art dataset. This design choice was made to increase the number of conjectural statements
in the final dataset.</p>
        <p>An excellent way to evaluate an algorithm’s performance is to observe how it responds to
variations in input size[6]. We started by downloading the whole subset of artwork entities,
related individuals (basically, attributed authors) and locations. This dataset, called D4, is
composed of about 3,5 million artwork entities and 188 thousand related entities (humans and
locations). We have not used this dataset for our comparison due to the excessive number of
timeouts in many of the queries and methods we used. Thus we scaled the dataset logarithmically
in three further sizes:
• Dataset D3 : D3 is obtained by extracting one tenth of the data in D4 (D3 = D4/10).
• Dataset D2 : D2 is obtained by extracting one tenth of the data in D3 (D2 = D3/10).
• Dataset D1 : D1 is obtained by extracting one tenth of the data in D2 (D1 = D2/10).</p>
        <p>
          We then surveyed the state of the art regarding reification methods to express without
asserting and selected a set of methods for our analysis: Singleton properties [12], Named
graphs[10] (using Wikidata rankings to decide whether a triple is asserted or not), Wikidata
[11] and the recent RDF-star[13] approach. We converted Wikidata JSON files into the six
selected reification methods through automatic scripts. In table 1 we provide some data about
our datasets. At the end of this process, we obtained 18 new method-specific datasets. In other
words, for each dataset ,  ∈ [
          <xref ref-type="bibr" rid="ref1 ref3">1, 3</xref>
          ] , we constructed the following datasets:
        </p>
        <p>Consider the case of Napoleon crossing the alps and its concurring attributions. In addition to
the statements present in Wikidata, the listings below present an additional dummy statement
reporting the historical attribution of the painting to Sophie Chéradame (Q60804575), a
statement claimed by the source ContrivedAttributionsInArtHistory-VP and never adopted by the
6The choice of adding the dummy claims is that of non-asserted statements in the Wikidata dump was circa 1%,
a low figure for this experiment
name
Dn-Wikidata
Dn-rdfStar
Dn-conjStrong
Dn-nGraphs
Dn-conjWeak
Dn-Singleton</p>
        <p>Serialization</p>
        <p>Turtle
Turtle
TriG
TriG
TriG
Turtle</p>
        <p>Reification</p>
        <p>Wikidata</p>
        <p>RDF-star
Conjectures - strong form</p>
        <p>Named graphs
Conjectures - weak form</p>
        <p>Singleton properties
yes
yes
yes
yes
yes
via ranking
66,768,937
29,779,850
scholars (and therefore marked with deprecated rank). An example of these concurring claims
is represented in the listing below with a diferent reification method: Wikidata statements
(listing 3), RDF-star (listing 4), Conjectures in strong form (listing 5), Named graphs with the
(listing 6), Conjectures in weak form (listing 7) and Singleton properties (listing 8)7.
wd : Q19801150 wdt : P170 wd : Q83155 .
s : Q19801150 s : P170 wd : Q19801150 − s1 ;
ps : P170 wd : Q83155 ;
w i k i b a s e : rank w i k i b a s e : NormalRank .
wd : Q19801150 s : P170 s : Q19801150 − s2 ;
ps : P170 ” unknown v a l u e ” ;
pq : P1774 wd : Q83155 ;
pq : P3831 wd : Q4233718 ;
w i k i b a s e : rank w i k i b a s e : DeprecatedRank .
wd : Q19801150 s : P170 s : Q19801150 − s3 ;
ps : P170 wd : Q672158 ;
w i k i b a s e : rank w i k i b a s e : NormalRank .
wd : Q19801150 s : P170 s : Q19801150dummy ;
ps : P170 wd : Q60804575 ;
pq : P248 c o n j : C o n t r i e v e d A t t r i b u t i o n s I n A r t H i s t o r y −VP ;
w i k i b a s e : rank w i k i b a s e : DeprecatedRank .</p>
        <p>Listing 3: Wikidata statements
wd : Q19801150 wdt : P170 wd : Q83155 .
&lt;&lt; wd : Q19801150 wdt : P170 wd : Q83155 &gt;&gt;</p>
        <p>w i k i b a s e : rank w i k i b a s e : P r e f e r r e d R a n k .
&lt;&lt; wd : Q19801150 wdt : P170 ” unknown v a l u e ” &gt;&gt;
pq : P1774 wd : Q83155 ;
pq : P3831 wd : Q4233718 ;
w i k i b a s e : rank w i k i b a s e : NormalRank .
&lt;&lt; wd : Q19801150 wdt : P170 wd : Q672158 &gt;&gt;</p>
        <p>w i k i b a s e : rank w i k i b a s e : NormalRank .
&lt;&lt; wd : Q19801150 s : P170 wd : Q60804575 &gt;&gt;
pq : P248 c o n j : C o n t r i e v e d A t t r i b u t i o n s I n A r t H i s t o r y −VP ;
w i k i b a s e : rank w i k i b a s e : DeprecatedRank .</p>
        <sec id="sec-5-1-1">
          <title>Listing 4: RDF-star</title>
          <p>SETTLED s : Q19801150 − s1 {
wd : Q19801150 wdt : P170 wd : Q83155 .</p>
          <p>GRAPH s : Q19801150 − s1 {
wd : Q19801150 wdt : P170 wd : Q83155 .
}
s : Q19801150 − s1 w i k i b a s e : rank w i k i b a s e : P r e f e r r e d R a n k .</p>
          <p>}
s : Q19801150 − s1 w i k i b a s e : rank w i k i b a s e : P r e f e r r e d R a n k .</p>
          <p>CONJ s : Q19801150 − s2 {</p>
          <p>wd : Q19801150 wdt : P170 ” unknown v a l u e ” .
7The approach adopted for the data acquisition and conversion is documented at https://github.com/
}
GRAPH s : c o l l a p s e O f Q 1 9 8 0 1 1 5 0 − s1 {
wd : Q19801150 wdt : P170 wd : Q83155 .</p>
          <p>s : c o l l a p s e O f − s1 c o n j : c o l l a p s e s s : Q19801150 − s1
}
s : Q19801150 − s1 w i k i b a s e : rank w i k i b a s e : P r e f e r r e d R a n k .</p>
          <p>GRAPH s : Q19801150 − s2 {
wd : Q19801150 c o n j 0 2 : P170 ” unknown v a l u e ” .</p>
          <p>c o n j 0 2 : P170 c o n j : i s A C o n j e c t u r a l F o r m O f wdt : P170 .</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Hardware and software configuration</title>
        <p>Tests have been run on a computer with processor Intel Core i5-8259U CPU @ 2.30GHz 2.30
GHz, RAM 32,0 GB, Windows 10 pro 64 bits, 1T hard disk. The TriG and SparQL parsers
of our GraphDB engine were modified to parse Conjectures in strong form. Our GraphDB
configuration uses 28G Ram allocated to the application, 89, and 10G cache size. A repository
has been created for each dataset with inferences of, no rule set assigned, predicates list index
enabled and (when possible) contexts enabled. All other parameters are left in their default
values. Repositories are already running before their performance tests are executed.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Metrics</title>
        <p>We decided to base the comparison of our reification methods on four major metrics. These
metrics are well-established when it comes to RDF quantitative analysis. The
performancerelated features of the reification methods under consideration should all be covered by those
criteria, which should also give us a clear picture of the benefits and drawbacks of each method.
• Total number of triples in endpoint : This value is particularly interesting since it makes it
possible to assess the verbosity of each method.
• Loading time: Time consumed by each dataset to be uploaded in the SPARQL endpoint.
• Dataset weight in triplestore: The storage size of the dataset after it has been uploaded
and stored in the triplestore.
8https://graphdb.ontotext.com/documentation/10.1/configuring-graphdb-memory.html
9https://graphdb.ontotext.com/documentation/10.2/getting-started.html#:~:text=the%20aforementioned%20icon.
• Query execution time: Response time on a selected set of queries. Each query is executed
automatically ten times. The average value is then computed.
4.3.1. Queries
Two sets of SPARQL queries (GQn, FQn) have been designed. While GQn queries do not
include any filter, FQn queries restrict the results only to paintings ( Q3305213). Each query
set is composed of 6 queries assessing all possible statuses of claims validity. In particular, the
queries retrieve the following topics: valid claims (Q1), debated claims (Q2), debated claims
with their provenance/time (Q3), currently disputed claims (Q4), accepted claims after being
debated (Q5), undisputed claims (Q6). Considering that authors’ and locations’ attributions
provide a simple, yet efective use case to test RDF representation of EWA over our dataset,
GQn and FQn have been then customised on retrieving authorship attributions (GQn-P170
and FQn-P170) and artworks’ locations (GQn-P276 and FQn-P276) respectively by the use of
Wikidata properties P170 and P276. All queries return the same data and are all correct in their
respective datasets. Each query set has been automatically run 10 times and the average times
have been calculated. Table 4.3.1 summarizes the nature of the queries. All actual queries are
available at https://github.com/conjectures-rdf/expressing-without-asserting-efficiency-tests
as well as the full set of results.</p>
        <p>Query</p>
        <p>Predicate</p>
        <p>Data selected by query
GQ1
GQ1
GQ2
GQ2
GQ3
GQ3
GQ4
GQ4
GQ5
GQ5
GQ6
GQ6
All attributions of paintings (Q3305213) that currently are considered valid
All locations of paintings (Q3305213) that are currently considered valid
All attributions of paintings (Q3305213) that have been debated
All past and debated locations of paintings (Q3305213)
All attributions of paintings (Q3305213) that have been debated, with provenance
All past and debated locations of paintings (Q3305213), with date of move
All currently debated attributions of paintings (Q3305213)
All locations of paintings (Q3305213) whose current location is uncertain
All settled attributions of paintings (Q3305213)
All current locations of paintings (Q3305213) that were moved
All attributions of paintings (Q3305213) that were never debated</p>
        <p>All locations of paintings (Q3305213) that never moved</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Test results</title>
      <sec id="sec-6-1">
        <title>5.1. Number of triples in endpoint</title>
        <p>All reification methods add additional triples to the already existing ones to represent the
necessary metadata (e.g. Singleton properties) or extend RDF 1.1 syntax (e.g. RDF-star). As
shown in table 2, Named graphs are the method which uses reification with the lower number
of triples, but with no explicit distinction between asserted and non-asserted graphs. While
other surveyed methods (in particular, RDF-star, Wikidata statements and Singleton properties)
use reification methods and assert each claim with an additional triple, Conjectures uses Named
graphs structure to express both statements and reification without adding additional triples
resulting in the method to express without asserting with the lowest addition of triples.</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Loading Time</title>
        <p>In the context of dataset D1 and D2, Conjectures in the strong form remain competitive with the
most eficient methods, notably RDF-star, and outperform Wikidata statements and Singleton
properties as shown in 2. However, the loading times increase in D3. This performance
discrepancy is attributed to the triplestore’s parser method for recognizing conjectural data. Specifically,
the process of checking each resource’s presence in a collection during loading contributes to
the observed delays. In essence, the loading time of the dataset increases proportionally with
the quantity of non-asserted triples (conjectures).</p>
      </sec>
      <sec id="sec-6-3">
        <title>5.3. Dataset weights in triplestore</title>
        <p>The Singleton method exhibits a storage size tenfold greater than alternative approaches, with
Conjectures in their weak form and Wikidata occupying intermediate positions. RDF-star,
Conjectures in their strong form, and Named graphs demonstrate similar sizes as shown in
ifgure 2.</p>
      </sec>
      <sec id="sec-6-4">
        <title>5.4. Query Execution Time</title>
        <p>The time response average for each dataset seems to increase linearly for each surveyed dataset
(D1, D2, D3). For this reason, figure 3 provides the snapshot of the execution time of queries
GQn and FQn on attributions and locations only on dataset D3.</p>
        <p>As illustrated in figure 3, the response times obtained from the execution of general queries
(GQn) on dataset D3 show that the strong form of Conjectures is less eficient than other methods
when retrieving asserted data, particularly in retrieving valid claims, currently disputed claims
and undisputed claims (queries GQ1, GQ4 and GQ6) for both creators and locations. However,
Conjectures perform better with disputed claims. In particular, Weak and Strong Conjectures
outperform other surveyed methods in the retrieval of debated statements with and without
provenance information (queries GQ2 and GQ3) and accepted claims after being debated (GQ5)
for both locations and creators.</p>
        <p>Similar to what was observed in GQn, Conjectures are less eficient in retrieving valid claims
(FQ1). On the contrary, Conjectures strong form is the most eficient method regarding the
retrieval of debated claims with and without contextual information (queries FQ2 and FQ3)
and accepted claims after debate (FQ5). In the remaining queries, currently disputed claims
(FQ4) and statements that have never been subject to debate (FQ6), Conjectures still maintain
competitive times with the rest of the methods. Strong Conjectures, in particular, address the
significant increase in response times for weak-form queries FQn[3:8]. Essentially, a notable
improvement in the performance of the strong form has been detected, proving to be the most
eficient method in half of the selected queries and, in the remaining ones, a valid competitor.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Discussion</title>
      <sec id="sec-7-1">
        <title>6.1. EWA expressivity</title>
        <p>In query performance assessment, we registered some slightly diferent values for the number
of results of some queries, such diferences are explained below highlighting some intrinsic
diferences in the models. Typically, a SPARQL query can still retrieve claims that won a certain
debate by accessing concurring ones stored in the Knowledge Graph (KG). In cases where debates
are not recorded, and only accepted statements are reported, reification approaches fall short.
Reification methods do not distinguish between claims that have never been questioned and
settled claims post-debate since both are recorded as asserted triples. At this point, an
ontologydependent solution must be addressed to represent this diferentiation such as Wikibase ranking.
For instance, the painting Portrait of Dona Isabel de Requesens (Q29651096) has been attributed
to Giulio Romano and marked as the settled claim, but no concurrent claim is reported. The
concept of SETTLED in Conjectures can also capture this nuanced distinction in an
ontologyindependent fashion which is transversal to be used on whichever KG.</p>
        <p>Another instance is when two claims express the same triple but with diferent qualifiers.
Consider a scenario where a historical painting X was initially attributed to author Y (marked
as an attribution) and afterwards this attribution has been considered settled by the community.
The historical attribution cannot be retrieved in a SPARQL query since its content is also asserted.
But, while Wikidata and Singleton provide unique IDs to distinguish such claims which can be
involved in the query to retrieve such claim, RDF-star associates all contextual triples with the
same quoted triple. This becomes more complex in multi-triple claims, where expressing them
as individual Wikidata statements is not feasible. For instance, when dealing with paintings
attributed to the collaboration of multiple individuals, expressing this complexity becomes
particularly interesting. Conjectures use Named graphs to group statements. which allows to
express and retrieve such complex statements with simple SPARQL queries. Other methods
would require to adopt other types of grouping methods (e.g. a n-ary relationship and/or a
blank node) with additional complexity and execution times.</p>
      </sec>
      <sec id="sec-7-2">
        <title>6.2. EWA eficiency</title>
        <p>Overall, we can immediately see several trends concerning the surveyed methods’ eficiency:
Singleton properties are systematically slower than the others, while Named graphs and
Conjectures in weak form performs at an intermediate level about the fastest methods, Wikidata,
Conjectures in strong form and RDF-star. Strong form also outperforms RDF-star in many
queries where the specifics of debated attributions and past locations become meaningful.
Strong form is the quickest method for expressing debates (disputed claims, GQ2 and FQ2,
disputed claims with provenance GQ3 and FQ3, settled claims GQ5 and FQ5), with a small loss
in term of performance for what concern asserted claims (valid claims GQn and FQn, undisputed
claims GQ6 and FQ6) and currently disputed claims (GQ5 and FQ5).</p>
        <p>Conjectures in strong form are also competitive for what concerns number of triples and
overall weight in the triplestore. It is competitive on loading time for, but loading times show
an interesting loss of performance for large datasets, that need to be investigated further.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>7. Conclusions</title>
      <p>This work evaluates the eficiency of EWA mechanisms by comparing several reification methods
(Wikidata, RDF-star, Named graphs, Singleton properties) and the novel Conjectures approach
(weak and strong form) on four major metrics (number of triples, loading time, dataset weight
and query execution time). Among the most eficient methods as RDF-star and Wikidata
statements, the strong form of Conjectures exhibits notable performance gains, particularly in
retrieving claims about debates (e.g., disputed claims with and without provenance information
and settled claims). In the future, we aim to optimize some aspects concerning the eficiency of
the strong form of Conjectures. In particular, we will prioritize the optimization of the loading
process, aiming to reduce the loading time and enhance its overall performance.
[6] F. Orlandi, D. Graux, D. O’Sullivan, Benchmarking rdf metadata representations:
Reification, singleton property and rdf*, 2021 IEEE 15th International Conference on
Semantic Computing (ICSC) (2021) 233–240. URL: https://api.semanticscholar.org/CorpusID:
232151947.
[7] M. Daquino, V. Pasqual, F. Tomasi, F. Vitali, Expressing without asserting in the arts, in:</p>
      <p>Proceedings of the Italian Research Conference on Digital Libraries. Padova, Italy, 2022.
[8] P. Hayes, Rdf semantics, w3c recommendation, http://www. w3. org/TR/rdf-mt/ (2004).
[9] N. Noy, A. Rector, P. Hayes, C. Welty, Defining n-ary relations on the semantic web, W3C
working group note 12 (2006).
[10] J. J. Carroll, C. Bizer, P. Hayes, P. Stickler, Named graphs, Journal of Web Semantics 3
(2005) 247–267.
[11] F. Erxleben, M. Günther, M. Krötzsch, J. Mendez, D. Vrandečić, Introducing Wikidata to
the Linked Data Web, in: P. Mika, T. Tudorache, A. Bernstein, C. Welty, C. Knoblock,
D. Vrandečić, P. Groth, N. Noy, K. Janowicz, C. Goble (Eds.), The Semantic Web – ISWC
2014, Springer International Publishing, Cham, 2014, pp. 50–65.
[12] V. Nguyen, O. Bodenreider, A. Sheth, Don’t like rdf reification? making statements
about statements using singleton property, in: Proceedings of the 23rd International
Conference on World Wide Web, WWW ’14, Association for Computing Machinery, New
York, NY, USA, 2014, p. 759–770. URL: https://doi.org/10.1145/2566486.2567973. doi:10.
1145/2566486.2567973.
[13] O. Hartig, Foundations of rdf* and sparql*:(an alternative approach to statement-level
metadata in rdf), in: AMW 2017 11th Alberto Mendelzon International Workshop on
Foundations of Data Management and the Web, Montevideo, Uruguay, June 7-9, 2017.,
volume 1912, Juan Reutter, Divesh Srivastava, 2017.
[14] A.-C. Ngonga Ngomo, I. Fundulaki, A. Krithara, J. Frey, K. Müller, S. Hellmann, E. Rahm,
M.-E. Vidal, A.-C. Ngonga Ngomo, I. Fundulaki, A. Krithara, Evaluation of metadata
representations in rdf stores, Semant. Web 10 (2019) 205–229. URL: https://doi.org/10.3233/
SW-180307. doi:10.3233/SW- 180307.
[15] V. Nguyen, O. Bodenreider, K. Thirunarayan, G. Fu, E. Bolton, N. Queralt-Rosinach, L. I.</p>
      <p>Furlong, M. Dumontier, A. Sheth, On reasoning with rdf statements about statements
using singleton property triples (2015).
[16] D. Hernández, A. Hogan, M. Krötzsch, Reifying RDF: what works well with wikidata?, in:
T. Liebig, A. Fokoue (Eds.), Proceedings of the 11th International Workshop on Scalable
Semantic Web Knowledge Base Systems, volume 1457 of CEUR Workshop Proceedings,
CEUR-WS.org, 2015, pp. 32–47.
[17] A. Hogan, J. Umbrich, A. Harth, R. Cyganiak, A. Polleres, S. Decker, An empirical survey
of linked data conformance, J. Web Semant. 14 (2012) 14–44. doi:http://dx.doi.org/10.
2139/ssrn.3198962.
[18] G. Demartini, I. Enchev, M. Wylot, J. Gapany, P. Cudré-Mauroux,
BowlognaBench—Benchmarking RDF Analytics, in: K. Aberer, E. Damiani, T. Dillon (Eds.), Data-Driven Process
Discovery and Analysis, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 82–102.
[19] Y. Theoharis, V. Christophides, G. Karvounarakis, Benchmarking Database Representations
of RDF/S Stores, in: Y. Gil, E. Motta, V. R. Benjamins, M. A. Musen (Eds.), The Semantic
Web – ISWC 2005, Springer Berlin Heidelberg, Berlin, Heidelberg, 2005, pp. 685–701.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. J. C.</given-names>
            <surname>Graham Klyne</surname>
          </string-name>
          ,
          <article-title>Resource Description Framework (RDF): Concepts and Abstract Syntax (</article-title>
          <year>2004</year>
          ). URL: https://www.w3.org/TR/2004/REC-rdf-concepts-
          <volume>20040210</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Giunti</surname>
          </string-name>
          , G. Sergioli, G. Vivanet,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pinna</surname>
          </string-name>
          ,
          <article-title>Representing n-ary relations in the semantic web</article-title>
          ,
          <source>Logic Journal of IGPL</source>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1093/jigpal/jzz047.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Freundschuh</surname>
          </string-name>
          ,
          <article-title>Crime stories in the historical urban landscape: narrating the theft of the mona lisa</article-title>
          ,
          <source>Urban History</source>
          <volume>33</volume>
          (
          <year>2006</year>
          )
          <fpage>274</fpage>
          -
          <lpage>292</lpage>
          . doi:
          <volume>10</volume>
          .1017/S0963926806003816.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <article-title>Salvator mundi: Going out on a limb</article-title>
          ,
          <source>KUR - Kunst und Recht</source>
          <volume>25</volume>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.15542/KUR/
          <year>2023</year>
          /2/3. doi:
          <volume>10</volume>
          .15542/KUR/
          <year>2023</year>
          /2/3.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hayes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Welty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Uschold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vatant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Manola</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Herman</surname>
          </string-name>
          , J. Lawrence,
          <article-title>Defining N-ary Relations on the Semantic Web (</article-title>
          <year>2006</year>
          ). URL: https://www.w3.org/TR/2006/ NOTE-swbp
          <string-name>
            <surname>-</surname>
          </string-name>
          n-aryRelations-
          <volume>20060412</volume>
          /.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>