=Paper= {{Paper |id=Vol-2578/PIE5 |storemode=property |title=Promoting Data Provenance Tracking in the Archaeological Interpretation Process |pdfUrl=https://ceur-ws.org/Vol-2578/PIE5.pdf |volume=Vol-2578 |authors=Sara Migliorini,Alberto Belussi,Elisa Quintarelli |dblpUrl=https://dblp.org/rec/conf/edbt/MiglioriniBQ20 }} ==Promoting Data Provenance Tracking in the Archaeological Interpretation Process== https://ceur-ws.org/Vol-2578/PIE5.pdf
                Promoting Data Provenance Tracking in the
                     Archaeological Interpretation Process
                Sara Migliorini                                       Alberto Belussi                            Elisa Quintarelli
          Dept. of Computer Science,                            Dept. of Computer Science,                   Dept. of Computer Science,
             University of Verona                                  University of Verona                         University of Verona
           sara.migliorini@univr.it                              alberto.belussi@univr.it                     elisa.quintarelli@univr.it

ABSTRACT                                                                          the interpretation process, it is necessary not only to infer new
In this paper we propose a model and a set of derivation rules                    knowledge but also to track the provenance of the information
for tracking data provenance during the archaeological inter-                     that has affected the inference. More specifically, it is necessary
pretation process. The interpretation process is the main task                    to keep track from which pieces of information (past interpreta-
performed by an archaeologist that, starting from ground data                     tions) the current new knowledge has been originated, together
about evidences and findings, tries to derive knowledge about an                  with their authorship.
ancient object or event. In particular, in this work we concentrate                  In computer science, provenance is the ability to record the
on the dating process used by archaeologists to assign one or                     history of data and its place of origin, and is useful to determine
more time intervals to a finding in order to define its lifespan on               the chronology of the ownership, custody or location of any
the temporal axis and we propose a framework to represent such                    object and to provide a critical foundation for assessing authen-
information and infer new knowledge including provenance of                       ticity and enabling trust. As highlighted in [9], data provenance is
data. Archaeological data, and in particular their temporal dimen-                separable from other forms of provenance. In our specific archae-
sion, are typically vague, since many different interpretations                   ological scenario, the term provenance comes originally from
can coexist, thus we will use Fuzzy Logic to assign a degree of                   the art world and it has been applied in archaeology and pale-
confidence to values and Fuzzy Temporal Constraint Networks                       ontology as well, where it refers to having trace of all the steps
to model relationships between dating of different findings.                      involved in producing a scientific result, such as a finding, from
                                                                                  experiment design through acquisition of raw data, and all the
KEYWORDS                                                                          subsequent steps of data selection, analysis and visualization.
                                                                                  Such information is necessary for the reproduction of a given
Provenance, Temporal Constraints, Information discovery
                                                                                  result, it can be useful to establish precedence (in case of patents,
                                                                                  Nobel prizes, etc.) [11] and is different from that of provenience.
1    INTRODUCTION                                                                    In the recent years there have been different proposals of for-
                                                                                  mal models for provenance storage, maintenance, and querying;
Interpretation and knowledge discovery represent a significant
                                                                                  PROV is the W3C recommendation for provenance data model
amount of the archaeological activity. Such interpretation process
                                                                                  and language [1]. Data provenance [8] differs from other forms
is usually based on direct and indirect observations of domain
                                                                                  of meta-data because it is based on relationships among objects.
experts (archeologists) which also consider previous interpre-
                                                                                  Indeed, the ancestry relationships, used in provenance for cor-
tations performed by themselves or other colleagues. Spatial
                                                                                  related objects, forms a directed graph that can be represented
and temporal dimensions are usually of considerable interest for
                                                                                  though semistructured data models. In [12] the authors have
archaeological research, because they allow to derive new impor-
                                                                                  encoded provenance graphs into Datalog and expressed infer-
tant relationships between findings, in particular as concern to
                                                                                  ence rules and constraints with the same declarative language,
stratigraphic analysis. A typical example involving such interpre-
                                                                                  in order to determine inconsistencies with respect to temporal
tation process is represented by the dating activity. Considering
                                                                                  constraints or provenance information (e.g. inconsistent cycles).
the process through which objects are usually manually dated
                                                                                     The aim of this paper is to propose a model and a set of deriva-
by archaeologists, some proposals in literature (e.g. [5, 7]) apply
                                                                                  tion rules that are able to track the data provenance during the
existing automatic techniques for time reasoning, in order to
                                                                                  archaeological interpretation process. More specifically, we con-
automatically derive new temporal knowledge or validate exist-
                                                                                  centrate on the dating process used by archaeologists to assign
ing interpretations based on the available spatial and temporal
                                                                                  one or more lifespans to a finding. Such process was initially
information.
                                                                                  modelled in [5, 7] for checking the temporal data consistency
   We can observe that archaeological interpretations depend
                                                                                  and vagueness reduction based on the use of Fuzzy Temporal
not only from direct observations, but also from past interpreta-
                                                                                  Constraint Networks (FTCN) [4, 13], here we extend it in order
tions performed by the same archaeologist or other colleagues. In
                                                                                  to manage and infer new knowledge including provenance of
general archaeological data, and more specifically the temporal
                                                                                  data and complex inferences.
dimension, are typically vague since many different interpre-
                                                                                     The remainder of the paper is organized as follows: Sect. 2 pro-
tations can coexist; each one has its own degree of confidence
                                                                                  vides a formal description of the problem, while Sect. 3 describes
and consequently several different global interpretations can be
                                                                                  the proposed solution; Sect. 4 exemplifies the application of this
derived from them. Each interpretation is typically identified
                                                                                  solution to a real-world case scenario.
by its author; moreover, the confidence greatly depends on the
archaeologist’s reputation in the field. For these reasons, during

© 2020 Copyright held by the owner/author(s). Published in the Workshop Pro-
                                                                                  2   PROBLEM FORMULATION
ceedings of the EDBT/ICDT 2020 Joint Conference, March 30-April 2, 2020 on        This paper refers to the Spatio-Temporal ARchaeological model
CEUR-WS.org Distribution of this paper is permitted under the terms of the Cre-
ative Commons license CC BY 4.0.
                                                                                  (Star) presented in [5, 7]. In the Star model three main objects
of interest can be recognized: ST_InformationSource, ST_Ar-                              1800    ǁ(a1)ǁ             ǁ(a1)ǁ   1899
                                                                                                               f1
chaeoPart and ST_ArchaeoUnit. An ST_ArchaeoUnit is a com-
                                                                                                 ǁ(a1)ǁ   ǁ(a4)ǁ    ǁ(a1)ǁ
plex archaeological entity obtained from an interpretation pro-
cess performed by the responsible officer. Such an interpretation                         f4                   f2             f3
                                                                                                ǁ(a5)ǁ              ǁ(a5)ǁ
is done based on some findings (represented by ST_Archaeo-                               1820 ǁ(a3)ǁ                         1850 ǁ(a2)ǁ
Part instances) retrieved during an excavation process or a bib-
liographical analysis (represented by ST_InformationSource                Figure 1: Example of topological complex representing
instances). Therefore, each ST_ArchaeoUnit is connected to one            temporal relations between archaeological partition.
or more constituent ST_ArchaeoParts, each one representing a
single result of an excavation or other investigation processes.
    As regards to the dating process, we can observe that the dat-        pairs of time-points. A TCN can be represented by a directed
ing of an ST_ArchaeoPart instance (when not available from                graph, where each node is associated with a variable and each arc
other objective measures) can also be determined from the dat-            corresponds to the constraint between the connected variables.
ing of other correlated instances, or the dating of an overall            However, in the archaeological domain, temporal knowledge is
ST_ArchaeoUnit can be obtained starting from the dating of                generally characterized by a level of vagueness and dates are
its constituent partitions. In this paper we extend the model             usually expressed as periods of great confidence together with
proposed in [5, 7] in order to keep track of the provenance of            an additional interval, i.e. the safety interval. For instance, the
such information and to provide a measure of the contribution             construction date of a building can be expressed as: between
provided by each author of the considered past interpretations.           1830-1850 with more confidence plus or minus 10 years of safety.
    In the Star model, temporal information regarding an archae-             Fuzzy set theory has been used to model the uncertainty of
ological finding can be quantitative or qualitative: a quantitative       natural language and is able to handle the concept of partial truth
temporal information is represented by time instants, while a             (or degree of truth). In particular, given a fuzzy set F , the term
qualitative information is a temporal information defined using           support denotes the set of elements with a possibility greater
the well-known Allen’s interval algebra [2]. Through the use of           than zero, while the term core denotes the set of elements with a
quantitative and qualitative temporal information it is possible to       possibility equal to 1. Therefore, a fuzzy representation of time
derive a topological structure composed of a set of related objects.      seams to be the most appropriate solution for representing time
Notice that inside a topological structure, some instants can be          dimensions in the archaeological context.
realized, namely they have an associated quantitative characteri-            A fuzzy temporal constraint network (FTCN) is a generaliza-
zation (i.e., an associated time instant value), while others can be      tion of TCN where a degree of possibility is associated with each
defined only qualitatively by means of relations with other nodes         possible value of a temporal constraint. In particular, a constraint
(i.e., represented as dummy nodes connected to other nodes).              between a pair of time-points represents a possibility distribution
                                                                          over temporal distances [13].
     Example 2.1. Let us consider four archaeological findings la-
                                                                              Definition 3.1 (fuzzy temporal constraint). Given two temporal
beled as f 1 , f 2 , f 3 and f 4 which are coarsely dated as follows:
                                                                          variables x i and x j , a fuzzy temporal constraint Ci j between them
f 1 , f 2 have been located in the 19th century by archaeologist a 1 ,
                                                                          is represented as a possibility distribution function πi j : R → [0, 1]
while f 3 has been dated 1850 by a 2 and f 4 has been dated 1820
                                                                          that constraints the possible values for the temporal distance
by a 3 . Besides these geometrical values, the following temporal
                                                                          x j − xi .                                                           □
relations have been detected: f 1 before f 2 and f 3 by a 4 , while f 2
before f 3 and after f 4 by a 5 . This knowledge can be represented           In other words, πi j (d) is the possibility degree for the distance
by the topological complex in Fig. 1. Dates associated to nodes           x j −x i to take the value d under the constraint Ci j . As done in our
f 3 and f 4 are realized as the years 1850 and 1820, respectively.        previous work [5, 7], this paper considers only trapezoidal dis-
Conversely, dates related to nodes f 1 and f 2 are not realized, but      tributions which are sufficiently expressive in practical contexts,
they are located between two dummy nodes representing the                 while computationally less expensive during the reasoning. They
years 1800 and 1899. Notice that both nodes and arcs can have             can be represented as a 4-tuple ⟨a, b, c, d⟩, where the intervals
an additional label representing the archaeologists that define           [b, c] and [a, d] represent the core and the support of the fuzzy
such quantitative or qualitative temporal information. Given such         set, respectively. Such tuple representation is enriched with a
topological relations some automatic reasoning techniques can             value α k , called degree of consistency, which denotes the height
be applied in order to specialize some coarse-grained dates and           of the trapeze and allows the representation of non-normalized
realize the dummy nodes. For instance, as regards to this example,        distributions. This is necessary in the general case, because even
the geometric temporal value associated to f 2 can be restricted          if the initial knowledge is always represented by a trapeze with
from 1800-1899 to 1820-1850, and consequently the dating of f 1           unitary height, during the reasoning the conjunction of some
can be restricted from 1800-1899 to 1800-1820. When considering           constraints can produce trapezes with an height less than one.
the provenance propagation, we can observe that the new dating                Starting from this representation, in this paper we introduce
of f 2 is determined by archaeologist a 2 who generally locates it        the possibility to specify for each temporal constraint also its
in the 19th century, but also more specifically by a 5 who defines        provenance (authorship). Moreover, we introduce a modified
the relations with f 3 , f 4 and by a 2 and a 3 who give a precise date   set of operations on these constraints which allow to track and
to f 3 and f 4 . Similar considerations can be done also for the new      update provenance information during the interpretation process.
dating of f 1 .                                                      □    Given such considerations, the notion of provenance-aware fuzzy
                                                                          temporal constraint (PA-FTCN) can be defined as follows.
3    PROPOSED SOLUTION                                                       Definition 3.2 (provenance-aware fuzzy trapezoidal constraint).
Temporal Constraint Network (TCN) [10] is a formalism for repre-          Given two variables x i and x j , a provenance-aware fuzzy trape-
senting temporal knowledge based on metric constraints among              zoidal temporal constraint Ci j = {T1 , . . . ,Tm } is a disjunction of
trapezoidal distributions πTk , each one denoted by a trapeze Tk =               In order to determine the result of the previous definition, it
⟨ak , bk , c k , dk ⟩[α k ]JΩK, where the characteristics 4-tuple is en-     is necessary to define the required operations. More specifically,
riched with a degree of consistency α k representing its height [3]          it is necessary to specialize some operations on fuzzy sets to op-
and a set of provenance statements Ω = {(o 1 , d 1 ), . . . , (on , dn )}.   erations on trapezoids with provenance statement. In particular,
Each provenance statement ωi = (oi , di ) contains a label o 1 iden-         the specialization of the inversion (Tk−1 ), composition (T1 ◦ T2 ),
tifying the data owner and a number di ∈ [0, 1] representing the             conjunction (T1 ⊗a T2 ) and disjunction (T1 ⊕a T2 ) operations on
degree of ownership.                                                    □    trapezoidal distributions can be found in [6]. Here we specialize
                                                                             them in order to take care also of the provenance information.
   The components of a trapezeTk take values as follows: ak , bk ∈
                                                                             In particular, our aim is from one side to propagate provenance
R ∪ {−∞}, c k , dk ∈ R ∪ {+∞}, α k ∈ [0, 1], Ω ⊆ A × [0, 1] where
                                                                             labels, but also to provide a degree of ownership to each author,
A is a set of labels representing known data owners. As men-
                                                                             thus, we need to define the concept of similarity between two
tioned before, the support of π is defined as supp(πTk ) = {x :
                                                                             trapezes.
πTk (x) > 0} = [ak , dk ], while the core as core(πTk ) = {x :
πTk (x) = α k } = [bk , c k ]. Moreover, this paper considers only               Definition 3.6 (trapeze similarity). Given two trapezes T1 =
well-formed trapezes: a trapeze T = ⟨a, b, c, d⟩ is well-formed, if          ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]JΩ1 K and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ]JΩ2 K, the de-
a ≤ b ≤ c ≤ d. In the following the set of well-formed trapezes              gree of similarity sim(T1 ,T2 ) ∈ [0, 1] between them is defined as:
is denoted as T . From this definition several shapes are allowed,                                                       area(T1 ∩ T2 )
as illustrated in Fig. 2.                                                                                 sim(T1 ,T2 ) =                                         (1)
                                                                                                                         area(T1 ∪ T2 )
                                                                                In other words the similarity is maximum (equal to 1) when
1                                                                            the two trapezes coincide, while it is minimium (equal to 0) when
                                                                             the two trapezes are completely disjoint, otherwise it is propor-
0                                                                            tional to the degree of overlap between them. Notice that there
             a                    b              c             d
                                                                             can be two cases where the degree of similarity is equal to 0: i)
                                                                             when the intersection is empty, and ii) when the union of the
Figure 2: Possible shapes of a trapezoidal possibility dis-
                                                                             two trapezes generates an infinite trapeze. This second case is
tribution function: (a) a < b < c < d, (b) a = b < c < d, (c)
                                                                             possibile, for instance, when one of the trapezes represents a
a < b < c = d, and (d) a < b = c < d.
                                                                             qualitative precedence constraint. In order to distinguish these
                                                                             two situations, we use the symbol 0 when the intersection is
   The semantics of a constraint Ci j = {T1 , . . . ,Tm } is the possi-      empty (no similarity at all), and the symbol ⊥ when the union is
bility distribution function πCi j corresponding to the disjunction          infinite (very low similarity).
of the trapezoidal distribution πTk : R → [0, 1] for k = 1, . . . , m.          During the various operations the degree of ownership as-
                                                                             signed to each author is computed on the basis of the starting
   Definition 3.3 (trapezoid possibility distribution function). The
                                                                             degree of ownership and the similarity between the original
possibility distribution function of a generic trapeze Tk ∈ T can
                                                                             constraint and the new obtained one.
be written as:
               0                             if x < ak ∨ x > dk                Definition 3.7 (inversion). Given a constraint Ci j = {T1 , . . . ,Tm }
                                                                             between variables x i and x j , the constraint Ci−1
             
              α k · ((x − ak )/(bk − ak )) if ak ≤ x < bk                                                                        j represents the
             
             
 πTk (x) =
             
               α k · ((dk − x)/(dk − c k )) if c k < x ≤ dk                 equivalent constraint holding between x j and x i . Such constraint
              α
             
             
                                              otherwise                      can be obtained by making the inversion of each constituent
              k
                                                                             trapezoids Tk = ⟨ak , bk , c k , dk ⟩[α k ]JΩK contained in Ci j , as fol-
                                                                       □
                                                                             lows: Tk−1 = ⟨−dk , −c k , −bk , −ak ⟩ [α k ]JΩK.                       □
   Definition 3.4 (solution). Let P = ⟨X, C⟩ be a provenance-
                                                                                Notice that in this case the provenance information is not
aware fuzzy temporal constraint network. An n-tuple S = {s 1 , . . .
                                                                             affected by the operation.
sn }, where si ∈ R, is a possible solution of P at degree α if and
only if: deg(S) = mini, j {πCi j (s j − si )} = α, where πi j stands for          Definition 3.8 (composition ◦). Given two constraints C 1 and C 2 ,
the possibility distribution associated to the constraint Ci j and           the composition of two generic trapezoidsT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]
the degree corresponds to the least satisfied constraint [13]. □             JΩ1 K ∈ C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ]JΩ2 K ∈ C 2 , assuming that
                                                                             α 1 ≥ α 2 , is defined as: T1 ◦ T2 = ⟨a 1 + a 2 , b1′ + b2 , c 1 + c 2′ , d 1 +
   In the case of a PA-FTCN, each solution is characterized by
                                                                             d 2 ⟩[min{α 1 , α 2 }]JΩ1 ∪ Ω2 K, where b1′ = a 1 + (α 2 /α 1 )(b1 − a 1 )
a degree of satisfaction reflecting a trade-off among potentially
                                                                             and c 2′ = d 2 − (α 2 /α 1 )(d 2 − c 2 ) and
conflicting constraints, and a set of provenance statements char-
acterizing the ownership of each constraint.                                        JΩ1 ∪ Ω2 K = {(oi , di ) | (oi , di ) ∈ Ω1 ∨ (oi , di ) ∈ Ω2 }             (2)
   The most widely used algorithm for constraint propagation is                                                                                                  □
the path-consistency algorithm.
                                                                                 The composition of two constraints produces a bigger trape-
    Definition 3.5 (path-consistency algorithm). Given three vari-           zoid w.r.t. the source trapezoids, thus, the provenance information
ables x i , x k and x j of a PA-FTCN P and a local instantiation             is the union of the input ones, with the same degree of ownership.
x i = di , x j = d j , a new constraint between x i and x j can be               The conjunction of two generic fuzzy possibility distribu-
induced from pre-existing constraints by the path consistency                tion functions π1 and π 2 is defined as: ∀d ∈ R (π1 ⊗ π 2 (d) =
algorithm as follows: πi j ⊗ (πik ◦ πk j )(x), where (πik ◦ πk j ) is        min{π1 , π2 }). Unfortunately, this operation cannot be directly
the composition (addition between fuzzy sets) of the constraints             applied to trapezoids and is more complex to specialize than
between x i − x k and x k − x j , while πi j is the existing constraints     composition, because given two generic trapezoids T1 and T2 , the
between x i − x j .                                                   □      function T1 ⊗ T2 = min{T1 ,T2 } is not always a trapeze: Fig. 3.a
                                                                                        the degree of ownership on the basis of the similarity between
                                                                                        the original information and the obtained one. Notice that, when
             a                         b                    c                  d
                                                                                        the same author oi is present in both the two trapezoids T1 and
                                                                                        T2 , we will compute its degree of ownership as di = max(sim
Figure 3: Two examples of approximated conjunction op-
                                                                                        (T1 ,T ), sim(T2 ,T )). Moreover, max(⊥, sim(Ti ,T )) = sim(Ti ,T ).
eration ⊗a between trapezoids: in (a) and (c) the result of
                                                                                            Finally, the disjunction operation is not required by the path
the classical conjunction operation between fuzzy possi-
                                                                                        consistency algorithm, but it can be useful for eliminating redun-
bility distribution functions, and in (b) and (d) the corre-
                                                                                        dant trapezes that are accidentally introduced by users or are
sponding approximation which produces a trapeze.
                                                                                        due to constraint propagation. Thus, it is an operation useful for
                                                                                        compressing available information.
and Fig. 3.c contain two examples of such situation. Therefore,                             The disjunction of two general fuzzy distribution functions
some sort of approximation of T1 ⊗ T2 has to be defined to ob-                          π 1 and π2 is defined as ∀d ∈ R : π1 ⊕ π2 (d) = max{π1 (d), π2 (d)}.
tain a trapeze. For the application context considered by this                          However, like conjunction, disjunction is not closed in the algebra
paper, the following approximation criteria formulated in [3]                           of trapezoids. Therefore, the idea is to compute a tentative trapeze,
are appropriate, where T is the result of the approximated con-                         and then check whether it corresponds to the disjunction of the
junction: core(πT ) = core(πT1 ⊗ πT2 ), h(πT ) = h(πT1 ⊗ πT2 ), and                     involved constraints (i.e., correspond of one of the two involved
supp(πT ) ⊆ supp(πT1 ⊗ πT2 ). In other words, the approximation                         trapezes), otherwise the constraints will be maintained separated.
shall ensure that the core of the obtained distribution is main-
tained while the possibility of the support elements outside the
core can be sightly modified. This operation is formalized as
follows.                                                                                              a                                       b

Table 1: Possible intersection between two trapezes and
corresponding element of the conjunction result.                                        Figure 4: Two examples of approximated disjunction op-
                                                                                        eration ⊕a between trapezoids: in (a) the operation can
                                                                                        be performed, while in (b) the operation cannot be per-
  Situation                                Result
                                                                                        formed.
  a 2 ∈ (a 1, b1 )
                                                   b1   if α 1 = α 2 ∧ b1 > b2
                                                  
                                                  
                                           b′ =     b1   if α 1 < α 2
                                                  
  a1    a2
                                                   b2                                      Definition 3.10 (disjunction ⊕a ). Given two constraints C 1 and
                                                         otherwise
                                                  
                                                                                       C 2 , the disjunction between two trapezesT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]Ω1 ∈
  d 1 ∈ (c 2, d 2 )
                                                                                        C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ] ∈ C 2′ is defined as follows [3]:
                                                   c1   if α 1 = α 2 ∧ c 1 > c 2
                                                                                        T1 ⊕aT2 = ⟨a, b, c, d⟩[max{α 1 , α 2 }]JΩ1 ∪Ω2 K where a = min{a 1 , a 2 },
                                                  
                                                  
                                           c′ =     c1   if α 1 < α 2
                                                  
                       d1    d2
                                                   c2
                                                  
                                                         otherwise                      b = b1 if α 1 > α 2 or b = b2 if α 2 > α 1 or b = min{b1 , b2 } other-
                                                                                        wise, c = c 1 if α 1 > α 2 or c = c 2 if α 2 > α 1 or c = min{c 1 , c 2 }
                                                  

                  a2         d1                                                         otherwise, d = max{d 1 , d 2 } and
                                                                                            JΩ1 ∪ Ω2 K = {(oi , di ) | oi ∈ A1 ∪ A2 ∧ di = sim(Ti ,T )}        □
   a2        d1              a1   d2       b ′ is the highlighted intersection point.
                                                                                           Fig. 4.a illustrates a case where the disjunction is executed,
             a1         d2                                                              while Fig. 4.b illustrates a case where it cannot be executed. The
                                                                                        disjunction has the same behaviour on data provenance of the
        a2    d1             a1   d2       c ′ is the highlighted intersection point.   conjunction.

                                                                                        4     CASE STUDY
    Definition 3.9 (conjunction ⊗a ). Given two constraints C 1 and
                                                                                        This section illustrates an example of reasoning performed on
C 2 , the conjunction between two trapezoidsT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]
                                                                                        archaeological data that allows the identification of some new
JΩ1 K ∈ C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ] JΩ2 K ∈ C 2 is defined as
                                                                                        temporal and data provenance knowledge. It regards an archaeo-
follows: T1 ⊗a T2 = T ∈ T inf (T1 ,T2 ) : ∀T1 ∈ T inf (T1 ,T2 ), πTi ≤
                                                                                        logical object called Porta Borsari which is an ancient Roman gate
πT , where T inf (T1 ,T2 ) = {T | πT ≤ πT1 ⊗ πT2 ∧ h(πT ) =                             in Verona. This object has been modeled as an ST_ArchaeoUnit
h(πT1 ⊗πT2 )} [3]. The trapezoidT can be computed as follows:T =                        by author a 1 , who also identifies and dates three distinct phases
(max{a 1 , a 2 }, b ′, c ′, min{d 1 , d 2 })[min{α 1 , α 2 }]JΩ1 ∪ Ω2 K where           into its life:
b ′ and c ′ depends on the 8 possible intersections between T1 and
T2 illustrated in Table 1 and                                                                 • Phase A – first foundation as Porta Iovia during the Late
                                                                                                Republican Time, which spans from 200 B.C. to 27 B.C.;
  JΩ1 ∪ Ω2 K = {(oi , di ) | oi ∈ A1 ∪ A2 ∧ di = sim(Ti ,T )}                       □         • Phase B – reconstruction during the Claudian Time, which
   The set T inf is the set of trapeze that approximate the conjunc-                            spans from 41 A.C. to 54 A.C.;
tion from “below”, the result of the conjunction is the greatest                              • Phase C – Teodorician changes during the Middle-Age,
trapeze in this set. Some examples of T1 ⊗a T2 are illustrated                                  which spans from 312 A.C. to 553 A.C.
in Fig. 3. In case (d) it is evident that the height of the resulting                      This information is represented in Fig. 5-7 by using two nodes
trapeze can become less than one, hence the degree of consistency                       for each phase X , a node X s denoting the phase start and a
α becomes necessary.                                                                    node X e denoting the phase end. An arrow connects X s with the
   As regards to the provenance, in this case, we keep track of all                     network start node s, while another arrow connects X s with X e .
authors who contribute to the trapeze conjunction, but we update                        The labels on these arrows is derived from the phase duration
and its relation with date associated to the start node s (in our                                                        ,1)ǁ   P263     〈0,0
                                                                                                                                              ,173
                                                                                                                   〉ǁ(a 3
example 200 B.C.).                                                                                             ,165                                ,173
                                                                                                                                                        〉ǁ(a
                                                                                                         50,155                                             1 ,1)
                                                                                                    40,1                                                          ǁ
                                                                                                  1                      〈0,0,173,173〉ǁ(a ,1)ǁ
   Subsequently, other archeologists have identified some find-                               〈                                            1
                                                                                                      〈0,0,173,173〉ǁ(a1,1)ǁ             〈0,0,173,173〉ǁ(a1,1)ǁ
ings as archaeological partitions belonging to this archaeological                             s                                  As                                  Ae
                                                                                                          ǁ(a5,1)ǁ
unit. Table 2 reports some information about them together with                                〈90,
                                                                                                    100               〈0,0,173,173〉ǁ(a1,1)ǁ
                                                                                                       ,199                                                   )ǁ
the associated dating. As regards to the dating, we assume that                                            ,209                                        (a 1,1
                                                                                                               〉ǁ(a                               73〉ǁ
                                                                                                                   2 ,1)                   ,173,1
                                                                                                                        ǁ    P208      〈0,0
the first archaeologist who found an archaeological partition
simply assigns it to one of the identified phases, while later the
same or other authors will restrict such dating as soon as new                        Figure 5: Portion of FTCN related to phase A.
information becomes available. The author responsible for the
identification of the phase membership is reported in column Ph
inside round brackets together with the phase name, while the
author(s) responsible for the fine-grained dating is (are) reported
in column Dating. Notice that in order not to cluttering the no-               simply as an edge labeled with the constraint ⟨0, 0, +∞, +∞⟩.
                                                                               In particular, by assuming i = P208, k = s and j = P263, the
 Table 2: Dating of each partition and associated phase.                       following new constraint πi′j can be derived between P208 and
                                                                               P263:
 Archaeo. Partition       Ph                       Dating
 P208 Foundation and    A (a 1 )      ⟨−110, −100, −1, +9⟩J(a 2, 1)K           πi′j = πi j ⊗a (πik ◦ πk j )
      North Tower                            I B.C. ± 10 years                                 −1
                                                                                   = πi j ⊗a (πki ◦ πk j )
 P263 Structures of     A (a 1 )      ⟨−60, −50, −45, −35⟩J(a 3, 1)K
      eastern facade                    Middle of I B.C. ± 10 years                = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a
 P214 Front of the      B (a 1 )          ⟨35, 45, 50, 60⟩J(a 4, 1)K                  (⟨−209, −199, −100, −90⟩J(a 2 , 1)K ◦ ⟨140, 150, 155, 165⟩J(a 3 , 1)K)
      external facade                   Middle of I A.C. ± 10 years
 P248 External          B (a 1 )   ⟨−9, 1, 100, 110⟩J(a 1, 0.5), (a 4, 0.5)K
                                                                                   = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a ⟨−69, −49, 55, 75⟩J(a 2 , 1), (a 3 , 1)K
      Foundations                            I A.C. ± 10 years                     = ⟨0, 0, 55, 75⟩J(a 1 , ⊥), (a 2 , 0.52), (a 3 , 0.52)K
 P275 Internal          B (a 1 )   ⟨−10, 1, 50, 100⟩J(a 2, 0.5), (a 3, 0.5)K
      Foundations                       Middle of I A.C. ± 5 years
                                                                                   From this derivation follows that the distance between P208
 P250 Defensive         C (a 1 )       ⟨401, 450, 500, 500⟩J(a 2, 1)K
      structures                           2nd middle of V A.C.
                                                                               and P263 can be from 0 to 75 years, with great possibility until
                                                                               55. This is consistent with the observation that P208 is located in
                                                                               I B.C., but it shall precede P263 which is located in the middle of
tation, we have omitted to report the original unitary height                  I B.C. As regards to the authors’ ownership, we can observe that
of the trapeze (namely [1]). Moreover, since the table reports                 all three authors partecipate to the final result, but with different
initial information, we assume that when more than one author                  degrees of ownership. In particular, the final degree of ownership
is present in Tab. 2, the contribution provided by each author is              for a 2 and a 3 is 0.52, computed using Def. 3.6, while a 1 is reported
equal, i.e. the reporting date is the result of a joint work.                  with a degree of similarity equal to ⊥, since the union operator
   Finally, author a 5 identifies the following temporal relations             produces a figure with an infinite area.
between partitions: P208 terminates before P263 starts, and P248                   A similar operation can be performed on the FTCN portion
terminates before P214 starts. These precedence relations have                 in Fig. 6, where Bs and Be represent the start and end points of
to be modeled with an arc ⟨0, 0, ∞, ∞⟩J(a 5 , 1)K; however, for not            phase B, respectively. The constraint between partition PA-248
cluttering the diagram, the figure reports only the author label.              and PA-214 can be restricted as follows where i = P248, k = s and
   Accordingly with the transformation rules of the previous sec-              j = P214:
tion, the first operation to perform is the definition of a common
coordinate reference system. The origin of such system is placed
                                                                               πi′j = πi j ⊗a (πik ◦ πk j )
to 200 B.C., since it is the earliest date in the model, while the
                                                                                               −1
granularity is the year, since for all dates the minimum granu-                    = πi j ⊗a (πki ◦ πk j )
larity is at least a year. In order to simplify the presentation, the              = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K⊗a
resulting network is presented through three portions, each one
corresponding to a different phase. The overall network can be                        (⟨−209, −199, −100, −90⟩J(a 1 , 0.5), (a 4 , 0.5)K◦
obtained by combining the three sub-networks and by adding                            ⟨140, 150, 155, 165⟩J(a 4 , 1)K)
an edge from phase A to phase B and an edge from phase B to                        = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a ⟨−69, −49, 55, 75⟩J(a 1 , 0.5), (a 4 , 1)K
phase C, both labeled with ⟨0, 0, +∞, +∞⟩J(a 1 , 1)K. These edges
                                                                                   = ⟨0, 0, 55, 75⟩J(a 1 , 0.52), (a 4 , 0.52)K⟩
represent the precedence relations between phases.
   Fig. 5 illustrates the subnetwork related to phase A: node s
represents the starting point, nodes As and Ae represent the start             The consideration is similar to the previous one, since P214 hap-
and end points of the phase respectively, while nodes P263 and                 pens in the middle of the I A.C. and P248 is generally dated I A.C.
P208 represent the dating of the corresponding archaeological                  but has to finish before P214 starts living. Notice that in this case
partitions. This portion of FTCN allows to compute some derived                we have two ownership information for a 1 , thus we choose the
constraints for the nodes based on the declared one, using the                 maximum one.
formula in Def. 3.5: πi′j (x) = πi j ⊗a (πik ◦ πk j (x)).                         Finally, as regards to phase C, whose corresponding sub-network
   In particular, a more precise relation can be derived between               is reported in Fig. 7, the dating of its partition can determine a
partition P208 and partition P263, which is initially represented              restriction of the phase start as follows, by considering i = s, k =
                                                               )ǁ                                                                        [5] A. Belussi and S. Migliorini. 2014. A Framework for Managing Temporal
                                                      (a 4,1           P214             〈0,0
                                              65〉ǁ                                             ,13,1
                                       55,1                                                            3〉ǁ(a                                 Dimensions in Archaeological Data. In Proceedings of 21st International Sym-
                               50,1                                                                           1 ,1)ǁ
                      〈1   40,1                                      〈0,0,13,13〉ǁ(a1,1)ǁ                                                     posium on Temporal Representation and Reasoning (TIME). 81–90. https:
                    〈241,241,254,254〉ǁ(a1,1)ǁ                                       〈0,0,13,13〉ǁ(a1,1)ǁ                                      //doi.org/10.1109/TIME.2014.15
               s                                                        Bs                                                         Be    [6] A. Belussi and S. Migliorini. 2014. Modeling Time in Archaeological Data: the
              〈90                        ǁ(a5,1)ǁ                             〈0,0,13,13〉ǁ(a1,1)ǁ                 ,1     )ǁ                  Verona Case Study. Technical Report RR 93/2014. Department of Computer
                 ,1 0                                                                                       〉ǁ(a 1
                                                                                                      ,13
                     0,1                                                                          ,13                                        Science, University of Verona. http://www.di.univr.it/report
                        99
                           ,2                         〈0,0,13,13〉ǁ(a1,1)ǁ            P275 〈0,0                 )ǁ
                                09                                                                          ,1
                                  〉ǁ(
                                     a1 ,                                                           3〉ǁ
                                                                                                        (a 1                             [7] A. Belussi and S. Migliorini. 2017. A spatio-temporal framework for managing
                                            0.5                                                 3,1                                          archeological data. Annals of Mathematics and Artificial Intelligence 80, 3 (Aug
                                               ),(a                                         0,1
                                                  4 ,0                                  〈0,
                                                       .5)
                                                           ǁ           P248                                                                  2017), 175–218. https://doi.org/10.1007/s10472-017-9535-0
                                                                                                                                         [8] P. Buneman, S. Khanna, and W. C. Tan. 2001. Why and Where: A Char-
                                                                                                                                             acterization of Data Provenance. In Database Theory - ICDT 2001, 8th In-
       Figure 6: Portion of FTCN related to phase B.                                                                                         ternational Conference, London, UK, January 4-6, 2001, Proceedings. 316–330.
                                                                                                                                             https://doi.org/10.1007/3-540-44503-X_20
                                                                                                                                         [9] P. Buneman and W. C. Tan. 2018. Data Provenance: What next? SIGMOD
                                                                                                                                             Record 47, 3 (2018), 5–16. https://doi.org/10.1145/3316416.3316418
P250 and j = Cs :                                                                                                                       [10] R. Dechter, I. Meiri, and J. Pearl. 1991. Temporal Constraint Networks. Artificial
                                                                                                                                             Intelligence 49, 1-3 (1991), 61–95.
          πi′j = πi j ⊗a (πik ◦ πk j )                                                                                                  [11] C. C. Kolb. 2014. Provenance Studies in Archaeology. Springer New York, New
                                                                                                                                             York, NY, 6172–6181. https://doi.org/10.1007/978-1-4419-0465-2_327
                                                                                                                                        [12] P. Missier and K. Belhajjame. 2012. A PROV Encoding for Provenance Analysis
               = πi j ⊗a (πik ◦ πk−1j )                                                                                                      Using Deductive Rules. In Provenance and Annotation of Data and Processes -
               = ⟨512, 512, 753, 753⟩J(a 1 , 1)K⊗a                                                                                           4th International Provenance and Annotation Workshop, IPAW. 67–81. https:
                                                                                                                                             //doi.org/10.1007/978-3-642-34222-6_6
                    (⟨601, 650, 700, 700⟩J(a 2 , 1)K◦                                                                                   [13] Lluis V. and Lluis G. 1994. On Fuzzy Temporal Constraint Networks. Mathware
                                                                                                                                             and Soft Compunting 3 (1994), 315–334.
                    ⟨−241, −241, 0, 0⟩J(a 1 , 1)K)
               = ⟨512, 512, 753, 753⟩J(a 1 , 1)K⊗a
                    ⟨360, 409, 700, 700⟩J(a 1 , 0.5), (a 2 , 0.5)K
               = ⟨512, 512, 700, 700⟩J(a 1 , 0.78), (a 2 , 0.30)K
   Clearly, these are only examples of the derivations that can
be obtained by executing the path-consistency algorithm on the
overall network and considering all the triangles. However, these
examples make clear the utility of applying existing temporal
reasoning techniques on archaeological data.

                      〈512,512,753,753〉ǁ(a1,1)ǁ                                    〈0,0,241,241〉ǁ(a1,1)ǁ
               s                                                             Cs                                                    Ce

                   〈60                                              〈0,0,241,241〉ǁ(a1,1)ǁ                                     )ǁ
                         1,6                                                                                             ,1
                               50,                                                                                   (a 1
                                  7   00,                                                                     1 〉ǁ
                                         700                                                         41   , 24
                                               〉ǁ(a
                                                   2 ,1)                                   〈0   ,0,2
                                                         ǁ                P250




       Figure 7: Portion of FTCN related to phase C.



5   CONCLUSION
In this paper we have proposed an extension of a model, able
to store temporal information about archeological findings, for
managing also the data provenance during the archaeological
interpretation process. In particular, we have extended a set of
fuzzy operators in order to represent and infer new knowledge
including provenance of data and its degree of truth.

ACKNOWLEDGMENTS
This work was partially supported by the Italian National Group
for Scientific Computation (GNCS-INDAM) and by “Progetto di
Eccellenza” of the Computer Science Dept., Univ. of Verona, Italy.

REFERENCES
[1] 2013. World Wide Web Consortium - PROV-DM: The PROV Data Model.
    https://www.w3.org/TR/prov-dm/.
[2] J. F. Allen. 1983. Maintaining Knowledge About Temporal Intervals. Commu-
    nications of the ACM 26, 11 (1983), 832–843.
[3] S. Badaloni, M. Falda, and M. Giacomin. 2004. Integrating Quantitative and
    Qualitative Fuzzy Temporal Constraints. AI Communications 17, 4 (2004),
    187–200.
[4] S. Badaloni and M. Giacomin. 2006. The Algebra IAfuz : A Framework for
    Qualitative Fuzzy Temporal Reasoning. Artificial Intelligence 170, 10 (2006),
    872–908.