Promoting Data Provenance Tracking in the Archaeological Interpretation Process Sara Migliorini Alberto Belussi Elisa Quintarelli Dept. of Computer Science, Dept. of Computer Science, Dept. of Computer Science, University of Verona University of Verona University of Verona sara.migliorini@univr.it alberto.belussi@univr.it elisa.quintarelli@univr.it ABSTRACT the interpretation process, it is necessary not only to infer new In this paper we propose a model and a set of derivation rules knowledge but also to track the provenance of the information for tracking data provenance during the archaeological inter- that has affected the inference. More specifically, it is necessary pretation process. The interpretation process is the main task to keep track from which pieces of information (past interpreta- performed by an archaeologist that, starting from ground data tions) the current new knowledge has been originated, together about evidences and findings, tries to derive knowledge about an with their authorship. ancient object or event. In particular, in this work we concentrate In computer science, provenance is the ability to record the on the dating process used by archaeologists to assign one or history of data and its place of origin, and is useful to determine more time intervals to a finding in order to define its lifespan on the chronology of the ownership, custody or location of any the temporal axis and we propose a framework to represent such object and to provide a critical foundation for assessing authen- information and infer new knowledge including provenance of ticity and enabling trust. As highlighted in [9], data provenance is data. Archaeological data, and in particular their temporal dimen- separable from other forms of provenance. In our specific archae- sion, are typically vague, since many different interpretations ological scenario, the term provenance comes originally from can coexist, thus we will use Fuzzy Logic to assign a degree of the art world and it has been applied in archaeology and pale- confidence to values and Fuzzy Temporal Constraint Networks ontology as well, where it refers to having trace of all the steps to model relationships between dating of different findings. involved in producing a scientific result, such as a finding, from experiment design through acquisition of raw data, and all the KEYWORDS subsequent steps of data selection, analysis and visualization. Such information is necessary for the reproduction of a given Provenance, Temporal Constraints, Information discovery result, it can be useful to establish precedence (in case of patents, Nobel prizes, etc.) [11] and is different from that of provenience. 1 INTRODUCTION In the recent years there have been different proposals of for- mal models for provenance storage, maintenance, and querying; Interpretation and knowledge discovery represent a significant PROV is the W3C recommendation for provenance data model amount of the archaeological activity. Such interpretation process and language [1]. Data provenance [8] differs from other forms is usually based on direct and indirect observations of domain of meta-data because it is based on relationships among objects. experts (archeologists) which also consider previous interpre- Indeed, the ancestry relationships, used in provenance for cor- tations performed by themselves or other colleagues. Spatial related objects, forms a directed graph that can be represented and temporal dimensions are usually of considerable interest for though semistructured data models. In [12] the authors have archaeological research, because they allow to derive new impor- encoded provenance graphs into Datalog and expressed infer- tant relationships between findings, in particular as concern to ence rules and constraints with the same declarative language, stratigraphic analysis. A typical example involving such interpre- in order to determine inconsistencies with respect to temporal tation process is represented by the dating activity. Considering constraints or provenance information (e.g. inconsistent cycles). the process through which objects are usually manually dated The aim of this paper is to propose a model and a set of deriva- by archaeologists, some proposals in literature (e.g. [5, 7]) apply tion rules that are able to track the data provenance during the existing automatic techniques for time reasoning, in order to archaeological interpretation process. More specifically, we con- automatically derive new temporal knowledge or validate exist- centrate on the dating process used by archaeologists to assign ing interpretations based on the available spatial and temporal one or more lifespans to a finding. Such process was initially information. modelled in [5, 7] for checking the temporal data consistency We can observe that archaeological interpretations depend and vagueness reduction based on the use of Fuzzy Temporal not only from direct observations, but also from past interpreta- Constraint Networks (FTCN) [4, 13], here we extend it in order tions performed by the same archaeologist or other colleagues. In to manage and infer new knowledge including provenance of general archaeological data, and more specifically the temporal data and complex inferences. dimension, are typically vague since many different interpre- The remainder of the paper is organized as follows: Sect. 2 pro- tations can coexist; each one has its own degree of confidence vides a formal description of the problem, while Sect. 3 describes and consequently several different global interpretations can be the proposed solution; Sect. 4 exemplifies the application of this derived from them. Each interpretation is typically identified solution to a real-world case scenario. by its author; moreover, the confidence greatly depends on the archaeologist’s reputation in the field. For these reasons, during © 2020 Copyright held by the owner/author(s). Published in the Workshop Pro- 2 PROBLEM FORMULATION ceedings of the EDBT/ICDT 2020 Joint Conference, March 30-April 2, 2020 on This paper refers to the Spatio-Temporal ARchaeological model CEUR-WS.org Distribution of this paper is permitted under the terms of the Cre- ative Commons license CC BY 4.0. (Star) presented in [5, 7]. In the Star model three main objects of interest can be recognized: ST_InformationSource, ST_Ar- 1800 ǁ(a1)ǁ ǁ(a1)ǁ 1899 f1 chaeoPart and ST_ArchaeoUnit. An ST_ArchaeoUnit is a com- ǁ(a1)ǁ ǁ(a4)ǁ ǁ(a1)ǁ plex archaeological entity obtained from an interpretation pro- cess performed by the responsible officer. Such an interpretation f4 f2 f3 ǁ(a5)ǁ ǁ(a5)ǁ is done based on some findings (represented by ST_Archaeo- 1820 ǁ(a3)ǁ 1850 ǁ(a2)ǁ Part instances) retrieved during an excavation process or a bib- liographical analysis (represented by ST_InformationSource Figure 1: Example of topological complex representing instances). Therefore, each ST_ArchaeoUnit is connected to one temporal relations between archaeological partition. or more constituent ST_ArchaeoParts, each one representing a single result of an excavation or other investigation processes. As regards to the dating process, we can observe that the dat- pairs of time-points. A TCN can be represented by a directed ing of an ST_ArchaeoPart instance (when not available from graph, where each node is associated with a variable and each arc other objective measures) can also be determined from the dat- corresponds to the constraint between the connected variables. ing of other correlated instances, or the dating of an overall However, in the archaeological domain, temporal knowledge is ST_ArchaeoUnit can be obtained starting from the dating of generally characterized by a level of vagueness and dates are its constituent partitions. In this paper we extend the model usually expressed as periods of great confidence together with proposed in [5, 7] in order to keep track of the provenance of an additional interval, i.e. the safety interval. For instance, the such information and to provide a measure of the contribution construction date of a building can be expressed as: between provided by each author of the considered past interpretations. 1830-1850 with more confidence plus or minus 10 years of safety. In the Star model, temporal information regarding an archae- Fuzzy set theory has been used to model the uncertainty of ological finding can be quantitative or qualitative: a quantitative natural language and is able to handle the concept of partial truth temporal information is represented by time instants, while a (or degree of truth). In particular, given a fuzzy set F , the term qualitative information is a temporal information defined using support denotes the set of elements with a possibility greater the well-known Allen’s interval algebra [2]. Through the use of than zero, while the term core denotes the set of elements with a quantitative and qualitative temporal information it is possible to possibility equal to 1. Therefore, a fuzzy representation of time derive a topological structure composed of a set of related objects. seams to be the most appropriate solution for representing time Notice that inside a topological structure, some instants can be dimensions in the archaeological context. realized, namely they have an associated quantitative characteri- A fuzzy temporal constraint network (FTCN) is a generaliza- zation (i.e., an associated time instant value), while others can be tion of TCN where a degree of possibility is associated with each defined only qualitatively by means of relations with other nodes possible value of a temporal constraint. In particular, a constraint (i.e., represented as dummy nodes connected to other nodes). between a pair of time-points represents a possibility distribution over temporal distances [13]. Example 2.1. Let us consider four archaeological findings la- Definition 3.1 (fuzzy temporal constraint). Given two temporal beled as f 1 , f 2 , f 3 and f 4 which are coarsely dated as follows: variables x i and x j , a fuzzy temporal constraint Ci j between them f 1 , f 2 have been located in the 19th century by archaeologist a 1 , is represented as a possibility distribution function πi j : R → [0, 1] while f 3 has been dated 1850 by a 2 and f 4 has been dated 1820 that constraints the possible values for the temporal distance by a 3 . Besides these geometrical values, the following temporal x j − xi . □ relations have been detected: f 1 before f 2 and f 3 by a 4 , while f 2 before f 3 and after f 4 by a 5 . This knowledge can be represented In other words, πi j (d) is the possibility degree for the distance by the topological complex in Fig. 1. Dates associated to nodes x j −x i to take the value d under the constraint Ci j . As done in our f 3 and f 4 are realized as the years 1850 and 1820, respectively. previous work [5, 7], this paper considers only trapezoidal dis- Conversely, dates related to nodes f 1 and f 2 are not realized, but tributions which are sufficiently expressive in practical contexts, they are located between two dummy nodes representing the while computationally less expensive during the reasoning. They years 1800 and 1899. Notice that both nodes and arcs can have can be represented as a 4-tuple ⟨a, b, c, d⟩, where the intervals an additional label representing the archaeologists that define [b, c] and [a, d] represent the core and the support of the fuzzy such quantitative or qualitative temporal information. Given such set, respectively. Such tuple representation is enriched with a topological relations some automatic reasoning techniques can value α k , called degree of consistency, which denotes the height be applied in order to specialize some coarse-grained dates and of the trapeze and allows the representation of non-normalized realize the dummy nodes. For instance, as regards to this example, distributions. This is necessary in the general case, because even the geometric temporal value associated to f 2 can be restricted if the initial knowledge is always represented by a trapeze with from 1800-1899 to 1820-1850, and consequently the dating of f 1 unitary height, during the reasoning the conjunction of some can be restricted from 1800-1899 to 1800-1820. When considering constraints can produce trapezes with an height less than one. the provenance propagation, we can observe that the new dating Starting from this representation, in this paper we introduce of f 2 is determined by archaeologist a 2 who generally locates it the possibility to specify for each temporal constraint also its in the 19th century, but also more specifically by a 5 who defines provenance (authorship). Moreover, we introduce a modified the relations with f 3 , f 4 and by a 2 and a 3 who give a precise date set of operations on these constraints which allow to track and to f 3 and f 4 . Similar considerations can be done also for the new update provenance information during the interpretation process. dating of f 1 . □ Given such considerations, the notion of provenance-aware fuzzy temporal constraint (PA-FTCN) can be defined as follows. 3 PROPOSED SOLUTION Definition 3.2 (provenance-aware fuzzy trapezoidal constraint). Temporal Constraint Network (TCN) [10] is a formalism for repre- Given two variables x i and x j , a provenance-aware fuzzy trape- senting temporal knowledge based on metric constraints among zoidal temporal constraint Ci j = {T1 , . . . ,Tm } is a disjunction of trapezoidal distributions πTk , each one denoted by a trapeze Tk = In order to determine the result of the previous definition, it ⟨ak , bk , c k , dk ⟩[α k ]JΩK, where the characteristics 4-tuple is en- is necessary to define the required operations. More specifically, riched with a degree of consistency α k representing its height [3] it is necessary to specialize some operations on fuzzy sets to op- and a set of provenance statements Ω = {(o 1 , d 1 ), . . . , (on , dn )}. erations on trapezoids with provenance statement. In particular, Each provenance statement ωi = (oi , di ) contains a label o 1 iden- the specialization of the inversion (Tk−1 ), composition (T1 ◦ T2 ), tifying the data owner and a number di ∈ [0, 1] representing the conjunction (T1 ⊗a T2 ) and disjunction (T1 ⊕a T2 ) operations on degree of ownership. □ trapezoidal distributions can be found in [6]. Here we specialize them in order to take care also of the provenance information. The components of a trapezeTk take values as follows: ak , bk ∈ In particular, our aim is from one side to propagate provenance R ∪ {−∞}, c k , dk ∈ R ∪ {+∞}, α k ∈ [0, 1], Ω ⊆ A × [0, 1] where labels, but also to provide a degree of ownership to each author, A is a set of labels representing known data owners. As men- thus, we need to define the concept of similarity between two tioned before, the support of π is defined as supp(πTk ) = {x : trapezes. πTk (x) > 0} = [ak , dk ], while the core as core(πTk ) = {x : πTk (x) = α k } = [bk , c k ]. Moreover, this paper considers only Definition 3.6 (trapeze similarity). Given two trapezes T1 = well-formed trapezes: a trapeze T = ⟨a, b, c, d⟩ is well-formed, if ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]JΩ1 K and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ]JΩ2 K, the de- a ≤ b ≤ c ≤ d. In the following the set of well-formed trapezes gree of similarity sim(T1 ,T2 ) ∈ [0, 1] between them is defined as: is denoted as T . From this definition several shapes are allowed, area(T1 ∩ T2 ) as illustrated in Fig. 2. sim(T1 ,T2 ) = (1) area(T1 ∪ T2 ) In other words the similarity is maximum (equal to 1) when 1 the two trapezes coincide, while it is minimium (equal to 0) when the two trapezes are completely disjoint, otherwise it is propor- 0 tional to the degree of overlap between them. Notice that there a b c d can be two cases where the degree of similarity is equal to 0: i) when the intersection is empty, and ii) when the union of the Figure 2: Possible shapes of a trapezoidal possibility dis- two trapezes generates an infinite trapeze. This second case is tribution function: (a) a < b < c < d, (b) a = b < c < d, (c) possibile, for instance, when one of the trapezes represents a a < b < c = d, and (d) a < b = c < d. qualitative precedence constraint. In order to distinguish these two situations, we use the symbol 0 when the intersection is The semantics of a constraint Ci j = {T1 , . . . ,Tm } is the possi- empty (no similarity at all), and the symbol ⊥ when the union is bility distribution function πCi j corresponding to the disjunction infinite (very low similarity). of the trapezoidal distribution πTk : R → [0, 1] for k = 1, . . . , m. During the various operations the degree of ownership as- signed to each author is computed on the basis of the starting Definition 3.3 (trapezoid possibility distribution function). The degree of ownership and the similarity between the original possibility distribution function of a generic trapeze Tk ∈ T can constraint and the new obtained one. be written as:  0 if x < ak ∨ x > dk Definition 3.7 (inversion). Given a constraint Ci j = {T1 , . . . ,Tm } between variables x i and x j , the constraint Ci−1   α k · ((x − ak )/(bk − ak )) if ak ≤ x < bk j represents the   πTk (x) =   α k · ((dk − x)/(dk − c k )) if c k < x ≤ dk equivalent constraint holding between x j and x i . Such constraint  α   otherwise can be obtained by making the inversion of each constituent  k trapezoids Tk = ⟨ak , bk , c k , dk ⟩[α k ]JΩK contained in Ci j , as fol- □ lows: Tk−1 = ⟨−dk , −c k , −bk , −ak ⟩ [α k ]JΩK. □ Definition 3.4 (solution). Let P = ⟨X, C⟩ be a provenance- Notice that in this case the provenance information is not aware fuzzy temporal constraint network. An n-tuple S = {s 1 , . . . affected by the operation. sn }, where si ∈ R, is a possible solution of P at degree α if and only if: deg(S) = mini, j {πCi j (s j − si )} = α, where πi j stands for Definition 3.8 (composition ◦). Given two constraints C 1 and C 2 , the possibility distribution associated to the constraint Ci j and the composition of two generic trapezoidsT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ] the degree corresponds to the least satisfied constraint [13]. □ JΩ1 K ∈ C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ]JΩ2 K ∈ C 2 , assuming that α 1 ≥ α 2 , is defined as: T1 ◦ T2 = ⟨a 1 + a 2 , b1′ + b2 , c 1 + c 2′ , d 1 + In the case of a PA-FTCN, each solution is characterized by d 2 ⟩[min{α 1 , α 2 }]JΩ1 ∪ Ω2 K, where b1′ = a 1 + (α 2 /α 1 )(b1 − a 1 ) a degree of satisfaction reflecting a trade-off among potentially and c 2′ = d 2 − (α 2 /α 1 )(d 2 − c 2 ) and conflicting constraints, and a set of provenance statements char- acterizing the ownership of each constraint. JΩ1 ∪ Ω2 K = {(oi , di ) | (oi , di ) ∈ Ω1 ∨ (oi , di ) ∈ Ω2 } (2) The most widely used algorithm for constraint propagation is □ the path-consistency algorithm. The composition of two constraints produces a bigger trape- Definition 3.5 (path-consistency algorithm). Given three vari- zoid w.r.t. the source trapezoids, thus, the provenance information ables x i , x k and x j of a PA-FTCN P and a local instantiation is the union of the input ones, with the same degree of ownership. x i = di , x j = d j , a new constraint between x i and x j can be The conjunction of two generic fuzzy possibility distribu- induced from pre-existing constraints by the path consistency tion functions π1 and π 2 is defined as: ∀d ∈ R (π1 ⊗ π 2 (d) = algorithm as follows: πi j ⊗ (πik ◦ πk j )(x), where (πik ◦ πk j ) is min{π1 , π2 }). Unfortunately, this operation cannot be directly the composition (addition between fuzzy sets) of the constraints applied to trapezoids and is more complex to specialize than between x i − x k and x k − x j , while πi j is the existing constraints composition, because given two generic trapezoids T1 and T2 , the between x i − x j . □ function T1 ⊗ T2 = min{T1 ,T2 } is not always a trapeze: Fig. 3.a the degree of ownership on the basis of the similarity between the original information and the obtained one. Notice that, when a b c d the same author oi is present in both the two trapezoids T1 and T2 , we will compute its degree of ownership as di = max(sim Figure 3: Two examples of approximated conjunction op- (T1 ,T ), sim(T2 ,T )). Moreover, max(⊥, sim(Ti ,T )) = sim(Ti ,T ). eration ⊗a between trapezoids: in (a) and (c) the result of Finally, the disjunction operation is not required by the path the classical conjunction operation between fuzzy possi- consistency algorithm, but it can be useful for eliminating redun- bility distribution functions, and in (b) and (d) the corre- dant trapezes that are accidentally introduced by users or are sponding approximation which produces a trapeze. due to constraint propagation. Thus, it is an operation useful for compressing available information. and Fig. 3.c contain two examples of such situation. Therefore, The disjunction of two general fuzzy distribution functions some sort of approximation of T1 ⊗ T2 has to be defined to ob- π 1 and π2 is defined as ∀d ∈ R : π1 ⊕ π2 (d) = max{π1 (d), π2 (d)}. tain a trapeze. For the application context considered by this However, like conjunction, disjunction is not closed in the algebra paper, the following approximation criteria formulated in [3] of trapezoids. Therefore, the idea is to compute a tentative trapeze, are appropriate, where T is the result of the approximated con- and then check whether it corresponds to the disjunction of the junction: core(πT ) = core(πT1 ⊗ πT2 ), h(πT ) = h(πT1 ⊗ πT2 ), and involved constraints (i.e., correspond of one of the two involved supp(πT ) ⊆ supp(πT1 ⊗ πT2 ). In other words, the approximation trapezes), otherwise the constraints will be maintained separated. shall ensure that the core of the obtained distribution is main- tained while the possibility of the support elements outside the core can be sightly modified. This operation is formalized as follows. a b Table 1: Possible intersection between two trapezes and corresponding element of the conjunction result. Figure 4: Two examples of approximated disjunction op- eration ⊕a between trapezoids: in (a) the operation can be performed, while in (b) the operation cannot be per- Situation Result formed. a 2 ∈ (a 1, b1 )  b1 if α 1 = α 2 ∧ b1 > b2   b′ = b1 if α 1 < α 2  a1 a2  b2 Definition 3.10 (disjunction ⊕a ). Given two constraints C 1 and otherwise   C 2 , the disjunction between two trapezesT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ]Ω1 ∈ d 1 ∈ (c 2, d 2 ) C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ] ∈ C 2′ is defined as follows [3]:  c1 if α 1 = α 2 ∧ c 1 > c 2 T1 ⊕aT2 = ⟨a, b, c, d⟩[max{α 1 , α 2 }]JΩ1 ∪Ω2 K where a = min{a 1 , a 2 },   c′ = c1 if α 1 < α 2  d1 d2  c2  otherwise b = b1 if α 1 > α 2 or b = b2 if α 2 > α 1 or b = min{b1 , b2 } other- wise, c = c 1 if α 1 > α 2 or c = c 2 if α 2 > α 1 or c = min{c 1 , c 2 }  a2 d1 otherwise, d = max{d 1 , d 2 } and JΩ1 ∪ Ω2 K = {(oi , di ) | oi ∈ A1 ∪ A2 ∧ di = sim(Ti ,T )} □ a2 d1 a1 d2 b ′ is the highlighted intersection point. Fig. 4.a illustrates a case where the disjunction is executed, a1 d2 while Fig. 4.b illustrates a case where it cannot be executed. The disjunction has the same behaviour on data provenance of the a2 d1 a1 d2 c ′ is the highlighted intersection point. conjunction. 4 CASE STUDY Definition 3.9 (conjunction ⊗a ). Given two constraints C 1 and This section illustrates an example of reasoning performed on C 2 , the conjunction between two trapezoidsT1 = ⟨a 1 , b1 , c 1 , d 1 ⟩[α 1 ] archaeological data that allows the identification of some new JΩ1 K ∈ C 1 and T2 = ⟨a 2 , b2 , c 2 , d 2 ⟩[α 2 ] JΩ2 K ∈ C 2 is defined as temporal and data provenance knowledge. It regards an archaeo- follows: T1 ⊗a T2 = T ∈ T inf (T1 ,T2 ) : ∀T1 ∈ T inf (T1 ,T2 ), πTi ≤ logical object called Porta Borsari which is an ancient Roman gate πT , where T inf (T1 ,T2 ) = {T | πT ≤ πT1 ⊗ πT2 ∧ h(πT ) = in Verona. This object has been modeled as an ST_ArchaeoUnit h(πT1 ⊗πT2 )} [3]. The trapezoidT can be computed as follows:T = by author a 1 , who also identifies and dates three distinct phases (max{a 1 , a 2 }, b ′, c ′, min{d 1 , d 2 })[min{α 1 , α 2 }]JΩ1 ∪ Ω2 K where into its life: b ′ and c ′ depends on the 8 possible intersections between T1 and T2 illustrated in Table 1 and • Phase A – first foundation as Porta Iovia during the Late Republican Time, which spans from 200 B.C. to 27 B.C.; JΩ1 ∪ Ω2 K = {(oi , di ) | oi ∈ A1 ∪ A2 ∧ di = sim(Ti ,T )} □ • Phase B – reconstruction during the Claudian Time, which The set T inf is the set of trapeze that approximate the conjunc- spans from 41 A.C. to 54 A.C.; tion from “below”, the result of the conjunction is the greatest • Phase C – Teodorician changes during the Middle-Age, trapeze in this set. Some examples of T1 ⊗a T2 are illustrated which spans from 312 A.C. to 553 A.C. in Fig. 3. In case (d) it is evident that the height of the resulting This information is represented in Fig. 5-7 by using two nodes trapeze can become less than one, hence the degree of consistency for each phase X , a node X s denoting the phase start and a α becomes necessary. node X e denoting the phase end. An arrow connects X s with the As regards to the provenance, in this case, we keep track of all network start node s, while another arrow connects X s with X e . authors who contribute to the trapeze conjunction, but we update The labels on these arrows is derived from the phase duration and its relation with date associated to the start node s (in our ,1)ǁ P263 〈0,0 ,173 〉ǁ(a 3 example 200 B.C.). ,165 ,173 〉ǁ(a 50,155 1 ,1) 40,1 ǁ 1 〈0,0,173,173〉ǁ(a ,1)ǁ Subsequently, other archeologists have identified some find- 〈 1 〈0,0,173,173〉ǁ(a1,1)ǁ 〈0,0,173,173〉ǁ(a1,1)ǁ ings as archaeological partitions belonging to this archaeological s As Ae ǁ(a5,1)ǁ unit. Table 2 reports some information about them together with 〈90, 100 〈0,0,173,173〉ǁ(a1,1)ǁ ,199 )ǁ the associated dating. As regards to the dating, we assume that ,209 (a 1,1 〉ǁ(a 73〉ǁ 2 ,1) ,173,1 ǁ P208 〈0,0 the first archaeologist who found an archaeological partition simply assigns it to one of the identified phases, while later the same or other authors will restrict such dating as soon as new Figure 5: Portion of FTCN related to phase A. information becomes available. The author responsible for the identification of the phase membership is reported in column Ph inside round brackets together with the phase name, while the author(s) responsible for the fine-grained dating is (are) reported in column Dating. Notice that in order not to cluttering the no- simply as an edge labeled with the constraint ⟨0, 0, +∞, +∞⟩. In particular, by assuming i = P208, k = s and j = P263, the Table 2: Dating of each partition and associated phase. following new constraint πi′j can be derived between P208 and P263: Archaeo. Partition Ph Dating P208 Foundation and A (a 1 ) ⟨−110, −100, −1, +9⟩J(a 2, 1)K πi′j = πi j ⊗a (πik ◦ πk j ) North Tower I B.C. ± 10 years −1 = πi j ⊗a (πki ◦ πk j ) P263 Structures of A (a 1 ) ⟨−60, −50, −45, −35⟩J(a 3, 1)K eastern facade Middle of I B.C. ± 10 years = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a P214 Front of the B (a 1 ) ⟨35, 45, 50, 60⟩J(a 4, 1)K (⟨−209, −199, −100, −90⟩J(a 2 , 1)K ◦ ⟨140, 150, 155, 165⟩J(a 3 , 1)K) external facade Middle of I A.C. ± 10 years P248 External B (a 1 ) ⟨−9, 1, 100, 110⟩J(a 1, 0.5), (a 4, 0.5)K = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a ⟨−69, −49, 55, 75⟩J(a 2 , 1), (a 3 , 1)K Foundations I A.C. ± 10 years = ⟨0, 0, 55, 75⟩J(a 1 , ⊥), (a 2 , 0.52), (a 3 , 0.52)K P275 Internal B (a 1 ) ⟨−10, 1, 50, 100⟩J(a 2, 0.5), (a 3, 0.5)K Foundations Middle of I A.C. ± 5 years From this derivation follows that the distance between P208 P250 Defensive C (a 1 ) ⟨401, 450, 500, 500⟩J(a 2, 1)K structures 2nd middle of V A.C. and P263 can be from 0 to 75 years, with great possibility until 55. This is consistent with the observation that P208 is located in I B.C., but it shall precede P263 which is located in the middle of tation, we have omitted to report the original unitary height I B.C. As regards to the authors’ ownership, we can observe that of the trapeze (namely [1]). Moreover, since the table reports all three authors partecipate to the final result, but with different initial information, we assume that when more than one author degrees of ownership. In particular, the final degree of ownership is present in Tab. 2, the contribution provided by each author is for a 2 and a 3 is 0.52, computed using Def. 3.6, while a 1 is reported equal, i.e. the reporting date is the result of a joint work. with a degree of similarity equal to ⊥, since the union operator Finally, author a 5 identifies the following temporal relations produces a figure with an infinite area. between partitions: P208 terminates before P263 starts, and P248 A similar operation can be performed on the FTCN portion terminates before P214 starts. These precedence relations have in Fig. 6, where Bs and Be represent the start and end points of to be modeled with an arc ⟨0, 0, ∞, ∞⟩J(a 5 , 1)K; however, for not phase B, respectively. The constraint between partition PA-248 cluttering the diagram, the figure reports only the author label. and PA-214 can be restricted as follows where i = P248, k = s and Accordingly with the transformation rules of the previous sec- j = P214: tion, the first operation to perform is the definition of a common coordinate reference system. The origin of such system is placed πi′j = πi j ⊗a (πik ◦ πk j ) to 200 B.C., since it is the earliest date in the model, while the −1 granularity is the year, since for all dates the minimum granu- = πi j ⊗a (πki ◦ πk j ) larity is at least a year. In order to simplify the presentation, the = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K⊗a resulting network is presented through three portions, each one corresponding to a different phase. The overall network can be (⟨−209, −199, −100, −90⟩J(a 1 , 0.5), (a 4 , 0.5)K◦ obtained by combining the three sub-networks and by adding ⟨140, 150, 155, 165⟩J(a 4 , 1)K) an edge from phase A to phase B and an edge from phase B to = ⟨0, 0, ∞, ∞⟩J(a 1 , 1)K ⊗a ⟨−69, −49, 55, 75⟩J(a 1 , 0.5), (a 4 , 1)K phase C, both labeled with ⟨0, 0, +∞, +∞⟩J(a 1 , 1)K. These edges = ⟨0, 0, 55, 75⟩J(a 1 , 0.52), (a 4 , 0.52)K⟩ represent the precedence relations between phases. Fig. 5 illustrates the subnetwork related to phase A: node s represents the starting point, nodes As and Ae represent the start The consideration is similar to the previous one, since P214 hap- and end points of the phase respectively, while nodes P263 and pens in the middle of the I A.C. and P248 is generally dated I A.C. P208 represent the dating of the corresponding archaeological but has to finish before P214 starts living. Notice that in this case partitions. This portion of FTCN allows to compute some derived we have two ownership information for a 1 , thus we choose the constraints for the nodes based on the declared one, using the maximum one. formula in Def. 3.5: πi′j (x) = πi j ⊗a (πik ◦ πk j (x)). Finally, as regards to phase C, whose corresponding sub-network In particular, a more precise relation can be derived between is reported in Fig. 7, the dating of its partition can determine a partition P208 and partition P263, which is initially represented restriction of the phase start as follows, by considering i = s, k = )ǁ [5] A. Belussi and S. Migliorini. 2014. A Framework for Managing Temporal (a 4,1 P214 〈0,0 65〉ǁ ,13,1 55,1 3〉ǁ(a Dimensions in Archaeological Data. In Proceedings of 21st International Sym- 50,1 1 ,1)ǁ 〈1 40,1 〈0,0,13,13〉ǁ(a1,1)ǁ posium on Temporal Representation and Reasoning (TIME). 81–90. https: 〈241,241,254,254〉ǁ(a1,1)ǁ 〈0,0,13,13〉ǁ(a1,1)ǁ //doi.org/10.1109/TIME.2014.15 s Bs Be [6] A. Belussi and S. Migliorini. 2014. Modeling Time in Archaeological Data: the 〈90 ǁ(a5,1)ǁ 〈0,0,13,13〉ǁ(a1,1)ǁ ,1 )ǁ Verona Case Study. Technical Report RR 93/2014. Department of Computer ,1 0 〉ǁ(a 1 ,13 0,1 ,13 Science, University of Verona. http://www.di.univr.it/report 99 ,2 〈0,0,13,13〉ǁ(a1,1)ǁ P275 〈0,0 )ǁ 09 ,1 〉ǁ( a1 , 3〉ǁ (a 1 [7] A. Belussi and S. Migliorini. 2017. A spatio-temporal framework for managing 0.5 3,1 archeological data. Annals of Mathematics and Artificial Intelligence 80, 3 (Aug ),(a 0,1 4 ,0 〈0, .5) ǁ P248 2017), 175–218. https://doi.org/10.1007/s10472-017-9535-0 [8] P. Buneman, S. Khanna, and W. C. Tan. 2001. Why and Where: A Char- acterization of Data Provenance. In Database Theory - ICDT 2001, 8th In- Figure 6: Portion of FTCN related to phase B. ternational Conference, London, UK, January 4-6, 2001, Proceedings. 316–330. https://doi.org/10.1007/3-540-44503-X_20 [9] P. Buneman and W. C. Tan. 2018. Data Provenance: What next? SIGMOD Record 47, 3 (2018), 5–16. https://doi.org/10.1145/3316416.3316418 P250 and j = Cs : [10] R. Dechter, I. Meiri, and J. Pearl. 1991. Temporal Constraint Networks. Artificial Intelligence 49, 1-3 (1991), 61–95. πi′j = πi j ⊗a (πik ◦ πk j ) [11] C. C. Kolb. 2014. Provenance Studies in Archaeology. Springer New York, New York, NY, 6172–6181. https://doi.org/10.1007/978-1-4419-0465-2_327 [12] P. Missier and K. Belhajjame. 2012. A PROV Encoding for Provenance Analysis = πi j ⊗a (πik ◦ πk−1j ) Using Deductive Rules. In Provenance and Annotation of Data and Processes - = ⟨512, 512, 753, 753⟩J(a 1 , 1)K⊗a 4th International Provenance and Annotation Workshop, IPAW. 67–81. https: //doi.org/10.1007/978-3-642-34222-6_6 (⟨601, 650, 700, 700⟩J(a 2 , 1)K◦ [13] Lluis V. and Lluis G. 1994. On Fuzzy Temporal Constraint Networks. Mathware and Soft Compunting 3 (1994), 315–334. ⟨−241, −241, 0, 0⟩J(a 1 , 1)K) = ⟨512, 512, 753, 753⟩J(a 1 , 1)K⊗a ⟨360, 409, 700, 700⟩J(a 1 , 0.5), (a 2 , 0.5)K = ⟨512, 512, 700, 700⟩J(a 1 , 0.78), (a 2 , 0.30)K Clearly, these are only examples of the derivations that can be obtained by executing the path-consistency algorithm on the overall network and considering all the triangles. However, these examples make clear the utility of applying existing temporal reasoning techniques on archaeological data. 〈512,512,753,753〉ǁ(a1,1)ǁ 〈0,0,241,241〉ǁ(a1,1)ǁ s Cs Ce 〈60 〈0,0,241,241〉ǁ(a1,1)ǁ )ǁ 1,6 ,1 50, (a 1 7 00, 1 〉ǁ 700 41 , 24 〉ǁ(a 2 ,1) 〈0 ,0,2 ǁ P250 Figure 7: Portion of FTCN related to phase C. 5 CONCLUSION In this paper we have proposed an extension of a model, able to store temporal information about archeological findings, for managing also the data provenance during the archaeological interpretation process. In particular, we have extended a set of fuzzy operators in order to represent and infer new knowledge including provenance of data and its degree of truth. ACKNOWLEDGMENTS This work was partially supported by the Italian National Group for Scientific Computation (GNCS-INDAM) and by “Progetto di Eccellenza” of the Computer Science Dept., Univ. of Verona, Italy. REFERENCES [1] 2013. World Wide Web Consortium - PROV-DM: The PROV Data Model. https://www.w3.org/TR/prov-dm/. [2] J. F. Allen. 1983. Maintaining Knowledge About Temporal Intervals. Commu- nications of the ACM 26, 11 (1983), 832–843. [3] S. Badaloni, M. Falda, and M. Giacomin. 2004. Integrating Quantitative and Qualitative Fuzzy Temporal Constraints. AI Communications 17, 4 (2004), 187–200. [4] S. Badaloni and M. Giacomin. 2006. The Algebra IAfuz : A Framework for Qualitative Fuzzy Temporal Reasoning. Artificial Intelligence 170, 10 (2006), 872–908.