Introducing VISU: Vagueness, Incompleteness, Subjectivity, and Uncertainty in Art Provenance Data Fabio Mariani Institute of Philosophy and Art History, Leuphana University Lüneburg, Universitätsallee 1, C5.418, 21335 Lüneburg, Germany Abstract The acronym VISU refers to Vagueness, Incompleteness, Subjectivity, and Uncertainty found in prove- nance records, which document the history of ownership and socio-economic custody changes of an object. VISU information represents the intellectual effort of researchers and its limits in reconstructing historical events from archival sources. Although provenance has mainly been used in the past to assess an object’s artistic and economic value, it has recently become crucial information from an ethical and legal viewpoint. In light of this, there is a growing interest in structuring provenance information in a machine-readable format and making this data openly accessible to anyone, e.g., by publishing provenance data as linked open data. However, with the impetus to publish provenance linked open data, we risk losing or simplifying VISU information. After describing VISU information and analysing current community standards, this article illustrates how to represent such information in publishing provenance linked open data. Keywords Provenance, Linked Open Data, CIDOC CRM, Linked Art, Nanopublication 1. Introduction Provenance records document chains of events of ownership and socio-economic custody changes of an object. These records contain historical information that answers the question: from where did it come? This article focuses on the provenances of objects with artistic or cultural value held by a gallery, library, archive, or museum (GLAM). In the art market, documenting provenance has been a means of establishing the value of artworks since the eighteenth century [1]. For example, if a well-known and highly respected collector owned an object, then they would contribute to its supposed authenticity and aesthetic value, determining its economic value [2]. By the late twentieth century, however, provenance’s moral, ethical, and legal entanglements became a subject of scrutiny and debate. As a con- sequence of colonialism, totalitarian regimes and two world wars, many objects improperly changed hands due to seizures, confiscations, and looting. For this reason, documenting and establishing the life story of an object has become crucial in establishing its rightful owner. The 1998 Washington Conference on Holocaust-Era Assets foregrounded the importance of provenance research to find and return art and cultural property confiscated by the Nazi regime COMHUM 2022: Workshop on Computational Methods in the Humanities, June 09–10, 2022, Lausanne, Switzerland Envelope-Open fabio.mariani@leuphana.de (F. Mariani) Orcid 0000-0002-7382-0187 (F. Mariani) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 63 CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Fabio Mariani CEUR Workshop Proceedings 63–84 [3]. At the conference, the 44 participating governments and 13 non-governmental organisa- tions agreed on eleven non-binding principles (“The Washington Conference Principles on Nazi-Confiscated Art”) resolving disputes over Nazi-looted art through the study of provenance. As a result of these principles, provenance research has become more professionalised, acquiring interdisciplinary characteristics. In fact, it has become something of an academic field in its own right [4]. The increased importance of provenance from not only an economic perspective, but also an ethical and legal one, has put a spotlight on the responsibility of institutions. Indeed, the accountability and transparency to which GLAM institutions are being held also depend on researching and publishing the provenance records of the objects for which they are responsible. However, recording provenance is a complex process and requires a considerable investment of resources. On the one hand, careful research of sources is necessary to reconstruct the history of an object. On the other hand, this effort requires consideration in curating and publishing any information obtained. Moreover, the efforts of a single institution must be coordinated with other stakeholders in the GLAM domain. Recently, digital tools and methodologies have opened up new possibilities to assist the curation, publishing, and analysis of provenance data. In particular, the publication of provenance linked open data promises unprecedented levels of standardisation, enabling researchers to analyse the context of object histories in their cross-institutional complexity [5, 6, 7]. Considering the benefits of provenance linked open data, it is crucial to identify and address its related risks and challenges. The exclusion or simplification of historical complexity could reduce the quality of information, which could, in turn, cause harm when considering the ethical and legal implications of provenance. It is no coincidence that among the principles that emerged from the 1998 Washington Conference, it is advised that “consideration should be given to unavoidable gaps or ambiguities in the provenance…” [3]. In this article, we aim to categorise such “unavoidable gaps or ambiguities in the prove- nance” as they are likely to be compromised in publishing provenance linked open data. Indeed, recording provenance requires considerable intellectual effort in interpreting sources and for- mulating hypotheses about an object’s history. Such a hermeneutic process is prone to produce Vagueness, Incompleteness, Subjectivity, and Uncertainty (VISU). In publishing provenance linked open data, it is, therefore, critical to maintain the integrity of the intellectual process, with its hypothetical statements and its dealing with gaps in knowledge. Given provenance’s complexity, this article, in addition to identifying and classifying VISU information, introduces implementation solutions to represent it as linked open data. These solutions comply with current data publishing standards in the cultural heritage domain. 2. Vague, Incomplete, Subjective, and Uncertain Information The growing requirement for institutions to be more transparent and accountable has prompted them to publish information about the provenance of objects in their collections. Currently, the provenance of an object is recorded manually as textual metadata through collection manage- ment software. Although there is not yet a shared standard for transcribing this information, the American Alliance of Museums (AAM) has drafted guidelines for compiling provenance texts [8]. To give an example, below is the provenance text of a painting by André Derain from 64 Fabio Mariani CEUR Workshop Proceedings 63–84 1910 titled “Cagnes”, which is published on the Art Institute of Chicago website and has been compiled according to the AAM guidelines: Galerie Kahnweiler, Paris, probably acquired directly from the artist. Louis Lion & Co., New York, by Feb. 1957 [verso inscription; this and the following according to letter from Knoedler and Co., Apr. 8, 1975, copy in curatorial file]; sold to Knoedler & Co., New York, Feb. 1957; sold to the Art Institute of Chicago, 1960.1 According to the AAM guidelines, provenance editors should list events in chronological order, from the object’s creation to the acquisition by its current owner.2 An event represents a change of ownership, or custody, of the object from one party to another. Each event consists of the acquisition method, location, date, names of the parties, and their related biographical information. Punctuation separating events has a specific meaning: a semicolon implies that the transaction from one party to another was direct; a period indicates a gap in the reconstruction of the events. For example, the period at the end of the first recorded event listed above, when Galerie Kahnweiler received the object, indicates a gap in the provenance record of the painting “Cagnes”. This means, therefore, that it is unknown how the painting passed from Galerie Kahnweiler to Louis Lion & Co., its next recorded owner. Potentially, there could have been other owners of the object that have yet to be identified. When there is no sufficient certainty about an event, the AAM guidelines suggest using the terms “probably” and “possibly”, depending on the level of uncertainty. In analysing the provenance text of the painting “Cagnes”, we can see that the authors were not certain about the first recorded event, and therefore used the phrase “probably acquired directly from the artist”. Finally, notes can provide additional information regarding the provenance. In the above example, the Art Institute of Chicago uses notes in square brackets. Notes in compiling a provenance text are necessary since the chronology of events results from careful research of disparate archival sources, such as inventories, letters, and even photographs. Indeed, sometimes a provenance expert can find a source for reconstructing an event on the object itself. For example, we know that Louis Lion & Co. owned Derain’s artwork through an inscription on the back of the painting (“verso inscription”). From what has been discussed, it is clear that reconstructing ownership histories is not a straightforward process since it requires intellectual and critical effort in analysing the available historical sources and formulating hypotheses. Moreover, sources are not always available to reconstruct events, and some information may not be immediately evident. We have classified these phenomena into four categories: Vagueness, Incompleteness, Subjectivity, and Uncertainty. We have gathered them under the acronym VISU, from the Latin de visu, meaning with your own eyes. Vagueness refers to information that is given with certainty but in an approximate way. An approximation can occur when describing spatial information (e.g., near Paris) or temporal information (e.g., circa 1945). In either case, the vagueness of the information does not affect the 1 https://www.artic.edu/artworks/12402/cagnes (accessed 2023-08-11). 2 Usually, the creation event is omitted in the provenance text as it is recorded in other appropriate metadata fields, such as author, date, and place of creation. 65 Fabio Mariani CEUR Workshop Proceedings 63–84 certainty of the event. Incompleteness refers to a lack of information in the reconstruction of an object’s provenance. In this case, provenance experts may not have formulated any hypotheses yet to address the missing information. Subjectivity concerns the expert’s interpretive context when reconstructing an object’s provenance—how they formulated hypotheses through source analysis and deduction. Moreover, different assumptions may contradict each other. Finally, uncertainty refers to the level of confidence with which a provenance expert has expressed a hypothesis, using terms such as “possibly” or “probably”. Unlike vagueness, uncertainty questions the very occurrence of a given event. The categories of what we define as VISU have already been a topic of interdisciplinary debate, from philosophy and mathematics to, more recently, computer science [9]. In Smithson’s taxonomy of ignorance, for example, the concept of uncertainty represents a generic term that, in turn, can be divided into more specific concepts such as vagueness and probability [10]. The latter is closer to our definition of uncertainty. In contrast, Smets distinguishes more sharply between uncertainty and imprecision in providing a taxonomy of imperfection [11]. Smets’ imprecision can be compared to the vagueness of VISU information. At least lexically speaking, a classification close to that of VISU is provided by Nagypál and Motik [12]. Here, the categories of uncertainty, subjectivity, and vagueness are defined in relation to expressing temporal knowledge. However, the meaning given to each term is different from that intended in VISU. In fact, according to their classifications, uncertainty (e.g., circa 1918), subjectivity (e.g., the dating of the Russian Revolution), and vagueness (e.g., in February 1918) are all ascribable to the concept of vagueness in VISU. In analysing uncertainty in the digital humanities domain, Piotrowski recognises the conflict of interpretations between scholars as an additional aspect of dealing with “uncertain, vague, incomplete, or missing information” [9]. In doing so, Piotrowski partially anticipates the classification we propose with the acronym VISU, since the conflict of interpretations is one aspect of what we define as subjectivity. 3. Provenance Linked Open Data As previously discussed, institutions currently create and share provenance records in text format. Although provenance texts are stored and published online digitally via collection management systems, the text format limits the use of provenance as a research and study tool. Indeed, it is currently impossible to use provenance data to perform large-scale analyses across multiple institutions through the application of, for example, digital methods such as big data queries, network analysis, and spatial analysis [13]. These limitations can be attributable, on the one hand, to the fact that textual information is not machine-readable and, on the other hand, to the fact that it is not published according to FAIR principles [14]. Indeed, as it stands, provenance information, which is siloed as text in collection databases of institutions, is not findable, accessible, interoperable, and reusable. For these reasons, publishing provenance linked open data (LOD) has recently emerged as a promising possibility to address the standardization of provenance information produced by institutions in a machine-readable format compliant with FAIR principles [5]. Moreover, LOD respects the open data principles: that is, provenance LOD can be used by anyone for any purpose.3 Provenance data should be published as open data, 3 https://opendefinition.org/ (accessed 2023-08-11). 66 Fabio Mariani CEUR Workshop Proceedings 63–84 not only because it involves historical facts but also because of the significance of provenance for institutional accountability and transparency. A significant early experiment in publishing an institution’s provenance records as LOD was carried out within the Art Tracks project, an initiative of the Carnegie Museum of Art (CMOA), which took place from 2014 to 2017 [15]. In particular, Art Tracks implemented the CMOA Digital Provenance Standard for modelling provenance LOD following the CIDOC CRM schema, the international standard for exchanging digital information regarding cultural heritage (ISO 21127).4 The schema of CIDOC CRM is event-based since its semantic structure has temporal entities (crm:E2_Temporal Entity) as its core [16]. A temporal entity, such as an event (crm:E5_Event), can link to time (crm:E52_Time-Span), space (crm:E53_Place), or event actors (crm:E39_Actor). However, the centrality of the temporal entity means that an actor, such as a person (crm:E21_Person), cannot link directly to a time or place. For example, CIDOC CRM does not express an individual’s birth date as a person’s attribute, but rather as a specific event, birth (crm:E67_Birth), linked to a time and involving that person. In turn, the birth event can be linked to the location of the event. In order to make CIDOC CRM modelling more accessible to institutional practitioners, Linked Art, a community of cultural heritage institutions, developed a CIDOC CRM application profile.5 In addition to CIDOC CRM, the Linked Art Data Model integrates the Getty’s controlled vocab- ularies, such as the Art and Architecture Thesaurus (AAT), to identify domain-specific terms via URI.6 The integration of CIDOC CRM and Getty vocabularies, combined with the support of a large and active community behind the Linked Art Data Model, make this application profile an ideal candidate for the standardisation of publishing provenance LOD. Indeed, modelling provenance LOD is one of the aspects that the Linked Art Data Model covers in detail. According to Linked Art, a provenance record structured as LOD is a succession of provenance events (or activities), structured in CIDOC CRM as crm:E7_Activity. An activity can itself consist of multiple activities (sub-activities), expressing more complex events. Linked Art provides a pattern for defining the characteristics of events: the object(s) involved, the actors participating, the location, and the time. Examples of how to structure the data are given depending on the different types of activities. For example, an activity describing the purchase of an object may contain two sub-activities. The first activity consists of the acquisition of the object given by the seller and received by the buyer, while the second constitutes the payment made by the buyer to the seller. Similarly, exchanging two objects involves two sub-activities, each describing the respective ownership change. 4 CIDOC CRM (version 7.2) is the Conceptual Reference Model (CRM) implemented by the International Committee for Documentation (CIDOC) of the International Council of Museums (https://www.cidoc-crm.org/, accessed 2023-08-11). 5 https://linked.art/model/ (accessed 2023-08-11). 6 https://www.getty.edu/research/tools/vocabularies/aat/ (accessed 2023-08-11). 67 Fabio Mariani CEUR Workshop Proceedings 63–84 @prefix crm: . @prefix rdfs: . a crm:E7_Activity ; rdfs:label "Purchased by the Art Institute of Chicago from Knoedler & Co. in 1960" ; crm:P2_has_type ; crm:P2_has_type ; crm:P4_has_time-span [ a crm:E52_Time-Span ; crm:P82a_begin_of_the_begin "1960-01-01T00:00:00Z" ; crm:P82b_end_of_the_end "1960-12-31T23:59:59Z" ] ; crm:P9_consists_of [ a crm:E8_Acquisition ; crm:P22_transferred_title_to [ a crm:E74_Group ; rdfs:label "The Art Institute of Chicago" ] ; crm:P23_transferred_title_from [ a crm:E74_Group ; rdfs:label "Knoedler & Co." ] ; crm:P24_transferred_title_of [ a crm:E22_Human-Made_Object ; rdfs:label "Cagnes" ] ] . Listing 1: RDF description, serialized in Turtle format, of the purchase of the painting “Cagnes” by the Art Institute of Chicago from Knoedler & Co. in 1960. Listing 1 shows the RDF description of the last provenance event of André Derain’s painting “Cagnes”: the purchase of the artwork by the Art Institute of Chicago from Knoedler & Co. in 1960. RDF, Resource Description Framework, is a World Wide Web Consortium standard for information exchange as LOD. The activity (crm:E7_Activity) is classifiable according to AAT vocabulary as “provenance” (aat:300055863) and “purchase” (aat:300417642). Moreover, the activity took place in 1960, a time span expressed through its time limits: the begin of the begin “1960-01-01T00:00:00Z” (the minimum possible date) and the end of the end “1960-12- 31T23:59:59Z” (the maximum possible date). The activity has a sub-activity (crm:E8_Acquisition), which describes the acquisition of the painting “Cagnes” by the Art Institute of Chicago from Knoedler & Co. In making CIDOC CRM modelling usable to practitioners, Linked Art deliberately leaves out some aspects that would complicate the accessibility of the data model, such as uncertainty and data provenance. However, this choice compromises the integrity of VISU information when modelling provenance LOD. As discussed in the previous section, VISU information is based on the intellectual work of provenance experts, who research and record provenance. Moreover, with VISU information, historical debate and hypothesis-making become critical to achieving the most scientifically accurate reconstruction of an object’s history. Forgoing VISU information thus not only compromises the integrity of the data but also prevents debate, thereby reducing its usefulness for research. This phenomenon, also referred to as the “lure of objectivity”, is one of the major challenges in digital humanities [17]. We, therefore, intend to safeguard the complexity of VISU information by making it machine-readable according to LOD standards and compatible with the Linked Art Data Model. The following sections describe the challenges, opportunities, and solutions in dealing with VISU information as LOD. 68 Fabio Mariani CEUR Workshop Proceedings 63–84 4. Vagueness By introducing VISU information, we have established a clear distinction between the concepts of vagueness and uncertainty that previous scholarship, as noted, has not made consistently. Vagueness indicates the approximation of a datum. Approximating a datum per se does not compromise the statement’s certainty. For example, to say that an event occurred near Paris is to approximate the geographical location of a temporal entity. The fact is not called into question. Similarly, the existence of an event that occurred circa 1945 is not questioned by the temporal approximation of the date. Since vagueness concerns spatial and temporal information, it depends on the measures and language used by historical sources. Indeed, whereas technology allows us to calculate space and time with utmost precision, human language can hardly replicate its accuracy. Compare, for example, the limitations of language in traditional art market information, such as the inventory of an art dealer, with the measures of modern digital data, such as an online auction house database. In the written inventory of an art dealer, an event date can achieve maximum accuracy by expressing the year, month, and day of the event. Usually, however, vague reference systems such as months or seasons are used. Seldom does an author of a source go into details such as the exact hour of an event. In contrast, an online auction house database can capture the moment of purchase to a thousandth of a second. Similarly, whereas human language cannot go beyond the precision of an address to indicate spatial information, technology allows us to pinpoint the geographical coordinates of a place with greater accuracy. In addition to measuring instruments and human language, an approximation can result from a lack of information. For example, in the provenance text of the painting “Cagnes”, the second provenance event states that Louis Lion & Co. owned the work “by Feb. 1957”. The author of the provenance record used this expression because they had no sources to establish when Louis Lion & Co. received the object precisely. We do not even know who the previous owner was, expressed using a period that signifies a gap in the painting’s provenance text. What the provenance expert can establish from the historical information available, however, is that Louis Lion & Co. had the object in February 1957 since the sources show that they sold the work to Knoedler & Co. in that month. Thus, we can assert that the acquisition of the painting by Louis Lion & Co. took place between 1910, the previous known date and thus the lower limit of the possible time interval, and 28 February 1957, the last day of the month in which Knoedler & Co. acquired the object. Experts can formulate subjective hypotheses with different degrees of uncertainty based on a vague time expression such as “by Feb. 1957”. For example, according to the available information, it is possible, although very unlikely, that Louis Lion & Co. acquired the object on 28 February 1957 and sold it to Knoedler & Co. on the same day. In CIDOC CRM and Linked Art, one can already model some vague information. This ensures that information is not falsified when publishing provenance LOD, which runs the risk of making vague information seemingly precise. It also opens up possibilities for data analysis and visualization that include this layer of complexity. Concerning the approximation of spatial data, CIDOC CRM introduces the property crm:P189_approximates. Using this property makes it possible to establish an approximation relation between two places. For example, in Listing 2, we see how the place “Paris”, defined as a point in space, approximates the expression “near Paris”. In this way, we preserve the vagueness of the information on the one hand. And on the 69 Fabio Mariani CEUR Workshop Proceedings 63–84 other hand, we model a point in space that, albeit approximate, allows us to query the geospatial datum and visualize it on a map. The Linked Art Data Model already includes this modelling solution. @prefix crm: . @prefix rdfs: . a crm:E53_Place ; rdfs:label "Paris" ; crm:P168_place_is_defined_by "POINT(2.2769957 48.8589466)" ; crm:P189_approximates [ a crm:E53_Place ; rdfs:label "near Paris" ] . Listing 2: RDF description, serialized in Turtle format, of the “near Paris” approximation. As far as temporal information is concerned, as we have already seen when introducing CIDOC CRM and Linked Art in Listing 1, it is represented as a time span. Thanks to the properties crm:P82a_begin_of_the_begin and crm:P82b_end_of_the_end, this type of modelling makes it possible to model several vague chronological pieces of information [18]. For example, Listing 3 shows the modelling of the time span in which Louis Lion & Co. acquired the painting “Cagnes”. As previously discussed, the activity occurred sometime between 1910 (begin of the begin) and February 1957 (end of the end). In addition, this approach allows for modelling other approximate expressions in which an event occurred, such as months, seasons, years, decades, centuries, and millennia. @prefix crm: . @prefix rdfs: . a crm:E52_Time-Span ; rdfs:label "between 1910 and February 1957" ; crm:P82a_begin_of_the_begin "1910-01-01T00:00:00Z" ; crm:P82b_end_of_the_end "1957-02-28T23:59:59Z " . Listing 3: RDF description, serialized in Turtle format, of the time span between 1910 and February 1958. Although CIDOC CRM allows us to model temporal information as a time span, it does not allow the representation of an approximation that occurs around a date, such as the expression “circa 1945”. However, it is possible to integrate the CRMgeo module to overcome this limitation. This extension of CIDOC CRM, dedicated to a more complex representation of spatiotemporal data, introduces the property crmgeo:Q13_approximates [19]. Like the crm:P189_approximates for places, the crmgeo:Q13_approximates property establishes an approximation relation be- tween two time spans [20]. As an example, Listing 4 describes how the time span “1945”—with a begin of the begin as 1 January 1945 and an end of the end as 31 December 1945—approximates the vague time span “circa 1945”. Therefore, we believe this solution, similar to the one adopted 70 Fabio Mariani CEUR Workshop Proceedings 63–84 for spatial approximation, can be integrated into the Linked Art Data Model. @prefix crm: . @prefix crmgeo: . @prefix rdfs: . a crm:E52_Time-Span ; rdfs:label "1945" ; crm:P82a_begin_of_the_begin "1945-01-01T00:00:00Z" ; crm:P82b_end_of_the_end "1945-12-31T23:59:59Z" ; crmgeo:Q13_approximates [ a crm:E52_Time-Span ; rdfs:label "circa 1945" ] . Listing 4: RDF description, serialized in Turtle format, of the time expression “circa 1945” approximated by the time span 1945. 5. Incompleteness In dealing with incompleteness, we must consider a trivial but essential fact: it is impossible to model as LOD what is unknown. Indeed, incompleteness is the only VISU information we cannot address directly in the modelling phase. However, conscious modelling of known information can help to address incompleteness through subsequent data analysis and in the hypotheses-making phase. Although we cannot model what we do not know, we can establish patterns of incompleteness against which we analyse the available information [21]. This approach first allows us to identify where and what information is missing and, secondly, to formulate new hypotheses with the help of data analysis. The first pattern of incomplete provenance information that we can identify concerns gaps in the object’s chain of activities. The importance of considering this kind of incompleteness for the integrity of a provenance record already emerges from the AAM guidelines. As we have already described, in provenance texts, events are divided by semicolons if transactions are direct and by periods if there are gaps in the ownership history of an object. Since we cannot directly model the presence of a gap as LOD, we must define a pattern to detect this incompleteness in the data. Linked Art describes the chronological linkage of provenance activities through the properties crm:P183_ends_before_the_start_of and crm:P183i_starts_after_the_end_of. These two properties allow us to determine whether an event occurred before or after another [22]. While they may help establish a chronological order of events, these properties are insufficient for identifying gaps between them. To detect such gaps, we must formulate the incompleteness pattern of the chain of activities: there is a gap between two events, A and B, linked in chronological succession (Activity_A crm:P183_ends_before_the_start_of Activity_B) if the party who receives the object in Activity_A is not the one who parts with it in Activity_B. In Listing 5, we describe the activities involving the acquisition of the painting “Cagnes” by Galerie Kahnweiler from the artist and the subsequent acquisition by Louis Lion & Co. In this case, the scenario respects the incompleteness pattern of the chain of activities insofar as 71 Fabio Mariani CEUR Workshop Proceedings 63–84 Galerie Kahnweiler was not the owner who gave the object to Louis Lion & Co. Identifying such a gap in analysis can lead to the formulation of new hypotheses since there may have been one or more intermediate owners prior to Louis Lion & Co. The gap in question is of significant interest to scholars as it conceals the events that caused the object to be moved from Paris to New York. Moreover, the gap overlaps with two world wars that affected, among other aspects, the circulation of artworks, legal or otherwise. In this scenario, the publication of provenance LOD is valuable because it allows us to analyse large amounts of provenance data from different institutions. Indeed, through network analysis, we can identify the most frequent pathways of artworks that, at some point in their lives, passed through Galerie Kahnweiler, as well as the most prominent agents from whom Louis Lion & Co. purchased artworks, thus opening up new hypotheses that try to bridge the gap. @prefix crm: . @prefix rdfs: . a crm:E7_Activity ; rdfs:label "Acquired by Galerie Kahnweiler from André Derain" ; crm:P2_has_type ; crm:P183_ends_before_the_start_of ; crm:P9_consists_of [ a crm:E8_Acquisition ; crm:P22_transferred_title_to [ a crm:E74_Group ; rdfs:label "Galerie Kahnweiler" ] ; crm:P23_transferred_title_from [ a crm:E21_Person ; rdfs:label "André Derain" ] ; crm:P24_transferred_title_of [ a crm:E22_Human-Made_Object ; rdfs:label "Cagnes" ] ] . a crm:E7_Activity ; rdfs:label "Acquired by Louis Lion & Co." ; crm:P2_has_type ; crm:P9_consists_of [ a crm:E8_Acquisition ; crm:P22_transferred_title_to [ a crm:E74_Group ; rdfs:label "Louis Lion & Co." ] ; crm:P24_transferred_title_of [ a crm:E22_Human-Made_Object ; rdfs:label "Cagnes" ] ] . Listing 5: RDF description, serialized in Turtle format, of the acquisition of the painting “Cagnes” by Galerie Kahnweiler from the artist, and the subsequent acquisition by Louis Lion & Co. Different patterns of incompleteness can result from other missing constituents of an activity. As we discussed in introducing Linked Art, the data model introduces a pattern of event constituents. An activity is determined not only by its participating actors, but also by time and place and the object(s) involved. In addition, an activity can consist of several sub-activities, depending on its type. Thus, Activity_A is incomplete if the time, place, or object(s) involved are not expressed, or if one or more of the sub-activities associated with its type are missing. 72 Fabio Mariani CEUR Workshop Proceedings 63–84 When the time of an activity is unknown, incompleteness can be solved by generating vague information, that is, by defining that the event occurred in a time interval between the last previously known date before the activity and the first subsequently known date after the activity. As previously discussed in the section on vagueness, while we do not know when Louis Lion & Co. acquired the painting “Cagnes”, we can infer that the activity occurred sometime between 1910 and 28 February 1957. The incompleteness of an activity’s location proves to be a more challenging piece of information to reconstruct from the sources, except for when an event is specific, like an auction. In provenance texts, we find mainly geographical information about the actors. This can sometimes be useful in hypothesising the locations where events occurred. For example, we can infer that the purchase of “Cagnes” by Knoedler & Co. from Louis Lion & Co. occurred in New York since both companies were located there. In contrast, the incompleteness concerning an activity and its sub-activities depends on the type of event, for which the Linked Art Data Model introduces a distinct structure. For example, as discussed, a purchase activity involves two sub-activities: 1) the acquisition of the sold object and 2) payment. In the previous section, we presented the LOD example of modelling the purchase of “Cagnes” in 1960 by the Art Institute of Chicago (Listing 1), the last event in the provenance record of that object. We can therefore assert that the activity is incomplete, since there is no sub-activity related to the payment made by the Art Institute of Chicago to the seller. Similarly, a provenance activity that concerns the exchange of one object for another will be incomplete if it consists of only one sub-activity, since one of the objects involved is not registered. Additional types of incompleteness, which are difficult to ascribe to a fixed pattern, concern the biographical information of the actors involved. Missing biographical information of interest to the reconstruction and study of provenance may be: birth and death (or formation and dissolution, in the case of organisations), period and place of activity, and relationships to other actors. In addition to the direct intervention of historians, it is possible to use external knowledge published as LOD to fill in these gaps, such as the Getty’s Union List of Artist Names (ULAN).7 This controlled vocabulary can enrich our understanding of the actors of provenance activities with additional biographical information. In turn, enriching biographical information can help fill in other types of incompleteness. For example, by using the ULAN entity information of Louis Lion & Co. (ulan:500449799), we learn that the company has been in business since 1949. This new information allows us to, in turn, narrow down its purchase of the painting “Cagnes” to a time interval from 1949 to 28 February 1957. Finally, it should be noted that provenance texts have a considerable bias in the representation of women. Many women are represented by their husbands’ names (“Mrs John Doe”) or even by the expression “the artist’s wife”. Such expressions compromise the historical representation of women and make it difficult for historians to identify female actors. For example, expressions such as “the artist’s wife” are of little help if an artist had multiple wives. Modelling provenance LOD thus becomes an opportunity for historians to remedy such bias and finally give proper representation to people. 7 https://www.getty.edu/research/tools/vocabularies/ulan/ (accessed 2023-08-11). 73 Fabio Mariani CEUR Workshop Proceedings 63–84 6. Subjectivity Reconstructing the history of an art object is the result of laborious research by provenance experts, who hypothesise through the interpretation of sources what might have happened. Of course, the hypotheses of different experts may contradict each other, evolve with time, and become obsolete in light of new findings. As provenance texts stand, however, they cannot capture the hermeneutic and dialectical complexity of this intellectual process. In fact, except for notes to provide additional context for specific hypotheses, the texts are not accompanied by any publication information. For example, the author’s name and publication date are critical metadata for information authority and versioning. The lack of versioning, in particular, can lead to the harmful practice of deleting a provenance text whenever an institution produces a new version. In this way, a debate concerning the provenance of an object is arbitrarily steered in a single direction, collapsing the idea that different historical interpretations can coexist. It is possible to include publication information and versioning when publishing provenance LOD by implementing what is known as the data provenance of provenance data [23, 7, 24]. Just as we can trace an artwork’s ownership history, we can trace the recording history of a given datum through data provenance. The recording history tracks when a datum was created, by whom, and when it was modified. CIDOC CRM introduces the class crm:E13_Attribute_Assignment, a subclass of crm:E7_Ac- tivity, to describe the context in which an assertion is made regarding an entity. An attribute assignment is the entity with which CIDOC CRM represents the n-ary relationship between the asserted entity and the assertion information. In this way, in addition to defining the asserted value, we can add additional statements to describe the context of the assertion, such as the author and date. Although this solution is also adopted in Linked Art to define, for example, authorship attribution, it tends to be verbose and redundant [25]. Focusing on the case of data provenance of provenance data, we found issues related to using attribute assignments to represent this type of information. An n-ary relation enables us to describe the context of an assertion pertaining only to a single statement. However, in the case of provenance, hypothesis-making does not concern a single statement but the assertion of an entire event and, thus, multiple statements. In this scenario, should we model an attribute assignment for each statement, we would need to repeat the same information multiple times. This situation would be even more complex in case of contradictory assumptions, as this requires us to produce multiple attribute assignments to describe conflicting hypotheses, resulting in an additional increase in statements. Moreover, such a solution would result in the coexistence in the same RDF graph of different and contradictory information about the same fact, compromising the usability of the data. Given the nature of provenance information and the issues arising using attribute assignments, we considered other approaches. Among the many methods to represent data provenance as LOD, nanopublication is one of the most suitable [26]. Nanopublication is a way of publishing an atomic unit of information as LOD, providing data provenance and publication information [27]. In this way, it is possible to trace and reference these atomic units of information independently of the entire dataset, making the knowledge expressed more authoritative and compliant with FAIR principles [28]. In presenting provenance LOD modelling according to the Linked Art Data Model, we have 74 Fabio Mariani CEUR Workshop Proceedings 63–84 Table 1 HiCO classes and properties alignment with CIDOC CRM, importing the CRMinf module. HiCO CIDOC CRM (CRMinf) hico:InterpretationAct crminf:I1_Argumentation hico:InterpretationCriterion crm:E55_Type hico:hasInterpretationType crm:P2_has_type crm:P14_carried_out_by crm:P14_carried_out_by prov:startedAtTime crm:P4_has_time-span cito:citesAsEvidence crm:P16_used_specific_object prov:wasGeneratedBy crminf:J2_concluded_that → crminf:I2_Belief → crminf:J4_that hico:hasInterpretationCriterion crm:P32_used_general_technique hico:isExtractedFrom crm:P70i_is_documented_in prov:wasInfluencedBy crm:P15_was_influenced_by seen how provenance activities are the constitutive elements of an event-based model. In light of this, we consider the provenance activity as the atomic unit of a provenance record published as a nanopublication. Thus, publishing provenance LOD as a nanopublication implies publishing each provenance activity as a stand-alone, referenceable, and citable unit. In this way, two conflicting hypotheses about the same activity can coexist while older hypotheses that have become obsolete can remain accessible to scholars [29]. In addition, each nanopublication expresses metadata about the creation of the information and its publication. We can thus publish the data provenance of provenance data. The structure of a nanopublication consists of three separate named graphs. A named graph is an RDF graph identified by a URI, which allows one to assert information about it [30]. The first graph of the nanopublication, the assertion graph, is devoted to the information on the published atomic unit. In the case of provenance data, it contains statements about a single provenance activity. The second graph, the provenance graph, is dedicated to the data provenance related to the assertion graph. It contains statements about how the knowledge expressed in the assertion graph was produced. For example, in a nanopublication of a provenance activity, the provenance graph describes the context of the hypothesis formulated by an expert, including the author’s identity, the date, the scientific method used, and the sources consulted by the author. Finally, the publication info graph, the third graph of a nanopublication, provides metadata about the entire nanopublication, such as the creator, creation date, and license. In expressing the subjectivity of information, such as the contexts of different hypotheses and the possible conflicts between them, it is necessary to focus on modelling the interpretation context in the provenance graph. The Historical Context Ontology (HiCO) is dedicated to expressing as LOD the context of a hermeneutic activity performed by a scholar in formulating a hypothesis through the interpretation of sources [31]. HiCO is an extension of the PROV ontology, the standard model dedicated to modelling data provenance on the web [32]. Given its purpose, such an ontology is ideal for representing provenance graph information. HiCO re- volves around one activity: the interpretation (hico:InterpretationAct). This activity represents the action of the scholar in formulating a hypothesis of which, among other types of informa- 75 Fabio Mariani CEUR Workshop Proceedings 63–84 tion, we can express the type of interpretation (hico:hasInterpretationType), the criterion of interpretation (hico:hasInterpretationCriterion), the time frame in which the interpretation was carried out (prov:startedAtTime), the resources used (cito:citesAsEvidence), and the influence of other hypotheses (prov:wasInfluencedBy). To integrate HiCO into the Linked Art Data Model, we propose aligning HiCO and CIDOC CRM, as shown in Table 1. In aligning the two ontologies, it is necessary to use CRMinf, a CIDOC CRM module dedicated to modelling inference-making activities.8 Specifically, CRMinf introduces the argumentation activity (crminf:I1_Argumen- tation) semantically comparable with HiCO’s interpretation act (hico:InterpretationAct). In addition, the module allows for more granular modelling of the argumentation result, expressed in the assertion graph. While HiCO uses the PROV ontology property prov:wasGeneratedBy to indicate that the assertion graph resulted from an interpretation act, CRMinf uses an n-ary relation. As a result, the argumentation generates a belief (crminf:I2_Belief), which is, in turn, expressed by the assertion graph. As discussed in the next section on uncertainty, an n-ary relation allows one to assert information about the relation, which is impossible in a binary relation. @prefix crm: . @prefix crminf: . @prefix rdfs: . @prefix np: . @prefix dct: . { a np:Nanopublication ; np:hasAssertion ; np:hasProvenance ; np:hasPublicationInfo . } { a crm:E7_Activity ; rdfs:label "Purchased by Knoedler & Co. from Louis Lion & Co. in February 1957" ; crm:P2_has_type ; crm:P2_has_type ; crm:P4_has_time-span [ a crm:E52_Time-Span ; crm:P82a_begin_of_the_begin "1957-02-01T00:00:00Z" ; crm:P82b_end_of_the_end "1957-02-28T23:59:59Z" ] ; crm:P9_consists_of [ a crm:E8_Acquisition ; crm:P22_transferred_title_to [ a crm:E74_Group ; rdfs:label "Knoedler & Co." ] ; crm:P23_transferred_title_from [ a crm:E74_Group ; rdfs:label "Louis Lion & Co." ] ; crm:P24_transferred_title_of [ a crm:E22_Human-Made_Object ; rdfs:label "Cagnes" ] ] . } 8 https://www.cidoc-crm.org/crminf/ (accessed 2023-08-11). 76 Fabio Mariani CEUR Workshop Proceedings 63–84 { a crminf:I1_Argumentation ; crm:P2_has_type ; crm:P14_carried_out_by [ a crm:E74_Group ; rdfs:label "The Art Institute of Chicago" ] ; crm:P16_used_specific_object [ a crm:E33_Linguistic_Object ; rdfs:label "letter from Knoedler and Co., Apr. 8, 1975." ; crm:P2_has_type ; crm:P94i_was_created_by [ a crm:E65_Creation ; crm:P4_has_time-span [ a crm:E52_Time-Span ; crm:P82a_begin_of_the_begin "1975-04-08T00:00:00Z" ; crm:P82b_end_of_the_end "1975-04-08T23:59:59Z" ] ; crm:P14_carried_out_by [ a crm:E74_Group ; rdfs:label "Knoedler & Co." ] ] ] ; crminf:J2_concluded_that [ a crminf:I2_Belief ; crminf:J4_that ] . } { dct:created "2023-08-11T16:31:08Z" ; dct:creator ; dct:source ; dct:license . } Listing 6: Nanopublication, serialized in TriG format, of the purchase of the painting “Cagnes” by Knoedler & Co. from Louis Lion & Co. in February 1957. Listing 6 shows the nanopublication of the provenance activity in which Knoedler & Co. purchased the painting “Cagnes” from Louis Lion & Co. in February 1957. The structure of the nanopublication is defined using the Nanopublication Ontology.9 According to the note in the original provenance text, the assumption made by the Art Institute of Chicago is based on a “letter from Knoedler and Co., Apr. 8, 1975.” The information is structured using HiCO’s alignment to CIDOC CRM. The Getty AAT vocabulary is used to assign the entity types, as standard practice in Linked Art. In particular, the argumentation has the entity type “provenance remark” (aat:300444173), while the linguistic object used to formulate hypotheses has the entity type “letter” (aat:300026879). The metadata of the publication info graph, such as creation date, creator, source and license, are structured using properties from the Dublin Core Metadata Initiative (DCMI) Metadata Terms, a set of standardized metadata elements to describe digital resources.10 9 https://nanopub.net/guidelines/working_draft/ (accessed 2023-08-11). 10 https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ (accessed 2023-08-11). 77 Fabio Mariani CEUR Workshop Proceedings 63–84 7. Uncertainty According to the literature, the terms “uncertainty” and “vagueness” are related, if not con- flated. For example, in documenting evidence interpretation in archaeology using CIDOC CRM, Niccolucci and Hermon merge the concepts of vagueness and uncertainty in the same concept of reliability [33]. However, as discussed in previous sections, we distinguish between vagueness and uncertainty. The reliability of vague information lies in the accuracy of the data approximation. In contrast, the reliability of uncertain information lies in the probability of the data’s factuality. In light of what was discussed in the previous section, we can therefore correlate the concept of uncertainty to subjectivity, as it expresses the degree of confidence in making a hypothesis. As we have already seen, AAM guidelines introduce the possibility of expressing uncertainty about a piece of information. Terms such as “possibly” or “probably” express levels of uncertainty depending on the provenance expert’s degree of confidence. Re- garding provenance LOD modelling, Art Tracks uses a boolean value to express certainty about some information [34], and Linked Art deliberately avoids adding this degree of complexity. Examining other attempts to model uncertainty in CIDOC CRM, in the previously mentioned work by Niccolucci and Hermon, the reliability of information is expressed through fuzzy logic, with a subjective coefficient ranging from 0 (not credible) to 1 (absolutely true) [33]. When analysing provenance texts, we noticed that uncertainty coincides with the patterns we identified when dealing with incompleteness. In the presence of a gap, hypotheses become less confident. Since uncertainty is related to making hypotheses, we could have multiple contradictory hypotheses of varying degrees of certainty to fill a given gap. For this reason, the nanopublication solution is effective since it can separate various hypotheses with their degrees of certainty in different assertion graphs, allowing for the coexistence of multiple hypotheses with varying degrees of certainty. While modelling uncertainty as information associated with the act of interpreting has already been proven possible using HiCO [35], we take a different approach. We align HiCO with CIDOC CRM, particularly with CRMinf. The use of this module to model uncertainty in provenance data has already been hypothesised by Smith in analysing the potential of provenance LOD [36]. As we have seen when dealing with subjectivity modelling, our alignment involves describing the product of the crminf:I1_Argumentation expressed in the assertion graph with an n-ary relation through the crminf:I2_Belief entity. The argumentation does not generate an assertion graph but instead concludes with a belief that is, in turn, expressed in the assertion graph. Thus, we can link additional information to the crminf:I2_Belief entity, such as the crminf:I6_Belief_Value. The belief value represents the truth value of a belief produced by an argumentation. The CRMinf module requires determining a belief value scale with at least three values. Staying true to the approach of the Linked Art Data Model, we delineate a belief value scale within Getty’s AAT vocabulary. A crminf:I2_Belief can have as crminf:I2_Belief_Value: “true” (aat:300068765), “probably” (aat:300435721), “possibly” (aat:300435722), and “obsolete” (aat:300404908). The uncertainty terminology already used according to the AAM guidelines reoccurs through this new scale of values. In addition, we include the option of assuming the obsolescence of a given assumption. This option is fundamental to the data provenance of provenance data as it allows hypotheses to be discarded without eliminating them permanently, thus leaving them as evidence of the hermeneutic process concerning a given fact. What is 78 Fabio Mariani CEUR Workshop Proceedings 63–84 obsolete for one provenance expert may not be obsolete according to another. @prefix crm: . @prefix crminf: . @prefix rdfs: . @prefix np: . @prefix dct: . { a np:Nanopublication ; np:hasAssertion ; np:hasProvenance ; np:hasPublicationInfo . } { a crm:E7_Activity ; rdfs:label "Acquired by Galerie Kahnweiler from André Derain" ; crm:P2_has_type ; crm:P183_ends_before_the_start_of ; crm:P9_consists_of [ a crm:E8_Acquisition ; crm:P22_transferred_title_to [ a crm:E74_Group ; rdfs:label "Galerie Kahnweiler" ] ; crm:P23_transferred_title_from [ a crm:E21_Person ; rdfs:label "André Derain" ] ; crm:P24_transferred_title_of [ a crm:E22_Human-Made_Object ; rdfs:label "Cagnes" ] ] . } { a crminf:I1_Argumentation ; crm:P2_has_type ; crm:P14_carried_out_by [ a crm:E74_Group ; rdfs:label "The Art Institute of Chicago" ] ; crminf:J2_concluded_that [ a crminf:I2_Belief ; crminf:J5_holds_to_be ; crminf:J4_that ] . } { dct:created "2023-08-11T16:35:12Z" ; dct:creator ; dct:source ; dct:license . } Listing 7: Nanopublication, serialized in TriG format, of the probable acquisition of the painting “Cagnes” by Galerie Kahnweiler from the artist. 79 Fabio Mariani CEUR Workshop Proceedings 63–84 Listing 7 shows the nanopublication of the provenance event in which Galerie Kahnweiler probably acquired the painting “Cagnes” directly from the artist. In this case, since the level of certainty was expressed with the term “probably” in the original text, we can describe the value held by the belief generated by the argumentation, with the entity aat:300435721. 8. Discussion and Conclusion The classification of VISU information differentiates among four distinct yet correlated types of information, each pertaining to a specific intervention by the provenance expert. Vagueness, subjectivity, and uncertainty represent information categories we depend on when provenance records are incomplete. In the absence of information, the provenance expert can fill the gap by approximating data, formulating hypotheses, and expressing varying degrees of confidence in reconstructing facts. Although these terms are often used synonymously, the VISU classification distinguishes between vagueness and uncertainty. In the classification’s context, vagueness pertains to the approximation of spatial and geographical information, thereby addressing the precision of the data. CIDOC CRM offers valuable elements for representing vague temporal information by modelling dates as time spans. Additionally, it enables the representation of vague spatial information by utilising the property crm:P189_approximates. Linked Art already includes such solutions. As we have discussed, to extend the modelling of temporal information approximation, we integrate the CRMgeo module. In this way, it is possible to describe a relation between a vague time span and its approximation using the property crmgeo:Q13_approximates. By its nature, incompleteness is the only VISU information we cannot model in LOD. However, we can address incompleteness in analysis and hypothesis-making by carefully modelling the available information. Thanks to the event-based schema of CIDOC CRM and the application profile of Linked Art, we can formulate patterns for analysing incompleteness between and within different events. Thanks to these patterns, on the one hand, conscious modelling of the available information is possible, for example, by always including the sender, the one who parts with the object in an event. On the other hand, identifying and analysing gaps in provenance records makes it possible to gain new insights into the state of provenance research on a large scale, helping to determine which artworks, collectors, and historical periods to prioritise in research efforts. In the classification of VISU information, subjective and uncertain information is correlated. It requires a change of approach from what CIDOC CRM proposes since modelling the assertion context for each triple related to a single provenance event proves inefficient and repetitive. For this reason, we introduced a different approach by publishing provenance LOD as a nanopubli- cation. The nanopublication of provenance LOD involves publishing each provenance event as an atomic unit, of which we describe the data provenance information, thus implementing the data provenance of provenance data. In this way, we model the information asserted and the context of the hypothesis, such as author, date, and sources used. In addition, we can include conflicting hypotheses by modelling them in distinct RDF graphs. In the literature, there are already ontologies suitable for modelling the context in which a hypothesis is formulated, such as HiCO. We, therefore, aligned HiCO with CIDOC CRM, using the CRMinf module to describe 80 Fabio Mariani CEUR Workshop Proceedings 63–84 inference-making activities. Since uncertain information is related to the degree of confidence with which an expert makes a hypothesis, it is possible to model this uncertainty as LOD by qualifying the hypothesis-making context and implementing a belief value scale using terms from the Getty’s AAT vocabulary. Although compatible with the Linked Art Data Model, the solutions discussed for including VISU information in publishing provenance LOD should be considered an external module rather than a proposed extension. The representation and analysis of VISU information involve areas that, as we have seen, are deliberately outside the scope of Linked Art. One of the purposes of the Linked Art application profile is to make LOD information accessible and usable to institutional insiders. In this way, solutions such as nanopublications, although an established good practice in sharing scientific data compliant with FAIR principles, can be barriers to institutional practitioners. Indeed, the large volume of provenance texts from which we need to extract data would make it even more challenging to publish provenance LOD as nanopublications. As discussed in this paper, VISU information is critical to the integrity of provenance LOD, and we cannot do without it in the name of simplicity. On the contrary, VISU information represents the complexity inherent in the effort to reconstruct historical events, as well as the contradictory assumptions that arise from the plurality of historical debates. Balancing the effort of structuring provenance information as LOD with the qualitative care of VISU information requires a human-in-the-loop approach [37]. This means that, on the one hand, quantitative data structuring from provenance texts can be performed automatically by addressing natural language processing tasks through AI [38]. On the other hand, the qualitative curation of the data remains the responsibility of domain experts who can evaluate de visu, with their own eyes, the most ambiguous information. Acknowledgments The author would like to thank the three anonymous reviewers for their constructive feedback. I extend my gratitude to Marilena Daquino for valuable input and to Max Koss, Lynn Rother, and Liza Weber for their efforts in editing the article. References [1] S. Raux, From Mariette to Joullain: Provenance and Value in Eighteenth-Century French Auction Catalogs, in: G. Feigenbaum, I. J. Reist (Eds.), Provenance: An Alternate History of Art, Getty Research Institute, Los Angeles, CA, 2012, pp. 85–103. [2] J. Gramlich, Reflections on Provenance Research: Values – Politics – Art Markets, Journal for Art Market Studies 1 (2017). doi:10.23690/JAMS.V1I2.15 . [3] United States Department of State, Washington Conference Prin- ciples on Nazi-Confiscated Art, 1998. URL: https://www.state.gov/ washington-conference-principles-on-nazi-confiscated-art/. [4] C. Fuhrmeister, M. Hopp, Rethinking Provenance Research, Getty Research Journal 11 (2019) 213–231. doi:10.1086/702755 . 81 Fabio Mariani CEUR Workshop Proceedings 63–84 [5] L. Rother, M. Koss, F. Mariani, Taking Care of History: Toward a Politics of Provenance Linked Open Data in Museums, in: E. Canning, E. Fry (Eds.), Perspectives on Data, The Art Institute of Chicago, Chicago, IL, 2022. doi:10.53269/9780865593152/06 . [6] A. Luther, Digital Provenance, Open Access, and Data-Driven Art History, in: K. Brown (Ed.), The Routledge Companion to Digital Humanities and Art History, 1 ed., Routledge, Taylor & Francis Group, New York, NY, 2020, pp. 448–458. doi:10.4324/ 9780429505188- 38 . [7] D. Newbury, L. Lippincott, Provenance in 2050, in: J. Milosch, N. Pearce (Eds.), Collecting and Provenance: A Multidisciplinary Approach, Rowman & Littlefield Publishers, Lanham, MD, 2020, pp. 101–109. [8] N. H. Yeide, K. Akinsha, A. L. Walsh, The AAM Guide to Provenance Research, American Association of Museums, Washington, DC, 2001. [9] M. Piotrowski, Accepting and Modeling Uncertainty, Zeitschrift für digitale Geisteswis- senschaften 4 (2019). doi:10.17175/SB004_006A . [10] M. Smithson, Ignorance and Uncertainty: Emerging Paradigms, Cognitive Science, Springer, New York, NY, 1989. doi:10.1007/978- 1- 4612- 3628- 3 . [11] P. Smets, Imperfect Information: Imprecision and Uncertainty, in: A. Motro, P. Smets (Eds.), Uncertainty Management in Information Systems, Springer, Boston, MA, 1997, pp. 225–254. doi:10.1007/978- 1- 4615- 6245- 0_8 . [12] G. Nagypál, B. Motik, A Fuzzy Model for Representing Uncertain, Subjective, and Vague Temporal Knowledge in Ontologies, in: R. Meersman, Z. Tari, D. C. Schmidt (Eds.), On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2003, pp. 906–923. doi:10.1007/ 978- 3- 540- 39964- 3_57 . [13] P. B. Jaskot, Digital Methods and the Historiography of Art, in: K. Brown (Ed.), The Routledge Companion to Digital Humanities and Art History, 1 ed., Routledge, Taylor & Francis Group, New York, NY, 2020, pp. 9–17. doi:10.4324/9780429505188- 3 . [14] M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, P. E. Bourne, J. Bouwman, A. J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds, C. T. Evelo, R. Finkers, A. Gonzalez- Beltran, A. J. G. Gray, P. Groth, C. Goble, J. S. Grethe, J. Heringa, P. A. C. ’t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S. J. Lusher, M. E. Martone, A. Mons, A. L. Packer, B. Persson, P. Rocca-Serra, M. Roos, R. van Schaik, S.-A. Sansone, E. Schultes, T. Sengstag, T. Slater, G. Strawn, M. A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft, J. Zhao, B. Mons, The FAIR Guiding Principles for Scientific Data Management and Stewardship, Scientific Data 3 (2016). doi:10.1038/sdata.2016.18 . [15] D. Newbury, Art Tracks: Using Linked Open Data for Object Provenance in Museums, MW17: Museums and the Web 2017 (2017). [16] M. Doerr, The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata, AI Magazine 24 (2003) 75. doi:10.1609/aimag. v24i3.1720 . [17] B. Rieder, T. Röhle, Digital Methods: Five Challenges, in: D. M. Berry (Ed.), Un- derstanding Digital Humanities, Palgrave Macmillan UK, London, 2012, pp. 67–84. 82 Fabio Mariani CEUR Workshop Proceedings 63–84 doi:10.1057/9780230371934_4 . [18] J. Holmen, C.-E. Ore, Deducing Event Chronology in a Cultural Heritage Documentation System, in: B. Frischer, J. Webb Crawford, D. Koller (Eds.), Making History Interactive. Computer Applications and Quantitative Methods in Archaeology (CAA). 37th Interna- tional Conference, Williamsburg, Virginia, United States of America, March 22-26 (BAR International Series S2079), Archaeopress, Oxford, 2010, pp. 122–129. [19] G. Hiebel, M. Doerr, Ø. Eide, CRMgeo: A Spatiotemporal Extension of CIDOC-CRM, Inter- national Journal on Digital Libraries 18 (2017) 271–279. doi:10.1007/s00799- 016- 0192- 4 . [20] G. Hiebel, M. Doerr, K. Hanke, A. Masur, How to Put Archaeological Geometric Data into Context? Representing Mining History Research with CIDOC CRM and Extensions, International Journal of Heritage in the Digital Era 3 (2014) 557–577. doi:10.1260/2047- 4970.3.3.557 . [21] M. Destandau, J.-D. Fekete, The Missing Path: Analysing Incompleteness in Knowledge Graphs, Information Visualization 20 (2021) 66–82. doi:10.1177/1473871621991539 . [22] M. Papadakis, M. Doerr, Temporal Primitives, an Alternative to Allen Operators, in: P. Ronzino (Ed.), Proceedings of the Workshop on Extending, Mapping and Focusing the CRM co-located with 19th International Conference on Theory and Practice of Digital Libraries (2015), Poznań, Poland, September 17, 2015, CEUR Workshop Proceedings, CEUR- WS.org, 2015, pp. 69–78. [23] C. Huemer, The Provenance of Provenances, in: J. Milosch, N. Pearce (Eds.), Collecting and Provenance: A Multidisciplinary Approach, Rowman & Littlefield Publishers, Lanham, MD, 2020, pp. 2–15. [24] S. Al-Eryani, G. Bucher, S. Rühle, Ein Metadatenmodell für gemischte Sammlungen, Bibliotheksdienst 52 (2018) 548–564. doi:doi:10.1515/bd- 2018- 0066 . [25] M. Daquino, V. Pasqual, F. Tomasi, F. Vitali, Expressing Without Asserting in the Arts, in: G. M. Di Nunzio, B. Portelli, D. Redavid, G. Silvello (Eds.), Proceedings of the 18th Italian Research Conference on Digital Libraries, Padua, Italy, February 24-25, 2022, CEUR Workshop Proceedings, CEUR-WS.org, 2022. [26] L. F. Sikos, D. Philp, Provenance-Aware Knowledge Representation: A Survey of Data Models and Contextualized Knowledge Graphs, Data Science and Engineering 5 (2020) 293–316. doi:10.1007/s41019- 020- 00118- 0 . [27] P. Groth, A. Gibson, J. Velterop, The Anatomy of a Nanopublication, Information Services & Use 30 (2010) 51–56. doi:10.3233/ISU- 2010- 0613 . [28] H. P. Sustkova, K. M. Hettne, P. Wittenburg, A. Jacobsen, T. Kuhn, R. Pergl, J. Slifka, P. McQuilton, B. Magagna, S.-A. Sansone, M. Stocker, M. Imming, L. Lannom, M. Musen, E. Schultes, FAIR Convergence Matrix: Optimizing the Reuse of Existing FAIR-Related Resources, Data Intelligence 2 (2020) 158–170. doi:10.1162/dint_a_00038 . [29] I. Asif, I. Tiddi, A. J. G. Gray, Using Nanopublications to Detect and Explain Contradictory Research Claims, in: 2021 IEEE 17th International Conference on eScience, IEEE, New York, NY, 2021, pp. 1–10. doi:10.1109/eScience51609.2021.00010 . [30] J. J. Carroll, C. Bizer, P. Hayes, P. Stickler, Named Graphs, Provenance and Trust, in: Proceedings of the 14th International Conference on World Wide Web - WWW ’05, Association for Computing Machinery, New York, NY, 2005, pp. 613–622. doi:10.1145/ 1060745.1060835 . 83 Fabio Mariani CEUR Workshop Proceedings 63–84 [31] M. Daquino, F. Tomasi, Historical Context Ontology (HiCO): A Conceptual Model for Describing Context Information of Cultural Heritage Objects, in: E. Garoufallou, R. J. Hartley, P. Gaitanou (Eds.), Metadata and Semantics Research, Communications in Com- puter and Information Science, Springer International Publishing, Cham, 2015, pp. 424–436. doi:10.1007/978- 3- 319- 24129- 6_37 . [32] L. Moreau, P. Groth, Provenance: An Introduction to PROV, Synthesis Lectures on Data, Semantics, and Knowledge, Springer International Publishing, Cham, 2013. doi:10.1007/ 978- 3- 031- 79450- 6 . [33] F. Niccolucci, S. Hermon, Expressing Reliability with CIDOC CRM, International Journal on Digital Libraries 18 (2017) 281–287. doi:10.1007/s00799- 016- 0195- 1 . [34] T. Berg-Fulton, D. Newbury, T. Snyder, Art Tracks: Visualizing the Stories and Lifespan of an Artwork, MW2015: Museums and the Web 2015 (2015). [35] M. Daquino, V. Pasqual, F. Tomasi, Knowledge Representation of Digital Hermeneutics of Archival and Literary Sources, JLIS.it 11 (2020) 59–76. doi:10.4403/jlis.it- 12642 . [36] J. Smith, Toward “Big Data” in Museum Provenance, in: G. Schiuma, D. Carlucci (Eds.), Big Data in the Arts and Humanities: Theory and Practice, Data Analytics Applications, Auerbach Publishers, New York, NY, 2018, pp. 41–50. [37] L. Rother, F. Mariani, M. Koss, Interpreting Strings, Weaving Threads: Structuring Prove- nance Data with AI, in: Sammlungsforschung im digitalen Zeitalter. Chancen, Heraus- forderungen und Grenzen, Wallstein, Göttingen, 2023. Forthcoming. [38] L. Rother, F. Mariani, M. Koss, Hidden Value: Provenance as a Source for Economic and Social History, Economic History Yearbook 64 (2023) 111–142. doi:doi:10.1515/ jbwg- 2023- 0005 . 84