<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic knowledge-graph creation from historical documents: The Chilean dictatorship as a case study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Camila Díaz</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jocelyn Dunstan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorena Etcheverry</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonia Fonck</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alejandro Grez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domingo Mery</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Reutter</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hugo Rojas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chile</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Pontificia Universidad Católica de Chile</institution>
          ,
          <country country="CL">Chile</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidad Alberto Hurtado</institution>
          ,
          <country country="CL">Chile</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>VioDemos</institution>
          ,
          <country country="CL">Chile</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present our results regarding the construction of a knowledge graph from historical documents related to the Chilean dictatorship period (1973-1990). Our approach uses LLMs to automatically recognize entities and relations between them and resolve conflicts between these values. To prevent hallucination, the interaction with the LLM is grounded in a simple ontology with four types of entities and seven types of relations. To evaluate our architecture, we use a gold standard graph constructed using a small subset of the documents, and compare this to the graph obtained from our approach when processing the same set of documents. Results show that the automatic construction manages to recognize a good portion of all the entities in the gold standard and that those not recognized are explained mainly by the level of granularity in which the information is structured in the graph and not because the automatic approach misses an important entity in the graph. Looking forward, we expect this report to encourage work on other similar projects focused on enhancing research in humanities and social science. However, we remark that better evaluation metrics are needed to accurately fine-tune these types of architectures.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Knowledge graphs have been identified as a promising tool for analyzing historical documents
[
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. Indeed, given a collection of documents, one can build a knowledge graph by identifying
all relevant entities and relations between them. The construction and usage of such knowledge
graphs allow shifting focus from a document-centric approach, in which analysts must find
their information within a collection of documents, to an entity-centric approach, in which
users can immediately find relevant entities in knowledge graphs, together with other helpful
information such as the neighborhood of these entities, the relation or paths between them,
and other more complex patterns.
      </p>
      <p>While promising, the construction of these knowledge graphs is a challenging, expensive
endeavor. To identify entities, one must read all relevant documents, and this list must be
constantly curated to avoid duplication. The same applies to the discovery of the relations
between entities. One way of partially preventing the cost of constructing knowledge graphs is
to look for automatic or semi-automatic techniques in which one leverages recent advances in
Natural Language Processing (NLP) to take care of the construction of the knowledge graph or
to deliver a preliminary result that is then later manually curated at a much lower cost.</p>
      <p>
        In this paper, we present results regarding the automatic construction of a knowledge graph
containing information about the human rights violations committed by the Chilean dictatorship
of Augusto Pinochet between 1973 and 1990. The transition to democracy and transitional
justice began in Chile after the peaceful return to democracy in 1990. In these 34 years, there
has been notable progress in the five diferent elements of transitional justice—the search for
truth, justice, reparation, memory, and non-repetition—although important challenges remain
pending [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
      </p>
      <p>For the work presented here, we use documents retrieved from the digital archive
Memoria Viva1. This repository is part of the NGO Human Rights International Project, initially
established by Chilean refugees and human rights activists in London. Their goal is to collect
and make available to a broad public the documentation that records the crimes committed
by the dictatorial state apparatus, relying on primary and secondary sources such as Oficial
Reports, Judicial and Police Archives, Archives of Human Rights Organizations, Testimonials,
Journalistic Documentation/Books, and International Archives</p>
      <p>Our ultimate goal is the creation of innovative computational methods specifically designed
for analyzing historical documents, which will enable the development of tools to integrate
currently fragmented information while adhering to necessary quality and standards. This
endeavor is also intended to impact the process of building historical knowledge, providing a
space for the interrelation between fragments that enable more integrated and comprehensive
analyses, supporting the work of diferent disciplines and organizations.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        Information extraction tasks are usually divided into open and closed tasks. Open information
extraction (OIE) is designed to derive relation triplets from unstructured text by directly using
entities and relations from the sentences without adhering to a fixed schema. In contrast, closed
information extraction (CIE) focuses on extracting factual data from text that conforms to a
predetermined set of relations or entities, as detailed by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Our approach can be classified under
the CIE paradigm. Still, we use a simple domain ontology to guide the process of extracting
entities and relationships between entities instead of a predefined set of concrete entities or
relations.
      </p>
      <p>Regarding the machine learning models to extract entities and relationships, several papers
in recent years have explored comprehensive methods that use a single machine learning model
for joint-named entity recognition and relationship extraction (NERRE) [7]. These methods
typically take a sequence-by-sequence approach in which a model is trained to obtain tuples of
two or more named entities and the relationship label belonging to the predefined set of possible
relationships between them. In [8], the authors present an approach using LLMs for NERRE in
1The digital archive is available at https://memoriaviva.com/nuevaweb/
the materials science domain, extracting hierarchical entities and relationships between them.
Our work with historical documents presents new challenges to this task: the documents we
process contain various information, ranging from judicial processes to family relationships.
This means the extraction process must be general enough to accommodate general information
instead of focusing on a particular domain.</p>
      <p>Although open and closed information extraction vary, both attempt to convert unstructured
text into structured knowledge, usually represented in triples that help outline relationships
but ofer limited entity-level knowledge. It is often assumed that two triples refer to the same
entity if their topics coincide. However, this assumption may not necessarily hold in practice.
Furthermore, evaluating these tasks is based on precision, recall, and F1 at the triplet level, which
can lead to erroneous conclusions about entity understanding. Some recent works propose new
metrics to evaluate the quality of the results obtained beyond these classical metrics. Among
them, the one defining the Approximate Entity Set OverlaP (AESOP) metric stands out [ 9]. In
this paper, we take inspiration from this work to find metrics suitable for comparing graphs
produced by two possibly diferent black-box architectures. We only allow comparison at the
level of the final answer.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Our approach</title>
      <p>Our approach is summarized in Fig. 1. The idea is to repeatedly prompt an LLM for entities and
relations and resolve these relations. This results in a graph, which is later post-processed to
remove additional redundancy and fix possible mistakes in the previous steps.</p>
      <sec id="sec-3-1">
        <title>3.1. A simple (fixed) ontology</title>
        <p>Our architecture considers a simple ontology in which we fix certain types of entities and
relationships. We then carefully design prompts to extract these entities and relationships,
using the concepts of the ontology. We do this to try to reduce the hallucinations of the LLM.
However, we keep the ontology general enough not to lose valuable information. The proposed
entities and relations are presented in Table 3.1. Furthermore, each relation type also restricts
the type of its origin and target entities. In the rest of the section, we omit clear types from the
context.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Extracting Entities and Relations</title>
        <p>Entities and relations are extracted by prompting LLMs. We use OpenAI’s API2. The approach
used is zero-shot prompting, as no examples of the expected output are included. However, the
expected behavior and expected structure are detailed in a JSON document. We extract each
type of entity or relation independently, with a diferent, manually created, prompt that broadly
specifies the context of the documents, the data expected from the response, and the expected
structure.</p>
        <p>For example, the prompt to recognize entities of type Individual is as follows:</p>
        <sec id="sec-3-2-1">
          <title>Prompt to extract Individuals</title>
          <p>Prompt: Your goal is to identify all the individuals mentioned in
the document and provide the information about that person as a
structured object. You will receive a document related to the
Chilean dictatorship of 1973.</p>
          <p>Generate a new JSON object containing the identified individuals:
{
"individual": [ // a list of all the individuals
{
"firstName": string, // first name of the individual
"lastName": string, // last name of the individual
"role": string, // individual's job, role, profession or</p>
          <p>activity. If not specified, make it "unspecified"
"summary": string, // 1-sentence summary of the individual
"origin_reference": string[] // If the document has parts
identified with the field "ORIGIN_REFERENCE", here comes
that value; if more than one corresponds, add them all as a</p>
          <p>list of strings</p>
          <p>}
2We use the GPT-4o-mini model, but if the model takes too long or repeatedly fails for one prompt, we use
GPT-4o for that particular prompt and document (see details on https://platform.openai.com/docs/models)
Make sure that each individual found satisfies this sentence:
"individual person with first and last name". Before adding the
individual to the result, imagine a detailed explanation for why
you deduce that the individual satisfies the sentence. If the
explanation is not 100% convincing, ignore that result. Use only
lowercase letters without accents. Use the English language for
the summary.</p>
          <p>As per relations, recall that we restrict each relation to the types of entities where they
participate. Then, for every relation type  between entities of type 1 and 2, we form a
prompt that concatenates all entities of type 1 and 2 discovered in the previous process and
asks the LLM to discover relations between these entities only. For example, here is the prompt
for the relation of type IndividualIsRelatedToOrganization (to which we must add the list of
entities of type Individual and Organization that were previously discovered in the document):</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>Prompt to extract Individuals</title>
          <p>Prompt: You will be given a JSON with a list of individuals (denoted
between tags ’==LIST 1 START==’ and ’==LIST 1 END==’), a JSON with
a list of organizations (denoted between tags ’==LIST 2 START==’
and ’==LIST 2 END==’), and then a document (denoted between tags
’==DOCUMENT START==’ and ’==DOCUMENT END==’). Based on the context,
you can identify from the document, find the individuals that are
related to an organization somehow. The result must be a JSON
object with the following structure:
{
"nature": string, // The nature of the relation of the
individual with the organization, it must be "affected
by", "member", "chief", or "other"
"individualId": number, // ID of the individual that is a</p>
          <p>member of the organization
"organizationId": number, // ID of the organization
"summary": string, // 1-sentence summary of the relation
"origin_reference": string[] // If the document has parts
identified with the field "ORIGIN_REFERENCE", here
comes that value; if more than one corresponds, add
them all as a list of strings
Each relationship should include a summary describing the nature of
the relationship between the entities. Ensure meaningful relations
are included, avoiding duplicates. Use only lowercase letters
without accents. Use the English language for the summary.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. The full text-to-graph pipeline</title>
        <p>In the following, we explain the entire pipeline that we use to go from text to a knowledge
graph, which we divide into three main steps. First, we need to split the text documents into
blocks of smaller length to fit into the prompt we send to OpenAI’s API. Then, we create a
diferent prompt to extract each diferent entity type and relation, and use this prompt to extract
entities and relation from each of these blocks. Lastly, we process this entity/relation graph to
remove redundant or incorrect information based on the graph’s structure.
3.3.1. Document Splitting
Since we use OpenAI’s GPT API, which has a size limit for its prompts input and output, we
need to divide the text document into fragments of a smaller size. Moreover, we must choose a
size that produces a good enough response. Generally, with a too small size, the LLM may not
know what to look for and may produce forced or incorrect results. Conversely, with a size that
is too big, the response may include only some occurrences. We tested lengths of 1000, 2000,
5000, 10000, and 15000 characters, making each cut either at a line break or a sentence end. We
also added an overlay at the beginning and end of each fragment of 0.1 of its size to provide
the LLM with more context. After trying diferent values, we found that the best results were
obtained with fragments of 5000 characters with overlays of 500 characters.
3.3.2. Prompting to extract Entities and Relations
After splitting the documents into fragments, the next step is to extract each fragment’s entities
and relations using OpenAI’s GPT API again. To extract the entities of a fragment for each entity
type, we prompt the API with the corresponding prompt as the system role and the fragment as
the user role and receive a JSON object with the entities of that type found in that fragment.
Similarly, to extract the relations of a fragment, we prompt the API with the corresponding
prompt for each relation type and add all entities found in that fragment that are of the types
relevant to the relation, i.e., the types of the subject and object entities. We consider only one
subject type and one object type for each relation. With this prompt, we get a JSON object with
the relations of that type found in that fragment.
3.3.3. Performing Entity Resolution
Next comes the task of finding duplicated entities, for which we merge both of their JSON files.
More precisely, we apply a series of rules that select possible pair of duplicated entities. For
each of these pairs, we ask an LLM to merge the information contained in both their JSON files.
To name some of the rules: for individuals, we consider duplicated any two entities A and B for
which both the name and surname of A are equal to or contained in the name and surname of
B, respectively. As another example, for relations between entities of type Location, we remove
any containment relation between locations that do not follow the order country &gt; city &gt; street
&gt; building3.</p>
        <p>After this, we obtain our raw graph.
3.3.4. Graph Post-Processing
The raw graph is also post-processed to remove incorrect and redundant information by applying
rules to its structure.</p>
        <p>First, we focus on edge removal. Generally, it would not make sense to have loops in this
kind of graph, so we assume it is due to errors at the time of relation extraction. To fix it, we
remove any existing relation participating in a loop (we use a loop limit of size 5 to ease the
computation). The next step is to remove redundant edges that can be easily extracted from
our ontology. For this, we define four rules that declare redundant relations, which can then be
removed.</p>
        <p>1. If 1 IsPartOf 2 and 2 IsPartOf 3, then 1 IsPartOf 3 is redundant;
2. I f  OccursAt 1 and 1 IsContainedIn 2, then  OccursAt 2 is redundant;
3. I f  WasPresentAt 1 and 1 IsContainedIn 2, then  WasPresentAt 2 on the same date
is redundant;
4. I f 1 IsRelatedTo  and 1 IsPartOf 2, then 2 IsRelatedTo  is redundant.</p>
        <p>For readability, we have shortened up the relation names. The fouth rule applies to the
properties IndividualIsRelatedToOrganization, IndividualIsRelatedToEvent or
OrganizationIsRelatedToEvent.</p>
        <p>After removing incorrect and redundant relations, we focus on merging redundant nodes 4
Our initial experiments showed that the LLM sometimes confuses organizations and locations,
resulting in duplicate entities that must be resolved. Therefore, we address this issue by
identifying locations and organizations with the same name and resolving them into one entity:
a location, an organization, or both. The latter case makes sense in specific instances, such as
when an organization is located in a particular building, and the name could refer to both the
organization and the building. We decide this based on which node has more information. The
API annotates organization and location entities with a possibly unspecified nature parameter. If
one node has this parameter defined and the other does not, then that node’s type is prioritized;
if both nodes have the value or neither does, then the node keeps both types.</p>
        <p>The last step of our process is to merge duplicate event nodes. We aim to reduce the number
of event nodes by applying a limit to its granularity in two ways. Note that every event is
annotated with a property date that contains the date it takes place. First, we restrict that there
3We can do this because we specifically ask the LLM to specify the nature of the location at the time of entity
recognition</p>
        <p>4The LLM annotates every entity with a summary property, and every time a pair of entities are merged, the
resulting one maintains both summaries in a list. The same is done for relations.
can only be one event per location per date, so all events with the same values are merged into
one. Second, an individual can be related to only one specific event on a particular date, so all
events that share these values must be merged. Since applying any of the two rules might raise
new ones for the other, we use them iteratively until no changes are made to the graph.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>In this section we evaluate the quality of the proposed graph. Regarding the scalability of the
approach, we note that our algorithms have been put to use in a real life scenario comprising
approximately 40.000 pages of text, generating a graph with more than 50.000 entities and
100.000 relations in about two hours, spending approximately 30 usd in requests to the chatGPT
API, according to prices published by the time of writing.</p>
      <p>To validate the quality of the graph, we propose to examine documents from public records
collected from a known collection (in our case, memoriaviva.com). We asked a domain expert
to construct a gold-standard graph from a subset of the documents in this collection, totalling
around 7500 words, and we compared how we fare against this standard graph.</p>
      <sec id="sec-4-1">
        <title>4.1. A gold standard sub-graph</title>
        <p>Our gold standard graph relates to the facts presented in memoriaviva.com concerning the
location of Lonquen, Chile, wherein a series of corpses were found during the dictatorship.
The constructed graph contains 121 entities: 51 Individuals, 38 Events, 16 Locations, and 16
Organizations. When writing this paper, the domain expert was still resolving conflicts regarding
some of the discovered relations, so we prefer not to report metrics on relations at this stage,
leaving them for the camera-ready version.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Results</title>
        <p>To analyze how we fared on the task of recognizing entities in the graph, we count the number
of Entities of each type that were recognized adequately by our automatic approach (Present in
both graphs), as well as the number of entities that were extracted by the automatic approach
but where not included in the gold standard graph by the domain expert (Extra nodes not in gold
standard), and the number of entities we failed to identify (Missing nodes from gold standard).
Because of the sheer number of nodes, we again resorted to LLM for this evaluation and asked
LLMs to decide which nodes matched and which did not. Results are presented in table 4.2.</p>
        <p>Interestingly, the precision of our algorithm varies tremendously depending on the entity’s
nature. The algorithm extracts individuals with high precision, and the only two missed nodes
from the gold standard correspond to spouses of people who were reported to be detained and
mentioned only once in all documents.</p>
        <p>For the case of organizations, the comparison reveals a slight problem with our algorithm, as
the automatic graph creates a few extra organizations. This happens when the same organization
is mentioned in the document several times with diferent roles (i.e., a specific division of the
police is mentioned to detain someone, and then this particular division is said to be led by a
specific oficer). In this case, our automatic construction would generate diferent entities for
Entity Type
Individual
Organization
Location
Event</p>
        <p>Present in both graphs
Extra nodes not in gold standard
Missing nodes from gold standard
Present in both graphs
Extra nodes not in gold standard
Missing nodes from gold standard
Present in both graphs
Extra nodes not in gold standard
Missing nodes from gold standard
Present in both graphs
Extra nodes not in gold standard
Missing nodes from gold standard
the various natures of the organizations (so one node corresponds to the police described as an
organization that detains people, and another to the police described as an organization led by
this oficer).</p>
        <p>Locations can be explained by a completely diferent phenomenon: our automatic graphs
create a more detailed location graph than the one made by experts. For example, if an event
is said to happen on the street  of city , experts only mention the city , but our graph
mentions both cities and streets. This suggests that our automatic graph could further refine
the gold standard.</p>
        <p>Finally, we have a similar problem with locations in the case of events. Still, in the opposite
direction, experts provide much more detailed information about specific events, whereas our
automatic extraction tends to merge two or three events into a single one. For example, experts
would detail all the events leading to the detention of victims, including the order in which
individuals were detained, how they were transported, and in which vehicle; the automatic graph
would instead merge all these events into a single detention event concerning all individuals, or
sometimes two events, one for the detention and another for the transportation of the victims.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and lessons learned</title>
      <p>The work presented here was a fully automatic analysis of a historical archive. We use large
language models to recognize entities, perform entity resolution, establish relations, and resolve
conflicts between identified relations. Importantly, we ensured that all the steps in our process
preserve the linkage of identified entities and relations. The final knowledge graph preserves
pointers to the original documents from which a given entity or relation was extracted (note
that a single entity or relation may point to several documents, as entities and relations we
identify are subject to a resolution step).</p>
      <p>One crucial issue in this work is to measure the quality of the final knowledge graph. As
several steps are involved, a slight change in the prompt used for any one step can create a
massive diference in the resulting knowledge graph. Our approach was to measure the final
product against a gold-standard graph. To test our algorithms, we first asked experts to manually
construct a graph from a small sample of Living Memory documents (121 entities). Then, we
compared the graph obtained by our approach, considering the same subset of texts, with the
expert graph.</p>
      <p>The resulting algorithms show promising results. Measured against the gold standard,
the extraction of Individuals was extremely precise. On the other hand, the extraction of
Organization, Events, and Location was less successful, but on closer inspection, this can be
partially explained by diferences in the granularity of the location and events, and one can
verify that the automatic approach captures the big picture of events just as well as the gold
standard. It remains to be seen whether this diference in granularity can be fixed by using
better prompts or if we need new components in our architecture.</p>
      <p>In terms of future work, we plan to systematically evaluate the entities detected by chatGPT.
To do so, we want to create an annotated corpus with the entities of interest [10]. Even if the
annotation is time-consuming, it can evaluate generative AI and other possible approaches to
detect critical information, such as Named Entity Recognition algorithms [11].</p>
      <p>Building a knowledge graph from collections of historical archives related to the Chilean
dictatorship is a way to add robust data science analyses to this critical event in Chilean history
that still has so many open questions. Unlike manual analysis of fragmented archives, a unified
approach can provide a comprehensive view, filling information gaps, cross-validating data, and
enhancing contextual understanding. A holistic method can uncover more profound insights
into the complexities of the dictatorship, enabling researchers to answer more nuanced questions
and revealing connections and patterns that fragmented, manual analyses might miss. Of course,
the quality of our analysis depends on the data quality. Thus, future research should resort to
primary sources to complement and validate the compilation made by Memoria Viva.
[7] D. Xu, W. Chen, W. Peng, C. Zhang, T. Xu, X. Zhao, X. Wu, Y. Zheng, E. Chen, Large
language models for generative information extraction: A survey, arXiv preprint
arXiv:2312.17617 (2023).
[8] J. Dagdelen, A. Dunn, S. Lee, N. Walker, A. S. Rosen, G. Ceder, K. A. Persson, A. Jain,
Structured information extraction from scientific text with large language models,
Nature Communications 2024 15:1 15 (2024) 1–14. URL: https://www.nature.com/articles/
s41467-024-45563-x. doi:10.1038/s41467-024-45563-x.
[9] H. Wu, Y. Yuan, L. Mikaelyan, A. Meulemans, X. Liu, J. Hensman, B. Mitra, Learning to
extract structured entities using language models (2024).
[10] K. Fort, Collaborative annotation for reliable natural language processing: Technical and
sociological aspects, John Wiley &amp; Sons, 2016.
[11] M. Rojas, F. Bravo-Marquez, J. Dunstan, Simple yet powerful: An overlooked architecture
for nested named entity recognition, in: Proceedings of the 29th International Conference
on Computational Linguistics, International Committee on Computational Linguistics,
Gyeongju, Republic of Korea, 2022, pp. 2108–2117. URL: https://aclanthology.org/2022.
coling-1.184.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Sequeda</surname>
          </string-name>
          ,
          <article-title>Knowledge graphs</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>64</volume>
          (
          <year>2021</year>
          )
          <fpage>96</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Debruyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Munnelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kilgallon</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. O'Sullivan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Crooks</surname>
          </string-name>
          ,
          <article-title>Creating a knowledge graph for ireland's lost history: Knowledge engineering and curation in the beyond 2022 project</article-title>
          ,
          <source>ACM Journal on Computing and Cultural Heritage (JOCCH) 15</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Opitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Born</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Nastase</surname>
          </string-name>
          ,
          <article-title>Induction of a large-scale knowledge graph from the regesta imperii</article-title>
          ,
          <source>in: Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage</source>
          ,
          <source>Social Sciences, Humanities and Literature</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>159</fpage>
          -
          <lpage>168</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Rojas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shaftoe</surname>
          </string-name>
          ,
          <article-title>Human rights and transitional</article-title>
          justice in Chile, Springer,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Corral</surname>
          </string-name>
          ,
          <article-title>50 years after the 1973 coup in chile: Analysis of the processes of transition to democracy and transitional justice</article-title>
          ,
          <source>Seattle J. Soc. Just</source>
          .
          <volume>22</volume>
          (
          <year>2023</year>
          )
          <fpage>587</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Josifoski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peyrard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>West</surname>
          </string-name>
          , Genie:
          <article-title>Generative information extraction</article-title>
          ,
          <source>in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>