<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Linking Dutch Civil Certificates</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Joe</forename><surname>Raad</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vrije Universiteit Amsterdam</orgName>
								<address>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rick</forename><surname>Mourits</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Utrecht University</orgName>
								<address>
									<settlement>Utrecht</settlement>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Auke</forename><surname>Rijpma</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Utrecht University</orgName>
								<address>
									<settlement>Utrecht</settlement>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ruben</forename><surname>Schalk</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Utrecht University</orgName>
								<address>
									<settlement>Utrecht</settlement>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Richard</forename><surname>Zijdeman</surname></persName>
							<affiliation key="aff2">
								<orgName type="department">International Institute of Social History</orgName>
								<address>
									<settlement>Amsterdam</settlement>
									<region>NL</region>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">University of Stirling</orgName>
								<address>
									<settlement>Stirling</settlement>
									<region>Scotland</region>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kees</forename><surname>Mandemakers</surname></persName>
							<affiliation key="aff2">
								<orgName type="department">International Institute of Social History</orgName>
								<address>
									<settlement>Amsterdam</settlement>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Albert</forename><surname>Meroño-Peñuela</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vrije Universiteit Amsterdam</orgName>
								<address>
									<region>NL</region>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department">International Institute of Social History</orgName>
								<address>
									<settlement>Amsterdam</settlement>
									<region>NL</region>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Linking Dutch Civil Certificates</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">EA7DB1D73A3BB213F9451F3F35750813</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T20:17+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>linked data</term>
					<term>digital humanities</term>
					<term>civil certificates linking</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Finding and linking different appearances of the same entity in an open Web setting is one of the primary challenges of the Semantic Web. In social and economic history, record linkage has dealt with this problem for a long time, linking historical individual records at a local database level. With the advent of semantic technologies, Knowledge Graphs containing these records have been published, raising the need for large-scale linking techniques that consider the particularities of historical individual linking. In this paper we focus on our current investigation of such techniques to link the Dutch civil certificates in the LINKS/CLARIAH project. We describe the production of the LINKS Knowledge Graph, and we show its potential at answering domain research questions through its large number of owl:sameAs links.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Finding and linking equivalent entities (persons, places, events, concepts) on the Web is one of the most important challenges of a Semantic Web of Linked Data. The distributed data publishing paradigm and the scale of the Web exacerbate this problem; various approaches have been proposed to address it, including heuristic-based linking (e.g. string similarity) <ref type="bibr" target="#b11">[12]</ref>, cluster-similarity linking <ref type="bibr" target="#b10">[11]</ref>, and deep learning-based knowledge graph completion <ref type="bibr" target="#b13">[14]</ref>. The goal is to produce identity links that use the owl:sameAs or skos:exactMatch predicates so data consumers are aware of identity clusters and classes <ref type="bibr" target="#b2">[3]</ref>.</p><p>Interestingly, the problem has been dealt with in other fields of research; in particular in economic and social history. There, record linkage is a challenging and active area of research, as shown in a recent Historical Methods special issue on the subject <ref type="bibr" target="#b17">[19]</ref>; and is becoming ever more important in economic and social history. Mass digitisation of archival material means that further insight can be obtained by linking individuals and households across different records, especially now that sources with complete population coverage are becoming available. Historical civil certificates are the authoritative sources of birth, marriage and death events in municipality registers, and allow for the reconstruction of lives of the past <ref type="bibr" target="#b3">[4]</ref> <ref type="bibr" target="#b21">[23]</ref>. In the Netherlands, the LINKS project <ref type="bibr">[15]</ref> has shown that this reconstruction is, however, often very challenging, as there is generally no ground truth. Individuals are not actively followed over time, but observed during the registration of a vital event. As a result, it is unclear whether, where, and when an individual can be observed. It is not even certain whether follow-up is available at all, because individuals could migrate out of the region of observation <ref type="bibr" target="#b3">[4]</ref>. To complicate matters further, large quantities of historical certificates have been indexed, which gives rise to data entry errors. These spelling mistakes can be hard to deal with, as twins and other multiple births often receive similar names. Furthermore, first names were often reused in families to "replace" earlier-born, deceased siblings. Finally, civil servants were known to indicate non-standard mutations, such as name changes, acknowledgement of children, and divorces as side notes. As a result, very important relational information is often not standardised <ref type="bibr" target="#b21">[23]</ref>.</p><p>In this paper, we summarise our efforts in the LINKS and CLARIAH projects to overcome these challenges, and link the appearance of the same person in 1.5 million birth (1812-1919), marriage (1812-1944) and death (1812-1969) certificates in the Dutch province of Zeeland. Specifically, our contributions are:</p><p>-A description of the LINKS knowledge graph production process by using standard semantic technologies (Section 4) -A highly scalable certificate linking method based on efficient string similarity through Levenshtein automaton (Section 5) -A preliminary evaluation based on SPARQL queries that use such links (Section 6)</p><p>In the next sections, we survey related work (Section 2), describe the original dataset (Section 3), explain our contributions (Sections 4, 5 and 6), and conclude (Section 7).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>Historical record linkage generally requires a previous effort on digitising large amounts of individual-level historical records, a goal shared by projects like NAP-P/IPUMS <ref type="bibr" target="#b8">[9]</ref>, the Balsac Population Database 6 , the Utah Population Database 7 , Familysearch 8 , the Scottish Longitudinal Study 9 , Digitising Scotland 10 , the Nor-way Historical Population Register<ref type="foot" target="#foot_6">11</ref> , Link-Lives<ref type="foot" target="#foot_7">12</ref> , POPLINK/DDB<ref type="foot" target="#foot_8">13</ref> , the Scanian Economic Demographic Database (SEDD) <ref type="foot" target="#foot_9">14</ref> , the North Orkney Population History Project <ref type="bibr" target="#b12">[13]</ref> and Death and Burial Data in Ireland 1864-1922<ref type="foot" target="#foot_10">15</ref> . Linking individuals in the US 1850 and 1860 census is generally considered one of the earliest efforts <ref type="bibr" target="#b7">[8]</ref>, and similar approaches for Canada <ref type="bibr" target="#b1">[2]</ref> and Sweden <ref type="bibr" target="#b23">[25]</ref> have followed. <ref type="bibr" target="#b15">[17]</ref> provides a critical review of these and other historical record linkage efforts, with a focus on US data. We share with these efforts a focus on string-based comparison linkage. In other projects (e.g. Digitising Scotland) the goal is to perform group-level linkage as well <ref type="bibr" target="#b0">[1]</ref>. Recent machine learning approaches have gotten a lot of traction in the field <ref type="bibr" target="#b5">[6]</ref>. For example, recent work on historical US census data uses manually labelled data from familysearch.com as training data <ref type="bibr" target="#b16">[18]</ref>. In the Netherlands, earlier work on Dutch civil certificates focuses on methodological aspects of record linkage <ref type="bibr" target="#b19">[21]</ref>. The work done in the LINKS project [15] constitutes a basis for our contribution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Dataset</head><p>The digitised civil registry consists at the moment of 27.5 million certificates. In total, there are 10.3 million birth certificates, 4.4 marriage certificates, and 12.7 million death certificates in the digitised registry at the International Institute of Social History <ref type="foot" target="#foot_11">16</ref> . The number of available birth and death certificates differs strongly as, due to privacy laws, only death certificates that are more than 50 years old are available for research. Birth certificates become available with a 100-year delay, marriage certificates with a 75-year delay, and death certificates with a 50-year delay. For the moment, the experiments in this paper are restricted to the civil registries produced in the Zeeland region. This dataset of Zeeland civil registries, known as LINKS Zeeland cleaned 2016 01 <ref type="bibr" target="#b14">[16]</ref>, consists of 1.5 million certificates, which represents ∼5.5% of the total certificates. Specifically, there are 698,285 birth certificates (6.7% of the total birth certificates), 193,921 marriage certificates (4.4%), and 665,999 death certificates (5.2%). This dataset is cleaned, standardised and distributed in a restricted manner <ref type="bibr">[15]</ref> in the form of three CSV files:</p><p>1. Locations: containing the locations that show up in the civil certificates, describing the municipality, province, region and the country of a location. This file consists of 6 columns and 2,456 rows. 2. Registrations: containing general data from a certificate registration which exceed the individual level, such as the date and place of birth, marriage or death. This file consists of 10 columns and 1,558,205 rows, with each row representing a single registration in the Zeeland province. 3. Persons: containing all appearances of persons. In general every birth certificate generates records for three persons (newborn child, mother and father), a marriage certificate generates minimally six person records (bride with her parents and the groom with his parents) and a death certificate generates three or four person records (deceased, father, mother and possibly a spouse). This file consists of 33 columns and 5,526,393 rows.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">LINKS Knowledge Graph</head><p>The process of converting the CSV files of the LINKS dataset into a Knowledge Graph consists of three steps. Firstly, we manually design a model for describing and enriching the civil registries data, following Linked Data best practices. Secondly, we transpose the CSV data into an RDF Knowledge Graph, according to our designed model. Finally, we make the graph available for browsing and querying in an efficient manner.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Designing the civil registries schema</head><p>For modelling the civil registries data, we designed a new simple model that reuses, whenever possible, existing vocabularies. This model is presented in Figure <ref type="figure">1</ref>, and has four main components:</p><p>-Civil Registrations. The first component (concepts coloured in brown) describes each civil registration (birth, marriage, or death certificate), listing its identifier, its sequential number, the location, and date of the registration. -Life Events. The second component (in green) describes the actual life events (birth, marriage, or death event), listing the main individuals involved in this event, the location and the date of this event. In this model, a distinction is made between the civil registration and their associated life events, as certain civil registrations can be produced in different dates and locations from where the life event actually happened. -Individuals. The third component (in blue) describes each individual involved in these life events, listing their names, sex, civil status, and birth dates. -Locations. The final component (in orange) describes the location where each life event has happened and the location where it was registered. In this component, information regarding the municipality, the province, the region, and the country can be available. The conversion process takes 30 seconds for the file Locations, 100 minutes for Registrations, and around 5 hours for Persons on a SSD disk, with 64GB of memory.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Transposing the data to RDF</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Accessing the RDF knowledge graph</head><p>Combining the three resulted N-Quads files results in the LINKS knowledge graph, composed of 58,513,388 triples. This knowledge graph can be accessed online through Druid<ref type="foot" target="#foot_15">21</ref> , the CLARIAH instance of the TriplyDB triple store <ref type="foot" target="#foot_16">22</ref> . Druid allows the storage of knowledge graphs, and provides tools to browse, query and visualise our data. For privacy reasons, the LINKS knowledge graph is uploaded as a private dataset on Druid, restricting its access <ref type="foot" target="#foot_17">23</ref> to members of the LINKS organisation <ref type="foot" target="#foot_18">24</ref> on Druid. We provide publicly accessible links to resources of the knowledge graph when possible.</p><p>In addition to accessing the LINKS knowledge graph through the Druid Web hub, authorised users of the LINKS knowledge graph can also access this dataset locally. For enabling easy and efficient access on a normal local machine, we convert the LINKS knowledge graph from N-Quads to HDT (Header, Dictionary, Triples) <ref type="bibr" target="#b6">[7]</ref>. This compact data structure and binary serialisation format for RDF keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression. Converting the LINKS knowledge graph into HDT consists of two simple steps: (i) merge the three RDF N-Quads files into one larger N-Quads file, (ii) convert the resulting merged file to HDT using the rdfhdt library <ref type="foot" target="#foot_19">25</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Certificate Linkage</head><p>For linking Dutch civil registries, we heavily rely on the string similarity between individuals' names. This is motivated by the high quality of the registered names in most civil certificates, and the limited spelling variation between different civil certificates for the same individual. An example of such quality maintenance can be observed in marriage registrations, where both the bride and the groom are required to bring their own birth certificates when registering their marriage. Moreover, married women in the Netherlands keep their own family name in the civil certificates, which highly facilitates the problem at hand. In the case of death registrations, they are generally registered by next of kin -parents, spouses, children, or siblings-which also highly limits variations in name spelling <ref type="bibr" target="#b21">[23]</ref>.</p><p>Similarity between two names can be measured in several ways, such as calculating the Levenshtein, Jaccard, or Jaro-Winkler distances. In this work, we take the Levenshtein distance as a basis for matching individuals in civil certificates. This distance measures the number of single character edits (insertions, deletions or substitutions) required to change one name into the other. The standard algorithm for calculating the Levenshtein distance between two names was proposed by Wagner and Fisher <ref type="bibr" target="#b22">[24]</ref>, but can lead to a quadratic time complexity.</p><p>In this work, as we aim to match individuals from a list of millions of certificates to individuals in another large list of certificates, the standard approach (or its variants) of calculating the Levenshtein distance by comparing each pair of certificates is not feasible, as the time complexity of the approach can grow exponentially with the size of the given lists. Therefore, we adopt the approach and the library proposed by Dylon Devo<ref type="foot" target="#foot_20">26</ref> , based largely on the work of Schulz and Mihov <ref type="bibr" target="#b20">[22]</ref>, for the fast selection of candidate individuals within a certain Levenshtein distance. In this approach, the list of target individuals are indexed as a Minimal Acyclic Finite-State Automata (MA-FSA), where a Levenshtein transducer is initialised according to a maximum distance specified by the user. When a name is given as a source query with a maximum accepted Levenshtein distance, the states of the Levenshtein automaton corresponding to that name are constructed on-demand as the automaton is evaluated. According to its author, this approach allows to find for a given name n all candidate names in a list M in linear time on the length of n, and not on the size M . In the following, we describe how we deploy this approach for matching newborns registered in birth certificates to their marriage certificates. The general process remains unchanged for other types of linkage, where only the roles of the considered individuals and the link's timeline consistency are adapted accordingly.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Approach</head><p>Finding the marriage certificate of a certain newborn, when applicable, requires matching three individuals: (i) the newborn in the birth certificate with the bride or groom of a certain marriage certificate, (ii) the newborn's mother with the bride's or groom's mother, (iii) the newborn's father with the bride's or groom's father. Once a match, according to a maximum Levenshtein distance, between the three individuals of a birth certificate and a marriage certificate is found, we check whether the logical timeline is respected. Only when a match between two certificates based on the three individuals is found, with a correct logical timeline, a match between the three individuals is registered in the Knowledge Graph. Specifically, our approach for matching newborns to a marriage certificate can be divided into 5 main steps:</p><p>1. Create six indices, with each index representing a MA-FSA containing the list of all full names of a certain role in marriage certificates. For instance, the index of the role "bride" contains the full names (first name + last name) of all women individuals that got married (i.e. role of bride). For each of these indices, a Levenshtein transducer is initialised according to a maximum Levenshtein distance, given by the user. 2. Create six Key-Value databases, with each database covering a single role r in the marriage certificate. A key in a database represents a full name f n, and the value represents a list of marriage certificate identifiers that have for the role r an individual with the name f n. For instance, the entry "Anna</p><p>Aartsen" → {123323,232344} indicates that both these certificates have a bride registered with the full name "Anna Aartsen". While such information can be directly queried from the Knowledge Graph, Key-Value databases are a better mean for frequent read requests. In particular, we rely on the RocksDB 27 disk-based Key-Value database. 3. Find marriage certificate candidate(s) for each birth certificate. For this, we firstly search for the full name of the newborn in the index of the bride or the groom. Considering that the newborn is a girl, this step retrieves a list of candidate names C newborn from the bride index, representing a spelling variation within the maximum Levenshtein distance specified by the user. If C newborn is not empty, we retrieve from the bride's Key-Value database the list of candidate certificates E newborn that contain this candidate's name.</p><p>In the case where C newborn contains several candidates, the result will be the union of all returned E newborn for each candidate. The same process is applied when searching for the full name of the newborn's mother and father in the bride's mother and father indices, respectively returning a list of candidate certificates E mother and E f ather . 4. Filter resulting candidates. Since in the majority of cases, a newborn is expected to have the same registered parents during marriage, we require the match between the birth and the marriage certificates to be based on the three individuals. Therefore, the preliminary marriage candidates consists of the intersection of E newborn , E mother and E f ather . Finally, out of these preliminary candidates only those that respect the logical timeline are considered. In this case consisting of matching a newborn to a bride or a groom, we expect that the marriage certificate is registered at least 14 years, and at most 70 years, after its matched birth registration. 5. Save links, in two formats for respecting the preferences of most researchers:</p><p>(a) CSV file consisting of the birth certificate identifier, the matched marriage certificate identifier, with the link metadata consisting mainly of the Levenshtein distance between each matched individual in these certificates, and time difference between both registrations, (b) N-Quads file consisting of owl:sameAs links between each matched individual, with each link being asserted in a different named graph for describing its context. For instance, the statement iisg:newbornURI, owl:sameAs, iisg:brideURI, iisg:graph/birthToMarriage/0-2-1 indicates that the identity link between these two individuals was detected based on a Levenshtein distance of 0 between the newborn and bride's name, a Levenshtein of 2 between their mothers' names, and 1 for the fathers' names.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Experiments</head><p>For testing the scalability of our approach, we evaluated our matching approach on the Zeeland dataset described in Section 3. We firstly evaluated the process   of matching newborns in marriage certificates to brides/grooms in marriage certificates, and then evaluated the process of matching parents of brides/grooms in marriage certificates to their own marriage certificate. Table <ref type="table" target="#tab_0">1</ref> shows that matching civil registries of a Dutch province takes no more than a few minutes 28 , with the runtime increasing as the maximum Levenshtein distance per individual increases. It also shows that even with a maximum Levenshtein distance of 1, there is a significant overlinking, since the number of detected links (271,230) is larger than the number of marriage certificates in this dataset <ref type="bibr">(193,</ref><ref type="bibr">921)</ref>. Therefore, indicating that a number of marriage certificates were matched to multiple birth certificates. The source code of this approach is publicly available 29 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Preliminary Evaluation (Use Cases)</head><p>While we expect that a dataset containing information on every Dutch person born in the period 1812-1919 and their family relations will be useful to many</p><p>28 Experiments conducted on MacBook Pro, with SSD disk and 16GB of memory 29 https://github.com/CLARIAH/wp4-links q q q qq q q q qq q q q q q q q q q q q q q q q q q q qq q q qqq q q q q q qq q q q q q q q q q q q q q q q q q q qq q q q q qq q q 1840 1850 1860 researchers, it will be especially valuable to demographic, social, and economic historians working with individual-level data. One issue in particular that it can address, is bias in results due to migration.</p><p>The key issue there is that, currently, many analyses are based on records from one locality village, town, or province). In other words, out-migrants are left out of the data. This is a problem because migrants are different from the rest of the population. For example, according to Ruggles <ref type="bibr" target="#b18">[20]</ref> they had different ages at marriage and life expectancies. A comparison of the civil registry of Zeeland and a smaller population register data set (HSN) that follows individuals as they move, has also shown that the differences between such datasets can be explained by the exclusion of migrants out of Zeeland <ref type="bibr" target="#b4">[5]</ref>.</p><p>The new data created here can take substantial steps to resolve this issue. Only international out-migrants can now go missing, which is a far smaller share of the data. The query<ref type="foot" target="#foot_21">30</ref> in Listing 1.1 shows how migrants and non-migrants are easily identified in the data. To do this, we compare the location of a marriage with that of both the bride's and the groom's parents' marriage. The results of this query (figure <ref type="figure" target="#fig_3">2</ref>) show that the share of non-migrants between 1840 and 1910 falls from 80 to 72 percent, which means that by the start of the twentieth century, nearly a fourth of the couples moved between their marriage and that of their child. Linking a civil registry for the entire Netherlands as done here allows us to include this large group in future analyses.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusion</head><p>In this work, we described our production process of the LINKS Knowledge Graph, containing civil certificates of the Dutch province of Zeeland. We presented our approach for linking all these certificates, within 5 minutes on a regular laptop, and showed how such links can be exploited for conducting demographic analyses using SPARQL. This work is in the process of being extended to cover all certificates of the Netherlands, enabling larger and more valuable demographic, social and economic analyses.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>For</head><label></label><figDesc>converting the Zeeland dataset to a Knowledge Graph, we use the tool CoW (CSV on the Web converter)17 . This batch tool, developed within the CLARIAH project<ref type="bibr" target="#b9">[10]</ref>, allows the conversion of datasets expressed in CSV. It uses a JSON schema expressed using an extended version of the CSVW standard, to convert CSV files to RDF in scalable fashion. In the case of the Zeeland dataset, we run the conversion process separately for the three CSV files, by manually designing the JSON schema for Locations 18 , Registrations19 , and Persons 20 . This manual design of these JSON files allows us to transpose the data according to the model presented in Figure 1. After designing a JSON file for each of the three CSV files, we use the command line to convert each of these files separately. For instance, having both the Locations.csv file with its associated JSON schema Locations.csv-metadata.json in the same directory, the following command is sufficient to convert the data to RDF, creating the RDF file Locations.nq encoded in the N-Quads format. $ c o w t o o l c o n v e r t L o c a t i o n s . c s v</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>17</head><label>17</label><figDesc></figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>} 18 GROUP BY ? year 19 ORDER BY ? year Listing 1 . 1 :</head><label>181911</label><figDesc>SPARQL query for identifying migrants and non-migrants in LINKS.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 2 :</head><label>2</label><figDesc>Fig. 2: Share of marriages of father and bride in same location, 1840-1910.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="5,134.77,115.84,345.80,162.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Results of matching newborns in marriage certificates to brides/grooms (newbornToPartner), and matching parents of brides/grooms in marriage certificates to their own marriage certificate (partnerParentsToCouple).</figDesc><table><row><cell></cell><cell></cell><cell cols="4">NewbornToPartner PartnerParentsToCouple</cell></row><row><cell></cell><cell>Maximum Levenshtein per Individual</cell><cell>Number of Links</cell><cell>Runtime (in mins)</cell><cell>Number of Links</cell><cell>Runtime (in mins)</cell></row><row><cell></cell><cell>1</cell><cell>271,230</cell><cell>5</cell><cell>205,477</cell><cell>2</cell></row><row><cell></cell><cell>2</cell><cell>289,937</cell><cell>18</cell><cell>224,785</cell><cell>8</cell></row><row><cell></cell><cell>3</cell><cell>310,232</cell><cell>74</cell><cell>244,343</cell><cell>25</cell></row><row><cell>1</cell><cell cols="4">SELECT ? year ( avg (? samePlace ) as ? shareS amePlace ) WHERE {</cell></row><row><cell>2</cell><cell>GRAPH ? g {</cell><cell></cell><cell></cell><cell></cell></row><row><cell>3</cell><cell cols="4">? fatherBride owl : sameAs ? f a t h e r B r i d e _ a s G r o o m .</cell></row><row><cell>4</cell><cell cols="4">? fatherGroom owl : sameAs ? f a t h e r G r o o m _ a s G r o o m .</cell></row><row><cell>5</cell><cell>}</cell><cell></cell><cell></cell><cell></cell></row><row><cell>6</cell><cell cols="3">? mar1 iisgv : fatherBride ? fatherBride ;</cell><cell></cell></row><row><cell>7</cell><cell cols="3">iisgv : fatherGroom ? fatherGroom ;</cell><cell></cell></row><row><cell>8</cell><cell cols="2">schema : location ? loc1 .</cell><cell></cell><cell></cell></row><row><cell>9</cell><cell cols="3">? mar2 iisgv : groom ? f a t h e r B r i d e _ a s G r o o m ;</cell><cell></cell></row><row><cell>10</cell><cell cols="2">bio : date ? date ;</cell><cell></cell><cell></cell></row><row><cell>11</cell><cell cols="2">schema : location ? loc2 .</cell><cell></cell><cell></cell></row><row><cell>12</cell><cell cols="3">? mar3 iisgv : groom ? f a t h e r G r o o m _ a s G r o o m ;</cell><cell></cell></row><row><cell>13</cell><cell cols="2">schema : location ? loc3 .</cell><cell></cell><cell></cell></row><row><cell>14</cell><cell cols="5">FILTER (? date &gt; "1840 -01 -01"^^xsd : date &amp;&amp; ? date &lt; "1910 -01 -01"^^xsd : date )</cell></row><row><cell>15</cell><cell cols="5">BIND ( if (? loc1 = ? loc2 || ? loc1 = ? loc3 , 1 , 0) AS ? samePlace ) .</cell></row></table><note>16BIND ( year (? date ) as ? year ) .</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_0">Copyright ©2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_1">http://balsac.uqac.ca/english/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_2">https://uofuhealth.utah.edu/huntsman/utah-population-database/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_3">https://www.familysearch.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_4">https://sls.lscs.ac.uk/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_5">https://digitisingscotland.ac.uk/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_6">https://www.rhd.uit.no/nhdc/hpr.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_7">https://link-lives.dk/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_8">https://www.umu.se/en/centre-for-demographic-and-ageing-research/ databases/parish-registers-databases/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="14" xml:id="foot_9">https://www.ed.lu.se/databases/sedd</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="15" xml:id="foot_10">https://www.dbdirl.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="16" xml:id="foot_11">https://iisg.amsterdam/en</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="18" xml:id="foot_12">https://raw.githubusercontent.com/CLARIAH/wp4-civreg/master/json/ locations.csv-metadata.json</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="19" xml:id="foot_13">https://github.com/CLARIAH/wp4-civreg/blob/master/json/registrations. csv-metadata.json</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="20" xml:id="foot_14">https://github.com/CLARIAH/wp4-civreg/blob/master/json/persons. csv-metadata.json</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="21" xml:id="foot_15">https://druid.datalegend.net/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="22" xml:id="foot_16">https://triply.cc/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="23" xml:id="foot_17">https://druid.datalegend.net/LINKS/links-zeeland/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="24" xml:id="foot_18">https://druid.datalegend.net/LINKS</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="25" xml:id="foot_19">http://www.rdfhdt.org/manual-of-the-java-hdt-library/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="26" xml:id="foot_20">https://github.com/universal-automata/liblevenshtein-java/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="30" xml:id="foot_21">https://github.com/CLARIAH/wp4-queries-links/blob/master/ marriage-locations.rq</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Linking Scottish vital event records using family groups</title>
		<author>
			<persName><surname>Akgün</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Dearle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Kirby</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Garrett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Dalton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Christen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Dibben</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Williamson</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.2019.1571466</idno>
		<ptr target="https://doi.org/10.1080/01615440.2019.1571466" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">0</biblScope>
			<biblScope unit="issue">0</biblScope>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2019-03">Mar 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Tracking people over time in 19th century Canada for longitudinal analysis</title>
		<author>
			<persName><forename type="first">L</forename><surname>Antonie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Inwood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Lizotte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Andrew Ross</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10994-013-5421-0</idno>
		<ptr target="https://link.springer.com/article/10.1007/s10994-013-5421-0" />
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">95</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="129" to="146" />
			<date type="published" when="2014-04">Apr 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><surname>Beek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Raad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wielemaker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Van Harmelen</surname></persName>
		</author>
		<title level="m">sameas. cc: The closure of 500m owl: sameas statements</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="65" to="80" />
		</imprint>
	</monogr>
	<note>European semantic web conference</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Families in comparison: An individual-level comparison of life-course and family reconstructions between population and vital event registers</title>
		<author>
			<persName><forename type="first">N</forename><surname>Van Den Berg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">K</forename><surname>Van Dijk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Mourits</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">E</forename><surname>Slagboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A P O</forename><surname>Janssens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Mandemakers</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Population Studies: A Journal of Demography</title>
		<imprint>
			<biblScope unit="page" from="1" to="20" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Families in comparison: An individual-level comparison of life-course and family reconstructions between population and vital event registers</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">V D</forename><surname>Berg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">K V</forename><surname>Dijk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Mourits</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">E</forename><surname>Slagboom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A P O</forename><surname>Janssens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Mandemakers</surname></persName>
		</author>
		<idno type="DOI">10.1080/00324728.2020.1718186</idno>
		<ptr target="https://doi.org/10.1080/00324728.2020.1718186" />
	</analytic>
	<monogr>
		<title level="j">Population Studies</title>
		<imprint>
			<biblScope unit="volume">0</biblScope>
			<biblScope unit="issue">0</biblScope>
			<biblScope unit="page" from="1" to="20" />
			<date type="published" when="2020-02">Feb 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Multiple Measures of Historical Intergenerational Mobility: Iowa 1915 to 1940</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Feigenbaum</surname></persName>
		</author>
		<idno type="DOI">10.1111/ecoj.12525</idno>
		<ptr target="https://onlinelibrary.wiley.com/doi/abs/10.1111/ecoj.12525" />
	</analytic>
	<monogr>
		<title level="j">The Economic Journal</title>
		<imprint>
			<biblScope unit="volume">128</biblScope>
			<biblScope unit="issue">612</biblScope>
			<biblScope unit="page" from="F446" to="F481" />
			<date type="published" when="2018-07">Jul 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Binary rdf representation for publication and exchange (hdt)</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Martínez-Prieto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutiérrez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Polleres</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arias</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="22" to="41" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A New Sample of Males Linked from the Public Use Microdata Sample of the 1850 U.S. Federal Census of Population to the 1860 U.S. Federal Census Manuscript Schedules</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Ferrie</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.1996.10112735</idno>
		<ptr target="https://doi.org/10.1080/01615440.1996.10112735" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="141" to="156" />
			<date type="published" when="1996-10">Oct 1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">New Methods of Census Record Linking</title>
		<author>
			<persName><forename type="first">R</forename><surname>Goeken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Huynh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">A</forename><surname>Lynch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vick</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.2010.517152</idno>
		<ptr target="https://doi.org/10.1080/01615440.2010.517152" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">44</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="7" to="14" />
			<date type="published" when="2011-01">Jan 2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The datalegend ecosystem for historical statistics</title>
		<author>
			<persName><forename type="first">R</forename><surname>Hoekstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Meroño-Peñuela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rijpma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zijdeman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ashkpour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Dentler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Zandhuis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rietveld</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics</title>
		<editor>Raad et al</editor>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="49" to="61" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Is my: sameas the same as your: sameas? lenticular lenses for context-specific identity</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Idrissou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hoekstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Van Harmelen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Khalili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Van Den Besselaar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Knowledge Capture Conference</title>
				<meeting>the Knowledge Capture Conference</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Silk server-adding missing links while consuming linked data</title>
		<author>
			<persName><forename type="first">R</forename><surname>Isele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jentzsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bizer</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First International Conference on Consuming Linked Data</title>
				<meeting>the First International Conference on Consuming Linked Data</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="volume">665</biblScope>
			<biblScope unit="page" from="85" to="96" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Interdisciplinary approach to spatiotemporal population dynamics:the north orkney population history project</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Jenning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Sparks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Murtha</surname></persName>
		</author>
		<ptr target="http://hdl.handle.net/10622/23526343-2019-0002?locatt=view:master" />
	</analytic>
	<monogr>
		<title level="j">Historical Life Course Studies</title>
		<imprint>
			<biblScope unit="page" from="27" to="51" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">MIDI2vec: Learning MIDI Embeddings for Reliable Prediction of Symbolic Music Metadata</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lisena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Meroño-Peñuela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Troncy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Transactions of the International Society for Music Information Retrieval</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Mandemakers</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Laan</surname></persName>
		</editor>
		<meeting><address><addrLine>WieWasWie Zeeland, Civil Certificates; Amsterdam</addrLine></address></meeting>
		<imprint>
			<publisher>International Institute of Social History</publisher>
			<date type="published" when="2017">2020. 2017</date>
		</imprint>
	</monogr>
	<note>LINKS dataset Genes Germs and Resources</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Mandemakers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Laan</surname></persName>
		</author>
		<title level="m">LINKS-Zeeland challenge, WieWasWie Zeeland, Civil Certificates, version 2016</title>
				<meeting><address><addrLine>Amsterdam</addrLine></address></meeting>
		<imprint>
			<publisher>International Institute of Social History</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Playing with matches: An assessment of accuracy in linked historical data</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">G</forename><surname>Massey</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.2017.1288598</idno>
		<ptr target="http://dx.doi.org/10.1080/01615440.2017.1288598" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">0</biblScope>
			<biblScope unit="issue">0</biblScope>
			<biblScope unit="page" from="1" to="15" />
			<date type="published" when="2017-03">Mar 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Combining Family History and Machine Learning to Link Historical Records</title>
		<author>
			<persName><forename type="first">J</forename><surname>Price</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Buckles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Van Leeuwen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Riley</surname></persName>
		</author>
		<idno type="DOI">10.3386/w26227</idno>
		<ptr target="http://www.nber.org/papers/w26227.pdf" />
		<imprint>
			<date type="published" when="2019-09">Sep 2019</date>
			<pubPlace>Cambridge, MA</pubPlace>
		</imprint>
		<respStmt>
			<orgName>National Bureau of Economic Research</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. Rep. w26227</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Record linkage in the Cape of Good Hope Panel</title>
		<author>
			<persName><forename type="first">A</forename><surname>Rijpma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cilliers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fourie</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.2018.1517030</idno>
		<ptr target="https://doi.org/10.1080/01615440.2018.1517030" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">0</biblScope>
			<biblScope unit="issue">0</biblScope>
			<biblScope unit="page" from="1" to="16" />
			<date type="published" when="2019-02">Feb 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Migration, Marriage, and Mortality: Correcting Sources of Bias in English Family Reconstitutions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ruggles</surname></persName>
		</author>
		<idno type="DOI">10.1080/0032472031000146486</idno>
		<ptr target="https://doi.org/10.1080/0032472031000146486" />
	</analytic>
	<monogr>
		<title level="j">Population Studies</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="507" to="522" />
			<date type="published" when="1992-11">Nov 1992</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Schraagen</surname></persName>
		</author>
		<author>
			<persName><surname>Others</surname></persName>
		</author>
		<ptr target="https://openaccess.leidenuniv.nl/handle/1887/29716" />
		<title level="m">Aspects of record linkage</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
		<respStmt>
			<orgName>Leiden Institute of Advanced Computer Science (LIACS), Faculty of Science, Leiden University</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Fast string correction with levenshtein automata</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">U</forename><surname>Schulz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mihov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal on Document Analysis and Recognition</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="67" to="85" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F</forename><surname>Vulsma</surname></persName>
		</author>
		<title level="m">Burgerlijke stand en bevolkingsregister</title>
				<imprint>
			<publisher>-Gravenhage</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">The string-to-string correction problem</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Wagner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Fischer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the ACM (JACM)</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="168" to="173" />
			<date type="published" when="1974">1974</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Testing Methods of Record Linkage on Swedish Censuses</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Wisselgren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Edvinsson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Berggren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Larsson</surname></persName>
		</author>
		<idno type="DOI">10.1080/01615440.2014.913967</idno>
		<ptr target="https://doi.org/10.1080/01615440.2014.913967" />
	</analytic>
	<monogr>
		<title level="j">Historical Methods: A Journal of Quantitative and Interdisciplinary History</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="138" to="151" />
			<date type="published" when="2014-07">Jul 2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
