<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">The discourse of the French method: making old knowledge on market gardening accessible to machines and humans</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">David</forename><surname>Colliaux</surname></persName>
							<email>david.colliaux@sony.com</email>
							<affiliation key="aff0">
								<orgName type="laboratory">Computer Science Laboratories</orgName>
								<orgName type="institution">SonyParis</orgName>
								<address>
									<addrLine>6 Rue Amyot</addrLine>
									<postCode>75005</postCode>
									<settlement>Paris</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Remi</forename><surname>Van Trijp</surname></persName>
							<email>remi.vantrijp@sony.com</email>
							<affiliation key="aff0">
								<orgName type="laboratory">Computer Science Laboratories</orgName>
								<orgName type="institution">SonyParis</orgName>
								<address>
									<addrLine>6 Rue Amyot</addrLine>
									<postCode>75005</postCode>
									<settlement>Paris</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">The discourse of the French method: making old knowledge on market gardening accessible to machines and humans</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">DE26A12BBD4491530895D7F0B2F6C800</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:50+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>digital humanities</term>
					<term>grounded language</term>
					<term>corpus linguistics</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>A vast amount of our cultural heritage is at risk of getting lost because it resides in old books that are difÏcult to access. It is therefore important to make this information available to human readers but also to machine analysis, so that new representations and insights based on this knowledge can be constructed. In our case study, we use a host of digital tools to extract and analyze a corpus of 19th century French texts about the practices of market gardening in Paris, and to apply a variety of possible visualizations in an integrated interface. Our work includes a Named Entity and Linking procedure for creating maps of the locations mentioned in these texts as well as the social networks of people cited in the books. We also consider how the analysis of verbs can approximate and represent the knowhow of market gardening: we analyze the statistics of those verbs compared to their usage in a general corpus for French, and map the verbs using word embeddings. Finally, we also consider a semantic frame analysis to extract causal relations from texts to evaluate how well these relations support the biological knowledge embedded in those texts (such as how too much exposure to the sun may affect the quality of the garden's produce). Altogether, we show how the visualizations based on Natural Language Processing and Textual Statistics could support a convivial navigation through the corpus.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Digital libraries gather large corpora of texts which are beyond human possibilities of reading. One of the tasks of digital humanities <ref type="bibr" target="#b20">[21]</ref> is thus to organize and analyze those texts so that they are easy to navigate. For instance, through distant reading <ref type="bibr" target="#b15">[16]</ref>, we may construct curves, graphs and maps that make this large quantity of information graspable for the human mind. Moreover, it is necessary that the information is accessible not only to humans but also to machines, so that further processing may be applied to those texts.</p><p>A large collection of works dedicated their efforts in this direction, applied to literary texts <ref type="bibr" target="#b15">[16]</ref> and the press <ref type="bibr" target="#b5">[6]</ref>, showing the potential of text mining and natural language processing for such corpora. However, less attention has been paid to manuals, even though such texts are essential as they encapsulate the knowledge of a particular era about a certain topic. In our case, we focused on 19th century manuals about market gardening. Those manuals are both a record of the practices of the time and the beginning of the crystallization of this knowledge into a science, namely agronomy.</p><p>19th century texts are particularly interesting because shortly after that period, from the second part of the 20th centuray onwards, agriculture went through radical changes with the green revolution and the introduction of chemicals to control the growth and the environment of plants. These changes, which were driven by the agronomical institutions, were so sweeping that we can reasonably ask whether some part of the old knowledge was lost. To answer this question, it is necessary to mine the older texts; and their analysis will also help visualize some interesting aspects of the history of agriculture.</p><p>We present here how we built the corpus, the preprocessing of the data and some analysis we did on the texts. First, we performed Named Entity Recognition and Linking to gather information on the places and people cited in those books. Then, we analyzed the verbs appearing in the corpus through semantic embeddings. And finally, we collected sentences expressing causal relations as those are most susceptible of containing agronomical knowledge. For each of these analyses, we provide visualization which can help navigate the corpus in an interactive manner.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The good Old Manuals corpus</head><p>The gardening manuals of the 19th century are a memory of the development of very efÏcient methods for growing vegetables in an urban environment (as many of these books are focused on the practices in the Paris area). These methods of cultivating very densely mixtures of crops on small plots of land have inspired a movement in California and more recently in Europe commonly referred to as the Biointensive French Method <ref type="bibr" target="#b13">[14]</ref>, or French Method <ref type="bibr" target="#b3">[4]</ref> for short. The French method is related to more recent practices like agroecology <ref type="bibr" target="#b22">[23]</ref> or permaculture <ref type="bibr" target="#b6">[7]</ref>, although the French Method insists on how to force the culture of vegetables out of season to be able to sell products at higher price early in the season or late in the season. One book in particular, Manuel pratique de la culture maraîchère de Paris by Moreau and Daverne, was particularly influential according to the actors of this revival <ref type="bibr" target="#b11">[12]</ref>, but there is a rich collection of literature on the topics in the 19th century, among which we picked references to include in our corpus. We describe below how the manuals were selected to compose the Good Old Manuals corpus (GOM).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Selection of the books</head><p>The first selection of books was collated by looking at the recommended readings accessible on an online platform about agroecological practices. The GOM1 corpus is thus composed of seven books listed in the table below. Additionally, we included 14 more books in the full GOM corpus after discussions with specialists of market gardening. All books are related to market gardening and were published between 1802 and 1912. For the following textual analysis, we only consider the GOM1 section of the corpus. The list of books included in the full GOM  corpus is available on the companion website<ref type="foot" target="#foot_0">1</ref> .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Text extraction and preprocessing</head><p>The first step in our analysis is to extract the layout of each page, identifying regions of the page occupied by text paragraphs, title, figures or tables using an image segmentation algorithm based on Faster RCNN trained on a large collection of publications <ref type="bibr" target="#b23">[24]</ref>. In this process, we could extract 1269 figures and 120 tables. The regions of the images classified as text were then fed to the Tesseract library <ref type="bibr" target="#b19">[20]</ref> for optical character recognition (OCR).</p><p>As expected, the resulting text still includes many mistakes, so a first preprocessing was done to substitute characters unlikely to appear in the text by their most likely replacement (for ex-ample ä-&gt;à). Next, to correct spelling mistakes from the OCR, we filtered out-of-vocabulary words (using the reference lexicon MORPHALOU3; <ref type="bibr" target="#b18">[19]</ref>), for example "avans" instead of "avons". A Bayesian model <ref type="bibr" target="#b16">[17]</ref> combining the estimation of the most likely mistakes (using the confusion matrix of the characters<ref type="foot" target="#foot_1">2</ref> ) and the closest neighbors using the edit distance with a weight different for words at 1 edit distance and 2 edit distance. For a string s, we select the candidate valid word w maximizing P(w).P(s|w). Where P(w) is the frequency of occurence in a base corpus (FRANTEXT <ref type="bibr" target="#b1">[2]</ref> in our case) and P(s|w) is the probability of subsitutions leading from s to w as given by the confusion matrix. From this process, we managed to reduce the number of out-of-vocabulary words from 80000 to 8000.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Named entity recognition and linking</head><p>It is important to identify the places and people cited in the GOM corpus so that the texts can be properly situated in their appropriate geography and history. For this, we used the out-of-vocabulary words, and selected the ones written starting with a capital letter. We then matched this list to a dictionary of geographical locations including their localization as GPS coordinates. In the remaining words, we checked manually, through web search, in the most commonly cited if those correspond to personalities.</p><p>Additionally, for places, there is a common ambiguity in our corpus on whether the name of a location is used to refer to the location or to a variety of plant originating from this location. To disambiguate this, we manually annotated all the mentions of names of locations as referring to the location or to a variety of plant originating from this location.</p><p>Based on this recognition of places and people, we were able to visualize both aspects. First, in a graph on Fig. <ref type="figure" target="#fig_1">2</ref>, we represented the authors and the most cited people (more than 2 times). We drew an edge between an author and a cited person if this person was cited by the author. We see that some authors cite generously, while some others only mention a few people. For example, in the Moreau &amp; Daverne, only Héricart de Thury and Mr Gontier are cited. The book they wrote was a response to a call emitted by the Royal Society of Horticulture, whose director was Héricart de Thury; and Mr Gontier was a market gardener in the region of Nantes and who was among the first to experiment with an innovative technology of the time, the thermosiphon. For places, on Fig. <ref type="figure" target="#fig_2">3</ref>, we placed circles on a map of France with the radius denoting the frequency of occurrence of the name of place in the GOM1 corpus. We notice that there are many mentions of places in the Paris region, which is expected since a lot of the practices we are interested in are originating from the Paris region.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Mapping the key verbs in the GOM corpus</head><p>It is interesting to focus on the verbs mentioned in the GOM corpus as they reflect the actions that are important to a market gardener on their farm. We are particularly interested in the verbs that are specific to market gardening, which can be considered as a keyword identification problem. For this, we first lemmatize and POS tag the texts using spacy, a widely used tool for various NLP tasks <ref type="foot" target="#foot_2">3</ref> . Then, similarly to the keyness commonly used in corpus linguistics <ref type="bibr" target="#b17">[18]</ref>, we measure for each verb the logarithm of the ratio between 𝑓 𝐺 the frequency of occurrences in the GOM1 corpus and 𝑓 𝐹 the frequency of occurrences of the verb in a reference corpus, FRANTEXT <ref type="bibr" target="#b1">[2]</ref>, which gathers 31 M words from periodicals the 19th and 20th century :</p><formula xml:id="formula_0">𝑘 = 𝑙𝑜𝑔( 𝑓 𝐺 𝑓 𝐹 )</formula><p>The word cloud in Fig. <ref type="figure" target="#fig_3">4</ref> shows the verbs with a size proportional to this index in yellow and the verbs not appearing in FRANTEXT in red with a size proportional to the log of the frequency of occurrences in GOM1.</p><p>In the previous representation the location of words has no interpretation and we also want to represent the words in a space where two words located close together would have similar meaning (in the distributional sense). That representation can be useful, for example, to show groups of words clustered together having a similar meaning. We represented each verb using its embedding in a word2vec model trained on a large French corpus <ref type="bibr" target="#b0">[1]</ref> and we visualize the map of verbs after reducing the dimension of the embedding to 2 dimensions using UMAP <ref type="bibr" target="#b14">[15]</ref> in Fig. <ref type="figure">5</ref>. We can for example identify a cluster of verbs describing actions of the farmer in the field (sarcler-palisser-semer) or verbs related to biological processes of the crops (pommer- tacheter-fleurir) being grouped together. Such a map is useful to navigate the content of the manuals and the embeddings may be useful to classify parts of the text.</p><p>The GOM corpus gathers an rich mixture of practical advice and practical knowledge. It is interesting to study whether the discourse in those books reflects this dichotomy between practices and knowledge. A key feature of the transition of discourse from practice to knowledge is nominalization, a linguistic process where nouns are derived from verbs <ref type="bibr" target="#b10">[11]</ref>. Thus in the particular example of the verb arroser ("to water"), we plot the usage statistics in each of the 7 books of the GOM1 corpus. We see, in Fig. <ref type="figure" target="#fig_4">6</ref> top panel, that some authors favor much more the use of the verb than the noun, denoting a more practical and less abstract discourse. Also, it is interesting to note that in the case of the verb arroser, there were actually two forms for the corresponding noun: arrosage and arrosement (both meaning "the watering (of crops)"). By plotting the frequencies of occurrence of these two terms in large corpora (Gallica and Google books), it shows that the 19th century is precisely the time during which those 2 terms coexisted, arrosement being used more frequently before; and arrosage becoming dominant after the 19th century. Some references (ATILF) mention a small difference in the meaning of those 2 terms, arrosement being more related to a passive manner for plants to receive water and arrosage referring to a more active process from a human to provide the water.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Extracting causality frames</head><p>We were also interested in capturing the parts of the discourse reflecting causal relations because in the sentences expressing causality, we may find elements of biological knowledge. For "Autre observation : la pratique nous a appris que, pendant l'été, si nous arrosons nos romaines durant le grand soleil avec l'eau froide de nos puits, quand elles sont près de se coiffer ou déjà coiffées, cela détermine dans leur intérieur des taches de pourriture; nous disons alors que la romaine est mouchetée : dans cet état, elle n'est plus bonne pour la vente. " "Another observation: practice has taught us that, during the summer, if we water our romaine plants in the hot sun with cold water from our wells, when they are about to be capped or have already been capped, this causes spots of rot inside them; we then say that the romaine is speckled: in this state, it is no longer fit for sale. "</p><p>Here, the authors draw a causal relation between on the one hand the watering of the crops with cold water when it's hot at a specific growth stage of the crops; and on the other hand the rotting of their leaves. Even though knowledge was too scarce at the time to fully explain this phenomenon, namely that these conditions are favoring the growth of fungi, it is clearly some kind of knowledge about biology that is encapsulated in the text.</p><p>To detect such causal relationships in a systematic matter, we are currently performing a Frame-Semantic analysis <ref type="bibr" target="#b7">[8]</ref> of the corpus. A Semantic Frame is a structured piece of knowledge that can be considered as a template of a scene with several open slots (called Frame Elements) that need to be filled in. One example is the Causality Frame, which comes with 'core' Frame Elements such as Cause and Effect, and 'non-core' elements that further qualify Figure <ref type="figure">5</ref>: A vector representation of verbs allows us to identify clusters of related activities. One cluster contains actions that focus on work in the field (such as 'sarcler', 'semer', and 'palisser') in the region on the right; while another cluster at the bottom left groups together biological processes of crops (such as 'pommer', 'fleurir' and 'tacheter').. the relation. The linguistic sister theory of Frame Semantics is called Construction Grammar <ref type="bibr" target="#b8">[9]</ref>, which explores how semantic frames get expressed in language through associations of form and meaning called constructions. There are typically two types of constructions involved. The first kind are frame-evoking constructions (usually lexical items or multiword expressions), which activate a semantic frame. In French, numerous words and multiword expressions evoke the Causality frame, such as à cause de "because of", parce que "because", occasionner "to bring about", suite à "due to", and so on. The second type are grammatical constructions (typically argument structure constructions; <ref type="bibr" target="#b9">[10]</ref>), which identify which phrases of a sentence should be mapped onto which Frame Elements.</p><p>Our Semantic Frame Extractor has been implemented in Fluid Construction Grammar (FCG; <ref type="bibr" target="#b21">[22]</ref>), an open-source computational grammar formalism for engineering Construction Grammars, following the methodology described by <ref type="bibr" target="#b2">[3]</ref>, who developed a Causality Frame Extractor for English. Our approach integrates several knowledge sources: • Input sentences are preprocessed using both a dependency parser and a constituency parser (such as the Berkeley Neural Parser; <ref type="bibr" target="#b12">[13]</ref>). These different structures are integrated in a single syntactic representation of a sentence using feature structures. During the training phrase, annotations of semantic frames are mapped onto the syntactic analysis to extract recurrent patterns of form-meaning associations (constructions). Patterns that are not frequent enough are pruned because they typically result from annotation errors. The semantic annotations were taken from the French FrameNet, developed within the ASFALDA project <ref type="bibr" target="#b4">[5]</ref>. The French FrameNet project has explicitly focused on Causality as one of its main domains, and includes 11 distinct Causality frames and 217 distinct frame-evoking elements. Fig. <ref type="figure" target="#fig_5">7</ref> illustrates the kind of information that can be extracted using this method. On the left is an input sentence, and on the right is a Causality frame that was detected. As can be seen, the verb form détermine (here: "causes") is the frame-evoking element (FEE). It has designated its subject (cela "that") as the Cause, and its direct object (tâches de pourriture "spots of rot") as the Effect.</p><p>In its current form, a Causal Frame extractor is already useful because it can search through a text for instances of causal language, and then present the results to the human reader. We are currently evaluating how well a frame extractor trained on contemporary French data can be applied to the Good Old Manual corpus. For this, we are annotating a test set of causal expressions that can be found in the corpus. Moreover, as can be seen in Figure <ref type="figure" target="#fig_5">7</ref>, the Frame Extractor currently identifies Frame Elements through syntactic relations, so the syntactic subject cela was assigned the role of Cause rather than the semantic subject (printed in italics), which is what really matters for extracting knowledge. Future work will therefore have to include anaphor resolution and tracking entities across longer spans in the discourse.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>Old texts are often treasure troves of past knowledge that has become almost inaccessible or even forgotten as societies evolve. Especially "good old" manuals, which have so far been neglected, offer a great potential source of information about the knowledge and practices of a given time and place. In this paper, we have illustrated how a suite of techniques from Digital Humanities, natural language processing, statistical analysis and data visualization, can be exploited to make such texts not only accessible, but also more meaningful to human readers.</p><p>More specifically, we have introduced the Good Old Manual corpus of 19th century texts about French market gardening, particularly in the Paris region. These techniques have recently gained a renewed interest because they offer insights into increased efÏciency for farming on small plots of lands, known as the French Method. We have demonstrated how the most prominent actors at the time can be situated in a social and geographic network through named entity linking; how activities that are relevant and meaningful to specific topics such as market gardening can be visualized through word clouds and word embedding spaces, and how more fine-grained knowledge could potentially be mined through semantic parsing.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Covers of the books included in the GOM corpus.</figDesc><graphic coords="3,89.28,225.36,416.72,237.21" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Citations in the GOM corpus. Authors are listed in the left and right columns; while cited people are listed in the central columns. Names in purple refer to people mostly on the knowledge side (professors of agronomy or botany for example) and names in yellow refer to people involved on the practical side (market gardeners, seed sellers,...).</figDesc><graphic coords="5,89.28,84.17,416.72,270.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Map of the locations mentioned in the GOM1 corpus. The circles size reflect the number of occurrences in the corpus.</figDesc><graphic coords="6,89.28,84.16,416.72,234.29" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: This word cloud of verbs illustrates which actions were important for market gardener (indicated though size). Red verbs do not appear in the FRANTEXT reference corpus and are therefore specific to market gardening.</figDesc><graphic coords="7,89.28,84.16,416.72,265.18" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: (Top) Comparison of the usage, in the GOM corpus, of the verb "arroser" (red) compared to its nominalizations "arrosage" (in black) and "arrosement" (in gray). Comparaison of the frequency of occurence of "arrosage" (in black) and "arrosement" (in gray) in Gallica (Middle) and Google books (Bottom).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: This Figure shows an input sentence on the left, with its Frame Elements indicated in boldface, and its frame-evoking element underlined. On the right is a Causality frame that was extracted from this sentence, as it is visualized in Fluid Construction Grammar's web interface.</figDesc><graphic coords="10,89.28,84.18,416.72,65.41" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="8,89.28,84.17,416.72,280.93" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>List of the books included in the GOM1 corpus.</figDesc><table><row><cell>Author</cell><cell>Date</cell><cell>Title</cell></row><row><cell>Combles, Charles-Jean De</cell><cell>1802</cell><cell>L'école du jardin potager</cell></row><row><cell>Noisette, Louis</cell><cell>1825</cell><cell>Manuel complet du jardinier maraîcher</cell></row><row><cell>Courtois-Gérard, Claude Joseph</cell><cell>1843</cell><cell>Manuel pratique du jardinage</cell></row><row><cell>Moreau, J.G. et Daverne, Jean-Jacques</cell><cell cols="2">1845 Manuel pratique de la culture maraîchere de Paris</cell></row><row><cell cols="2">Deby, Julien et Rodigas, François/Emile 1853</cell><cell>Manuel de culture maraîchère</cell></row><row><cell>Gressent, Vincent</cell><cell>1863</cell><cell>Le potager moderne</cell></row><row><cell>Desmoulins, Philippe</cell><cell>1871</cell><cell>Guide pratique du jardinier français</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://sonycslparis.github.io/gom-webapp/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">We used the confusion matrix available at https://github.com/shaneweisz/OCR-Character-Confusion/blob/mast er/confusion_matrix/confusion_matrix_base.pkl</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://spacy.io</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Evaluation of word embeddings from large-scale French web content</title>
		<author>
			<persName><forename type="first">H</forename><surname>Abdine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xypolopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Eddine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Vazirgiannis</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2105.01990</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Computerized linguistic resources of the research laboratory ATILF for lexical and textual analysis: Frantext, TLFi, and the software Stella</title>
		<author>
			<persName><forename type="first">P</forename><surname>Bernard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lecomte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Dendien</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-M</forename><surname>Pierrel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Lrec. Citeseer</title>
				<imprint>
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A computational construction grammar approach to semantic frame extraction</title>
		<author>
			<persName><forename type="first">K</forename><surname>Beuls</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Van Eecke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">S</forename><surname>Cangalovic</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Linguistics Vanguard</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">20180015</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>De Carné-Carnavalet</surname></persName>
		</author>
		<title level="m">Le maraıĉhage sur petite surface: La French Method: une agriculture urbaine ou périurbaine</title>
				<imprint>
			<publisher>Editions de Terran</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Corpus annotation within the French FrameNet: a domain-by-domain methodology</title>
		<author>
			<persName><forename type="first">M</forename><surname>Djemaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Candito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Muller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Vieu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Tenth international conference on language resources and evaluation (LREC</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Computational Approaches to Digitised Historical Newspapers</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ehrmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Düring</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Neudecker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Doucet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Dagstuhl Seminar 22292)</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Permaculture for agroecology: design, movement, practice, and worldview. A review</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Ferguson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Lovell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Agronomy for sustainable development</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="251" to="274" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">A frames approach to semantic analysis</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Fillmore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Baker</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Construction Grammar: A thumbnail sketch</title>
		<author>
			<persName><forename type="first">M</forename><surname>Fried</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-O</forename><surname>Östman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Construction Grammar in a cross-language perspective</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="86" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Constructions: A construction grammar approach to argument structure</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">E</forename><surname>Goldberg</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1995">1995</date>
			<publisher>University of Chicago Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Writing science: Literacy and discursive power</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A K</forename><surname>Halliday</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Martin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2003">2003</date>
			<publisher>Routledge</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Hervé-Gruyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hervé-Gruyer</surname></persName>
		</author>
		<title level="m">Miraculous abundance: One quarter acre, two French farmers, and enough food to feed the world</title>
				<imprint>
			<publisher>Chelsea Green Publishing</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Constituency parsing with a self-attentive encoder</title>
		<author>
			<persName><forename type="first">N</forename><surname>Kitaev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Klein</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1805.01052</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">French Intensive Gardening: A Retrospective</title>
		<author>
			<persName><forename type="first">O</forename><surname>Martin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Umap: Uniform manifold approximation and projection for dimension reduction</title>
		<author>
			<persName><forename type="first">L</forename><surname>Mcinnes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Healy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Melville</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1802.03426</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Graphs, maps, trees: abstract models for a literary history</title>
		<author>
			<persName><forename type="first">F</forename><surname>Moretti</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2005">2005</date>
			<pubPlace>Verso</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">How to write a spelling corrector</title>
		<author>
			<persName><forename type="first">P</forename><surname>Norvig</surname></persName>
		</author>
		<ptr target="http://norvig.com/spell-correct.html" />
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">From key words to key semantic domains</title>
		<author>
			<persName><forename type="first">P</forename><surname>Rayson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International journal of corpus linguistics</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="519" to="549" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Standards going concrete: from LMF to Morphalou</title>
		<author>
			<persName><forename type="first">L</forename><surname>Romary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Salmon-Alt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Francopoulo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The 20th International Conference on Computational Linguistics-COLING</title>
				<imprint>
			<date type="published" when="2004">2004. 2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">An overview of the Tesseract OCR engine</title>
		<author>
			<persName><forename type="first">R</forename><surname>Smith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ninth international conference on document analysis and recognition</title>
				<imprint>
			<publisher>Ieee</publisher>
			<date type="published" when="2007">2007. 2007</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="629" to="633" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Defining digital humanities: a reader</title>
		<author>
			<persName><forename type="first">M</forename><surname>Terras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Nyhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Vanhoutte</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>Routledge</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">The FCG Editor: An innovative environment for engineering computational construction grammars</title>
		<author>
			<persName><forename type="first">R</forename><surname>Van Trijp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Beuls</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Van Eecke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Plos One</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page">e0269708</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Agroecology as a science, a movement and a practice. A review</title>
		<author>
			<persName><forename type="first">A</forename><surname>Wezel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bellon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Doré</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Francis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vallod</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>David</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Agronomy for sustainable development</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="503" to="515" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Publaynet: largest dataset ever for document layout analysis</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Yepes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2019 International conference on document analysis and recognition (ICDAR)</title>
				<imprint>
			<publisher>Ieee</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1015" to="1022" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
