<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using Topic Generation Model to explore the French Parliamentary Debates during the early Third Republic (1881-1899)</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Nicolas</forename><surname>Bourgeois</surname></persName>
							<email>nicolas.bourgeois@epitech.eu</email>
							<affiliation key="aff0">
								<orgName type="laboratory">Méthodes Numériques pour les Sciences de l&apos;Humain et de la Société (MNSHS)</orgName>
								<address>
									<addrLine>Epitech, 14-16 rue Voltaire</addrLine>
									<postCode>94270</postCode>
									<settlement>Le Kremlin-Bicêtre</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Aurélien</forename><surname>Pellet</surname></persName>
							<email>aurelien.pellet@epitech.eu</email>
							<affiliation key="aff0">
								<orgName type="laboratory">Méthodes Numériques pour les Sciences de l&apos;Humain et de la Société (MNSHS)</orgName>
								<address>
									<addrLine>Epitech, 14-16 rue Voltaire</addrLine>
									<postCode>94270</postCode>
									<settlement>Le Kremlin-Bicêtre</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marie</forename><surname>Puren</surname></persName>
							<email>marie.puren@epitech.eu</email>
							<affiliation key="aff0">
								<orgName type="laboratory">Méthodes Numériques pour les Sciences de l&apos;Humain et de la Société (MNSHS)</orgName>
								<address>
									<addrLine>Epitech, 14-16 rue Voltaire</addrLine>
									<postCode>94270</postCode>
									<settlement>Le Kremlin-Bicêtre</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="laboratory">Centre Jean-Mabillon (CJM)</orgName>
								<address>
									<addrLine>École nationale des chartes, 65 rue de Richelieu</addrLine>
									<postCode>75002</postCode>
									<settlement>Paris</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">Available on Gallica</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Using Topic Generation Model to explore the French Parliamentary Debates during the early Third Republic (1881-1899)</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9C45F71A90D9F589E34A4E8251EDEA51</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T13:20+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Natural Language Processing</term>
					<term>Topic Modelling</term>
					<term>Parliamentary Debates</term>
					<term>France</term>
					<term>Early Third Republic (1881-1899)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this long paper, we use NLP techniques to explore two decades (1881-1899) of parliamentary debates of the French Third Republic (1870Republic ( -1940)), and more specifically to analyse the importance of the army in the political debate. We use Latent Dirichlet Allocation to partition the vocabulary into topics, and then study the distribution of the topic "army" over time. We also examine its connection with other topics, in relation to the main political and military events of the period.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In this paper, we present the preliminary work we have carried out on a set of parliamentary reports from the early years of French Third Republic (1881-1899), extracted from the Journal officiel de la République française. Débats parlementaires. Chambre des députés : compte rendu in-extenso, available online on the digital library of the National Library of France 1 . During the French Third Republic, which was the republican system of government in effect in France from September 1870 to July 1940, the debates in the lower house of French Parliament (the upper house being the Senate) have been carefully recorded and published in the Journal Officiel. Elected by universal male suffrage for four years, the Chamber of Deputies was created by the Constitutional Laws of 1875. We are working on debates held since 1881, and not 1876 (date of the first election of the new Chamber of Deputies), i.e. from the end of the second legislature 2 onwards <ref type="foot" target="#foot_1">3</ref> . This choice is dictated by the document we are working on: it is from 1881 that the debates held in the Chamber of Deputies are recorded in a publication specifically dedicated to them <ref type="foot" target="#foot_2">4</ref> .</p><p>The Chamber of Deputies played a considerable political role, especially in the nineteenth century. At that time, the government paid particular attention to this assembly <ref type="bibr" target="#b0">[1]</ref>. We thus have access to the full report of the debates, written by a body of specialised civil servants set up in 1847, whose techniques aim to recreate the naturalness of the deliberations <ref type="bibr" target="#b1">[2]</ref>. Parliamentary debates are therefore an essential historical source for political history <ref type="bibr" target="#b2">[3]</ref>, but also for other historical fields, since they make it possible to follow the major stages in the development of the legislative framework of various social, economic, religious or cultural fields of activity <ref type="bibr" target="#b3">[4]</ref>. They are also of interest to other disciplines: political science, sociology, linguistics <ref type="bibr" target="#b4">[5]</ref>, or legal history <ref type="bibr" target="#b5">[6]</ref>.</p><p>However, while all parliamentary debates since the French Revolution were made available online between 2009 and 2016, this has not prompted a new wave of research. Although they constitute a fundamental democratic institution, debates are indeed little known by the general public and little studied by specialists <ref type="bibr" target="#b0">[1]</ref>. On the other hand, the availability of its Anglo-Saxon counterpart, the Hansard<ref type="foot" target="#foot_3">5</ref> , in the form of exploitable textual data, has stimulated new research in history and in political science <ref type="bibr" target="#b6">[7]</ref> but also in linguistics and natural language processing <ref type="bibr" target="#b7">[8]</ref>. The form of the French debates and the means made available to users to read them online, make them a difficult source to work with: to navigate through the digitised reports, it is best to already know what you are looking for (for example: to search for debates on a law carried out on a specific date). It is possible to do a full-text search within an issue (which corresponds to a parliamentary sitting), but this does not allow the user to explore the corpus as a whole, especially if he or she is interested in a major topic that has been debated over several years.</p><p>Fortunately it is possible to extract the text of these digitized documents. From a methodological point of view, parliamentary debates thus constitute an excellent case study for the computational exploration of large historical corpora. While digitisation provides access to an increasingly large amount of historical data, it requires the development of new ways of reading digitised ancient sources <ref type="bibr" target="#b8">[9]</ref>, such as the methods offered by "distant reading" as defined by Franco Moretti <ref type="bibr" target="#b9">[10]</ref>. Within the framework of the AGODA <ref type="foot" target="#foot_4">6</ref> project, funded by the National Library of France <ref type="bibr" target="#b10">[11]</ref>, our team is working on the development of tools to facilitate the exploration of this corpus. As part of this work, we propose to use topic modelling, a method that is particularly appropriate for the study of large historical corpora <ref type="bibr" target="#b11">[12]</ref>.</p><p>Topic modelling has shown its value in analysing similar sources, in particular the press (such as in <ref type="bibr" target="#b12">[13]</ref> or <ref type="bibr" target="#b13">[14]</ref>). Such corpora, large in volume, serial, and crossed by many different topics that evolve over time, are well suited to a topic-based exploration. We wish to show the interest in such a method to analyse and explore our corpus. Topic modelling indeed seems to us to be an interesting "entry point" into parliamentary debates. We start from the hypothesis that identifying the topics present in these debates makes it possible to better understand the evolution of political ideas and debates over time. We present here a first approach based on raw (uncorrected) data collected on a large scale. We have chosen to focus on the topic "army" and its co-occurring topics. The French army is indeed a stable institution during the period, which officially does not depend directly on political governments. However, discussions concerning the army were numerous and repeated in the Chamber, as the MPs had to decide on various issues related to its functioning (budgets, reforms, conscription, etc.), its activities (wars and conflicts, external operations, etc.) or political events (such as the infamous Dreyfus affair). Although soldiers did not have the right to vote from 1872 to 1945, and the army was supposed to remain politically neutral <ref type="foot" target="#foot_5">7</ref> , the military were also surprisingly present on the French political scene in the nineteenth and twentieth centuries, even if they are still discreet in political history <ref type="bibr" target="#b14">[15]</ref>. The Minister of War is the one who deals with Parliament, which keeps a close eye on him. In practice, the Minister proposes governmental projects, and Parliament chooses whether or not to support them. The centre of decision making in defence policy, particularly in regard to projects concerning the colonial army, is therefore in Parliament <ref type="bibr" target="#b15">[16]</ref>.</p><p>We hypothesise that topic generation model will allow us to better understand the action of the army over the identified period, what its fields of intervention were, and to grasp to what extent the French army (represented by the Minister of War) was able to participate in the elaboration and execution of the political decisions. To assess this hypothesis, our study is divided into two parts. First, we assess the consistency of the topics that our model identifies, with particular attention to the topic "army". We study the results obtained in the light of current historical knowledge, in order to verify the validity of the model. We then examine a few topics co-occurring with the topic "army", assuming that the validity of these correlations can be verified with the historical data at our disposal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Data set</head><p>Digitised by the National Library of France and the archives of the National Assembly, the records of the French parliamentary debates are available online on Gallica, a freely accessible digital library, together with some precious metadata. Automatic transcription (OCR) have also been performed on these records, and the resulting texts have been made available online in ALTO-XML format and in raw text <ref type="foot" target="#foot_6">8</ref> . Transcription was generated on the fly at the time of digitisation by an OCR software (ABBYY FineReader), and put online without extensive post-correction.</p><p>A detailed analysis of the quality of this transcription -and how to improve it -is beyond the scope of this article. Let us just mention that while the current transcription is unfortunately not accurate enough for performing precise tasks such as named entities recognition, we believe it to be fit for the purpose of a broad analysis of the vocabulary. Most OCR errors are indeed located in specific parts of the text, namely in the binding of documents, where the pages can be very curved <ref type="foot" target="#foot_7">9</ref> . But luckily this does not represent a significant part of the corpus.</p><p>We are interested in the place of the army in the parliamentary debates of the early Third Republic, a period marked by significant military activity, particularly with the wars of colonisation (protectorate over Tunisia (1881), Tonkin campaign (1883-1886), exactions committed by the military (Voulet-Chanoine mission in 1899), etc.) and the resulting tensions with its European "competitors" (Fashoda Incident in 1898). The army also intervened within the metropolitan borders to suppress strike movements, ending sometimes in bloodshed (Fusillade de Fourmies (Fourmies shooting) in 1891). The trauma of the 1870 defeat also led to a reform of the army from 1871 onwards, which continued in the following years with the expansion of recruitment (Freycinet laws in 1889). It was also a period marked by various scandals and affairs such as the Scandale des décorations (Medals scandal) (1887), the Schaebelé Affair (1887), or the arrest and conviction of Captain Dreyfus (1894-1899). We have decided to limit our work to the years 1881-1899 in order to encompass these events without extending the size of the corpus beyond our reach. Over this period we dispose of 2597 reports in text format, almost 4 per week, and over 80 millions words.</p><p>A parliamentary sitting is a long and composite event, during which several unrelated issues are discussed in succession. For this reason, we have divided the reports according to their sections (a section corresponds to a single debate, which deals with a well-defined issue), which usually focus on a single topic. We processed this division automatically by identifying intermediate headings identified as isolated sentences written in capital letters. Thus, our corpus consists of 35891 small documents, with an average size of 2200 words.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>The Latent Dirichlet Allocation (LDA) topic generation model was first presented in 2003 <ref type="bibr" target="#b16">[17]</ref>. It is based on a Bayesian probabilistic model, which is derived from the following theoretical assumption. Before any article is written, there are topics, this term designating semantic fields, i.e. sets of words linked by their meaning. Then, texts are produced by choosing words from a small subset of topics with a given probability distribution. In practice, this means that the texts are the observations derived from hidden variables, namely the topics, and that the statistical correlations in the texts are the direct results of the semantic similarities. We therefore hope to find the topics by reversing the generation process. In other words, we want to know the topics as word distributions and the texts as topic distributions, conditional on the observed word distribution. Unfortunately, the calculation of the universe probability is not feasible and so we have to approximate this quantity. Many algorithms have been introduced in the literature to deal with this issue; here we simply use the original algorithm of <ref type="bibr" target="#b16">[17]</ref>, namely the variational mean field method.</p><p>Topic modelling has been widely used in many areas of the Humanities and Social Sciences <ref type="bibr" target="#b17">[18]</ref>, as it is a very powerful tool for extracting information from a large corpus in an unsupervised context, i.e. when classes are not defined a priori. However, the results are particularly reliable when the assumptions of the model are satisfied by the study corpus. This includes: a large number of texts; each text dealing with a limited number of topics; each topic being distributed over several texts in the corpus; a common conceptual framework shared by all authors. If newspaper articles are the paradigmatic example <ref type="bibr" target="#b12">[13]</ref>  <ref type="bibr" target="#b13">[14]</ref>, parliamentary debates also meet all these requirements. Once the topics are generated, they can be used as new variables for the study of vocabulary. This drastically reduces the size of the variable space (from more than 50000 forms to a few dozen topics) and thus makes visualisations possible -obviously with a significant loss of information. We can for example study the intensity of topics over time <ref type="bibr" target="#b18">[19]</ref> or study the correlation between topics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">General Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Structure and semantic coherence of the topics</head><p>The topics provided by the algorithm are remarkably coherent. If we consider the main keywords of some of them, it is easy to guess what they are representing (see Table <ref type="table">1</ref>). For instance, Topic 8 deals with the class struggle and the working class situation, with words like salaire (wages), patron (boss), syndicat (labour union), grève (strike) or ouvrier (worker). On the other hand, Topic 11 clearly relates to the army, with words like général (general), régiment (regiment), troupe (troop), soldat (soldier) or guerre (war).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Four straightforward topics: the working class <ref type="bibr" target="#b7">(8)</ref>, the army <ref type="bibr" target="#b10">(11)</ref>, the voting process <ref type="bibr" target="#b12">(13)</ref>, the state infrastructures <ref type="bibr" target="#b14">(15)</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Categorisation of topics into classes</head><p>These topics can easily be divided into two main categories. The first broad category includes topics related to the functioning of the Chamber: speaking tours, organisation of the sitting, votes, etc. The conduct of a sitting (even if it is sometimes disrupted) is highly codified. Even if the aim of the stenographers is to reproduce the naturalness of the exchanges and speeches, the transcription of the debates must accurately record each stage of the parliamentary sittings, from the bill's introduction to its final vote, but also all the elements relating to the functioning of the assembly (announcement of leave, composition of committees, questions to the government, etc.).</p><p>The second category includes topics that are semantically more significant for our study. The latter captures the different issues that dominated the parliamentary debates. Naturally, there are also some useless topics -for instance Topic 12 is nothing but the list of all French departments. This is because the names of the departments often appear in debates: at the time of the verification of election results (each deputy represents a department), during debates which frequently concern local life, or at the time of the vote on bills because the voters are identified by their name and the department they represent.</p><p>Also some topics are very similar to each other, especially those dealing with how the Chamber works. Hence we categorised the 50 topics in 16 classes with the following labels (Table <ref type="table" target="#tab_1">2</ref>): In Table <ref type="table" target="#tab_1">2</ref>, we also calculated the contribution of each of these classes in the corpus (Cf.</p><p>column "Weight in the corpus"). This allows us to better understand whether a class of topics was more or less frequently addressed in the corpus.</p><p>If we disregard the first two classes, which bring together topics concerning the functioning of the Chamber of Deputies, we can see that "budget", "working class", "economy" and "trains/communications" are the four classes of topics that appear most often in the corpus. "Budget" is naturally the most important class of topics, because the key role of the Chamber of Deputies is to discuss the state budget and to allocate the funds needed to enforce government policy. The growth of the working class and the rise of socialism are also well reflected in the debates: MPs address social struggles in their speeches; we also see the (timid) beginning of social legislation in the 1890s. "Economy" is one the class of topics most often dealt with by the Chamber of Deputies, as it is frequently the subject of legislation (particularly with the question of taxation). This class of topics is also frequently present, as it relates to many sectors (agriculture, trade agreements, industrialisation, etc.). "Train/communications" reflects the significant investment in the development of communications infrastructures, and the creation of the French railway network -one of the most developed in Europe at the beginning of the twentieth century. More generally, an examination of this figures confirms the coherence of the topics we have identified: they are quite consistent with the major themes that marked political life during the early Third Republic <ref type="bibr" target="#b19">[20]</ref>.</p><p>Beyond this simple comparison between their weight in the corpus, we see (Figure <ref type="figure" target="#fig_0">1</ref>) that the various classes have unequal variances. Let us consider the topic "army". While the quartiles are not extremely far from the median, there are some strong outliers. They can go up to 0.2, and 6 of them are greater than 0.1. It seems that when "army" is the main topic, it tends to become hegemonic. On the two extremes, the topic "budget" has a very high variance while "school" never gets to be really prominent, the maximum never goes higher than 0.07.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Distribution of the topic "army" over time</head><p>Figure <ref type="figure">2</ref> shows, for every month, the percentage of the vocabulary that belongs to a given class of topics. As expected, this signal is quite noisy, since many topics are discussed over small periods of time. However, it is possible to notice some patterns in these graphs.</p><p>The topic "army" was very popular at four different times: first in 1881, then around 1884, then in 1887-1888, and finally around 1895. These four peaks can be explained perfectly well in terms of military history. The conquest of Tunisia began in 1881, with the intervention of the French army in Kroumirie, located in northwestern Tunisia. The year 1884 was marked by two events concerning the army: the Tonkin campaign, and discussions on reducing military service to three years. In 1887 and 1888, discussions resumed on the reform of military service. The year 1887 was also marked by renewed tensions with Germany (Schnaebelé affair). In 1895, the difficult conquest of Madagascar gave rise to new debates concerning the French colonial armies.</p><p>For the sake of the comparison, we present in Figure <ref type="figure">2</ref> the evolution of other topics over time, namely "law enforcement", "school" and "colonies". Observe that "law enforcement" remains a significant proportion of the corpus, while "school" and "colonies" are almost non-existent over large periods. We can clearly see that these three graphs show different patterns, confirming that each topic captures distinct phenomena. For instance, "law enforcement" and "school" are more represented in the first half of the period, while"colonies" circulates quietly in the corpus, and gains in importance in the second half of the period with strong peaks in 1886, 1895 and at the very end of the 1890s.</p><p>Figures <ref type="figure" target="#fig_2">3 and 4</ref> show distribution of the topic "army", respectively for all the years considered, and for the year 1884 during which the topic is particularly present. In Figure <ref type="figure">3</ref>, several peaks can be seen that were not visible in the previous figure. These peaks can be explained by the colonial policy conducted by France, by the military reforms that took place during the period, and by the Dreyfus affair. In 1888 and 1889, the law reducing the length of military service was discussed and voted. The years 1892 and 1893 were marked by the continuation of colonial conquests (Comoros, Tunisia, Sudan, Dahomey, Ivory Coast, Siam). The Dreyfus affair began in 1894 and continued until the end of the period studied. In 1897, the borders of the French colonial empire are stabilised with the last conquests (Indochina and Madagascar), and the Franco-Russian military alliance was affirmed in case of war.</p><p>Figure <ref type="figure" target="#fig_2">4</ref> shows that 1884 was a year in which there were several intense discussions about the army in the Chamber of Deputies. Two issues occupied the MPs. On the one hand, they deliberated on a bill to reduce the length of military service between April and June (see peaks in April and June). On the other hand, they also had to discuss the military operations carried out by France in Tonkin. The way in which the government conducted this war of conquest can be seen in the shape of the graph: (1) the government asked for new credits in February, which led to heated debates; (2) the government sought to increase the number of colonial troops to satisfy its ambitions and proposed a project to this effect in June; (3) the Chamber discussed the budget in December, and in particular the credits allocated to colonial troops. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Cross study of the topics' prevalence</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Time-based correlation between topics</head><p>We are now looking for high co-intensity topics, i.e. topics that tend to be frequently associated with each other in specific parts of the corpus. Correlation is based on the number of text units that contain a significant percentage of both subjects. These text units are not defined semantically, but on the basis of a fixed length window of 6000 characters.</p><p>We create a first indicator by dividing our corpus according to the date of production of the texts composing it. This allows us to divide the corpus into smaller segments (by month or by year), and then determine whether a high proportion of the topic "army" is correlated with a high or low proportion of other topics in the same period. We take the average weight of each topic per month and calculate the Pearson correlation coefficient between the topic "army" and all other topics. We then observe that the intensity of the topic "army" over the course of a month (Figure <ref type="figure" target="#fig_3">5</ref>) is positively correlated with the following topics: "colonies", "navy", "foreign affairs". Perhaps most surprising is the strong correlation with the topic "school". However this correlation is rather weak in absolute terms. In the course of a given period (even a single parliamentary sitting, i.e. a single day), many different issues are addressed by MPs. Information therefore tends to be spread across most subjects. In particular, some topics such as the names of MPs, the departments they come from, and the vocabulary describing the functioning of parliament, are evenly spread over the period. We therefore decided to look for a correlation at the lowest level. We are in fact looking to answer the following question: what proportion of the blocks that address with high intensity the topic "army" (more than 15% of the vocabulary) also deals with high intensity with another given topic?</p><p>We find a very strong correlation between army and navy (15.5% of documents with a high proportion of the topic "navy" also have a high proportion of the topic "army"), followed by "colonies", "school", "law enforcement" and "budget" (see Table <ref type="table" target="#tab_2">3</ref> and Figure <ref type="figure" target="#fig_4">6</ref>). Since the topic "government/parliament" is fairly evenly distributed throughout the corpus, other topics such as "working class", "alcohol" or "local politics" are almost completely disconnected from the topic "army". The case of the names of the MPs and the departments is specific, as these two  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Study of topics with a strong correlation with the topic "army"</head><p>We aim here to focus on the strong correlations we have just identified. We will examine the nature of these correlations by "close reading" <ref type="bibr" target="#b9">[10]</ref> the texts we are studying. We will also check the validity of these results in the light of current historical knowledge. Figure <ref type="figure" target="#fig_5">7</ref> shows that there is a strong correlation between "army" and "school"; this is mainly related to the debates on the reform of military service. The law of 1872 had established a military service that could last up to five years, but with a certain number of exemptions (teachers, students of grandes écoles<ref type="foot" target="#foot_8">10</ref> and seminarians <ref type="bibr" target="#b15">[16]</ref>). The association of "army" with "school" refers to the exemptions granted to teachers and students of grandes écoles, which the Republican MPs wanted to put an end to.</p><p>This correlation was most intense in 1887, although the law removing these exemptions and reforming the military service was passed in 1889. This was because this law was in the making from the early 1880s <ref type="bibr" target="#b15">[16]</ref>. Between 1876 and 1889, there were twelve bills related to this issue <ref type="bibr" target="#b20">[21]</ref>. But it was really in 1887, with the renewed tensions between Germany and France, that the Ministry of War, in agreement with the Chamber of Deputies, decided to transform the law of 1872 <ref type="bibr" target="#b15">[16]</ref>. The report on this project was proposed and discussed between June and July 1887. After passing through the Senate, the bill was presented to the MPs again in December 1888 and voted on in January 1889 <ref type="bibr" target="#b20">[21]</ref>.</p><p>The topic "law enforcement" includes vocabulary related to the creation of the law, as well as references to punishments and means of control inside the military. Figure <ref type="figure" target="#fig_5">7</ref> reflects the intense legislative activity relating to the army during this period. Discussions about military service laws explain its strong correlation with "army" between 1881 and 1889; they also cause a strong peak in 1887 for the same reason.</p><p>These recruitment issues were also linked to the international context. The increase in the intensity of the association of "army" with "foreign affairs" in 1894-1895 (see Figure <ref type="figure" target="#fig_6">8</ref>) can be explained by the introduction of a law in November 1894 extending the duration of incorporation to two years, in order to increase the army's strength <ref type="bibr" target="#b15">[16]</ref>. This was a reaction to the changes that were taking place in the German military, whose growing power frightened the MPs. Between 1893 and 1894, the number of German soldiers increased following the introduction of the two-year service. Our model captures this trend well by clearly associating the topic "army" with the topic "foreign affairs". The peak in 1895 can be explained by the return of the project to the Chamber in June 1895, as the German strength had just increased by 70000 men <ref type="bibr" target="#b15">[16]</ref>. The association between "army" and "foreign affairs" also reveals the competition with Great Britain in colonial affairs. In 1884, France took control of Annam, while Great Britain extended its influence over Burma. Both imperialisms were in contact with South China, which led to tensions, especially in 1885 over Siam. In 1885, there were also strong tensions with the British in the face of growing French appetite for Madagascar. These tensions were also high at the time of the second Madagascar expedition in 1895 <ref type="bibr" target="#b21">[22]</ref> (see peaks in 1885 and 1895 in Figure <ref type="figure" target="#fig_6">8</ref>); The peak in 1881 can also be explained by tensions with another European competitor for the conquest of new territory: a Franco-Italian crisis broke out in June following the Treaty of Bardo, which placed Tunisia under the French protectorate.</p><p>We also note the correlation of the topic "army" with the topic "colonies". This correlation refers to the crucial role of the army in the acquisition and defence of colonies. This association follows a pattern quite similar to the previous association (see Figure <ref type="figure" target="#fig_6">8</ref>): the topic is present throughout the period, but with a strong intensity in the early years (1884-1885), and a second peak in the mid-1890s. Our model succeeds in capturing the way in which the executive power imposes its colonialist policy on Parliament. After the defeat of 1870, the colonial enterprises were blamed for the domestic defeat, as they were said to have taken away the men and funds needed for national defence. Public opinion -and the MPs with it -was at best indifferent, at worst hostile, to new conquests. The arrival in power of the opportunist Republicans in 1879 nevertheless saw the renewal of colonial expansion, which resumed in 1880 and continued intensively until 1885. This policy of conquest was carried out in parallel on several fronts: notably in Tunisia (1880-1881), Annam and Tokin (1883-1885), not to mention Sudan, Congo and Madagascar <ref type="bibr" target="#b22">[23]</ref>. The government therefore had to "trick" public opinion and Parliament, and act on the sly to conceal the extent of its ambitions. Then, as the difficulties accumulated, it insensibly obtained an increase in credits, the sending of increasingly large reenforcements, and irresistibly dragged the MPs into the spiral of conquest <ref type="bibr" target="#b21">[22]</ref>.</p><p>The 1884-1885 peak in Figure <ref type="figure" target="#fig_6">8</ref> is explained by the launching of the Tonkin expedition, for which the government asked the Chamber for new credits and troops in 1884 and early 1885. Debates were particularly intense on this subject in 1885, as the difficulties encountered by the French army in April (Retreat from Lạng Sơn at the end of March 1885) led to an outcry in the Chamber and the fall of the government <ref type="bibr" target="#b15">[16]</ref>. The Chamber elected in 1885 was more anti-colonialist than the previous one and avoided any colonial adventure of the importance of Tonkin; but from 1890 onwards, the opposition began to diminish until it disappeared. The very principle of colonisation was progressively accepted and increasingly supported by the MPs <ref type="bibr" target="#b22">[23]</ref>, even if this did not avoid stormy debates in the assembly. The intensity of the correlation between "army" and "colonies" from 1894 to 1896 is mainly explained by the second expedition led by the French army in Madagascar. In November 1894, the government submitted a request for credits to send an expeditionary corps to the island. This expedition was partly a failure; and in March 1895, the government was interpellated by the Chamber about the pitiful state of the troops. In July 1895, the conquest of Madagascar was resumed but it was stalled. A text is then presented to the Chamber to reform the recruitment of the colonial armies <ref type="bibr" target="#b15">[16]</ref>.</p><p>Let us examine the year 1896 in particular. We can see that the correlation between "army" and "colonies" is quite strong. In March and July 1896, a bill on colonial armies was discussed in the Chamber. It is interesting to note that, in its second version, the bill proposed to entrust the entire management of colonial units to the Navy <ref type="bibr" target="#b15">[16]</ref>. The text shows the birth of a new trend in the Chamber in favour of this branch of the armed forces. On 27 October 1896, the government proposed a new bill on the colonial army, which the Navy would be responsible for, as it was the only one capable of ensuring the continuity of transport and logistics 11 [16]. This explains the strong correlation between the topics "army" and "navy" in 1896 see (Figure <ref type="figure" target="#fig_7">9</ref>). The topics "army" and "navy" are frequently associated, whether for cooperation -(the Navy transports colonial troops) -or competition between the two branches of the military. This association was rather weak during the 1880s but reached a peak in 1896. Until 1895, the Navy had been relatively indifferent to colonial troops. In June 1895, however, the Minister of the Navy claimed responsibility for the management of colonial units from the Ministry of War. This request was the consequence of the rivalry between the two armies over Madagascar, as the Navy could not bear the idea that the Army had taken charge of the expedition <ref type="bibr" target="#b15">[16]</ref>. The financial competition between the two armies was becoming tougher, especially as the Navy needed new investments to modernise the fleet and train staff <ref type="bibr" target="#b23">[24]</ref>. This is why in 1896 a great wave of legislative reforms was launched concerning the organisation of the Navy, notably the creation of a naval school <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>The results of our study show the validity of topic modelling for the analysis of parliamentary debates. This confirms the interest of using such a method to facilitate the analysis of this major historical source. This study also allows us to draw a number of interesting insights on parliamentary debates, which we wish to explore further.</p><p>We see that the weight of the descriptive vocabulary of parliamentary activity itself is very important in the corpus; but this problem is rather well solved thanks to the topic model. We then observe that parliamentary debates follow their own rhythm. This rhythm is in fact imposed by the legislative process, which requires long debates before a law is finally voted. This means that subjects can be dealt with by the Chamber of Deputies long before they become newsworthy. Conversely, issues that make the news are rarely discussed during parliamentary sittings; they are usually dealt with long after they have made the headlines. Topic modelling therefore seems to us to be a method that makes it easier to identify underlying political trends.</p><p>We also note the reactive nature of parliamentary work: this means that a major legislative effort can take place a few months or even several years after the triggering events (as shown in <ref type="bibr" target="#b24">[25]</ref> for instance). Finally, there is another consequence of the way legislative work is carried out, namely the weight that discussions on sensitive subjects can take on, without leading to a vote or the production of a law. The Chamber can indeed seize on a subject to interpellate the government -this is for instance the case of Tonkin after the Retreat from Lạng Sơn in 1885.</p><p>While encouraging, these results are still preliminary. We are working in two directions To further complement and improve them. In order to obtain information on more specific and hopefully unexpected correlations (e.g. the role of the church in the army, or the influence of the executive branch), we will use additional tools, such as word embedding, to further divide the corpus into a few hundred groups, some of them very specific, and to study their life cycle in relation to the army. To improve our model, we are planning to enlarge the period studied, and to work on a less faulty corpus. Within the framework of the AGODA project, we are thus evaluating the solutions available to us to improve the results of the OCR, hoping to further enhance these first results.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Comparative variances of several topic classes</figDesc><graphic coords="8,126.31,106.46,340.17,249.95" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Distribution of four different topics over time</figDesc><graphic coords="9,140.49,106.56,311.81,209.68" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Distribution of the topic "army" over time, per day (1884)</figDesc><graphic coords="10,113.39,106.95,368.52,257.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Correlation between the topic "army" and the other identified topics (by month)</figDesc><graphic coords="11,127.56,106.95,340.17,266.61" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Graph of topics with high correlation</figDesc><graphic coords="12,127.56,106.46,340.20,344.75" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Cross-topics between "army", "school" and "law enforcement"</figDesc><graphic coords="13,111.35,106.95,184.26,138.19" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Cross-topics between "army", "colonies" and "foreign affairs"</figDesc><graphic coords="13,111.35,448.83,184.26,138.19" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: Cross-topics between "army" and "navy".</figDesc><graphic coords="15,205.51,173.07,184.26,138.19" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>List of labeled topics and contributions in the corpus. Some very similar topics are put together in a single class for the sake of readability.</figDesc><table><row><cell>Label</cell><cell>Topics</cell><cell>Examples of words</cell><cell>Weight in the corpus</cell></row><row><cell>Names of MPs</cell><cell>0,5,10,14,18,</cell><cell>Duval, Sigismond, Jules, Martin</cell><cell>0.101</cell></row><row><cell></cell><cell>23,35,37,39</cell><cell></cell><cell></cell></row><row><cell cols="2">government/parliament 1,6,9,13,17,</cell><cell>tribune, projet, adoption, majorité</cell><cell>0.284</cell></row><row><cell></cell><cell>19,22,36,38,</cell><cell></cell><cell></cell></row><row><cell></cell><cell>41,45,46,49</cell><cell></cell><cell></cell></row><row><cell>economy</cell><cell>2,4,16</cell><cell>agriculture, commerce, patente, betterave</cell><cell>0.069</cell></row><row><cell>working class</cell><cell>7,8,31,34</cell><cell>travailler, salaire, usine, mutuelle</cell><cell>0.070</cell></row><row><cell>army</cell><cell>11,48</cell><cell>général, régiment, contrôle, militaire</cell><cell>0.041</cell></row><row><cell>department</cell><cell>12</cell><cell>Calais, Alpes, Saône, Charente</cell><cell>0.006</cell></row><row><cell cols="2">trains/communications 15,44</cell><cell>télégraphe, ingénieur, train, travaux</cell><cell>0.069</cell></row><row><cell>local politics</cell><cell>20, 33</cell><cell>ville, arrondissement, local, département</cell><cell>0.030</cell></row><row><cell>law enforcement</cell><cell>21,40</cell><cell>police, préfet, tribunal, délit</cell><cell>0.055</cell></row><row><cell>school</cell><cell>24</cell><cell>lycée, faculté, classe, enfant</cell><cell>0.023</cell></row><row><cell>alcohol</cell><cell>25</cell><cell>bouilleur, degré, raisin, octroi</cell><cell>0.019</cell></row><row><cell>budget</cell><cell>26,29,30,43</cell><cell>chiffre, budget, dépense, exercice</cell><cell>0.097</cell></row><row><cell>colonies</cell><cell>28</cell><cell>métropole, juif, algérien, tonkin</cell><cell>0.018</cell></row><row><cell>navy</cell><cell>32</cell><cell>marin, flotte, mer, bâtiment</cell><cell>0.021</cell></row><row><cell>building works</cell><cell>27, 42</cell><cell>construction, théâtre, hectare, terrain</cell><cell>0.024</cell></row><row><cell>foreign affairs</cell><cell>47</cell><cell>puissance, Madagascar, Angleterre, traité</cell><cell>0.034</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Number of topics with more than 15% of vocabulary from "army" and more than 15% from another topic</figDesc><table><row><cell>topic name</cell><cell cols="2">army MPs</cell><cell cols="6">gov./parl. economy work. class depart trains local pol.</cell></row><row><cell cols="2">topic + army 1737 5</cell><cell></cell><cell>1009</cell><cell>74</cell><cell>48</cell><cell>1</cell><cell>85</cell><cell>31</cell></row><row><cell>topic only</cell><cell cols="3">1737 11651 17731</cell><cell>2489</cell><cell>2554</cell><cell>534</cell><cell>2559</cell><cell>3678</cell></row><row><cell>proportion</cell><cell cols="3">100% 0.04% 5.69%</cell><cell>2.97%</cell><cell>1.88%</cell><cell>0.19%</cell><cell cols="2">3.32% 0.84%</cell></row><row><cell>topic name</cell><cell cols="6">law inft school alcohol budget colonies navy</cell><cell cols="2">building foreign affairs</cell></row><row><cell cols="2">topic + army 202</cell><cell>69</cell><cell>1</cell><cell>310</cell><cell>62</cell><cell>124</cell><cell>33</cell><cell>65</cell></row><row><cell>topic only</cell><cell>2968</cell><cell>923</cell><cell>587</cell><cell>4678</cell><cell>723</cell><cell>800</cell><cell>962</cell><cell>1404</cell></row><row><cell>proportion</cell><cell>6.81%</cell><cell cols="2">7.48% 0.17%</cell><cell>6.63%</cell><cell>8.58%</cell><cell cols="2">15.50% 3.43%</cell><cell>4.63%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Digital Parliamentary Data in Action (DiPaDA 2022) workshop, Uppsala, Sweden,March 15, 2022.   </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">There were in fact two parliamentary cycles between 1876 and 1881 following the dissolution of the assembly in June 1877.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">Between 1876 and 1880, the Annales du Sénat et de la Chambre des députés recorded the debates in both chambers. Until 1880, they were printed by a private printer, Alfred Wittersheim. From January 1881 onwards, the French state took over the printing and then published the parliamentary debates in the Journal officiel</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">This is the name given to the transcripts of the debates in Great Britain and and Commonwealth countries. Thomas C. Hansard (1776-1833) was the first official publisher.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">Analyse sémantique et Graphes relationnels pour l'Ouverture et l'étude des Débats à l'Assemblée nationale.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_5">The nickname of the French army was the "Grande Muette" at the time, which meant that soldiers remained "mute" on political issues, in order to avoid any risk of political destabilisation.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_6">They can be retrieved with the following API : https://api.bnf.fr/fr/api-document-de-gallica#/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_7">The following digitised image illustrates this problem.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_8">For more information on the French system of grandes écoles, please see this Wikipedia article.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_9">The project was finally rejected in December.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We would like to thank the Bibliothèque nationale de France for its support in the framework of the BnF DataLab.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Online Resources</head><p>The data and source code are available via GitHub.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">L&apos;invention du compte rendu intégral des débats en france (1789-1848), Parlement[s</title>
		<author>
			<persName><forename type="first">H</forename><surname>Coniez</surname></persName>
		</author>
		<idno type="DOI">10.3917/parl.014.0146</idno>
	</analytic>
	<monogr>
		<title level="j">Revue d&apos;histoire politique</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="146" to="159" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Scriptes de la démocratie : les sténographes et rédacteurs des débats (1848-2005</title>
		<author>
			<persName><forename type="first">D</forename><surname>Gardey</surname></persName>
		</author>
		<idno type="DOI">10.4000/sdt.13695</idno>
	</analytic>
	<monogr>
		<title level="j">Sociologie du travail</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Les débats parlementaires au service de l&apos;histoire politique</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ouellet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Roussel-Beaulieu</surname></persName>
		</author>
		<idno type="DOI">10.7202/1060736ar</idno>
	</analytic>
	<monogr>
		<title level="j">Bulletin d&apos;histoire politique</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="23" to="40" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Lermercier</surname></persName>
		</author>
		<ptr target="https://halshs.archives-ouvertes.fr/halshs-0010745" />
		<title level="m">Le vocabulaire des débats sur la loi de 1841 sur le travail des enfants : Premiers résultats sur la chambre des pairs</title>
				<imprint>
			<date type="published" when="1840">mars 1840. 2006</date>
			<biblScope unit="page" from="4" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>De Galembert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Rozenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Vigour</surname></persName>
		</author>
		<title level="m">Faire parler le parlement: méthodes et enjeux de l&apos;analyse des débats parlementaires pour les sciences sociales</title>
				<meeting><address><addrLine>Issy-les-Moulineaux</addrLine></address></meeting>
		<imprint>
			<publisher>LGDJ-Lextenso éditions</publisher>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">La majorité politique : Étude des débats parlementaires sur la fixation d&apos;un seuil</title>
		<author>
			<persName><forename type="first">B</forename><surname>Fournier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Pépratx</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Age et politique, La vie politique</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Percheron</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Rémond</surname></persName>
		</editor>
		<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<publisher>Economica</publisher>
			<date type="published" when="1991">1991</date>
			<biblScope unit="page" from="85" to="110" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">From antagonist to protagonist: &apos;democracy&apos; and &apos;people&apos; in british parliamentary debates, 1775-1885</title>
		<author>
			<persName><forename type="first">H</forename><surname>Bonin</surname></persName>
		</author>
		<idno type="DOI">10.1093/llc/fqz082</idno>
	</analytic>
	<monogr>
		<title level="j">Digital Scholarship in the Humanities</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="759" to="775" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The hansard hazard: gauging the accuracy of british parliamentary transcripts</title>
		<author>
			<persName><forename type="first">S</forename><surname>Mollin</surname></persName>
		</author>
		<idno type="DOI">10.3366/cor.2007.2.2.187</idno>
	</analytic>
	<monogr>
		<title level="j">Corporas</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="187" to="201" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Vers de nouveaux modes de lecture des sources</title>
		<author>
			<persName><forename type="first">F</forename><surname>Clavert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Le temps des humanités digitales</title>
				<editor>
			<persName><forename type="first">O</forename><forename type="middle">L</forename><surname>Deuff</surname></persName>
		</editor>
		<meeting><address><addrLine>Roubaix</addrLine></address></meeting>
		<imprint>
			<publisher>FYP EDITIONS</publisher>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Distant Reading</title>
		<author>
			<persName><forename type="first">F</forename><surname>Moretti</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<pubPlace>Verso, London</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Agoda : Analyse sémantique et graphes relationnels pour l&apos;ouverture et l&apos;étude des débats à l&apos;assemblée nationale</title>
		<author>
			<persName><forename type="first">M</forename><surname>Puren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vernus</surname></persName>
		</author>
		<ptr target="https://hal.archives-ouvertes.fr/hal-03382765" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">G</forename><surname>Shawn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Milligan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Weingart</surname></persName>
		</author>
		<title level="m">Exploring big historical data: the historian&apos;s macroscope</title>
				<meeting><address><addrLine>London</addrLine></address></meeting>
		<imprint>
			<publisher>Imperial College Press</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<author>
			<persName><forename type="first">G</forename><surname>Lavenir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bourgeois</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Old people, video games and french press: a topic model approach on a study about discipline, entertainment and self-improvement</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the usa, 1898-1920</title>
		<author>
			<persName><forename type="first">L</forename><surname>Violla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Verheul</surname></persName>
		</author>
		<idno type="DOI">10.1093/llc/fqz068</idno>
	</analytic>
	<monogr>
		<title level="j">Digital Scholarship in the Humanities</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="921" to="943" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<idno type="DOI">10.4000/books.psorbonne.61562</idno>
		<title level="m">Militaires en République, 1870-1962</title>
				<editor>
			<persName><forename type="first">O</forename><surname>Forcade</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Éric</forename><surname>Duhamel</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Vial</surname></persName>
		</editor>
		<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<publisher>Éditions de la Sorbonne</publisher>
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">J.-C</forename><surname>Jauffret</surname></persName>
		</author>
		<title level="m">Parlement, gouvernement, commandement : l&apos;armée de métier sous la 3è république 1871-1914</title>
				<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1987">1987</date>
		</imprint>
		<respStmt>
			<orgName>Université de Paris I Panthéon Sorbonne</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Latent dirichlet allocation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Topic modeling and digital humanities</title>
		<author>
			<persName><forename type="first">D</forename><surname>Blei</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Digital Humanities</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Topics over time: a non-markov continuous-time model of topical trends</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mccallum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting>the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="424" to="433" />
		</imprint>
	</monogr>
	<note>KDD&apos;06</note>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">J.-M</forename><surname>Mayeur</surname></persName>
		</author>
		<title level="m">Les débuts de la IIIe République 1871-1898</title>
				<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<publisher>Editions du Seuil</publisher>
			<date type="published" when="1973">1973</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Crépin</surname></persName>
		</author>
		<title level="m">Défendre la France : Les Français, la guerre et le service militaire, de la guerre de Sept Ans à Verdun</title>
				<meeting><address><addrLine>Rennes</addrLine></address></meeting>
		<imprint>
			<publisher>Presses universitaires de Rennes</publisher>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Battesti</surname></persName>
		</author>
		<title level="m">La Marine au XIXe siècle. Interventions extérieures et colonies</title>
				<meeting><address><addrLine>Du; Paris</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1993-05">May. 1993</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Bouche</surname></persName>
		</author>
		<title level="m">Histoire de la colonisation française. Flux et reflux</title>
				<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="1815" to="1962" />
		</imprint>
	</monogr>
	<note>Le Grand livre du mois</note>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Monaque</surname></persName>
		</author>
		<title level="m">Une histoire de la marine de guerre française</title>
				<meeting><address><addrLine>Paris</addrLine></address></meeting>
		<imprint>
			<publisher>Perrin</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<author>
			<persName><forename type="first">J</forename><surname>Alerini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Olteanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ridgway</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Markov and the duchy of savoy: Segmenting a century with regime-switching models</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
