<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="it">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler!</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
							<email>s.terragni4@campus.unimib.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Milano-Bicocca</orgName>
								<address>
									<settlement>Milan</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
							<email>elisabetta.fersini@unimib.it</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Milano-Bicocca</orgName>
								<address>
									<settlement>Milan</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler!</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4F13539E9F36E4339FFE36263A840BC7</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T03:44+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>English.</head><p>OCTIS is an open-source framework for training, evaluating and comparing Topic Models. This tool uses singleobjective Bayesian Optimization (BO) to optimize the hyper-parameters of the models and thus guarantee a fairer comparison. Yet, a single-objective approach disregards that a user may want to simultaneously optimize multiple objectives. We therefore propose OCTIS 2.0: the extension of OCTIS that addresses the problem of estimating the optimal hyper-parameter configurations for a topic model using multi-objective BO. Moreover, we also release and integrate two pre-processed Italian datasets, which can be easily used as benchmarks for the Italian language.</p><p>Italiano. OCTIS è un framework opensource per il training, la valutazione e la comparazione di Topic Models. Questo strumento utilizza l'ottimizzazione Bayesiana (BO) a singolo obiettivo per ottimizzare gli iperparametri dei modelli e quindi garantire una comparazione più equa. Tuttavia, questo approccio ignora che un utente potrebbe voler ottimizzare pi'u di un obiettivo. Proponiamo perciò OCTIS 2.0: l'estensione di OCTIS che affronta il problema della stima delle configurazioni ottimali degli iperparametri di un topic model usando la BO multi-obiettivo. In aggiunta, rilasciamo e integriamo anche due nuovi dataset in italiano preprocessati, che possono essere facilmente utilizzati come benchmark per la lingua italiana.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="it">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Topic models are statistical methods that aim to extract the hidden topics underlying a collection of documents <ref type="bibr" target="#b5">(Blei et al., 2003;</ref><ref type="bibr" target="#b6">Blei, 2012;</ref><ref type="bibr" target="#b7">Boyd-Graber et al., 2017)</ref>. Topics are often represented by sets of words that make sense together, e.g. the words "cat, animal, dog, mouse" may represent a topic about animals. Topic models' evaluations are usually limited to the comparison of models whose hyper-parameters are held fixed <ref type="bibr" target="#b9">(Doan and Hoang, 2021;</ref><ref type="bibr" target="#b23">Terragni et al., 2020a;</ref><ref type="bibr" target="#b24">Terragni et al., 2020b)</ref>. However, hyper-parameters can have an impressive impact on the models' performance and therefore fixing the hyper-parameters prevents the researchers from discovering the best topic model on the selected dataset.</p><p>Recently, OCTIS <ref type="bibr" target="#b25">(Terragni et al., 2021a</ref>, Optimizing and Comparing Topic Models is Simple) has been released: a comprehensive and opensource framework for training, analyzing, and comparing topic models, over several datasets and evaluation metrics. OCTIS determines the optimal hyper-parameter configuration according to a Bayesian Optimization (BO) strategy <ref type="bibr" target="#b1">(Archetti and Candelieri, 2019;</ref><ref type="bibr" target="#b21">Snoek et al., 2012;</ref><ref type="bibr" target="#b11">Galuzzi et al., 2020)</ref>. The framework already provides several features and resources, among which at least 8 topic models, 4 categories of evaluation metrics, and 4 pre-processed datasets. However, the framework uses a single-objective Bayesian optimization approach, disregarding that a user may want to simultaneously optimize more than one objective <ref type="bibr" target="#b22">(Terragni and Fersini, 2021)</ref>. For example, a user may be interested in obtaining topics that are coherent but also diverse and separated from each other.</p><p>OCTIS <ref type="bibr">(Terragni et al., 2021a, Optimizing and</ref> Comparing is Simple!) is an open-source evaluation framework for the comparison of topic models, that allows a user to optimize the models' hyper-parameters for a fair experimental comparison. The evaluation framework is composed of different modules that interact with each other: (1) dataset and pre-processing tools, (2) topic modeling, (3) hyper-parameter optimization, (4) evaluation metrics. OCTIS can be used both as a python library and through a web dashboard. It also provides a set of pre-processed datasets, state-of-theart topic models and several evaluation metrics.</p><p>We will now briefly describe the two components that we will extend in this work: the preprocessed datasets and the hyper-parameter optimization module.</p><p>Pre-processing and Datasets. OCTIS currently provides functionalities for pre-processing the texts, which include the lemmatization of the text, the removal of punctuation, numbers and stopwords, and the removal of words based on their frequency. Moreover, the framework already provides 4 pre-processed datasets, that are ready to use for topic modeling. These datasets are 20 NewsGroups,<ref type="foot" target="#foot_0">1</ref> M10 (Lim and Buntine, 2014), DBLP,<ref type="foot" target="#foot_1">2</ref> and BBC News <ref type="bibr" target="#b12">(Greene and Cunningham, 2006)</ref>. All the datasets are split into three partitions: training, testing and validation.</p><p>All the currently provided datasets are in English. OCTIS already provides language-specific pre-processing tools (e.g. lemmatizers for multiple languages), but it does not present datasets in other languages. Creating benchmark datasets for other languages is useful for investigating the peculiarities of different topic modeling methods.</p><p>Single-Objective Hyper-parameter Optimization. OCTIS uses single-objective Bayesian Optimization <ref type="bibr" target="#b21">(Snoek et al., 2012;</ref><ref type="bibr" target="#b20">Shahriari et al., 2015)</ref> to tune the topic models' hyper-parameters with respect to a selected evaluation metric. In particular, the user specifies the search space for the hyper-parameters and an objective metric. Then, BO sequentially explores the search space to determine the optimal hyper-parameter configuration. Since the models are usually probabilistic and can give different results with the same hyper-parameter configuration, the objective function is computed as the median of a given number of model runs (i.e., topic models run with the same hyper-parameter configuration) computed for the selected evaluation metric. OCTIS uses the Scikit-Optimize library <ref type="bibr" target="#b13">(Head et al., 2018)</ref> for the implementation of the single-objective hyper-parameter Bayesian optimization.</p><p>The use of a single-objective approach is however limited. In fact, this strategy disregards other objectives. For example, a user may require to optimize the coherence of the topics and their diversity at the same time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">OCTIS 2.0</head><p>New dataset resources for the Italian language.</p><p>Since OCTIS provides only English datasets, we extend the set of datasets by including two new datasets in Italian. We build the two datasets from the Italian version of the Europarl dataset<ref type="foot" target="#foot_2">3</ref> and from the Italian abstracts of DBPedia. <ref type="foot" target="#foot_3">4</ref> In particular, we randomly sample 5000 documents from Europarl and we randomly sample 1000 Italian abstracts for 5 DBpedia types (event, organization, place, person, work), for a total of 5000 abstracts.</p><p>We preprocess the datasets using the following strategy: we lemmatize the text, we remove the punctuation, numbers and Italian stop-words, we filter out the words with a document frequency higher than the 50% and less than the 0.1% for Europarl and 0.2% for DBPedia and we also remove the documents with less than 5 words. These values have been chosen by manually inspecting the resulting pre-processed datasets.</p><p>We report the most relevant statistics of the From Single-objective to Multi-objective Hyper-parameter Bayesian Optimization.</p><p>Given the limitations of the single-objective hyperparameter optimization approach, we extend OCTIS by including a multi-objective approach <ref type="bibr" target="#b14">(Kandasamy et al., 2020;</ref><ref type="bibr" target="#b18">Paria et al., 2019)</ref>. Single-objective BO can be in fact generalized to multiple objective functions, where the final aim is to recover the Pareto frontier of the objective functions, i.e. the set of Pareto optimal points. A point is Pareto optimal if it cannot be improved in any of the objectives without degrading some other objective. Using a multi-objective hyper-parameter optimization approach thus allows us not only to identify the best performing model, but also to empirically discover competing objectives.</p><p>Since the original Scikit-Optimize library does not provide multi-objective optimization tools, we use the dragonfly library<ref type="foot" target="#foot_4">5</ref>  <ref type="bibr" target="#b18">(Paria et al., 2019)</ref>. Like the single-objective optimization, the user must specify the hyper-parameter search space. But in addition, they also need to specify which functions they want to optimize. We report a simple coding example below: The snippet will run a multi-objective optimization experiment that will return the Pareto front of the diversity and coherence metrics on the Italian dataset DBPedia by optimizing the hyperparameters (defined in a configuration file) of LDA with 25 topics.</p><p>In keeping with the spirit of the first version of OCTIS, the framework extension is open-source and easily accessible, in order to guarantee researchers and practitioners a fairer, accessible and reproducible comparison between the models <ref type="bibr" target="#b2">(Bianchi and Hovy, 2021)</ref>. OCTIS 2.0 is available as extension of the original library, at the following link: https://github.com/mind-Lab/octis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Experimental Setting</head><p>In the following, we will show the capabilities of the extended framework on the new datasets by carrying out a simple experimental campaign.</p><p>We assume an experimental setting in which a topic modeling practitioner is interested in discovering the main thematic information of the two novel datasets in Italian. However, the user does not have prior knowledge on the datasets, therefore does not know which topic model is the most appropriate. Moreover, the user aims to get topics which are coherent and make sense together but which are also diverse and separated from the others. Let us notice that a user could consider a different set of metrics to optimize, by selecting one of the already defined metrics available in OCTIS or by defining novel metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Evaluation Metrics</head><p>We briefly describe the two evaluation metrics (one of topic coherence and one of topic diversity) that we will target as the two objectives of the multi-objective Bayesian optimization. Both metrics need to be maximized.</p><p>IRBO <ref type="bibr" target="#b3">(Bianchi et al., 2021a;</ref><ref type="bibr" target="#b26">Terragni et al., 2021b</ref>) is a measure of topic diversity (0 for identical topics and 1 for completely different topics). It is based on the Ranked-Biased Overlap measure <ref type="bibr" target="#b27">(Webber et al., 2010)</ref>. Topics with common words at different rankings are penalized less than topics sharing the same words at the highest ranks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>NPMI</head><p>(Lau et al., 2014) measures Normalized Pointwise Mutual Information of each pair of words (w i , w j ) in the 10-top words of each topic. It is a topic coherence measure, that evaluates how much the words in a topic are related to each other.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Topic Models and Hyper-Parameter Setting</head><p>We focus our experiments on four well-known topic models that OCTIS already provides, two of them are considered classical topic models and the others are neural models. In particular, we trained Latent Dirichlet Allocation <ref type="bibr">(Blei et al., 2003, LDA)</ref> We summarize the models' hyper-parameters and their corresponding ranges in Table <ref type="table" target="#tab_1">2</ref>. For each model, we optimize the number of topics, ranging from 5 to 100 topics. We select the ranges of the hyper-parameters similarly to previous work <ref type="bibr" target="#b22">(Terragni and Fersini, 2021)</ref>.</p><p>Regarding LDA, we also optimize the hyperparameters α and β priors that the sparsity of the topics in the documents and sparsity of the words in the topic distributions respectively. These hyper-parameters are set to range between 10 −3 and 10 −1 on a logarithmic scale.</p><p>The hyper-parameters of NMF are mainly related to the regularization applied to the factorized matrices. The regularization hyper-parameter controls if the regularization is applied only to the matrix V , or to the matrix H, or both. The regularization factor denotes the constant that multiplies the regularization terms. It ranges between 0 and 0.5 (0 means no regularization). L1-L2 ratio controls the ratio between L1 and L2-regularization. It ranges between 0 and 1, where 0 corresponds to L2 regularization only, 1 corresponds to L1 regularization only, otherwise it is a combination of the two types. We also optimize the initialization method for the two matrices W and H.</p><p>Since ETM and CTM are neural models, their hyper-parameters are mainly related to the network architecture. We optimize the number of neurons (ranging from 100 to 1000, with a step of 100). For simplicity, each layer has the same number of neurons. We also consider different variants of activation functions and optimizers. We set the dropout to range between 0 and 0.9 and the learning rate, that to range between 10 −3 and 10 −1 , on a logarithm scale. We fix the batch size to 200 and we adopted an early stopping criterion for determining the convergence of each model.</p><p>Moreover, only for CTM we also optimized the momentum, ranging between 0 and 0.9, and the number of layers (ranging from 1 to 5). Following <ref type="bibr" target="#b4">(Bianchi et al., 2021b)</ref>, we use the contextualized document representations derived from SentenceBERT <ref type="bibr" target="#b19">(Reimers and Gurevych, 2019)</ref>. In particular, we use the pre-trained multilingual Universal Sentence Encoder. <ref type="foot" target="#foot_5">6</ref>For all the models, we set the remaining parameters to their default values. Finally, we train each model 30 times and consider the median of the 30 evaluations as the evaluation of the function to be optimized. We sample the n initial configurations using the Latin Hypercube Sampling, with n equal to the number of hyperparameters to optimize plus 2 to provide enough configurations for the initial surrogate model to fit. The total number of BO iterations for each model is 125. We use Gaussian Process as the probabilistic surrogate model and the Upper Confidence Bound (UCB) as the acquisition function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Results</head><p>In the following, we report the results of the comparative analysis between the considered models on the Italian datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Quantitative Results</head><p>Figure <ref type="figure">1</ref>: Pareto front of the performance of the considered models for the analyzed Italian datasets.</p><p>We jointly consider the results of both objectives by plotting the Pareto frontier of the results of topic diversity and topic coherence. Figure <ref type="figure">1</ref> shows the frontier of each model for the pair of metrics (NPMI, IRBO). We can notice that the topic models have similar frontiers in each dataset. The most competitive models are NMF and CTM. In particular, NMF outperforms the others for the topic coherence gets a lower coherence as the diversity increases. Therefore, CTM is the model to prefer if a user wants to get totally separated topics but good coherence. Instead, LDA and ETM have lower performance than the others. We also noticed from our experiments that the performance of ETM is affected when the documents are shorter (on the Europarl dataset), often originating the phenomenon of mode collapsing, i.e. obtaining all the topics equal to the others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Qualitative Results</head><p>In Table <ref type="table" target="#tab_3">3</ref> we report an example of topics discovered by the models. We selected the best hyperparameter configuration discovered by the models with 5 topics and randomly sampled a model run among the 30 runs. Let us notice that, for the sake of simplicity, we have to fix the number of topics here and select a run among the total of 30 runs. Therefore, the qualitative results reported in Table 3 may not reflect the overall results.</p><p>We can notice that NMF obtains more coherent and stable topics. CTM and LDA obtain topics that have a higher variance: in particular, CTM discovers a topic (the fourth one, NPMI=-0.51) that lowers the average coherence, while LDA discovers a topic (the second one, NPMI=0.48) that effectively increases the average coherence. On the other hand, the topics discovered by ETM are more stable but have a lower coherence on average. As already observed in previous work <ref type="bibr">(Al-Sumait et al., 2009;</ref><ref type="bibr" target="#b10">Doogan and Buntine, 2021)</ref>, obtaining junk or mixed topics is common in topic models and this problem can be addressed by filtering out the topics that are less relevant.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion</head><p>In this paper, we presented OCTIS 2.0, the extension of the evaluation framework OCTIS for topic modeling. This tool can now address the problem of estimating the optimal hyper-parameter configurations of different topic models using a multiobjective Bayesian optimization approach. Moreover, we also released two novel datasets in Italian which can be used as benchmark datasets for the Italian topic modeling and NLP communities.</p><p>We conducted a simple experimental campaign to show to potentiality of the extended framework. We have seen that using a multi-objective hyperparameter optimization approach allows us not only to identify the best performing model over the oth-  ers, thus guaranteeing a fairer comparison among different models, but also to empirically discover the relationships between different objectives.</p><p>As future work, we aim to extend the framework by considering additional datasets in different and possibly low-resource languages, which require different pre-processing strategies and would allow researchers to investigate the peculiarities of different topic modeling methods.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>#</head><label></label><figDesc>loading of a pre-processed dataset dataset = Dataset() dataset.fetch_dataset("DBPedia_IT") #model instantiation lda = LDA(num_topics=25) #definition of the metrics to optimize td = TopicDiversity() coh = Coherence() metrics = [td, coh] #definition of the search space config_file = "path/to/search/space/file"</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="5,73.81,300.41,214.65,292.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>novel Italian datasets in Table1. Following the original paper, we split the datasets in three partitions: training (75%), validation (15%), and testing (15%). Statistics of the pre-processed datasets.</figDesc><table><row><cell>Dataset</cell><cell>Num. of documents</cell><cell>Avg. doc length (Std. dev.)</cell><cell>Num. of unique words</cell></row><row><cell>DBPedia</cell><cell>4251</cell><cell>5.5 (11.8)</cell><cell>2047</cell></row><row><cell>Europarl</cell><cell cols="2">3616 20.6 (19.3)</cell><cell>2000</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 :</head><label>2</label><figDesc>Hyper-parameters and ranges.</figDesc><table><row><cell></cell><cell cols="2">, Non-negative Matrix Factor-</cell></row><row><cell cols="3">ization (Lee and Seung, 2000, NMF), Embedded</cell></row><row><cell cols="3">Topic Model (Dieng et al., 2020, ETM), Con-</cell></row><row><cell cols="3">textualized Topic Models (Bianchi et al., 2021a;</cell></row><row><cell cols="2">Bianchi et al., 2021b, CTM).</cell><cell></cell></row><row><cell cols="2">Model Hyper-parameter</cell><cell>Values/Range</cell></row><row><cell>All</cell><cell>Number of topics</cell><cell>[5, 100]</cell></row><row><cell>LDA</cell><cell>α prior β prior</cell><cell>[10 −3 , 10] [10 −3 , 10]</cell></row><row><cell></cell><cell cols="2">Regularization factor [0, 0.5]</cell></row><row><cell></cell><cell>L1-L2 ratio</cell><cell>[0,1]</cell></row><row><cell>NMF</cell><cell>Initialization method</cell><cell>nndsvd, nndsvda, nndsvdar, random</cell></row><row><cell></cell><cell>Regularization</cell><cell>V matrix, H matrix, both</cell></row><row><cell></cell><cell>Activation function</cell><cell>elu, sigmoid, soft-plus, selu</cell></row><row><cell></cell><cell>Dropout</cell><cell>[0, 0.9]</cell></row><row><cell>ETM</cell><cell>Learning rate</cell><cell>[10 −3 , 10 −1 ]</cell></row><row><cell></cell><cell>Number of neurons</cell><cell>{100, 200, . . ., 900, 1000}</cell></row><row><cell></cell><cell>Optimizer</cell><cell>adam, sgd, rmsprop</cell></row><row><cell></cell><cell>Activation function</cell><cell>elu, sigmoid, soft-plus, selu</cell></row><row><cell></cell><cell>Dropout</cell><cell>[0, 0.9]</cell></row><row><cell></cell><cell>Learning rate</cell><cell>[10 −3 , 10 −1 ]</cell></row><row><cell>CTM</cell><cell>Momentum</cell><cell>[0, 0.9]</cell></row><row><cell></cell><cell>Number of layers</cell><cell>1, 2, 3, 4, 5</cell></row><row><cell></cell><cell>Number of neurons</cell><cell>{100, 200, . . ., 900, 1000}</cell></row><row><cell></cell><cell>Optimizer</cell><cell>adam, sgd, rmsprop</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>fondare nome azienda noto francese compagnia parigi 0.06 guerra partito battaglia venire nord politico tedesco esercito regno militare 0.03 torneo situare comune giocare abitante edizione tennis tour regione uniti -0.10 film serie the dirigere gioco pubblicare statunitense televisivo venire romanzo 0.07 album pubblicare campionato squadra musicale the calcio statunitense singolo vincere -0.12</figDesc><table><row><cell cols="2">Model Top words</cell><cell>NPMI</cell></row><row><cell></cell><cell>de album pubblicare italiano the uniti situare fondare università noto</cell><cell>-0.05</cell></row><row><cell></cell><cell>torneo giocare tennis edizione tour atp ambito open categoria cemento</cell><cell>0.48</cell></row><row><cell>LDA</cell><cell>film pubblicare the album serie musicale venire statunitense rock band</cell><cell>0.11</cell></row><row><cell></cell><cell>guerra battaglia venire situare statunitense spagnolo partito esercito distretto mondiale</cell><cell>-0.14</cell></row><row><cell></cell><cell>comune campionato squadra abitante calcio regione situare società francese vincere</cell><cell>-0.03</cell></row><row><cell></cell><cell>comune abitante dipartimento regione situare francese alta distretto est grand</cell><cell>0.29</cell></row><row><cell></cell><cell>torneo giocare tennis tour atp open edizione ambito categoria cemento</cell><cell>0.48</cell></row><row><cell>NMF</cell><cell>album pubblicare studio the musicale statunitense records singolo cantante rock</cell><cell>0.29</cell></row><row><cell></cell><cell>calciatore ruolo allenatore calcio centrocampista difensore attaccante portiere settembre aprile</cell><cell>0.24</cell></row><row><cell></cell><cell>contea america uniti situare comune censimento designated census place capoluogo</cell><cell>0.39</cell></row><row><cell></cell><cell>album the pubblicare band statunitense singolo brano of musicale rock</cell><cell>0.26</cell></row><row><cell></cell><cell>superare argentino calciatore el buenos maria en svezia situare chiesa</cell><cell>-0.29</cell></row><row><cell>CTM</cell><cell>partito battaglia guerra venire politico de linea isola stazione regno</cell><cell>-0.08</cell></row><row><cell></cell><cell>st stella vendetta dollaro robert company ritorno west superiore soggetto</cell><cell>-0.51</cell></row><row><cell></cell><cell>edizione tennis giocare torneo vincere tour campionato maschile disputare squadra</cell><cell>0.18</cell></row><row><cell></cell><cell>sede de italiano</cell><cell></cell></row><row><cell>ETM</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3 :</head><label>3</label><figDesc>Example of top words of 5 topics for each considered model and the corresponding topic coherence (NPMI).</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://people.csail.mit.edu/jrennie/2 0Newsgroups/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://github.com/shiruipan/TriDNR/ tree/master/data</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://www.statmt.org/europarl/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.dbpedia.org/resources/on tology/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://github.com/dragonfly/dragonf ly #define and launch optimization mmm = MOOptimizer( dataset=dataset, model=model, config_file=config_file, metrics=metrics, maximize=True) mmm.optimize()</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_5">Let us notice that there is not a Sentence BERT-like model for Italian. Therefore we used a multilingual one: distiluse-base-multilingual-cased-v1.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Topic Significance Ranking of LDA Generative Models</title>
		<author>
			<persName><forename type="first">Loulwah</forename><surname>Alsumait</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><surname>Barbará</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Gentle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Carlotta</forename><surname>Domeniconi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">5781</biblScope>
			<biblScope unit="page" from="67" to="82" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Bayesian Optimization and Data Science</title>
		<author>
			<persName><forename type="first">Francesco</forename><surname>Archetti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Antonio</forename><surname>Candelieri</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>Springer International Publishing</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">On the gap between adoption and understanding in nlp</title>
		<author>
			<persName><forename type="first">Federico</forename><surname>Bianchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dirk</forename><surname>Hovy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="3895" to="3901" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Pre-training is a hot topic: Contextualized document embeddings improve topic coherence</title>
		<author>
			<persName><forename type="first">Federico</forename><surname>Bianchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dirk</forename><surname>Hovy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021</title>
				<meeting>the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2021">2021a</date>
			<biblScope unit="page" from="759" to="766" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Cross-lingual contextualized topic models with zero-shot learning</title>
		<author>
			<persName><forename type="first">Federico</forename><surname>Bianchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dirk</forename><surname>Hovy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Debora</forename><surname>Nozza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021b</date>
			<biblScope unit="page" from="1676" to="1683" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Latent dirichlet allocation</title>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">M</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrew</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><forename type="middle">I</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Probabilistic topic models</title>
		<author>
			<persName><forename type="first">M</forename><surname>David</surname></persName>
		</author>
		<author>
			<persName><surname>Blei</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Communications of the ACM</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="77" to="84" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Applications of topic models</title>
		<author>
			<persName><forename type="first">L</forename><surname>Jordan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yuening</forename><surname>Boyd-Graber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">M</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><surname>Mimno</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Found. Trends Inf. Retr</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="143" to="296" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Topic modeling in embedding spaces</title>
		<author>
			<persName><forename type="first">Adji</forename><surname>Bousso Dieng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Francisco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">M</forename><surname>Ruiz</surname></persName>
		</author>
		<author>
			<persName><surname>Blei</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Trans. Assoc. Comput. Linguistics</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="439" to="453" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Benchmarking neural topic models: An empirical study</title>
		<author>
			<persName><forename type="first">Thanh-Nam</forename><surname>Doan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tuan-Anh</forename><surname>Hoang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021</title>
				<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2021-08">2021. August</date>
			<biblScope unit="page" from="4363" to="4368" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Topic model or topic twaddle? re-evaluating semantic interpretability measures</title>
		<author>
			<persName><forename type="first">Caitlin</forename><surname>Doogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wray</surname></persName>
		</author>
		<author>
			<persName><surname>Buntine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online</meeting>
		<imprint>
			<date type="published" when="2021-06-06">2021. June 6-11, 2021</date>
			<biblScope unit="page" from="3824" to="3848" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Hyperparameter optimization for recommender systems through bayesian optimization</title>
		<author>
			<persName><forename type="first">Giovanni</forename><surname>Bruno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilaria</forename><surname>Galuzzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Antonio</forename><surname>Giordani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Riccardo</forename><surname>Candelieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francesco</forename><surname>Perego</surname></persName>
		</author>
		<author>
			<persName><surname>Archetti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Management Science</title>
		<imprint>
			<biblScope unit="page" from="1" to="21" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering</title>
		<author>
			<persName><forename type="first">Derek</forename><surname>Greene</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pádraig</forename><surname>Cunningham</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 23rd International Conference on Machine learning (ICML&apos;06)</title>
				<meeting>the 23rd International Conference on Machine learning (ICML&apos;06)</meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="377" to="384" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">Tim</forename><surname>Head</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gilles</forename><forename type="middle">Louppe</forename><surname>Mechcoder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iaroslav</forename><surname>Shcherbatyi</surname></persName>
		</author>
		<title level="m">scikit-optimize/scikitoptimize</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">5</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly</title>
		<author>
			<persName><forename type="first">Kirthevasan</forename><surname>Kandasamy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Raju</forename><surname>Karun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Willie</forename><surname>Vysyaraju</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Biswajit</forename><surname>Neiswanger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">R</forename><surname>Paria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jeff</forename><surname>Collins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Barnabás</forename><surname>Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eric</forename><forename type="middle">P</forename><surname>Póczos</surname></persName>
		</author>
		<author>
			<persName><surname>Xing</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">81</biblScope>
			<biblScope unit="page">27</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality</title>
		<author>
			<persName><forename type="first">Han</forename><surname>Jey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Lau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Timothy</forename><surname>Newman</surname></persName>
		</author>
		<author>
			<persName><surname>Baldwin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014</title>
				<meeting>the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="530" to="539" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Algorithms for non-negative matrix factorization</title>
		<author>
			<persName><forename type="first">Daniel</forename><forename type="middle">D</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">Sebastian</forename><surname>Seung</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Papers from Neural Information Processing Systems (NIPS) 2000</title>
				<imprint>
			<publisher>MIT Press</publisher>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="556" to="562" />
		</imprint>
	</monogr>
	<note>Advances in Neural Information Processing Systems 13</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Bibliographic analysis with the citation network topic model</title>
		<author>
			<persName><forename type="first">Wai</forename><surname>Kar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wray</forename><forename type="middle">L</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><surname>Buntine</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth Asian Conference on Machine Learning</title>
				<meeting>the Sixth Asian Conference on Machine Learning<address><addrLine>ACML</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations</title>
		<author>
			<persName><forename type="first">Biswajit</forename><surname>Paria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kirthevasan</forename><surname>Kandasamy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Barnabás</forename><surname>Póczos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI)</title>
				<meeting>the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI)<address><addrLine>Tel Aviv, Israel</addrLine></address></meeting>
		<imprint>
			<publisher>AUAI Press</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">115</biblScope>
			<biblScope unit="page" from="766" to="776" />
		</imprint>
	</monogr>
	<note>Proceedings of Machine Learning Research</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks</title>
		<author>
			<persName><forename type="first">Nils</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iryna</forename><surname>Gurevych</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing</title>
				<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing</meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="3980" to="3990" />
		</imprint>
	</monogr>
	<note>EMNLP-IJCNLP)</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Taking the human out of the loop: A review of bayesian optimization</title>
		<author>
			<persName><forename type="first">Bobak</forename><surname>Shahriari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kevin</forename><surname>Swersky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ziyu</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ryan</forename><forename type="middle">P</forename><surname>Adams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nando</forename><forename type="middle">De</forename><surname>Freitas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE</title>
				<meeting>the IEEE</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">104</biblScope>
			<biblScope unit="page" from="148" to="175" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Practical Bayesian Optimization of Machine Learning Algorithms</title>
		<author>
			<persName><forename type="first">Jasper</forename><surname>Snoek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hugo</forename><surname>Larochelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ryan</forename><forename type="middle">P</forename><surname>Adams</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="2960" to="2968" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">An empirical analysis of topic models: Uncovering the relationships between hyperparameters, document length and performance measures</title>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Recent Advances in Natural Language Processing</title>
				<imprint>
			<publisher>RANLP</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Constrained relational topic models</title>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Enza</forename><surname>Messina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">512</biblScope>
			<biblScope unit="page" from="581" to="594" />
			<date type="published" when="2020">2020a</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Which matters most? comparing the impact of concept and document relationships in topic models</title>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Debora</forename><surname>Nozza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Messina</forename><surname>Enza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First Workshop on Insights from Negative Results in NLP</title>
				<meeting>the First Workshop on Insights from Negative Results in NLP</meeting>
		<imprint>
			<date type="published" when="2020">2020b</date>
			<biblScope unit="page" from="32" to="40" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">OCTIS: Comparing and Optimizing Topic models is Simple!</title>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giovanni</forename><surname>Bruno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pietro</forename><surname>Galuzzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Antonio</forename><surname>Tropeano</surname></persName>
		</author>
		<author>
			<persName><surname>Candelieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, EACL 2021</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, EACL 2021</meeting>
		<imprint>
			<date type="published" when="2021">2021a</date>
			<biblScope unit="page" from="263" to="270" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Word embedding-based topic similarity measures</title>
		<author>
			<persName><forename type="first">Silvia</forename><surname>Terragni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elisabetta</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Enza</forename><surname>Messina</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing and Information Systems -26th International Conference on Applications of Natural Language to Information Systems, NLDB 2021</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021b</date>
			<biblScope unit="volume">12801</biblScope>
			<biblScope unit="page" from="33" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">A similarity measure for indefinite rankings</title>
		<author>
			<persName><forename type="first">William</forename><surname>Webber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alistair</forename><surname>Moffat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Justin</forename><surname>Zobel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Trans. Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">38</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
