<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler!</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Silvia Terragni</string-name>
          <email>s.terragni4@campus.unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elisabetta Fersini</string-name>
          <email>elisabetta.fersini@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Milano-Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>4363</fpage>
      <lpage>4368</lpage>
      <abstract>
        <p>English. OCTIS is an open-source framework for training, evaluating and comparing Topic Models. This tool uses singleobjective Bayesian Optimization (BO) to optimize the hyper-parameters of the models and thus guarantee a fairer comparison. Yet, a single-objective approach disregards that a user may want to simultaneously optimize multiple objectives. We therefore propose OCTIS 2.0: the extension of OCTIS that addresses the problem of estimating the optimal hyper-parameter configurations for a topic model using multi-objective BO. Moreover, we also release and integrate two pre-processed Italian datasets, which can be easily used as benchmarks for the Italian language.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. OCTIS e` un framework
opensource per il training, la valutazione
e la comparazione di Topic Models.
Questo strumento utilizza l’ottimizzazione
Bayesiana (BO) a singolo obiettivo per
ottimizzare gli iperparametri dei modelli
e quindi garantire una comparazione piu`
equa. Tuttavia, questo approccio ignora
che un utente potrebbe voler ottimizzare
pi‘u di un obiettivo. Proponiamo percio`
OCTIS 2.0: l’estensione di OCTIS che
affronta il problema della stima delle
configurazioni ottimali degli iperparametri di un
topic model usando la BO multi-obiettivo.
In aggiunta, rilasciamo e integriamo
anche due nuovi dataset in italiano
preprocessati, che possono essere facilmente
utilizzati come benchmark per la lingua
italiana.</p>
      <p>Copyright © 2021 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>
        Topic models are statistical methods that aim to
extract the hidden topics underlying a collection
of documents
        <xref ref-type="bibr" rid="ref6 ref7">(Blei et al., 2003; Blei, 2012;
BoydGraber et al., 2017)</xref>
        . Topics are often represented
by sets of words that make sense together, e.g. the
words “cat, animal, dog, mouse” may represent a
topic about animals. Topic models’ evaluations
are usually limited to the comparison of models
whose hyper-parameters are held fixed
        <xref ref-type="bibr" rid="ref10 ref22 ref23 ref23 ref24 ref24 ref26 ref3 ref4">(Doan and
Hoang, 2021; Terragni et al., 2020a; Terragni et
al., 2020b)</xref>
        . However, hyper-parameters can have
an impressive impact on the models’ performance
and therefore fixing the hyper-parameters prevents
the researchers from discovering the best topic
model on the selected dataset.
      </p>
      <p>
        Recently, OCTIS
        <xref ref-type="bibr" rid="ref10 ref22 ref25 ref26 ref3 ref4 ref5">(Terragni et al., 2021a,
Optimizing and Comparing Topic Models is Simple)</xref>
        has been released: a comprehensive and
opensource framework for training, analyzing, and
comparing topic models, over several datasets and
evaluation metrics. OCTIS determines the
optimal hyper-parameter configuration according to
a Bayesian Optimization (BO) strategy
        <xref ref-type="bibr" rid="ref11 ref18 ref19 ref2 ref21">(Archetti
and Candelieri, 2019; Snoek et al., 2012; Galuzzi
et al., 2020)</xref>
        . The framework already provides
several features and resources, among which at least
8 topic models, 4 categories of evaluation metrics,
and 4 pre-processed datasets. However, the
framework uses a single-objective Bayesian
optimization approach, disregarding that a user may want
to simultaneously optimize more than one
objective
        <xref ref-type="bibr" rid="ref10 ref22 ref25 ref26 ref3 ref4 ref5">(Terragni and Fersini, 2021)</xref>
        . For example, a
user may be interested in obtaining topics that are
coherent but also diverse and separated from each
other.
      </p>
      <p>Contributions. In this paper, we propose
OCTIS 2.0, an extension of the existing
framework that integrates both a single-objective
and multi-objective hyper-parameter optimization
strategy, using Bayesian optimization. Moreover,
we also pre-process and include two novel datasets
in Italian. We will then briefly show the
potentiality of the extended framework by comparing
different topic models on the new released Italian
datasets. We believe these resources can be
useful for the topic modeling and NLP communities,
since they can be used as benchmarks for the
Italian language.
2</p>
    </sec>
    <sec id="sec-3">
      <title>OCTIS: Optimizing and Comparing Topic Models Is Simple!</title>
      <p>2.1</p>
      <sec id="sec-3-1">
        <title>OCTIS 1.0</title>
        <p>
          OCTIS
          <xref ref-type="bibr" rid="ref10 ref22 ref25 ref26 ref3 ref4 ref5">(Terragni et al., 2021a, Optimizing and
Comparing is Simple!)</xref>
          is an open-source
evaluation framework for the comparison of topic
models, that allows a user to optimize the models’
hyper-parameters for a fair experimental
comparison. The evaluation framework is composed of
different modules that interact with each other: (1)
dataset and pre-processing tools, (2) topic
modeling, (3) hyper-parameter optimization, (4)
evaluation metrics. OCTIS can be used both as a python
library and through a web dashboard. It also
provides a set of pre-processed datasets,
state-of-theart topic models and several evaluation metrics.
        </p>
        <p>We will now briefly describe the two
components that we will extend in this work: the
preprocessed datasets and the hyper-parameter
optimization module.</p>
        <p>
          Pre-processing and Datasets. OCTIS currently
provides functionalities for pre-processing the
texts, which include the lemmatization of the text,
the removal of punctuation, numbers and
stopwords, and the removal of words based on their
frequency. Moreover, the framework already
provides 4 pre-processed datasets, that are ready to
use for topic modeling. These datasets are 20
NewsGroups,1 M10
          <xref ref-type="bibr" rid="ref15 ref17">(Lim and Buntine, 2014)</xref>
          ,
DBLP,2 and BBC News
          <xref ref-type="bibr" rid="ref12">(Greene and
Cunningham, 2006)</xref>
          . All the datasets are split into three
partitions: training, testing and validation.
        </p>
        <p>All the currently provided datasets are in
English. OCTIS already provides language-specific
pre-processing tools (e.g. lemmatizers for
multiple languages), but it does not present datasets in
other languages. Creating benchmark datasets for
1http://people.csail.mit.edu/jrennie/2
0Newsgroups/</p>
        <p>
          2https://github.com/shiruipan/TriDNR/
tree/master/data
other languages is useful for investigating the
peculiarities of different topic modeling methods.
Single-Objective Hyper-parameter
Optimization. OCTIS uses single-objective Bayesian
Optimization
          <xref ref-type="bibr" rid="ref20 ref21">(Snoek et al., 2012; Shahriari et al.,
2015)</xref>
          to tune the topic models’ hyper-parameters
with respect to a selected evaluation metric. In
particular, the user specifies the search space
for the hyper-parameters and an objective metric.
Then, BO sequentially explores the search space
to determine the optimal hyper-parameter
configuration. Since the models are usually
probabilistic and can give different results with the same
hyper-parameter configuration, the objective
function is computed as the median of a given number
of model runs (i.e., topic models run with the same
hyper-parameter configuration) computed for the
selected evaluation metric. OCTIS uses the
ScikitOptimize library
          <xref ref-type="bibr" rid="ref13">(Head et al., 2018)</xref>
          for the
implementation of the single-objective hyper-parameter
Bayesian optimization.
        </p>
        <p>The use of a single-objective approach is
however limited. In fact, this strategy disregards other
objectives. For example, a user may require to
optimize the coherence of the topics and their
diversity at the same time.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>OCTIS 2.0 New dataset resources for the Italian language.</title>
        <p>Since OCTIS provides only English datasets, we
extend the set of datasets by including two new
datasets in Italian. We build the two datasets from
the Italian version of the Europarl dataset3 and
from the Italian abstracts of DBPedia.4 In
particular, we randomly sample 5000 documents from
Europarl and we randomly sample 1000 Italian
abstracts for 5 DBpedia types (event, organization,
place, person, work), for a total of 5000 abstracts.</p>
        <p>We preprocess the datasets using the following
strategy: we lemmatize the text, we remove the
punctuation, numbers and Italian stop-words, we
iflter out the words with a document frequency
higher than the 50% and less than the 0.1% for
Europarl and 0.2% for DBPedia and we also remove
the documents with less than 5 words. These
values have been chosen by manually inspecting the
resulting pre-processed datasets.</p>
        <p>We report the most relevant statistics of the
3https://www.statmt.org/europarl/
4https://www.dbpedia.org/resources/on
tology/
novel Italian datasets in Table 1. Following the
original paper, we split the datasets in three
partitions: training (75%), validation (15%), and
testing (15%).</p>
      </sec>
      <sec id="sec-3-3">
        <title>Dataset</title>
        <sec id="sec-3-3-1">
          <title>DBPedia Europarl</title>
          <p>
            Num. of
documents
From Single-objective to Multi-objective
Hyper-parameter Bayesian Optimization.
Given the limitations of the single-objective
hyperparameter optimization approach, we
extend OCTIS by including a multi-objective
approach
            <xref ref-type="bibr" rid="ref14 ref18">(Kandasamy et al., 2020; Paria et
al., 2019)</xref>
            . Single-objective BO can be in fact
generalized to multiple objective functions, where
the final aim is to recover the Pareto frontier of
the objective functions, i.e. the set of Pareto
optimal points. A point is Pareto optimal if
it cannot be improved in any of the objectives
without degrading some other objective. Using
a multi-objective hyper-parameter optimization
approach thus allows us not only to identify the
best performing model, but also to empirically
discover competing objectives.
          </p>
          <p>
            Since the original Scikit-Optimize library does
not provide multi-objective optimization tools, we
use the dragonfly library5
            <xref ref-type="bibr" rid="ref18">(Paria et al., 2019)</xref>
            . Like
the single-objective optimization, the user must
specify the hyper-parameter search space. But in
addition, they also need to specify which functions
they want to optimize. We report a simple coding
example below:
# loading of a pre-processed dataset
dataset = Dataset()
dataset.fetch_dataset("DBPedia_IT")
#model instantiation
lda = LDA(num_topics=25)
#definition of the metrics to optimize
td = TopicDiversity()
coh = Coherence()
metrics = [td, coh]
#definition of the search space
config_file = "path/to/search/space/file"
5https://github.com/dragonfly/dragonf
ly
#define and launch optimization
mmm = MOOptimizer(
dataset=dataset, model=model,
config_file=config_file,
metrics=metrics, maximize=True)
mmm.optimize()
          </p>
          <p>The snippet will run a multi-objective
optimization experiment that will return the Pareto front of
the diversity and coherence metrics on the
Italian dataset DBPedia by optimizing the
hyperparameters (defined in a configuration file) of LDA
with 25 topics.</p>
          <p>
            In keeping with the spirit of the first version of
OCTIS, the framework extension is open-source
and easily accessible, in order to guarantee
researchers and practitioners a fairer, accessible
and reproducible comparison between the
models
            <xref ref-type="bibr" rid="ref10 ref22 ref26 ref3 ref4 ref5">(Bianchi and Hovy, 2021)</xref>
            . OCTIS 2.0 is
available as extension of the original library, at the
following link:
https://github.com/mindLab/octis.
3
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Setting</title>
      <p>In the following, we will show the capabilities of
the extended framework on the new datasets by
carrying out a simple experimental campaign.</p>
      <p>We assume an experimental setting in which a
topic modeling practitioner is interested in
discovering the main thematic information of the two
novel datasets in Italian. However, the user does
not have prior knowledge on the datasets,
therefore does not know which topic model is the most
appropriate. Moreover, the user aims to get topics
which are coherent and make sense together but
which are also diverse and separated from the
others. Let us notice that a user could consider a
different set of metrics to optimize, by selecting one
of the already defined metrics available in OCTIS
or by defining novel metrics.
3.1</p>
      <sec id="sec-4-1">
        <title>Evaluation Metrics</title>
        <p>We briefly describe the two evaluation metrics
(one of topic coherence and one of topic
diversity) that we will target as the two objectives of
the multi-objective Bayesian optimization. Both
metrics need to be maximized.</p>
        <p>
          IRBO
          <xref ref-type="bibr" rid="ref22 ref25 ref26 ref3 ref4 ref4 ref5 ref5">(Bianchi et al., 2021a; Terragni et al.,
2021b)</xref>
          is a measure of topic diversity (0 for
identical topics and 1 for completely different topics).
It is based on the Ranked-Biased Overlap
measure
          <xref ref-type="bibr" rid="ref27">(Webber et al., 2010)</xref>
          . Topics with common
words at different rankings are penalized less than
topics sharing the same words at the highest ranks.
NPMI
          <xref ref-type="bibr" rid="ref15">(Lau et al., 2014)</xref>
          measures
Normalized Pointwise Mutual Information of each pair of
words (wi, wj ) in the 10-top words of each topic.
It is a topic coherence measure, that evaluates how
much the words in a topic are related to each other.
3.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Topic Models and Hyper-Parameter Setting</title>
        <p>
          We focus our experiments on four well-known
topic models that OCTIS already provides, two
of them are considered classical topic models
and the others are neural models. In
particular, we trained Latent Dirichlet Allocation
          <xref ref-type="bibr" rid="ref6">(Blei
et al., 2003, LDA)</xref>
          , Non-negative Matrix
Factorization
          <xref ref-type="bibr" rid="ref16">(Lee and Seung, 2000, NMF)</xref>
          , Embedded
Topic Model
          <xref ref-type="bibr" rid="ref9">(Dieng et al., 2020, ETM)</xref>
          ,
Contextualized Topic Models
          <xref ref-type="bibr" rid="ref3 ref3 ref4 ref4 ref5 ref5">(Bianchi et al., 2021a;
Bianchi et al., 2021b, CTM)</xref>
          .
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Model Hyper-parameter Values/Range</title>
        <sec id="sec-4-3-1">
          <title>Number of topics</title>
          <p>α prior
β prior</p>
        </sec>
        <sec id="sec-4-3-2">
          <title>Regularization factor [0, 0.5]</title>
          <p>L1-L2 ratio [0,1]
Initialization method nnnnddssvvdd,ar, rnannddsovmda,
Regularization bVomthatrix, H matrix,</p>
          <p>
            We summarize the models’ hyper-parameters
and their corresponding ranges in Table 2. For
each model, we optimize the number of topics,
ranging from 5 to 100 topics. We select the
ranges of the hyper-parameters similarly to
previous work
            <xref ref-type="bibr" rid="ref10 ref22 ref25 ref26 ref3 ref4 ref5">(Terragni and Fersini, 2021)</xref>
            .
          </p>
          <p>Regarding LDA, we also optimize the
hyperparameters α and β priors that the sparsity of the
topics in the documents and sparsity of the words
in the topic distributions respectively. These
hyper-parameters are set to range between 10− 3
and 10− 1 on a logarithmic scale.</p>
          <p>The hyper-parameters of NMF are mainly
related to the regularization applied to the
factorized matrices. The regularization hyper-parameter
controls if the regularization is applied only to the
matrix V , or to the matrix H, or both. The
regularization factor denotes the constant that multiplies
the regularization terms. It ranges between 0 and
0.5 (0 means no regularization). L1-L2 ratio
controls the ratio between L1 and L2-regularization.
It ranges between 0 and 1, where 0 corresponds to
L2 regularization only, 1 corresponds to L1
regularization only, otherwise it is a combination of
the two types. We also optimize the initialization
method for the two matrices W and H.</p>
          <p>Since ETM and CTM are neural models, their
hyper-parameters are mainly related to the
network architecture. We optimize the number of
neurons (ranging from 100 to 1000, with a step of
100). For simplicity, each layer has the same
number of neurons. We also consider different variants
of activation functions and optimizers. We set the
dropout to range between 0 and 0.9 and the
learning rate, that to range between 10− 3 and 10− 1, on
a logarithm scale. We fix the batch size to 200 and
we adopted an early stopping criterion for
determining the convergence of each model.</p>
          <p>
            Moreover, only for CTM we also optimized the
momentum, ranging between 0 and 0.9, and the
number of layers (ranging from 1 to 5).
Following
            <xref ref-type="bibr" rid="ref3 ref4 ref5">(Bianchi et al., 2021b)</xref>
            , we use the
contextualized document representations derived from
SentenceBERT
            <xref ref-type="bibr" rid="ref18 ref19 ref2">(Reimers and Gurevych, 2019)</xref>
            .
In particular, we use the pre-trained multilingual
Universal Sentence Encoder.6
          </p>
          <p>For all the models, we set the remaining
parameters to their default values. Finally, we train each
model 30 times and consider the median of the
30 evaluations as the evaluation of the function to
6Let us notice that there is not a Sentence BERT-like
model for Italian. Therefore we used a multilingual one:
distiluse-base-multilingual-cased-v1.
be optimized. We sample the n initial
configurations using the Latin Hypercube Sampling, with n
equal to the number of hyperparameters to
optimize plus 2 to provide enough configurations for
the initial surrogate model to fit. The total
number of BO iterations for each model is 125. We
use Gaussian Process as the probabilistic surrogate
model and the Upper Confidence Bound (UCB) as
the acquisition function.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>In the following, we report the results of the
comparative analysis between the considered models
on the Italian datasets.</p>
      <p>We jointly consider the results of both
objectives by plotting the Pareto frontier of the results
of topic diversity and topic coherence. Figure 1
shows the frontier of each model for the pair of
metrics (NPMI, IRBO). We can notice that the
topic models have similar frontiers in each dataset.
The most competitive models are NMF and CTM.
In particular, NMF outperforms the others for the
topic coherence but gets a lower coherence as the
diversity increases. Therefore, CTM is the model
to prefer if a user wants to get totally separated
topics but good coherence. Instead, LDA and
ETM have lower performance than the others. We
also noticed from our experiments that the
performance of ETM is affected when the documents are
shorter (on the Europarl dataset), often originating
the phenomenon of mode collapsing, i.e.
obtaining all the topics equal to the others.
4.2</p>
      <sec id="sec-5-1">
        <title>Qualitative Results</title>
        <p>In Table 3 we report an example of topics
discovered by the models. We selected the best
hyperparameter configuration discovered by the models
with 5 topics and randomly sampled a model run
among the 30 runs. Let us notice that, for the sake
of simplicity, we have to fix the number of topics
here and select a run among the total of 30 runs.
Therefore, the qualitative results reported in
Table 3 may not reflect the overall results.</p>
        <p>
          We can notice that NMF obtains more coherent
and stable topics. CTM and LDA obtain topics
that have a higher variance: in particular, CTM
discovers a topic (the fourth one, NPMI=-0.51)
that lowers the average coherence, while LDA
discovers a topic (the second one, NPMI=0.48) that
effectively increases the average coherence. On
the other hand, the topics discovered by ETM are
more stable but have a lower coherence on
average. As already observed in previous work
          <xref ref-type="bibr" rid="ref1 ref10 ref22 ref26 ref3 ref4">(AlSumait et al., 2009; Doogan and Buntine, 2021)</xref>
          ,
obtaining junk or mixed topics is common in topic
models and this problem can be addressed by
filtering out the topics that are less relevant.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper, we presented OCTIS 2.0, the
extension of the evaluation framework OCTIS for topic
modeling. This tool can now address the problem
of estimating the optimal hyper-parameter
configurations of different topic models using a
multiobjective Bayesian optimization approach.
Moreover, we also released two novel datasets in Italian
which can be used as benchmark datasets for the
Italian topic modeling and NLP communities.</p>
      <p>We conducted a simple experimental campaign
to show to potentiality of the extended framework.</p>
      <p>We have seen that using a multi-objective
hyperparameter optimization approach allows us not only
to identify the best performing model over the
othde album pubblicare italiano the uniti situare fondare universita` noto
torneo giocare tennis edizione tour atp ambito open categoria cemento
iflm pubblicare the album serie musicale venire statunitense rock band
guerra battaglia venire situare statunitense spagnolo partito esercito distretto mondiale
comune campionato squadra abitante calcio regione situare societa` francese vincere
comune abitante dipartimento regione situare francese alta distretto est grand
torneo giocare tennis tour atp open edizione ambito categoria cemento
album pubblicare studio the musicale statunitense records singolo cantante rock
calciatore ruolo allenatore calcio centrocampista difensore attaccante portiere settembre aprile
contea america uniti situare comune censimento designated census place capoluogo
album the pubblicare band statunitense singolo brano of musicale rock
superare argentino calciatore el buenos maria en svezia situare chiesa
partito battaglia guerra venire politico de linea isola stazione regno
st stella vendetta dollaro robert company ritorno west superiore soggetto
edizione tennis giocare torneo vincere tour campionato maschile disputare squadra
sede de italiano fondare nome azienda noto francese compagnia parigi
guerra partito battaglia venire nord politico tedesco esercito regno militare
torneo situare comune giocare abitante edizione tennis tour regione uniti
iflm serie the dirigere gioco pubblicare statunitense televisivo venire romanzo
album pubblicare campionato squadra musicale the calcio statunitense singolo vincere
-0.05
0.48
0.11
-0.14
-0.03
0.29
0.48
0.29
0.24
0.39
ers, thus guaranteeing a fairer comparison among
different models, but also to empirically discover
the relationships between different objectives.</p>
      <p>As future work, we aim to extend the framework
by considering additional datasets in different and
possibly low-resource languages, which require
different pre-processing strategies and would
allow researchers to investigate the peculiarities of
different topic modeling methods.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Loulwah</surname>
            <given-names>AlSumait</given-names>
          </string-name>
          , Daniel Barbara´,
          <string-name>
            <given-names>James</given-names>
            <surname>Gentle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Carlotta</given-names>
            <surname>Domeniconi</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Topic Significance Ranking of LDA Generative Models</article-title>
          .
          <source>In Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD</source>
          <year>2009</year>
          , volume
          <volume>5781</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>67</fpage>
          -
          <lpage>82</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Archetti</surname>
          </string-name>
          and Antonio Candelieri.
          <year>2019</year>
          .
          <article-title>Bayesian Optimization and Data Science</article-title>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Federico</given-names>
            <surname>Bianchi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dirk</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>On the gap between adoption and understanding in nlp</article-title>
          .
          <source>In Findings of the Association for Computational Linguistics: ACL-IJCNLP</source>
          <year>2021</year>
          , pages
          <fpage>3895</fpage>
          -
          <lpage>3901</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Federico</given-names>
            <surname>Bianchi</surname>
          </string-name>
          , Silvia Terragni, and
          <string-name>
            <given-names>Dirk</given-names>
            <surname>Hovy</surname>
          </string-name>
          . 2021a.
          <article-title>Pre-training is a hot topic: Contextualized document embeddings improve topic coherence</article-title>
          .
          <source>In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP</source>
          <year>2021</year>
          , pages
          <fpage>759</fpage>
          -
          <lpage>766</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Federico</given-names>
            <surname>Bianchi</surname>
          </string-name>
          , Silvia Terragni, Dirk Hovy, Debora Nozza, and
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          . 2021b.
          <article-title>Cross-lingual contextualized topic models with zero-shot learning</article-title>
          .
          <source>In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:</source>
          Main Volume,
          <source>EACL</source>
          <year>2021</year>
          , pages
          <fpage>1676</fpage>
          -
          <lpage>1683</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>David M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Andrew Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Michael I.</given-names>
            <surname>Jordan</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>David M Blei</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Probabilistic topic models</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>55</volume>
          (
          <issue>4</issue>
          ):
          <fpage>77</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Jordan L. Boyd-Graber</surname>
            ,
            <given-names>Yuening</given-names>
          </string-name>
          <string-name>
            <surname>Hu</surname>
          </string-name>
          , and
          <string-name>
            <surname>David</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mimno</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Applications of topic models</article-title>
          .
          <source>Found. Trends Inf. Retr.</source>
          ,
          <volume>11</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>143</fpage>
          -
          <lpage>296</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Adji</given-names>
            <surname>Bousso</surname>
          </string-name>
          <string-name>
            <surname>Dieng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Francisco J. R.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>David M.</given-names>
            <surname>Blei</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Topic modeling in embedding spaces</article-title>
          .
          <source>Trans. Assoc. Comput. Linguistics</source>
          ,
          <volume>8</volume>
          :
          <fpage>439</fpage>
          -
          <lpage>453</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Thanh-Nam Doan</surname>
          </string-name>
          and
          <string-name>
            <surname>Tuan-Anh Hoang</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Benchmarking neural topic models: An empirical study</article-title>
          .
          <source>In Findings of the Association for Computational Linguistics: ACL-IJCNLP</source>
          <year>2021</year>
          ,
          <article-title>pages Caitlin Doogan</article-title>
          and
          <string-name>
            <surname>Wray L. Buntine</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Topic model or topic twaddle? re-evaluating semantic interpretability measures</article-title>
          .
          <source>In Proceedings of the</source>
          <year>2021</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online</article-title>
          , June 6-11,
          <year>2021</year>
          , pages
          <fpage>3824</fpage>
          -
          <lpage>3848</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Bruno</given-names>
            <surname>Giovanni</surname>
          </string-name>
          <string-name>
            <surname>Galuzzi</surname>
          </string-name>
          , Ilaria Giordani, Antonio Candelieri, Riccardo Perego, and
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Archetti</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Hyperparameter optimization for recommender systems through bayesian optimization</article-title>
          .
          <source>Computational Management Science</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Derek</given-names>
            <surname>Greene</surname>
          </string-name>
          and Pa´draig Cunningham.
          <year>2006</year>
          .
          <article-title>Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering</article-title>
          .
          <source>In Proceedings of the 23rd International Conference on Machine learning (ICML'06)</source>
          , pages
          <fpage>377</fpage>
          -
          <lpage>384</lpage>
          . ACM Press.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Tim</given-names>
            <surname>Head</surname>
          </string-name>
          , Gilles Louppe MechCoder,
          <string-name>
            <surname>Iaroslav Shcherbatyi</surname>
          </string-name>
          , et al.
          <year>2018</year>
          . scikit-optimize
          <source>/scikitoptimize: v0. 5</source>
          .2.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Kirthevasan</given-names>
            <surname>Kandasamy</surname>
          </string-name>
          , Karun Raju Vysyaraju, Willie Neiswanger, Biswajit Paria, Christopher R. Collins, Jeff Schneider,
          <article-title>Barnaba´s Po´czos, and</article-title>
          <string-name>
            <given-names>Eric P.</given-names>
            <surname>Xing</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>21</volume>
          :81:
          <fpage>1</fpage>
          -
          <lpage>81</lpage>
          :
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Jey</given-names>
            <surname>Han Lau</surname>
          </string-name>
          , David Newman,
          <string-name>
            <given-names>and Timothy</given-names>
            <surname>Baldwin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality</article-title>
          .
          <source>In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>EACL</surname>
          </string-name>
          <year>2014</year>
          , pages
          <fpage>530</fpage>
          -
          <lpage>539</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Daniel D.</given-names>
            <surname>Lee</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. Sebastian</given-names>
            <surname>Seung</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Algorithms for non-negative matrix factorization</article-title>
          .
          <source>In Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS)</source>
          <year>2000</year>
          , pages
          <fpage>556</fpage>
          -
          <lpage>562</lpage>
          . MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Kar</given-names>
            <surname>Wai Lim and Wray L. Buntine</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Bibliographic analysis with the citation network topic model</article-title>
          .
          <source>In Proceedings of the Sixth Asian Conference on Machine Learning</source>
          ,
          <string-name>
            <surname>ACML</surname>
          </string-name>
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Biswajit</given-names>
            <surname>Paria</surname>
          </string-name>
          , Kirthevasan Kandasamy, and Barnaba´s Po´czos.
          <year>2019</year>
          .
          <article-title>A Flexible Framework for MultiObjective Bayesian Optimization using Random Scalarizations</article-title>
          .
          <source>In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI)</source>
          , volume
          <volume>115</volume>
          <source>of Proceedings of Machine Learning Research</source>
          , pages
          <fpage>766</fpage>
          -
          <lpage>776</lpage>
          ,
          <string-name>
            <surname>Tel</surname>
            <given-names>Aviv</given-names>
          </string-name>
          , Israel. AUAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Nils</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <given-names>Iryna</given-names>
            <surname>Gurevych</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>SentenceBERT: Sentence Embeddings using Siamese BERTNetworks</article-title>
          .
          <source>In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing</source>
          ,
          <source>(EMNLPIJCNLP)</source>
          , pages
          <fpage>3980</fpage>
          -
          <lpage>3990</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Bobak</given-names>
            <surname>Shahriari</surname>
          </string-name>
          , Kevin Swersky, Ziyu Wang,
          <string-name>
            <surname>Ryan P Adams</surname>
          </string-name>
          , and Nando De Freitas.
          <year>2015</year>
          .
          <article-title>Taking the human out of the loop: A review of bayesian optimization</article-title>
          .
          <source>Proceedings of the IEEE</source>
          ,
          <volume>104</volume>
          (
          <issue>1</issue>
          ):
          <fpage>148</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Jasper</given-names>
            <surname>Snoek</surname>
          </string-name>
          , Hugo Larochelle, and
          <string-name>
            <given-names>Ryan P.</given-names>
            <surname>Adams</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Practical Bayesian Optimization of Machine Learning Algorithms</article-title>
          .
          <source>In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems</source>
          , pages
          <fpage>2960</fpage>
          -
          <lpage>2968</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Terragni</surname>
          </string-name>
          and
          <string-name>
            <given-names>Elisabetta</given-names>
            <surname>Fersini</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>An empirical analysis of topic models: Uncovering the relationships between hyperparameters, document length and performance measures</article-title>
          .
          <source>In Recent Advances in Natural Language Processing (RANLP).</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Terragni</surname>
          </string-name>
          , Elisabetta Fersini, and
          <string-name>
            <given-names>Enza</given-names>
            <surname>Messina</surname>
          </string-name>
          . 2020a.
          <article-title>Constrained relational topic models</article-title>
          .
          <source>Information Sciences</source>
          ,
          <volume>512</volume>
          :
          <fpage>581</fpage>
          -
          <lpage>594</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Terragni</surname>
          </string-name>
          , Debora Nozza, Elisabetta Fersini, and
          <string-name>
            <given-names>Messina</given-names>
            <surname>Enza</surname>
          </string-name>
          . 2020b.
          <article-title>Which matters most? comparing the impact of concept and document relationships in topic models</article-title>
          .
          <source>In Proceedings of the First Workshop on Insights from Negative Results in NLP</source>
          , pages
          <fpage>32</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Terragni</surname>
          </string-name>
          , Elisabetta Fersini, Bruno Giovanni Galuzzi, Pietro Tropeano, and Antonio Candelieri. 2021a.
          <article-title>OCTIS: Comparing and Optimizing Topic models is Simple! In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations</article-title>
          ,
          <string-name>
            <surname>EACL</surname>
          </string-name>
          <year>2021</year>
          , pages
          <fpage>263</fpage>
          -
          <lpage>270</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Terragni</surname>
          </string-name>
          , Elisabetta Fersini, and
          <string-name>
            <given-names>Enza</given-names>
            <surname>Messina</surname>
          </string-name>
          . 2021b.
          <article-title>Word embedding-based topic similarity measures</article-title>
          .
          <source>In Natural Language Processing and Information Systems - 26th International Conference on Applications of Natural Language to Information Systems, NLDB</source>
          <year>2021</year>
          , volume
          <volume>12801</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>33</fpage>
          -
          <lpage>45</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>William</given-names>
            <surname>Webber</surname>
          </string-name>
          , Alistair Moffat, and
          <string-name>
            <given-names>Justin</given-names>
            <surname>Zobel</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>A similarity measure for indefinite rankings</article-title>
          .
          <source>ACM Trans. Inf</source>
          . Syst.,
          <volume>28</volume>
          (
          <issue>4</issue>
          ):
          <volume>20</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          :
          <fpage>38</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>