=Paper=
{{Paper
|id=Vol-3033/paper55
|storemode=property
|title=OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler!
|pdfUrl=https://ceur-ws.org/Vol-3033/paper55.pdf
|volume=Vol-3033
|authors=Silvia Terragni,Elisabetta Fersini
|dblpUrl=https://dblp.org/rec/conf/clic-it/TerragniF21
}}
==OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler!==
OCTIS 2.0:
Optimizing and Comparing Topic Models in Italian Is Even Simpler!
Silvia Terragni and Elisabetta Fersini
University of Milano-Bicocca, Milan, Italy
s.terragni4@campus.unimib.it, elisabetta.fersini@unimib.it
1 Introduction
Abstract
Topic models are statistical methods that aim to
English. OCTIS is an open-source frame- extract the hidden topics underlying a collection
work for training, evaluating and compar- of documents (Blei et al., 2003; Blei, 2012; Boyd-
ing Topic Models. This tool uses single- Graber et al., 2017). Topics are often represented
objective Bayesian Optimization (BO) to by sets of words that make sense together, e.g. the
optimize the hyper-parameters of the mod- words “cat, animal, dog, mouse” may represent a
els and thus guarantee a fairer compari- topic about animals. Topic models’ evaluations
son. Yet, a single-objective approach dis- are usually limited to the comparison of models
regards that a user may want to simulta- whose hyper-parameters are held fixed (Doan and
neously optimize multiple objectives. We Hoang, 2021; Terragni et al., 2020a; Terragni et
therefore propose OCTIS 2.0: the exten- al., 2020b). However, hyper-parameters can have
sion of OCTIS that addresses the problem an impressive impact on the models’ performance
of estimating the optimal hyper-parameter and therefore fixing the hyper-parameters prevents
configurations for a topic model using the researchers from discovering the best topic
multi-objective BO. Moreover, we also re- model on the selected dataset.
lease and integrate two pre-processed Ital- Recently, OCTIS (Terragni et al., 2021a, Opti-
ian datasets, which can be easily used as mizing and Comparing Topic Models is Simple)
benchmarks for the Italian language. has been released: a comprehensive and open-
Italiano. OCTIS è un framework open- source framework for training, analyzing, and
source per il training, la valutazione comparing topic models, over several datasets and
e la comparazione di Topic Models. evaluation metrics. OCTIS determines the opti-
Questo strumento utilizza l’ottimizzazione mal hyper-parameter configuration according to
Bayesiana (BO) a singolo obiettivo per a Bayesian Optimization (BO) strategy (Archetti
ottimizzare gli iperparametri dei modelli and Candelieri, 2019; Snoek et al., 2012; Galuzzi
e quindi garantire una comparazione più et al., 2020). The framework already provides sev-
equa. Tuttavia, questo approccio ignora eral features and resources, among which at least
che un utente potrebbe voler ottimizzare 8 topic models, 4 categories of evaluation metrics,
pi‘u di un obiettivo. Proponiamo perciò and 4 pre-processed datasets. However, the frame-
OCTIS 2.0: l’estensione di OCTIS che af- work uses a single-objective Bayesian optimiza-
fronta il problema della stima delle config- tion approach, disregarding that a user may want
urazioni ottimali degli iperparametri di un to simultaneously optimize more than one objec-
topic model usando la BO multi-obiettivo. tive (Terragni and Fersini, 2021). For example, a
In aggiunta, rilasciamo e integriamo an- user may be interested in obtaining topics that are
che due nuovi dataset in italiano pre- coherent but also diverse and separated from each
processati, che possono essere facilmente other.
utilizzati come benchmark per la lingua
Contributions. In this paper, we propose
italiana.
OCTIS 2.0, an extension of the existing frame-
Copyright © 2021 for this paper by its authors. Use per-
mitted under Creative Commons License Attribution 4.0 In- work that integrates both a single-objective
ternational (CC BY 4.0). and multi-objective hyper-parameter optimization
strategy, using Bayesian optimization. Moreover, other languages is useful for investigating the pe-
we also pre-process and include two novel datasets culiarities of different topic modeling methods.
in Italian. We will then briefly show the poten-
tiality of the extended framework by comparing Single-Objective Hyper-parameter Optimiza-
different topic models on the new released Italian tion. OCTIS uses single-objective Bayesian Op-
datasets. We believe these resources can be use- timization (Snoek et al., 2012; Shahriari et al.,
ful for the topic modeling and NLP communities, 2015) to tune the topic models’ hyper-parameters
since they can be used as benchmarks for the Ital- with respect to a selected evaluation metric. In
ian language. particular, the user specifies the search space
for the hyper-parameters and an objective metric.
2 OCTIS: Optimizing and Comparing Then, BO sequentially explores the search space
Topic Models Is Simple! to determine the optimal hyper-parameter config-
uration. Since the models are usually probabilis-
2.1 OCTIS 1.0 tic and can give different results with the same
OCTIS (Terragni et al., 2021a, Optimizing and hyper-parameter configuration, the objective func-
Comparing is Simple!) is an open-source evalu- tion is computed as the median of a given number
ation framework for the comparison of topic mod- of model runs (i.e., topic models run with the same
els, that allows a user to optimize the models’ hyper-parameter configuration) computed for the
hyper-parameters for a fair experimental compar- selected evaluation metric. OCTIS uses the Scikit-
ison. The evaluation framework is composed of Optimize library (Head et al., 2018) for the imple-
different modules that interact with each other: (1) mentation of the single-objective hyper-parameter
dataset and pre-processing tools, (2) topic model- Bayesian optimization.
ing, (3) hyper-parameter optimization, (4) evalua- The use of a single-objective approach is how-
tion metrics. OCTIS can be used both as a python ever limited. In fact, this strategy disregards other
library and through a web dashboard. It also pro- objectives. For example, a user may require to op-
vides a set of pre-processed datasets, state-of-the- timize the coherence of the topics and their diver-
art topic models and several evaluation metrics. sity at the same time.
We will now briefly describe the two compo-
nents that we will extend in this work: the pre- 2.2 OCTIS 2.0
processed datasets and the hyper-parameter opti- New dataset resources for the Italian language.
mization module. Since OCTIS provides only English datasets, we
extend the set of datasets by including two new
Pre-processing and Datasets. OCTIS currently
datasets in Italian. We build the two datasets from
provides functionalities for pre-processing the
the Italian version of the Europarl dataset3 and
texts, which include the lemmatization of the text,
from the Italian abstracts of DBPedia.4 In partic-
the removal of punctuation, numbers and stop-
ular, we randomly sample 5000 documents from
words, and the removal of words based on their
Europarl and we randomly sample 1000 Italian ab-
frequency. Moreover, the framework already pro-
stracts for 5 DBpedia types (event, organization,
vides 4 pre-processed datasets, that are ready to
place, person, work), for a total of 5000 abstracts.
use for topic modeling. These datasets are 20
NewsGroups,1 M10 (Lim and Buntine, 2014), We preprocess the datasets using the following
DBLP,2 and BBC News (Greene and Cunning- strategy: we lemmatize the text, we remove the
ham, 2006). All the datasets are split into three punctuation, numbers and Italian stop-words, we
partitions: training, testing and validation. filter out the words with a document frequency
All the currently provided datasets are in En- higher than the 50% and less than the 0.1% for Eu-
glish. OCTIS already provides language-specific roparl and 0.2% for DBPedia and we also remove
pre-processing tools (e.g. lemmatizers for multi- the documents with less than 5 words. These val-
ple languages), but it does not present datasets in ues have been chosen by manually inspecting the
other languages. Creating benchmark datasets for resulting pre-processed datasets.
We report the most relevant statistics of the
1
http://people.csail.mit.edu/jrennie/2
3
0Newsgroups/ https://www.statmt.org/europarl/
2 4
https://github.com/shiruipan/TriDNR/ https://www.dbpedia.org/resources/on
tree/master/data tology/
novel Italian datasets in Table 1. Following the
original paper, we split the datasets in three parti- #define and launch optimization
mmm = MOOptimizer(
tions: training (75%), validation (15%), and test- dataset=dataset, model=model,
ing (15%). config_file=config_file,
metrics=metrics, maximize=True)
mmm.optimize()
Avg. doc Num. of
Num. of
Dataset length unique The snippet will run a multi-objective optimiza-
documents
(Std. dev.) words tion experiment that will return the Pareto front of
DBPedia 4251 5.5 (11.8) 2047 the diversity and coherence metrics on the Ital-
Europarl 3616 20.6 (19.3) 2000 ian dataset DBPedia by optimizing the hyper-
parameters (defined in a configuration file) of LDA
Table 1: Statistics of the pre-processed datasets. with 25 topics.
In keeping with the spirit of the first version of
OCTIS, the framework extension is open-source
From Single-objective to Multi-objective
and easily accessible, in order to guarantee re-
Hyper-parameter Bayesian Optimization.
searchers and practitioners a fairer, accessible
Given the limitations of the single-objective
and reproducible comparison between the mod-
hyperparameter optimization approach, we
els (Bianchi and Hovy, 2021). OCTIS 2.0 is avail-
extend OCTIS by including a multi-objective
able as extension of the original library, at the fol-
approach (Kandasamy et al., 2020; Paria et
lowing link: https://github.com/mind-
al., 2019). Single-objective BO can be in fact
Lab/octis.
generalized to multiple objective functions, where
the final aim is to recover the Pareto frontier of 3 Experimental Setting
the objective functions, i.e. the set of Pareto
optimal points. A point is Pareto optimal if In the following, we will show the capabilities of
it cannot be improved in any of the objectives the extended framework on the new datasets by
without degrading some other objective. Using carrying out a simple experimental campaign.
a multi-objective hyper-parameter optimization We assume an experimental setting in which a
approach thus allows us not only to identify the topic modeling practitioner is interested in discov-
best performing model, but also to empirically ering the main thematic information of the two
discover competing objectives. novel datasets in Italian. However, the user does
Since the original Scikit-Optimize library does not have prior knowledge on the datasets, there-
not provide multi-objective optimization tools, we fore does not know which topic model is the most
use the dragonfly library5 (Paria et al., 2019). Like appropriate. Moreover, the user aims to get topics
the single-objective optimization, the user must which are coherent and make sense together but
specify the hyper-parameter search space. But in which are also diverse and separated from the oth-
addition, they also need to specify which functions ers. Let us notice that a user could consider a dif-
they want to optimize. We report a simple coding ferent set of metrics to optimize, by selecting one
example below: of the already defined metrics available in OCTIS
or by defining novel metrics.
# loading of a pre-processed dataset
dataset = Dataset() 3.1 Evaluation Metrics
dataset.fetch_dataset("DBPedia_IT")
We briefly describe the two evaluation metrics
#model instantiation (one of topic coherence and one of topic diver-
lda = LDA(num_topics=25)
sity) that we will target as the two objectives of
#definition of the metrics to optimize the multi-objective Bayesian optimization. Both
td = TopicDiversity() metrics need to be maximized.
coh = Coherence()
metrics = [td, coh]
IRBO (Bianchi et al., 2021a; Terragni et al.,
#definition of the search space 2021b) is a measure of topic diversity (0 for iden-
config_file = "path/to/search/space/file" tical topics and 1 for completely different topics).
5
https://github.com/dragonfly/dragonf It is based on the Ranked-Biased Overlap mea-
ly sure (Webber et al., 2010). Topics with common
words at different rankings are penalized less than and their corresponding ranges in Table 2. For
topics sharing the same words at the highest ranks. each model, we optimize the number of topics,
ranging from 5 to 100 topics. We select the
NPMI (Lau et al., 2014) measures Normal-
ranges of the hyper-parameters similarly to previ-
ized Pointwise Mutual Information of each pair of
ous work (Terragni and Fersini, 2021).
words (wi , wj ) in the 10-top words of each topic.
Regarding LDA, we also optimize the hyper-
It is a topic coherence measure, that evaluates how
parameters α and β priors that the sparsity of the
much the words in a topic are related to each other.
topics in the documents and sparsity of the words
3.2 Topic Models and Hyper-Parameter in the topic distributions respectively. These
Setting hyper-parameters are set to range between 10−3
and 10−1 on a logarithmic scale.
We focus our experiments on four well-known
The hyper-parameters of NMF are mainly re-
topic models that OCTIS already provides, two
lated to the regularization applied to the factor-
of them are considered classical topic models
ized matrices. The regularization hyper-parameter
and the others are neural models. In particu-
controls if the regularization is applied only to the
lar, we trained Latent Dirichlet Allocation (Blei
matrix V , or to the matrix H, or both. The regular-
et al., 2003, LDA), Non-negative Matrix Factor-
ization factor denotes the constant that multiplies
ization (Lee and Seung, 2000, NMF), Embedded
the regularization terms. It ranges between 0 and
Topic Model (Dieng et al., 2020, ETM), Con-
0.5 (0 means no regularization). L1-L2 ratio con-
textualized Topic Models (Bianchi et al., 2021a;
trols the ratio between L1 and L2-regularization.
Bianchi et al., 2021b, CTM).
It ranges between 0 and 1, where 0 corresponds to
L2 regularization only, 1 corresponds to L1 reg-
Model Hyper-parameter Values/Range
ularization only, otherwise it is a combination of
All Number of topics [5, 100] the two types. We also optimize the initialization
α prior [10−3 , 10] method for the two matrices W and H.
LDA
β prior [10−3 , 10] Since ETM and CTM are neural models, their
hyper-parameters are mainly related to the net-
Regularization factor [0, 0.5]
L1-L2 ratio [0,1] work architecture. We optimize the number of
nndsvd, nndsvda, neurons (ranging from 100 to 1000, with a step of
NMF Initialization method 100). For simplicity, each layer has the same num-
nndsvdar, random
V matrix, H matrix, ber of neurons. We also consider different variants
Regularization of activation functions and optimizers. We set the
both
dropout to range between 0 and 0.9 and the learn-
elu, sigmoid, soft-
Activation function ing rate, that to range between 10−3 and 10−1 , on
plus, selu
Dropout [0, 0.9] a logarithm scale. We fix the batch size to 200 and
ETM Learning rate [10−3 , 10−1 ] we adopted an early stopping criterion for deter-
{100, 200, . . ., 900, mining the convergence of each model.
Number of neurons Moreover, only for CTM we also optimized the
1000}
Optimizer adam, sgd, rmsprop momentum, ranging between 0 and 0.9, and the
number of layers (ranging from 1 to 5). Follow-
elu, sigmoid, soft-
Activation function ing (Bianchi et al., 2021b), we use the contex-
plus, selu
Dropout [0, 0.9] tualized document representations derived from
Learning rate [10−3 , 10−1 ] SentenceBERT (Reimers and Gurevych, 2019).
CTM Momentum [0, 0.9] In particular, we use the pre-trained multilingual
Number of layers 1, 2, 3, 4, 5 Universal Sentence Encoder.6
{100, 200, . . ., 900, For all the models, we set the remaining param-
Number of neurons
1000} eters to their default values. Finally, we train each
Optimizer adam, sgd, rmsprop model 30 times and consider the median of the
30 evaluations as the evaluation of the function to
Table 2: Hyper-parameters and ranges. 6
Let us notice that there is not a Sentence BERT-like
model for Italian. Therefore we used a multilingual one:
We summarize the models’ hyper-parameters distiluse-base-multilingual-cased-v1.
be optimized. We sample the n initial configura- topic coherence but gets a lower coherence as the
tions using the Latin Hypercube Sampling, with n diversity increases. Therefore, CTM is the model
equal to the number of hyperparameters to opti- to prefer if a user wants to get totally separated
mize plus 2 to provide enough configurations for topics but good coherence. Instead, LDA and
the initial surrogate model to fit. The total num- ETM have lower performance than the others. We
ber of BO iterations for each model is 125. We also noticed from our experiments that the perfor-
use Gaussian Process as the probabilistic surrogate mance of ETM is affected when the documents are
model and the Upper Confidence Bound (UCB) as shorter (on the Europarl dataset), often originating
the acquisition function. the phenomenon of mode collapsing, i.e. obtain-
ing all the topics equal to the others.
4 Results
4.2 Qualitative Results
In the following, we report the results of the com-
parative analysis between the considered models In Table 3 we report an example of topics discov-
on the Italian datasets. ered by the models. We selected the best hyper-
parameter configuration discovered by the models
4.1 Quantitative Results with 5 topics and randomly sampled a model run
among the 30 runs. Let us notice that, for the sake
of simplicity, we have to fix the number of topics
here and select a run among the total of 30 runs.
Therefore, the qualitative results reported in Ta-
ble 3 may not reflect the overall results.
We can notice that NMF obtains more coherent
and stable topics. CTM and LDA obtain topics
that have a higher variance: in particular, CTM
discovers a topic (the fourth one, NPMI=-0.51)
that lowers the average coherence, while LDA dis-
covers a topic (the second one, NPMI=0.48) that
effectively increases the average coherence. On
the other hand, the topics discovered by ETM are
more stable but have a lower coherence on aver-
age. As already observed in previous work (Al-
Sumait et al., 2009; Doogan and Buntine, 2021),
obtaining junk or mixed topics is common in topic
models and this problem can be addressed by fil-
tering out the topics that are less relevant.
5 Conclusion
In this paper, we presented OCTIS 2.0, the exten-
sion of the evaluation framework OCTIS for topic
Figure 1: Pareto front of the performance of modeling. This tool can now address the problem
the considered models for the analyzed Italian of estimating the optimal hyper-parameter config-
datasets. urations of different topic models using a multi-
objective Bayesian optimization approach. More-
We jointly consider the results of both objec- over, we also released two novel datasets in Italian
tives by plotting the Pareto frontier of the results which can be used as benchmark datasets for the
of topic diversity and topic coherence. Figure 1 Italian topic modeling and NLP communities.
shows the frontier of each model for the pair of We conducted a simple experimental campaign
metrics (NPMI, IRBO). We can notice that the to show to potentiality of the extended framework.
topic models have similar frontiers in each dataset. We have seen that using a multi-objective hyperpa-
The most competitive models are NMF and CTM. rameter optimization approach allows us not only
In particular, NMF outperforms the others for the to identify the best performing model over the oth-
Model Top words NPMI
de album pubblicare italiano the uniti situare fondare università noto -0.05
torneo giocare tennis edizione tour atp ambito open categoria cemento 0.48
LDA film pubblicare the album serie musicale venire statunitense rock band 0.11
guerra battaglia venire situare statunitense spagnolo partito esercito distretto mondiale -0.14
comune campionato squadra abitante calcio regione situare società francese vincere -0.03
comune abitante dipartimento regione situare francese alta distretto est grand 0.29
torneo giocare tennis tour atp open edizione ambito categoria cemento 0.48
NMF album pubblicare studio the musicale statunitense records singolo cantante rock 0.29
calciatore ruolo allenatore calcio centrocampista difensore attaccante portiere settembre aprile 0.24
contea america uniti situare comune censimento designated census place capoluogo 0.39
album the pubblicare band statunitense singolo brano of musicale rock 0.26
superare argentino calciatore el buenos maria en svezia situare chiesa -0.29
CTM partito battaglia guerra venire politico de linea isola stazione regno -0.08
st stella vendetta dollaro robert company ritorno west superiore soggetto -0.51
edizione tennis giocare torneo vincere tour campionato maschile disputare squadra 0.18
sede de italiano fondare nome azienda noto francese compagnia parigi 0.06
guerra partito battaglia venire nord politico tedesco esercito regno militare 0.03
ETM torneo situare comune giocare abitante edizione tennis tour regione uniti -0.10
film serie the dirigere gioco pubblicare statunitense televisivo venire romanzo 0.07
album pubblicare campionato squadra musicale the calcio statunitense singolo vincere -0.12
Table 3: Example of top words of 5 topics for each considered model and the corresponding topic
coherence (NPMI).
ers, thus guaranteeing a fairer comparison among ized document embeddings improve topic coher-
different models, but also to empirically discover ence. In Proceedings of the 59th Annual Meet-
ing of the Association for Computational Linguis-
the relationships between different objectives.
tics and the 11th International Joint Conference on
As future work, we aim to extend the framework Natural Language Processing, ACL/IJCNLP 2021,
by considering additional datasets in different and pages 759–766. Association for Computational Lin-
possibly low-resource languages, which require guistics.
different pre-processing strategies and would al- Federico Bianchi, Silvia Terragni, Dirk Hovy, Debora
low researchers to investigate the peculiarities of Nozza, and Elisabetta Fersini. 2021b. Cross-lingual
different topic modeling methods. contextualized topic models with zero-shot learning.
In Proceedings of the 16th Conference of the Euro-
pean Chapter of the Association for Computational
Linguistics: Main Volume, EACL 2021, pages 1676–
References 1683. Association for Computational Linguistics.
Loulwah AlSumait, Daniel Barbará, James Gentle, and
Carlotta Domeniconi. 2009. Topic Significance David M. Blei, Andrew Y. Ng, and Michael I. Jordan.
Ranking of LDA Generative Models. In Machine 2003. Latent dirichlet allocation. Journal of Ma-
Learning and Knowledge Discovery in Databases, chine Learning Research, 3:993–1022.
European Conference, ECML PKDD 2009, volume David M Blei. 2012. Probabilistic topic models. Com-
5781 of Lecture Notes in Computer Science, pages munications of the ACM, 55(4):77–84.
67–82. Springer.
Jordan L. Boyd-Graber, Yuening Hu, and David M.
Francesco Archetti and Antonio Candelieri. 2019. Mimno. 2017. Applications of topic models.
Bayesian Optimization and Data Science. Springer Found. Trends Inf. Retr., 11(2-3):143–296.
International Publishing.
Adji Bousso Dieng, Francisco J. R. Ruiz, and David M.
Federico Bianchi and Dirk Hovy. 2021. On the gap be- Blei. 2020. Topic modeling in embedding spaces.
tween adoption and understanding in nlp. In Find- Trans. Assoc. Comput. Linguistics, 8:439–453.
ings of the Association for Computational Linguis-
tics: ACL-IJCNLP 2021, pages 3895–3901. Thanh-Nam Doan and Tuan-Anh Hoang. 2021.
Benchmarking neural topic models: An empirical
Federico Bianchi, Silvia Terragni, and Dirk Hovy. study. In Findings of the Association for Com-
2021a. Pre-training is a hot topic: Contextual- putational Linguistics: ACL-IJCNLP 2021, pages
4363–4368, Online, August. Association for Com- Processing and the 9th International Joint Confer-
putational Linguistics. ence on Natural Language Processing, (EMNLP-
IJCNLP), pages 3980–3990. Association for Com-
Caitlin Doogan and Wray L. Buntine. 2021. Topic putational Linguistics.
model or topic twaddle? re-evaluating semantic in-
terpretability measures. In Proceedings of the 2021 Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P
Conference of the North American Chapter of the Adams, and Nando De Freitas. 2015. Taking the hu-
Association for Computational Linguistics: Human man out of the loop: A review of bayesian optimiza-
Language Technologies, NAACL-HLT 2021, Online, tion. Proceedings of the IEEE, 104(1):148–175.
June 6-11, 2021, pages 3824–3848. Association for
Computational Linguistics. Jasper Snoek, Hugo Larochelle, and Ryan P. Adams.
2012. Practical Bayesian Optimization of Machine
Bruno Giovanni Galuzzi, Ilaria Giordani, Antonio Can- Learning Algorithms. In Advances in Neural Infor-
delieri, Riccardo Perego, and Francesco Archetti. mation Processing Systems 25: 26th Annual Con-
2020. Hyperparameter optimization for recom- ference on Neural Information Processing Systems,
mender systems through bayesian optimization. pages 2960–2968.
Computational Management Science, pages 1–21.
Silvia Terragni and Elisabetta Fersini. 2021. An em-
Derek Greene and Pádraig Cunningham. 2006. Practi- pirical analysis of topic models: Uncovering the
cal Solutions to the Problem of Diagonal Dominance relationships between hyperparameters, document
in Kernel Document Clustering. In Proceedings length and performance measures. In Recent Ad-
of the 23rd International Conference on Machine vances in Natural Language Processing (RANLP).
learning (ICML’06), pages 377–384. ACM Press.
Silvia Terragni, Elisabetta Fersini, and Enza Messina.
Tim Head, Gilles Louppe MechCoder, Iaroslav
2020a. Constrained relational topic models. Infor-
Shcherbatyi, et al. 2018. scikit-optimize/scikit-
mation Sciences, 512:581 – 594.
optimize: v0. 5.2.
Kirthevasan Kandasamy, Karun Raju Vysyaraju, Willie Silvia Terragni, Debora Nozza, Elisabetta Fersini, and
Neiswanger, Biswajit Paria, Christopher R. Collins, Messina Enza. 2020b. Which matters most? com-
Jeff Schneider, Barnabás Póczos, and Eric P. Xing. paring the impact of concept and document relation-
2020. Tuning Hyperparameters without Grad Stu- ships in topic models. In Proceedings of the First
dents: Scalable and Robust Bayesian Optimisation Workshop on Insights from Negative Results in NLP,
with Dragonfly. Journal of Machine Learning Re- pages 32–40.
search, 21:81:1–81:27.
Silvia Terragni, Elisabetta Fersini, Bruno Giovanni
Jey Han Lau, David Newman, and Timothy Baldwin. Galuzzi, Pietro Tropeano, and Antonio Candelieri.
2014. Machine reading tea leaves: Automatically 2021a. OCTIS: Comparing and Optimizing Topic
evaluating topic coherence and topic model quality. models is Simple! In Proceedings of the 16th Con-
In Proceedings of the 14th Conference of the Euro- ference of the European Chapter of the Association
pean Chapter of the Association for Computational for Computational Linguistics: System Demonstra-
Linguistics, EACL 2014, pages 530–539. tions, EACL 2021, pages 263–270. Association for
Computational Linguistics.
Daniel D. Lee and H. Sebastian Seung. 2000. Al-
gorithms for non-negative matrix factorization. In Silvia Terragni, Elisabetta Fersini, and Enza Messina.
Advances in Neural Information Processing Systems 2021b. Word embedding-based topic similarity
13, Papers from Neural Information Processing Sys- measures. In Natural Language Processing and In-
tems (NIPS) 2000, pages 556–562. MIT Press. formation Systems - 26th International Conference
on Applications of Natural Language to Informa-
Kar Wai Lim and Wray L. Buntine. 2014. Bibli- tion Systems, NLDB 2021, volume 12801 of Lecture
ographic analysis with the citation network topic Notes in Computer Science, pages 33–45. Springer.
model. In Proceedings of the Sixth Asian Confer-
ence on Machine Learning, ACML 2014. William Webber, Alistair Moffat, and Justin Zobel.
2010. A similarity measure for indefinite rankings.
Biswajit Paria, Kirthevasan Kandasamy, and Barnabás ACM Trans. Inf. Syst., 28(4):20:1–20:38.
Póczos. 2019. A Flexible Framework for Multi-
Objective Bayesian Optimization using Random
Scalarizations. In Proceedings of the Thirty-Fifth
Conference on Uncertainty in Artificial Intelligence
(UAI), volume 115 of Proceedings of Machine
Learning Research, pages 766–776, Tel Aviv, Israel.
AUAI Press.
Nils Reimers and Iryna Gurevych. 2019. Sentence-
BERT: Sentence Embeddings using Siamese BERT-
Networks. In Proceedings of the 2019 Confer-
ence on Empirical Methods in Natural Language