<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Detection of the Topical Structure of the Ministerial Posts on Social Networks</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Saint Petersburg State University</institution>
          ,
          <addr-line>Russia, 199034, Saint Petersburg, Universitetskaya emb. 11</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Journalism and Mass Communications, Saint Petersburg State University</institution>
          ,
          <addr-line>Russia, 199004, Saint Petersburg, VO, 1 Line, 26</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2041</year>
      </pub-date>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>The paper discusses the development of a corpus of Russian ministerial posts based on VKontakte social network. The study is aimed at revealing topical structure of ministerial communities. We performed a series of experiments which include LDA topic modeling and automatic topic labeling that help to improve the interpretability of topics. To implement the procedures, we used Python libraries for NLP. Experiments allowed us to find out pivotal topics that the government of Russia covers on social networks nowadays.</p>
      </abstract>
      <kwd-group>
        <kwd>Social Network</kwd>
        <kwd>Ministerial Post</kwd>
        <kwd>Corpus Linguistics</kwd>
        <kwd>Russian</kwd>
        <kwd>Topic Modeling</kwd>
        <kwd>LDA</kwd>
        <kwd>Automatic Topic Labeling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In the era of digitalization of society, many areas of life are reflected on the Internet.
Government agencies, services and ministries are no exception: using the available
tools and applications of online platforms, departments create communities in which
they publish posts about the current state of the country, as well as its regions.</p>
      <p>There are a great number of ministries in Russia: The Ministry of Defense, The
Ministry of Foreign Affairs, The Ministry of Culture, etc. At first glance, it seems that
the names of the ministries are directly related to the topics of the problems and the
posts that they publish on social networks. However, the topical structure of
ministerial groups is much more complicated. For example, the coronavirus pandemic has
affected many areas of the Apparatus of the Government of Russia, some common
issues appearing among a lot of ministries. The most obvious one is related to the
problems of online education and the ways to overcome them, they are dealt with by
both the Ministry of Education and the Ministry of Digital Development,
Communications and Mass Media.</p>
      <p>This study is aimed at detecting main topical areas in ministerial posts on
VKontakte social network. This social network is very popular among residents of Russia1,
therefore ministries and other governmental departments are eager to create their
communities there. We will use LDA and topic labeling algorithms to reveal main
topics of ministerial posts. These procedures have never been carried out on the
corpus of ministerial posts.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related works</title>
      <p>Large text collections, known as corpora, are becoming more and more popular
among linguists, political scientists, sociologists and a number of other scientists.
They try to use corpora to carry out their practical researches.</p>
      <p>
        In English papers [
        <xref ref-type="bibr" rid="ref13 ref14 ref3 ref5">3, 5, 13, 14</xref>
        ], functioning of politically oriented posts (for
example, Twitter or Facebook) and their impact on the public opinion of users are
described. Most of the authors note that the integration of governmental apparatus and
social networks is connected with a certain degree of risk.
      </p>
      <p>
        As regards the Russian segment of studying political discourse, it should be noted
that there is a small layer of works on this topic. The papers [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] are based on
diachronic description of political vocabulary and its frequency behavior. The
experiments were conducted on the basis of Google Books Ngram Viewer2. The authors pay
special attention to the names of political figures and important events in the history
of Russia. The corpus, that was used by the scholars, is compiled on the basis of a
large number of digitized versions of printed publications, it does not focus on texts
that one can find on social networks. In our study, we try to describe some linguistic
features of ministerial posts on online platforms, it will fill in the gaps in the Russian
corpus linguistics and topic modeling researches.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Experiment</title>
      <sec id="sec-3-1">
        <title>Corpus collection and preprocessing</title>
        <p>To conduct further experiments, there is a need to collect a corpus. We used Python3
and the beautifulsoup44 library to create a parser for scraping posts from 15
communities of ministries, agencies and services on VKontakte social network. We took
20192020 posts as they reflect the current situation in Russia in various spheres of life.</p>
        <p>The size of the final corpus is 2 311 480 words. The procedure for preprocessing
the corpus involves the following steps:
1. Removal of non-textual elements (emoticons, images, etc.);
2. Obtaining tokens using regular expressions;
3. Lemmatization (normalization) of the received tokens and resolution of
morphological ambiguity using pymorphy25;
2 https://books.google.com/ngrams
3 https://www.python.org/
4 https://www.crummy.com/software/BeautifulSoup/bs4/doc/
5 https://pymorphy2.readthedocs.io/en/latest/
4. Removal of the words (pronouns, prepositions, etc.) if they are in the stop-list;
5. Adding bigrams and trigrams with the help of gensim6;
6. The division of preprocessed posts according to ministries and departments;
7. Saving the preprocessed corpus in .txt format.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Topic modeling</title>
        <p>Topic modeling consists in the representation of compressed topical descriptions of
documents. The topic model of a text collection defines each topic as a discrete
distribution over a set of terms, each document is defined as a discrete distribution over a
set of topics. The texts are presented as a sequence of topics, which are randomly and
independently selected from some distribution. A topic in topic models is a set of
words characterized by co-occurrence within a document.</p>
        <p>
          Nowadays topic modeling is used in a great variety of computational linguistics
researches: one can use the algorithms for detecting hidden communities on social
networks, determining main topics of users’ posts on Twitter, or revealing topics of
social media news [
          <xref ref-type="bibr" rid="ref1 ref6 ref7">1, 6, 7</xref>
          ].
        </p>
        <p>To implement further procedures, we need to choose a suitable algorithm for the
experiment. We decided to focus on one of the popular algorithms for topic
modeling – Latent Dirichlet Allocation (LDA). The gensim library is used for LDA as it
provides a lot of possibilities to work with other libraries (for visualization, etc.).</p>
        <p>First of all, we need to compute the optimal number of topics for the corpus using
the U-Mass measure. It reflects topic coherence value which is treated as a level of
human interpretability of the model based on relatedness of words and documents
within a topic. The formula is as follows:

(  ,   ,  ) = log
 (  ,  )+
 (  )</p>
        <p>3
(1)
where  (  ,   ) is the number of documents that contains 
 and 

words,
 (  ) shows the number of documents containing   words. The highest value of</p>
        <p>(  ,   ,  ) shows that the model includes the appropriate number of topics. We
conducted a series of experiments with the following parameters: the minimum
number of topics was 5, the maximum was 70, and the step was 5. We obtained graphic
data (Figure 1).</p>
        <sec id="sec-3-2-1">
          <title>6 https://radimrehurek.com/gensim/</title>
          <p>As regards the graphic representation of topics in the corpus, the resulting sets can be
visualized using pyldavis7 (Figure 2).</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>7 https://github.com/bmabey/pyLDAvis</title>
          <p>It is clear that the first topic covers the biggest part of the text collection while there is
much less information about the others.</p>
          <p>It is also important to note that neither bigrams nor trigrams appeared in the topics.
Their absence can be connected with low weights of the obtained n-grams.</p>
          <p>The main disadvantage of LDA is the impossibility of automatic selection of topic
labels: a user may have certain difficulties while interpreting the results of obtained
topics. To find the most accurate label, scholars try to improve topic modeling
algorithms and implement a method known as automatic topic labeling. The next section
will discuss this procedure.
3.3</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Automatic topic labeling</title>
        <p>A label is a word or a sequence of words that covers the general content of given sets
of words. Sometimes relevant labels are manually assigned to topics. However, the
procedure of automatic topic labeling allows us to facilitate the interpretation of
topics, as well as to save time and effort spent on manual assigning. Nowadays, it is
being developed both for English and Russian corpora.</p>
        <p>
          There are two main ways to get topic labels: external sources and internal
sources [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. External sources can refer to online thesauri (WordNet, etc.) or sites that
help to obtain topical labels for a set of words (Wikipedia, etc.). As for internal
sources, topical labels can be extracted using procedures of automatic summarization
or creating word2vec models [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          In our research we used some methods proposed in [
          <xref ref-type="bibr" rid="ref12 ref7">7, 12</xref>
          ]. First, we transformed
the topical words into a query for Google, extracted first 5 site headlines, identified
collocations with the help of pymorphy2 (ADJ+NOUN, NOUN+NOUN, etc.) and
ranged them according to the number of occurrence in the Russian National Corpus.
If there were no occurrences, we didn’t consider the label to be a candidate.
        </p>
        <p>
          Then we created two word2vec CBOW models: the first one is based on the
ministerial corpus itself, the second one is based on the corpus of users’ posts of VKontakte
social network, it is proposed in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The parameters of the models: minimal
frequency of occurrence is 5, the size of a vector is 100, the context window is 5. The results
are ranged with the help of the cosine similarity.
        </p>
        <p>Below we present a comparative table of candidates for topic labels.
россия, россий- российская
федераский, страна, ция, официальный
дело, министр, представитель,
международный, внешняя политика,
иностранный, министерство
инофедерация, во- странных дел,
межпрос, отношение дународная жизнь
(russia, russian, (russian federation,
country, affair, official representative,
minister, interna- foreign policy,
ministional, foreign, try of foreign affairs,
federation, issue, international affairs)
attitude)
россия, полиция, мвд россии,
правосотрудник, мвд, охранительная
сидело, российский, стема, управление
служба, полицей- мвд, полиция россии,
ский, область, рубеж россии
(minisвнутренний try of internal affairs
(russia, police, of russia, law
enemployee, minis- forcement system,
try of internal department of the
minaffairs, affair, istry of internal affairs,
russian, service, police of russia, border
police, region, of russia)
internal)
военный, россия, российская
федерароссийский, обо- ция, министерство
рона, учение, обороны,
военносила, междуна- морской флот,
военродный, конкурс, ная техника, военный
армия, флот (mili- округ (russian
federatary, russia, rus- tion, ministry of
desian, defence, fence, navy, military
exercises, force, equipment, military
international, district)
competition, army,
navy)
павел, субъект, чиновник, активист,
правительство, следственный,
констиполковник, сов- туционный,
антиконместно (pavel, ституционный,
должsubject, govern- ность (official, activist,
ment, colonel, investigative,
constitumutually) tional, anticonstitutional,</p>
        <p>occupation)
проходить,
совместно, андрей,
выступить,
назначить (pass,
mutually, andrew,
come out,
appoint)
социалистический,
великобритания,
гражданский, украинский,
оборона (socialistic,
great britain, civil,
ukrainian, defence)
россия,
российский,
образование, школа, день,
ребёнок, работа,
культура, проект,
мир (russia,
russian, education,
school, day, child,
work, culture,
project, world)
россия,
российский, проект,
развитие,
министр,
производство, страна,
работа, область,
участие (russia,
russian, project,
development,
minister,
production, country,
work, region,
participation)
повышение
квалификации, первая школа,
государственный
университет, история
образования,
главный портал (advanced
training, first school,
state university,
history of education,
main portal)
национальная
программа, министерство
науки, деловая
программа, управление
экономики,
цифровая экономика
(national program,
ministry of science, business
program, department
of economics, digital
economy)
совместно,
проходить, андрей,
республика,
инициатива
(mutually, pass,
andrew, republic,
initiative)
генеральный,
моисеев, дума,
проходить,
грамотность
(general, moiseev,
duma, pass,
literacy)
молодёжь, вуз,
инженер, отцовство,
обучаться (youth,
university, engineer,
paternity, study)
предпринимательство,
стратегический,
экологический,
региональный, федеральный
(enterprise, strategic,
ecological, regional, federal)
The easiest way of obtaining a topic label is to choose the first word in a set, but
russia is the first word in all the sets, as it is one of the most frequent within the corpus. It
cannot be a topical word, so there is a need to analyze the results of using internal and
external sources. The idea is to find the same words or collocations in all the columns
or try to find words which can describe the same semantic field.</p>
        <p>Mind the word2vec model based on the ministerial corpus. The resultant labels are
repeated sometimes; it may be connected with the size of the corpus. The size of the
second corpus is almost four times as many as the size of the first one, so there are
more relevant results. At the same time, a lot of proper names are represented as
candidates for topic labels. The names denote people working in a particular ministry: for
instance, Moiseev works in the Ministry of Finance. Although names are indirectly
related to the topical sets of words, they cannot be candidates for labels.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and Evaluation</title>
      <p>After running several experiments, we obtained the topical structure of ministerial
posts on VKontakte social network. The main topics are related to:</p>
      <sec id="sec-4-1">
        <title>1. foreign policy;</title>
        <p>2. internal affairs;
3. country defense;
4. education;
5. country development.
Such topics as health, transport, protection of environment, etc. were not mentioned in
the models. It may be connected with several issues.
1. A ministry doesn’t have a topical community on the social network or it doesn’t
publish a lot of posts on social networks so users are unable to get all the
information on the current state of a ministry. This fact is closely linked to the degree of
the state “openness”8.
2. An obtained topic itself contain more specific ones. For instance, the last set of
words describes both economic and industrial processes in Russia.
3. The health topic, that is still acute because of the coronavirus pandemic, is
scattered across various ministerial communities, each department assesses the impact
of the coronavirus on its own area. This is why the topic of health is most likely
absorbed by larger topics.</p>
        <p>
          As regards the assessment of the procedures, we will discuss the main advantages and
disadvantages. First of all, nowadays a lot of scholars have analyzed different Russian
corpora of social networks with the help of LDA algorithm, a great number of papers
proving it [
          <xref ref-type="bibr" rid="ref1 ref12 ref2">1, 2, 12</xref>
          ]. Moreover, we can reveal the inner structure which is described
by certain syntagmatic and paradigmatic relations between words within each topic
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <sec id="sec-4-1-1">
          <title>Syntagmatic relations Paradigmatic relations</title>
        </sec>
        <sec id="sec-4-1-2">
          <title>Adjective-modifier relations: Hypernymy and hyponymy:</title>
          <p>российский – федерация (russian - страна – россия (country - russia)
federation) сотрудник – полицейский (employee –
poмеждународный – отношение (interna- liceman)
tional - relation)</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>Noun-modifiers:</title>
          <p>оборона – сила (defence – force)</p>
        </sec>
        <sec id="sec-4-1-4">
          <title>Derivational relations:</title>
          <p>россия – российский (russia - russian)
полиция – полицейский (police – policeman)</p>
        </sec>
        <sec id="sec-4-1-5">
          <title>Adjective-modifier and noun-modifiers Meronymy and holonymy:</title>
          <p>relations: оборона – армия, флот (defence – army, navy)
министр – иностранный – дело (minis- образование – школа, ребёнок (education –
ter – foreign – affair) school, child)
россия – область (russia – region)
работа – производство (work – production)
Mind that verb-modifier relations were not mentioned in the table. The reason is that
verbs in the corpus of ministerial posts can be either high-frequency or
lowfrequency: сказать, встретить, подписать, обсудить и т.д. (say, meet, sign, discuss,
etc.) so they were not included in the resultant topic models.</p>
          <p>
            As regards topic labels, we should note that the main advantage is using a
doubleranking algorithm (Google PageRank and the number of occurrence in the Russian
8
https://ach.gov.ru/news/gosudarstvo-sredney-zakrytosti-rezultaty-novogo-reytingaotkrytosti-gosorganov
National Corpus), it helped to avoid candidates with low frequencies. The problem
was to assess the quality of the obtained labels, as there is no gold standard for
Russian topic labels. It was decided in involve 5 independent assessors, they were asked
to assess each label on Google Forms using the following grades: 1 is for relevant
labels, 0 is for irrelevant ones. Some results are presented below (Figure 3).
Fig. 3. Diagram of candidates for topic labels for the topic: russia, police, employee, ministry
of internal affairs, affair, russian, service, police, region, internal
Then we transformed the chart into the table for better interpretation. Below there is a
part of such a table (Table 4).
As far as you can see, the assessors agreed that the ministry of internal affairs of
russia label is the best option for the topic, and all the unigram labels are the worst ones.
It can be explained by the fact that n-grams better describe specific topics, and
unigrams are often used to cover common topics [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Summary</title>
      <p>Automatic ways of analyzing texts on social networks have become pivotal
nowadays. In this paper, we created the corpus of ministerial posts, performed linguistic
analysis of the data with the help of LDA topic modeling and topic labeling. We
described the topical structure of the text collection and found out the main topics which
are important for the Russian government. We also analyzed paradigmatic and
syntagmatic relations between lexical units within each topic.</p>
      <p>Results, that were obtained during the experiments, prove consistency of the
statistical model and provide common knowledge on current issues of Russian ministries.</p>
      <p>Further researches can be carried out in the following directions:
 the enlargement of the corpus by adding information from other social networks
(Facebook, etc.) and comparison their topical structures;
 comparing several topic modeling algorithms (for instance, LDA and LSI);
 developing a gold standard for automatic topic labeling of Russian topic models;
 classifying and clustering texts of the ministerial posts within one corpus;
 automatic extraction of keywords from ministerial posts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bodrunova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blekanov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kukarkin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Topic modeling for Twitter discussions: Model selection and quality assessment</article-title>
          .
          <source>In: Proceedings of the 6th SGEM International Multidisciplinary Scientific Conferences on SOCIAL SCIENCES and ARTS SGEM2018, Science and Humanities</source>
          ,
          <volume>207</volume>
          -
          <fpage>214</fpage>
          . STEF92
          <string-name>
            <given-names>Technology</given-names>
            <surname>Ltd</surname>
          </string-name>
          ., Sofia, Bulgaria (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bodrunova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blekanov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kukarkin</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Topics in the Russian Twitter and relations between their interpretability and sentiment</article-title>
          .
          <source>In: Sixth International Conference on Social Networks Analysis, Management and Security</source>
          ,
          <volume>549</volume>
          -
          <fpage>554</fpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bodrunova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blekanov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smoliarova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litvinenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Beyond Left and Right: RealWorld Political Polarization in Twitter Discussions on Inter-Ethnic Conflicts</article-title>
          .
          <source>Media and Communication</source>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <fpage>119</fpage>
          -
          <lpage>132</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Cano</given-names>
            <surname>Basave</surname>
          </string-name>
          <string-name>
            <given-names>A.E.</given-names>
            ,
            <surname>He</surname>
          </string-name>
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Xu</surname>
          </string-name>
          <string-name>
            <surname>R.</surname>
          </string-name>
          :
          <article-title>Automatic Labelling of Topic Models Learned from Twitter by Summarisation, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics</article-title>
          (Volume
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , Stroudsburg, PA, USA, Association for Computational Linguistics,
          <fpage>618</fpage>
          -
          <lpage>624</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Garrett</surname>
            ,
            <given-names>R. K.</given-names>
          </string-name>
          :
          <article-title>Social media's contribution to political misperceptions in U.S. Presidential elections</article-title>
          .
          <source>PLOS ONE</source>
          ,
          <volume>14</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Koltsov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pashakhin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dokuka</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A full-cycle methodology for news topic modeling and user feedback research</article-title>
          . In: Staab,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Koltsova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Ignatov</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.I. (eds.) SocInfo</surname>
          </string-name>
          <year>2018</year>
          . LNCS,
          <volume>11185</volume>
          ,
          <fpage>308</fpage>
          -
          <lpage>321</lpage>
          . Springer, Cham (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mamaev</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitrofanova</surname>
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Automatic Detection of Hidden Communities in the Texts of Russian Social Network Corpus</article-title>
          . In: Filchenkov A.,
          <string-name>
            <surname>Kauttonen</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pivovarova</surname>
            <given-names>L</given-names>
          </string-name>
          . (eds) Artificial Intelligence and
          <string-name>
            <surname>Natural Language. AINL</surname>
          </string-name>
          <year>2020</year>
          .
          <article-title>Communications in Computer</article-title>
          and Information Science,
          <volume>1292</volume>
          ,
          <fpage>17</fpage>
          -
          <lpage>33</lpage>
          . Springer, Cham (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Masevich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Corpus Linguistics Methods in Humanitarian Studies</article-title>
          .
          <source>Computational linguistics and computational ontologies</source>
          ,
          <volume>24</volume>
          -
          <fpage>43</fpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Masevich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Variation in the representation of names of political functionaries in diachronic researches based on text corpora</article-title>
          .
          <source>Computational linguistics and computational ontologies</source>
          ,
          <volume>56</volume>
          -
          <fpage>73</fpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Probabilistic Topic Modeling of the Russian Text Corpus on Musicology</article-title>
          .
          <source>In: LMAC</source>
          <year>2015</year>
          , CCIS
          <volume>561</volume>
          ,
          <fpage>69</fpage>
          -
          <lpage>76</lpage>
          . Springer Nature (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mirzagitova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Automatic assignment of labels in topic modeling for Russian corpora</article-title>
          .
          <source>In: Proceedings of the 7th Tutorial and Research Workshop on Experimental Linguistics</source>
          ,
          <fpage>115</fpage>
          -
          <lpage>118</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sampetova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mamaev</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moskvina</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukharev</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Topic modelling of the Russian corpus of Pikabu posts: author-topic distribution and topic labelling</article-title>
          .
          <source>In: Proceedings of the International Conference « Internet and Modern Society» (IMS</source>
          <year>2020</year>
          ), International Workshop «Computational Linguistics» (CompLing-2020)
          <article-title>(2020</article-title>
          , in press).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mohammad</surname>
          </string-name>
          , H. B.:
          <article-title>Government's Presence on Social Media. A Study with Special Reference to Jordan</article-title>
          .
          <source>In: Research Journal of Applied Sciences, Engineering and Technology</source>
          ,
          <volume>7</volume>
          ,
          <fpage>4813</fpage>
          -
          <lpage>4816</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Stieglitz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang-Xuan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Social media and political communication: a social media analytics framework</article-title>
          .
          <source>Soc. Netw. Anal. Min</source>
          .
          <volume>3</volume>
          ,
          <fpage>1277</fpage>
          -
          <lpage>1291</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>