<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Implementing a Natural Language Processing Approach for an Online Exercise in Urban Design</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aleksei Romanov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>romanov@corp.ifmo.ru</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Artem Chirkin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>chirkin@arch.ethz.ch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arina Sender</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>arisend@gmail.com</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Andrey Dergachev</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dmitry Mouromtsev</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dmitry Volchek</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>ETH Zu ̈rich Zu ̈rich</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>ITMO University Saint- Petersburg</institution>
          ,
          <addr-line>Russian Federation</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the analysis of an online educational experiment. The idea of the experiment consists in using an interactive exercise platform to complement a Massive Open Online Course (MOOC) in Urban Design. The platform provides students an opportunity to map different urban structures and premises within a district or city, create notes and descriptions, so they can express themselves and share their views. At the same time, the platform uses students' responses to grade their submissions and records students' actions to aid the research in Urban Design. First, we overview the platform functionality and the exercise setting. Then, we take a closer look at the data sources provided by the platform. Finally, we describe the method that combines techniques of natural language processing and semantic technology to analyze students' responses and feedback. The analysis allows a better understanding of students ideas and concerns related to the problem the students are asked to solve.</p>
      </abstract>
      <kwd-group>
        <kwd>online educational</kwd>
        <kwd>Massive Open Online Course</kwd>
        <kwd>urban design</kwd>
        <kwd>natural language processing</kwd>
        <kwd>semantic technology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The object of the presented study is the natural language response of students of a Massive
Open Online Course (MOOC). The MOOC used in this study is called “Future Cities”,
is conducted by the chair of Information Architecture at ETH Zurich and ETH Future
Cities Lab, and is hosted on edX site1. The MOOC teaches the understanding of a
city as a whole, its people, components and functions and is designed along the lines
of Citizen Design Science. The main principles behind their approach are described by
Mueller et al. in an article “Citizen Design Science: A strategy for crowd-creative urban
design”[Mueller et al. 2018]. The online course consists of a series of videos,
questionnaires, and exercises. The focus of the presented study is a pair of exercises in Urban
Design provided by an external tool called Quick Urban Analysis Kit (qua-kit). The main
problem of analyzing the results of the course study is a large number of text materials,
such as reviews and discussions and the needs for their analysis. The article describes
the method of the analysis that uses NLP algorithms. This methods allows us to extract
the most common concepts being operated by the MOOC students describing their
design submissions. The analysis of the concepts gives us an insight into the ideas of the
students.
2</p>
      <p>Qua-kit
Qua-kit is a web platform for simple urban design editing, sharing and discussion2.
Quakit is used in the MOOC in the form of two exercises:
1. Design exercise – work on a single design submission for a predefined scenario.
2. Compare exercise – given a series of twenty randomly selected submission pairs,
select (vote for) one design in a pair that performs better according to a given
design criterion.</p>
      <p>Both, design and compare exercises require students to consider a list of design criteria. In
qua-kit, a design criterion is presented by a description and an illustration; every exercise
is linked with a list of one to four criteria.</p>
      <p>Design exercise In the first qua-kit exercise a student is asked to redesign a part of
container terminal area in Tanjong Pagar, Singapore. Figure 1 presents qua-kit user
interface for this exercise. The tool uses WebGL to manipulate 3D geometry in browser.
The student can move, delete, or create (from templates) individual objects. After the
work is finished, the student submits the design with optional textual explanation of
their ideas. At any moment, the student can come back to the site and update their
submission. Other students can open the submission and write reviews (Figure 1, a panel
on the right). A review consists of a criterion id (shown as an icon), a like/dislike tag,
and an optional textual explanation. Therefore, the design exercise provides two sources
of natural language feedback: the submission descriptions and the user reviews.</p>
      <sec id="sec-1-1">
        <title>1https://www.edx.org/course/future-cities-ethx-fc-01x</title>
      </sec>
      <sec id="sec-1-2">
        <title>2https://github.com/achirkin/qua-kit</title>
        <p>Compare exercise In the second exercise, a student is asked to assess a series of design
pairs. Figure 2 shows the interface for comparing two designs in a pair.</p>
        <p>At the top of the page, the student sees the name and and the icon of a design criterion
under consideration. The student sees previews and descriptions of two randomly chosen
designs. Then, the student has to select the one that seems to be better according to the
given criterion by clicking on it. Optionally, a student can add a textual explanation for
their choice.</p>
        <p>Grading and the gallery When a number of students finish both of the exercises,
qua-kit possesses enough statistical data to assign grades to the designs based on peer
voting. Qua-kit runs a ranking system that updates design ratings according to the design
criteria based on the votes from the compare exercise. Figure 3 shows the qua-kit gallery
with student submissions. The gallery is available live at https://qua-kit.ethz.ch.
The grades in qua-kit are assigned for each criterion independently. The algorithm of the
ranking system may be interpreted as follows:</p>
        <p>Design the weighted majority of best compare voters defines the submission grade;
Compare the voter gets the better grade if their votes agree with the majority.
Qua-kit constantly updates relevant grades when a user submits a design, updates a
design, or votes for a pair of designs. Once a day, qua-kit averages the
per-criteriongrades for an updated design submission or a voter and sends the result to edX. This way,
the students get graded in the MOOC.
Qua-kit records all kinds of students activities: design (geometry) changes, new
submissions, comments, reviews, and votes. This opens many possibilities for a research in Urban
Design and education. In particular, the data allows studying a behavior of a student
designer. Qua-kit designer interface as a tool defines the ways a designer can interact
with a design [Volchek et al. 2017]. The simplicity of the tool enforces strong constraints
on a designer expressiveness and limits the understanding of the design context. For
example, by observing individual submissions we have found that some students have
different understanding of what individual object models represent in reality, or different
understanding of building density in the same design context. Therefore, one goal of the
presented research is to study the relationship between a student submission and student
opinions (description and reviews) on it.</p>
        <p>To get an insight into student opinions and feedback in qua-kit, we need to analyze
the natural language response of the students. The moderate amount of data produced
by the exercises suggests for the use of natural language processing techniques (NLP). As
follows from the Section 2, qua-kit provides three sources of textual data:</p>
        <sec id="sec-1-2-1">
          <title>1. descriptions attached to the design submissions;</title>
        </sec>
        <sec id="sec-1-2-2">
          <title>2. user reviews (like/dislike w.r.t. a criterion);</title>
        </sec>
        <sec id="sec-1-2-3">
          <title>3. comments attached to the compare votes.</title>
          <p>Unfortunately for the analysis, all types of textual feedback are optional. The most
informative source of text is the submission descriptions. The students spend a lot of
effort developing their submissions and usually feel eager to write at least a few sentences
about their ideas. The user reviews often contain comments, but the total number of
reviews is not very large, because the reviews are not the part of the edX exercises.
Lastly, the students rarely add comments to their compare votes, because this implies
increasing the effort for the compare exercise significantly.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <p>Semantic technologies allow us to represent and analyze large volumes of information.
This approach ensures the fulfillment of the task in a short time, assumes a high speed
of operation and flexibility of the developed system. In addition, the use of the RDF
description language3 allows us to present information in a form that is convenient not
only for humans, but also for possible subsequent processing by machine methods (robots,
crawlers, etc.). To achieve this goal, it is necessary to:</p>
      <p>Extract existing data from the PostgreSQL database. The analysis will
require submissions sent by users, namely a text description of the completed task;
Extract the concepts using NLP algorithms. In particular, it is necessary to
use algorithms, which analyzes the presented text and returns a list of the concepts
present in this text;
Develop an ontological model. The model should represent the subject area in
a semantic form, namely the main objects, the relationships between them and the
attributes;
Map extracted data. Only correct mapping will allow to execute queries and
analyze results;
Deploy the RDF-storage. Tripplestore is necessary for storing all the received
data, and also provides the interface for execution of queries (SPARQL-endpoint);
Define and create queries. Queries are designed to extract the necessary
information from the RDF-storage, for its subsequent interpretation and analysis.
Also queries check the correctness of the mapping procedure. If the queries are not
executed correctly - it is necessary to improve the mapping process;</p>
      <p>Analyze the results.
3.1</p>
      <p>Data
The obtained data consists of user exercise submissions, their reviews, grades and votes in
different criterion stored in PostgreSQL database. All data is easily obtained by executing
queries. All data is processed in triplet format for import into RDF storage.
3.2</p>
      <p>Concept extraction
The key thing in keyword extraction is language. The difference between nouns, verbs,
adjectives and adverbs of a language is extremely useful in many natural language
processing tasks. There are some fundamental techniques in NLP, including sequence labeling,
n-gram models and evaluation.</p>
      <p>POS-tagging is an important step for text analyzing. This step consists in the
classification of words into their parts of speech also known as word classes or lexical categories
and labeling them. A POS-tagger processes a sequence of words and attaches a part of
speech tag to each word. For these tasks corpus readers are used. They provide a uniform
interface for tagging.</p>
      <p>For the purposes of extracting keywords, you should pay attention to such parts of
speech as: nouns and nouns after determiners and adjectives[Kim Su Nam et al. 2013].
Adjectives and adverbs are important word classes too. Adjectives describe nouns and
can be used as modifiers or in predicates. Adverbs modify verbs to specify the place,
direction, etc. and may also modify adjectives. Most common approaches for tagging are
regular expressions and unigrams. The regular expression tagger assigns tags to tokens
using pattern matches. The unigram taggers use statistical algorithms.</p>
      <p>But it is not enough to know the parts of speech, you need to work with the text
itself. Tokenization is a way to split text into parts (tokens). These tokens could be
sentences, or individual words. There are several approaches to divide the text. We used
a verbose regular expression which consists of checking abbreviations, hyphens compound
words, currency, percentages and different separate tokens.</p>
      <p>In the processing of natural language is important from where you retrieved the data.
Our data has been received from the reviews and descriptions of the student’s home tasks.
In such circumstances, it is necessary to consider possible typos and errors. So, grammar
correction is necessary.</p>
      <p>Final step is lemmatization words. It is a process of grouping together the different
inflected forms of a word. The main goal is determining the lemma for a given word. It is
the complex task, depending on the concepts that need to be obtained. Types of buildings,
places, zones, roads, etc. play an important role in this project. We used complex metrics
considering hyponyms of the word and semantic description. In addition, an important
role is played by the names of streets, famous buildings, districts, cities. Information on
such ownership can be obtained from corpora (text corpuses). [Kovriguina et al. 2017]</p>
      <p>All these methods allow to obtain concepts related to the field of Urban Design. As a
result, it is possible to carry out a comprehensive analysis of the data of completed tasks
by students.</p>
      <p>The development consisted of the steps considered in the method. For the purposes
of correcting spelling we used Python autocorrect library4 - simple implementation that
everybody can use to implement a basic autocorrect features.</p>
      <p>The NLTK library was used for POS-tagging. Tokenization is made by regular
expression.
./.</p>
      <p>Creating/VBG
(NP (NBAR privacy/NN))
inside/RB
and/CC
open/VB
to/TO
(NP (NBAR new/JJ public/JJ spaces/NNS))
./.</p>
      <p>Tokens were selected according to the grammatical mask NBAR in sentences NP,
with POS standard abbreviations: NN - Noun, singular or mass, NNS - Noun, plural,
NNP - Proper noun, singular and JJ - Adjective.</p>
      <p>NBAR:
{&lt;NN.*|JJ&gt;*&lt;NNS|NN&gt;}
{&lt;NNP&gt;*&lt;NNP&gt;}
NP:
{&lt;NBAR&gt;}
{&lt;NBAR&gt;&lt;IN&gt;&lt;NBAR&gt;}</p>
      <p>Thus, the concepts or phrases are obtained. First of all, concepts have been
lemmatized by the WordNet lemmatizer, which take the POS tag into account. WordNet is a
large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into
sets of cognitive synsets. Synsets are interlinked by means of conceptual-semantic and
lexical relations.</p>
      <p>Concepts = [’privacy’, ’new public space’, ’community’]</p>
      <p>So, after tagging and lemmatization all concepts converted to synset format. This
format allows us to access possible synonyms and hyponyms.
house</p>
      <p>[Synset(’building.n.01’)
area
city
[Synset(’region.n.03’)]</p>
      <p>In addition, based on the theme of the project, the concepts belonging to important
geographical places and city spots were retrieved. This is possible due to semantic
enrichment of concepts and named entity recognition (NER). We used machine learning model
based on Groningen Meaning Bank (GMB) corpus5.</p>
      <p>As we can see ’Tanjong Pagar’, ’Central Business District’, ’Singapore’ are members
of the ’org’ - organization and ’geo’ - geographical groups.</p>
      <p>Additionally all concepts were semantically searched among open data sets, such as
Wikidata6, DBpedia7, etc. This allows concepts to be grouped together. As an example
of the results of such a grouping by the "way" criterion.</p>
      <p>Ontology development
Ontological modeling is one of the most important stages in works like this. The developed
model should not only adequately reflect the subject area, its main classes and relations
between them, but also contain links to a top-level ontologies.</p>
      <sec id="sec-2-1">
        <title>5http://gmb.let.rug.nl</title>
      </sec>
      <sec id="sec-2-2">
        <title>6https://wikidata.org</title>
      </sec>
      <sec id="sec-2-3">
        <title>7http://wiki.dbpedia.org</title>
        <p>Our ontology is based on the following top-level ontologies[Ke ler et al. 2013].
Firstly, AIISO8 that provides classes and properties to describe the internal
organizational structure of an academic institution. Secondly, FOAF9 (an acronym of Friend of a
Friend), the ontology describing people, their activities and their relations to other people
and objects. Finally, TEACH (Teaching Core Vocabulary)10, a lightweight vocabulary
providing terms teachers use to relate objects in their courses.</p>
        <p>We used Prot´eg´e11 for ontology development and Ontodia12 for visualization. A part
of the ontology that describes the relations between concepts, submissions, and reviews
is shown in the figure 5.
Since we have the ontology and the extracted concepts it becomes possible to start
mapping procedure. We to used N-triples format to form RDF-storage. To map the concept on
the ontology it’s necessary to make several steps: to define that the concept is a "Named
Individual", it belongs to class "Concept" and it may belong to a "Group". The example
is shown below.</p>
        <p>Urban:SomeConcept rdf:type owl:NamedIndividual.</p>
        <p>Urban:SomeConcept rdf:type Urban:Concept.</p>
        <p>Urban:SomeConcept Urban:belongsToGroup Urban:someGroup</p>
      </sec>
      <sec id="sec-2-4">
        <title>8 http://purl.org/vocab/aiiso/schema.</title>
      </sec>
      <sec id="sec-2-5">
        <title>9http://xmlns.com/foaf/spec/ 10http://linkedscience.org/teach/ns/teach.rdf 11http://protege.stanford.edu 12http://ontodia.org</title>
        <p>Mapping of the rest of the extracted data on the ontology was done in a similar way.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>To evaluate the results of the project, it is necessary to analyze the data, namely, to
identify the main dependencies, to estimate the existing trends. The resulting triple store
contains 36,926 statements. To extract the necessary information, the query language
SPARQL was used. For example, a query to display the frequencies of mentioning concepts
is presented below:</p>
      <p>PREFIX Urban:
&lt;http://www.semanticweb.org/Urban&gt;
select ?concept (COUNT(?concept) AS ?all)
where ?submission Urban:hasConcept
?concept.
group by ?concept
order by desc(?all)</p>
      <p>The distribution is presented in the table 1. 1491 user participated in the exercise,
however there is only 721 tasks submitted. Peer assessment was used to evaluate the
results of the exercise. A total of 308 reviews were left. At the same time, only 88
submissions had at least one review. Submissions were rated according to the offered
criteria. A total of 2704 rates were left.</p>
      <p>To analyze the relation of what students say and what they actually implement we
extracted the concepts from the submissions and the reviews. A total of 1816 concepts
were extracted. The distribution of the most popular concepts is presented in the figure
6.</p>
      <p>A significant part of the most popular concepts consists of too common ideas like
"building", "road", "city" which have neutral meaning and do not reflect user preferences.
If we exclude such a concepts, we can analyze concepts that really reflect user’s vision.</p>
      <p>It is possible to group concepts into several main clusters and find out the popularity
of each cluster. The results are presented on the fig.7 Among them:</p>
      <p>People. This cluster includes concepts ’people’, ’community’, ’neighborhood’, etc
and describes the care about a person and her relationships with the community.
Space. Includes concepts, responsive for the description of the environment ’space’,
’open space’, ’public space’, etc.</p>
      <p>Centrality. The criterion is used to understand structural properties of complex
relational networks. Centrality measures identify that, in a network, some nodes
are more central than others.</p>
      <p>Green areas. Includes concepts like ’park’, ’green area’, ’green space’ etc.
Visibility. Visibility analysis shows the visual impact from a point into the
surrounding environment affected by the obstructions and shaping the skyline. In the
city, the urban elements such as topography, building, trees, etc., make part of the
urban atmospheric visibility.</p>
      <p>Accessibility. Accessibility can be defined as the facility in which one place can be
reached from another place and is dependent on the spatial distributions the given
location.</p>
      <p>Water. This cluster includes concepts that describe water environment: ’water’,
’waterfront’, etc.</p>
      <p>Connectivity. For urban networks, connectivity is used to understand spatial
conditions affecting pedestrian activities and behaviors in cities.
Density. Density can be defined as the mass of an object per unit area. In
architecture and urban planning, physical density is a numerical measure of the
concentration of individuals or physical structures within a given unit area.</p>
      <p>Adjectives are used to describe and clarify the types of zones, places, etc. The list of
the most popular and significant adjectives, encountered in the submissions is presented
on the figure 8</p>
      <p>According to this information students were highly concerned about caring for a
person, her needs, and her representation as a member of society. Second important aspect
is space organization. Centrality, Visibility, Accessibility, Connectivity and Density are
the criteria by which the works were rated, so students are expected to mention these
terms in their submissions. Finally parks, green zones and water environment were quite
popular topics in the submissions.</p>
      <p>Students were suggested to do several tasks and these tasks were assumed to be rated
by the different criteria. We united similar criteria and counted the number of the rates
and the number of the corresponding concepts mentions in the submissions. Results can
be seen in the table below:</p>
      <p>According to the data among 721 submissions, concepts reflecting the criteria, are
mentioned only in 30 submissions on the average. But there is a connection between
number of Rates of each criterion and the number of corresponding concepts. The Pearson
correlation coefficient is sufficiently high (0.86). Thus it’s possible to say that if the
submission contains the concept, reflecting the corresponding criterion, it has more chances
to be rated.</p>
      <p>The above data collection is the result of processing all tasks performed by students.
Finally, the task from Responsive Cities [EdX FC-04x-2] was considered separately.</p>
      <p>We used LDA for the purposes of data visualization. LDA is an unsupervised
technique, meaning that we don’t know prior to running the model how many concepts exits
in our corpus. Topic coherence, is one of the main techniques used to deestimate the
number of topics [Rosner et al. 2018] [Muhammad Omar et al. 2015]</p>
      <p>On the left side of figure 9 we can see circles representing different topics and the
distance between them. This approach allows us to clearly see the density of the tasks
descriptions, as well as to identify any anomalies.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>Qua-kit was used for conducting MOOC exercises. It provided the data of peer reviews and
task descriptions for subsequent analysis. We proposed the method to identify and classify
keywords reflecting the concepts operated by MOOC students. Keywords and exercise
materials such as peer reviews and descriptions were represented as the RDF storage. It
allowed handling these materials by sending queries that are natural for humans. Then,
we performed the LDA analysis of the data. We analyzed how the students described
their design submissions and evaluated the submissions of their peers.</p>
      <p>In future, we plan to integrate and automate the method to improve tasks.
Additionally, we are looking to automating verification of peer assessments and search for
anomalies. Another research direction is combining the analysis of submission content
(geometry) and the analysis of submission descriptions to study the relations between the
two.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This work is partially supported by Scientific Technological Cooperation Program
Switzerland-Russia (STCPSR) 2015 (project IZLRZ1164056):</p>
      <p>The work is partially supported by the RGNF grant 16-23-41007.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Ke ler et al</source>
          . 2013]
          <article-title>Ke ler, Carsten and d'Aquin, Mathieu</article-title>
          and Dietze,
          <string-name>
            <surname>Stefan</surname>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>Linked Data for science and education</article-title>
          . //Semantic Web, p.
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>[Kim Su</surname>
          </string-name>
          Nam et al. 2013]
          <article-title>Kim, Su Nam and Medelyan, Olena and Kan, Min-Yen and Baldwin</article-title>
          ,
          <string-name>
            <surname>Timothy</surname>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>Automatic keyphrase extraction from scientific articles</article-title>
          . //Language Resources and Evaluation, p.
          <fpage>723</fpage>
          -
          <lpage>742</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Mueller et al. 2018]
          <article-title>Johannes Mueller and Hangxin Lu and Artem Chirkin and Bernhard Klein and Gerhard Schmitt (</article-title>
          <year>2018</year>
          )
          <article-title>Citizen Design Science: A strategy for crowd-creative urban design</article-title>
          . //Cities, p.
          <fpage>181</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Rosner et al.
          <year>2018</year>
          <article-title>] Rosner, Frank and Hinneburg, Alexander and Ro¨der, Michael and Nettling, Martin and Both</article-title>
          ,
          <string-name>
            <surname>Andreas</surname>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>Evaluating topic coherence measures</article-title>
          .
          <source>//Conference: Neural Information Processing Systems Foundation (NIPS</source>
          <year>2013</year>
          ), p.
          <fpage>23</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Kovriguina et al. 2017]
          <string-name>
            <surname>Kovriguina</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shilin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Putintseva</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shipilo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2017</year>
          )
          <article-title>Russian Tagging and Dependency Parsing Models for Stanford CoreNLP Natural Language Toolkit</article-title>
          . //International Conference on
          <article-title>Knowledge Engineering and the Semantic Web</article-title>
          , p.
          <fpage>101</fpage>
          -
          <lpage>111</lpage>
          . - Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>[Muhammad</surname>
          </string-name>
          Omar et al.
          <year>2015</year>
          ]
          <article-title>LDA topics: Representation and evaluation (2015) Russian Tagging and Dependency Parsing Models for Stanford CoreNLP Natural Language Toolkit</article-title>
          . //Journal of Information Science, p.
          <fpage>662</fpage>
          -
          <lpage>675</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Volchek et al. 2017]
          <string-name>
            <surname>Volchek</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romanov</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouromtsev</surname>
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2017</year>
          )
          <article-title>Towards the Semantic MOOC: Extracting, Enriching and Interlinking E-Learning Data in Open edX Platform</article-title>
          . //Knowledge Engineering and
          <string-name>
            <surname>Semantic Web. KESW</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Communications in Computer</article-title>
          and Information Science, p.
          <fpage>662</fpage>
          -
          <lpage>675</lpage>
          . - Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>