<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Principal Component Analysis in Topic Modelling of Short Text Document Collections</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hennadii Dobrovolskyi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nataliya Keberle</string-name>
          <email>nkeberle@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Zaporizhzhya National University</institution>
          ,
          <addr-line>Zhukovskogo st. 66, 69600, Zaporizhzhya</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>48</fpage>
      <lpage>54</lpage>
      <abstract>
        <p>This paper presents the motivation for and the preliminary theoretical investigations of the PhD project by the rst author. The objective of the research is to propose and to experimentally verify the approach of application of eigendecomposition in principal component analysis for topic modelling of short text document collections. The main hypothesis examined in this project, is that principal component analysis applied to word co-occurrence statistics turns topic modelling into well-de ned problem having unique solution with natural tting parameters. The project is performed at the Dept. of Computer Science of Zaporizhzhya National University.</p>
      </abstract>
      <kwd-group>
        <kwd>text mining</kwd>
        <kwd>short text document</kwd>
        <kwd>topic modelling</kwd>
        <kwd>principal component analysis</kwd>
        <kwd>eigendecomposition</kwd>
        <kwd>clusterization</kwd>
        <kwd>KeyTerms</kwd>
        <kwd>MathematicalModel</kwd>
        <kwd>MachineIntelligence</kwd>
        <kwd>DescriptiveModel</kwd>
        <kwd>KnowledgeRepresentation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper presents a PhD project aimed at developing the method for
probabilistic topic modelling of collections of short text documents. It is assumed
that documents are literate texts or have another known structure that allows
to discover in a natural way all the links between terms. For instance
collection of scienti c paper abstracts and titles contains well-formed sentences that
can be parsed with NLP tools. The project concentrates on the analysis
scienti c abstracts covering one vague domain of knowledge. That means, documents
have one principal topic and some extra ones associated with related domains
of knowledge.</p>
      <p>It is well known that number of scienti c publication grows faster than
average scientist can analyze. Thus it is important to have a tool that can maintain
actual state of documents collection ensuring that it completely covers a domain
of interest. Therefore developing a well-grounded method to determine topics an
unknown document belongs to is useful.</p>
      <p>
        The main hypothesis examined in this project, is that principal component
analysis[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] applied to word co-occurrence statistics turns topic modelling into
well- de ned problem having unique solution with natural tting parameters.
      </p>
      <p>The discovered topics can be used to represent short documents as vector
of real numbers appropriate for retrieval and clustering. Moreover the terms
associated with topics can be used to search for new documents and to extend
the collection.</p>
      <p>
        The known common document topic modelling is the ill-posed problem that
does not have unique solution. Therefore different additional conditions are
added and combined to get comprehensible topic models. Often many restrictions
are poorly grounded heuristics that require diverse tricks to combine them[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Applying common document topic modelling to short texts leads to the lower
quality of discovered topics if compared to ones derived from long texts.
      </p>
      <p>The objective of the presented project is to develop and evaluate a method
to determine a set of topics of short text document collection and assign topic
weights to each document in the collection.</p>
      <p>
        As a theoretical background the project uses the natural language processing
(NLP) methods like part-of-speech tagging [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], stemming [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], sentence splitting
and parsing [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Information retrieval approaches [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] are used to exclude
insignificant information during preprocessing. Following the mainstream of
probabilistic topic model the Principal Component Analysis [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is applied to derive the
most signi cant collection topics from word co-occurrence frequencies. The HEP
collection [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] provides a sample data to verify the suggested method.
      </p>
      <p>The rest of the paper is organized as follows. Section 2 contains a short
review of the related work. This is followed by short description of the suggested
method in Section 3. Experiment goals and work ow is illustrated and explained
in Section 4. Finally, the conclusive remarks are presented brie y in section 5
and several possible directions in future are also pointed out.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works and Motivation</title>
      <p>
        Probabilistic topic models [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] are the set of algorithms providing a statistical
solution to the problem of handling large collection of documents. The basic idea
behind the topic modelling is to construct a low-dimensional document
representation using few groups of tightly connected signi cant terms instead of separate
words. The most known method of topic modelling is Latent Dirichlet Allocations
(LDA) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] which overcomes de ciencies of earlier approaches and is successful
and simple enough. A general introduction and survey of the topic modelling
can be found in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] along with a novel approach, called Additive Regularization
of Topic Models. However the primary direction of topic model enhancement
still is a regularization, i.e. incorporating different restrictions into basic
algorithm. Origins of the restrictions are not limited and sometimes (as in LDA) the
additional condition is applied because it is manageable and it works.
      </p>
      <p>
        Another drawback of common topic model is the shorter are documents in a
collection the less accurate is the result. It is overpassed with approaches utilizing
word co-occurrence statistics [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] instead of counting document-word pairs.
      </p>
      <p>
        Similar results can be reached in quite a different way, with a combination of
NLP and clustering algorithms [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However the clustering in a high-dimensional
discrete space is a time demanding problem. So mapping documents to
lowdimensional space Rn can accelerate the clustering and subsequent analysis.
      </p>
      <p>Therefore, the method of topic modelling that replaces the magic restrictions
with comprehensible ones will be valuable. The project presented in this paper
aims at the development, evaluation and application of the method based on the
PCA approach for word co-occurrence probabilities.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Method Description</title>
      <p>Let D be a collection of documents, W - a dictionary containing all terms used in
D. Each document d 2 D is a sequence of nd terms (w1; : : : ; wnd ). The term can
occur many times in the document. "Term" may be a word or group of words.</p>
      <p>
        The suggested method shares with the common document topic model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
the following assumptions:
      </p>
      <p>Assumption 1: Each term w in the document d is related to a topic t from a
set of topics T . Collection of documents is formed as set of triples (d; w; t),
independently selected in a random way from discrete probability p (d; w; t) de ned
over set D W T . The document d and the term w are observable and the
topic t is the hidden parameter.</p>
      <p>Assumption 2. Order of terms in a document doesn't matter.</p>
      <p>Assumption 3: Order of documents in the collection doesn't matter.</p>
      <p>Assumption 4: Conditional probability p (wjd; t) is independent on the
document d, i.e. p (wjd; t) = p (wjt).</p>
      <p>
        As well as Word Network Topic Model [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and Biterm Topic Model [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] the
suggested method utilizes probability p (wi; wk) that both word wi and word wk
occur in the same document or document fragment
p (wi; wk) =
      </p>
      <p>T
∑ p (wijt) p (t) p (wkjt)
t=1
(1)
where p (wi; wk) is a joint probability and t is a topic identi er. In the presented
project the probability is estimated as relative number of pairs (wi; wk).</p>
      <p>Term pairs (wi; wk) are collected in two steps. First, each document dk in the
collection is mapped to set of short term sequences S (dk) = (sk1; sk2; : : :), where
skq = (wkq1; : : : ; wkqr). Second, each sentence skq is mapped to pairs (wi; wk),
wi 2 skq , wk 2 skq , wi ̸= wk .</p>
      <p>Topic model creation is an estimation of probabilities p (wijt) and p (tjdk). It
is assumed that a number of signi cant topics is far smaller than the number of
words and the number of documents that simpli es the further manipulations
like search, comparison, clustering etc.</p>
      <p>In our document generation model, the document dk is represented with the
set of conditional probabilities p (tjdk). dk is a bag of terms and we apply the
Gibbs sampling to create such a bag of terms. First, the document covariance
matrix p (wi; wk) is calculated using Eq.(1) where topic probabilities p(t) are
replaced with p (tjdk). Second, a random topic t is selected according to the
where Z is a normalizing denominator and wi is a term placed at i-th position
in the document;</p>
      <p>3. get new random term w according to the probability (2) and place it at
the position j.</p>
      <p>In the presented work, dimensionality of covariance matrix is decreased through
stemming and omitting words which are not nouns or adjectives, stop-words, and
rare words.</p>
      <p>
        Words which are not nouns or adjectives are readily detected with
part-ofspeech tagger [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. They are proved to make small contribution to document topic
assignment [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>Stop-words are the terms that do not affect topic detection. There are two
groups of stop-words: collection-speci c and common stop-words. Various lists
of common words are available online1 but the collection-speci c ones have to
be constructed.</p>
      <p>To extract a set of collection-speci c stop-words the covariance p (wi; wj ) is
employed. The hypothesis is that the stop-word has a large value of the Shannon
information entropy
conditional probability p(tjdk) and the initial set of N terms for the document is
randomly selected based on the term probabilities p(wijt). Third, the repetitive
sampling is used to replace each term in the document. One iteration of the
sampling is a three-step process:
1. choose the term position j that will be updated;
2. calculate nave Bayes probability for each term w in the dictionary W
(2)
(3)
(4)
H(wi) =
jW j
∑ p (wi; wj ) log [p (wi; wj )]
j=1
The value of H(wi) exceeding some threshold value Hmax signals that wi can
accompany any other word. Therefore it is not effective when detecting topics
and should be dropped out. Hmax may be considered as additional parameter
to adjust the algorithm.</p>
      <p>Rare words are detected with comparison of the single word probability p(wi)
and a threshold value Pmax where
p(wi) =
jW j
∑ p (wi; wj )
j=1
1 For instance list of English stop words is available at Snowball stemmer site
http://snowball.tartarus.org/algorithms/english/stop.txt</p>
      <p>One of the ways to de ne Pmax is to require that cumulative distribution
function equals to some parameter
=</p>
      <p>∑
p(wi) Pmax
p(wi)
(5)
That means, the kept terms cover prede ned percentage of occurrences.</p>
      <p>After all the excessive words are dropped out the joint probability matrix
becomes much smaller and should be decomposed into product of three
matrixes according to Eq.(1). The main point of the presented method is setting
number of topics T to dimensionality of the square covariance matrix Pij . Then
Eq.(1) becomes eigendecomposition problem such that its solution produces
conditional word probabilities p (wj jt) as eigenvectors and topic probabilities p(t)
as eigenvalues.</p>
      <p>
        Next step is to reduce the number of topics. Method of Principal Component
Analysis [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] states that the matrix Pij can be approximated by setting the
smallest values of topic probabilities p(t) to zero. Also PCA suggests a way to select
the most signi cant topics relying on calculated topic probabilities. After values
of p(wj jt) are calculated, the topic detection of the document is performed using
the expression
(6)
(7)
(8)
where p(tjwi) is found from the Bayes equation
p(tjd) =
jW j
∑ p (tjwi) p (wijd)
i=1
p(tjwi) =
p(wijt)p(t)
p(wi)
p(wi) ( see Eq.(4) ) is the probability of word wi occurs in the collection, and
p(wijd) is the relative frequency of word wi in document d.
      </p>
      <p>
        The comprehensive and automated evaluation measure of topic quality is
Topic Coherence [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
C(z; M ) = ∑T t∑11 log [ D(mt1 ; mt2 ) + ϵ ]
      </p>
      <p>D(mt1 )D(mt2 )
where M = (m1; : : : ; mT ) is the list of the T most probable terms in a topic
z, D(m) counts the number of documents containing the term m, D(m1; m2)
counts the number of documents containing both m1 and m2, and ϵ is used
to avoid log(0). The evaluation metric of the entire topic model is the average
coherence score of all topics. Topic coherence is directly related to probability
the top topic terms can be found in the same document. So the higher topic
coherence indicates better topic quality.</p>
    </sec>
    <sec id="sec-4">
      <title>Experiment planning</title>
      <p>
        Experiments aim to check if the method presented above does provide the valid
and high-quality topic de nitions. To answer the main question experiments
should explore the impact of factors enlisted in the Section 3 on the quality of
topic de nitions, namely:
1. How does the topic model quality depend on the threshold value of
information entropy?
2. How does the topic model quality depend on the threshold value of rare word
frequency?
3. Which of the available PCA decompositions ts the explored method?
4. How does the lower limit of topic probability in uence the quality of results?
5. Which method of word pairs extraction is better:
(a) combination of consecutive words (as in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]);
(b) combination of adjacent terms in a grammar tree;
(c) combination of all possible words in separate sentence;
(d) all possible word pairs in sliding window of size r [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>General experiment work ow contains the following steps:
1. Extract title and abstract from each document of the collection;
2. Extract word pairs from titles and abstracts using one of the word pairs
extraction methods;
3. Apply stemming, omit words which are not nouns or adjectives, stop-words,
and rare words setting Hmax, ;
4. Apply one of PCA alternatives to extract word probabilities in topics, setting
minimal value of topic probability;
5. Calculate average topic coherence to measure quality of topic set.</p>
      <p>
        The experiment will use the HEP data collection [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which is oriented to the
study of multi-label classi ers text. It consists of scienti c papers in the eld
of High Energy Physics (HEP) obtained from the document server of European
Nuclear Physics Laboratory (CERN).
      </p>
      <p>The experiments should show the dependencies of average topic coherence
on Hmax, , minimal value of topic probability, type of PCA and extraction
method.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusive remarks and future works</title>
      <p>The project has been started in December 2016 and is in stage of detailed
planning of experiments and exploration of background technologies and theories.
The future plans include (a) implementation of all the necessary software
components; (b) evaluating quality of proposed basic algorithms; (c) component
integration and running all the work ow; (d) application of developed method
to practical tasks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Jolliffe</surname>
            ,
            <given-names>I. Principal Component Analysis</given-names>
          </string-name>
          (
          <year>2ed</year>
          .). Springer,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Vorontsov</surname>
            ,
            <given-names>K.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potapenko</surname>
            ,
            <given-names>A.A.:</given-names>
          </string-name>
          <article-title>Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization</article-title>
          . In: Ignatov,
          <string-name>
            <surname>D.I.</surname>
          </string-name>
          et al. (
          <source>Eds.) Proc. AIST2014</source>
          , CCIS 436, pp.
          <fpage>29</fpage>
          -
          <lpage>46</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network</article-title>
          . In: Hearst,
          <string-name>
            <given-names>M.</given-names>
            and
            <surname>Ostendorf</surname>
          </string-name>
          , M. (Eds.)
          <source>Proc. HLT-NAACL2003</source>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>259</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raghavan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schtze</surname>
          </string-name>
          , H.: Introduction to Information Retrieval. Cambridge University Press,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.
          <article-title>: A Fast and Accurate Dependency Parser using Neural Networks</article-title>
          . In: Moschitti,
          <string-name>
            <surname>A.</surname>
          </string-name>
          et al. (
          <source>Eds.) Proc. EMNLP</source>
          <year>2014</year>
          , pp.
          <fpage>740</fpage>
          -
          <lpage>750</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Montejo-Rez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steinberger</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Urea-Lpez</surname>
            ,
            <given-names>L.A.</given-names>
          </string-name>
          :
          <article-title>Adaptive Selection of Base Classi ers in One-Against-All Learning for Large Multi-Labeled Collections</article-title>
          . In:
          <string-name>
            <surname>Vicedo J. L</surname>
          </string-name>
          . et al. (
          <source>Eds.) Proc. 4th Intl Conf. Advances in Natural Language Processing (EsTAL2004)</source>
          ,
          <source>LNAI 3230</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>M.I.</given-names>
          </string-name>
          :
          <article-title>Latent Dirichlet Allocation</article-title>
          .
          <source>In: J. Machine Learning Research</source>
          , Vol.
          <volume>3</volume>
          , pp.
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Zuo</surname>
            ,
            <given-names>Yu.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          , Ji.,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Word Network Topic Model: A Simple but General Solution for Short and Imbalanced Texts</article-title>
          . The Computer Research Repository (CoRR), http://arxiv.org/abs/1412.5404,
          <year>December 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
          </string-name>
          , Ji.,
          <string-name>
            <surname>Lan</surname>
            , Ya., Cheng, Xu.,
            <given-names>A Biterm</given-names>
          </string-name>
          <string-name>
            <surname>Topic</surname>
          </string-name>
          <article-title>Model For Short Texts</article-title>
          . In: Schwabe,
          <string-name>
            <surname>D.</surname>
          </string-name>
          et al.(
          <source>Eds.) Proc. 22nd Intl Conf. World Wide Web, ACM</source>
          , pp.
          <fpage>1445</fpage>
          -
          <lpage>1456</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Popova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khodyrev</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Egorov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Logvin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gulayev</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karpova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouromtsev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Sci-Search: Academic Search and Analysis System Based on Keyphrases</article-title>
          . In: Klinov,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Mouromtsev</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (Eds.)
          <source>Proc. 4th Intl Conf. Knowledge Engineering and the Semantic Web, CCIS 394</source>
          , pp.
          <fpage>281</fpage>
          -
          <lpage>288</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Aletras</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevenson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Evaluating Topic Coherence Using Distributional Semantics</article-title>
          .
          <source>In: Proc. 10th Intl Workshop on Computational Semantics (IWCS2013)</source>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>22</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>