<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Black-Box Classification Techniques for Demographic Sequences: from Customised SVM to RNN</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anna Muratova</string-name>
          <email>amuratova@hse.ru</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavel Sushko</string-name>
          <email>sushkope@mail.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas H. Espy</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Sociology of the Russian Academy of Sciences</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Research University Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Pittsburgh</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <fpage>31</fpage>
      <lpage>40</lpage>
      <abstract>
        <p>Nowadays there is a large amount of demographic data which should be analysed and interpreted. From accumulated demographic data, more useful information can be extracted by applying modern methods of data mining. The aim of this study is to compare the methods of classification of demographic data by customising the SVM kernels using various similarity measures. Since demographers are interested in sequences without discontinuity, formulas for such sequences similarity measures were derived. Then they were used as kernels in the SVM method, which is the novelty of this study. Recurrent neural network algorithms, such as SimpleRNN, GRU and LSTM, are also compared. The best classification result with SVM method is obtained using a special kernel function in SVM by transforming sequences into features, but recurrent neural network outperforms SVM.</p>
      </abstract>
      <kwd-group>
        <kwd>data mining</kwd>
        <kwd>demographics</kwd>
        <kwd>support vector machines</kwd>
        <kwd>neural networks</kwd>
        <kwd>classification</kwd>
        <kwd>sequences similarity</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Nowadays researchers from different countries have access to a large amount of
demographic data about important demographic events and their sequences. More useful
information from accumulated demographic data can be extracted by applying
modern methods of data mining.</p>
      <p>
        The main task of this study is to find the most accurate classification method for
analysing demographic sequences. For classification, various methods such as:
decision trees, support vector machines (SVM), k nearest-neighbours (kNN), neural
networks and others are used. This paper is a continuation of [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], in which decision trees,
kNN and SVM were compared. The purpose of this paper is to compare methods for
classifying demographic data by customising the SVM kernel using various similarity
measures for sequences of events. Neural network algorithms are also compared.
Alternative treatment of the problem by means of Pattern Mining [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Formal
Concept Analysis [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and Pattern Structures [
        <xref ref-type="bibr" rid="ref10 ref11">10,11</xref>
        ], in particular, is given in [
        <xref ref-type="bibr" rid="ref13 ref14">13,14</xref>
        ].
      </p>
      <p>Data were obtained from the scientific laboratory of socio-demographic policy at
HSE and contain results of a survey of 6,626 people, including 3,314 men and 3,312
women. In the database, the dates of significant events in respondents’ lives are
indicated, such as partner, marriage, break up, divorce, education, work, separation from
parents and birth of a child. Also, there are features of people: type of education
(general, higher, professional), location (city, town, country), religion, frequency of
church attendance, generation (Soviet, 1930-1969; modern, 1970-1986) and gender.</p>
      <p>Chapter 2 presents a brief theoretical framework on sequence similarity measures,
namely “the longest common subsequence” LCS and “all common subsequences”
ACS. Chapter 3 presents the results of the work on the classification of demographic
data by sequences of events without discontinuities. In Section 3.1, the special core
variants are used in the SVM method (Support Vector Machines), and in Section 3.2
the results of recurrent neural networks (SimpleRNN, LSTM, GRU) are presented.
Section 4 is devoted to comparing different classification methods and Section 5
presents the conclusions of the work.</p>
      <p>The novelty of this work lies in the use of special kernel variants in the SVM
method. In addition, the results are improved with the help of recurrent neural
network algorithms.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Sequence similarity measures</title>
      <p>
        Sequence analysis is an important task in data analysis and machine learning [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1-4</xref>
        ].
Pairwise relations between sequences are often used for them. For example, methods
such as clustering and kernels depend on the calculation of distances and similarity
measures between sequences. When calculating measures of similarity, it is necessary
to take into account complex combinatorial aspects, since the sequences look like
ordered sets of objects. Below we will consider measures of sequence similarity from
objects on the basis of common subsequences contained in them.
      </p>
      <p>
        The measure of similarity between two sequences S and T “all common
subsequences” (ACS) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is defined as
( ,  ) =

|
The measure of similarity “longest common subsequence” (LCS) is calculated by the
formula
      </p>
      <p>Demographers are interested in sequences without discontinuity (gaps). We are
interested in two options: common prefixes and common subsequences without
discontinuity. Let us transform the original formulas.</p>
      <p>First consider the prefixes. In this case, the number of common prefixes of two
sequences is equal to the length of the largest prefix of these sequences. Prefixes of
(1)
(2)
length zero are not considered. The number of prefixes of the sequence S is equal to
the length of the sequence |S|.</p>
      <p>Thus, the formulas for all common prefixes and for the longest common prefix of
two sequences are the same and equal
  
( ,  ) =   
( ,  ) =    ( ,  ) =</p>
      <p>( ,  )|
{| |, | |}
where LCSP is the longest common sequence prefix, ACSP is the set of all common
prefixes of sequences S and T and CP is the set of common prefixes.</p>
      <p>Now consider subsequences without discontinuities. First consider the case of the
longest common subsequence. Like the previous case, we get:
  
|
( ,  )|
{| |, | |}
where LCS is the longest common subsequence of S and T without discontinuities.</p>
      <p>Now let us look at all common subsequences without discontinuities. For this we
consider all common subsequences S and T of different lengths without
discontinuities. The number of subsequences of the sequence S without discontinuities is
Since a longer sequence has more subsequences, we obtain the formula:
| |(| | + 1)</p>
      <p>2</p>
    </sec>
    <sec id="sec-3">
      <title>Classification results</title>
      <sec id="sec-3-1">
        <title>Using a special kernel in the SVM method</title>
        <p>
          The SVM method is implemented in the scikit-learn library [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The functions of the
method support the data classification by features. There are no functions for
sequence analysis. The implementation of the method provides the ability to connect
custom kernel functions. We use this feature to code and connect functions that
analyse sequences.
        </p>
        <p>Kernels of the SVM method based on sequence similarity measures without
discontinuity. In this paper, the previously calculated Gram matrix was transmitted to
(3)
(4)
(5)
(6)
the SVM method. Each element of the matrix in row i and column j corresponds to
the calculated measure of the similarity of the two sequences with the numbers i, j.
The size of the matrix is  2, where n is the number of sequences. Accordingly, the
calculation time has a quadratic dependence on the number of sequences.</p>
        <p>The following functions for calculating sequence similarity without discontinuity
were used: ACS, LCS and CP. Functions are written in Python.</p>
        <p>The values of the functions ranged from 0 to 1. The value 1 is obtained for equal
sequences, in particular, for the diagonal elements of the matrix. The value 0 is
obtained when there is no similarity between sequences.</p>
        <p>To evaluate classification quality by different methods, the class of sex for the test
sample was used. The initial data were divided into training and test sets according to
the ratio 80/20. Previously, the rows of data were shuffled for a more even
distribution between the training and test sets. Calculation time and accuracy results for
similarity functions are presented in Table 1.
Model fitting time, sec
Prediction time, sec
Total time, sec
Accuracy</p>
        <p>CP
400.97
98.66
499.62</p>
        <p>Classification accuracy of functions CP and ACS differ slightly, and the
calculation time for the CP is much less; the quality of the LCS prediction is not satisfactory.</p>
        <p>In this table and in all of the following tables methods are compared by accuracy,
time is shown for information.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Classification by features using the SVM method. For comparison, we will classify</title>
        <p>not according to the sequences, but on the basis of the respondents’ features: type of
education, place of residence, religiosity, frequency of church attendance and
generation. We use the SVM method with default parameters (kernel function - RBF).
Results are presented in Table 2.
Model fitting time, sec
Prediction time, sec
Total time, sec
Accuracy</p>
        <sec id="sec-3-2-1">
          <title>Value</title>
          <p>3.62
0.52
4.14</p>
          <p>The results in the table show that the accuracy is worse than with the classification
by sequence of events. The built-in function is much faster, since it is implemented in
C and does not calculate the sequences similarity function.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Classification by sequences, by features and by weighted sum of probabilities of</title>
        <p>sequences and features. We can try to improve the result by combining two methods
of classification: by sequences and by features. This can be done using the
probabilities of referring to a certain class, calculated by the SVM method. To get the
probability values, you need to specify the qualifier parameter “probability = True”:
clf = svm.SVC(probability=True)</p>
        <p>As a result, the method returns a matrix with the number of columns equal to the
number of classes. In each position, there will be a probability of assigning a
sequence to the corresponding class.</p>
        <p>Having obtained the probability tables for each method, we can classify based on
the weighted sum of the probabilities of the two methods. Since the methods give
different classification accuracies, the final probability of assigning an object to a
class is calculated by the formula:

=

∙ 

+ 
+ 
∙ 
where
As — accuracy by sequences,
Af — accuracy by features,
Ps — probability by sequences,
Pf — probability by features.
3.
quences and features</p>
        <p>Parameter
Accuracy of sequence classification
(SVM, custom kernel functions: CP,
ACS, LCS)
Accuracy of classification by features
(SVM default - RBF)
Accuracy of classification by
weighted sum of probabilities (7)
Formula (7) takes into account the accuracy of the method for the final probability
calculation. The probability calculated by each method will be included in the final
result with a coefficient equal to the method accuracy. Results are presented in Table
CP
(7)</p>
        <p>The table shows a noticeable improvement in the final result when two methods
are combined.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Classification by features and sequences transformed into features. Another pos</title>
        <p>sible method of classification by sequences is by bringing each sequence to a set of
features. After that, existing methods of classification by features could be used.</p>
        <p>We consider as features a set of subsequences without discontinuity no greater than
a certain length, such that one can compose all the sequences from them. We compose
a dictionary from all possible subsequences without discontinuity for the available
sequences. Such subsequences will be regarded as features of the sequence. We
replace each sequence with a set of attributes corresponding to the subsequences that
appear in it. It is possible to apply the SVM method to the set of attributes.</p>
        <p>The maximum number of different subsequences of sequences without
discontinuity is
where n is the length of the sequence.</p>
        <p>The dependence of the number of subsequences without discontinuity on the length
of the sequence is quadratic. For large sequences, it is necessary to introduce
constraints. Let us consider subsequences without discontinuity no greater than a certain
length.</p>
        <p>This algorithm is similar to the ACS method used in the kernel function; however,
in the kernel function only the number of common subsequences without
discontinuity between the two sequences is considered. This algorithm takes into account their
quality – each unique subsequence is considered as a feature of the sequence. Let us
investigate the work of algorithms with existing sequences of demographic events. In
the provided initial data, the maximum length of the sequence is eight, hence the
maximum possible number of subsequences without discontinuity is 36, according to (8).
We will not consider all subsequences, but only one subsequence of maximum length
for each sequence. In this case, there will be only one feature for each sequence,
which will significantly reduce the amount of computation. Results are presented in
Table 4.</p>
        <p>Parameter</p>
        <sec id="sec-3-4-1">
          <title>Number of sequences</title>
          <p>The number of unique
sequences of maximum
length (number of features
values)
Number of initial features
Number of generated
features (from sequences)
Model fitting time, sec
Prediction time, sec
Total time, sec
Accuracy
6626
1228
0
1
4.80
0.79
5.59</p>
          <p>It turned out that the classification according to the sequences transformed into
features, in combination with the initial features, renders the best result in comparison
with the above methods.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Recurrent neural network algorithms</title>
        <p>
          We performed classification according to sequences using recurrent neural networks
from Keras and Tensorflow software [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Keras, a top-level software, is used to
describe the structure of a neural network as an add-on over the Tensorflow software,
which performs the simulation of the neural network. The simulation was performed
on the GeForce GT 710 GPU.
        </p>
        <p>
          Recurrent Neural Network - RNN allows us to reveal regularities in sequences [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
Three types of recurrent layers were compared in Keras: SimpleRNN, GRU and
LSTM [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. GRU and LSTM, in comparison with SimpleRNN, have more complex
algorithms for detecting regularities. However, on sequences in the original
demographic data, because of their small lengths, LSTM and GRU did not show any
advantage in classification over SimpleRNN. At the same time, the simulation time for
GRU and LSTM was several times larger. Therefore for subsequent classification by
sequences, together with the features, only the SimpleRNN algorithm was used. The
results are shown in the Table 5.
6626
1228
5
1
5.79
0.91
6.70
0.716
        </p>
        <sec id="sec-3-5-1">
          <title>Method of classification</title>
        </sec>
        <sec id="sec-3-5-2">
          <title>Parameter</title>
        </sec>
        <sec id="sec-3-5-3">
          <title>The number of events in the sequences (maximum)</title>
          <p>Number of
features
Model fitting
time, sec
Prediction time,
sec
Total time, sec
Accuracy</p>
          <p>Recurrent neural networks give the best result, since they are better than other
algorithms at accounting for the dependencies in the sequences.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Comparison of all methods</title>
      <p>Thus, the best result of classification of demographic data is given by recurrent
neural networks (Keras, Tensorflow) on sequences with features, and the accuracy is
0.754.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>
        In the course of the work, we derived formulas for calculating measures of sequence
similarity without discontinuity; these were then incorporated to the kernels of the
SVM method. We have studied several methods for classifying demographic data by
sequences of events without discontinuities, namely variants of a custom kernel in the
SVM method and recurrent neural networks. Also, we made a comparison of these
methods with the algorithm from the earlier paper [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. To complete our work, we
wrote programmes in Python, with the help of which we processed the initial
demographic data. We obtained solid classification results by using the custom kernel
function in SVM by transforming sequences into features and even better results with
recurrent neural network SimpleRNN. These two methods take into account event
regularities in the sequences, unlike most other methods which work only with
features. This work can be applied to the analysis of various sequences. Of course, many
other classification methods based on different similarity measures of demographic
sequences can be used. These may be statistical methods or other types of neural
networks, such as convolutional neural networks (CNN). Those methods may be
investigated in future research.
      </p>
      <p>Acknowledgments. We would like to thank our colleagues from the research and
study group “Models and Methods of Demographic Sequence Analysis” Dmitry
Ignatov and Danil Gizdatullin for their piece of advice and Ekaterina Mitrofanova for the
obtained data.</p>
      <p>This article was prepared within the framework of the Academic Fund Program at
the National Research University Higher School of Economics (HSE) in 2016-2017
(grant № 16-05-0011 “Development and testing of demographic sequence analysis
and mining techniques”) and by the Russian Academic Excellence Project "5-100".</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Elzinga</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liefbroer</surname>
            <given-names>A.C.</given-names>
          </string-name>
          :
          <article-title>De-standardization of Family-Life Trajectories of Young Adults. A Cross-National Comparison Using Sequence Analysis</article-title>
          .
          <source>European Journal of Population</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <fpage>225</fpage>
          -
          <lpage>250</lpage>
          (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Elzinga</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rahmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Algorithms for subsequence combinatorics</article-title>
          .
          <source>Theoretical Computer Science</source>
          <volume>409</volume>
          (
          <issue>3</issue>
          ),
          <fpage>394</fpage>
          -
          <lpage>404</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Egho</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raïssi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calders</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jay</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>On measuring similarity for sequences of itemsets</article-title>
          .
          <source>Data Mining Knowledge Discovery</source>
          <volume>29</volume>
          (
          <issue>3</issue>
          ),
          <fpage>732</fpage>
          -
          <lpage>764</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Lodhi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saunders</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shawe-Taylor</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Cristianini</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watkins</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Text Classification using String Kernels</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>2</volume>
          ,
          <fpage>419</fpage>
          -
          <lpage>444</lpage>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Understanding</surname>
            <given-names>LSTM</given-names>
          </string-name>
          Networks, http://colah.github.io/posts/2015-08-UnderstandingLSTMs/, last accessed
          <year>2017</year>
          /02/15.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Scikit-learn:
          <article-title>Scientific library for Machine Learning in Python</article-title>
          , http://scikit-learn.org/,
          <source>last accessed</source>
          <year>2017</year>
          /01/28.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Keras:
          <article-title>Deep Learning library for Theano and TensorFlow</article-title>
          , https://keras.io/,
          <source>last accessed</source>
          <year>2017</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>8. The Unreasonable Effectiveness of Recurrent Neural Networks</article-title>
          , http://karpathy.github.io/
          <year>2015</year>
          /05/21/rnn-effectiveness/,
          <source>last accessed</source>
          <year>2016</year>
          /12/20.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ignatov</surname>
            ,
            <given-names>D.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>E.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muratova</surname>
            <given-names>A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gizdatullin</surname>
            <given-names>D.K.</given-names>
          </string-name>
          :
          <article-title>Pattern Mining and Machine Learning for Demographic Sequences</article-title>
          .
          <source>In: Knowledge Engineering and Semantic Web: 6th International Conference, KESW 2015</source>
          , vol.
          <volume>518</volume>
          , pp.
          <fpage>225</fpage>
          -
          <lpage>243</lpage>
          . Springer, Switzerland (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Buzmakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Egho</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nicolas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raïssi</surname>
          </string-name>
          , Ch.:
          <article-title>On mining complex sequential data by means of FCA and pattern structures</article-title>
          .
          <source>Int. J. General Systems</source>
          <volume>45</volume>
          (
          <issue>2</issue>
          ),
          <fpage>135</fpage>
          -
          <lpage>159</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          :
          <article-title>Pattern structures and their projections</article-title>
          . In: Delugach,
          <string-name>
            <given-names>H.S.</given-names>
            ,
            <surname>Stumme</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. (eds.) ICCS</surname>
          </string-name>
          <year>2001</year>
          .
          <article-title>LNCS (LNAI)</article-title>
          , vol.
          <volume>2120</volume>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>142</lpage>
          . Springer, Heidelberg (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis</source>
          . Springer, Berlin (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Gizdatullin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baixeries</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ignatov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muratova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomas</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Espy</surname>
          </string-name>
          :
          <article-title>Learning Patterns from Demographic Sequences</article-title>
          .
          <source>In.: Intelligent Data Processing, IDP 2016</source>
          , Springer (to appear)
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Gizdatullin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ignatov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitrofanova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muratova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Classification of Demographic Sequences Based on Pattern Structures and Emerging Patterns</article-title>
          .
          <source>In.:14th International Conference on Formal Concept Analysis</source>
          ,
          <source>Supplementary proceedings, ICFCA</source>
          <year>2017</year>
          , Rennes, France (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Aggarwal</surname>
          </string-name>
          , Ch. C.,
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          .:
          <source>Frequent Pattern Mining</source>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>