<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UAEMex System for Identifying Personality Traits from Source Code</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eder Vázquez Vázquez</string-name>
          <email>eder2v@hotmail.com1</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Omar González Brito</string-name>
          <email>gonzalezbritoomar@gmail.com2</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jovani A. García</string-name>
          <email>jovani_2807@hotmail.com3</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel García Calderón</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriela Villada Ramírez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alan J. Serrano León</string-name>
          <email>alan.serrano.leon@outlook.com6</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>René A. García-Hernández</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yulia Ledeneva</string-name>
          <email>yledeneva@yahoo.com8</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universidad Autónoma del Estado de México</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>UAPT Tianguistenco.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>PR-SOCO</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Support Vector Machine</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Symbolic Regression</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Neural Networks</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Personality Trait</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Genetic Algorithms.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Instituto Literario</institution>
          ,
          <addr-line>100, Toluca, Edo. Méx. 50000</addr-line>
          ,
          <country country="MX">México</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the UAEMex participation on Personality Recognition Source Code (PR-SOCO 2016) task, where the principal challenge is to identify the five personality traits using the source code of a developer. In the first phase of the task, a training dataset with 50 programs and the degree values of the personality incidence for each trait were provided. In the second phase, a test dataset with 21 programs must be classified. Our method consists in extracting only 41 features from the source code including the comments in order to classify it (we test 4 models). Using the evaluation metrics proposed by PR-SOCO, our system is ranked between the best systems for both evaluation metrics. Finally, using the RMSE and the PC metric we propose a ranking measure.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Personality is an inherent aspect of human nature that has an
influence on its activities. It means, personality is a set of
characteristics that describes one person, and makes it different
from others [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Nowadays, identifying the degree of personality
traits for determining if a candidate fits with a job is such important
as skills and experience [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. After decades of research, the Big-Five
Theory is the most accepted model for assessing the personality [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
This model has a hierarchical organization of personality traits with
five classes: Extroversion (E), Agreeableness (A),
Conscientiousness (C), Neuroticism (N), and Openness to
experience (O) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Given a few set of java source codes in PR-SOCO task, the main
objective is to identify the degree of presence of five classes of
personality [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In order to get an approximation of what aspects
determine the personality, the NEO-PI R test may be answered (this
test is based on the Big-Five theory) to measure the personality
traits [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. There are many structured surveys based on NEO-PI R in
several Web pages, available on-line for everybody to predict the
personality of the user. Using these aspects, we propose to extract
41 features as the main information for training four classifiers.
In this paper, we present the working notes of the UAMEX
participation on the PR-SOCO 2016 task.
      </p>
      <p>This paper is organized as follows. In section 2, the
methodology is described. In section 3, the results for the test
dataset experiments are presented. In section 4, using the evaluation
metrics proposed by PR-SOCO, we rank the results with others
systems by personality traits. In section 5, the conclusions are
presented.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>The proposed methodology is divided in four steps: Corpus
Analysis, Feature Extraction, Feature Representation and
Classification.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Corpus Analysis</title>
      <p>The training dataset is composed of 1741 java source codes of 50
developers that where evaluated with the Big-Five Theory
personality traits where each trait ranges between value of 20 and
80. However, the number of different values by personality trait in
the samples is small, we decided to manage each program by
separated, to get a good representation (See table 1). There are
different numbers of values per class on every personality trait, the
distribution is shown in table 1.</p>
      <p>Number of different
values
13
14
11
14
12
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Feature Extraction</title>
      <p>Using few source codes of our team members, we identify some
personal features in order to identify some similar elements. As
result, we detected the indentation, identifier and comment features
are important to determine the author of such codes. These features
can be extracted independently of the content or objective of the
source code. The first 25 features were calculated using average
and the last 16 were calculated using frequency. Extracted features
can be classified as:
Indentation Features: space in code, space in the comments,
space between classes, space between source code blocks, space
between methods, space between control sentences and space in
clustering characters “(), [], {}”. These features are measured with
the average.</p>
      <p>Identifier Features: The presence of underscore, uppercase and
lowercase in the name of an identifier is measured in binary way.
Also, we extract the average number of characters and the average
length in the name of an identifier as features. These features are
extracted for class, methods and variable names. Also, the
percentage of number of initialized variables is extracted.
Comment Features: The presence of line and block comments are
extracted as binary features. Also, the presence of comments with
all letters in uppercase is extracted as binary feature. Finally, the
average of size of the comments is extracted as feature.</p>
    </sec>
    <sec id="sec-5">
      <title>2.3 Features Representation</title>
      <p>
        For every source code, 41 features are extracted for representing in
a vector space model, where the Source Code   is represented by
one of the 41 features   [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>2.4 Classification</title>
      <p>
        Once the source codes are represented in a vector space model, we
train the system with the next classifiers. The objective of test
different classifiers is that if the extracted features are good features
then we would get, in general, good results with these classifiers. It
is worth to say that these classifiers have been widely used in other
language processing tasks, especially we trust in the Symbolic
Regression model since the training dataset only has some few
values per trait.
2.4.1 Symbolic Regression (SR)
Finding the structure, coefficients and appropriate elements of a
model at same time that try to solve problem, is a challenge for
which no efficient mathematical method exists, therefore
traditional mathematical techniques are not the best in empirical
modeling problems due to their nonlinearity. Because, there is a
need with an artificial expert which can create or define a model
from available data of specific task without appeal problem
understand [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Symbolic Regression is an artificial expert type that
evolve models from available data observations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], whose main
objective is to find a model which describes the relationship
between dependent variable and independent variables as
accurately as possible [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Because Symbolic Regression works directly with Genetic
Programming is possible to evolve equations or mathematical
functions in order to estimate the behavior of a dataset. The
symbolic regression technique standout as a viable solution to the
problem of this work because it does not assume an answer
problem, but also discover it [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
2.4.2 Support Vector Machine (SVM)
SVM maps a set of examples as a set of points in the same space
trying to get optimal hyper-plane. Optimal hyper-plane is defined
as hyperplane with maximal separation between two classes [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
SVM make predictions based on which side of the gap they fall on
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In this work, we used SVM implementation LIB-SVM [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
2.4.3 K Nearest Neighbor (KNN)
Is one of the simplest machine learning algorithms known as lazy
classifier where classification function is only approximated
locally. KNN is trained using vectors on feature space; each vector
must have a class label.
      </p>
      <p>
        The training phase consists on store feature vectors and class labels
of training dataset. In classification phase is necessarily to define
constant  and send an unlabeled vector to KNN algorithm for
calculate the minimal distance between stored classes and input
vector [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. We use Weka implementation for KNN algorithm [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
2.4.4 Back Propagation Neural Network (BP-NN)
Neural networks are an elemental processor that recipe a vector as
input data. The feature vector is send at input layer and then every
neuron processes a  −  with  −  ℎ and returns a  −
 . Neural networks are used to approximate functions
according to the input data [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        When neural network implements back-propagation error, the
output of neural network is compared with desired output to
calculate neural network error and then correct weights of every
neuron in hidden layer [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-7">
      <title>3. RUN RESULTS</title>
      <p>In this section, the results submitted for the PR-SOCO test dataset
are described.</p>
      <p>Run 1: This run was generated using symbolic regression (SR) over
the vector space model but we eliminate the source codes of five
developers according to the next criterion: the person who has a
high presence in all the personality traits, the person who has a
lower presence in all the personality traits, the person who has an
average presence in all the personality traits, the person who has
more source codes and the person who has few source codes.
Run 2: Similar to run 1, this run was generated using (SR) but for
each personality trait the developers (between 12 and 20) with
average presence of such trait were eliminated.</p>
      <p>Run 3: For this run, the whole training dataset was used with Back
Propagation Neural Network.</p>
      <p>Run 4: The whole training dataset with KNN with constant  = 3
was used.</p>
      <p>Run 5: We use a genetic algorithm, but it is not described because
we find a mistake.</p>
      <p>Run 6: The whole training dataset was used for classify with a
SVM.</p>
      <p>Root Mean Square Error (RMSE) and Pearson Correlation (PC)
metrics were used by PR-SOCO task as evaluation of the ranking
results. A minimum RMSE is desired for a system. In change, in
PC metrics a closer value to 1 or -1 is desired. In table 2, the RMSE
scores of our runs are presented, with the best scores highlighted in
bold. As is possible to see, the first and six runs get the best scores,
where the SR and SVM classifiers were used, respectably.
In table 3, the results with Pearson Correlation metric is showed,
with the best score highlighted in bold.</p>
      <p>N
-0.29
-0.14
0.35
0.04
0.13</p>
      <p>E
-0.14
-0.15
-0.10
-0.04
0</p>
      <p>O
0.45
0.04
0.28
0.10
0</p>
      <p>A
0.22
0.19
0.33
0.29
0</p>
      <p>C
0.11
-0.30
-0.01
-0.07
0</p>
    </sec>
    <sec id="sec-8">
      <title>4. RANKING RESULTS</title>
      <p>In PR-SOCO 2016, eleven teams participated in this task with two
baseline: the baseline bow (bl bow) based on trigram of chars and
the baseline mean (bl mean) based on a method that predicts the
mean value of the observed values. In table 4, the best RMSE
results of those teams for every personality trait are showed
according to the rank. In general, our results (uaemex) were ranked
in good positions outperforming the baseline, except for
Extroversion, in the case of Neuroticism and Agreeableness we
were ranked in second position, in the case of Openness we get the
first rank and for Conscientiousness we get the fourth position
between two baselines.</p>
      <p>N
9.78
9.84
uaemex
9.97
10.04
10.24</p>
      <p>E
8.6
8.69
8.8
8.96
9.01
9.06
bl bow
bl mean
9.22
9.49
uaemex
11.18
16.67
27.39</p>
      <p>O
6.95
uaemex
7.16
7.19
7.27
7.42
7.57
bl mean
7.74
bl bow
8.19
8.21
8.43
15.97
22.57</p>
      <p>A
8.79
8.97
uaemex</p>
      <p>9
bl bow
9.04
bl mean
9.16
9.32
9.36
9.39
9.55
10.31
11.5
21.1
28.63</p>
      <p>C
8.38
8.39
8.47
bl bow
8.53
uaemex</p>
      <p>8.54
bl mean
8.59
8.61
8.69
8.77
8.85
9.99
15.53
22.36
In table 5, the best PC results of those teams for every personality
trait are showed according to the positive correlation results. In
general, our results (uaemex) were ranked in good positions
outperforming the baseline configurations. In the case of
Neuroticism, Openness, Agreeableness and Conscientiousness we
were ranked in second position except for the Extroversion trait. In
general, it is possible to observe that the rank of our results for the
C
0.33
0.32
uaemex
0.31
0.21
0.19
0.16
0.13
0.07
-0.12
bl mean
-0.2
bl bow
-0.23</p>
      <p>C
6.24
6.78
7.03
bl bow
7.47
7.55
7.59
uaemex</p>
      <p>8.54
bl mean
11.33
RMSE metric correspond with the rank of our results for the PC
metric.</p>
      <p>In PR-SOCO 2016, two evaluation metrics were used given two
ways of ranking the results, the RMSE for measuring the average
error between the observed and predicted values and the PC for
measuring the correlation between variables. In this paper, we
propose ranking the results using both RMSE and PC measures as:
This measure only is applied for positive correlation results in PC
metric. Since RMSE is not normalized we propose to multiply both
results. This ranking is a metric where best values are those closer
to cero. Table 6 shows the best results evaluating with our
proposing measure.</p>
      <p>N</p>
      <p>N
6.39
uaemex
6.54
6.74
7.67
8.84
8.91
9.3
9.67
bl bow</p>
      <p>E</p>
      <p>E
5.32
5.59
6.03
6.07
7.52
7.97
bl bow
8.49
9.06
bl mean</p>
      <p>O</p>
      <p>O
3.82
uaemex
4.60
4.79
5.13
5.26
7.28
8.23</p>
      <p>A</p>
      <p>A
5.88
6.36
uaemex
6.71
6.98
7.2
bl bow
8.24
8.49
9.04
bl mean
16.47
As we can see in table 6, our results get a better balance between
RMSE and PC. In table 6, uaemex team is ranking in first position
for Neuroticism and Openness trait, in second place for
Agreeableness and sixth place for Conscientiousness. However, in
this new ranking the Extroversion do not outperform both
baselines.</p>
    </sec>
    <sec id="sec-9">
      <title>5. CONCLUSIONS</title>
      <p>This paper presents the results in personality trait prediction. We
describe the participation of the UAEMex at PR-SOCO 2016.
We know that submitted runs overcome the baseline despite that
corpus has noise like repeated source code, obfuscated source code
and it have little samples.</p>
      <p>The training set has different classes of personality. There are
unbalanced classes and there has not enough examples for class
values. In this approach, we do not make preprocessing because it
was considered that all information in corpus are relevant by the
task. Personality Trait Prediction in source code is a new task and
there are not reference approaches about this. It was difficult to
identify what features would be extracted.</p>
      <p>The best results in our runs obtained with the symbolic regression
model because the training phase try to approximate the output of
input vector.</p>
      <p>Also, we propose a new ranking measure for combine a RMSE and
PC measure in order to get an approximation for evaluation results.
According to our experiments in train dataset, we note that it is
better than RMSE or PC evaluating alone. RMSE is a minimization
metric and PC is a maximization metric.</p>
    </sec>
    <sec id="sec-10">
      <title>6. ACKNOWLEDGMENTS</title>
      <p>Thanks to Autonomous University of the State of Mexico
(UAEMex), Consejo Nacional de Ciencia y Tecnología
(CONACyT) and Consejo Mexiquense de Ciencia y Tecnología
(COMECyT) for support granted for this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Montaño</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palacios</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gantiva</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2009</year>
          . Teorías de la personalidad.
          <article-title>Un análisis histórico del concepto y su medición</article-title>
          . Psychologia Avances de la disciplina,
          <fpage>81</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , R.,
          <string-name>
            <surname>M.R.</surname>
          </string-name>
          <year>2008</year>
          .
          <string-name>
            <given-names>NEO</given-names>
            <surname>PI-R Revised Neo</surname>
          </string-name>
          <article-title>Personality Inventory</article-title>
          .
          <source>TEA Ediciones S.A.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abbas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shahzad</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Syeda</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Personality and career choices</article-title>
          .
          <source>African Journal of Business Management (AJBM) 6</source>
          ,
          <fpage>2255</fpage>
          -
          <lpage>2260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>González</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Restrepo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Pan at fire: Overview of the pr-soco track on personality recognition in source code</article-title>
          .
          <source>In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation</source>
          , Kolkata, India, December 7-
          <issue>10</issue>
          ,
          <year>2016</year>
          ,
          <string-name>
            <given-names>CEUR</given-names>
            <surname>Workshop</surname>
          </string-name>
          <article-title>Proceedings</article-title>
          . CEUR-WS.org,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Salton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          <year>1975</year>
          .
          <article-title>A vector space model for automatic indexing</article-title>
          .
          <source>Commun. ACM</source>
          <volume>18</volume>
          ,
          <fpage>613</fpage>
          -
          <lpage>620</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Dabhi</surname>
            ,
            <given-names>V.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vij</surname>
            ,
            <given-names>S.K.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Empirical modeling using symbolic regression via postfix Genetic Programming</article-title>
          .
          <source>Image Information Processing (ICIIP)</source>
          ,
          <source>2011 International Conference on, 1-6.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Koza</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          <year>1992</year>
          .
          <article-title>Genetic programming: on the programming of computers by means of natural selection</article-title>
          . MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Murari</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peluso</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gelfusa</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lupelli</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lungaroni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaudio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Symbolic regression via genetic programming for data driven derivation of confinement scaling laws without any assumption on their mathematical form</article-title>
          .
          <source>Plasma Physics and Controlled Fusion</source>
          <volume>57</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kommenda</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Affenzeller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burlacu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kronberger</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winkler</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Genetic programming with data migration for symbolic regression</article-title>
          .
          <source>In: Proceedings of the 2014 conference companion on Genetic and evolutionary computation companion</source>
          ,
          <volume>1361</volume>
          -
          <fpage>1366</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Can</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heavey</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Comparison of experimental designs for simulation-based symbolic regression of manufacturing systems</article-title>
          .
          <source>Computers and Industrial Engineering</source>
          <volume>61</volume>
          ,
          <fpage>447</fpage>
          -
          <lpage>462</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hearst</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          <year>1998</year>
          .
          <article-title>Support Vector Machines</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>13</volume>
          ,
          <fpage>18</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>Support-Vector Networks</article-title>
          .
          <source>Machine Learning</source>
          <volume>20</volume>
          ,
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.-C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.-J.</given-names>
          </string-name>
          <year>1977</year>
          .
          <article-title>LIBSVM: A library for support vector machines</article-title>
          .
          <source>ACM Trans. Intell. Syst. Technol. 2</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Stone</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          <year>1977</year>
          . Consistent Nonparametric Regression.
          <fpage>595</fpage>
          -
          <lpage>620</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutemann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Witten</surname>
            ,
            <given-names>I. H.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>The WEKA data mining software: an update</article-title>
          .
          <source>SIGKDD Explor. Newsl</source>
          .
          <volume>11</volume>
          ,
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>McCulloch</surname>
            ,
            <given-names>W.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitts</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <year>1988</year>
          .
          <article-title>A logical calculus of the ideas immanent in nervous activity</article-title>
          . In: James,
          <string-name>
            <given-names>A.A.</given-names>
            ,
            <surname>Edward</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.) Neurocomputing: foundations of research, 15-
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Rumelhart</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          <year>1986</year>
          .
          <article-title>Learning internal representations by error propagation</article-title>
          . In: David,
          <string-name>
            <given-names>E.R.</given-names>
            ,
            <surname>James</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.M.</surname>
          </string-name>
          , Group,
          <string-name>
            <surname>C.P.R.</surname>
          </string-name>
          <article-title>(eds.) Parallel distributed processing: explorations in the microstructure of cognition</article-title>
          , vol.
          <volume>1</volume>
          ,
          <fpage>318</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>