<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Spam Detection System Combining Cellular Automata and Naive Bayes Classifier</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>, N. Barigou</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2012</year>
      </pub-date>
      <fpage>250</fpage>
      <lpage>260</lpage>
      <abstract>
        <p>In this study, we focus on the problem of spam detection. Based on a cellular automaton approach and naïve Bayes technique which are built as individual classifiers we evaluate a novel method combining multiple classifiers diversified both by feature selection and different classifiers to determine whether we can more accurately detect Spam. This approach combines decisions from three cellular automata diversified by feature selection with that of naïve Bayes classifier. Experimental results show that the proposed combination increases the classification performance as measured on LingSpam dataset.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Spam is rapidly becoming a major problem on the Internet. Some recent studies
shows that about 80% of the e-mails sent daily are Spam [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The major problem
concerning spam is that it is the receiver who is paying in terms of its time, bandwidth
and disk space. To address this growing problem of spam, many solutions have
emerged. Some of them are based on the header of the email such as black list, white
list and DNS checking. Other solutions are based on the text content of the message
such as filtering based on machine learning. Many techniques have been developed to
classify e-mails –for good review the reader can look, e.g., [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In a previous study
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], we proposed CASD (a Cellular Automaton for Spam Detection) a new approach
to spam detection, based on symbolic induction by cellular automata [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Experiments
show a very high quality of prediction when using stemming and Information gain as
a features selection function [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. A performance improvement is also observed over
NB and KNN proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] on Ling Spam corpora. In this paper, our aim is to
further improve the spam detection by adopting a combination strategy of classifiers.
One technique to create an ensemble of classifiers is to use different feature subsets
for each individual classifier. We believe that by varying the feature subsets to train
the classifiers we can improve the performance of filtering, since it is possible to
incorporate diversity and produce classifiers that tend to have high variety in their
predictions. In a set of experiment to prove this, the same learning algorithm of
CASD is trained over three different subsets of features and combined by voting, with
a naïve Bayes algorithm.
      </p>
      <p>The remainder of this paper is organized as follows; in section 2, we give an
overview of the different types of strategies for classifier combination and we follow
with the related work in combining multiple classifiers for spam detection. Section 3,
first introduces the Naïve Bayes classifier and the CASD based cellular automaton
and then moves to the proposed combination approach. Experimental results are
presented in section 4. Conclusions are finally drawn in section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>A general overview of classifier combination is given in section 2.1. Some
background on the spam detection using classifier combination is given in section 2.2.
2.1</p>
      <sec id="sec-2-1">
        <title>Combining Classifiers</title>
        <p>
          An ensemble of classifiers combines the decisions of several classifiers in some
way in an attempt to obtain better results than the individual members. Such systems
are also known under the names multiple classifiers, committees or classifier fusion.
Numerous studies have shown that combining classifiers yields better results than
achievable with an individual classifier. A good overview of different ways of
constructing ensembles as well as an explanation about why ensemble is able to
outperform its single members is pointed in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          An ensemble of classifiers must be both diverse and accurate in order to improve
accuracy, compared to a single classifier. Diversity guarantees that all the individual
classifiers do not make the same errors. If the classifiers make identical errors, these
errors will propagate to the whole ensemble and so no accuracy gain can be achieved
in combining classifiers. In addition to diversity, accuracy of individual classifiers is
important, since too many poor classifiers can overwhelm correct predictions of good
classifiers [
          <xref ref-type="bibr" rid="ref15 ref7">7, 15</xref>
          ].
        </p>
        <p>In order to make individual classifiers diverse, many ensemble methods use
feature selection so that each classifier works with a specific feature set. To contribute
to this research, we propose to employ multiple classifiers, each making predictions
based on subsets of features.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Spam detection using multiple classifiers</title>
        <p>
          In the context of spam filtering, a number of ensemble classification methods have
been studied. Sakkis et al. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] combined a Naïve Bayes (NB) and k-nearest neighbor
(k-NN) classifiers by stacking method and found that the ensemble achieved better
performance. Carreras and Marquez [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] used boosting decision trees with the
AdaBoost algorithm. Compared with two learning algorithms, the induction decision
trees (DT) and Naïve Bayes, Adaboost clearly outperformed the above two learning
algorithms in terms of the F1 measure. Rios and Zha [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] applied random forests, an
ensemble of decision trees, using a combination of text and meta data features. For
low false positive spam rates, RF was shown to be overall comparable with support
vector machines (SVM) in classification accuracy. Also, Koprincha et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] studied
the application of random forests to Spam filtering. The LingSpam and PU1 corpora
with 10-fold cross-validation were used, selecting 256 features based on either
information gain or the proposed term-frequency variance. Random forests produced
the best overall results. Shih et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] proposed an architecture for collaborative
agents, in which algorithms running in different clients can interact for the
classification of messages. The individual methods considered include NB, Fisher’s
probability combination method, DT and neural networks. In the framework
developed, the classification given by each method is linearly combined, with the
weights of the classifiers that agree (disagree) with the overall result being increased
(decreased). The authors argued that the proposed framework has important
advantages, such as robustness to failure of single methods and easy implementation
in a network.
        </p>
        <p>3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Proposed Framework</title>
      <p>
        In this research, we propose an ensemble of classifiers diversified by both
manipulating input data and using two different classifiers Cellular automaton CASD
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and Naïve bayes approach. These two classifiers are given in section 3.1 and 3.2
while the design of the proposed combination is discussed in section 3.3.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Naive Bayes Classifier</title>
        <p>
          Naïve Bayes (NB) which has been widely used for spam filtering [
          <xref ref-type="bibr" rid="ref1 ref13 ref2">1,2, 13</xref>
          ] is
a simple but highly effective classifier. It uses the training data to estimate the
probability that an instance belongs to a particular class. NB requires little storage
space during both the training and classification stages; the strict minimum is the
memory needed to store the prior and conditional probabilities. In our experiments,
each message is represented as a binary vector (x1, . . . , xm), where xi =1 if a particular
token Xi of the vocabulary is present, otherwise xi=0.
        </p>
        <p>From Bayes’ theorem, the probability that a message with vector = (x1, . . . , xm)
belongs in category c (= spam or lefitimate) is:
|
| .  NB classifies
each e-mail in the category that maximizes the product Pc P x|c . The a priori
probabilities p(c) are typically estimated by dividing the number of training e-mails of
category c by the total number of training e-mails. And the probabilities | are
calculated as follows:</p>
        <p>| ∏ | | |
number of occurrences of token X in e-mails with label c, is the total number of
token occurrences in e-mails labeled c and |vocabulary| is the number of unique
tokens across all e-mails.
where</p>
        <p>is the
=
3.2</p>
        <p>CASD : a Cellular Automaton for Spam Detection</p>
        <p>
          CASD is a classifier which is built on the cellular automaton CASI [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Besides its
high classification accuracy, CASD also has advantages in terms of simplicity,
classification speed, and storage space [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>Cellular automaton CASI (Cellular Automaton for Symbolic Induction) is a
cellular method of generation, representation and optimization of induction graphs
generated from a set of learning examples. It produces conjunctive rules from a
Boolean induction graph representation that can power a cellular inference engine.
This Cellular-symbolic system is organized into cells where each cell is connected
only with its neighbors (subset of cells). All cells obey in parallel to the same rule
called local transition function, which results in an overall transformation of the
system. CASI uses a knowledge base in the form of two layers of finite automata. The
first one, called CelFact, represents the facts base and the second one, called CelRule,
represents the rule base. In each layer, the content of a cell determines whether and
how it participates in each inference step; at every step, a cell can be active or
passive, can take part in the inference or not. The states of cells are composed of three
parts; EF, IF and SF, and ER, IR and SR which are the input, internal state and output
parts of the CelFact cells, and of the CelRule cells, respectively. The neighborhood of
cells is defined by two incidence matrices called RE and RS respectively. They
represent the input respectively output relation of the facts and are used in forward
chaining.</p>
        <p>• The input relation, noted iREj, is : if (fact i ∈ Premise of rule j) then iREj =1 else
iREj = 0.
• The output relation, noted iRSj, is : if (fact i ∈ Conclusion of rule j) then iRSj =1
else iRSj =0.</p>
        <p>The cellular automaton dynamics is implemented as a cycle of an inference
engine made up of two local transitions functions δfact and δrule.</p>
        <p>The transition function δfact which corresponds to the evaluation, selection and
filtering phases is defined as: , , , , ,   , , , , ,</p>
        <p>The transition function δrule which corresponds to the execution phase is defined</p>
        <p>  , , , , ,
3.2.1</p>
        <p>Learning classifier system</p>
        <p>S0
1810 
361 </p>
        <p>Legitimate </p>
        <p>Spam</p>
        <p>During the learning phase, the Sipina method produces a graph. From this graph, a
set of rules is inferred. They are in the form of "if condition1 and condition 2 and
…condition n then conclusion". For example, in the graph of Figure 1, if we look to
partition number 2 at node number 1 (S1) we have the rule "if the term 'buy' is not
present then the email is legitimate", because the majority of emails (1693) which do
not contain this term are legitimate.</p>
        <p>The set of rules generated from induction graph are modeled by the CASI
automaton as follows:</p>
        <p>- The set of all conditions and conclusions are represented by a Boolean facts base
called CelFact.</p>
        <p>- The set of rules is represented by a Boolean Rule-based called CelRule.
- An input matrix RE which memorizes conditions of the rules.
- and finally, an output matrix RS which memorizes conclusions of the rules.</p>
        <p>Forward chaining will allow the model to move from initial configuration to the
next configurations G0 , G1, ... Gn. The inference stops after stabilization with a final
configuration. At this step the construction of cellular model is complete.</p>
        <p>Table 1 presents the final configuration corresponding to the example of Figure 1.
Three rules, represented by CelRule layer are deduced from the graph. The conditions
and conclusions of these rules are stored in CelFact layer. The premises are the terms
used in classification and the last two facts present the two classes. Note that no facts
are established: EF = 0.</p>
        <p>In the input matrix RE (respectively output matrix RS) are stored the premises
(respectively the conclusions) of each rule. For example, the rule R2, has premises
"buy = 1", “science=0” and a conclusion “class = spam”.</p>
        <p>Interaction between these two layers (CelFact and CelRule) is done by δfact and
δrule.</p>
        <p>Table 1. Final Configuration: CelRule, CelFact, RE, and RS.</p>
        <p>Rules </p>
        <p>ER  IR 
R1  0 
R2  0 
R3  0 
CelRule 
1 
1 
1 </p>
        <p>Facts
buy=0
buy=1
science=0
science=1
S3:class=spam
S5:class=legitimate</p>
        <p>SR 
0 
0 
0 </p>
        <p>We can use the model composed of CelFact, CelRule, RE and RS to classify new
e-mails. The classification process is illustrated in Figure 2.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Proposed classifier combination</title>
        <p>Training
emails
Extract
features</p>
        <p>Feature</p>
        <p>set
Extract Feature Subset 1 Extract Feature Subset 2 Extract Feature Subset 3 Extract Feature Subset 4
by IG selector by X2 selector by MI selector by IG selector
CASD-1</p>
        <p>
          Methods for creating ensembles [
          <xref ref-type="bibr" rid="ref15 ref7">7, 15</xref>
          ] focus on producing diversified base
classifiers. Indeed, combination can be done by manipulating the training data,
manipulating the input features, using different learning techniques to the same data.
In this paper, we have chosen to consider combination by manipulating both features
and using two different classifiers (CASD and NB).
        </p>
        <p>
          The proposed approach termed 3CA-1NB (See Figure 3) combines three cellular
automata classifiers (CASD), where each one is trained only with a feature subset.
These subsets are generated with three different feature selection functions [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]:
Information gain (IG), mutual information, and Chi-2 statistic respectively. We
combine the decisions of these classifiers with that of Naïve bayes decision using
voting1 strategy. Our motivation for using this combining technique by varying
feature selectors and using two different classifiers emerged from our preliminary
results [
          <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
          ] which indicate:
• The set of features selected by CASD during the learning phase depends on the
selection function used to select features. For example, we observe that the
features subset used by CASD after a selection based on information gain is generally
1 E-mail is classified spam when at least two classifiers decide spam
different from that which was selected by the Chi-2 statistic or MI function.
        </p>
        <p>Therefore, we are guaranteed of having high feature set diversity.
• When using CASD, The quality of detection is better when features selection is
done by MI or χ2 (we have precision=100%), while the coverage is very low in
the case of a selection with χ2 (recall is very low) but very good with IG selector
(see table 2 below). We want a classifier with high quality of detection and high
coverage.
• Besides their simplicity, classification speed, CASD and NB also have advantages
in terms of high classification accuracy.</p>
        <p>4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental study and results</title>
      <p>
        We used the publicly available LingSpam corpora [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It comprises 2893 different
e-mails, of which 2412 are legitimate e-mails obtained by downloading digests from
the list and 481 are spam e-mails retrieved from one of the authors of the corpus [
        <xref ref-type="bibr" rid="ref1 ref13">1,
13</xref>
        ].
4.1
      </p>
      <p>Linguistic preprocessing and feature selection</p>
      <p>The first step in the process of constructing a classifier is the transformation of the
e-mails into a format appropriate for the classification algorithms. We use an indexing
module to:
(a) Tokenize texts and establish an initial list of terms;
(b) Eliminate stop words using a pre-defined stop list and;
(c) Perform stemming with a variant of the Porter2 algorithm.</p>
      <p>
        Prior experiments [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] have shown that stemming improves classification
performance. In this paper we report results on stemmed data. Since the number of
terms after this preprocessing phase is very high, and to reduce the computational cost
and improves the classification performance, we must select those that best represent
the emails and remove less informative and noisy ones. Based on a study of [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
indicating the most used feature selectors in text categorization, we have implemented
three feature selectors: Information gain (IG), mutual information (MI) and
χ2statistic (CHI). The system calculates the chosen measure for all the terms, and then
takes the first k terms corresponding to larger scores. In our experiments the
threshold’s parameter is set to k= 500. After feature selection process, each e-mail is
represented by a vector that contains a weighting for every selected term. This
weighting represents the importance of that term in that e-mail. In this paper, we deal
with a binary weighting. The kth document is represented by the characteristic vector
Xk =(a1k, a2k, ….aMk). (aik) =1 if the term “i” is present in document “k”, 0 otherwise
and M is the index size.
2 http://tartarus.org/~martin/PorterStemmer/
      </p>
      <sec id="sec-4-1">
        <title>Performance measures</title>
        <p>To evaluate performance we calculated spam precision (SP), spam recall (SR),
spam F1 measure (F1) and accuracy. (Shown in equations 1to 4). Let TN: the number
of legitimate e-mails classified as legitimate (true negatives), TP: the number of spam
emails classified as spam (true positives), FP: the number of legitimate e-mails
classified as Spam (False Positives) and FN: the number of spam e-mails classified as
legitimate (false negatives), then we have:
        
      1
   1         
    3     
   2</p>
        <p>   4
λ
Weighted accuracy (WA) was also calculated. More formally, WA is defined as
follows:</p>
        <p>(5).
       λ
Three scenarios are evaluated and compared with previous work:
(a) λ=1; no cost considered;
(b) λ=9; semi-automatic scenario for moderately accurate filter, and
(c) λ=999 completely automatic scenario for a highly accurate filter.</p>
        <p>The experiments were performed with a k-fold cross validation with k = 10. In this
way, our dataset was split 10 times into 10 different sets of learning sets (90% of the
total dataset) and testing sets (10% of the total data). We conduct the training-test
procedure ten times and use the average of the ten performances as final result.
4.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Results and discussion</title>
        <p>
          To evaluate 3CA-1NB and to show improvement over our previous work, we
include the results of experiments on the LingSpam corpus with the CASD classifier
using three subsets of features and NB classifier. In Table 2, we reproduce the best
performing configuration. These configurations were used as members of the
ensemble.
approach gives better performance than the four base classifiers used separately. The
ensemble approach exploits the differences in misclassification by individual
classifier and improves the overall performance. We also compare 3CA-1NB with the
ensemble approaches developed by [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Table 3 reports the best results that we have
achieved with 3CA-1NB and which are actually better than the results of [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
150
100
50
0
Accuracy and F1 measures (%)
        </p>
        <p>96,9
89,8
90,62
97,1</p>
        <p>F1
A
97,96
83,8
61,4
90,7</p>
        <p>93,54
4,9
NB</p>
        <p>CASD-1 CASD-2 CASD-3
3CA</p>
        <p>1NB</p>
        <p>In this paper a new approach for creating a diversity ensemble of classifiers is
proposed. This method uses feature subset selection to train and construct a
diversified set of base classifiers. We combine the predictions from the different
classifiers by a voting technique in order to increase the performance of spam
detection.</p>
        <p>
          The results of experiencing on LingSpam datasets show better performance of the
proposed method. As a future perspective, we will investigate the effect of combining
more types of classifiers, and also, exploring other combination techniques [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] to
further increase accuracy.
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koutsias</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          (
          <year>2000a</year>
          ),
          <article-title>“An Evaluation of Naive Bayesian Networks</article-title>
          .”,
          <source>In: Machine Learning in the New Information Age. Barcelona Spain</source>
          (
          <year>2000</year>
          )
          <fpage>9</fpage>
          -
          <lpage>17</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sakkis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spyropoulos</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Stamatopoulos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2000b</year>
          ).
          <article-title>“Learning to filter spam e-mail: a comparison of a naïve Bayesian and a memory based approach”</article-title>
          .
          <source>In Proc. of the Workshop on ML and Textual Information Access, PKDD</source>
          <year>2000</year>
          , France.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Atmani</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beldjilali</surname>
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Knowledge Discovery in Database: Induction Graph</article-title>
          and
          <string-name>
            <given-names>Cellular</given-names>
            <surname>Automaton</surname>
          </string-name>
          ,
          <source>Computing and Informatics Journal</source>
          ,
          <volume>26</volume>
          ,
          <fpage>171</fpage>
          -
          <lpage>197</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Barigou</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atmani</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beldjilali</surname>
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Utilisation de la machine cellulaire pour la détection des courriels indésirables</article-title>
          .
          <source>EGC</source>
          <year>2011</year>
          :
          <fpage>321</fpage>
          -
          <lpage>322</lpage>
          , Revue des Nouvelles Technologies de l'Information, RNTI-E-
          <volume>20</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Barigou</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barigou</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atmani</surname>
            <given-names>B.</given-names>
          </string-name>
          ,
          <article-title>“A Boolean model for spam detection”</article-title>
          ,
          <source>In: Proceedings of the International Conference on Communication, Computing and Control Applications</source>
          ,
          <string-name>
            <surname>Tunisia</surname>
          </string-name>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Carreras</surname>
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marquez</surname>
            <given-names>L.</given-names>
          </string-name>
          , (
          <year>2001</year>
          ), “
          <article-title>Boosting Trees for Anti-Spam Email Filtering</article-title>
          ”
          <source>in Proc. of RANLP-01, 4th International Conference on Recent Advances in Natural Language Processing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dietrich</surname>
            <given-names>T.G.</given-names>
          </string-name>
          ,
          <article-title>Ensemble methods in machine learning</article-title>
          . In: Kittler J.,
          <string-name>
            <surname>Roli</surname>
            <given-names>F</given-names>
          </string-name>
          . (eds),
          <source>Proc. of 1st Int. Workshop on Multiple Classifier Systems</source>
          , Springer Verlag LNCS
          <year>1857</year>
          ,
          <year>2000</year>
          ,
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Flavio</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Garcia</surname>
          </string-name>
          , Jaap-henk H. , Jeroen van N.,
          <article-title>“spam filter analysis</article-title>
          ”
          <source>in 'Proc. of 19th IFIP International Information Security Conference</source>
          ,
          <year>2004</year>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Guzella</surname>
            <given-names>T. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caminhas</surname>
            <given-names>W. M.</given-names>
          </string-name>
          <year>2009</year>
          , “
          <article-title>A review of machine learning approaches to spam filtering”</article-title>
          ,
          <source>Expert Systems with Applications</source>
          ,
          <volume>36</volume>
          (
          <issue>7</issue>
          ),
          <fpage>10206</fpage>
          -
          <lpage>10222</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Koprinska</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poon</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clarck</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <article-title>Learning to classify e-mail.</article-title>
          <string-name>
            <surname>Info</surname>
          </string-name>
          . S.
          <volume>177</volume>
          :
          <fpage>2167</fpage>
          -
          <lpage>2187</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kuncheva</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Combining Pattern</surname>
            <given-names>Classifiers</given-names>
          </string-name>
          , Methods and Algorithms, Wiley Inter Science,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Rios</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zha</surname>
            <given-names>H</given-names>
          </string-name>
          .
          <article-title>Exploring support vector machines and random forests for spam detection</article-title>
          ,
          <source>in: Proc. First International Conference on Email and Anti Spam (CEAS)</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Sakkis</surname>
            , I. Androutsopoulos,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Paliouras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Karkaletsis</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. D.</surname>
            Spy-ropoulos, and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stamatopoulos</surname>
          </string-name>
          .
          <article-title>Stacking classifiers for anti-spam filtering of e-mail</article-title>
          .
          <source>Proceedings of 6th Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <volume>1</volume>
          :
          <fpage>44</fpage>
          -
          <lpage>50</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Shih</surname>
            <given-names>D. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiang</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>I. B.</given-names>
          </string-name>
          <article-title>Collaborative spam filtering with heterogeneous agents</article-title>
          .
          <source>Expert systems with applications</source>
          ,
          <volume>34</volume>
          (
          <issue>4</issue>
          ),
          <fpage>1555</fpage>
          -
          <lpage>1566</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Valentini</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masuli</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <article-title>Ensambles of Learning Machines</article-title>
          . In: R.Tagliaferri, M. Marinaro (eds),
          <source>Neural Nets WIRN Vietri-2002</source>
          , Springer-Verlag LNCS, vol.
          <volume>2486</volume>
          ,
          <year>2002</year>
          ,
          <fpage>3</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Zighed</surname>
          </string-name>
          . “
          <article-title>Graphe d'induction: Apprentissage et data mining”</article-title>
          .
          <source>HERMES</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Yang</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedersen</surname>
            <given-names>J. O.</given-names>
          </string-name>
          ,
          <article-title>“A comparative study on feature selection in text categorization”</article-title>
          , FISHER D. H., Ed.,
          <source>Proceedings of ICML-97, 14th International Conference on Machine Learning</source>
          , Nashville,
          <string-name>
            <surname>US</surname>
          </string-name>
          , Morgan Kaufmann Publishers,
          <fpage>412</fpage>
          -
          <lpage>420</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>