<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A.R.E.S : Automatic Rogue Email Spotter Crypt Coyotes</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Vysakh S Mohan</institution>
          ,
          <addr-line>Naveen J R, Vinayakumar R</addr-line>
          ,
          <institution>Soman KP Center for Computational Engineering and Networking(CEN), Amrita School of Engineering</institution>
          ,
          <addr-line>Coimbatore Amrita Vishwa Vidyapeetham</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1998</year>
      </pub-date>
      <volume>62</volume>
      <abstract>
        <p>Be it formal or casual, email is undoubtedly the most popular means of communication in modern times. Their popularity owes to the fact that they are reliable, fast and more over free to use. One issue that plagues this otherwise solid technology is phishing emails received by users. Phishing emails have always bothered users as it's a huge waste of storage, time, money and resource to any user. Many previous attempts to eradicate or at least block phishing emails have been deemed futile. This work uses word embedding as text representation for supervised classi cation approach to identify phishing emails. Ruled based and machine learning models with feature engineering were attempted but failed due to the ever increasing ways of threats and lack of scalability of the model. Deep learning based models have shown to surpass the older techniques in spam email detection. This work aims at attempting the same using a CNN/RNN/MLP network with Word2vec embeddings on phishing email corpus, where Word2vec helps to capture the synaptic and semantic similarity of phishing and legitimate emails in an email corpus. This work aims to show the abilities of word embedding have to solve issues related to cybersecurity use cases.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Copyright c by the paper's authors. Copying permitted for
private and academic purposes.</p>
      <p>
        In: R. Verma, A. Das (eds.): Proceedings of the 1st
AntiPhishing Shared Pilot at 4th ACM International Workshop on
Security and P
        <xref ref-type="bibr" rid="ref3 ref4">rivacy Analytics (IWSPA 2018</xref>
        ), Tempe, Arizona,
USA, 21-03-2018, published at http://ceur-ws.org
1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>Internet and staying connected through it is what
distinguishes this era from the previous. More and more
people rely on the internet for their communication as
well as data transaction requirements. Email has
revolutionized the way people communicate over the web.
From its inception, electronic mails have outgrown
its real world counterpart to become mainstream and
serve as both casual and o cial way of passing a
message. Now we have several service providers o ering
email platforms for free and with a plethora of
features. This means that the number of people taking
advantage of these services have grown dramatically.
This mass adoption is one aspect any malignant
adversary could use to his bene t. Such malignant emails
are called spam[CM01], and they are unsolicited as
well as junk info usually unwanted for the user. They
are commonly characterized by the following: they are
mass mailed, may contain explicit content, useless
advertisements, fraudulent, may contain hidden links to
phishing websites etc. On a personal front the user
could face issues like, annoyance due to irrelevant info,
unwanted use of bandwidth, waste of storage, makes
the communication channel less productive via loss of
time sorting junk mails, unnecessary use of
computing power, causes spread of viruses, loss of money via
phishing etc.</p>
      <p>These issues have brought immense focus on safety
of users against spam emails. Massive pool of users
using these platforms is one reason for it being
targeted more often. It is an inexpensive means to gain
access to millions of people, which forces adversaries
to target it more often. The most dangerous type of
emails are the spam emails[KRA+07]. It may be via a
spam email server or from personal servers containing
malicious URLs that could direct the users to
phishing sites. This is a challenging task and many
solutions have been devised to solve this problem over the
past few years, but they all come with some downsides.
One reason it gets challenging is the variety of ways
in which the attacker can serve a spam email. A
frequently used method is the blended attack. Malware
delivery through such attacks may vary. Usually the
email itself may not contain the malware, but possibly
contain a link to some compromised website. These
emails may look normal, but would contain a mix of
legitimate as well as malicious content. A former
research by IBM's X-Force team, found that more than
50% of the emails produced worldwide are fraudulent.
These gures are going to increase in the subsequent
years.</p>
      <p>One reason such attacks are successful is the
carelessness from the generic user. Most internet users are
illiterate when it comes to cybersecurity and they
simply ignore the safety precautions that need to be
exercised in the online space. There are no sure shot ways
to check if a person has been a victim to such attacks,
but can be prevented by being a bit cautious. You
could check the email headers and check for
grammatical mistakes. But these may not be su cient when the
scale of such attacks escalates. These type of states
require some automated solution to detect spam email.</p>
      <p>Emails headers can help to a certain extent. They
can be used as features to some machine learning based
classi ers[LT04, S+09]. The advantage of using header
features compared to body features have been detailed
in[ZZY04]. Header features like sender address,
message ID etc. were used in[WC07] to make the
detection.</p>
      <p>Most of the popular machine learning techniques
consists of two steps: obtain the proper features
representation from the data and use these features for
learning and predicting the system. First step focuses
on extracting useful info from the given URL, which
is stored as a vector so that the algorithm can t
different machine learning based models in it. Di erent
categories of features have been taken[SLH17].
Lexical features, content features, host based features and
context features are some of the popular ones. An
algorithm requires some form of mathematical
representation to work with. This work uses Word2vec
embedding methods for e ective representation of the data.</p>
      <p>
        Spam ltering is a supervised classi cation problem
where the problem is considered as a binary classi
cation task with 2 classes: legitimate (good) emails
and spam emails. Tretyakov used methods like naive
bayes and K-NN machine learning algorithms for spam
detection[Tre04], which doesn't deal with feature
selection but bene cial for beginners. Spam detection
or automatic email ltering starts with statistical
approaches primarily. The development began with
popular naive bayes approaches, which reduced the
problem into a space where dependencies between the data
and co-relation issues are ignored[SDHH98], that is,
the multi variate nature of the problem breaks down
to a uni-variate one without compromising on
accuracy. Di erent authors have tried to incorporate
modi cations on top of the naive bayes pipeline, but the
approach was unable to nd the correlation between
words and the algorithm fai
        <xref ref-type="bibr" rid="ref7">led in certain tasks. In
2004</xref>
        Chih-Chin Lai and Tsai[LT04] introduced the
TF-IDF, K-NN and SVM to overcome the issues in
the email ltering task. SVM, TF-IDF got a
satisfactory result while K-NN got worst result among them.
Blanzieri and Bryl came up with feature extraction
methods in[BB08], along with SVM. During this time,
unsupervised machine leaning techniques were also
developed. Data were clustered into spam and ham.
Whissell and Clarke[WC11] in 2011, came up with
a novel research on spam clustering, which attained
state of art result compared to all the previous
methods. Since the spam ltering is a diverse area,
ensemble methods (combining di erent algorithms on same
problem), like boosting and bagging[GGWM+10], are
applied to get e ective classi cation. Caruana and
Li, (2012) focused[CL12] on distributed computuing
paradigram using SVM and ANN by removing the
interoperability and implementation issues.
      </p>
      <p>Machine learning models usually rely on some sort
of engineered features that are generated from the
data and has been proved to surpass the accuracy of
its predecessors in spam email classi cation[FRID+07,
AAY11], whereas, very few machine learning models
for phishing emails exist today and most of them are
in their infancy. With acquired domain knowledge,
various feature engineering strategies are employed on
the data to build the model[SAZ18], [PHS18], [VH13],
[VSH12], [MG18], [HDC+18], [MBA18]. A main plus
to this method is the reduced e ort to train the
classi er rather than developing complex rules for a lter.
This feature engineering method could also deem the
system vulnerable to manipulation and the model may
not scale well to newer threats. Deep learning
models can be used to overcome this issue as they learn
the features themselves and modify it according to
newer inputs. On top of that these models are
comparatively more accurate and scalable. Nowadays deep
learning models combined with word embeddings have
given good performance for various cybersecurity
usecases[VSP18a], [VSP18b], [VSP17], [LF17], [SKP18].
This motivated the use of word embeddings with deep
learning models like Multi-Layer Perceptron (MLP),
Recurrent Neural Network (RNN), Convolutional
Neural Networks and Long Short Term Memory (LSTM).</p>
    </sec>
    <sec id="sec-3">
      <title>Background</title>
      <p>This section details the theory behind various deep
learning models used.
Word2vec is a model proposed by Mikalov[MSC+13]
to learn the word embedding which is inspired
by distributed representation introduced by
Hinton[HMR+86], but in the Word2vec
framework, word representation is learned using a shallow
neural network. The fundamental assumption in
word embedding or distributional methods is that,
words with similar sense tends to happen in similar
context and they capture the similarity between
words[BG17], [BG18]. Word2vec is a popular model
to generate word embeddings on text data. They have
the ability to reproduce linguistic context of words
through training their shallow two layer architectures.
The input to the Word2vec model may be a huge
corpus and the generated outputs are vectors in some
multi-dimensional space, with each unique word in
the corpus have a corresponding vector associated
with it. This makes learning the word representation
signi cantly faster than the previous methods. In the
Word2vec framework the distributed representation
of the words in the vocabulary is learned in an
unsupervised way. Learning can be done via two
architectures like skip-gram and continuous bag of
words.</p>
      <p>1 XN</p>
      <p>X
N n=1 sks;j6=0
logp(Qn+kjQk)
(1)</p>
      <p>Skip-gram method tries to maximize the average
probability value of the word sequence Q1,Q2,...QN .
Here 's' indicates training context size that is directly
related to the center word Qn and p(Qn+kjwk) is
softmax function. In the skip-gram model, the context or
surrounding word is predicted given the centre word
as the input and in Continuous Bag of Words(CBOW)
model, given the surrounding words the centre word is
predicted.
2.2</p>
      <sec id="sec-3-1">
        <title>Convolutional neural Nets (CNN)</title>
        <p>CNN is commonly used for computer vision tasks,
where their local receptive eld is advantageous for
feature learning in images. CNN models are also used for
text classi cation tasks. CNN can be thought of as an
arti cial neural network that has the ability to pick out
or detect patterns and make sense out of them. These
pattern detection makes CNN useful for data analysis.
CNN has hidden layers called convolutional layers are
a tad bit di erent from MLP. For each convolutional
layers, the number of lters needs to be speci ed,
which then slides over the entire rows and columns
of the matrix. In this matrix each individual row is a
vector representing one word, more accurately
speaking, these are word embedding models like Glove1 or
Word2vec2. This work used Word2vec model before
applying CNN in this task. CNN performs well on
sequential data with faster training times and is
exceptional for predictive analysis. CNN normally
consist of an input layer followed by convolutional layers,
maxpooling layers for dimensionality reduction
purpose and fully connected layers with a speci c
nonlinear activation function (ReLU in this work). In this
phishing email detection task (text based), one
dimensional maxpooling layers and fully connected layers are
used. Filters used in this network model slides above
the embedding vector to output a continuous value at
each step. This outputs better representations of the
word vectors. For text based applications 1D CNN is
used.
2.3</p>
      </sec>
      <sec id="sec-3-2">
        <title>Multi-layer perceptron (MLP)</title>
        <p>Rosenblatt introduced the concept of a single
perceptron. Multi-layer perceptron (MLP) is typically a
network of perceptrons or simple neurons. MLP consists
of one input and output layer. Dimensions of input
output nodes depends on the no of sample vectors and
the no of label vectors present in the input data. In
between these two layers, many hidden layers are present.
There exist layers where the output is being fed as
input to the following hidden layers and each unit does
a relatively straight forward computation. It takes
input X multiplies it by a weight W , performs a
summation and passes all of that through an activation
function to yield the output. Perceptrons compute a
score or a single output from sequential inputs that
are usually real valued. This calculated score is used
for backward pass, where cost function is calculated by
matching wrongly predicted output to the truth label
value, and is expressed as root mean square (RMS)
error value. This RMS error is minimized using
gradient descent technique and optimum weight and base
value is gured out from this network model. It uses
activation functions like sigmoid or tanh to produce
the output. One nature of MLP is the fully connected
architecture within its deep layers.
2.4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Recurrent neural network (RNN)</title>
        <p>The problem associated with MLP and CNN model is
that every input and outputs vectors are independent.
Or in other words above models can't capture the
sequential info between the words. In phishing email
1https://nlp.stanford.edu/projects/glove
2https://www.tensor ow.org/tutorials/Word2vec</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>DATASET DESCRIPTION</title>
      <p>The dataset[EDMB+18] used is provided at the 4th
ACM International Workshop on Security and Privacy
Analytics shared task[EDB+18]. The task was to
detect phishing emails. Details of the dataset is shown
in Table2 &amp; 3
The proposed tool is christened A.R.E.S which stands
for Automatic Rogue Email Spotter. A detailed
visualization of the model is shown in Fig 1. The
architecture is a combination of word embedding with a
CNN, RNN, and MLP. This task is categorized into
2 subs tasks, which are emails with 'no header' and
'with header'. We didn't extract any other features
from the header and the methodology used for
conversion of raw email samples to feature vectors the same
for both the sub tasks. In both the sub tasks, the raw
email corpus is fed to the embedding layer that uses
Word2vec model to generate distributed word
embedding. The learned word embedding model is used to
represent the input data, which is then fed to a deep
learning models. The hyperparameters used to create
Word2vec model is detailed in Table 1.</p>
      <p>The deep learning models learn additional features
which will be pushed to the fully connected layer.
Previous work on similar problem suggests to use RNN to
solve such tasks, but in order to have a better analysis
on the performance of di erent models we
incorporated CNN and MLP to this work. Finally, due to the
binary nature of this task we used sigmoid to
classify legitimate emails from the phishing based on its
threshold and used binary cross entropy for loss
reduction.</p>
      <p>From the statistics shown in Table 4 and 5, the word
embedding model along with an MLP network gives
a commendable score for both the sub tasks.
Further, when the same word embedding model is passed
Word embedding + MLP
Word embedding + CNN
Word embedding + RNN
Word embedding + MLP
Word embedding + CNN
Word embedding + RNN</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Phishing emails have always plagued even the average
user and classifying the same properly is a
challenging task. Where former machine learning techniques
failed, deep learning models have provided state of
the art performance. The CNN/RNN/MLP
architecture along with the Word2vec embeddings used in this
work has outperformed former rule based and machine
learning based models. During training the model gave
high accuracy, while the test accuracy were
comparatively low due to the highly unbalance nature of the
dataset. In the proposed system, no external data
was provided to train the model. CNN had a slightly
better performance over RNN model on subtask1 and
RNN perform well for subtask2, on the test data. For
subtask 1, the CNN managed a score of 95.2%,
almost comparable to RNN and for subtask 2, the RNN
managed a score of 93.1%, making the RNN a better
and more versatile overall performer. More accuracy
can be achieved with these trained model by
extrap[ABC+16]
[BB08]
[BG17]</p>
      <sec id="sec-5-1">
        <title>Tiago A Almeida, Jurandy Almeida, and Akebo Yamakami. Spam ltering: how the dimensionality reduction a ects the accuracy of naive bayes classi ers.</title>
        <p>Journal of Internet Services and
Applications, 1(3):183{200, 2011.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Mart n Abadi, Paul Barham, Jianmin</title>
        <p>Chen, Zhifeng Chen, Andy Davis,
Jeffrey Dean, Matthieu Devin, Sanjay
Ghemawat, Geo rey Irving, Michael
Isard, et al. Tensor ow: A system for
large-scale machine learning. In OSDI,
volume 16, pages 265{283, 2016.</p>
      </sec>
      <sec id="sec-5-3">
        <title>Enrico Blanzieri and Anton Bryl. A survey of learning-based techniques of email spam ltering. Arti cial Intelligence Review, 29(1):63{92, 2008.</title>
      </sec>
      <sec id="sec-5-4">
        <title>Reshma U. Anand man K.P. Barathi Representation of</title>
      </sec>
      <sec id="sec-5-5">
        <title>Kumar M. SoGanesh, H.B. target classes [BG18]</title>
        <p>[GGWM+10] Pedro H Calais Guerra, Dorgival
Guedes, J Wagner Meira, Cristine
Hoepers, MHPC Chaves, and Klaus
Steding-Jessen. Exploring the spam
arms race to characterize spam
evolution. In Proceedings of the 7th
Collaboration, Electronic messaging,
AntiAbuse and Spam Conference (CEAS),
Redmond, WA, 2010.
[HDC+18]
[HMR+86]
[KRA+07]
[LF17]
[LT04]
[MBA18]
[MG18]
[MSC+13]</p>
      </sec>
      <sec id="sec-5-6">
        <title>Reza Hassanpour, Erdogan Dogdu,</title>
        <p>
          Roya Choupani, Onur Goker, and
Nazli Nazli. Phishing e-mail detection by
using deep learning algorithms. In
P
          <xref ref-type="bibr" rid="ref3 ref4">roceedings of the ACMSE 2018</xref>
          Conference, page 45. ACM, 2018.
        </p>
      </sec>
      <sec id="sec-5-7">
        <title>Geo rey E Hinton, James L McClel</title>
        <p>land, David E Rumelhart, et al.
Distributed representations. Parallel
distributed processing: Explorations in the
microstructure of cognition, 1(3):77{
109, 1986.</p>
      </sec>
      <sec id="sec-5-8">
        <title>Ponnurangam Kumaraguru, Yong</title>
        <p>Rhee, Alessandro Acquisti, Lorrie Faith
Cranor, Jason Hong, and Elizabeth
Nunge. Protecting people from
phishing: the design and evaluation of an
embedded training email system. In
Proceedings of the SIGCHI
conference on Human factors in computing
systems, pages 905{914. ACM, 2007.</p>
      </sec>
      <sec id="sec-5-9">
        <title>Ruidan Li and Errin W Fulp. Evolu</title>
        <p>
          tionary approaches for resilient
surveillance management. In 2017 IEEE
Security and Privacy Workshops (SPW),
pages 23{28.
          <xref ref-type="bibr" rid="ref2">IEEE, 2017</xref>
          .
        </p>
      </sec>
      <sec id="sec-5-10">
        <title>Chih-Chin Lai and Ming-Chi Tsai. An</title>
        <p>empirical performance comparison of
machine learning methods for spam
email categorization. In Hybrid
Intelligent Systems, 2004. HIS'04. Fourth
International Conference on, pages 44{48.
IEEE, 2004.</p>
      </sec>
      <sec id="sec-5-11">
        <title>Youness Mourtaji, Mohammed</title>
        <p>Bouhorma, and Daniyal
Alghazzawi. New phishing hybrid detection
framework. Journal of Theoretical &amp;
Applied Information Technology, 96(6),
2018.</p>
      </sec>
      <sec id="sec-5-12">
        <title>Ankur Mishra and BB Gupta. Intelligent phishing detection system using similarity matching algorithms.</title>
        <p>International Journal of Information
and Communication Technology,
12(12):51{73, 2018.</p>
      </sec>
      <sec id="sec-5-13">
        <title>Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Je Dean. Distributed representations of words and phrases and their compositionality.</title>
        <p>In Advances in neural information
processing systems, pages 3111{3119, 2013.
[SAZ18]</p>
      </sec>
      <sec id="sec-5-14">
        <title>Tianrui Peng, Ian Harris, and Yuki</title>
        <p>Sawa. Detecting phishing attacks
using natural language processing and
machine learning. In Semantic Computing
(ICSC), 2018 IEEE 12th International
Conference on, pages 300{301. IEEE,
2018.</p>
      </sec>
      <sec id="sec-5-15">
        <title>Jyh-Jian Sheu et al. An e cient twophase spam ltering method based on e-mails categorization. IJ Network Security, 9(1):34{43, 2009.</title>
      </sec>
      <sec id="sec-5-16">
        <title>Sami Smadi, Nauman Aslam, and</title>
        <p>
          Li Zhang. Detection of online
phishing email using dynamic evolving neural
network based on reinforcement
learning. Decision Suppo
          <xref ref-type="bibr" rid="ref3 ref4">rt Systems, 2018</xref>
          .
        </p>
      </sec>
      <sec id="sec-5-17">
        <title>Vysakh S Mohan Soman Kp, Vinayaku</title>
        <p>
          mar R and Prabaharan
Poornachandran. S.p.o.o.f net: Syntactic
patterns for identi cation of ominous
online facto
          <xref ref-type="bibr" rid="ref3 ref4">rs. In 2018</xref>
          IEEE Security and
Privacy Workshops (SPW). IEEE,
[InP
          <xref ref-type="bibr" rid="ref3 ref4">ress], 2018</xref>
          .
        </p>
      </sec>
      <sec id="sec-5-18">
        <title>Doyen Sahoo, Chenghao Liu, and</title>
        <p>Steven CH Hoi. Malicious url detection
using machine learning: A survey. arXiv
preprint arXiv:1701.07179, 2017.</p>
      </sec>
      <sec id="sec-5-19">
        <title>Konstantin Tretyakov. Machine learning techniques in spam ltering. In</title>
        <p>Data Mining Problem-oriented
Seminar, MTAT, volume 3, pages 60{79,
2004.</p>
      </sec>
      <sec id="sec-5-20">
        <title>Rakesh Verma and Nabil Hossain. Semantic feature selection for text with application to phishing email detection.</title>
        <p>In International Conference on
Information Security and Cryptology, pages
455{468. Springer, 2013.
[VSP18a]
[WC07]
[ZZY04]</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [VSP18b] [WC11]
          <string-name>
            <given-names>R</given-names>
            <surname>Vinayakumar</surname>
          </string-name>
          , KP Soman, and
          <string-name>
            <given-names>Prabaharan</given-names>
            <surname>Poornachandran</surname>
          </string-name>
          .
          <article-title>Deep encrypted text categorization</article-title>
          .
          <source>In Advances in Computing, Communications and Informatics (ICACCI)</source>
          , 2017 International Conference on, pages
          <volume>364</volume>
          {
          <fpage>370</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>R</given-names>
            <surname>Vinayakumar</surname>
          </string-name>
          , KP Soman, and
          <string-name>
            <given-names>Prabaharan</given-names>
            <surname>Poornachandran</surname>
          </string-name>
          .
          <article-title>Detecting malicious domain names using deep learning approaches at scale</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          ,
          <volume>34</volume>
          (
          <issue>3</issue>
          ):
          <volume>1355</volume>
          {
          <fpage>1367</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>R</given-names>
            <surname>Vinayakumar</surname>
          </string-name>
          , KP Soman, and
          <string-name>
            <given-names>Prabaharan</given-names>
            <surname>Poornachandran</surname>
          </string-name>
          .
          <article-title>Evaluating deep learning approaches to characterize and classify malicious urls</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          ,
          <volume>34</volume>
          (
          <issue>3</issue>
          ):
          <volume>1333</volume>
          {
          <fpage>1343</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>Using header session messages to antispamming</article-title>
          .
          <source>Computers &amp; Security</source>
          ,
          <volume>26</volume>
          (
          <issue>5</issue>
          ):
          <volume>381</volume>
          {
          <fpage>390</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>Clustering for semi-supervised spam ltering</article-title>
          .
          <source>In Proceedings of the 8th Annual Collaboration</source>
          , Electronic messaging,
          <source>Anti-Abuse and Spam Conference</source>
          , pages
          <volume>125</volume>
          {
          <fpage>134</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Le</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Jingbo Zhu, and
          <string-name>
            <given-names>Tianshun</given-names>
            <surname>Yao</surname>
          </string-name>
          .
          <article-title>An evaluation of statistical spam ltering techniques</article-title>
          .
          <source>ACM Transactions on Asian Language Information Processing (TALIP)</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ):
          <volume>243</volume>
          {
          <fpage>269</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>