<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Scherer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Vicher</string-name>
          <email>i@lvosvba</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavla Dr´aˇzdilov´a</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Martinoviˇc</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Scherer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>MJaiˇr´ıtiDnvVoircshkey´r</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>aPnadvlVa a´DclraavzdSinlo´avˇsae</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>l Jan Martinovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jir Dvorsky</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vaclav Snasel</string-name>
          <email>vaclav.snaselg@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, FEI, VSB - Technical University of Ostrava</institution>
          ,
          <addr-line>Departmen1t7o.fliCstoompapduute1r5S, c7ie0n8c3e3,,FOEsIt,rVavSaB-P</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>peter.s1c7h.elrisetro</institution>
          ,
          <addr-line>pamdaurt1i5n,.7v0i8c3h3e,rO,sptraavvlaa-.Pdorrauzbdai,lCozveac,h jRaenp.umbalrictinovic</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <fpage>108</fpage>
      <lpage>119</lpage>
      <abstract>
        <p>Intrusion Detection System (IDS) is a system, that monitors network traffic and tries to detect suspicious activity. In this paper we discuss the possibilities of application of clustering algorithms and Support Vector Machines (SVM) for use in the IDS. There we used K-means, FarthestFirst and COBWEB algorithms as clustering algorithms and SVM as classification SVM of type 1, known too as C-SVM. By appropriate choosing of kernel and SVM parameters we achieved improvements in detection of intrusion to system. Finally, we experimentally verified the efficiency of applied algorithms in IDS.</p>
      </abstract>
      <kwd-group>
        <kwd>Intrusion Detection System</kwd>
        <kwd>K-means</kwd>
        <kwd>Farthest First Traversal</kwd>
        <kwd>COBWEB/CLASSIT</kwd>
        <kwd>SVM</kwd>
        <kwd>clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Clustering Algorithms and Their Classification</title>
      <p>Cluster analysis is the process of grouping the objects (usually represented as a
vector of measurements, or a point in a multidimensional space) so that the
objects of one cluster are similar to each other whereas objects of different clusters
are dissimilar.</p>
      <p>
        Clustering is the unsupervised classification of objects (observations, data
items, instances, cases, patterns, or feature vectors) into groups, clusters. In
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] author cite that from a machine learning perspective, clusters correspond
to hidden patterns, the search for clusters is unsupervised learning, and the
resulting system represents a data concept. Therefore, clustering is unsupervised
learning of a hidden data concept.
      </p>
      <p>
        The applications of clustering often deal with large datasets and data with
many attributes. Clustering is related to many other fields. The classic
introduction to clustering in pattern recognition is given in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Machine learning
clustering algorithms were applied to image segmentation and computer vision [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
The various clustering algorithms can be classified according to how they create
clusters of objects. Such division of clustering algorithms is shown in Fig. 1.
      </p>
      <p>For our intention of using the clustering algorithms in an IDS, we need
algorithms that can determine the jurisdiction of the object X to cluster, even if the
object X was not included in the set of objects, from which we generate clusters.
For this purpose we chose the algorithms K-means, Farthest First Traversal
(they are partitional algorithms) and Cobweb/CLASSIT (this is a conceptual
clustering algorithm).</p>
      <p>Partitional Algorithms Partitional algorithms divide the objects into several
disjoint sets and creates a one level of non-overlapping clusters. But the problem
is to determine how many clusters has algorithm detect.</p>
      <p>Algorithms of Conceptual Clustering Algorithms of conceptual clustering
create by incremental way, the structure of the data by division of observed
objects into subclasses. The result of these algorithms is a classification tree.
Each node of the tree contains the objects of its child nodes, so root of this
tree contains a all objects. According to the above classification are a these
algorithms hierarchical, incremental algorithms that combine both – aggregation
and division approach.
2.2</p>
      <sec id="sec-2-1">
        <title>Farthest First Traversal</title>
        <p>
          Farthest first traversal (FFT) algorithm is partitional clustering algorithm. This
algorithm first select K objects as the centers of clusters and then assign other
objects into the cluster (according to measure of dissimilarity to centers of the
clusters). The first center of cluster is chosen randomly, the second center of
cluster as most dissimilar to first center of cluster and every other center of
cluster is chosen as the one whose value of measure of dissimilarity [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to the
previously selected centers of the clusters is greatest.
2.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>K-means</title>
        <p>
          Algorithm K-means, according to the classification above is partitional clustering
algorithm. The main idea of the algorithm is to find K centers (one for each
cluster) of clusters. The question is, how choose these centers of clusters, because
this choice will significantly affect the resulting clusters. The best would be to
pick center of cluster least similar to each other. The next step is assign each
object from data set to the center of cluster, to which is most similar. Once this
occurs, the next step in the classification is to determine the new center of each
cluster (centers are derived from clusters of objects). Again, is performed the
classification of objects into different clusters according to their dissimilarity [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
with new centers of clusters. These steps are repeated until we find out that
centers of clusters no longer change or until is achieved maximum number of
repetitions.
2.4
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>COBWEB/CLASSIT</title>
        <p>
          This incremental clustering algorithm creates a hierarchical structure of clusters
by using four operators (operator for creating a new cluster, inserting an object
into an existing cluster, union of two clusters into one cluster and splitting cluster
into two clusters) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and the categorization utility [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. When processing object
into the cluster is always used one of the operators, but always are tested all
four operators and categorization utility evaluate distribution of clusters after
applying one of the operator. Finally, as the resulting distribution is chosen
distribution that was evaluated (by using a categorization utility) as the best.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Classification SVM of type 1 (C-SVM) and their parameters</title>
      <p>3.1</p>
      <sec id="sec-3-1">
        <title>Support Vector Machines Classifier</title>
        <p>
          Support Vector Machine (SVM) is a preferably technique for linear binary data
classification. In [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] authors state that a classification task usually involves
separating data into training and testing sets. Each instance in the training set
contains one target value (i.e. the class labels) and several attributes (i.e. the
features or observed variables). The goal of SVM is to produce a model (based
on the training data) which predicts the target values of the test data given only
the test data attributes.
        </p>
        <p>Given a binary training set (xi, yi), xi ∈ Rn, yi ∈ {−1, 1}, i = 1, . . . , m, the
basic variant of the SVM algorithm attempts to generate a separating
hyperplane in the original space of n coordinates (xi parameters in vector x) between
two distinct classes, Fig. 2. During the training phase the algorithm seeks for
a hyper-plane which best separates the samples of binary classes (classes 1 and
−1). Let h1 : wx + b = 1 and h−1 : wx + b = 1 (w, x ∈ Rn, b ∈ R) be possible
hyper-planes such that majority of class 1 instances lie above h1 and majority
of class −1 fall below h−1, whereas the elements coinciding with h1, h−1 are
hold for Support Vectors. Finding another hyper-plane h : wx + b = 0 as the
best separating (lying in the middle of h1, h−1), assumes calculating w and b,
i.e. solving the nonlinear convex programming problem. The notion of the best
separation can be formulated as finding the maximum margin M that separates
the data from both classes. Since M = 2 kwk−1, maximizing the margin cuts
down to minimizing kwk Eq.(1).</p>
        <p>1
w,b 2 kwk2 + C
min</p>
        <p>X εi
i
with respect to: 1 − εi − yi(w · xi + b) ≤ 0, −εi ≤ 0, i = 1, 2 . . . , m</p>
        <p>Regardless of having some elements misclassified (Fig. 2) it is possible to
balance between the incorrectly classified instances and the width of the separating
margin. In this context, the positive slack variables εi and the penalty
parameter C are introduced. Slacks represents the distances of misclassified points to
the initial hyper-plane, while parameter C models the penalty for misclassified
training points, that trades-off the margin size for the number of erroneous
classifications (bigger the C smaller the number of misclassifications and smaller
the margin). The goal is to find a hyper-plane that minimizes misclassification
errors while maximizing the margin between classes. This optimization problem
is usually solved in its dual form (dual space of Lagrange multipliers):
w∗ =
m
X αiyixi
i=1
(1)
(2)
(3)
where C ≥ αi ≥ 0, i = 1, . . . , m, and where w∗ is a linear combination of
training examples for an optimal hyper-plane. However, it can be shown that w∗
represents a linear combination of Support Vectors xi for which the
corresponding αi Langrangian multipliers are non-zero values. Support Vectors for which
C &gt; αi &gt; 0 condition holds, belong either to h1 or h−1. Let xa and xb be two
such Support Vectors (C &gt; αa, αb &gt; 0) for which ya = 1 and yb = −1. Now b
could be calculated from b∗ = 0.5w∗(xa + xb), so that classification (decision)
function finally becomes:</p>
        <p>m
f (x) = sgn X αiyi(xi · x) + b∗</p>
        <p>i=1</p>
        <p>
          To solve non-linear classification , one can propose the mapping of instances
to a so-called feature space of very high dimension: ϕ : Rn → Rd, n d i.e.
x → ϕ(x). The basic idea of this mapping into a high dimensional space is to
transform the non-linear case into linear and then use the general algorithm
already explained above Eqs. (1), (2), and (3). In such space, dot-product from
Eq. (3) transforms into ϕ(xi) · ϕ(x). A certain class of functions called kernels
[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for which k(x, y) = ϕ(x) · ϕ(y) holds, are called kernels. They represent
dot-products in some high dimensional dot-product spaces (feature spaces), and
yet could be easily recomputed into the original space. As example was chosen
a Radial Basis Function Eq. (4), also known as Gaussian kernel [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], and was one
of implemented kernels in the experimenting procedure.
        </p>
        <p>k(x, y) = exp(−γ kx − yk2)
(4)
Now Eq. (3) becomes:
(5)</p>
        <p>
          After removing all training data that are not Support Vectors and retraining
the classifier, the same result would be obtained [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] by applying the function
above. Thus, one depicted, Support Vectors could replace the entire training
set, which is the central idea of SVM implementation.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>
        The data used for training and testing was prepared by the Agency DARPA
intrusion detection evaluation program in 1998 at MIT Lincoln Labs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Experiments were performed on a collection containing five pairs of data sets: the
learning set (5092 vectors of 42 attributes) and testing set (6890 vectors of 42
attributes). Each pair represents a learning and testing data for one type of five
classes of network attacks. Individual vectors describing the network traffic are
described by 41 attributes (range 0 − 1, is therefore not necessary to
normalization). The 42nd attribute was used in learning process. The attribute determines
type of network attack in the question. In the case of testing, the existence of the
attribute was neglected. We measure only classification accuracy of the vector,
that describes the network attack.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Classification Using SVM type 1 (C-SVM)</title>
        <p>It is necessary to determine the appropriate combination of parameters C and γ
for better efficiency. In our experiment, the parameter C is in the range of 2−5
and 215 in increments of powers of 2 and a parameter γ is in the range of 2−15 and
23 in increments of powers of 2. We used 110 combinations of parameters C γ in
total. In the case of same results of prediction with different parameters C and γ,
the combination of parameters with the lowest time-intensive calculation model
was chosen. In Tables 1,2, 3, and 4 is possible to see the best result combination.</p>
        <p>
          The four most utilized kernel functions (linear, polynomial, RBF and
sigmoid) was used for process of learning. As technology, we used library LibSVM [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Classification with Algorithm Farthest First Traversal</title>
        <p>During experiments with the algorithm Farthest First Traversal we tried to reveal
the effect of number of generated clusters on success rate of the classification of
network traffic, and on training time. The measure used by this algorithm was
cosine measure. Tables 5 and 6 shows results of each experiments with algorithm
FFT. Of these it is possible to deduce that the time of training increases with
the number of generated clusters. We tried to optimize this algorithm by using
data structure KD-tree. Training time of this algorithm with and without using
of KD-tree is shown in Tables 5 and 6. As you can see in the Tables 5 and 6
training time of this algorithm with using KD-tree was reduced by almost half.
Table 7 presents the results of the algorithm FFT with using a KD-tree for each
class of attack.
During experiments with algorithm K-means we tried to reveal the influence
of the number of generated clusters on training time and success rate of the
network traffic classification. The measure that was used by this algorithm was
cosine measure. In Tables 8, 9 and 10 are shown results for each experiment.
Of these it is possible to deduce that the time of training is increasing with
the number of generated clusters. We tried to optimize this algorithm by using
data structure KD-tree. Training time of this algorithm with and without using
of KD-tree is shown in Tables 8 and 9. As you can see in the Tables 8 and 9,
training time of this algorithm with using KD-tree not declined as significantly
as at algorithm FFT. For certain number of generated clusters was training time
even worse than at algorithm without using KD-tree. This is due overhead of
Normal
Probe
DOS
U2R
R2L
creating KD-tree in each iteration of the algorithm and for a small number of
generated clusters is more effective search cluster, where object fall, sequentially
than by using KD-tree. Table 10 presents the results of algorithm K-means using
a KD-tree for each class of attack.
To achieve the best success rate is necessary to determine values of parameters
Acuity and Cutoff. These parameters must be selected manually and is not
known method how select the best combination. Based on experiments with
the values of these parameters, when the values for the parameter Acuity were
changed in the interval 0.225 to 0.01 with step 0.025 with the constant value
of parameter Cutoff 0.1 and experiments when parameter Acuity had constant
value 0.1 and values of parameter Cutoff were changed in the interval 0.1 − 1
with step 0.1. We have chosen values for parameter Acuity 0.1 and for parameter
Cutoff 0.6. Table 11 shown the results of the algorithm COBWEB/CLASSIT for
each class of attack.
linear polynomial RBF sigmoid
In this paper we have described the method for the illustrated prediction
accuracy by using clustering algorithms and SVM in the IDS. In Table 13 for each
used algorithm is shown success rate for each class of attack. The best average
success rate has SVM algorithm, more than 99% (best of all is algorithm SVM
that is using the RBF kernel, it has a success rate 99.722%). The average success
rate of other algorithms was between 91.228% and 98.998%. It will be useful to
compare these two methods on other document collections. In our future work
we will investigate other kernel functions to search for better attacks prediction
in the IDS, SVM paralelization and optimalization clustering algorithms.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgment References</title>
      <p>This work is partially supported by Grant of Grant Agency of Czech
Republic No. 205/09/1079, and SGS, VSB – Technical University of Ostrava, Czech
Republic, under the grant No. SP2011/172.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Abe</surname>
          </string-name>
          .
          <article-title>Support Vector Machines for pattern classification</article-title>
          . London, Springer,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Abraham</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Jain</surname>
          </string-name>
          .
          <article-title>Soft Computing Models for Network Intrusion Detection Systems. Classification and Clustering for Knowledge Discovery Studies in Computational Intelligence</article-title>
          , p.
          <fpage>191</fpage>
          -
          <lpage>207</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>B.</given-names>
            <surname>Al-Shboul</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.-H.</given-names>
            <surname>Myaeng</surname>
          </string-name>
          .
          <article-title>Initializing k-means using genetic algorithms</article-title>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P.</given-names>
            <surname>Berkhin</surname>
          </string-name>
          .
          <article-title>A Survey of Clustering Data Mining Techniques</article-title>
          .
          <source>Grouping Multidimensional Data</source>
          , p.
          <fpage>25</fpage>
          -
          <lpage>71</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chih-Chung Chang</surname>
          </string-name>
          and
          <string-name>
            <surname>Chih-Jen Lin</surname>
          </string-name>
          .
          <article-title>LIBSVM: a library for support vector machines</article-title>
          ,
          <year>2001</year>
          http://www.csie.ntu.edu.tw/~cjlin/libsvm
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristiani</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          .
          <article-title>An Introduction to Support Vector Machines and other kernel-based learning methods</article-title>
          . Cambridge, Cambridge University Press,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>R.</given-names>
            <surname>Duda</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Hart</surname>
          </string-name>
          . Pattern Classification and
          <string-name>
            <given-names>Scene</given-names>
            <surname>Analysis</surname>
          </string-name>
          . John Wiley &amp; Sons, New York,
          <year>1973</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Fisher. Knowledge Acquisition Via Incremental Conceptual Clustering</surname>
          </string-name>
          . Kluwer Academic Publisher,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>G.</given-names>
            <surname>Gan</surname>
          </string-name>
          , C. Ma, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Data Clustering Theory. Algorithms and Applications</article-title>
          . ASASIAM,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>C. Hsu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
            and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lin</surname>
          </string-name>
          .
          <article-title>A Practical Guide to Support Vector Classification</article-title>
          ,
          <source>journal Bioinformatics</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>S.</given-names>
            <surname>Chavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abraham</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Sanyal. Adaptive</surname>
          </string-name>
          Neuro-Fuzzy
          <source>Intrusion Detection Systems, International Conference on Information Technology: Coding and Computing (ITCC'04)</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>A. K. Jain</surname>
            and
            <given-names>P.J.</given-names>
          </string-name>
          <string-name>
            <surname>Flynn</surname>
          </string-name>
          .
          <article-title>Image segmentation using clustering. In Advances in Image Understanding: A Festschrift for Azriel Rosenfeld</article-title>
          , IEEE Press,
          <fpage>65</fpage>
          -
          <lpage>83</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>13. MIT Lincoln Laboratory http://www.ll.mit.edu/IST/ideval/</mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>S.</given-names>
            <surname>Owais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Snasel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kromer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Abraham</surname>
          </string-name>
          . Survey:
          <article-title>Using Genetic Algorithm Approach in Intrusion Detection Systems Techniques</article-title>
          , p.
          <fpage>300</fpage>
          -
          <lpage>307</lpage>
          ,
          <source>Computer Information Systems and Industrial Management Applications</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>N.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          .
          <article-title>Incremental hierarchical clustering of text documents</article-title>
          .
          <source>adviser: Jamie Callan</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>V.</given-names>
            <surname>Snasel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Platos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kromer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abraham</surname>
          </string-name>
          .
          <article-title>Matrix Factorization Approach for Feature Deduction and Design of Intrusion Detection Systems</article-title>
          , p.
          <fpage>172</fpage>
          -
          <lpage>179</lpage>
          ,
          <source>The Fourth International Conference on Information Assurance and Security</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>