<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Closure Structure: a Deeper Insight</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tatiana Makhalova</string-name>
          <email>tatiana.makhalova@inria.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergei O. Kuznetsov</string-name>
          <email>skuznetsov@hse.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amedeo Napoli</string-name>
          <email>amedeo.napoli@loria.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Research University Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universit ́e de Lorraine</institution>
          ,
          <addr-line>CNRS, Inria, LORIA, F-54000 Nancy</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Discovery of the pattern search space is an essential component of Itemset Mining. The most common approach to reduce pattern search space is to compute only frequent closed itemsets. Frequent patterns are known to be not a good choice due to omitting useful infrequent itemsets and their exponential explosion with decreasing frequency. In our previous work we proposed the closure structure that allows for computing itemsets level-by-level without any preset parameters. In this work we study experimentally some properties of the closure levels.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Itemset Mining (IM) envelops a wide variety of tasks and methods related to
computing and selecting itemsets. Its main challenges can be summarized in two
questions: What itemsets (and how) to compute? Which of them (and how) to
use? IM is usually considered as unsupervised learning, meaning that one selects
itemsets based on such characteristics as coverage, diversity, interestingness by a
certain measure [9], etc. However, IM also includes some supervised approaches,
e.g., rule-based classifiers, where itemsets are selected based on a standard
quality measure of classifiers. However, both supervised and unsupervised approaches
may use the same methods for itemset computing. To date, frequency remains
the major criterion for computing itemsets. The methods for computing
frequent itemsets are brought together under the name Frequent Itemset Mining
(FIM). The main drawback of FIM is omitting interesting and useful infrequent
itemsets, while the main advantage of FIM is efficiency in the sense that any
FIM-approach computes frequent itemsets and only them (because of the
antimonotonicity of frequency w.r.t. the order of pattern inclusion).</p>
      <p>Nowadays there are almost no other (anti-)monotone measures that are
commonly used in IM for computing itemsets. In [4] authors propose to generate
closed itemsets based on Δ-measure, which is monotone w.r.t. projections.
Authors propose an efficient polynomial algorithm, however, the lack of experiential
study of the quality of generated itemsets may hamper a wide use of this
approach in practice.</p>
      <p>In our previous work [11], we proposed the closure structure of concept lattice
(i.e., the whole set of closed itemsets) and an algorithm for its gradual
computing. The algorithm computes closed itemsets (formal concepts) by levels with
polynomial delay. Each level may contain itemsets of different frequency,
however, the number of frequent itemsets decreases with each new level. In [11] we
presented some theoretical results and described characteristics of closed
itemsets by levels based on experiments. In this work, we study further topology of
the closure structure and applicability of concepts for classification by closure
levels.</p>
      <p>The paper has the following structure. In Section 2 we recall the basic notions.
In Section 3 we describe the GDPM algorithm for computing closure levels and
discuss its flaws. In Section 4 we present a simple model of a rule-based classifier.
Section 5 contains the results of the experiments. In Section 6 we conclude and
give directions of future work.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Basic notions</title>
      <sec id="sec-2-1">
        <title>Concepts and the partial order between them</title>
        <p>A formal context [7] is a triple (G, M, I), where G is called a set of objects, M is
called a set of attributes and I ⊆ G × M is a relation called incidence relation,
i.e., (g, m) ∈ I if object g has attribute m. The derivation operators (·)0 are
defined for A ⊆ G and B ⊆ M as follows:</p>
        <p>A0 = {m ∈ M | ∀g ∈ A : gIm} , B0 = {g ∈ G | ∀m ∈ B : gIm} .</p>
        <p>Sets A ⊆ G, B ⊆ M , such that A = A00 and B = B00, are said to be closed.
For A ⊆ G, B ⊆ M , a pair (A, B) such that A00 = B and B00 = A, is called
a formal concept, A and B are called extent and intent, respectively. In Data
Mining, an intent is also called a closed itemset (or closed pattern).</p>
        <p>A partial order ≤ is defined on the set of concepts as follows: (A, B) ≤ (C, D)
iff A ⊆ C (D ⊆ B), a pair (A, B) is a subconcept of (C, D), while (C, D) is a
superconcept of (A, B). With respect to this partial order, the set of all formal
concepts forms a complete lattice L called the concept lattice of the formal
context (G, M, I).
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Equivalence classes and key sets</title>
        <p>Let B be a closed itemset. Then all subsets D ⊆ B, such that D00 = B are called
generators of B and the set of all generators is called the equivalence class of B,
denoted by Equiv(B) = {D | D ⊆ B, D00 = B}. A subset D ∈ Equiv(B) is a
key [2, 13] or minimal generator of B if for every E ⊂ D one has E00 6= D00 = B00,
i.e., every proper subset of a key is a member of the equivalence class of a
smaller closed set. We denote a set of keys (key set ) of B by K(B). The set of
keys is an order ideal, i.e., any subset of a key is a key [13]. The minimum key
set Kmin(B) ⊆ K(B) is a subset of the key set that contains the keys of the
minimum size, i.e., Kmin(B) = {D | D ∈ K(B), |D| = minE∈K(B)|E|}. In an
equivalence class there can be several keys, but only one closed itemset, which
is maximal in this equivalence class. An equivalence class is called trivial if it
consists only of a closed itemset.</p>
        <p>For the sake of simplicity, we denote attribute sets by strings of characters,
e.g., abc instead of {a, b, c}.</p>
        <p>Example. Let us consider a formal context given in Table 1. Five concepts have
nontrivial equivalence classes, namely ({g1}, acf ), ({g3}, ade), ({g5, g6}, bdf ),
({g5}, bdef ) and (∅, abcdef ). Among them, only bdf and abcdef have the
minimum key sets that differ from the key sets, i.e., Kmin(bdf ) = {b}, K(bdf ) =
{b, df } and Kmin(abcdef ) = {ab}, K(abcdef ) = {ab, adf, aef, cef }.
gggggg234561 ××× ×× ××× ××× ×××× ××× bbaaaddbcdfecfedfef bbaaaebfd,, ecff
∗ The equivalence class includes all itemsets that contain a key from K(abcdef ).
ad ad, ade
af , cf af , cf , acf
b, df b, df , bd, bf , bdf
be, ef be, ef , bde, bef , def , bdef
ab, adf , aef , cef ab, adf , aef , cef , ..., abcdef ∗
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Level-wise structure on minimum key sets</title>
        <p>In [11] we introduced the minimum closure structure induced by minimum key
sets. Here we recall the main notions.</p>
        <p>Let C be a set of all closed itemsets and Kmin(B) be the minimum key set
of a closed itemset B ∈ C. We denote a function that maps a closed itemset
to the size of its minimum key by level, i.e., level : C → {0, . . . , |M |}, such
that level(B) = |D|, where D ∈ Kmin(B) is an arbitrary itemset chosen from
Kmin(B). The minimal structural level k is given by all minimum keys of size
k, i.e., Kkmin = S Kmin(B). We say that B belongs to the minimum</p>
        <p>B∈C,
level(B)=k
structural level k if keys in Kmin(B) have size k. We denote the corresponding
set of closed itemsets of level k by Ckmin. More formally, Ckmin = {B | B ∈
C, level(B) = k}. We call minimum structural complexity of C the maximal
number of not empty levels, Ncmin = max{k | k = 1, . . . , |M |, Kkmin 6= ∅}.
Example. The closure structure of the concept lattice from the running example
is given in Fig. 1. It consists of 3 closure levels, minimum structural complexity
is equal to 3.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The GDPM algorithm and related issues</title>
      <p>
        Efficiency is a principal parameter of the algorithms for computing closed
itemsets (concepts). Apart of polynomial delay, we pay attention to other important
characteristics, namely (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) the strategy of computing a concept given already
generated ones, and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) the test of uniqueness of generated concepts.
      </p>
      <p>
        Taking into account that the set of minimum keys is an order ideal, we may
generate closure levels using a strategy similar to one used in the Titanic and
Pascal algorithms, i.e., computing a new minimum key by merging two minimum
keys, that differ in one element, from the previous level. This strategy may be
non-optimal because (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) for each concept we should keep all its minimum keys,
and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) the time complexity of the key candidate generation is quadratic w.r.t.
the size of the last level.
      </p>
      <p>An alternative strategy is to add to each key an attribute that is not included
into the corresponding intent. For this strategy it is important to use an efficient
procedure for verification whether a concept is generated for the first time. We
cannot use the canonicity test as it is done, for example, in CbO because a key
of a concept may be lexicographically less than its minimum key.</p>
      <p>The simplest solution to ensure the presence of only one minimum key for
each concept is to use a lexicographic tree that contains all previously generated
concepts. Thus, for each generated key we additionally need O(|M |) time to
check if a concept was generated at previous iterations.</p>
      <p>We proposed an algorithm called GDPM (Gradual Discovery in Pattern
Mining) to compute the closure structure of concept lattice by levels. Its detailed
description and an example are given in [11] (referred there as CbO-Gen). Here
we give its brief description.</p>
      <p>The pseudocode of GDPM is given in Algorithm 1. The algorithm computes
concepts based on the breadth-first traversal, i.e., at the level k it computes
all concepts that have a minimum key of size k. Each newly generated key is
obtained by computing the union of a minimum key from the previous level and
an attribute that is out of the closure of the minimum key (lines 3-13). The
key is added to Kk∗ only if its closure is not in the lexicographic tree that stores
all generated previously intents (lines 8-11). For each concept we store only one
minimum key in Kk∗.</p>
      <sec id="sec-3-1">
        <title>Algorithm 1 GDPM</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Concepts as classifiers. Baseline classification model</title>
      <p>In order to evaluate intents (closed itemsets) as representatives of the classes we
propose to use the class labels of objects that were unavailable during computing
the closure structure. Further we describe a simple concept-based classification
model. This model is closely related to the JSM-method proposed by Finn, that
is widely used in the FCA community [10, 3, 8].</p>
      <p>Let (G, M, I) be a training context, and each object g belongs to one class
label(g) ∈ Y, where Y is a set of class labels. We use concepts as classifiers.
Let c = (A, B) ∈ C be a formal concept, then its class is given by class(c) =
arg maxy∈Y (Pg∈A[label(g) = y]), where [·] is an indicator function, taking 1 if
the condition in the bracket is true and 0 otherwise. An object g is classified by
a set of concept classifiers C∗ based on the weighted majority vote as follows:
classif y(g, C∗, w, θ) = arg maxy∈Y (Pc∈C∗,w(c)≥θ w(c)), where w(·) is a weight
class(c)=y
function, and θ is a weight threshold. For example, for a concept c = (A, B)
weight can be defined based on one of the following functions:
prec(c) =</p>
      <p>tp(c)
tp(c) + f p(c)
, recall(c) =</p>
      <p>tp(c)
tp(c) + f n(c)
, F 1(c) =
2 · prec(c) · recall(c)
prec(c) + recall(c)
where tp(c) = |{g | label(g) = class(c), g ∈ A}|, f p(c) = |A| − tp(c), tn = |{g |
label(g) 6= class(c), g ∈ G \ A}|, f n = |G \ A| − tn.</p>
      <p>As a set of classifiers we use either a single level Ckmin or all concepts up to
level k, i.e., ∪j≤kCjmin.</p>
      <p>Example. Let us consider the context from Table 1. We take the class labels
where objects g1 and g2 belong to class “+”, and g3 − g6 belong to class “−”.
The weights of concept classifiers are given by the extent precision, e.g., pr(a) =
pr(c) = pr(d) = pr(f ) = 2/3. The threshold is θ = 2/3. The intents a and c
are from “+”-class, d and f are from “−”-class. Then, to classify object gtest
described by acdf we use the intents a, c, d, f from the 1st level, and ac, acf
from the 2nd level. Using the intents of the 1st level we are not able to classify
gtest since pr(a) + pr(c) = pr(d) + pr(f ) = 4/3. For the intents of the 2nd level
we have pr(ac) + pr(acf ) = 2 for “+” class and 0 for “−” class, thus, we classify
gtest as “+”.</p>
      <p>The proposed model is based on all intents from a given level that meet the
weight requirements. However, more sophisticated models, e.g., Classy [12] or
Krimp [14], can be adapted to use the intents by closure levels instead of frequent
itemsets. More proper combination of the intents may improve the classification
quality.</p>
      <p>For large datasets the closure structure can be computed for each class
independently.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <p>In this section we report the results of an experimental study of the minimum
closure structure, i.e., closed itemsets within levels Ckmin. We use freely available
datasets from the LUCS/KDD data set repository [5], their characteristics are
given in Table 2.</p>
      <sec id="sec-5-1">
        <title>Concept contrast based on F1 measure</title>
        <p>In IM apart of descriptive quality of itemsets as coverage or diversity, one is
interested in assessing the quality of itemsets as representatives of the classes.
The latter is closely related to emerging patterns [6, 1]. In this study we evaluate
contrast of formal extents by F1 measure. As it was shown in [11], the average F1
measure usually decreases w.r.t. closure levels. However, there are some datasets
with atypical behavior, where the average F1 measure increases at the last levels
of the closure structure. To address the underlying causes of this behavior we
study the values of F1 measure within the frequency ranges of size 0.1, i.e.,
(0.0, 0.1], (0.1, 0.2], (0.2, 0.3], etc.</p>
        <p>Fig. 2 shows the results for some datasets. Our experiments showed that
usually the value of F1 measure of the concepts within a fixed frequency range
remains almost the same at all levels. Thus, the average F1 at a closure level is
affected by the proportion of the concepts of a certain frequency. Since the ratio
of frequent (infrequent) concepts decreases (increases) with the level number,
F1 measure decreases as well. Thus, we may expect increase of the average F1
measure at the last levels for datasets with a large number of frequent and
“coherent” attributes and a subset of infrequent “incoherent” attributes.</p>
        <p>In the previous study we showed that the size of closure levels resembles the
values of binomial coefficients, i.e., the largest levels are located in the middle
of the closure structure, while the first and last levels are the smallest ones. In
Fig. 3 we show the size of the levels w.r.t. 10 frequency ranges. As for the whole
set of concepts, for the subset of concepts of a given frequency we observe quite
similar level size distributions – the largest level is on the middle, the smallest
ones are located the first and last levels. The index of the largest levels is shifting
– the less frequency the larger the index of the largest level.
In this section we report the average accuracy by 10-fold cross validation of
the rule-based classifier described in Section 4. We use 8 folds (training set)
to compute itemsets, 1 fold (test set) to select the best parameters and 1 fold
(validation set) to assess the performance of the classifiers. We report the average
values on the validation sets. We use both concepts from a single closure level
(single level, SL) and concepts from all levels up to a given level (cumulative
levels, CL) to build a classifier. As a weight function we use precision with the
following threshold values: 0.0, 0.6, 0.7, 0.8 and 0.9.</p>
        <p>The experiments showed that both SL- and CL-classifiers may achieve quite
high accuracy. The average accuracy for 8 datasets is given in Fig. 4. The
maximal (or close to the maximal) accuracy of CL-classifiers is achieved at the first
levels and usually changes slightly when the classifier is extended by the further
closure levels. For SL-classifiers the maximal (or close to the maximal) accuracy
is usually achieved at one of the first levels.</p>
        <p>
          We also compared the proposed classifier with the state-of-the-art classifiers
from the Sklearn library. We consider SVM, Naive Bayes, and 3 tree-based
models: Random Forests, CART, C5.0. We also use three sets to select the best
parameters for each classifiers, i.e., the number of trees for tree-based classifiers
(50 or 100), the maximum tree depth (
          <xref ref-type="bibr" rid="ref2 ref5">2, 5, 10, 15</xref>
          ) and the kernel types for SVM
(polynomial, Radial basis function, sigmoid).
        </p>
        <p>The average accuracy is reported in Table 3. The experiments show that
even with the simplest model of classifiers based on closure level we can achieve
the accuracy comparable with the one of the state-of-the-art classifiers. A more
proper selection and combination of the generated concepts may provide better
quality.</p>
        <p>Based on the obtained results we may conclude that the proposed level-wise
strategy allows us to generate the concepts that describe meaningful groups of
objects and the intents from the first closure levels may be used as an alternative
to frequent itemset.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper, we study further the closure structure of concept lattice by
focusing on the ability of concepts to describe meaningful subsets of objects. Our
experiments show that the levels of the closure structure are a good alternative
to frequency. Each closure level is computed with polynomial delay, and the
quality of itemsets decreases after the first levels.</p>
      <p>One of the main directions of future work is to develop more efficient
algorithms for computing the closure levels and study other practical
applications where the proposed closure structure may provide better results than the
frequency-based concept generation.
J. (eds.) Contrast Data Mining: Concepts, Algorithms, and Applications, pp. 269–
282. CRC Press (2013)
7. Ganter, B., Wille, R.: Formal concept analysis–mathematical foundations (1999)
8. Ganter, B., Grigoriev, P.A., Kuznetsov, S.O., Samokhin, M.V.: Concept-based data
mining with scaled labeled graphs. In: International Conference on Conceptual
Structures. pp. 94–108. Springer (2004)
9. Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM</p>
      <p>
        Computing Surveys (CSUR) 38(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), 9–es (2006)
10. Kuznetsov, S.O.: Interpretation on graphs and complexity characteristics of a
search for specific patterns. Automatic Documentation and Mathematical
Linguistics 24(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 37–45 (1989)
11. Makhalova, T., Kuznetsov, S.O., Napoli, A.: Gradual discovery with
closure structure of a concept lattice. In: The 15th International Conference on
Concept Lattices and Their Applications (2020)
12. Proenc¸a, H.M., van Leeuwen, M.: Interpretable multiclass classification by
mdlbased rule lists. Information Sciences 512, 1372–1393 (2020)
13. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg
concept lattices with titanic. Data &amp; knowledge engineering 42(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), 189–222 (2002)
14. Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress.
      </p>
      <p>
        Data Mining and Knowledge Discovery 23(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 169–214 (2011)
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Asses</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buzmakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourquard</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A hybrid classification approach based on FCA and emerging patterns - an application for the classification of biological inhibitors</article-title>
          .
          <source>In: Proceedings of the 9th International Conference on Concept Lattices and Their Applications. CEUR Workshop Proceedings</source>
          , vol.
          <volume>972</volume>
          , pp.
          <fpage>211</fpage>
          -
          <lpage>222</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bastide</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taouil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquier</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Mining frequent patterns with counting inference</article-title>
          .
          <source>ACM SIGKDD Explorations Newsletter</source>
          <volume>2</volume>
          (
          <issue>2</issue>
          ),
          <fpage>66</fpage>
          -
          <lpage>75</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Blinova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dobrynin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finn</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pankratova</surname>
          </string-name>
          , E.:
          <article-title>Toxicology analysis by means of the jsm-method</article-title>
          .
          <source>Bioinformatics</source>
          <volume>19</volume>
          (
          <issue>10</issue>
          ),
          <fpage>1201</fpage>
          -
          <lpage>1207</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Buzmakov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Sofia: how to make fca polynomial?</article-title>
          <source>In: Proceedings of the 4th International Conference on What can FCA do for Artificial Intelligence?-</source>
          Volume
          <volume>1430</volume>
          . pp.
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          . CEUR-WS. org (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Coenen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The LUCS-KDD discretised/normalised ARM and CARM Data Library</article-title>
          . http://www.csc.liv.ac.uk/ frans/KDD/Software/LUCS KDD DN,
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cuissart</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poezevara</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Cr´emilleux,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Lepailleur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Bureau</surname>
          </string-name>
          , R.:
          <article-title>Emerging Patterns as Structural Alerts for Computational Toxicology</article-title>
          . In: Dong,
          <string-name>
            <surname>G.</surname>
          </string-name>
          , Bailey,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>