<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Warehouse Development to Identify Regions with High Rates of Cancer Incidence in México through a Spatial Data Mining Clustering Task.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Joaquin Pérez Ortega</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María del Rocío Boone Rojas</string-name>
          <email>rboone@cs.buap.mx</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María Josefa Somodevilla García</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mariam Viridiana Meléndez Hernández</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Benemèrita Universidad Autónoma Puebla</institution>
          ,
          <addr-line>Fac. Cs. de la Computaciòn</addr-line>
          ,
          <country country="MX">México</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Centro Nacional de Investigación y Desarrollo Tecnológico</institution>
          ,
          <addr-line>Cuernavaca Mor. Mex</addr-line>
        </aff>
      </contrib-group>
      <fpage>37</fpage>
      <lpage>47</lpage>
      <abstract>
        <p>Data warehouses arise in many contexts, such as business, medicine and science, in which the availability of a repository of heterogeneous data sources, integrated and organized under a unified framework facilitates analysis and supports the decision making process. These data repositories increase their scope and application, when used for data mining tasks, which can extract useful knowledge, new and valuable from large amounts of data. This paper presents the design and implementation of population-based data warehouses on the incidence of cancer in Mexico; based on the conceptual level multidimensional model and the ROLAP model (Relational On-Line Analytical Processing) at the implementation level. A data warehouses is built, to be used as input for clustering data mining tasks, in particular, the k-means algorithm, in order to identify regions in Mexico, with high rates of cancer incidence. The identified regions, as well as, the dimension related to the geographic location of the municipalities and their rate of incidence of cancer, are processed by IRIS, a Geographic Information System, developed at the National Institute of Statistics, Geography and Informatics of Mexico.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Data warehouses arise in many contexts, such as business, medicine and science,
in which the availability of a repository of heterogeneous data sources, integrated
and organized under a unified framework facilitates analysis and supports the
decision making process. These data repositories increase their scope and
application, when used for data mining tasks, which can extract useful
knowledge, new and valuable from large amounts of data.</p>
      <p>
        Data warehouses have been applied mainly in the commercial and business
areas [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and more recently there have been some applications in the Health field
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and the trend towards its integration with various technologies [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        Moreover, according to the literature, the use of data mining systems applied
to the analysis of massive databases of health on a population basis has been
limited, it is noteworthy work: Constructing Over Dendrogram Matrix Detail
view + Views. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Application of data mining techniques to databases population
of cancer [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], Subgroup discovery in cervical cancer using data mining
Techniques [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and Data mining for cancer management in Egypt [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In the
case of Mexico, to the best of our knowledge, the work that has been developed
at the Centro Nacional de Investigación y Desarrollo Tecnológico and BUAP, are
the first ones in this field.
      </p>
      <p>
        This work has been preceded by other works which has been done on the
incidence of other cancers such as stomach and lung [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. It is part of a larger
project doomed to make proposals for improving the k-means algorithm in
various aspects such as effectiveness and efficiency, reported in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and its application in the Health field.
      </p>
      <p>
        This article presents the data warehouse design and integration for the
development of a data mining task on cancer incidence by regions in Mexico,
based on the integration of complementary technologies such as clustering and
geographical information systems. As a study case, the results for the incidence
of cervical cancer are presented, which has been of special interest, since in
Mexico, cervical cancer is the leading cause of cancer death in women [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The report is organized as follows, followed by this introduction, Section 2
presents the description of data sources and process design and implementation of
data warehouse, Section 3 provides an overview of each application. In Section 4,
results for the case of cervical cancer and its visualization by GIS INEGI IRIS [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
are included. Finally, in Section 5, conclusions and perspectives of this work are
presented.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2 The Data Warehouse</title>
      <p>The process of collecting and integrating data warehouse on cancer incidence by
region in Mexico, required to select the data sources necessary to accomplish the
task of data mining. This section describes the data sources and the conceptual
design based on the multidimensional model and also, the implementation of the
data warehouse under the ROLAP approach.</p>
      <sec id="sec-2-1">
        <title>2.1 The Data Sources</title>
        <p>In the study, the processed databases have been derived from official records of
the National Institute of Public Health (INSP) and the National Institute of
Statistics, Geography and Informatics (INEGI) of Mexico.</p>
        <p>
          Data on cancer incidence were obtained through subsystem Remote
Consultation System for Health Information (SCRIS) of the INSP [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. In
particular, the databases were queried for cases of mortality cancer and results
were configured by considering levels of aggregation such as: National States,
division (Jurisdiction, Municipalities), year, age range, gender and causes
(including tumors).
        </p>
        <p>The information on the population and the actual geographical location of the
municipalities was obtained from INEGI official databases, through its
Geographic Information System IRIS, which has statistical information covering
a wide geographical number of subjects, demographic, social and economic; also
includes aspects of the physical environment, natural resources and
infrastructure. This wealth of statistical and geographical data was obtained
through various activities such as conducting population and housing census and
economic census and the generation of basic cartography and census.</p>
        <p>The information in the databases of the above institutions are integrated into a
data warehouse (see Fig. 1), and according to the conventions in the area of
health, for this study, only the municipalities with more than one hundred
thousand inhabitants were considered.</p>
        <p>CategoryMunicipalityID</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Data Warehouse Multidimensional Model for a population-based incidence of cancer in Mexico.</title>
        <p>
          According to [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] the conceptual data model most widely used for data
warehouses is the multidimensional model. The data are organized around
the facts that have attributes or measures that may be more or less detail
according to certain dimensions. In our case, the data warehouse design at the
conceptual level is based on the multidimensional model, in which the
dimensions can be distinguished as CAUSE, TIME, and PLACE. In this case, it
is considered that a country has the basic fact, "deaths" that may have associated
attributes such as number of cases, incidence rate, mean, variance, etc.. Fact can
be detailed in several dimensions such as cause of death, place of death, date of
death, etc. In Fig. 1 shows the facts "deaths" and three dimensions with various
levels of aggregation. The arrows can be read as "is added". As shown in Fig. 1,
each dimension has a hierarchical structure but not necessarily linear. When the
number of dimensions cannot exceed three represent each combination of levels
of aggregation as a cube.
        </p>
        <p>The cube is made up of boxes with one box for each possible value from each
dimension to the corresponding level of aggregation. On this "view", each box
represents a fact. Fig. 2 shows a three dimensional cube corresponding to the fact:
"According to the 2000 census, the town of Atlixco, there were 15 deaths from
cervical cancer" in which the dimensions Cause, Place and Time have been added
by type of disease (cancer), Municipality and Census. The representation of a fact
corresponds therefore to a square in the cube. The value of the box is the
observed (in this case is the number of deaths).</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3 Data warehouse scheme ROLAP (Relational OLAP) implementation of population-based cancer incidence in Mexico.</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3 Data Mining Application on Cancer Incidence</title>
      <p>The implemented data warehouse has been used to develop a data mining task
space based on the integration of additional technologies to the data warehouse,
such as clustering and Geographic Information Systems, which in this case are
very suitable, to identify and display areas with incidence of cancer in
Mexico. The following provides a general description of the integration process
of technologies and tools (Fig. 3) made for this application.</p>
      <p>The data warehouse integrates the following information for our application:
the component space that allows viewing of the regions of municipalities,
population data such as the death rate and incidence rate and the time component,
which in this case is the census year.</p>
      <p>
        The IRIS GIS INEGI [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], through your options allows the recovery of
population data and the real location of the municipalities, which are integrated
into the data warehouse.
      </p>
      <p>Since IRIS stores geographical representation of municipalities in the vector
format standardized "shape" and by means of polygons, there is the need for a
process of transfer of forms and formats in order to have a numerical
representation of each municipality, in this case, corresponds to a point on the
municipality center location, which is accomplished primarily through the tools
of ESRI's ArcInfo GIS.</p>
      <p>
        Given the numerical representation of each municipality through a point (x,
y), along with its rate of incidence of cancer, the Matlab programming
environment and its implementation of k-means algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is used
to generate patterns / groups of municipalities and the corresponding centroids.
      </p>
      <p>Once you have the above results, it is again necessary to transfer digital data
format to format shape, a process similar to above using ArcInfo tools, allowing
viewing through GIS IRIS.</p>
      <p>Finally, the groups of municipalities and their corresponding centroids, are
passed as GIS layers to IRIS, for display on the geographic map of Mexico.</p>
    </sec>
    <sec id="sec-4">
      <title>4 Results and visualization with IRIS</title>
      <p>In this project we have done grouping tasks according to the affinity of location
and incidence rate of the municipalities. Series of experimental tests on the data
stores in cities with more than 100.000 inhabitants were carried out. Size groups
were considered k = 5, 10, 15, 20 and 30. The best result was obtained for k =
20.</p>
      <p>As a case study, this paper presents the results obtained by k-means algorithm
in Matlab for the cervical cancer data warehouse. Fig. 4 provides the visualization
of the 20 regions identified.</p>
      <p>From the results, we distinguish the groups spearheading the three
municipalities with higher incidence rates: Atlixco, Apatzingán and Tapachula
(Chiapas). In Fig. 5 the detail of the display of the group corresponding to the
region of Chiapas and the incidence of cervical cancer is shown. Table 1 provides
data for the previous group, and statistical measures for the mean and standard
deviation.</p>
      <p>
        The groups identified with high incidence rates: Tapachula and Apatzingan
match municipalities identified in other studies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and correspond to the
population characteristics, identified in the work of the medical field [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
such situations such as poverty, lack of preparation and access to effective health
services and the initiation of sexual activity at an early age. This allows us to
assert that the grouping is made valid. On the other hand, the study allowed
discovering other municipalities that had not been identified in other research,
such as the group of Atlixco, in particular showing the highest incidence rate in
the country (see table 2).
      </p>
      <p>In order to perform a global analysis of our results, Table 2 provides
information of the ten municipalities with the highest incidence rate in the
country.
07089 Chiapas
17006 Morelos
28021 Tamaulipas
06007 Colima</p>
      <p>Atlixco
Apatzingán
Tapachula
Cuautla
El Mante</p>
      <p>Manzanillo
30039 Veracruz-Llave Coatzacoalcos 267212
18017 Nayarit</p>
      <p>Tepic
General Mean
Standard Deviation
117111
117949
271674
153329
112602
125143
305176
153001
118593
15
13
27
14
10
11
23
26
13
10</p>
      <p>Figure 6, illustrates the location of previous incidence rates compared to the
national average and the corresponding standard deviation.</p>
    </sec>
    <sec id="sec-5">
      <title>5 Conclusions</title>
      <p>Multidimensional model for conceptual design of the data warehouse, turned out
to be very appropriate, since this model is easily scalable and allows analysis of
the information under different perspectives. It is expected that future studies
process other variables, related to the municipalities, included in this design, such
as socioeconomic status, type of region, gender and access to health services,
among others. Moreover, the implementation of data warehouse based on the
ROLAP model has allowed taking advantage of the facilities developed for
relational databases. In addition, it is expected that the design and implementation
carried out in the data warehouse can be used in other applications.</p>
      <p>The processing of the spatial component of our data warehouse, using the
IRIS GIS INEGI, has resulted in a high quality visual representation of our
results, based on the actual physical location of the municipalities and on a map
of the topography of the Republic Mexican INEGI. Also experience and learning
has been gained on transfer of shapes (polygons, points) techniques and
formats (Number-shape) through ArcView GIS tools.</p>
      <p>Currently we are working to complete studies in other cancer types. Besides,
data mining tasks will be developed on the incidence of conditions such as
diabetes, influenza and cardiovascular diseases, among others.
Acknowledgement. R. Boone expresses her gratitude to Ms. Rocío Pérez Osorno
from INEGI, Puebla. (Graduated from the Faculty of Cs. Computing, BUAP) for
advice and support in plotting the results of this work through the IRIS GIS.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Barrón Vivanco M. Arandine</surname>
            ,
            <given-names>Pérez O. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miranda</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Fátima</surname>
          </string-name>
          , Pazos R., XII Congreso de Investigación en Salud Pública, Aplicación de técnicas de minería de datos a bases de datos poblacionales de cáncer, CENIDET, México, Secretaría de Saúde do Estado de Pernambuco, Brasil, Abril (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Forgy</surname>
            <given-names>E.</given-names>
          </string-name>
          “
          <article-title>Cluster analysis of multivariate data: Efficiency vs</article-title>
          .
          <source>Interpretability of classification”</source>
          ,
          <source>Biometrics</source>
          , vol.
          <volume>21</volume>
          , pp.
          <fpage>768</fpage>
          -
          <lpage>780</lpage>
          .
          <year>1965</year>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hernández-Orallo</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramiréz-Quintana M. J.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ferri-Ramiréz</surname>
            <given-names>C.</given-names>
          </string-name>
          , Introducción a la Minería de Datos, Ed. Pearson Prentice Hall, Madrid (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hidalgo-Martínez Ana C.</surname>
          </string-name>
          <article-title>El cáncer cérvico-uterino su impacto en México. Porqué no funciona el programa nacional de detección oportuna</article-title>
          .
          <source>Revista Biomédica</source>
          ,
          <string-name>
            <given-names>Centro</given-names>
            <surname>Nal. De Investigaciones Regionales Dr. Hineyo Noguchi</surname>
          </string-name>
          ,
          <string-name>
            <surname>UADY</surname>
          </string-name>
          ,
          <year>2006</year>
          , México.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>5. IRIS 4. http://mapserver.inegi.gob.mx. SNIEG Sistema Nacional de Información Estadística y Geográfica.</mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Jin</surname>
            <given-names>Chen</given-names>
          </string-name>
          , MacEachren,
          <string-name>
            <surname>Alan</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peuquet</surname>
          </string-name>
          , Donna. Constructing Overview+
          <article-title>Detail Dendogram Matrix Views</article-title>
          .
          <source>IEEE Transactions on Visualization &amp; Computer Graphics</source>
          ., Vol.
          <volume>15</volume>
          ,
          <string-name>
            <surname>Issue</surname>
            <given-names>6</given-names>
          </string-name>
          ,
          <fpage>p889</fpage>
          -
          <lpage>896</lpage>
          , Dec.
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>MacQueen</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Some Methods for Classification and Analysis of Multivariate Observations</article-title>
          .
          <source>In Proceedings Fifth Berkeley Symposium Mathematics Statistics and Probability</source>
          . Vol.
          <volume>1</volume>
          . Berkeley, CA (
          <year>1967</year>
          )
          <fpage>281</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Martínez</surname>
            <given-names>M. Francisco</given-names>
          </string-name>
          <string-name>
            <surname>Javier</surname>
          </string-name>
          .
          <article-title>Epidemiología del cáncer del cuello uterino</article-title>
          .
          <source>Medicina Universitaria</source>
          <year>2004</year>
          ,
          <fpage>39</fpage>
          -
          <lpage>46</lpage>
          . Vol.
          <volume>6</volume>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <year>22</year>
          ,
          <string-name>
            <surname>UANL</surname>
          </string-name>
          , México.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>NAIIS</given-names>
            <surname>Instituto Nacional de Salud Pública</surname>
          </string-name>
          ,
          <string-name>
            <surname>SCRIS</surname>
          </string-name>
          , Mortalidad, http://sigsalud.insp.mx/naais/, Cuernavaca, Morelos, México, (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Nevine M. Labib</surname>
            ,
            <given-names>Michael N.</given-names>
          </string-name>
          <article-title>Malek: Data Mining for Cancer Management in Egypt</article-title>
          .
          <source>Transactions on Engineering, Computing and Technology V8 October</source>
          <year>2005</year>
          :
          <article-title>(ISSN 1305-</article-title>
          5313).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Pérez-C. Nelson</surname>
          </string-name>
          ,
          <string-name>
            <surname>Abril-Frade D.O. Estado Actual de las Tecnologías de Bodegas de Datos Espaciales</surname>
          </string-name>
          .
          <source>Ing. E Investigación</source>
          . Vol.
          <volume>27</volume>
          , No. 1, Univ. Nal. De Colombia.
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Pérez-O. J.</surname>
            ,1,
            <given-names>R. Pazos R</given-names>
          </string-name>
          , L. Cruz R.,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Reyes S. “Improvement the Efficiency and Efficacy of the K-means Clustering Algorithm through a New Convergence Condition”</article-title>
          .
          <source>Computational Science and Its Applications - ICCSA 2007 - International Conference Proceedings</source>
          . Springer Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Pérez-O. J</surname>
            .2,
            <given-names>M.F.</given-names>
          </string-name>
          <string-name>
            <surname>Henriques</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Pazos</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Cruz</surname>
            , G. Reyes,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Salinas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mexicano</surname>
          </string-name>
          . Mejora al Algoritmo de
          <article-title>K-means mediante un Nuevo criterio de convergencia y su aplicación a bases de datos poblacionales de cancer</article-title>
          . 2do
          <string-name>
            <surname>Taller Latino Iberoamericano de Investigación de Operaciones</surname>
          </string-name>
          , Mèxico,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pérez-O. J</surname>
          </string-name>
          .
          <volume>3</volume>
          ,
          <string-name>
            <given-names>Rocío</given-names>
            <surname>Boone</surname>
          </string-name>
          <string-name>
            <surname>Rojas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>María J. Somodevilla</given-names>
            <surname>García</surname>
          </string-name>
          .
          <article-title>Research issues on K-means Algorithm: An Experimental Trial Using Matlab</article-title>
          .,
          <source>Advances on Semantic Web and New Technologies”</source>
          . Vol
          <volume>534</volume>
          . http://ceur-ws.org/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rangel-Gómez</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Lazcano-Ponce</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Palacio-Mejía</surname>
          </string-name>
          ,
          <article-title>Cáncer cervical, una enfermedad de la pobreza: diferencias en la mortalidad por áreas urbanas y rurales en México</article-title>
          , http:// www.insp.mx/salud/index.html.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Scotch</surname>
          </string-name>
          ,Matthew, Parmato B.
          <string-name>
            <surname>Monaco</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <article-title>Evaluation of SOVAT: An OLAPGIS decision support system for community health assessment data analysis</article-title>
          .
          <source>BMC Medical Informatics &amp; Decisión</source>
          Making Vol.
          <volume>8</volume>
          (
          <issue>1</issue>
          -
          <fpage>12</fpage>
          ).
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Simonet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Landais</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Guillon D.</surname>
          </string-name>
          <article-title>A multi-source Information System for end-stage renaldisease</article-title>
          .
          <source>Comptes Residus Biologies</source>
          ,
          <year>2002</year>
          , Vol.
          <volume>325</volume>
          I4.,
          <year>p515</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Thangavel K. Jaganathan</surname>
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Esmy</surname>
            <given-names>P. O.</given-names>
          </string-name>
          ,
          <article-title>Subgroup Discovery in Cervical Cancer Analysis Using Data Mining Techniques</article-title>
          , Departament of Computer Science, Periyar University: Departament of Computer Science and Applications,
          <string-name>
            <surname>Gandhigram Rural</surname>
          </string-name>
          Institute-Deemed University, Gandhigram: Radiation Oncologist , Christian Fellowship Community Health Centre, Tamil Nadu, India:
          <source>AIML journal</source>
          , Vol(
          <volume>6</volume>
          ),
          <source>Issue(1)</source>
          , January,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>