<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Data Warehouse Development to Identify Regions with High Rates of Cancer Incidence in México through a Spatial Data Mining Clustering Task</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Joaquin</forename><forename type="middle">Pérez</forename><surname>Ortega</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Centro Nacional de Investigación y Desarrollo Tecnológico</orgName>
								<address>
									<settlement>Cuernavaca Mor. Mex</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">María</forename><surname>Del Rocío</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Boone</forename><surname>Rojas</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Centro Nacional de Investigación y Desarrollo Tecnológico</orgName>
								<address>
									<settlement>Cuernavaca Mor. Mex</settlement>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Fac. Cs. de la Computaciòn</orgName>
								<orgName type="institution">Benemèrita Universidad Autónoma Puebla</orgName>
								<address>
									<country key="MX">México</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">María</forename><forename type="middle">Josefa</forename><surname>Somodevilla García</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Fac. Cs. de la Computaciòn</orgName>
								<orgName type="institution">Benemèrita Universidad Autónoma Puebla</orgName>
								<address>
									<country key="MX">México</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mariam</forename><surname>Viridiana Meléndez Hernández</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Fac. Cs. de la Computaciòn</orgName>
								<orgName type="institution">Benemèrita Universidad Autónoma Puebla</orgName>
								<address>
									<country key="MX">México</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Data Warehouse Development to Identify Regions with High Rates of Cancer Incidence in México through a Spatial Data Mining Clustering Task</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4FCA13D0D43B9501FED821F79943687E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T09:13+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Data warehouses arise in many contexts, such as business, medicine and science, in which the availability of a repository of heterogeneous data sources, integrated and organized under a unified framework facilitates analysis and supports the decision making process. These data repositories increase their scope and application, when used for data mining tasks, which can extract useful knowledge, new and valuable from large amounts of data.</p><p>This paper presents the design and implementation of population-based data warehouses on the incidence of cancer in Mexico; based on the conceptual level multidimensional model and the ROLAP model (Relational On-Line Analytical Processing) at the implementation level.</p><p>A data warehouses is built, to be used as input for clustering data mining tasks, in particular, the k-means algorithm, in order to identify regions in Mexico, with high rates of cancer incidence.</p><p>The identified regions, as well as, the dimension related to the geographic location of the municipalities and their rate of incidence of cancer, are processed by IRIS, a Geographic Information System, developed at the National Institute of Statistics, Geography and Informatics of Mexico.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Data warehouses arise in many contexts, such as business, medicine and science, in which the availability of a repository of heterogeneous data sources, integrated and organized under a unified framework facilitates analysis and supports the decision making process. These data repositories increase their scope and application, when used for data mining tasks, which can extract useful knowledge, new and valuable from large amounts of data.</p><p>Data warehouses have been applied mainly in the commercial and business areas <ref type="bibr" target="#b2">[3]</ref> and more recently there have been some applications in the Health field <ref type="bibr" target="#b14">[16]</ref>  <ref type="bibr" target="#b15">[17]</ref> and the trend towards its integration with various technologies <ref type="bibr">[11] [16]</ref>.</p><p>Moreover, according to the literature, the use of data mining systems applied to the analysis of massive databases of health on a population basis has been limited, it is noteworthy work: Constructing Over Dendrogram Matrix Detail view + Views. <ref type="bibr" target="#b4">[6]</ref>, Application of data mining techniques to databases population of cancer <ref type="bibr" target="#b0">[1]</ref>, Subgroup discovery in cervical cancer using data mining Techniques <ref type="bibr" target="#b16">[18]</ref> and Data mining for cancer management in Egypt <ref type="bibr" target="#b8">[10]</ref>. In the case of Mexico, to the best of our knowledge, the work that has been developed at the Centro Nacional de Investigación y Desarrollo Tecnológico and BUAP, are the first ones in this field.</p><p>This work has been preceded by other works which has been done on the incidence of other cancers such as stomach and lung <ref type="bibr" target="#b13">[15]</ref>. It is part of a larger project doomed to make proposals for improving the k-means algorithm in various aspects such as effectiveness and efficiency, reported in <ref type="bibr" target="#b10">[12]</ref>, <ref type="bibr" target="#b11">[13]</ref> and <ref type="bibr" target="#b12">[14]</ref> and its application in the Health field.</p><p>This article presents the data warehouse design and integration for the development of a data mining task on cancer incidence by regions in Mexico, based on the integration of complementary technologies such as clustering and geographical information systems. As a study case, the results for the incidence of cervical cancer are presented, which has been of special interest, since in Mexico, cervical cancer is the leading cause of cancer death in women <ref type="bibr" target="#b9">[11]</ref>.</p><p>The report is organized as follows, followed by this introduction, Section 2 presents the description of data sources and process design and implementation of data warehouse, Section 3 provides an overview of each application. In Section 4, results for the case of cervical cancer and its visualization by GIS INEGI IRIS [5] are included. Finally, in Section 5, conclusions and perspectives of this work are presented.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">The Data Warehouse</head><p>The process of collecting and integrating data warehouse on cancer incidence by region in Mexico, required to select the data sources necessary to accomplish the task of data mining. This section describes the data sources and the conceptual design based on the multidimensional model and also, the implementation of the data warehouse under the ROLAP approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">The Data Sources</head><p>In the study, the processed databases have been derived from official records of the National Institute of Public Health (INSP) and the National Institute of Statistics, Geography and Informatics (INEGI) of Mexico.</p><p>Data on cancer incidence were obtained through subsystem Remote Consultation System for Health Information (SCRIS) of the INSP <ref type="bibr" target="#b7">[9]</ref>. In particular, the databases were queried for cases of mortality cancer and results were configured by considering levels of aggregation such as: National States, division (Jurisdiction, Municipalities), year, age range, gender and causes (including tumors).</p><p>The information on the population and the actual geographical location of the municipalities was obtained from INEGI official databases, through its Geographic Information System IRIS, which has statistical information covering a wide geographical number of subjects, demographic, social and economic; also includes aspects of the physical environment, natural resources and infrastructure. This wealth of statistical and geographical data was obtained through various activities such as conducting population and housing census and economic census and the generation of basic cartography and census.</p><p>The information in the databases of the above institutions are integrated into a data warehouse (see Fig. <ref type="figure">1</ref>), and according to the conventions in the area of health, for this study, only the municipalities with more than one hundred thousand inhabitants were considered. According to <ref type="bibr" target="#b3">[4]</ref> the conceptual data model most widely used for data warehouses is the multidimensional model. The data are organized around the facts that have attributes or measures that may be more or less detail according to certain dimensions. In our case, the data warehouse design at the conceptual level is based on the multidimensional model, in which the dimensions can be distinguished as CAUSE, TIME, and PLACE. In this case, it is considered that a country has the basic fact, "deaths" that may have associated attributes such as number of cases, incidence rate, mean, variance, etc.. Fact can be detailed in several dimensions such as cause of death, place of death, date of death, etc. In Fig. <ref type="figure">1</ref> shows the facts "deaths" and three dimensions with various levels of aggregation. The arrows can be read as "is added". As shown in Fig. <ref type="figure">1</ref>, each dimension has a hierarchical structure but not necessarily linear. When the number of dimensions cannot exceed three represent each combination of levels of aggregation as a cube.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CategoryMunicipalityID</head><p>The cube is made up of boxes with one box for each possible value from each dimension to the corresponding level of aggregation. On this "view", each box represents a fact. Fig. <ref type="figure" target="#fig_1">2</ref> shows a three dimensional cube corresponding to the fact: "According to the 2000 census, the town of Atlixco, there were 15 deaths from cervical cancer" in which the dimensions Cause, Place and Time have been added by type of disease (cancer), Municipality and Census. The representation of a fact corresponds therefore to a square in the cube. The value of the box is the observed (in this case is the number of deaths). One of the most efficient ways to implement a multidimensional model using relational databases is based on the ROLAP model <ref type="bibr" target="#b3">[4]</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Data Mining Application on Cancer Incidence</head><p>The implemented data warehouse has been used to develop a data mining task space based on the integration of additional technologies to the data warehouse, such as clustering and Geographic Information Systems, which in this case are very suitable, to identify and display areas with incidence of cancer in Mexico. The following provides a general description of the integration process of technologies and tools (Fig. <ref type="figure">3</ref>) made for this application.</p><p>The data warehouse integrates the following information for our application: the component space that allows viewing of the regions of municipalities, population data such as the death rate and incidence rate and the time component, which in this case is the census year.</p><p>The IRIS GIS INEGI [5], through your options allows the recovery of population data and the real location of the municipalities, which are integrated into the data warehouse.</p><p>Since IRIS stores geographical representation of municipalities in the vector format standardized "shape" and by means of polygons, there is the need for a process of transfer of forms and formats in order to have a numerical representation of each municipality, in this case, corresponds to a point on the municipality center location, which is accomplished primarily through the tools of ESRI's ArcInfo GIS.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 3 Integration of Technology and Data Mining Tools</head><p>Given the numerical representation of each municipality through a point (x, y), along with its rate of incidence of cancer, the Matlab programming environment and its implementation of k-means algorithm <ref type="bibr">[2] [7]</ref> is used to generate patterns / groups of municipalities and the corresponding centroids.</p><p>Once you have the above results, it is again necessary to transfer digital data format to format shape, a process similar to above using ArcInfo tools, allowing viewing through GIS IRIS.</p><p>Finally, the groups of municipalities and their corresponding centroids, are passed as GIS layers to IRIS, for display on the geographic map of Mexico.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Results and visualization with IRIS</head><p>In this project we have done grouping tasks according to the affinity of location and incidence rate of the municipalities. Series of experimental tests on the data stores in cities with more than 100.000 inhabitants were carried out. Size groups were considered k = 5, 10, 15, 20 and 30. The best result was obtained for k = 20.</p><p>As a case study, this paper presents the results obtained by k-means algorithm in Matlab for the cervical cancer data warehouse. Fig. <ref type="figure">4</ref> provides the visualization of the 20 regions identified.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 4 Regions of the Municipalities with an incidence of Cervical Cancer.</head><p>From the results, we distinguish the groups spearheading the three municipalities with higher incidence rates: Atlixco, Apatzingán and Tapachula (Chiapas). In Fig. <ref type="figure">5</ref> the detail of the display of the group corresponding to the region of Chiapas and the incidence of cervical cancer is shown. Table <ref type="table" target="#tab_1">1</ref> provides data for the previous group, and statistical measures for the mean and standard deviation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 5 Tapachula Chiapas Group</head><p>The groups identified with high incidence rates: Tapachula and Apatzingan match municipalities identified in other studies <ref type="bibr" target="#b3">[4]</ref> and correspond to the population characteristics, identified in the work of the medical field <ref type="bibr" target="#b6">[8]</ref>, <ref type="bibr" target="#b13">[15]</ref> such situations such as poverty, lack of preparation and access to effective health services and the initiation of sexual activity at an early age. This allows us to assert that the grouping is made valid. On the other hand, the study allowed discovering other municipalities that had not been identified in other research, such as the group of Atlixco, in particular showing the highest incidence rate in the country (see table <ref type="table" target="#tab_2">2</ref>). In order to perform a global analysis of our results, Table <ref type="table" target="#tab_2">2</ref> provides information of the ten municipalities with the highest incidence rate in the country. Figure <ref type="figure" target="#fig_2">6</ref>, illustrates the location of previous incidence rates compared to the national average and the corresponding standard deviation. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions</head><p>Multidimensional model for conceptual design of the data warehouse, turned out to be very appropriate, since this model is easily scalable and allows analysis of the information under different perspectives. It is expected that future studies process other variables, related to the municipalities, included in this design, such as socioeconomic status, type of region, gender and access to health services, among others. Moreover, the implementation of data warehouse based on the ROLAP model has allowed taking advantage of the facilities developed for relational databases. In addition, it is expected that the design and implementation carried out in the data warehouse can be used in other applications.</p><p>The processing of the spatial component of our data warehouse, using the IRIS GIS INEGI, has resulted in a high quality visual representation of our results, based on the actual physical location of the municipalities and on a map of the topography of the Republic Mexican INEGI. Also experience and learning has been gained on transfer of shapes (polygons, points) techniques and formats (Number-shape) through ArcView GIS tools.</p><p>Currently we are working to complete studies in other cancer types. Besides, data mining tasks will be developed on the incidence of conditions such as diabetes, influenza and cardiovascular diseases, among others.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 . 2 . 2</head><label>122</label><figDesc>Fig. 1 Multidimensional Model Data Warehouse on the incidence of cancer in Mexico.</figDesc><graphic coords="3,158.06,348.86,297.37,236.12" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2</head><label>2</label><figDesc>Fig. 2 Display of a fact in a multidimensional model</figDesc><graphic coords="4,199.90,440.72,213.53,182.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 6 .</head><label>6</label><figDesc>Figure 6. Top Ten municipalities incidence rates.</figDesc><graphic coords="9,149.23,176.24,315.05,160.79" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>. In our case, the tables for the ROLAP model have the following schemes:</figDesc><table><row><cell>Snowflake Tables</cell></row><row><cell>Dimension Cause</cell></row><row><cell>DISEASE (Clave_Enfermedad, name, IdGama, CategoryID)</cell></row><row><cell>GAMA (IdGama, CategoryID, Description)</cell></row><row><cell>CATEGORY (CategoryID, Description)</cell></row><row><cell>Place dimension</cell></row><row><cell>STATE (Clave_Estado, name, población_total)</cell></row><row><cell>MUNICIPALITY (Clave_Municipio, Clave_Estado, name, población_total,</cell></row><row><cell>Loc_x, Loc_y, extension, tipo_zona, nivel_socioeconómico)</cell></row><row><cell>Time dimension</cell></row><row><cell>YEAR (Idan)</cell></row><row><cell>CENSUS (IdCenso, Idan, number, name)</cell></row><row><cell>Fact Tables</cell></row><row><cell>DEATH (IdEnfermedad, IdCenso, IdMunicipio, no_casos, rate, mean, variance)</cell></row><row><cell>Star Tables</cell></row><row><cell>TIME (Idan, IdCenso)</cell></row><row><cell>CAUSE (IdEnfermedad, IdGama, CategoryID)</cell></row><row><cell>PLACE (IdCiudad, IdMunicipio)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 Municipalities Incidence Rates of Cervical-Uterine Cancer</head><label>1</label><figDesc></figDesc><table><row><cell>State</cell><cell>Municipality</cell><cell cols="3">Population Deaths Rate</cell></row><row><cell>Chiapas</cell><cell>Tapachula</cell><cell>271674</cell><cell>27</cell><cell>9.93</cell></row><row><cell cols="2">Veracruz-Llave Coatzacoalcos</cell><cell>267212</cell><cell>23</cell><cell>8.60</cell></row><row><cell cols="2">Veracruz-Llave Minatitlán</cell><cell>153001</cell><cell>13</cell><cell>8.49</cell></row><row><cell>Chiapas</cell><cell>Comitán de Domínguez</cell><cell>105210</cell><cell>8</cell><cell>7.60</cell></row><row><cell>Chiapas</cell><cell cols="2">San Cristóbal de las Casas 132421</cell><cell>9</cell><cell>6.79</cell></row><row><cell>Tabasco</cell><cell>Comalcalco</cell><cell>164637</cell><cell>11</cell><cell>6.68</cell></row><row><cell>Tabasco</cell><cell>Cárdenas</cell><cell>217261</cell><cell>11</cell><cell>5.06</cell></row><row><cell>Tabasco</cell><cell>Huimanguillo</cell><cell>158573</cell><cell>8</cell><cell>5.04</cell></row><row><cell>Chiapas</cell><cell>Tuxtla Gutiérrez</cell><cell>434143</cell><cell>21</cell><cell>4.83</cell></row><row><cell>Tabasco</cell><cell>Cunduacán</cell><cell>104360</cell><cell>5</cell><cell>4.79</cell></row><row><cell>Campeche</cell><cell>Carmen</cell><cell>172076</cell><cell>8</cell><cell>4.64</cell></row><row><cell>Tabasco</cell><cell>Macuspana</cell><cell>133985</cell><cell>6</cell><cell>4.47</cell></row><row><cell>Tabasco</cell><cell>Centro</cell><cell>520308</cell><cell>23</cell><cell>4.42</cell></row><row><cell>Chiapas</cell><cell>Ocosingo</cell><cell>146696</cell><cell>2</cell><cell>1.36</cell></row><row><cell>Average</cell><cell></cell><cell></cell><cell></cell><cell>5.91</cell></row><row><cell cols="2">Standard deviation</cell><cell></cell><cell></cell><cell>2.23</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 Top Ten Municipalities Incidence Rates of Cervical-Uterine Cancer Key State Municipality Population Deaths Rate</head><label>2</label><figDesc></figDesc><table><row><cell>21019 Puebla</cell><cell>Atlixco</cell><cell>117111</cell><cell>15</cell><cell>12,80</cell></row><row><cell>16006 Michoacán</cell><cell>Apatzingán</cell><cell>117949</cell><cell>13</cell><cell>11,02</cell></row><row><cell>07089 Chiapas</cell><cell>Tapachula</cell><cell>271674</cell><cell>27</cell><cell>9,93</cell></row><row><cell>17006 Morelos</cell><cell>Cuautla</cell><cell>153329</cell><cell>14</cell><cell>9,13</cell></row><row><cell>28021 Tamaulipas</cell><cell>El Mante</cell><cell>112602</cell><cell>10</cell><cell>8,88</cell></row><row><cell>06007 Colima</cell><cell>Manzanillo</cell><cell>125143</cell><cell>11</cell><cell>8,78</cell></row><row><cell cols="3">30039 Veracruz-Llave Coatzacoalcos 267212</cell><cell>23</cell><cell>8,60</cell></row><row><cell>18017 Nayarit</cell><cell>Tepic</cell><cell>305176</cell><cell>26</cell><cell>8,51</cell></row><row><cell cols="2">30108 Veracruz-Llave Minatitlán</cell><cell>153001</cell><cell>13</cell><cell>8,49</cell></row><row><cell cols="2">30118 Veracruz-Llave Orizaba</cell><cell>118593</cell><cell>10</cell><cell>8,43</cell></row><row><cell>General Mean</cell><cell></cell><cell></cell><cell></cell><cell>4.70</cell></row><row><cell>Standard Deviation</cell><cell></cell><cell></cell><cell></cell><cell>1.95</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgement. R. Boone expresses her gratitude to Ms. Rocío Pérez Osorno from INEGI, Puebla. (Graduated from the Faculty of Cs. Computing, BUAP) for advice and support in plotting the results of this work through the IRIS GIS.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">Barrón</forename><surname>Vivanco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arandine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">J</forename><surname>Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Miranda</surname></persName>
		</author>
		<author>
			<persName><surname>Fátima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pazos</surname></persName>
		</author>
		<title level="m">XII Congreso de Investigación en Salud Pública, Aplicación de técnicas de minería de datos a bases de datos poblacionales de cáncer</title>
				<meeting><address><addrLine>CENIDET, México; Brasil; Abril</addrLine></address></meeting>
		<imprint>
			<publisher>Secretaría de Saúde</publisher>
			<date type="published" when="2007">2007</date>
		</imprint>
		<respStmt>
			<orgName>do Estado de Pernambuco</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Cluster analysis of multivariate data: Efficiency vs. Interpretability of classification</title>
		<author>
			<persName><forename type="first">E</forename><surname>Forgy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biometrics</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="768" to="780" />
			<date type="published" when="1965">1965</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Hernández-Orallo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Ramiréz-Quintana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ferri-Ramiréz</surname></persName>
		</author>
		<title level="m">Introducción a la Minería de Datos</title>
				<meeting><address><addrLine>Madrid</addrLine></address></meeting>
		<imprint>
			<publisher>Pearson Prentice Hall</publisher>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">El cáncer cérvico-uterino su impacto en México. Porqué no funciona el programa nacional de detección oportuna</title>
		<author>
			<persName><forename type="first">C</forename><surname>Hidalgo-Martínez Ana</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Revista Biomédica</title>
				<meeting><address><addrLine>Centro Nal; UADY; México</addrLine></address></meeting>
		<imprint>
			<publisher>De Investigaciones Regionales Dr. Hineyo Noguchi</publisher>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Constructing Overview+Detail Dendogram Matrix Views</title>
		<author>
			<persName><forename type="first">Jin</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alan</forename><forename type="middle">M</forename><surname>Maceachren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Donna</forename><surname>Peuquet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Visualization &amp; Computer Graphics</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="889" to="896" />
			<date type="published" when="2009-12">Dec. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Some Methods for Classification and Analysis of Multivariate Observations</title>
		<author>
			<persName><forename type="first">J</forename><surname>Macqueen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings Fifth Berkeley Symposium Mathematics Statistics and Probability</title>
				<meeting>Fifth Berkeley Symposium Mathematics Statistics and Probability<address><addrLine>Berkeley, CA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1967">1967</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="281" to="297" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Epidemiología del cáncer del cuello uterino</title>
		<author>
			<persName><forename type="first">M</forename><surname>Martínez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Javier</forename><surname>Francisco</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Medicina Universitaria</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">22</biblScope>
			<biblScope unit="page" from="39" to="46" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<ptr target="http://sigsalud.insp.mx/naais/" />
		<title level="m">NAIIS Instituto Nacional de Salud Pública, SCRIS, Mortalidad</title>
				<meeting><address><addrLine>Cuernavaca; Morelos, México</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Data Mining for Cancer Management in Egypt</title>
		<author>
			<persName><forename type="first">M</forename><surname>Nevine</surname></persName>
		</author>
		<author>
			<persName><surname>Labib</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Michael</surname></persName>
		</author>
		<author>
			<persName><surname>Malek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions on Engineering, Computing and Technology</title>
		<idno type="ISSN">1305-5313</idno>
		<imprint>
			<date type="published" when="2005-10">October 2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Estado Actual de las Tecnologías de Bodegas de Datos Espaciales</title>
		<author>
			<persName><forename type="first">-</forename><forename type="middle">C</forename><surname>Pérez</surname></persName>
		</author>
		<author>
			<persName><surname>Nelson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">O</forename><surname>Abril-Frade</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Ing. E Investigación</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="2007">2007</date>
			<publisher>Univ. Nal. De Colombia</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Improvement the Efficiency and Efficacy of the K-means Clustering Algorithm through a New Convergence Condition</title>
		<author>
			<persName><forename type="first">Pérez-O</forename><forename type="middle">J</forename></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pazos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cruz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reyes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computational Science and Its Applications -ICCSA 2007 -International Conference Proceedings</title>
				<imprint>
			<publisher>Springer Verlag</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Mejora al Algoritmo de K-means mediante un Nuevo criterio de convergencia y su aplicación a bases de datos poblacionales de cancer</title>
		<author>
			<persName><forename type="first">Pérez-O</forename><forename type="middle">J</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">F</forename><surname>Henriques</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pazos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Cruz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reyes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Salinas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mexicano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2do Taller Latino Iberoamericano de Investigación de Operaciones</title>
				<meeting><address><addrLine>Mèxico</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Research issues on K-means Algorithm: An Experimental Trial Using Matlab</title>
		<author>
			<persName><forename type="first">Pérez-O</forename><forename type="middle">J</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Rocío</forename><surname>Boone Rojas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">María</forename><forename type="middle">J</forename><surname>Somodevilla</surname></persName>
		</author>
		<author>
			<persName><surname>García</surname></persName>
		</author>
		<ptr target="http://ceur-ws.org/" />
	</analytic>
	<monogr>
		<title level="m">Advances on Semantic Web and New Technologies</title>
				<imprint>
			<biblScope unit="volume">534</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Cáncer cervical, una enfermedad de la pobreza</title>
		<author>
			<persName><forename type="first">G</forename><surname>Rangel-Gómez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Lazcano-Ponce</surname></persName>
		</author>
		<author>
			<persName><surname>Palacio-Mejía</surname></persName>
		</author>
		<ptr target="http://www.insp.mx/salud/index.html" />
	</analytic>
	<monogr>
		<title level="m">diferencias en la mortalidad por áreas urbanas y rurales en México</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Evaluation of SOVAT: An OLAP-GIS decision support system for community health assessment data analysis</title>
		<author>
			<persName><surname>Scotch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Parmato</forename><forename type="middle">B</forename><surname>Matthew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Monaco</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC Medical Informatics &amp; Decisión Making</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A multi-source Information System for end-stage renaldisease</title>
		<author>
			<persName><forename type="first">A</forename><surname>Simonet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Landais</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Guillon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Comptes Residus Biologies</title>
		<imprint>
			<biblScope unit="volume">325</biblScope>
			<biblScope unit="page">515</biblScope>
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Subgroup Discovery in Cervical Cancer Analysis Using Data Mining Techniques</title>
		<author>
			<persName><forename type="first">K</forename><surname>Thangavel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Jaganathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">O</forename><surname>Esmy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">AIML journal</title>
		<imprint>
			<biblScope unit="issue">6</biblScope>
			<date type="published" when="2006-01">January, 2006</date>
		</imprint>
		<respStmt>
			<orgName>Departament of Computer Science, Periyar University ; Departament of Computer Science and Applications, Gandhigram Rural Institute-Deemed University, Gandhigram: Radiation Oncologist , Christian Fellowship Community Health Centre</orgName>
		</respStmt>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
