<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">OntoBio: Designing New Features to Improve Modeling and Implementation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Andréa</forename><surname>Corrêa</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Flôres</forename><surname>Albuquerque</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Instituto de Ciência da Computação (IComp)</orgName>
								<orgName type="institution">Universidade Federal do Amazonas (UFAM)</orgName>
								<address>
									<settlement>Manaus-AM</settlement>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">José</forename><forename type="middle">Laurindo</forename><surname>Campos Dos Santos</surname></persName>
							<affiliation key="aff1">
								<orgName type="laboratory">Laboratório de Interoperabilidade Semântica (LIS)</orgName>
								<orgName type="institution">Instituto Nacional de Pesquisas da Amazônia (INPA)</orgName>
								<address>
									<settlement>Manaus-AM</settlement>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Alberto</forename><surname>Nogueira De Castro Júnior</surname></persName>
							<email>alberto@icomp.ufam.edu.br</email>
							<affiliation key="aff0">
								<orgName type="department">Instituto de Ciência da Computação (IComp)</orgName>
								<orgName type="institution">Universidade Federal do Amazonas (UFAM)</orgName>
								<address>
									<settlement>Manaus-AM</settlement>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">OntoBio: Designing New Features to Improve Modeling and Implementation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D1B419EB47F14B95C3684F0FDEF59F46</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T19:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>OntoBio is a formal ontology developed in the scenario of biological collection and field data collection of biotic entities. Considering the complex and dynamic nature of biodiversity data and information, modeling and implementations decisions likely to be error prone, can happen. This paper presents OntoBio's limitations regarding conceptualization and implementational aspects and new features aiming to indicate accurate recommendations for OntoBio's evolution, by emphasizing several aspects that must be considered when designing a new version of the ontology.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The current research on data integration has focused on semantics, which aims to mitigate the conflicts between heterogeneous data sources instead of designing the structure of an architecture for integration.</p><p>One strategy that has been adopted to deal with such problems is the use of integrative elements -such as ontologies -to manage and eliminate semantic conflicts. In the scope of biodiversity data and information, ontologies can be a valuable resource for strategic planning and contribution toward conservation <ref type="bibr" target="#b0">[Albuquerque et al., 2015]</ref>. There is a remarkable growing demand for this data in several applications, such as environmental impact assessment, definition of environmental preservation areas, protection of endangered species, land reclamation, bio-prospecting, setting public policy, environmental legislation, among others.</p><p>Due to the wide-ranging characteristics of data and the diverse profiles of experts, there is still much work to be done in the specification of ontology for this domain. This is one of the reasons that the integration of biodiversity data and ecological studies is not considered trivial. Solutions for interoperability are needed for research in this field.</p><p>Regarding these facts, OntoBio, a formal ontology applied to biodiversity data, provided important results with already validated technology for the adoption of formal ontologies to knowledge acquisition and integration in biodiversity field. OntoBio was developed in a research initiative involving the IComp/UFAM and INPA's Biological Collection Program. It was modeled conceptually through OntoUML language <ref type="bibr" target="#b3">[Guizzardi 2005</ref>] and developed through the SABIO method <ref type="bibr" target="#b2">[Falbo et al. 1998</ref>].</p><p>OntoBio is divided into five sub-ontologies, connected by relationships between concepts and axioms. They are: collection<ref type="foot" target="#foot_0">1</ref> ; material entity, that is composed by biotic entity and abiotic entity; spatial location; ecosystem; and environment <ref type="bibr" target="#b0">[Albuquerque et al., 2015]</ref>.</p><p>Considering the complex and dynamic nature of biodiversity domain, it is expected the occurrence of extensions/evolution of ontology, according to the views of experts. Elicited requirements with researchers from INPA guided the identification of new entities, categorizations, relationships and some new sub-ontologies. During the development of OntoBio, much of an expert's knowledge (which was not presented in the structured databases that support the ontology) was not represented, and thus lost. Empirical evidence indicated that this knowledge could become essential to incorporate semantic expressiveness in ontologies. A conceptual framework was proposed to aggregate scientific tacit knowledge into ontologies <ref type="bibr" target="#b1">[Albuquerque et al., 2016]</ref>. The new version of OntoBio incorporates more semantics to the model and the availability of a version with features that allow its use in more complex applications (taxonomic classification).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">OntoBio's New Features</head><p>Alloy language has been used as a way to evaluate graphic models, aiding the professionals that build them <ref type="bibr" target="#b4">[Jackson, 2002]</ref>. OntoUML is a well-founded language to build ontologies. The existence of algorithms that translate models developed in this language to Alloy specifications helped the validation of OntoBio.</p><p>Due to the complexity of OntoBio and the size restriction (number of classes modeled) imposed by Alloy, it was validated in a segmented way: the strongly connected sub-ontologies were validated first, followed by the intersections of these sub-models. The validation identified recurrent modeling decisions that are error prone and they were presented in <ref type="bibr" target="#b5">[Sales 2012</ref>].</p><p>In addition to the validation and suggestions of improvements found in <ref type="bibr" target="#b5">[Sales 2012</ref>], some conceptual modeling aspects were considered:</p><p>• Collection. This sub-ontology can be segmented in two sub-ontologies:</p><p>Acquisition and Research Institution;</p><p>• Acquisition. A Collection is defined as the acquisition of an organism, animal, vegetal, fungal or microbial. In Acquisition, the Collection entity is called</p><p>Expedition, which is one of the ways of acquiring specimens. Other forms that must be considered are: Purchase, Donation, Legacy and Exchange. The collections performed by an Expedition must follow specific collection protocols.</p><p>• Research Institution. Currently the Research Institution has a broader representation, where additional features can be incorporated. For official Brazilian institutions, a biological collection comprises of properly treated biological material, maintained and documented in accordance with norms and standards to ensure the safety, accessibility, quality, longevity, integrity and interoperability of data collection, belonging to the scientific institution in order to support scientific or technological research and ex situ conservation.</p><p>• Ecosystem. It can be absorbed by the Environment sub-ontology, as well as phytophysiognomy, vegetation and climatic region modules of the Spatial Location sub-ontology.</p><p>• Environment. New specializations of micro and macro environment must be added to the Environment sub-ontology.</p><p>• Material Entity. In the first version of OntoBio, this sub-ontology captured the taxonomic ranks of family, genus and species. The complete taxonomic classification of the organism is required, which results in the creation of the sub-ontology Taxonomic Classification for this purpose. Food habits and maturity stage can be included in this sub-ontology.</p><p>• Taxonomic Classification. This sub-ontology would allow OntoBio to capture the taxonomic structure detailed for any specimen. This sub-ontology must follow the latest change of the botanical international nomenclature, which accepts phylum and division the as same taxonomic level.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">OntoBio's Evolution Based on Tacit Knowledge Through a Conceptual Framework</head><p>In general, tacit knowledge modeling is not considered part of the formal scientific research life cycle, but it can inspire hypothesis to get a scientific view of knowledge.</p><p>When modelled and made available, knowledge (implicit-explicit) becomes essential in the process of generating new knowledge. There are still open questions related to the representation, modeling, formalization and integration of tacit knowledge. A conceptual framework can be used to integrate specialists' mental models, aiming to map semantic components of attachable structures to formal ontologies. It also explores semantic annotation for dissemination and reuse. The framework aggregates semantic expressiveness to formal ontologies, and uses OntoBio, as the object of study. The framework guides the management of scientific tacit knowledge presenting different levels of representation, and allowing to retain knowledge to answer questions that OntoBio cannot currently respond.</p><p>The application of the conceptual framework to integrate scientific tacit knowledge applied to OntoBio <ref type="bibr" target="#b1">[Albuquerque et al., 2016]</ref>  • Create a formal relation (has) between Biotic Entity and Organ (MM14);</p><p>• Create a self formal relation (occurs) between Biotic Entity (1..*) and Biotic Entity (1..*). This means that an organism's occurrence is subjected to the occurrence of another organism (MM14).</p><p>The original OntoBio and a trial version of OntoBio with some of these changes can be found at portal.inpa.gov.br/ctin/lis/ontobio/. More details of the framework and the formalization files used to apply it to generate OntoBio's recommendations for evolution can be found at portal.inpa.gov.br/ctin/lis/frameworkconceitual/.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Implementational Issues</head><p>OntoBio is developed using tools in a sequential order to provide a better code result. The ontological schema must be designed in a tool with graphical support to UML, such as Sparx Enterprise System Architect 3 (EA). EA is used to design OntoBio's ontological schema using OntoUML primitives. Once the ontological schema is concluded in a (.eap) file format, it can be exported to (.xmi) file format to be used in OntoUML Lightweight Editor (OLED) 4 , current version named Menthor Editor 5 .</p><p>The ontological schema in (.xmi) file format must be imported by Menthor and then can be converted to a (.owl) file format.</p><p>The final implementation phase is to use the (.owl) file in an OWL editor such as Protégé. The editor manipulates OWL ontologies and also provides a list of inference tools for testing the logical ontology consistencies. Some implementation issues emerged with OntoBio's evaluation and are regarded as ontology modeling language limitations in the specific domain of biodiversity:</p><p>• OntoUML does not support Ø representation for anything (Ø..1, Ø..N).</p><p>Limitation of cardinality representation. It justifies the adoption of a taxonomic representation with three ranks in original OntoBiofamily, genus, species. All specie is associated to a family, a genus, a specie;</p><p>• OntoUML does not support the use of high order, essential for taxonomic classification. It supports only kind that does not model these concepts more appropriately;</p><p>• OntoUML does not allow modeling a sub-collection of a sub-collection. Ex.:</p><p>States are sub-collections of countries; cities are sub-collections of states;</p><p>• There are inconsistences in the .owl file generated from the .eap file. Even if Menthor allows the automatic generation of OWL code of the ontological scheme designed, it is important to remember that a language in the level of analysis to design ontologies as OntoUML has more expressiveness power than a language for ontologies in the level of implementation, such as OWL. Thus, a code generated automatically in OWL does not reflect the reality modeled.</p><p>Adjustments are required to maintain the integrity of that which has been patterned, thus justifying the use of Protégé. This is a recurring issue in the development of ontologies that still requires additional research and well elaborated solutions;</p><p>• OntoUML is based on UML and OWL is based on set theories. It implies that these ontology languages do not have a directly mapping between them. Some OntoUML definitions may be missing in OWL mapping at the Application level;</p><p>• Protégé does not support powertype that can be used in OntoUML.</p><p>3 http://www.sparxsystems.com.au/products/ea/ 4 https://github.com/nemo-ufes/ontouml-lightweight-editor 5 http://www.menthor.net/menthor-editor.html</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions</head><p>This research revealed some conceptual misunderstandings and ontology language limitations. These issues must be dealt with according to the domain allowing the ontology to evolve in resources. Despite its limitations, OntoUML is a highly expressive formal ontology modeling language capable of guaranteeing less risk of semantic expressiveness loss than other ontology modeling languages. It produces logically and ontologically consistent models, but to do so, it is necessary to: 1) understand the meaning of each stereotype in OntoUML in order to use the appropriate meta-category for concepts; and 2) validate the ontology modelled by checking all the model's possibilities. A complex domain such as biodiversity, facilitates the identification of limitations in OntoUML and as a result, generates demands for improvements in the language. When these bottlenecks are solved, the ontology engineer will benefit with more resources for modeling.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>suggested some recommendations of change. These changes are associated to Mental Models (MMs) elicited and are: • Create a formal relation (can have) between Biotic Entity (1..*) and Popular Name (1..*). This means that a Biotic Entity can be associated to multiple Popular Names and that a Popular Name can be associated to more than one</figDesc><table><row><cell>Biotic Entity (MMs 1 to 10);</cell></row><row><cell>• To specialize Macro Environment/Aquatic into Macrophyte Bank (MM1), into</cell></row><row><cell>Soaked Trunk (MM2), into Well and Bench of Submerged Leaves (MM5),</cell></row><row><cell>into Submerged Branches (MM6), into Inland Water Transition Zone</cell></row><row><cell>(MM9), into Leaves Bunch (MM10) and into Mainland Igarapé 2 (MM12);</cell></row><row><cell>• To specialize Collection Method into Bait (MMs 3, 13a, 13b) and into Hand</cell></row><row><cell>Net (MMs 7a, 7b, 9);</cell></row></table><note>• Create a formal relation (feeds on) between Material Entity (1..*) and Biotic Entity (1..*). This means that a Biotic Entity can eat multiple Material Entities and that a Material Entity can be the food of more than one Biotic Entity (MM3, 4, 13a, 13b); • Create a formal relation between Environment (1..*) and Collection Method (1..*). This means that an Environment can be adopted to more than one Collection Method and that a Collection Method can be used in more than one Environment, depending on the Biotic Entity that is going to be collected (MM8); • Create a formal relation between Biotic Entity (1..*) and Habitat (1..*). This means that a Biotic Entity can have multiple Habitats and that a Habitat can be used by more than one Biotic Entity (MMs 11a, 11b, 11c); • Create a component of relation (composed by) between Habitat (1..*) and Environment (1..*). This means that a Habitat is composed by multiple Environments and that an Environment can be part of more than one Habitat (MMs 11a, 11b, 11c); • Create a new concept Organ and instantiate it with Flower (MM14);</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Collection here means the act of collecting material entities in an environment.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">Small body of water, generally a tributary river or a canal. It's a word used by indigenous Tupi tribes when referring to a small strait or canal between two islands, or between an island and the mainland. Igarapés can only give way to small vessels (such as canoes, hence its Tupi denomination), as they are shallow, and ordinarily have very dark waters, being located deep within wealds or Amazonian thickets or forests.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>We would like to thank GSI-ICOMP-UFAM, LIS-INPA, FAPEAM (Foundation for the State of Amazonas Research), Grant Number 021/2011 062.03101 / 2012-DO and CNPq (National Council for Scientific and Technological Development) Grant Number 486333 / 2011-6 for partially funding this research.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">OntoBio: A Biodiversity Domain Ontology for Amazonian</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C F</forename><surname>Albuquerque</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L C</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 48 th Hawaii International Conference on System Sciences</title>
				<meeting>48 th Hawaii International Conference on System Sciences<address><addrLine>Kauai, Hawaii</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015-01-05">2015. January 5 th -8 th</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A Conceptual Framework to Integrate Scientific Tacit Knowledge</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C F</forename><surname>Albuquerque</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L C</forename><surname>Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Castro</surname><genName>Jr</genName></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of SAI Intelligent Systems Conference, IntelliSys 2016</title>
				<meeting>SAI Intelligent Systems Conference, IntelliSys 2016<address><addrLine>London, UK</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2016-09-21">2016. September 21 st -22 nd</date>
		</imprint>
	</monogr>
	<note>To appear</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A Systematic Approach for Building Ontologies</title>
		<author>
			<persName><forename type="first">R</forename><surname>Falbo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IBERAMIA&apos;98 (Proceedings of the 6th Ibero-American Conference on AI</title>
		<title level="s">Lecture Notes in Artificial Intelligence</title>
		<editor>
			<persName><forename type="first">H</forename><surname>Coelho</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg, Lisbon, Portugal</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="1998">1998. 1998</date>
			<biblScope unit="volume">1484</biblScope>
			<biblScope unit="page" from="349" to="360" />
		</imprint>
	</monogr>
	<note>Artificial Intelligence</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Ontological Foundations for Structural Conceptual Models</title>
		<author>
			<persName><forename type="first">G</forename><surname>Guizzardi</surname></persName>
		</author>
		<idno>ISSN 1381-3617</idno>
		<imprint>
			<date type="published" when="2005">2005</date>
			<publisher>Holanda</publisher>
			<biblScope unit="volume">15</biblScope>
		</imprint>
		<respStmt>
			<orgName>CUM LAUDE), University of Twente, The Netherlands</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">CTIT PhD-thesis</note>
	<note>Published as the same name book in Telematica Institut Fundamental Research. Series No. 05-74</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Alloy: A Lightweight Object Modelling Notation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jackson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions on Software Engineering and Methodology</title>
		<imprint>
			<biblScope unit="page">11</biblScope>
			<date type="published" when="2002">2002</date>
			<publisher>TOSEM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Identificação de Padrões de Erro em Modelagem Conceitual Por Meio de Validação de Ontologias OntoUML Utilizando ALLOY</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">P</forename><surname>Sales</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
		<respStmt>
			<orgName>Universidade Federal do Espírito Santo</orgName>
		</respStmt>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
