<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">NFDI4DSO: Towards a BFO Compliant Ontology for Data Science</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Genet</forename><forename type="middle">Asefa</forename><surname>Gesese</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Leibniz Institute for Information Infrastructure</orgName>
								<orgName type="institution">FIZ Karlsruhe</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Karlsruhe Institute of Technology</orgName>
								<address>
									<settlement>KIT</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jörg</forename><surname>Waitelonis</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Leibniz Institute for Information Infrastructure</orgName>
								<orgName type="institution">FIZ Karlsruhe</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Karlsruhe Institute of Technology</orgName>
								<address>
									<settlement>KIT</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Zongxiong</forename><surname>Chen</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Fraunhofer FOKUS</orgName>
								<address>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sonja</forename><surname>Schimmler</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Fraunhofer FOKUS</orgName>
								<address>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Harald</forename><surname>Sack</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Leibniz Institute for Information Infrastructure</orgName>
								<orgName type="institution">FIZ Karlsruhe</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Karlsruhe Institute of Technology</orgName>
								<address>
									<settlement>KIT</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">NFDI4DSO: Towards a BFO Compliant Ontology for Data Science</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">D806791A340A74E8122247EEBA0F04AA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:46+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Data Science</term>
					<term>Artificial Intelligence</term>
					<term>Ontology</term>
					<term>Knowledge Graph</term>
					<term>NFDI4DS</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The NFDI4DataScience (NFDI4DS) project aims to enhance the accessibility and interoperability of research data within Data Science (DS) and Artificial Intelligence (AI) by connecting digital artifacts and ensuring they adhere to FAIR (Findable, Accessible, Interoperable, and Reusable) principles. To this end, this poster introduces the NFDI4DS Ontology, which describes resources in DS and AI and models the structure of the NFDI4DS consortium. Built upon the NFDICore ontology and mapped to the Basic Formal Ontology (BFO), this ontology serves as the foundation for the NFDI4DS knowledge graph currently under development.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The German National Research Data Infrastructure (NFDI) 1 is a non-profit association founded to coordinate the activities for establishing a national research data infrastructure. It comprises 26 consortia spanning a wide range of scientific disciplines, from cultural sciences, social sciences, humanities and engineering to life sciences and natural sciences. The NFDI consortia share common goals and concepts, such as their members, structure, data repositories, and services <ref type="bibr" target="#b0">[1]</ref>. To enhance interoperability across these consortia, the NFDICore ontology 2 has been developed. It acts as a mid-level ontology for representing metadata related to NFDI SEMANTiCS 2024: 20th International Conference on Semantic Systems, September 17-19, 2024, Amsterdam, The Netherlands * Corresponding author. Envelope genet-asefa.gesese@fiz-kalrsruhe.de (G. A. Gesese); Joerg.Waitelonis@fiz-Karlsruhe.de (J. Waitelonis); zongxiong.chen@fokus.fraunhofer.de (Z. Chen); sonja.schimmler@fokus.fraunhofer.de (S. Schimmler); harald.sack@fiz-kalrsruhe.de (H. Sack) GLOBE https://tinyurl.com/3cx37b9x (G. A. Gesese); https://shorturl.at/UwDND (J. Waitelonis); https://www.fokus.fraunhofer.de/009785fd54551039 (S. Schimmler); https://www.aifb.kit.edu/web/Harald_Sack (H. Sack) Orcid 0000-0003-3807-7145 (G. A. Gesese); 0000-0001-7192-7143 (J. Waitelonis); 0000-0003-2452-0572 (Z. Chen); 0000-0002-8786-7250 (S. Schimmler); 0000-0001-7069-9804 (H. <ref type="bibr">Sack)</ref> resources, including individuals, organizations, projects, data portals, and more. NFDICore provides mappings to a broad range of standards across different domains, such as the Basic Formal Ontology (BFO) <ref type="bibr" target="#b1">[2]</ref> and Schema.org <ref type="bibr" target="#b2">[3]</ref> to advance knowledge representation, data exchange, and collaboration across diverse domains. To address domain-specific research questions for each consortium, NFDICore follows a modular architecture. Examples for modular extensions include the NFDI4Culture ontology module CTO<ref type="foot" target="#foot_0">3</ref>  <ref type="bibr" target="#b3">[4]</ref> and the NFDI-MatWerk ontology module MWO <ref type="foot" target="#foot_1">4</ref> , which are specifically designed for the cultural heritage and materials science domains, respectively. In this paper, we present an ontology named NFDI4DSO for the data science domain as a domain-specific modular extension of NFDICore.</p><p>NFDI4DataScience (NFDI4DS) <ref type="foot" target="#foot_2">5</ref> is one of the NFDI consortia and its project aims to enhance the accessibility and interoperability of research data in the domain of Data Science (DS) and Artificial Intelligence (AI). Data Science (DS) is a multidisciplinary field combining different aspects of mathematics, statistics, computer science, and domain-specific knowledge to extract meaningful insights from diverse data sources. DS and Artificial Intelligence (AI) involve various artifacts, e.g., datasets, models, ontologies, code repositories, execution platforms, repositories, etc. The project achieves this by linking digital artifacts and ensuring their FAIR (Findable, Accessible, Interoperable, and Reusable) accessibility, thereby fostering collaboration across various DS and AI platforms. To this end, the NFDI4DS Ontology (NFDI4DSO) is built.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The NFDI4DataScience Ontology (NFDI4DSO)</head><p>As mentioned earlier, NFDI4DSO is created in a modular fashion, building upon NFDICore. Similar to NFDICore, the NFDI4DSO ontology is developed using a bottom-up, iterative, usercentered approach. NFDICore comprises 51 classes, 55 object properties, 8 data properties, 18 annotation properties, and 5 SWRL rules <ref type="bibr" target="#b4">[5]</ref> (for details refer to NFDICore documentation <ref type="foot" target="#foot_3">6</ref> ). In NFDI4DSO, in addition to what is provided in NFDICore, 42 classes, 38 object properties, 9 data properties, and 8 SWRL rules are added. The NFDI4DSO ontology not only describes various data science artifacts but also provides information about the resources of the NFDI4DS Consortium, such as personas, consortium members, spokespersons, and task area leads. AS in NFDICore, the classes introduced in NFDI4DSO are also mapped to the top-level ontology BFO and also other ontologies such as schema.org, the FaBiO ontology <ref type="bibr" target="#b5">[6]</ref>, and the Conference Ontology<ref type="foot" target="#foot_4">7</ref> .</p><p>NFDI4DSO contains various kinds of classes such as processes, roles, and independent continuants. For instance, Figure <ref type="figure" target="#fig_0">1</ref> depicts how NFDI4DSO represents the relationship between the independent continuant nfdi4dso:SonjaSchimmler and her spokesperson role nfdi4dso:Spokesper-sonRole by mapping it to BFO. By using roles and processes, NFDI4DSO enables a detailed representation of the relationship between different entities enhancing the ontology's level of expressivity. On the other hand, to support easier integration and use of less complex relations, shortcuts are also introduced to simplify the ontology by implementing easy-to-use direct shortcut properties, which can be expanded to fully-fledged BFO-compliant complex path expressions. For instance, in Figure <ref type="figure" target="#fig_0">1</ref>, the shortcut relation nfdi4dso:spokesperson is provided and its corresponding SWRL<ref type="foot" target="#foot_5">8</ref> rule is given below.</p><p>Person(?p) ∧ Consortium(?c) ∧ SpokespersonRole(?sr) ∧ Leading(?l) ∧ participates in(?p, ?l) ∧ participates in(?c, ?l) ∧ has role(?p, ?sr) ∧ realised in(?sr, ?l) → spokesperson(?c, ?p) Ontology Implementation The Protégé ontology editor<ref type="foot" target="#foot_6">9</ref> for the OWL-based formalization of terminological knowledge has been used to develop and implement NFDI4DSO. Widoco<ref type="foot" target="#foot_7">10</ref> has been used to create an enriched and customized documentation of the ontology automatically. The stable version of the ontology NFDI4DSO v1.0.0 is available at https://github.com/ISE-FIZKarlsruhe/NFDI4DS-Ontology/tree/main and the latest development version is at https://github.com/ISE-FIZKarlsruhe/NFDI4DS-Ontology/tree/develop-1.0.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">NFDI4DSO in Use</head><p>The NFDI4DSO is designed to form the foundation of the NFDI4DS Knowledge Graph (NFDI4DS-KG), which is currently under development. The NFDI4DS-KG consists of two main components: the Research Information Graph (RIG) and the Research Data Graph (RDG). RIG includes metadata about the NFDI4DS consortium's resources, persons, and organizations, while the RDG encompasses content-related index data from the consortium's heterogeneous data sources. RIG serves as the backend for the NFDI4DS web portal, facilitating interactive access and management of this data. Both RIG and RDG will be accessible and searchable via the NFDI4DS Registry platform. Additionally, the NFDI4DS consortium plans to collaborate with other NFDI consortia to further integrate domain-specific knowledge into the RDG seamlessly. Currently, the first version of the NFDI4DS-KG<ref type="foot" target="#foot_8">11</ref> with RIG is publicly available. For example, to view the list of co-spokespersons of the NFDI4DS Consortium, you can either navigate through the data using SHMARQL 12 , as depicted in Figure <ref type="figure" target="#fig_1">2</ref> or query it using SPARQL, as shown in Figure <ref type="figure" target="#fig_2">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion and Future Work</head><p>This paper presents the NFDI4DS Ontology and its use for the NFDI4DS-KG that is currently under-development. The ontology facilitates the representation and interoperability of data science artifacts within and outside of NFDI4DS. NFDI4DSO is built on top of the NFDICore ontology and mapped to BFO and other ontologies. In the future, there is a plan to perform extensive ontology evaluation using competency questions based on the persona definitions from the NFDI4DS consortium. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Example of representing roles where the prefixes ro and obi represent http://purl.obolibrary.org/obo/ro.owl and http://purl.obolibrary.org/obo/obi.owl ontologies, respectively.</figDesc><graphic coords="3,89.29,84.19,430.80,224.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: A screenshot of part of the SHMARQL interface with the list of NFDI4DS co-spokespersons (refer to https://shorturl.at/eNb5e to navigate it fully.)</figDesc><graphic coords="4,124.91,84.19,345.45,128.10" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: An example SPARQL query to provide a list of the co-spokespersons of the NFDI4DS Consortium. (It possible to query it live at: https://nfdi.fiz-karlsruhe.de/4ds/sparql)</figDesc><graphic coords="5,128.04,84.18,339.20,307.60" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">https://gitlab.rlp.net/adwmainz/nfdi4culture/knowledge-graph/culture-ontology</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">https://git.rwth-aachen.de/nfdi-matwerk/ta-oms/mwo</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">https://www.nfdi4datascience.de/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">https://ise-fizkarlsruhe.github.io/nfdicore/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_4">http://www.scholarlydata.org/ontology/doc/#toc</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_5">https://ise-fizkarlsruhe.github.io/NFDI4DS-Ontology/#d4e7620</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_6">https://protege.stanford.edu/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_7">https://github.com/dgarijo/Widoco</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_8">https://nfdi.fiz-karlsruhe.de/4ds/sparql, https://nfdi.fiz-karlsruhe.de/4ds/shmarql</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This publication was written by the NFDI consortium NFDI4DataScience in the context of the work of the association German National Research Data Infrastructure (NFDI) e.V.. NFDI is financed by the Federal Republic of Germany and the 16 federal states and funded by the Federal Ministry of Education and Research (BMBF) -funding code M532701 / the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -project number NFDI4Data-Science (460234259).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schrade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bruns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Posthumus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tietz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Norouzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Waitelonis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Fliegl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Söhn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tolksdorf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Steller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Azócar Guzmán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Fathalla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zainul Ihsan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Hofmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sandfeld</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fritzen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Laadhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schimmler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mutschke</surname></persName>
		</author>
		<title level="m">Knowledge Graph Based RDM Solutions: NFDI4Culture-NFDI-MatWerk-NFDI4DataScience</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>1st Conf. on Research Data Infrastructure</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">BFO: Basic formal ontology</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Otte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Beverley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ruttenberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied ontology</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Schema. org: evolution of structured data on the web</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">V</forename><surname>Guha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Brickley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Macbeth</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>Communications of the ACM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Tietz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Bruns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Söhn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tolksdorf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Posthumus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Steller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Fliegl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Norouzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Waitelonis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schrade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sack</surname></persName>
		</author>
		<title level="m">From Floppy Disks to 5-Star LOD: FAIR Research Infrastructure for NFDI4Culture</title>
				<imprint>
			<date type="published" when="2023">2023. 2023</date>
		</imprint>
	</monogr>
	<note>DaMaLOS, co-located with ESWC</note>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">F</forename><surname>Patel-Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Boley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tabet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Grosof</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dean</surname></persName>
		</author>
		<title level="m">Swrl: A semantic web rule language combining owl and ruleml</title>
				<imprint>
			<publisher>W3C Member submission</publisher>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">FaBiO and CiTO: Ontologies for describing bibliographic resources and citations</title>
		<author>
			<persName><forename type="first">S</forename><surname>Peroni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Shotton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics</title>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
