<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CosmOntology: Creating an Ontology of the Cosmos</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vasilis Efthymiou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FORTH-ICS</institution>
          ,
          <addr-line>N. Plastira 100, GR 70013, Heraklion, Crete</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>Today, we are gathering more and more new astrophysics knowledge, but we are using obsolete ways of representing and processing it. This position paper discusses potential ways of constructing an astronomical KG, semantically annotating and reasoning on such data, using neuro-symbolic methods. AI is already revolutionizing knowledge acquisition and management, allowing computers to understand and process resources that are otherwise consumable only by humans. A common requirement for such technologies to operate successfully is a machine-readable conceptual modelling of the domain of interest. Having such a model enables, in turn, reasoning on the data to infer new knowledge. Typically, domain-specific models (e.g., SNOMED in healthcare, GeneOntology in genomics) are stored in knowledge graphs (KGs), formatted as ontologies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        2. Managing Astrophysics Data with CosmOntology
In recent years, there have been a plethora of works that try to exploit the potential of tables
available on the Web for a multitude of applications, ranging from knowledge base augmentation
CEUR
Workshop
Proceedings
to question answering, schema linking, and data integration [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ]. The first steps towards
constructing a rich KG (an ontology) of the cosmos can be the extraction [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ], semantic
annotation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], partial KG creation [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ] and unification [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] of knowledge from tables
found on scientific publications (e.g., astro-ph within arXiv), as well as online catalogues.
      </p>
      <p>
        In more detail, for the generation and curation of CosmOntology, a human-in-the-loop
approach is probably more promising, in which a board of experts will handle requests to update
the ontology. Such requests will be made both by algorithms and by humans. A set of many
shallow ontologies can be generated initially from structured data, which will be later enriched
by combining the many, shallow ontologies into one [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], through ontology matching [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and
a consensus-based ontology curation platform (e.g., based on [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]).
      </p>
      <p>
        A recent work [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] that exploits textual information from AI publications with
state-of-theart NLP tools, has shown very promising results. A combination of such tools that process
textual, tabular, and image data in the field of astrophysics would set new standards in mining
knowledge available online and modeling it in a unified way.
      </p>
      <p>
        Applications. A unified KG modeling of astronomical concepts will allow a number of
AI tools and methods to become available. For example, accessing astrophysics data even
with natural language interfaces [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], like query answering systems or even chatbots, can
become seamless. Graph data analysis tasks, such as clustering, node classification, and link
prediction have seen significant advances in the presence of ontologies (e.g., [
        <xref ref-type="bibr" rid="ref13 ref19">13, 19</xref>
        ]). Such tasks
can be utilized for generating new data insights and visualizations, and boost new scientific
discoveries. An ontology of astronomical concepts can constitute a global point of reference
for astrophysicists, computer scientists and practitioners that work with astrophysics data.
Furthermore, pre-trained KG embeddings can become publicly available to facilitate such tasks.
      </p>
      <p>
        Some of the reasoning tasks that could be enabled after such a KG construction are: managing
inconsistencies [
        <xref ref-type="bibr" rid="ref20">20, 21</xref>
        ] that may arise after matching and curation, enriching question answering
based on inferred knowledge from the constructed ontology [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], KG summarization and
modular reuse of ontologies [22, 23].
      </p>
      <p>Provenance and explainability are also major issues for astronomers, determining the
credibility of the underlying KG. Data and workflow provenance information can be used for
explainability, answering questions like “WHY am I (not) seeing this result?” and “HOW was
this data acquired?” (e.g., with which instruments, under which conditions), respectively.</p>
    </sec>
    <sec id="sec-2">
      <title>3. Concluding Remarks</title>
      <p>This paper discusses some possible steps for the construction of a KG that captures our
knowledge of the cosmos, mostly relying on tabular data found in astrophysics publications and
catalogues. Such a KG will boost research in astrophysics, widen the public knowledge in
astronomy, and pose new challenges that can further improve research in computer science.
For example, the resources produced by this efort could be used to improve deep-learning tools
that mine such data, or semantically annotate them (e.g., systems participating in SemTab). The
processes followed in this endeavor should be thoroughly documented, with the ultimate goal
to create a generalizable, open-access methodology that gets adopted by other disciplines.
2005.
[21] S. Schlobach, R. Cornet, Non-standard reasoning services for the debugging of description
logic terminologies, in: IJCAI, 2003.
[22] B. C. Grau, I. Horrocks, Y. Kazakov, U. Sattler, Modular reuse of ontologies: Theory and
practice, J. Artif. Intell. Res. 31 (2008) 273–318.
[23] A. Bonifati, S. Dumbrava, H. Kondylakis, Graph summarization, CoRR abs/2004.14794
(2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K. D.</given-names>
            <surname>Borne</surname>
          </string-name>
          ,
          <article-title>Astroinformatics: data-oriented astronomy research and education</article-title>
          ,
          <source>Earth Sci. Informatics</source>
          <volume>3</volume>
          (
          <year>2010</year>
          )
          <fpage>5</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[2] Ontology of astronomical object types version 1</source>
          .3,
          <year>2010</year>
          . URL: https://www.ivoa.net/ documents/Notes/AstrObjectOntology/.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Sarro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Martínez-Tomás</surname>
          </string-name>
          ,
          <article-title>First steps towards an ontology for astrophysics</article-title>
          ,
          <source>in: KES</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ritze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lehmberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Oulabi</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Bizer, Profiling the potential of web tables for augmenting cross-domain knowledge bases</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Lehmberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <article-title>Ontology augmentation through matching with web tables</article-title>
          ,
          <source>in: OM@ISWC</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Cutrona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jiménez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sequeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Srinivas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Abdelmageed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hulsebos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pesquita</surname>
          </string-name>
          ,
          <source>Results of semtab</source>
          <year>2021</year>
          , in: SemTab@ISWC,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gilani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Qasim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Malik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Shafait</surname>
          </string-name>
          ,
          <article-title>Table detection using deep learning</article-title>
          ,
          <source>in: ICDAR</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N. X. R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Burdick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>Tablelab:</surname>
          </string-name>
          <article-title>An interactive table extraction system with adaptive deep learning</article-title>
          ,
          <source>in: IUI</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Özcan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Quamar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sankaranarayanan</surname>
          </string-name>
          ,
          <article-title>Ontologybased natural language query interfaces for data exploration</article-title>
          ,
          <source>IEEE Data Eng. Bull</source>
          .
          <volume>41</volume>
          (
          <year>2018</year>
          )
          <fpage>52</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>H. M. Zahera</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Heindorf</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Balke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Haupt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Voigt</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Walter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Witter</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          <string-name>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <article-title>Tab2onto: Unsupervised semantification with knowledge graph embeddings</article-title>
          ,
          <source>in: ESWC</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chaves-Fraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <article-title>Declarative description of knowledge graphs construction automation: Status &amp; challenges</article-title>
          , in: KGCW@ESWC,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mudgal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rekatsinas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Park</surname>
          </string-name>
          , G. Krishnan,
          <string-name>
            <given-names>R.</given-names>
            <surname>Deep</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Arcaute</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Raghavendra</surname>
          </string-name>
          ,
          <article-title>Deep learning for entity matching: A design space exploration</article-title>
          ,
          <source>in: SIGMOD</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Quamar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Özcan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>MEDTO: medical data to ontology matching using hybrid graph neural networks</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Alani</surname>
          </string-name>
          ,
          <article-title>Position paper: ontology construction from online ontologies</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Ontology-based data access: A survey</article-title>
          ,
          <source>in: IJCAI</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Patkos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Flouris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bikakis</surname>
          </string-name>
          ,
          <article-title>Symmetric multi-aspect evaluation of comments - extended abstract</article-title>
          ,
          <source>in: ECAI</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dessì</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Osborne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Recupero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Buscaldi</surname>
          </string-name>
          , E. Motta, H. Sack,
          <article-title>AI-KG: an automatically generated knowledge graph of artificial intelligence</article-title>
          ,
          <source>in: ISWC</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Quamar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Efthymiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Özcan</surname>
          </string-name>
          ,
          <article-title>Natural language interfaces to data</article-title>
          ,
          <source>Found. Trends Databases</source>
          <volume>11</volume>
          (
          <year>2022</year>
          )
          <fpage>319</fpage>
          -
          <lpage>414</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jiménez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. M.</given-names>
            <surname>Holter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Antonyrajah</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Owl2vec*: embedding of OWL ontologies, Mach</article-title>
          . Learn.
          <volume>110</volume>
          (
          <year>2021</year>
          )
          <fpage>1813</fpage>
          -
          <lpage>1845</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. van Harmelen</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. ten Teije</surname>
          </string-name>
          ,
          <article-title>Reasoning with inconsistent ontologies</article-title>
          , in: IJCAI,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>