<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alignments⋆</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cassia Trojahn</string-name>
          <email>cassia.trojahn@irit.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institut de Recherche en Informatique de Toulouse</institution>
          ,
          <addr-line>Université de Toulouse 2, Toulouse</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>A number of recommendations has been proposed so far for making FAIR data, including more recent ones on how to publish FAIR ontologies on the web. However, less attention has been given to producing FAIR ontology alignments. This paper reviews existing FAIR data initiatives and discusses the required eforts for generating and publishing FAIR alignments on the Web. It aligns the four principles (F, A, I and R) to the actions and requirements towards the generation and sharing of FAIR alignments. It ends with a discussion on further developments.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology alignment</kwd>
        <kwd>FAIR principles</kwd>
        <kwd>FAIR alignment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Management
(e.g: practitioners, repositories or the Semantic Web community). Several frameworks have also
been proposed to assess the degree of FAIRness of resources1, as the well-known “FAIR Data
Maturity Model” [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which mainly consists of a list of evaluation items in a spreadsheet format.
Other works have proposed more automated approaches for FAIRness evaluation [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] based
on web applications. More recently, proposals have addressed the evaluation of vocabularies
and ontologies as well as best practices for implementing FAIR vocabularies and ontologies on
the Web [
        <xref ref-type="bibr" rid="ref7">7, 8</xref>
        ], with tools available to support this evaluation process, such as FOOPS! [9] and
O’FAIRe (for Ontology FAIRness Evaluator) for the evaluation of semantic resources in general
(e.g., vocabularies, terminologies, thesaurus) [10].
      </p>
      <p>Despite this wave of eforts on making data and ontologies FAIR, few attention has been
given to producing FAIR ontology alignments, in particular to the generation of (rich) alignment
metadata. While alignment representation languages have become the standard de facto in the
ontology matching field, such as the RDF Alignment API format and the EDOAL (Expressive
and Declarative Ontology Alignment Language) [11] languages, they were not designed to
provide rich metadata on the alignments. Metadata in terms of interpretation, explanation,
quality, provenance, usage license, version history, etc., both at the level of the alignment and
at the level of the correspondences, are so far missing. This lack of documentation definitely
makes hard the task of exploiting, combining and reproducing alignments.</p>
      <p>Recently, the EOSC has addressed the problem of “semantic mapping sharing”, reporting
on the requirements for creating, documenting, and publishing alignments and cross-walks
within a particular scientific community, as well as across scientific domains [ 12]. This efort
resulted from 26 interviews that have been carried out and on observations about the work in
the ESFRI (European Strategy Forum on Research Infrastructures) initiatives, in the realm of
EOSC discussions, and the work in RDA. A complementary efort is the Simple Standard for
Sharing Ontological Mappings (SSSOM) [13] that proposes a machine-readable and extensible
vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in
correspondences explicit. Tools and software libraries working with the standard are made
publicly available. However, there is still no clear ‘alignments’ between FAIR principles and
requirements for making FAIR alignments.</p>
      <p>This paper reviews the required eforts for generating and publishing FAIR alignments on
the Web, what brings to light many still unsolved issues in the field such as the lack of rich
metadata alignment models, lack of ontology alignment repositories for alignment publishing
and sharing (as LOV for ontologies and vocabularies), common good practices, etc. It aligns the
four principles (F, A, I and R) to the actions and requirements towards the generation and sharing
of FAIR alignments on the Web. The paper ends with a discussion on further developments.</p>
    </sec>
    <sec id="sec-2">
      <title>2. FAIR alignment requirements</title>
      <p>
        The list of FAIR guiding principles defined in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is as in the following. These principles are
presenting together with the eforts for making FAIR alignments.
1Most of which are listed on https://fairassist.org/ (accessed on 10th August 2022).
      </p>
      <p>Findable F1. (meta)data are assigned a globally unique and persistent identifier; F2. data are
described with rich metadata (defined by R1 below); F3. metadata clearly and explicitly include
the identifier of the data it describes; F4. (meta)data are registered or indexed in a searchable
resource: Alignments should be findable in order to facilitate their reuse. Data here are alignments
themselves. They have to be exposed, stored, in dedicated repositories (e.g., github), and ideally
indexed in alignment (searchable) catalogs. They have to be described with enough information
allowing their full exploitation (see R below): rich describing of the alignment content (data) and
rich description of the alignment (metadata). The question of having globally unique and persistent
identifiers arises and has to be discussed in the ontology matching community.</p>
      <p>Accessible A1. (meta)data are retrievable by their identifier using a standardized
communications protocol; A1.1 the protocol is open, free, and universally implementable; A1.2 the protocol
allows for an authentication and authorization procedure, where necessary; A2. metadata are
accessible, even when the data are no longer available. Alignments should be accessible in order to
facilitate their reuse. They should be made available along with (open) mechanisms and protocols
for content negotiation, allowing for both automated and human exploration (with at least one
RDF serialization and HTML, as recommended by FOOPS! for ontologies).</p>
      <p>Interoperable I1. (meta)data use a formal, accessible, shared, and broadly applicable language
for knowledge representation; I2. (meta)data use vocabularies that follow FAIR principles; I3.
(meta)data include qualified references to other (meta)data. The main question that arises here is:
what should be interoperable alignments? A debatable view but consistent with the ‘I’ principle
is that their metadata should be described with descriptions that are formally represented and
accessible, using FAIR ontologies and vocabularies. Again, in the sense of FOOPS! evaluation
checks, such a vocabulary has to include references to existing vocabularies in their metadata
annotations, classes and properties. They have to be documented both in a human-readable manner
and with formal languages that are expressive enough to be able to capture the semantics of each
correspondence (meaning of the relation between the ontology entities being aligned, how the
confidence should be considered, explanations on how the correspondence has been found, etc.).
Alignments have to be clearly described to be consumed for other communities than the ones
producing them. Alignments have to be understood.</p>
      <p>Reusable R1. meta(data) are richly described with a plurality of accurate and relevant
attributes; R1.1. (meta)data are released with a clear and accessible data usage license; R1.2.
(meta)data are associated with detailed provenance; R1.3. (meta)data meet domain-relevant
community standards. Alignments should definitely be produced to be reusable. Besides type of
documentation referred in ‘I’, alignments have to exposed with clear information on the usage
license and their provenance (who has created the alignment, which tool and tool version, on which
ontologies version, etc.).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Discussion</title>
      <p>While the previous section has introduced the ideal requirements for making FAIR alignments,
this section discusses the further developments and the need for joint eforts in those directions.</p>
      <p>With respect to making alignments findable , this is still an open issue in the field. In fact,
alignments generated in research papers are rarely available and OAEI alignments are rather
available as zip files on dedicated web pages without any indexation. While some
(domainspecific) portals, such as BioPortal 2 and AgroPortal3 catalog curated alignments, there is still a
lack of catalog services to index general purpose alignments, as the LOV4 does for ontologies.
A good practice could be exposing alignments on repositories as github, with an indexation
by alignments catalogs that can alternatively also ofer the storing service. The field urgently
needs a LOA (Linked Open Alignment) service. Versioning has also to be taken into account
at the metadata level (version annotations, as owl:priorVersion and owl:versionInfo for
ontologies).</p>
      <p>Alignments (data) have to be accessible (besides their metadata being findable). As stated
above, they should be made available along with mechanisms for content negotiation, allowing
To docu While there is specialised tools for generating ontology documentation in a
humanreadable format, including content-negotiation configuration, as WIDOCO [ 14], alignment
documentation tools should be able to deal with both alignment metadata and alignment
content visualisation.</p>
      <p>Alignments have to be interoperable. At least, their metadata has to be formally represented
and accessible, using FAIR vocabularies. A number of vocabularies has been proposed to
represent metadata in general (Dublin Core, VoID, Schema.org, DCAT, DCAT-AP), with extensions
for accommodating specific kinds of data, such as geo-spatial data (GeoDCAT-AP) or statistical
data (StatDCAT-AP). The question that arises is “what are FAIR vocabulaires”? In [15], eleven
features for FAIR vocabulary are proposed, covering requirements for identifiers, access
protocols, knowledge representation, etc. First, FAIR vocabularies for alignment metadata have
to be proposed and then reused. SSSOM brings a format that improves on the metadata and
provenance information associated with the correspondences, enhancing their understanding
and potential future reuse. It seems to be a good candidate towards a FAIR model for alignment
metadata.</p>
      <p>Alignments have be generated to be reusable as much as possible. It is still dificult to
reuse alignments mainly because they are hardly findable, accessible, and interoperable. As
stated above, alignment representation languages still lack descriptive metadata, explanation
and justification components. It is dificult to interpret a correspondence involving complex
constructors, the truth relation expressed between the involved ontologies entities within a
correspondence, etc. The languages fail as well in ofering a support for representation “how”
a given correspondence has been found, in both a human-readable format and formally. This
goes beyond provenance, which can be documented using existing vocabulaires, as PROV-O5
to represent and interchange provenance information (agent, activity, entity, etc).
2https://bioportal.bio
3https://agroportal.lirmm.fr
4https://lov.linkeddata.es/dataset/lov/
5https://www.w3.org/TR/prov-o/ (accessed on 10th August 2022)</p>
      <p>Last but not at least, the Ontology Alignment Evaluation Initiative (OAEI), as an international
coordinated forum for matching systems developers, has also to led the initiative towards the
generatioTo dn of FAIR alignments: agreement on metadata model for adoption (SSSOM is a
good candidate), coordination with the EOSC group [12] on the developments on the semantic
mapping framework, for citing few concrete actions.</p>
      <p>Summing up, there is a path to producing and exposing alignments fully compliant to the
FAIR principles, in particular: i) what is the vocabulary to use to explicitly describe them both at
alignment level (descriptive metadata, provenance, licence, versioning) and at correspondence
level (relation interpretation, confidence interpretation, explanation, justification), ii) how to
index them (LOA); iii) how to assist alignment producers on sharing alignments; iv) how to
evaluate the degree of FAIRness of alignments.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Future Work</title>
      <p>
        Concretely, in a short term, the next steps to integrate the FAIR principles into ontology
alignment have to be: adopting SSSOM in the well-known OAEI campaigns; working on
SSSOM extensions to provide alignment metadata at both correspondence and alignment levels;
providing clear guidelines to make FAIR alignments (with an identification of the requirements
to do so), what involves ‘aligning’ the FAIR guidelines and recommendations ([
        <xref ref-type="bibr" rid="ref3">3, 8, 15, 12</xref>
        ], for
citing a few); developing a LOA.
ontologies on the web, CoRR abs/2003.13084 (2020). URL: https://arxiv.org/abs/2003.13084.
arXiv:2003.13084.
[8] S. J. D. Cox, A. N. Gonzalez-Beltran, B. Magagna, M.-C. Marinescu, Ten simple rules
for making a vocabulary fair, PLOS Computational Biology 17 (2021) 1–15. URL: https:
//doi.org/10.1371/journal.pcbi.1009041. doi:10.1371/journal.pcbi.1009041.
[9] D. Garijo, Ó. Corcho, M. Poveda-Villalón, FOOPS!: An Ontology Pitfall Scanner for the
FAIR principles, in: Proc. of the ISWC 2021 Posters, Demos and Industry Tracks, 2021.
      </p>
      <p>URL: http://ceur-ws.org/Vol-2980/paper321.pdf.
[10] E. Amdouni, S. Bouazzouni, C. Jonquet, O’FAIRe: Ontology FAIRness Evaluator in the
AgroPortal semantic resource repository, in: ESWC 2022 Poster and demos, Greece, 2022.</p>
      <p>URL: https://hal-lirmm.ccsd.cnrs.fr/lirmm-03630543.
[11] J. David, J. Euzenat, F. Scharfe, C. Trojahn, The alignment API 4.0, Semantic Web 2 (2011)
3–10. URL: https://doi.org/10.3233/SW-2011-0028. doi:10.3233/SW-2011-0028.
[12] D. Broeder, P. Budroni, E. Degl’Innocenti, Y. Le Franc, W. Hugo, K. Jefery, C. Weiland,
P. Wittenburg, C. M. Zwolf, SEMAF: A Proposal for a Flexible Semantic Mapping
Framework, 2021. URL: https://doi.org/10.5281/zenodo.4651421. doi:10.5281/zenodo.4651421.
[13] N. Matentzoglu, J. P. Balhof, S. M. e. a. Bello, A Simple
Standard for Sharing Ontological Mappings (SSSOM), Database 2022
(2022). URL: https://doi.org/10.1093/database/baac035. doi:10.1093/
database/baac035.
arXiv:https://academic.oup.com/database/articlepdf/doi/10.1093/database/baac035/43832024/baac035.pdf, baac035.
[14] D. Garijo, Widoco: a wizard for documenting ontologies, in: International
Semantic Web Conference, Springer, Cham, 2017, pp. 94–102. URL: http://dgarijo.com/papers/
widoco-iswc2017.pdf. doi:10.1007/978-3-319-68204-4_9.
[15] F. Xu, N. S. Juty, C. A. Goble, S. Jupp, H. E. Parkinson, M. Courtot, Features of a FAIR
vocabulary, in: 13th International Conference on Semantic Web Applications and Tools
for Health Care and Life Sciences, SWAT4HCLS, 2022, pp. 118–148. URL: http://ceur-ws.
org/Vol-3127/paper-15.pdf.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          , et al.,
          <article-title>The FAIR Guiding Principles for scientific data management and stewardship</article-title>
          ,
          <source>Scientific data 3</source>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Espinoza-Arias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          , Ó. Corcho,
          <article-title>Coming to terms with FAIR ontologies</article-title>
          , in: C.
          <string-name>
            <surname>M. Keet</surname>
          </string-name>
          , M. Dumontier (Eds.),
          <article-title>Knowledge Engineering and Knowledge Management -</article-title>
          22nd International Conference, EKAW,
          <year>2020</year>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>270</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -61244-3_
          <fpage>18</fpage>
          . doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>030</fpage>
          - 61244- 3\_
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Le Franc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Parland-von Essen</surname>
          </string-name>
          , L. Bonino,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lehväslaiho</surname>
          </string-name>
          , G. Coen, C. Staiger,
          <year>D2</year>
          .2 fair semantics: First recommendations,
          <year>2020</year>
          . URL: https://doi.org/10.5281/zenodo.3707985. doi:
          <volume>10</volume>
          .5281/zenodo.3707985.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>FAIR</given-names>
            <surname>Data Maturity Model Working Group</surname>
          </string-name>
          <string-name>
            <surname>RDA</surname>
          </string-name>
          ,
          <source>FAIR Data Maturity Model. Specification and Guidelines</source>
          ,
          <year>2020</year>
          . URL: https://doi.org/10.15497/rda00050. doi:
          <volume>10</volume>
          .15497/rda00050, https://doi.org/10.15497/rda00050 Accessed 6 May
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Evaluating</surname>
            <given-names>FAIR</given-names>
          </string-name>
          <article-title>maturity through a scalable, automated, community-governed framework</article-title>
          ,
          <source>Sc. Data</source>
          <volume>6</volume>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Devaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Huber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mokrane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Herterich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cepinskas</surname>
          </string-name>
          , J. de Vries,
          <string-name>
            <surname>H. L'Hours</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Davidson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>White</surname>
          </string-name>
          ,
          <source>FAIRsFAIR Data Object Assessment Metrics 0.5</source>
          ,
          <string-name>
            <surname>Technical</surname>
            <given-names>Report</given-names>
          </string-name>
          , Research Data Alliance (RDA),
          <year>2020</year>
          . URL: https://zenodo.org/record/6461229. doi:
          <volume>10</volume>
          .5281/ zenodo.6461229, https://zenodo.org/record/6461229 Accessed 3 May
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <article-title>Best practices for implementing FAIR vocabularies and</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>