<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Gradio-Based Toolkit for Remote Sensing Data Fusion Literature</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Caleb Cheruiyot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jonathan P. Leidig</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiaxin Du</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Computing, Grand Valley State University</institution>
          ,
          <addr-line>Allendale, MI</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Remote Sensing</institution>
          ,
          <addr-line>Data Fusion, Knowledge Graph, Uncertainty Tagging, Gradio, BERT</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>We present a proof-of-concept toolkit focused on remote sensing data fusion literature that turns research articles into searchable, ontology-backed records and a lightweight knowledge graph. The system ofers an interactive, Gradio-based interface featuring searchable cards and a simple subgraph preview. Uncertainty is incorporated as basic, human-editable tags. A small, binary text-classification component (BERT/RoBERTa) supports triage of abstracts into “fusion-related” vs. “other remote sensing” to aid curation. We demonstrate the end-to-end pipeline and provide preliminary classifier metrics on a small labeled set, emphasizing the system's scope as a practical starting point. Code, ontology, and configuration are released for reproducibility. This work supports the AI-for-science goal of developing transparent, interpretable, and uncertainty-aware AI tools for scientific knowledge integration.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Data fusion—the integration of heterogeneous data sources into a unified representation—is critical
for enhancing coverage, resolution, and interpretability across diverse domains. A widely adopted
structure for organizing fused data is the knowledge graph, which represents entities and their semantic
relationships as nodes and edges, enabling queryable, interconnected representations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However,
constructing and maintaining such structured knowledge remains labor-intensive, especially when
accounting for uncertainty inherent in source data. Recent research increasingly highlights the dual
importance of data fusion and uncertainty modeling [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. For example, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] demonstrated the value
of integrating multi-scale measurement data to reveal latent relationships. Similarly, BUGPan [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
efectively addressed spatial uncertainty in image fusion, underscoring the necessity of managing
uncertainty for robust integration. Given the resource-intensive nature of building knowledge graphs,
emerging approaches aim to improve eficiency by combining ML-based toolkits [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] with strategically
minimized human input [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Building on these developments, we present a lightweight, end-to-end toolkit that transforms remote
sensing data fusion literature into searchable records and an ontology-backed knowledge graph with
basic uncertainty annotations. Implemented via a Gradio based interface 1, the system supports search,
graph previews, and triage using a binary text classifier. Specifically, we ofer:
• An ontology-driven schema linking papers, datasets, and fusion methods in the remote sensing
domain.
• A lightweight binary classifier (BERT/RoBERTa) for abstract-level triage.
• Basic uncertainty tagging to support interpretability.</p>
      <p>• A simple, reproducible interface for search and visualization, with released artifacts for extension.
The 2nd International Workshop on Artificial Intelligence for the Science of Science (AI4SciSci 2025), co-located with ACM/IEEE-CS</p>
      <p>JCDL 2025
CEUR
Workshop</p>
      <p>ISSN1613-0073</p>
      <p>In the context of digital libraries, such lightweight AI tools can accelerate domain curation while
preserving human oversight. By emphasizing interpretability and incremental uncertainty annotation,
our approach aligns with AI4SciSci’s vision of transparent and reproducible AI systems for scientific
knowledge integration.</p>
    </sec>
    <sec id="sec-2">
      <title>2. System Architecture</title>
      <p>A modular data pipeline was designed to integrate structured data handling, semantic enrichment,
machine learning, and lightweight visualization. Figure 1 illustrates the high-level architecture of
the proposed toolkit. The architecture is organized into five layers: data ingestion, storage, semantic
enrichment, AI-based classification , and user interaction. This layered design supports flexible querying,
ontology-based linkage, and editable uncertainty tagging across datasets and fusion methods. At this
stage, uncertainty metadata are qualitative and manually assigned rather than formally modeled or
propagated computationally.</p>
      <p>
        The pipeline begins with data ingestion, where tabular datasets (CSV and Excel) describing research
papers, data fusion techniques, and datasets are pre-processed using pandas and stored in a PostgreSQL
database. These structured data are exported to an RDF-compatible schema through an OWL ontology
layer implemented with OWLReady22, enabling semantic relationships between papers, datasets, fusion
methods, and associated uncertainty tags [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. A semantic knowledge graph is then generated from the
database using NetworkX3 to visualize entity relationships and support graph-based exploration.
      </p>
      <p>
        To assist document triage, a small BERT-based classifier [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] identifies whether an abstract is
fusionrelated or general remote sensing. This classification facilitates organization within the ontology but
does not directly determine graph structure. Finally, a Gradio4 interface enables users to search, filter,
and interpret curated records through card views, keyword filters, subgraph previews, and classification
outputs, completing the end-to-end process from ingestion to human-interpretable exploration [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Ontology Development</title>
        <p>
          An OWL ontology was designed to provide semantic structure for literature-level representation of
remote sensing data fusion. Rather than aiming for comprehensive domain coverage, the ontology was
intentionally kept lightweight and application-driven, with the goal of supporting search, linking, and
human-editable uncertainty tagging within a small curated corpus. This design was informed by prior
work on scholarly knowledge graphs [
          <xref ref-type="bibr" rid="ref6 ref7">7, 6</xref>
          ], which highlighted the trade-ofs between comprehensive
coverage and usability for expert-driven curation. Existing ontologies were considered; however, they
proved too complex, incomplete, or insuficiently tailored to the remote sensing data fusion domain,
motivating the minimal schema adopted here.
        </p>
        <p>
          The core classes include FusionMethod, Dataset, and Uncertainty, with supporting classes such
as Paper and Publisher. These are connected through properties including integrates, usesData,
isPublishedIn, and hasUncertainty. A numeric attribute, hasConfidenceLevel, allows storage
of confidence-related indicators extracted from text or assigned by curators. Uncertainty modeling
was motivated by existing research on scientific uncertainty detection and integrated modeling [
          <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
          ].
Instead of formal reasoning or probabilistic propagation, uncertainty is represented as qualitative and
numeric metadata to support filtering and interpretation during search.
        </p>
        <p>The ontology was implemented using the owlready2 library and serialized in RDF format for
integration with the knowledge graph pipeline. Structural consistency was verified using the Owlready2
reasoner to ensure valid class and property definitions. In addition, a subset of entities and relations was
manually reviewed by domain experts to confirm semantic correctness. No external ontology alignment
or quantitative quality metrics were applied in this prototype. Figure 2 illustrates the ontology structure
and its key relationships.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Knowledge Graph Construction</title>
        <p>
          Two domain-specific knowledge graphs were constructed to support exploration of data fusion and
remote sensing literature [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Source data consisted of expert-curated tables describing publications,
datasets, and fusion methods. Preprocessing included normalization, removal of non-standard characters,
case harmonization, and basic consistency checks. The cleaned data were stored in a PostgreSQL
database to enable structured querying and controlled schema evolution.
        </p>
        <p>A graph representation was then generated using the NetworkX library. Nodes correspond to entities
defined in the ontology (e.g., Paper, Dataset, FusionMethod), while directed edges encode semantic
relations such as integrates and usesData. Only valid entity links were instantiated to preserve
referential integrity.</p>
        <p>Each graph instance includes optional uncertainty annotations inherited from ontology metadata.
2https://owlready2.readthedocs.io/
3https://networkx.org
4https://www.gradio.app</p>
        <p>
          Graphs were exported in GraphML format and rendered for interactive inspection using pyvis 5. In
parallel, the relational backend supports keyword- and attribute-based queries. Separate graphs were
maintained for data fusion and general remote sensing to align with the binary classifier’s scope and
reduce semantic drift between domains [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. As the toolkit is a prototype, evaluation focused on structural
validity, correct linkage of entities, and interface consistency rather than formal graph metrics.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Text Classification</title>
        <p>
          Although most literature tagging in the system was performed manually, machine learning was
incorporated to assist abstract-level triage by identifying whether a paper is related to data fusion or general
remote sensing. Following human-supervised knowledge curation practices [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], a transformer-based
classifier was used as a support tool rather than as a fully automated decision mechanism.
        </p>
        <p>
          We fine-tuned a pre-trained BERT model [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] for binary sequence classification using abstract text.
Tokenization was performed via the HuggingFace tokenizer, and training used mini-batch optimization
with class weighting and random oversampling to address label imbalance. Experiments were also
conducted using a RoBERTa variant for comparison, though BERT achieved the best overall performance.
        </p>
        <p>
          A total of 120 abstracts were independently annotated by two domain experts, achieving strong
agreement (Cohen’s  = 0.86 ) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The dataset was divided into 80% for training and 20% for validation
using stratified sampling. Because of the limited data size, a separate test set was not created, and the
results are presented as a feasibility assessment rather than a benchmark evaluation.
        </p>
        <p>Predictions with confidence below 0.6 were flagged and reviewed by human annotators through
the Gradio interface, enabling iterative correction and refinement. Although no classical baselines
(e.g., TF–IDF with SVM) were included in this prototype, future work will incorporate comparative
evaluations and larger-scale validation.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Visualization Interface</title>
        <p>The user interface was implemented using Gradio, chosen for its simplicity and flexibility in building
interactive web applications. The interface provides four main views: Search, Graph, Classify, and
Ontology. Users can query the system by research topic, fusion method, or dataset name, and results
are presented as styled cards with DOI links, metadata, abstracts, and embedded uncertainty tags.</p>
        <p>Uncertainty is visualized using color-coded highlights and tooltips to improve interpretability. The
ontology view presents a structured outline of the knowledge model, while the graph view renders
ifltered subgraphs for interactive exploration. The classification tab allows users to submit abstracts
and receive automated predictions regarding their relevance to data fusion. Figure 3 shows an overview
of the interface.</p>
        <p>
          Overall, the interface serves as a lightweight digital library front end [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], enabling users to explore
ontology-backed literature relationships while observing uncertainty cues and classification outputs
interactively.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Evaluation</title>
      <p>The evaluation of the developed system was conducted across multiple dimensions, including search
accuracy, classification reliability, interface responsiveness, and overall system robustness. The
preliminary results demonstrate that the integrated architecture performs efectively in delivering dynamic,
context-aware domain insights with associated uncertainties.</p>
      <sec id="sec-4-1">
        <title>4.1. Search and Query Evaluation</title>
        <p>
          The search engine embedded in the user interface mitigates common user query issues through exact
and partial matching via Python’s difflib.get_close_matches() function, misspelling corrections,
etc. Test cases were designed to validate core functionalities such as searching by paper title, dataset
name, and fusion method using related keywords. In all tested scenarios, the system returned card-styled
results with enriched metadata, abstracts, and DOI references that were deemed relevant by expert
evaluators. Across 20 representative test queries spanning fusion methods, datasets, and publication
titles, the system achieved full accuracy on exact matches and an average relevance of 92% for partial
matches, as assessed by two domain experts. Each retrieved record displayed associated uncertainty
tags [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] from the ontology, providing contextual cues for reliability and potential data ambiguity.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Text Classification Performance</title>
        <p>To assess the efectiveness of transformer-based models for abstract triage, both BERT-Base-Uncased
and RoBERTa-Base were evaluated on the labeled abstract set. Table 1 summarizes the performance
results.</p>
        <p>The fine-tuned BERT model achieved an accuracy of 0.966 and an F1 score of 0.963 on the validation
subset, outperforming the RoBERTa variant. Precision was perfect for the fusion class, while recall was
slightly lower, indicating conservative classification behavior. These results suggest that
transformerbased classifiers are efective for abstract-level domain filtering even with limited data.</p>
        <p>Classification outputs were linked to ontology instances such as FusionMethod and Dataset, enabling
automated tagging during knowledge graph construction. Low-confidence predictions were routed to
human annotators via the interface for confirmation or correction, preventing classification errors from
propagating into the graph.</p>
        <p>Because the dataset is small and manually curated, results should be interpreted as evidence of
feasibility rather than generalization performance. Nonetheless, the findings confirm that lightweight
ifne-tuning can meaningfully support expert-driven literature organization.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Interface Responsiveness and User Experience</title>
        <p>The interface was evaluated for usability and performance. Pagination mechanisms were tested by
injecting simulated datasets to ensure stable loading and filtering. Visual transitions, layout adjustments,
resizing, and mobile responsiveness were validated. Functional buttons such as “View Fusion
Possibilities” and “View Data Fusion Recommendations” successfully directed users to relevant contextual
sections, activated embedded logic, and met anticipated use case scenarios. A small pilot evaluation with
three research assistants (familiar with remote sensing data) confirmed that the color-coded uncertainty
indicators improved interpretability, allowing users to distinguish between confirmed and tentative
links in the graph. Participants highlighted the system’s transparency and ease of use, though they also
noted the need for richer visual encoding of uncertainty in future iterations. Figure 4 shows a sample
knowledge graph query highlighting multiple aspects of uncertainty within search results.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>This work demonstrates the feasibility of a lightweight toolkit that integrates ontology-based
representation, uncertainty tagging, and human-in-the-loop machine learning for organizing remote sensing data
fusion literature. The system prioritizes interpretability and usability over full automation, supporting
expert-guided exploration rather than autonomous knowledge extraction.</p>
      <p>
        The ontology was intentionally designed as a minimal, application-driven schema to enable linking,
ifltering, and navigation across papers, datasets, and fusion methods. Structural consistency was verified
using a reasoner, and concepts were refined through iterative curation. Uncertainty is represented as
qualitative metadata to support interpretability rather than formal inference, consistent with prior work
on uncertainty modeling in scientific knowledge systems [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>The classifier serves as a triage mechanism, with expert validation of low-confidence predictions to
prevent error propagation. While evaluation is limited in scale, the results indicate that lightweight
ifne-tuning can efectively support expert workflows.</p>
      <p>Limitations include the small dataset, absence of large-scale benchmarking, and lack of automated
uncertainty reasoning. These reflect the prototype nature of the system and motivate future work on
ontology alignment, uncertainty modeling, and scaling.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>This paper presented a Gradio-based toolkit for curating remote sensing data fusion literature into
searchable, ontology-backed records with editable uncertainty annotations. A lightweight
transformerbased classifier supports abstract-level triage, with expert validation integrated to preserve reliability
and interpretability.</p>
      <p>Future work will focus on expanding ontology coverage, introducing uncertainty provenance
tracking, and incorporating baseline model comparisons. Additional directions include structured user
evaluations, multi-label classification, and scaling the framework to larger literature collections.
Interoperability with external knowledge sources will also be explored.</p>
      <p>Artifacts (code, ontology, and configuration) are available at https://doi.org/10.6084/m9.figshare.
29914361 to support reproducibility and community engagement.
We extend our sincere gratitude to Timothy Mugambi, Nancy Odhiambo, and Braiden Betway for their
invaluable contributions to this research project. This work was supported by seed funding from the
College of Computing at Grand Valley State University.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this manuscript, the author(s) used Generative AI tools for limited purposes
such as grammar and language refinement. All technical content, interpretation of results, and
conclusions were created by the author(s). The author(s) reviewed and edited the output as needed and take(s)
full responsibility for the accuracy, originality, and integrity of the work in compliance with the CEUR
Generative AI policy.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Jaradeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stocker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          , “
          <article-title>Generate FAIR literature surveys with scholarly knowledge graphs,”</article-title>
          <source>in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL</source>
          <year>2020</year>
          ), Wuhan, China,
          <year>June 2020</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>106</lpage>
          , doi: 10.1145/3383583.3398520.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kirchner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mitter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U. A.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sommer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Falkner</surname>
          </string-name>
          , and E. Schmid, “
          <article-title>Uncertainty concepts for integrated modeling - Review and application for identifying uncertainties and uncertainty propagation pathways</article-title>
          ,
          <source>” Environmental Modelling &amp; Software</source>
          , vol.
          <volume>135</volume>
          , p.
          <fpage>104905</fpage>
          ,
          <year>2021</year>
          , doi: 10.1016/j.envsoft.
          <year>2020</year>
          .
          <volume>104905</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Ningrum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mayr</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Atanassova</surname>
          </string-name>
          , “
          <article-title>UnScientify: Detecting scientific uncertainty in scholarly full text</article-title>
          ,”
          <source>in Proceedings of the Joint Workshop on the 4th EEKE and the 3rd AII, co-located with JCDL</source>
          <year>2023</year>
          , Santa Clara, CA, USA,
          <year>June 2023</year>
          , pp.
          <fpage>52</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Davidof</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Chau</surname>
          </string-name>
          , “
          <article-title>Nested fusion: A method for learning high-resolution latent structure of multi-scale measurement data on Mars,”</article-title>
          <source>in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD</source>
          <year>2024</year>
          ), Barcelona, Spain,
          <year>August 2024</year>
          , pp.
          <fpage>5969</fpage>
          -
          <lpage>5978</lpage>
          , doi: 10.1145/3637528.3671596.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hou</surname>
          </string-name>
          et al.,
          <article-title>“Bidomain uncertainty gated recursive network for pan-sharpening,” Information Fusion</article-title>
          , vol.
          <volume>118</volume>
          , p.
          <fpage>102938</fpage>
          ,
          <year>2025</year>
          , doi: 10.1016/j.infus.
          <year>2025</year>
          .
          <volume>102938</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stocker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          , “
          <article-title>TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation</article-title>
          ,”
          <source>in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL</source>
          <year>2022</year>
          ),
          <year>2022</year>
          , doi: 10.1145/3529372.3533285.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pirklbauer</surname>
          </string-name>
          , and W. Balke, “
          <article-title>A toolbox for the nearly-unsupervised construction of digital library knowledge graphs,”</article-title>
          <source>in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL</source>
          <year>2021</year>
          ), pp.
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>2021</year>
          , doi: 10.1109/JCDL52503.
          <year>2021</year>
          .
          <volume>00014</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , “BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,” arXiv preprint arXiv:
          <year>1810</year>
          .04805,
          <year>2018</year>
          , doi: 10.48550/ARXIV.
          <year>1810</year>
          .
          <volume>04805</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gohsen</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , “
          <article-title>Assisted knowledge graph authoring: Human-supervised knowledge graph construction from natural language</article-title>
          ,”
          <source>in Proceedings of the ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR</source>
          <year>2024</year>
          ), Shefield, UK,
          <year>March 2024</year>
          , pp.
          <fpage>376</fpage>
          -
          <lpage>380</lpage>
          , doi: 10.1145/3627508.3638340.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Koh</surname>
          </string-name>
          , “
          <article-title>Teamwork dimensions classification using BERT,”</article-title>
          <source>in Proceedings of the International Conference on Artificial Intelligence in Education (AIED</source>
          <year>2023</year>
          )
          <article-title>: Posters</article-title>
          and
          <string-name>
            <given-names>Industry</given-names>
            <surname>Track</surname>
          </string-name>
          , Tokyo, Japan,
          <year>July 2023</year>
          , pp.
          <fpage>254</fpage>
          -
          <lpage>259</lpage>
          , doi: 10.1007/978-3-
          <fpage>031</fpage>
          -36336-8-39.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>