<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Examining the Representation of Uncertainty in Knowledge Graphs: A Case Study in Copolymer Chemistry</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sarah T. Bachinger</string-name>
          <email>sarah.bachinger@uni-jena.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Uncertain Knowledge Graphs, Experimental Uncertainty, Uncertainty Representation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Heinz Nixdorf Chair for Distributed Information Systems, Friedrich Schiller University Jena</institution>
          ,
          <addr-line>Leutragraben 1, 07743 Jena</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>2</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>In copolymer chemistry, researchers aim to create molecules with desirable properties. The experimental process towards a successful synthesis inherently contains uncertainties in diferent forms and can be described as a structured process. Using ontologies to describe this process could help in predicting the synthesis setup and conserve resources. Therefore in this thesis project, we want to examine how the representation of diferent uncertainties in the ontology can increase the prediction success. Guided by five research questions, we detail our approach consisting of expert interviews, mapping of uncertainties from the interviews to known representations on Knowledge Graphs (KGs) and its evaluation.</p>
      </abstract>
      <kwd-group>
        <kwd>Copolymer Chemistry</kwd>
        <kwd>1</kwd>
        <kwd>Problem statement and importance</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>https://www.fmi.uni-jena.de/en/14619/sarah-t-bachinger (S. T. Bachinger)
CEUR</p>
      <p>ceur-ws.org</p>
      <p>Biodegradable
Copolymer
a
Chemist with a
research question</p>
      <p>KAnqouwisleitdiogne b</p>
      <p>c
TRADTIrTEiaIOrTlroarNirnaAdlLanWdORKFLOW</p>
      <p>Error
prediction with respect to
uncertain knowledge
uncertain knowledge f
representation</p>
      <p>Setup of
synthesis
d
Synthesis setup</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>During a preliminary literature review, we focused on diferent definitions of the term uncertainty and
its representation in Knowledge Graphs, and ontologies and data sources in the copolymer chemistry
domain.</p>
      <sec id="sec-2-1">
        <title>2.1. Definitions of Uncertainty</title>
        <p>What researchers view as uncertainty varies in diferent scientific domains. For example, Ningrum et
al. [11] discuss this and list diferent definitions of uncertainty in their study on scientific uncertainty
in scholarly articles.</p>
        <p>In the final report of the W3C Incubator Group for uncertainty reasoning for the World Wide Web, the
authors define uncertainty as ”a variety of aspects of imperfect knowledge, including incompleteness,
inconclusiveness, vagueness, ambiguity, and others” [12].</p>
        <p>As a starting point, uncertainty in this thesis refers in relation to the above mentioned W3C Incubator
Group definition to everything that hinders the prediction of the experimental outcome including,
among others, statistical, ethical, societal or experimental factors. We expect this definition to adapt
and to be refined as we continue the research process.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Uncertainty in Knowledge Graphs</title>
        <p>Jarnac et al. [13] classify data from diferent sources with conflicting information in the construction
of a KG into two groups of so-called knowledge deltas and diferentiate six types of conflicting data:
invalidity, vagueness, fuzziness, timeliness, ambiguity, and incompleteness.</p>
        <p>In 2008, the final report of an W3C Incubator Group [ 12] on uncertainty reasoning for the world wide
web was released with two goals, namely the identification of world wide web challenges that would
benefit from uncertainty reasoning and building on that what techniques could be used to improve and
standardize the same. In the report, they describe an uncertainty ontology and list common approaches
to model uncertainty in the World Wide Web. We will apply and extend their work for our experimental
domain.</p>
        <p>Recent work on the combination of KGs with uncertainty includes Ni et al. [14] who propose a
multihop reasoning framework combining LLMs with KGs to address uncertainty estimation in LLM-KG
systems. Deng et al. [15] explore the creation of artificial intelligence systems based on structured
knowledge for systems such as generative AI and investigate the handling of uncertainty. Yang et
al. [16] review KG reliability from the perspective of knowledge correctness and uncertainty. Building
upon knowledge representation learning, they compare score function modification, representation
vector optimization, loss function adjustment, and textual information integration. Freedman et al. [17]
combine Bayesian Probabilism with existing KGs to transform them to what they call probabilistic KGs.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Copolymer Data and Ontologies</title>
        <p>Currently available datasets for copolymer chemistry include the polyOne dataset [18] that contains
100 million hypothetical polymers along with several predicted parameters and PolyInfo [19] which
was introduced in 2011 and recently examined in ”NIMS polymer database PoLyInfo (I): an overarching
view of half a million data points” [20]. The purpose was the collection of diferent polymers and related
information on their structure and methods of processing including measurements, properties, and
monomers. The PoLyInfo Knowledge Collection [6] consists among other things of the PoLyInfo
ontology and aim to systematize polymer chemistry and provide a ”machine-readable understanding of
the polymer knowledge contained in “PoLyInfoRDF” [21].</p>
        <p>In the paper ”NIMS polymer database PoLyInfo (II): machine-readable standardization of polymer
knowledge expression” [22], Ishii et al. introduce the management of polymers including copolymers,
where they use shape expression language and used IDs to refer to diferent variational parts of the
copolymer, such as the repeating unit.</p>
        <p>With PolyNERE [5], the authors created an ontology for polymers, related entities and relationships
and a corpus of abstracts annotated by the diferent features.</p>
        <p>Conjugated Polymer Process Ontology [7] is used for the design of experiments for organic
ifeld-efect transistors and aim to provide FAIR data for experiments.</p>
        <p>NanoMine [23] aims to provide an open source data resource for polymer nanocomposites1 serving
as an open source data resource.</p>
        <p>Other ontologies that include copolymers as concepts include the Devices, Experimental scafolds
and Biomaterials Ontology [24] and the Materials Data Vocabulary [25, 26].</p>
        <p>PolyMAT [27] is an ontology for polymer membrane research that puts a focus on being used in
electronic laboratory notebooks. We find that especially PolyNERE and PolyInfo as extensive sources
for polymers including copolymers could serve as interesting applications to examine the occurrence of
missing, incomplete or uncertain data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Research questions and hypotheses</title>
      <p>Using the domain of copolymer chemistry, we want to examine how uncertainty in diferent aspects
of the experimentation process can be represented in KGs. Therefore, we pose the following research
questions (RQs).</p>
      <p>• RQ1: What does the term ”Uncertainty” refer to in the experimental sciences?
• RQ2: What factors (domain-specific uncertainties) influence the outcome of experiments in a
domain such as copolymer chemistry?
• RQ3: To what extent are existing uncertainty representation approaches able to represent the
specific uncertainties in the experimental sciences, in particular copolymer chemistry?
• RQ4: How can approaches that represent uncertainty in KGs be applied to predict the outcome
of experiments?
• RQ5: If the exploration of the previous RQs shows that there is a need for improvement: How can
a novel uncertainty representation approach with regards to the specific needs for the copolymer
chemistry domain be designed?
We operate under the assumption that there are certain types of uncertainty in research that are not yet
(well) represented in KGs as a medium for connecting information on diferent domains. However, if
these uncertainties were represented, this would be helpful for domain experts because the modeled
knowledge can then be utilized for predicting the outcome of experiments. As an application domain,
copolymer chemistry was chosen for the big potential improvement areas. To explore the above
mentioned research questions, we pose the following hypotheses with each hypothesis corresponds to
the same numbered research question:
1. Hypothesis 1: The definition of uncertainty depends on the scientific domain.</p>
      <p>• Surveying taxonomies and definitions of uncertainties
• Exploratory expert interviews with ten researchers from diferent scientific domains
2. Hypothesis 2: There are certain factors in experimental domains that hinder the prediction of
experimental outcomes.</p>
      <p>• Domain-specific expert interviews with five experts
• Conducting a requirement analysis for the prediction of experiment outcomes
• Decision on metrics for the evaluation
• Establishing a use case for the remainder of the thesis
3. Hypothesis 3: Current approaches are not yet able to represent all of the specific uncertainties in
experimental sciences.</p>
      <p>• Mapping between the state-of-the-art and the the discovered uncertainty types from the
interviews from H1 and H2
• Literature review on the representation of uncertainty in ontologies and specifically in</p>
      <p>Knowledge Graphs
• Literature review on Knowledge Graphs in copolymer chemistry
4. Hypothesis 4: A representation design with uncertainty in mind would enhance reasoning, and
therefore have a practical impact.</p>
      <p>• Comparative analysis of the diferent representation approaches on the chosen use case
5. Hypothesis 5: A new specific representation approach and accordingly reasoning abilities improve
upon the chosen use case.</p>
      <p>• Development of a novel representation approach
• Evaluation and comparison of the new representation approach with the state-of-the-art
representations</p>
    </sec>
    <sec id="sec-4">
      <title>4. Preliminary results</title>
      <p>From January to May 2025, 10 interviews were conducted with scientists from diferent domains about
uncertainty in their research. This sets the foundation for further exploration and more detailed
questioning in the future. We use the sociological method guideline based expert interviews [28] to
answer hypothesis 1 to access domain expert knowledge to uncover areas of insuficient data uncertainty
coverage. We designed a guideline encompassing the following questions to be as open as possible:
• What kind of uncertainty do you deal with in your research?
• What kind of uncertainty occurs in your data?</p>
      <p>Career-level
PhD Candidate, Postdocs, (assistant) professors
PhD Candidate
PhD Candidate
• How do you deal with that (referring to the occurring uncertainty)?
• How is this (referring to both the uncertainty and its handling) represented for future reuse?
We experimented with including the question “How do you define uncertainty?” at the beginning, but
found that it narrows the participants’ answers to mostly statistical uncertainty.</p>
      <p>For a standardized transcription process, we use the seven working steps outlined by Kuckart [29].
First, to ensure comparative transcription, we adapted the transcription rules from [29]. Second, the
interviews were transcribed using the software ”noScribe” [ 30] in version 0.6.1 and manually corrected.
Currently, we are redacting personal information from the interviews and will use pseudonymized
transcripts for the collection of uncertainty types.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation</title>
      <p>In this section, we want to connect the research questions and hypotheses from Section 3 with concrete
evaluation steps and expected outcomes. We started from a very generic definition of uncertainty as
everything that hinders the prediction of the experimental outcome including, among others, statistical,
ethical, societal or experimental factors. To operate under a more grounded definition, we pose the first
research questions as RQ1: What does the term ”Uncertainty” refer to in the experimental sciences? To
answer it, we will survey definitions from diferent disciplines like mathematics, philosophy, sociology
and the experimental sciences. We conduct interviews with ten researchers from diferent scientific
domains to explore the notion of uncertainty in their domain. As outcome, we expect to obtain diverse
definitions of the term uncertainty as well as examples of uncertainty in diferent scientific domains.</p>
      <p>At the same time, we will look more specifically at our application domain with the second research
question RQ2: What factors influence the outcome of experiments in a domain like copolymer chemistry?
to extract uncertainties specific to copolymer chemistry. For example, incomplete or missing data,
representations of copolymers, or distributions of values come to mind. We plan to conduct interviews
with five domain experts about uncertainties in the experimentation process which will also help us to
establish a use case for the remainder of the thesis. In consideration of the domain-specific challenges,
we conduct a requirement analysis to extract the specific requirements for the prediction of the outcome
of an experiment and building on that metrics for the evaluation of a knowledge representation approach
with reasoning including uncertainty.</p>
      <p>After exploring RQ1 and RQ2, we can compare existing representation methods on our specific use
case and therefore ask RQ3: To what extent are existing uncertainty representation approaches able to
represent the specific uncertainties in the experimental sciences, in particular copolymer chemistry? In a
ifrst step, we will map the uncertainties from the interviews to the definitions. This will enable us to
make grounded assumptions in how to represent these uncertainties as we can use established methods
for the application domain specific use case. For a nuanced comparison, we need to review existing
ontologies in the copolymer domain. An extensive review on the coverage for the application area
copolymer science needs to include both domain specific entities like the monomer types as well as
information about the license and reusability. Second, we will further analyze the existing ontologies
in diferent aspects. Investigations include the domain area, whether they are actively used in the
chemistry domain and if the described domain includes data that are uncertain. Furthermore, with the
mapping of domain-specific uncertainties (RQ2) to definitions (RQ1), we can evaluate the coverage of
diferent types of uncertainty with existing representation and reasoning approaches for KGs. We will
use these insights in the decision whether a new representation method is needed if a certain type of
uncertainty is not well represented.</p>
      <p>
        To report on RQ4: How can approaches that represent uncertainty in KGs be applied to predict the
outcome of experiments?, we will compare the performance of diferent state-of-the-art approaches
regarding our chosen use case and the evaluation criteria established in H2. With this comparison, we
can (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) evaluate the strengths and drawbacks of each approach and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) judge whether all aspects of the
use case are appropriately covered.
      </p>
      <p>Our last research question RQ5: If the exploration of the previous RQs shows that there is a need for
improvement: How can a novel uncertainty representation approach with regards to the specific needs for
the copolymer chemistry domain be designed? uses the insights from RQ3 and RQ4 to develop a novel
representation approach. We will use the comparison from RQ4 as a baseline for the novel approach.</p>
      <p>
        Continuous working packages are (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) keeping up-to-date on the current literature regarding
uncertainty in KGs and ontologies for the copolymer domain and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) thesis writing. We opt for writing and
researching in parallel instead of consecutive research and writing phases. Even though this thesis is
not planned as a cumulative dissertation, we plan to submit milestones to suitable conferences and use
the published results in the thesis.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Reflection and future work</title>
      <p>In conclusion, we laid out how we plan to investigate the representation of uncertainty in KGs and how
we apply this to the copolymer chemistry domain. We will now reflect on the main steps and illustrate
future work.</p>
      <sec id="sec-6-1">
        <title>6.1. Expert Interviews</title>
        <p>As reported in Section 4, we have conducted ten exploratory interviews. The next steps are
pseudonymization and coding to cluster diferent types of uncertainty. For the copolymer chemists,
only two PhD level candidates were interviewed. Further interviews with senior scientists or industry
specialists are necessary to ensure a complete picture. With the start of a new graduate school focused
on the intersection of copolymer chemistry and computer science (COIN)2, a new pool of potential
interviewees of diferent levels of expertise are available. Furthermore, on the basis of the exploratory
interviews, more directed questioning is possible as we plan to use the interviews we have to develop
an enhanced and more focused questionnaire for the domain of copolymer chemistry. As a guideline,
we aim for five interview partners that hold a PhD in copolymer chemistry and use those interviews to
further clarify the uncertainty types. Furthermore, we aim to extract at least two types of synthesis
as a use case for the thesis. For RQ1 and RQ2, we are dependent on the availability of experts and
their willingness to participate in multiple iterations as the use case developments and we classify
uncertainties.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Literature Review</title>
        <p>Though literature review is an ongoing process, we identified several areas like uncertainty
representation in KGs and copolymer ontologies where an in-depth review will be beneficial. This includes the
topic of uncertainty in diferent scientific fields to see the state-of-the-art in uncertainty taxonomies.
We plan to map the insights from the interviews to current uncertainty representation methods in
Knowledge Graphs to analyze where gaps in the current research exist.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Development of a novel representation approach</title>
        <p>Generative AI is transforming scientific practice as well as all scientific domains, as recent examples
show [31, 32]. Works such as [33, 34] explore the current abilities and drawbacks of using Large
Language Models for Material Science. For example, Miret and Krishnan [33] point out several future
challenges, one of them being the development of domain-specific reasoning techniques that implement
the principles of the domain, namely material science. While this is still in the early stages, the
combination of LLMs and ontologies for representing the uncertainty in copolymer chemistry might be
an interesting research direction for this thesis.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This thesis is funded by a doctoral scholarship by the State of Thuringia and supported by the Friedrich
Schiller University of Jena, the ZiF group ”Mapping Evidence to Theory in Ecology” and the DFG
research training group (RTG) 3040 ”Copolymer Informatics (COIN): How digital technologies shape
copolymer chemistry – From design to application” (DFG project number 527537972). It is supervised
by Prof. Dr. Birgitta König-Ries.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author used Writefull in order to: Grammar and spelling
check. After using this service, the author reviewed and edited the content as needed and takes full
responsibility for the publication’s content.
Knowledge Graphs, number 22 in Synthesis Lectures on Data, Semantics, and Knowledge, Springer,
2021.
[11] P. K. Ningrum, P. Mayr, N. Smirnova, I. Atanassova, Annotating Scientific Uncertainty: A
comprehensive model using linguistic patterns and comparison with existing approaches, Journal of
Informetrics 19 (2025) 101661. doi:10.1016/j.joi.2025.101661. arXiv:2503.11376, comment:
Paper Accepted for Publication in the Journal of Informetrics (2025)</p>
      <p>Compares diferent notions of uncertainty.
[12] Kenneth J. Laskey, Kathryn B. Laskey, Paulo C. G. Costa, Mieczyslaw M. Kokar, Trevor Martin,
Thomas Lukasiewicz, Uncertainty Reasoning for the World Wide Web, W3C Incubator Group
Report, 2008. URL: http://www.w3.org/2005/Incubator/urw3/XGR-urw3/.
[13] L. Jarnac, Y. Chabot, M. Couceiro, Uncertainty Management in the Construction of Knowledge
Graphs: A Survey, Transactions on Graph Data and Knowledge 3 (2025) 3:1–3:48. URL: https://drops.
dagstuhl.de/storage/08tgdk/tgdk-vol003/tgdk-vol003-issue001/TGDK.3.1.3/TGDK.3.1.3.pdf. doi:10.
4230/tgdk.3.1.3, publisher: Schloss Dagstuhl – Leibniz-Zentrum fuer Informatik.
[14] B. Ni, Y. Wang, L. Cheng, E. Blasch, T. Derr, Towards Trustworthy Knowledge Graph Reasoning:
An Uncertainty Aware Perspective, Proceedings of the AAAI Conference on Artificial Intelligence
39 (2025) 12417–12425. doi:10.1609/aaai.v39i12.33353.
[15] S. Deng, N. Zhang, T. Wu, From Certainty to Uncertainty in Knowledge: Exploring Modeling,
Extraction, Representation, and Applications, in: Handbook on Neurosymbolic AI and Knowledge
Graphs, IOS Press, 2025, pp. 319–362. doi:10.3233/FAIA250214.
[16] Y. Yang, J. Chen, Y. Xiang, A review on the reliability of knowledge graph: From a knowledge
representation learning perspective, World Wide Web 28 (2024) 4. doi:10.1007/s11280-024-01316-w.
[17] H. Freedman, N. Abolhassani, J. Metzger, S. Paul, Ontology Modeling for Probabilistic Knowledge
Graphs, in: 2023 IEEE 17th International Conference on Semantic Computing (ICSC), 2023, pp.
252–259. doi:10.1109/ICSC56153.2023.00049, the paper to the medium article.
[18] C. Kuenneth, R. Ramprasad, polyOne Data Set - 100 million hypothetical polymers including 29
properties, 2022. doi:10.5281/zenodo.7766806.
[19] S. Otsuka, I. Kuwajima, J. Hosoya, Y. Xu, M. Yamazaki, PoLyInfo: Polymer Database for Polymeric
Materials Design, in: 2011 International Conference on Emerging Intelligent Data and Web
Technologies, 2011, pp. 22–29. doi:10.1109/EIDWT.2011.13.
[20] M. Ishii, T. Ito, H. Sado, I. Kuwajima, NIMS polymer database PoLyInfo (I): An overarching view
of half a million data points, Science and Technology of Advanced Materials: Methods 4 (2024)
2354649. doi:10.1080/27660400.2024.2354649.
[21] M. Ishii, T. Takemura, M. Tanifuji, PoLyInfo RDF: A Semantically Reinforced Polymer Database
for Materials Informatics, in: International Workshop on the Semantic Web, 2019.
[22] M. Ishii, T. Ito, K. Sakamoto, NIMS polymer database PoLyInfo (II): Machine-readable
standardization of polymer knowledge expression, Science and Technology of Advanced Materials: Methods
4 (2024) 2354651. doi:10.1080/27660400.2024.2354651.
[23] V. Rawte, J. P. McCusker, H. Zhao, L. C. Brinson, W. Chen, L. Schadler, D. L. McGuinness, An
Ontology for a Polymer Nanocomposite Community Data Resource, in: Proceedings of the 2017
ACM on Web Science Conference, WebSci ’17, Association for Computing Machinery, New York,
NY, USA, 2017, pp. 411–412. doi:10.1145/3091478.3098866.
[24] O. Hakimi, J. L. Gelpi, M. Krallinger, F. Curi, D. Repchevsky, M.-P. Ginebra, The Devices,
Experimental Scafolds, and Biomaterials Ontology (DEB): A Tool for Mapping, Annotation, and
Analysis of Biomaterials Data, Advanced Functional Materials 30 (2020) 1909910. doi:10.1002/
adfm.201909910.
[25] A. Medina-Smith, C. A. Becker, R. L. Plante, L. M. Bartolo, A. Dima, J. A. Warren, R. J. Hanisch, A
Controlled Vocabulary and Metadata Schema for Materials Science Data Discovery, Data Science
Journal 20 (2021) 18. doi:10.5334/dsj-2021-018.
[26] A. Medina-Smith, C. Becker, Simple Knowledge Organization System (SKOS) version of Materials</p>
      <p>Data Vocabulary, 2017. doi:10.18434/T4/1435037.
[27] M. Dembska, M. Held, S. Schindler, PolyMat ontology, 2024. doi:10.5281/ZENODO.10286389.
[28] C. Helferich, Leitfaden- und Experteninterviews, in: N. Baur, J. Blasius (Eds.), Handbuch
Methoden der empirischen Sozialforschung, Springer Fachmedien, Wiesbaden, 2022, pp. 875–892.
doi:10.1007/978-3-658-37985-8_55.
[29] U. Kuckartz, Qualitative Inhaltsanalyse. Methoden, Praxis, Computerunterstützung, 2018.
[30] K. Dröge, Kaixxx/noScribe, 2025. URL: https://github.com/kaixxx/noScribe.
[31] K. M. Jablonka, Q. Ai, A. Al-Feghali, S. Badhwar, J. D. Bocarsly, A. M. Bran, S. Bringuier, L. C.</p>
      <p>Brinson, K. Choudhary, D. Circi, S. Cox, W. A. de Jong, M. L. Evans, N. Gastellu, J. Genzling, M. V.
Gil, A. K. Gupta, Z. Hong, A. Imran, S. Kruschwitz, A. Labarre, J. Lála, T. Liu, S. Ma, S. Majumdar,
G. W. Merz, N. Moitessier, E. Moubarak, B. Mouriño, B. Pelkie, M. Pieler, M. C. Ramos, B. Ranković,
S. G. Rodriques, J. N. Sanders, P. Schwaller, M. Schwarting, J. Shi, B. Smit, B. E. Smith, J. V. Herck,
C. Völker, L. Ward, S. Warren, B. Weiser, S. Zhang, X. Zhang, G. A. Zia, A. Scourtas, K. J. Schmidt,
I. Foster, A. D. White, B. Blaiszik, 14 examples of how LLMs can transform materials science
and chemistry: A reflection on a large language model hackathon, Digital Discovery 2 (2023)
1233–1250. doi:10.1039/D3DD00113J.
[32] J. V. Herck, M. V. Gil, K. M. Jablonka, A. Abrudan, A. S. Anker, M. Asgari, B. Blaiszik, A. Bufo,
L. Choudhury, C. Corminboeuf, H. Daglar, A. M. Elahi, I. T. Foster, S. Garcia, M. Garvin, G. Godin,
L. L. Good, J. Gu, N. X. Hu, X. Jin, T. Junkers, S. Keskin, T. P. J. Knowles, R. Laplaza, M. Lessona,
S. Majumdar, H. Mashhadimoslem, R. D. McIntosh, S. M. Moosavi, B. Mouriño, F. Nerli, C. Pevida,
N. Poudineh, M. Rajabi-Kochi, K. L. Saar, F. H. Saboor, M. Sagharichiha, K. J. Schmidt, J. Shi,
E. Simone, D. Svatunek, M. Taddei, I. Tetko, D. Tolnai, S. Vahdatifar, J. Whitmer, D. C. F. Wieland,
R. Willumeit-Römer, A. Züttel, B. Smit, Assessment of fine-tuned large language models for
real-world chemistry and material science applications, Chemical Science 16 (2025) 670–684.
doi:10.1039/D4SC04401K.
[33] S. Miret, N. M. A. Krishnan, Enabling large language models for real-world materials discovery,</p>
      <p>Nature Machine Intelligence 7 (2025) 991–998. doi:10.1038/s42256-025-01058-y.
[34] E. O. Pyzer-Knapp, M. Manica, P. Staar, L. Morin, P. Ruch, T. Laino, J. R. Smith, A. Curioni,
Foundation models for materials discovery – current state and future directions, npj Computational
Materials 11 (2025) 61. doi:10.1038/s41524-025-01538-0.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Feldman</surname>
          </string-name>
          , Polymer History,
          <source>Designed Monomers and Polymers</source>
          <volume>11</volume>
          (
          <year>2008</year>
          )
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . doi:
          <volume>10</volume>
          .1163/ 156855508X292383.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Lynd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Meuler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hillmyer</surname>
          </string-name>
          ,
          <article-title>Polydispersity and block copolymer self-assembly</article-title>
          ,
          <source>Progress in Polymer Science</source>
          <volume>33</volume>
          (
          <year>2008</year>
          )
          <fpage>875</fpage>
          -
          <lpage>893</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.progpolymsci.
          <year>2008</year>
          .
          <volume>07</volume>
          .003.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Tyler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Trauner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Glorius</surname>
          </string-name>
          ,
          <article-title>Reaction development: A student's checklist</article-title>
          ,
          <source>Chemical Society Reviews</source>
          <volume>54</volume>
          (
          <year>2025</year>
          )
          <fpage>3272</fpage>
          -
          <lpage>3292</lpage>
          . doi:
          <volume>10</volume>
          .1039/D4CS01046A.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Strömert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hunold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Neumann</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Koepler,</surname>
          </string-name>
          <article-title>Ontologies4Chem: The landscape of ontologies in chemistry</article-title>
          ,
          <source>Pure and Applied Chemistry</source>
          <volume>94</volume>
          (
          <year>2022</year>
          )
          <fpage>605</fpage>
          -
          <lpage>622</lpage>
          . doi:
          <volume>10</volume>
          .1515/ pac-2021-
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.-T.</given-names>
            <surname>Phi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Teranishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Oka</surname>
          </string-name>
          , M. Ishii,
          <article-title>PolyNERE: A Novel Ontology and Corpus for Named Entity Recognition and Relation Extraction in Polymer Science Domain</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            , M.-
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Hoste</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Sakti</surname>
          </string-name>
          , N. Xue (Eds.),
          <source>Proceedings of the 2024 Joint International Conference on Computational Linguistics</source>
          ,
          <article-title>Language Resources and Evaluation (LREC-COLING 2024), ELRA</article-title>
          and
          <string-name>
            <given-names>ICCL</given-names>
            ,
            <surname>Torino</surname>
          </string-name>
          , Italia,
          <year>2024</year>
          , pp.
          <fpage>12856</fpage>
          -
          <lpage>12866</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ishii</surname>
          </string-name>
          , K. Sakamoto,
          <source>PoLyInfo Knowledge Collection</source>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48505/NIMS.4413.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Venkatesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Bonsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Volkovinsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Meredith</surname>
          </string-name>
          , E. Reichmanis,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Grover</surname>
          </string-name>
          ,
          <article-title>Conjugated Polymer Process Ontology and Experimental Data Repository for Organic FieldEfect Transistors</article-title>
          ,
          <source>Chemistry of Materials</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <fpage>8816</fpage>
          -
          <lpage>8826</lpage>
          . doi:
          <volume>10</volume>
          .1021/acs.chemmater. 3c01842, doi: 10.1021/acs.chemmater.3c01842.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Jarnac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chabot</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Couceiro, Uncertainty Management in the Construction of Knowledge Graphs:</article-title>
          A Survey,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2405.16929. arXiv:
          <volume>2405</volume>
          .
          <fpage>16929</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Janjua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          <article-title>Survey on State-of-the-art Techniques for Knowledge Graphs Construction and Challenges ahead</article-title>
          ,
          <source>in: 2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>99</fpage>
          -
          <lpage>103</lpage>
          . doi:
          <volume>10</volume>
          .1109/AIKE52691.
          <year>2021</year>
          .
          <volume>00021</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hogan</surname>
          </string-name>
          , E. Blomqvist,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cochez</surname>
          </string-name>
          , C. d'Amato, G. de Melo, Claudio Gutierrez, Sabrina Kirrane,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Labra</surname>
          </string-name>
          <string-name>
            <surname>Gayo</surname>
          </string-name>
          , Roberto Navigli,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Neumaier</surname>
          </string-name>
          , A.
          <string-name>
            <surname>-C. Ngonga Ngomo</surname>
          </string-name>
          , Axel Polleres,
          <string-name>
            <surname>Sabbir M. Rashid</surname>
          </string-name>
          , Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Stefen Staab, Antoine Zimmermann,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>