<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic Creation of Knowledge Graphs from IEC 61850-Tagged Datasets for Renewable Energy Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lina Nachabe</string-name>
          <email>lina.nachabe@emse.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maxime Lefrançois</string-name>
          <email>maxime.lefrancois@emse.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cyril Efantin</string-name>
          <email>cyril.efantin@edf.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antoine Zimmermann</string-name>
          <email>antoine.zimmermann@emse.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EDF R&amp;D</institution>
          ,
          <addr-line>Palaiseau</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Mines Saint-Etienne, Univ Clermont Auvergne, INP Clermont Auvergne, CNRS, UMR 6158 LIMOS</institution>
          ,
          <addr-line>Saint-Étienne</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapid growth of renewable energy toward industry 4.0, particularly renewable solar, has amplified the need for findable, accessible, interoperable, and reusable data sets collected from meteorological stations, sensors, renewable equipments, infrastructures, and others. In particular, solar plants generate large volumes of operational data, often structured using the IEC 61850 standard, a widely adopted protocol for energy systems that defines models for communication between intelligent devices. While IEC 61850 ensures consistency at the data exchange level, it does not provide full semantic interoperability needed for advanced analytics, data discovery, crossdomain integration, and automated reasoning. This paper describes an automatic pipeline for transforming IEC 61850-tagged data into knowledge graphs, enabling seamless integration with diverse datasets such as weather and grid structure for operation maintenance services as well as energy prediction services. This pipeline allows the transformation of these tags into a KG aligned with the Omega-X ontology, reducing the need for manual intervention, enabling continuous data integration, and enhancing the reuse of these data by service providers in the energy sector. We evaluate this approach in the context of the Omega-X project, using real-world datasets from a solar park that combines meteorological and electrical parameters.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;IEC 61850</kwd>
        <kwd>knowledge graph construction pipeline</kwd>
        <kwd>renewables solar ontology</kwd>
        <kwd>solar datasets</kwd>
        <kwd>FAIR</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The continuous use of renewable energy systems has driven the emergence of new digital services
designed to accelerate the integration of photovoltaic (PV) technologies. These services include the
optimization of Operation &amp; Maintenance (O&amp;M) workflows for early fault detection, as well as
congestion forecasting, and energy production estimation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Such services rely on heterogeneous
datasets collected from solar park equipment, weather stations, infrastructure systems, and other related
sources. However, they are often developed using siloed datasets originating from individual companies
or even specific departments within an organisation. As a result, the models and services built on
these limited datasets may not be reused by other stakeholders and have limited performance when
applied across diverse environments or varying operational contexts. Consequently, European Data
Space (DS) targets to overcome these barriers by promoting data sharing across multiple stakeholders.
In particular, the European Energy DS aims to exchange data between data providers and services
providers to develop more resilient and adaptable services that enhance the performance of PVs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
This approach enables interoperability, innovation, and eficient energy management while maintaining
data sovereignty. In the context of the European Energy DS, the Omega-X project 1 addressed semantic
interoperability challenges across four use case families, including renewables, flexibility, local energy
communities, and electro-mobility using ontologies and Knoweldge Graphs (KGs). In the context of
the projects, datasets exchanged between data providers and service providers are semantified using
1Omega-X - Orchestrating an interoperable sovereign federated Multi-vector Energy data space built on open standards and
ready for GAia-X, funded by the European Union’s Horizon Europe Framework Programme under Grant Agreement No.
101069287 – https://omega-x.eu
the Omega-X Ontology 2, which models the structure of diverse energy datasets and services. In the
renewable use case family, diferent data providers wanted to share solar energy datasets about pilot
sites. Électricité de France (EDF) was one of them and wanted to share an heterogeneous set of CSV files
that rely on the International Electrotechnical Commission IEC 61850 standard [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and are generated
on a weekly basis for a pilot site3. The IEC 61850 standard is part of the IEC Technical Committee 57
reference architecture for electric power systems, and defines a communication protocol for power
utility automation. It defines a general tagging mechanism for data elements, specialised for several
domains such as substation automation4, hydropower plant, wind plant, solar plant, etc. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Although
standardised, IEC 61850 tags are dificult to interpret by non-domain experts, making it dificult to
integrate datasets based on these tags with other datasets. In addition, the solar power datasets shared
by EDF presented diferent structures with no explicit semantics, hindering both automated integration
and cross-dataset analysis. This paper reports on a pipeline for creating KGs conformant to the Omega-X
ontology from these IEC 61850-tagged datasets.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Automatic pipeline for KG creation</title>
      <p>
        In IEC 61850 tags, each physical device (Ph, in this case the solar park) consists of logical devices
(LD), which consist of logical nodes (LN) that represent specific functions, and have data objects (DO)
qualified by data attributes (DA) and functional constraints (FC) that categorise their purpose [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The
specification defines diferent ways to express IEC 61850 tags. In the context of this paper, we use the
following:
      </p>
      <p>Ph_LD∖DL.DO.DA.FC
For example tag PARK_ECP001_S3_SHL001_Inverter01\s4dinv.heatsinktmp.mag.f can be
decomposed as follows: PARK is the Ph and represents a PV park. ECP001_S3_SHL001_Inverter01
is the LD and represents an inverter. s4dinv is the LN and represents the specific function of averaging
some property of the inverter over a period of 10 minutes. heatsinktmp means that the property of
interest is the heat sink temperature of the inverter. The DA.FC mag.f means magnitude expressed as
a float.</p>
      <p>
        These tags will be the headers of the CSV files which are extracted each week encompassing. Each file
encompasses 250 tags. The aim is to transform these CSV files to a KG conformant with the Omega-X
pattern.5 To address this, we implemented a semantic Extract Transform Load (ETL) pipeline (illustrated
in Figure 1) that transforms CSV files into a unified and understandable representation using KGs.
Since the ontologies and data schemas were pre-defined in the project, and that the input data followed
standard protocols, we adopted a top-down approach for KG creation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The proposed pipeline starts
after the extraction of the CSV files.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Pipeline tools</title>
        <p>
          In the process of building KGs from heterogeneous data sources, RML (RDF Mapping Language) is used
to define declarative rules that map data from formats like CSV, JSON, or XML, into RDF [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ]. However,
manually writing RML rules for each dataset can be time-consuming and error-prone, especially when
dealing with frequent updates or structurally similar files from diferent sources. To streamline this
process, we leverage Jinja, a powerful Python templating engine, to dynamically generate RML rules.
When a new dataset needs to be integrated, these templates are automatically populated with the
specific configuration, producing valid RML mappings on the fly. To process RML files, we selected
SDM-RDFizer [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] due to its open-source nature, ease of use, and Python-based implementation. For
        </p>
        <sec id="sec-2-1-1">
          <title>2https://w3id.org/omega-x/repository</title>
          <p>3Detailed information about the pilot site cannot be disclosed in this paper for confidentiality reasons.
4Substations are key facilities of the power grid where voltage is transformed and electricity is routed between generation,
transmission, and distribution networks.
5The Omega-X pattern can be found on GitHub - https://github.com/NaveenVarmaK/Pipeline_Mapping_IEC61850_
OmegaX-CSDM/images/pattern.jpg
KG storage, Ontotext GraphDB 6 is used. as a triple store solution, tailored for storing, managing, and
querying RDF data. It supports advanced features like reasoning and querying using SPARQL query
language, which enables flexible and expressive querying over linked datasets.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Pipeline workflow</title>
        <p>The pipeline is FAIRly available on GitHub7, illustrated in 1 and consists of the following steps:
1. Extract: The extraction step begins with the normalisation of timestamps, converting all temporal
data into the ISO 8601 date &amp; time format with the UTC timezone. Following this, the list of
devices is identified, and the dataset is segmented by device. For each device, a dedicated CSV file
is generated. This step allows to attach the Ph_LD evaluation point to the data collection.
2. Transform: This stage constitutes the core of our semantic enrichment pipeline. The CSV files
are transformed into KGs using RML mapping generated from a Jinja Template 8. The RML files
are then processed using SDM-Rdfizer to create the KG.
3. Load: Once generated, the KGs are loaded to a GraphDB triple store, which serves as the semantic
data endpoint. The stored data is queried using SPARQL, enabling the validation and evaluation
of the system against a set of competency questions.</p>
        <p>
          To illustrate the proposed pipeline, we use the example of two CSV datasets, one containing
meteorological data and another with inverter data from EDF pilot site. We first extract the list of devices from
the CSV files to create the topology KG. Typically, the topology creation step is performed once and
reused by the service providers to enable intelligent services. Weather stations and inverter datasets
are then transmitted periodically, typically on a weekly basis. In order to determine the unit and the
property, a rule-based parsing (regex + dictionary lookup) is being used. Regular expressions allow
lfexible pattern matching on text, while dictionary lookup ensures correct identification of known
units and properties, improving accuracy and maintainability of the parsing system [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The dictionary
contains for each LN and DO that may be found in the IEC 61850 tag, the corresponding unit from the
QUDT unit vocabulary 9 and property from Omega-X property module. The produced KGs are stored
in GraphDB end point. Figure 2 depicts a KG for PARK_ECP001_S3_SHL001Inverter01.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Pipeline Evaluation and lessons learned</title>
        <p>To validate our pipeline, we conducted testing using data from sixteen inverters, sixteen weather
stations, and approximately one hundred combiner boxes over eighteen weeks. Each week, the system
processed around 1009 data points. The inverter data included DC current, DC voltage, DC power,
6https://www.ontotext.com/products/graphdb/
7All source files are available on GitHub - https://github.com/NaveenVarmaK/Pipeline_Mapping_IEC61850_OmegaX-CSDM
8https://jinja.palletsprojects.com/en/stable/templates/
9https://qudt.org/3.1.4/vocab/unit
@prefix ets: &lt;https://w3id.org/omega-x/ontology/EventTimeSeries/&gt; .
@prefix eds: &lt;https://w3id.org/omega-x/ontology/EnergyDataSet/&gt; .
@prefix prop: &lt;https://w3id.org/omega-x/ontology/Property/&gt; .
@prefix unit: &lt;http://qudt.org/vocab/unit/&gt;.
&lt;PARK_ECP001_S3_SHL001Inverter01/week-1&gt; a ets:DataCollection, eds:EnergyDataSet ;
eds:includesEvaluationPoint &lt;PARK_ECP001_S3_SHL001Inverter01&gt; ;
eds:isExchangedIn &lt;PARK_ECP001_S3_SHL001Inverter01/week-1/EC&gt; ;
ets:comprises &lt;PARK_ECP001_S3_SHL001Inverter01/week-1/4dinv.heatsinktmp&gt; .
&lt;PARK_ECP001_S3_SHL001Inverter01/week-1/4dinv.heatsinktmp&gt; a ets:DataCollection ;
ets:isAboutProperty prop:HeatSinkTemperature ;
ets:hasUnit unit:DEG_C .
&lt;PARK_ECP001_S3_SHL001Inverter01/week-1/4dinv.heatsinktmp_DP_6163972&gt; a ets:DataPoint ;
ets:belongsTo &lt;PARK_ECP001_S3_SHL001Inverter01/week-1/4dinv.heatsinktmp&gt; ;
ets:hasPropertyValue
˓→ &lt;PARK_ECP001_S3_SHL001Inverter01/week-1/s4dinv.heatsinktmp_DP_6163972_PV6163972&gt; ;
ets:dataTime "2025-01-06 00:13:20"^^xsd:dateTime .
enclosure temperature, heat sink temperature, and total active power. From the weather stations, we
captured ambient temperature, plane of array insolation, and back-of-panel temperature. The evaluation
of the produced KG focuses on its completeness, consistency, size and usability for intelligent services
in renewable energy use cases. To assess the completeness of the KG, we executed a list of fity
CQs which are written as SPARQL queries and executed in GraphDB. Basically, the queries aim to
retrieve the device, the property, unit of measure, timestamp, and the values. In addition, for predictive
maintenance services, we used queries depicting the topology of the park. These queries retrieve the
connected devices as well as some datasheet properties like maximum DC current and maximum DC
voltage for each inverter. All needed information was successfully retrieved. The responses to these
queries proved highly valuable for the service providers of the Omega-X project, who previously found
it dificult to interpret the raw CSV files before semantic enrichment. To ensure semantic consistency,
we used the Pellet reasoner [10]. Furthermore, we evaluated the number of triples of the produced
KG. The weekly inverter KG consists of 10240 triples generated from a CSV file with 39 columns and
1009 rows, while the weekly weather station KG consists of 6144 triples generated from a CSV file
with 19 columns and 1009 rows. The semantic enrichment significantly improves the data’s utility
and interoperability with other data from other data providers who were not using the IEC 61850 tags
but reused the same Omega-X pattern to describe semantically their measurements. For example, the
predictive maintenance service is tested on two distinct datasets provided by EDF (that uses IEC 61850)
and ESTABANELL-EYPESA. This demonstration illustrates the potential of deploying services across
diferent companies adopting heterogenous dataset format by relying on a common data representation
model. Moreover, the shared taxonomy of properties used in the KG allows for the reuse of the dataset
in flexibility services. To evaluate the usability of the produced KGs, we assessed their adoption by
service providers. Four out of five service providers in Omega-X were able to successfully utilise the KGs
to generate their services, demonstrating the practicality and efectiveness of the graph structure and
data model. However, the remaining one encountered dificulties due to the absence of some inverter
properties description, whose inclusion requires significant manual efort and time. On average, the
complete pipeline for generating a weekly KG for a single device takes less than 1 minute. To better
understand performance at a finer granularity, we analysed a representative execution using a CSV file
for an inverter. All experiments and performance evaluations were conducted on a personal laptop
running Ubuntu, equipped with the following hardware specifications: CPU: 11th Gen Intel® CoreTM
i7-1185G7 @ 3.00GHz, 4 cores, 8 threads; CPU Frequency: 400 MHz (min) – 4.8 GHz (max); RAM: 16
GB; Operating System: Ubuntu 22.04 LTS (64-bit). Over the span of 18 weeks, on average, the creation
of the KG is done in 47.3 seconds and the peak memory usage is 75.20 MB.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion</title>
      <p>The paper describes the steps of the automatic ETL pipeline for KG creation from IEC 61850 tags and
structured using the Omega-X ontology. The proposed solution consists of three main components:
Extraction of devices and grouping data per device; KG generation using Jinja and RML files; KG storage
in GraphDB triple store for information retrieval using SPARQL queries defined from CQs. To the best
of our knowledge, using text-based templates to generate RML mappings and then use these mappings
automatically is an innovative approach that proved to be both simple and a pragmatic fit to our needs.
RML mappings being expressed in RDF, an alternative could consist in using RML itself to generate
RML mappings, although we believe it would be less understandable and maintainable. The produced
KGs have been utilised by various service providers who consumed the KGs in the Omega-X project
to deliver services related to predictive maintenance and energy prediction. This has improved the
understandability and reusability of the datasets. Moreover, the application of semantic techniques
has proven efective in addressing the interoperability challenges between datasets from the same
data provider as well as across diferent data providers. Fortunately, the creation of these KGs paves
the way for their application across diverse use case families. Notably, they enhance scenarios where
energy production prediction is required, such as in flexibility related use cases. EDF is planning
to test this pipeline in other departments than solar production where IEC 61850 is used for data
exchange. However, a major challenge lies in the scale of these KGs, the diversity of data sources, and
the complexity introduced by IEC 61850 tags. There is a clear need for adaptable approach where the
mapping process should be generated based on the IEC 61850 tags encountered in the data.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Declaration on Generative AI</title>
      <sec id="sec-4-1">
        <title>The authors have not employed any Generative AI tools.</title>
        <p>information extraction systems!, in: Proceedings of the 2013 Conference on Empirical Methods in
Natural Language Processing, 2013, pp. 827–832.
[10] S. Abburu, A survey on ontology reasoners and comparison, International Journal of Computer
Applications 57 (2012).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>ENERShare</given-names>
            <surname>Consortium</surname>
          </string-name>
          ,
          <article-title>Blueprint of the Common European Energy Data Space (CEEDS)</article-title>
          ,
          <source>Technical Report</source>
          ,
          <year>2024</year>
          . Accessed:
          <fpage>2025</fpage>
          -05-07.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>OMEGA-X Consortium</surname>
          </string-name>
          ,
          <year>D3</year>
          .4:
          <string-name>
            <given-names>Data</given-names>
            <surname>Analytic Services</surname>
          </string-name>
          and Requirements Related to Interoperability, Security, Privacy,
          <string-name>
            <given-names>and Data</given-names>
            <surname>Sovereignty</surname>
          </string-name>
          ,
          <source>Technical Report</source>
          ,
          <year>2024</year>
          . Accessed:
          <fpage>2025</fpage>
          -05-07.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Tatsoft</surname>
          </string-name>
          ,
          <article-title>Iec 61850 overview</article-title>
          and implementation,
          <year>2021</year>
          . URL: https://tatsoft.com/wp-content/ uploads/2021/10/IEC61850.pdf, accessed:
          <fpage>2025</fpage>
          -04-01.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Tebekaemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wijesekera</surname>
          </string-name>
          ,
          <article-title>Designing an iec 61850 based power distribution substation simulation/emulation testbed for cyber-physical security studies</article-title>
          ,
          <source>in: Proceedings of the First International Conference on Cyber-Technologies and Cyber-Systems</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          , S.
          <article-title>-</article-title>
          K. Han,
          <string-name>
            <surname>I.-M. So</surname>
          </string-name>
          ,
          <article-title>Architecture of knowledge graph construction techniques</article-title>
          ,
          <source>International Journal of Pure and Applied Mathematics</source>
          <volume>118</volume>
          (
          <year>2018</year>
          )
          <fpage>1869</fpage>
          -
          <lpage>1883</lpage>
          . URL: https://acadpubl.eu/ jsi/2018-118-19/articles/19b/24.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Vander</given-names>
            <surname>Sande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          , E. Mannens, R. Van de Walle,
          <article-title>Rml: A generic language for integrated rdf mappings of heterogeneous data</article-title>
          .,
          <source>Ldow</source>
          <volume>1184</volume>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iglesias-Molina</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. Van Assche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Arenas-Guerrero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>De Meester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Debruyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jozashoori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Maria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chaves-Fraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <article-title>The rml ontology: a community-driven modular redesign after a decade of experience in mapping heterogeneous data to rdf</article-title>
          , in: International Semantic Web Conference, Springer,
          <year>2023</year>
          , pp.
          <fpage>152</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Iglesias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jozashoori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chaves-Fraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Collarana</surname>
          </string-name>
          , M.-E. Vidal,
          <article-title>Sdm-rdfizer: An rml interpreter for the eficient creation of rdf knowledge graphs</article-title>
          ,
          <source>in: Proceedings of the 29th ACM international conference on Information &amp; Knowledge Management</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3039</fpage>
          -
          <lpage>3046</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiticariu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Reiss</surname>
          </string-name>
          ,
          <article-title>Rule-based information extraction is dead! long live rule-based</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>