<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PODIO: A Political Discourse Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ibai Guillén-Pacho</string-name>
          <email>ibai.guillen@upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ana Iglesias-Molina</string-name>
          <email>ana.iglesiasm@upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlos Badenes-Olmedo</string-name>
          <email>carlos.badenes@upm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oscar Corcho</string-name>
          <email>oscar.corcho@upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, Universidad Politécnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ontology Engineering Group, Universidad Politécnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this study, we present the POlitical DIscourse Ontology (PODIO) and the accompanying Knowledge Graph, both designed to enhance the formalization and accessibility of political discourse in digital media. The core contribution of this work is the development of a comprehensive ontological framework and its instantiation in a Knowledge Graph that systematically organizes and represents political discourse. This framework provides structured insights to better understand and analyze political debates on various digital platforms. PODIO is specifically engineered to address the complexities of political communication, enabling detailed analysis of political proposals, ideological foundations, target audiences, and the temporal context of discourse. Through the integration of diverse existing datasets into the Knowledge Graph, we demonstrate PODIO's efectiveness in encapsulating a broad spectrum of semantically annotated political discourses</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;ontology</kwd>
        <kwd>political discourse</kwd>
        <kwd>knowledge graph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Political discourse, a strategic and dynamic form of communication in democratic societies, is crafted
by political agents to adapt to changing sociopolitical contexts and interacts with various audiences
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Platforms like TikTok, X (formerly Twitter), and Facebook have transformed political discourse
by adding immediacy and interactivity, facilitating real-time engagement and broad dissemination of
political messages. In contrast, traditional media such as electoral programs and government bulletins
ofer a more formal, less interactive communication style.
      </p>
      <p>
        Research on political discourse incorporates a range of methodologies that focus on diferent elements,
such as national and regional party manifestos [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], politician activities on social media [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], state head
speeches [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and enacted legislation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Traditional research has often been siloed by communication
channels, limiting data integration across diverse sources. To address these challenges and improve
the interoperability of data, we have developed an ontology that redefines traditional classifications,
facilitating a more nuanced representation of political discourse across various media. This innovation
extends beyond existing models like the BBC Politics ontology,1 which lacks the detail needed for
complex political analyses, by ofering a more versatile Discourse class. Additionally, the integration
with the Legal Knowledge Graph2 via federated queries enhances our capability to combine diferent
data sources, moving toward a more integrated and comprehensive approach to modeling political
discourse.
      </p>
      <p>This paper introduces the Political Discourse Ontology (PODIO)3 and Knowledge Graph,4 focusing
on advancing political discourse analysis. PODIO integrates diverse political communications, such as
speeches, tweets, and manifestos, into a comprehensive Knowledge Graph. This integration not only
includes social media activities of political figures but also legislative outputs, significantly improving
the interpretation and interaction analysis of political strategies across various media. This innovative
approach leverages the existing Political Marketing Ontology and extends its capabilities by focusing
on political discourse representation rather than just the dissemination.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The PODIO Ontology</title>
      <p>
        The PODIO ontology has been developed using the Linked Open Terms (LOT) methodology [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which
guides the creation of ontologies through four main stages: Requirements Specification, Implementation,
Publication, and Maintenance. Initially, the ontology’s purpose and competency questions are defined, 5
specifying how it should represent political discourse across various channels and the types of political
information it must handle, such as data from social media, political manifestos, and party proposals.
Following the specification, the ontology is built and evaluated using tools like Protégé and OOPS! [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
for model validation, ensuring no inconsistencies before publication. Once completed, the ontology is
documented, published, and continuously updated through community feedback on GitHub.6
      </p>
      <p>PODIO incorporates two key concepts: Conversational Discourse and Expository Discourse.
Conversational Discourse, interactive in nature, encompasses content such as texts, videos, and images shared via
social media platforms, enriched with metadata such as language and engagement statistics. The
following example of a Conversational Discourse (Listing 1) shows an instance of podio:Conversational,
a Discourse on Twitter that is created by the account of the PSOE political party, with the date it was
created, the number of likes obtained, the shared content, etc.
1 :Post/Twitter/1729783306588180721 a podio:Conversational ;
2 sioc:has_creator :UserAccount/Twitter/psoe_m ;
3 sioc:has_container :Hashtag/viviendas ;
4 terms:created "2023-11-2 13:56:00" ;
5 terms:source &lt;https://twitter.com/i/status/1720062424819040555&gt;;
6 podio:content "El gobierno regional lleva dos legislaturas prometiendo</p>
      <p>[...]" ;
7 schema:sharedContent [ a schema:VideoObject ;
8 schema:contentUrl &lt;https://twitter.com/i/status/1720062424819040555&gt;] ;
9 sioc:mentions :UserAccount/Twitter/CristinaGAlvare ;
10 schema:interactionStatistic [ a schema:InteractionCounter ;
11 schema:interactionType schema:UserLikes ;
12 schema:userInteractionCount 107 ] .</p>
      <p>Listing 1. Instance of Conversational Discourse created from a Tweet.</p>
      <p>
        Expository Discourse, on the other hand, covers non-interactive elements such as policy proposals and
approved policies, often embedded in legislative documents and party manifestos. These policies can
be of two types: Approved Policies (podio:ApprovedPolicy), which form or have formed part of the
legislation; or Policy Proposals (podio:PolicyProposal), which can form part of a Party Manifesto
(podio:PartyManifesto). On the one hand, Approved Policies are changes that have been introduced
into legislation by an authorised entity. To reflect this and import the specification of legislative
documents, we have extended the class lkg:Legislation of the LKG ontology [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this way, by
making Approved Policies equivalent to lkg:Legislation, we ensure that these policies are part of
the political discourse, have a domain expert definition, and both Knowledge Graphs interoperate and
are jointly exploitable. The following example (Listing 2) provides an example of a lkg:Legislation,
which is equivalent to podio:ApprovedPolicy and, in addition to existing data in the LKG network, is
enriched with more data introduced by PODIO (language, creator, creation date, source, and publisher).
1 &lt;https://apis.lynx-project.eu/document-platforms/BOE-A-2018-17773&gt;
2 a lkg:Legislation ;
3 a podio:ApprovedPolicy ;
4 terms:language &lt;http://id.loc.gov/vocabulary/iso639-2/spa&gt; ;
5https://github.com/oeg-upm/PODIO/tree/main/queries
6https://github.com/oeg-upm/PODIO
      </p>
      <p>Alternatively, Policy Proposals are legislative suggestions made by an agent that may not necessarily
come to fruition but are often included in Party Manifestos. These manifestos are comprehensive
documents categorized as Bibliographic Resource, detailing language, creation date, source, description,
content, ideology, and proposed candidates for political positions. Published by political parties, these
documents encapsulate various policy proposals to outline a party’s agenda and intentions. This
ontology structure efectively captures and analyzes political communication, accommodating both
dynamic social interactions and formal content like legislation, crucial for shaping public policy and
political narratives.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Political Discourse Knowledge Graph</title>
      <p>
        We constructed a Knowledge Graph (KG) based on the PODIO ontology using various data sources
categorized into political and legislative domains. In the political domain, we included speeches from
politicians and political parties, such as the American Presidency Project’s electoral programs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
social media posts from platforms like Facebook[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and Twitter [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and speeches by the Spanish
head of state [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For the legislative domain, we incorporated data from the oficial Spanish gazette[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
and a European legislation KG [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The resources were carefully downloaded, refined, modified, and
manually linked with Wikidata, ensuring all discourse authors are connected to their respective URIs.
      </p>
      <p>Data was transformed into RDF using a three-step approach centered around creating declarative
mappings with the RDF Mapping Language (RML). Initially, we utilized YARRRML—a user-friendly
serialization of RML—to outline the data source transformations in the YARRRML files, specifying how
each data field correlates with elements in the ontology. These mappings were then converted into RML
using the yarrrml-parser tool. Finally, the RMLMapper tool processed these mappings to produce RDF
triples, forming the KG. This method enhances the KG’s interoperability across various data sources,
including social networks, external legal KGs, legal documents, institutional discourses, and integrates
seamlessly with Wikidata, ensuring diverse information systems within PODIO are interconnected
and functional. GraphDB was used to store and manage RDF triples in our Knowledge Graph (KG),
accessible through a SPARQL endpoint.4 This KG is validated using SPARQL queries designed to answer
competency questions that address unmet needs within the domain, such as aggregating data across
social platforms and enhancing data with external knowledge sources.</p>
      <p>To underscore the interoperability and analytical capabilities of our Knowledge Graph, we have
developed an interactive demo environment (figure 2), available online, 7 where users can perform
queries that highlight the integration with external legislative resources, enriching political discourse
analysis. This platform allows users to see firsthand how queries detailing the latest legislation by
jurisdiction and the ideologies and histories of political parties enhance our KG with data from Wikidata.
Such integration facilitates complex analyses, like assessing the legislative and propagandistic activities
of Spanish leftist parties during various legislative periods, demonstrating the invaluable role of external
knowledge in providing comprehensive insights.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions and Future Work</title>
      <p>This paper introduces the PODIO ontology, designed to encapsulate political discourse across various
communication channels including social media, electoral programs, and legislation from the United
States, Spain, and the European legislative domain. PODIO addresses the shortcomings of existing
ontologies by providing detailed modeling of political speeches that allow for direct interaction (like
social media posts) and those that do not (such as legislative texts). It integrates diverse data types and
sources, normalizing them to enable interoperability within a unified Knowledge Graph (KG). This KG,
built following the Linked Open Terms (LOT) methodology, has been validated for consistency and
7https://w3id.org/podio/demo
8CQ15 takes the date of when the query is made as inGovernmentUntil and as legislatureEndTime if they have not
yet finished.
error-free structure, ensuring that it meets 33 specific requirements tailored to the political discourse
domain. We hope this resource will be of help to experienced sparql users interested in analysing
political discourse. Future enhancements of PODIO aim to include multimedia content and extend its
geopolitical applicability beyond the initial regions, focusing on expanding both the source variety and
the discourse characteristics analyzed, such as intentionality and the presence of ofensive content.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work is supported by the Predoctoral Grant (PIPF-2022/COM-25947) of the Consejería de Educación,
Ciencia y Universidades de la Comunidad de Madrid, Spain.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>De Vos</surname>
          </string-name>
          ,
          <article-title>Discourse theory and the study of ideological (trans-) formations: Analysing social democratic revisionism, Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 13 (</article-title>
          <year>2003</year>
          )
          <fpage>163</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Franzmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Burst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Regel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Riethmüller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Volkens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Weßels</surname>
          </string-name>
          ,
          <string-name>
            <surname>L. Zehnter,</surname>
          </string-name>
          <article-title>The manifesto data collection. manifesto project (mrg/cmp/marpor)</article-title>
          .
          <source>version 2023a</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.25522/manifesto.mpds.2023a. doi:
          <volume>10</volume>
          .25522/manifesto.mpds.2023a.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Schmitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Wüst</surname>
          </string-name>
          ,
          <article-title>Euromanifestos project (emp) 1979 - 2004, GESIS Data Archive, Cologne</article-title>
          .
          <source>ZA4457 Data file Version 1.0</source>
          .0, https://doi.org/10.4232/1.4457,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .4232/1.4457.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nulty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Theocharis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Popa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Parnet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Benoit</surname>
          </string-name>
          ,
          <article-title>Social media and political communication in the 2014 elections to the european parliament</article-title>
          ,
          <source>Electoral Studies</source>
          <volume>44</volume>
          (
          <year>2016</year>
          )
          <fpage>429</fpage>
          -
          <lpage>444</lpage>
          . URL: http://dx.doi.org/10.1016/j.electstud.
          <year>2016</year>
          .
          <volume>04</volume>
          .014. doi:
          <volume>10</volume>
          .1016/j.electstud.
          <year>2016</year>
          .
          <volume>04</volume>
          .014.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Álvarez-Mellado</surname>
          </string-name>
          ,
          <article-title>A Corpus of Spanish Political Speeches from 1937 to 2019, in: Proceedings of the Twelfth Language Resources</article-title>
          and Evaluation Conference, European Language Resources Association, Marseille, France,
          <year>2020</year>
          , pp.
          <fpage>928</fpage>
          -
          <lpage>932</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Borrett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Laurer</surname>
          </string-name>
          ,
          <source>The CEPS EurLex dataset: 142.036 EU laws from 1952-2019 with full text and 22 variables</source>
          ,
          <year>2020</year>
          . URL: https://doi.org/10.7910/DVN/0EGYWY. doi:
          <volume>10</volume>
          .7910/DVN/0EGYWY.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fernández-Izquierdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernández-López</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          <article-title>García-Castro, LOT: An industrial oriented ontology engineering framework</article-title>
          ,
          <source>Engineering Applications of Artificial Intelligence</source>
          <volume>111</volume>
          (
          <year>2022</year>
          )
          <article-title>104755</article-title>
          . doi:https://doi.org/10.1016/j.engappai.
          <year>2022</year>
          .
          <volume>104755</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Poveda-Villalón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gómez-Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Suárez-Figueroa</surname>
          </string-name>
          ,
          <article-title>Oops!(ontology pitfall scanner!): An on-line tool for ontology evaluation</article-title>
          ,
          <source>International Journal on Semantic Web and Information Systems (IJSWIS) 10</source>
          (
          <year>2014</year>
          )
          <fpage>7</fpage>
          -
          <lpage>34</lpage>
          . doi:
          <volume>10</volume>
          .4018/ijswis.2014040102.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. Moreno</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Rehm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Montiel-Ponsoda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Rodríguez-Doncel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Martín-Chozas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Navas-Loro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaltenböck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Revenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Karampatakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sageder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gracia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Maganza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Kernerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lonke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lagzdins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Bosque</given-names>
            <surname>Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Verhoeven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Gomez</given-names>
            <surname>Diaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Boil</surname>
          </string-name>
          <string-name>
            <surname>Ballesteros</surname>
          </string-name>
          ,
          <article-title>Lynx: A knowledge-based ai service platform for content processing, enrichment and analysis for the legal domain</article-title>
          ,
          <source>Information Systems</source>
          <volume>106</volume>
          (
          <year>2022</year>
          )
          <article-title>101966</article-title>
          . URL: https://www.sciencedirect.com/ science/article/pii/S0306437921001563. doi:https://doi.org/10.1016/j.is.
          <year>2021</year>
          .
          <volume>101966</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Woolley</surname>
          </string-name>
          , G. Peters, The American Presidency Project,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Baviera</surname>
          </string-name>
          <string-name>
            <surname>Puig</surname>
          </string-name>
          ,
          <source>Facebook ads 2019 spanish general elections dataset</source>
          ,
          <year>2020</year>
          . URL: http://dx.doi.org/ 10.4995/Dataset/10251/146502. doi:
          <volume>10</volume>
          .4995/dataset/10251/146502.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Moya</surname>
          </string-name>
          , Tweets Política España,
          <year>2023</year>
          . URL: https://www.kaggle.com/datasets/ricardomoya/ tweets-poltica-espaa.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Arganda del Rey City Council</surname>
          </string-name>
          ,
          <string-name>
            <surname>Publicaciones</surname>
            <given-names>BOE</given-names>
          </string-name>
          / BOCM 1985-
          <year>2023</year>
          ,
          <year>2023</year>
          . URL: https://datos. gob.es/es/catalogo/l01280148-publicaciones-boe-
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>