<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>E. Adiba);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Open Data Foraging: a Semantic Approach to Findability and Understandability</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eudes Adiba</string-name>
          <email>eudes.adiba@unamur.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antoine Clarinval</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anthony Simonofski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Namur Digital Institute, University of Namur</institution>
          ,
          <addr-line>Rue de Bruxelles 61, 5000 Namur</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Open Government Data</institution>
          ,
          <addr-line>Semantic annotation, Information Foraging Theory</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Proceedings EGOV-CeDEM-ePart conference</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Swedish Center for Digital Innovation, Department of Informatics, Umeå University</institution>
          ,
          <addr-line>Umeå</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Findability and understandability are two main barriers to Open Government Data (OGD) reuse. Instead of addressing these barriers at the individual dataset level, we present a shift of perspective in their definition through Information Foraging Theory and propose a semantic annotation methodology to solve these problems by reshaping the information space and improving the quality of information scent from each dataset. The potential benefits associated with efective OGD reuse face two persistent obstacles: the dificulty of finding relevant datasets (findability) and the dificulty of understanding them suficiently for reuse (understandability) [1]. Current approaches often address these issues by improving metadata at the individual dataset level. While these eforts are necessary, they are not suficient. Exploring open data is not simply a one-of search operation [ 2]. It is a dynamic search process, made of trial and error, forking and backtracking, in which the user progressively develops an understanding of the information territory (OGD portals). Given this context, the important thing is not just the isolated quality of the (meta)data, but the way in which the data environment guides or disorients navigation. This shift of perspective leads to the need for a redefinition of the two problems not as properties of the datasets themselves but as experiences, shaped by the structure of the information space and the signals it emits. To model this perspective, the current work draws on Information Foraging Theory (IFT) to redescribe dataset search on OGD portals and proposes a semantic annotation designed as an information overlay to solve the two obstacles by: (1) reshaping the information space and (2) improving the quality of information scent from each dataset. With this poster, we aim to give an overview of this new perspective informed by IFT and the semantic annotation methodology. The goal is to engage in discussions on how these can (re)shape the OGD community's work. ∗Corresponding author.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1. Introduction
2. IFT Perspective and Contribution
2.1. OGD Search from IFT Perspective
Information Foraging Theory (IFT) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is about how people seek information derived from animal
food-foraging strategies. It has successfully explained people’s information seeking behaviors in various
domains and is based on a set of constructs (Table 1). IFT models information (dataset) seeking as an
adaptive behavior, where the citizen (OGD portal user) explores an information environment (the OGD
portal) in search of prey (relevant dataset). This environment is made up of patches (thematic collections)
connected by links (hyperlinks, tags, suggestions), and each link provides cues. The information scent
corresponds to the user’s estimate, based on these cues, of the potential usefulness of the dataset at
the other end of the link. Through this mechanism, information scent influences the browsing choices
between the patches.
      </p>
    </sec>
    <sec id="sec-2">
      <title>CEUR</title>
    </sec>
    <sec id="sec-3">
      <title>Workshop</title>
      <p>ISSN1613-0073
OGD portals often organize data into collections that correspond to patches. These patches are based
on tags which, due to their inconsistency, can lead to patches with heterogeneous content: an aspect
that can confuse users in their searching process. Moreover, users must rely on the portal’s predefined
structure to navigate, as there are no clear links within or between patches. This is tedious and adds to
the cognitive load of the user. Another key point is the information scent emanating from the metadata,
which helps the user judge the relevance of a dataset. The metadata use diferent terminologies
depending on the data provider; this greatly weakens the scent.</p>
      <p>Semantic annotation (the process of identifying and linking the real-world concepts of a knowledge
graph to diferent elements of tabular data) makes it possible to structure patches more coherently by
leveraging the semantic relations of associated concepts. Using the relations of the knowledge graph,
one can add meaningful links within and between patches. This also helps to increase information
scent, as the cues provided are based on concepts and terminologies familiar to users. We proposed a
probabilistic approach (Conditional Random Field) to OGD annotation task by integrating the diversity
of data type and the richness of OGD structure information: header, metadata, values in the process
(see Figure 1).</p>
      <sec id="sec-3-1">
        <title>Before semantic annotation</title>
      </sec>
      <sec id="sec-3-2">
        <title>After semantic annotation</title>
        <p>ltrao
P
D
G
O
tlraoltrao
P
P
D
G
D
O
G
O</p>
        <sec id="sec-3-2-1">
          <title>Navigationpredefinedstructure</title>
          <p>Patch 1</p>
          <p>Patch 2</p>
          <p>Patch 3
Within
patch
Wlitnhkin
patchlink
Patch1</p>
          <p>Betweenpatch
link</p>
          <p>Weak,
unqualitative
scent</p>
          <p>Example of 3rd column annotation of a dataset
Declaration on Generative AI
The author(s) have not employed any Generative AI tools.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kremen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Necasky</surname>
          </string-name>
          ,
          <article-title>Improving Discoverability of Open Government Data with Rich Metadata Descriptions Using Semantic Government Vocabulary</article-title>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Crusoe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ahlin</surname>
          </string-name>
          ,
          <article-title>Users' activities for using open government data - a process framework</article-title>
          ,
          <source>Transforming Government: People, Process and Policy</source>
          <volume>13</volume>
          (
          <year>2019</year>
          )
          <fpage>213</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pirolli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Card</surname>
          </string-name>
          , Information foraging,
          <source>Psychological Review</source>
          <volume>106</volume>
          (
          <year>1999</year>
          )
          <fpage>643</fpage>
          -
          <lpage>675</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>