<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the PENG project</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gabriella Pasi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gloria Bordogna</string-name>
          <email>gloria.bordogna@idpa.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Research Project named PENG (Personalized NEws content programminG). This project (IST-004597) funded within the Sixth Framework Programme, Priority 2, Information Society Technology, Thematic Priority: Cross Media Content for Leisure and Entertainment. Gabriella Pasi is with the Università degli Studi di Milano Bicocca (DISCO)</institution>
          ,
          <addr-line>Via Bicocca degli Arcimboldi 8, 20126 Milano, Italy, , (</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>- In this paper a synthetic overview of the main aims, characteristics and innovations of the PENG project is presented. PENG . “PErsonalised News content programminG” is a Specific TARGETED RESEARCH PROJECT (IST-004597) funded within the Sixth Program Framework of the European Research Area.</p>
      </abstract>
      <kwd-group>
        <kwd>Information Filtering</kwd>
        <kwd>Distributed Information Retrieval</kwd>
        <kwd>Personlized User Profiles</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>T</p>
      <p>HE main objective of the PENG project was to define an
innovative technological solution to the personalised
multimedia news access, composition and presentation, with
an emphasis on personalised filtering, retrieval and
composition of multimedia news. Indeed, the proposed system
aims to collect news from both news-feeds and specialised
archives in a personalised way, so as to provide media
professionals with a fully customisable environment.</p>
      <p>This is performed by pushing personalised news towards
the user, and by allowing her or him to expand a selected topic
by searching for additional information or editing the final
news through a personalized presentation approach. This
involves the integration of personalised filtering and
distributed information retrieval, producing information that
constitute a first but very important aid to a journalist's writing
activity (the initial target user for PENG). The target users of
PENG are classified according to a bi-dimensional schema
defined in terms of their level of interest in the news, and their
topical interests. Possible user targets include
informationintensive workers, students of communication faculties,
journalists with specific topical interests etc.</p>
      <p>An important characteristic that is ensured by the system is
the flexibility in modelling the user's topical interests and
context. This means modelling the capability to be tolerant to
the vagueness and uncertainty in both the user-system
interaction, and adaptive in the learning of users' changing
preferences over time. In particular the PENG prototypal
system supports both the filtering of news based on a new
multi-criteria decision approach, and the cataloguing of news
into overlapping topics, These characteristics are not met by
the current systems commonly used by journalists to carry out
their habitual tasks, and their replacement with the PENG
system will greatly innovate the current way of making news.
The PENG project was mainly addressed to news
professionals, such as journalists and editors, with the view of
extending the use of the defined system to more general users
in the future. In this context, with the term news we refer to
any kind of news, including information regarding leisure and
entertainment.</p>
      <p>This PENG prototype is conceived as a personal assistant,
supporting journalists in all stages of the news lifecycle.
Information (text, images, and videos) is gathered from
different sources (including the Web) using a combination of
push and pull technologies and is presented to the user in a
personalised way.</p>
    </sec>
    <sec id="sec-2">
      <title>II. PENG SYSTEM’S ARCHITECTURE The main functionalities of the PENG system are separated into three main phases: a push phase, a pull phase and a presentation phase.</title>
      <p>In the push phase a filtering system was developed, by
means of which a first selection of news are selected from
newswires and other news archives. This filtering is based on a
dynamic user profile including the personal user’s trust in the
information sources.</p>
      <p>In the pull phase a user query and the user profile are used
to retrieve further and more specific information from both the
same sources used in the pull phase, and also from additional
sources automatically selected in relation to the content of the
query and the user profile. A distributed information retrieval
approach is used, where the query can be automatically
generated from user feedback on the information presented by
the push phase.</p>
      <p>The presentation phase uses multi-document and
multimedia visualisation to present the results from the push and
pull phases to the user. This takes into account the trust a user
places in the information sources. The results visualized by the
system can be personalised not only to the user information
need but also to visualisation preferences and the subjective
interpretation of the users trust in sources of information.</p>
      <p>Figure 1, below, presents a high level view of the PENG
architecture, with the modules corresponding to the three main
phases highlighted in gray.</p>
      <sec id="sec-2-1">
        <title>Information</title>
      </sec>
      <sec id="sec-2-2">
        <title>Presentation</title>
      </sec>
      <sec id="sec-2-3">
        <title>Database and communication layer</title>
      </sec>
      <sec id="sec-2-4">
        <title>Information</title>
      </sec>
      <sec id="sec-2-5">
        <title>Filtering</title>
      </sec>
      <sec id="sec-2-6">
        <title>Information</title>
      </sec>
      <sec id="sec-2-7">
        <title>Retrieval</title>
      </sec>
      <sec id="sec-2-8">
        <title>PENG system</title>
      </sec>
      <sec id="sec-2-9">
        <title>User</title>
        <p>profile
database
3rd party</p>
      </sec>
      <sec id="sec-2-10">
        <title>Information sources</title>
        <p>Figure 1: a sketch of the PENG system architecture
The user accesses the system locally through an interface
provided by the presentation module. Each of the three main
modules communicate via an intermediary layer that also
manages access to the common databases required by the
system, the most important of which is the user profile
database (the other databases are not shown for clarity). This
database and communication layer is composed of a user
profile manager (which manages the user profile database) and
a common database manager (which manages other common
databases and coordinates the communication between the
modules). In the PENG system, a user profile contains the data
relating to a single user: personal information (such as name,
email etc), information preferences (what information is
relevant to the user, from where), presentation preferences
(how this information is to be displayed) and interaction
history (the history of the user's interaction with the PENG
system). Since a user may be interested in numerous different
subjects, the information and presentation preferences are
split into a set of different user interests. Each interest is
personal to the user to which it belongs, and plays an
important part in the filtering module (which is intended to not
only filter documents to users, but to the adjust user interest)
and information retrieval modules (as providing a context in
which a search can take place).</p>
        <p>Importantly, the profile stores the degree to which a user
trusts different information sources ('trust scores'), information
hypothesised as being important in news gathering and
filtering. Trust scores are conceived as indications of the
potential reliability of the information sources to a specific
user (or category of users) with respect to a given topical area.</p>
        <p>PENG has the potential to greatly contribute to the
continuing development of filtering and retrieval systems, for
the benefit of the journalists, and ultimately for all users of
news services. Professionals, such as journalists or editors, can
tune the contribution of the distinct sources to their
information gathering, filtering and editing tasks. This is
achieved by specifying queries expressing constraints on the
multimedia and time-dependent content of the news so as to
focus on a particular event. This enables the tuning of a
personalised gathering and presenting of news that expresses
an individuals view and opinion on an event, a condition for
journalism that has become predominant in recent years and a
very important condition for a personalised presentation of
news to the general user. In fact, while this can greatly reduce
the time needed for a journalist to consult the distinct sources
and to report on a given topic of interest, it also enables the
presentation of news tailored to a specific users interests.</p>
        <p>The automatic classification of the news into thematic
clusters represented by sets of keywords can be coupled by
successively using PENG to yield personalised presentations
of up to date topics. This can help in drafting a personalised
multimedia newspaper and can thus be a powerful tool for the
editorial staff of a journal.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Agosti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          and G. Pasi eds.,
          <source>"Lectures on Information Retrieval"</source>
          , Springer-Verlag,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bordogna</surname>
          </string-name>
          , G.Pasi, R. Yager, “
          <article-title>Soft Approaches to information Retrieval on the WEB”</article-title>
          ,
          <source>Int. Journal of Approximate Reasoning</source>
          ,
          <volume>34</volume>
          ,
          <fpage>105</fpage>
          -
          <lpage>120</lpage>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bordogna</surname>
          </string-name>
          , G.Pasi, “
          <article-title>Personalised indexing and retrieval of heterogeneous structured documents”, Information Retrieval Journal</article-title>
          , in press (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Claypool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gokhale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Murnikov</surname>
          </string-name>
          , D. netes, M. Sartin, “
          <article-title>Combining content-based and collaborative filters in an online newspaper”</article-title>
          ,
          <source>ACM sigir workshop on recommemder systems Aug. 19</source>
          , Berkeley. (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Crestani</surname>
          </string-name>
          and G. Pasi, editors. “
          <article-title>Soft Computing in Information Retrieval: Techniques and Applications</article-title>
          .”
          <string-name>
            <surname>Physica-Verlag (</surname>
          </string-name>
          Springer-Verlag), Heidelberg, Germany,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Kilander</surname>
          </string-name>
          , “
          <article-title>A brief comparison of News filtering Software”</article-title>
          , http://www.glue.umd.edu/enee/medlab/filter/filter.html.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Moraru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Besacier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mulhem</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Quénot</surname>
          </string-name>
          ,
          <article-title>"CLIPSIMAG at TREC-11 : Experiments in Video Retrieval"</article-title>
          , 11th Text Retrieval Conference, Gaithersburg,
          <string-name>
            <surname>MD</surname>
          </string-name>
          , USA,
          <fpage>19</fpage>
          -
          <lpage>22</lpage>
          November,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mulhem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gensel</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Martin</surname>
          </string-name>
          , “
          <article-title>Adaptive Video Summarization”</article-title>
          , in Handbook on Video Databases, CRC Press, to appear,
          <year>2003</year>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pasi</surname>
          </string-name>
          , “
          <article-title>Modelling users' preferences in systems for information access”</article-title>
          ,
          <source>International Journal of Intelligent Systems</source>
          ,
          <volume>18</volume>
          ,
          <fpage>793</fpage>
          -
          <lpage>808</lpage>
          , (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Amato</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Straccia</surname>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>“User Profile Modeling and Applications to Digital Libraries”</article-title>
          ,
          <source>3rd European Conference on Digital Libraries, ECDL99</source>
          , Paris, France, September
          <volume>22</volume>
          -24, LNCS 1696.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>