<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Content Processing in Web Portals</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Felicitas L o¨ffler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bahar Sateliy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birgitta K o¨nig-Ries</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rene´ Wittey</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fig. 1. The Semantic Assistants-Portal Integration architecture</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Computer Science, Friedrich-Schiller University of Jena</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Web portals provide a standardized way of integrating multiple information sources and applications in a single web interface. However, they currently do not provide semantic support for users that need to navigate the often overwhelming amount of content. We demonstrate our open source portal architecture “hanu¨ wa” that integrates text mining web services, based on the Semantic Assistants framework, with the Liferay portal server.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION</p>
      <p>Web portals are a specific kind of web-based systems
that provide for an integration of diverse information sources
and applications. Deployed for a concrete scenario in an
organization, they typically address the information needs of a
wide range of users and their tasks through both internal and
external services.</p>
      <p>While a web portal provides convenient access to
information, there is no standardized way that allows to further
process the available content in order to support users in their
tasks. There is also a lack of appropriate technologies for
document filtering within a web portal. We envision a new
generation of web portals that can provide context-sensitive
support through semantic analysis services, in particular based
on natural language processing (NLP). These services are
deployed in shared or private servers and can be dynamically
requested by users that ask for help in a specific task: e.g.,
finding entities in a documents, summarizing a text, answering
a question, or linking content to external sources. As such,
they perform the role of AI “assistants” that support their
users. Furthermore, we imagine enhancing web portals with a
personalization component to adapt the content to the user’s
needs. Sorting documents or highlighting terms according to a
specific user interests would be a great advantage for the user
and a step towards working against information overload.</p>
      <p>
        In previous work, Bakalov et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] demonstrated the
feasibility and usability of a portal integration with natural
language processing services. However, this implementation
was tied to a specific, commercial portal engine (IBM
WebSphere1). The work presented here is a complete re-design
and re-implementation of the NLP-portal integration, taking
into account future extensions and based exclusively on open
source software. Similar to the solution presented in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we
rely on the Semantic Assistants framework [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for brokering
text mining pipelines as web services, but our new architecture
is based on the Liferay2 open source portal server.
      </p>
      <p>Our new portlets can be deployed in any existing
Liferaybased portal to offer natural language processing services to
its users. Here, we demonstrate the core functionality with
1IBM WebSphere, http://www.ibm.com/software/websphere
2Liferay, http://www.liferay.com/
named entity recognition in a given article, but the framework
is not limited to a single domain: A clear separation of concerns
allows a language engineer to make new NLP services available
without requiring knowledge in portal technology, and a web
engineer can easily design a new web portal that incorporates
language technology.</p>
      <p>II.</p>
    </sec>
    <sec id="sec-2">
      <title>ARCHITECTURE</title>
      <p>Our novel Semantic Assistants-portal integration
architecture, illustrated in Fig. 1, is designed to allow various portlets
to benefit from NLP techniques on their content. The core
idea is to enable generic portlets to communicate with the
Semantic Assistants portlet, specifically designed to connect to
the back-end Semantic Assistants server and provide inquiry
and invoking capability of NLP pipelines to portal users.</p>
      <p>Portal</p>
      <p>Database</p>
      <p>Portlet Controller
... POothrtelert
Semantic Assistants Server</p>
      <p>NLP Service Connector</p>
      <p>Service Invocation</p>
      <p>Service Information
Semantic Assistants</p>
      <p>User
(Embedded)
Browser
Language
Service
Descriptions</p>
      <p>In this architecture, all available portlets in a page can
communicate with the Semantic Assistants portlet by sending
content for analysis and receiving the results. To commence
an analysis session, users interact with the portal via their web
browser, for example, on their desktop computer or from a
mobile device. Through this integration, users can select an NLP
service to execute on a portlet’s content from a
dynamicallygenerated list of available assistants in the Semantic Assistants
server repository. Where applicable, users can also customize
the services’ behaviour by setting runtime parameters. An
execution request is then sent to the Semantic Assistants server
from the Semantic Assistants portlet in form of a W3C3 standard
web service call that triggers the execution of the designated
NLP pipeline on the provided content. The results of each</p>
    </sec>
    <sec id="sec-3">
      <title>3World Wide Web Consortium (W3C), http://www.w3.org</title>
      <p>NLP Service
Results
successful service execution are first received by the Semantic
Assistants portlet and then passed on to the portlet that requested
the service execution. The NLP pipelines are described in
the OWL4 language and the Semantic Assistants server uses
SPARQL5 for a dynamic discovery of available services upon
each user request. Hence, adding or removing NLP services
to the integration requires no modification to the code base of
the portal.</p>
      <p>
        The basis of the personalization component will be an
ontology-based user profile, where all user interests are recorded
automatically while browsing through the portal and reading
documents. A user interface, embedded into a portlet, allows
a user to control interests, add new terms, delete or change
concepts. The user can also enable or disable the personalization
mode. When personalization is desired, the documents are
resorted and the relevant terms of the user profile are highlighted
within the text. In contrast to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the personalization feature
will be available to various portlets in form of services, rather
than a concrete implementation on a per-portlet basis.
      </p>
      <p>III.</p>
      <p>APPLICATION</p>
      <p>The integration of NLP assistants within a portal context
allows for a multitude of applications. Fig. 2 shows an example
scenario in which a portal user needs assistance in analyzing
the textual content available in the content portlet (left). Such
assistance can be offered to the user through the NLP services
listed in the Semantic Assistants portlet (right). This portlet
allows the user to connect to different Semantic Assistants
servers and review the list of their available pipelines in
order to find a suitable assistant for his task at hand. In
our example, the list of assistants contains a “Person and
Location Extractor” service that extracts entities of person
and location types from a given text. The user then sends the
text in the content portlet to the Semantic Assistants portlet
for analysis and requests the service execution by clicking on
the “Run Assistant” button. This interaction will request the
designated Semantic Assistants server for the execution of the
ANNIE pipeline, provided by GATE.6 Subsequently, the results
are returned to the content portlet in form of annotations in
a tabular format and highlighted in the text based on their
offsets. The processing time for different scenarios depends on
both the length of the input text and the actual NLP pipeline.
Naturally, sophisticated NLP pipelines with deep syntactic or
semantic analysis require more time to process. Currently, we
are working on a personalization scenario aimed at tackling
the user’s information overload issue, by filtering the portal’s
content according to a user’s interest. The idea is to embed
such capability directly within portlets, allowing users to be
able to switch to various personalization modes.</p>
      <p>IV.</p>
    </sec>
    <sec id="sec-4">
      <title>CONCLUSIONS</title>
      <p>In this paper, we described our open source integration
of natural language processing capabilities within a portal
environment. We also intend to integrate a personalization
feature into portals to adapt their content according to a user’s
needs. Furthermore, we want to provide a user interface to give
the users the opportunity to have control over their recorded
interests. The NLP-portal integration will be available as part
of the Semantic Assistants distribution hosted on SourceForge.7</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bakalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sateli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Witte</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-J. Meurs</surname>
            , and
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Ko</surname>
          </string-name>
          <article-title>¨nig-Ries, “Natural Language Processing for Semantic Assistance in Web Portals,”</article-title>
          <source>in IEEE International Conference on Semantic Computing (ICSC</source>
          <year>2012</year>
          ). Palermo, Italy: IEEE,
          <year>September 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Witte</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Gitzinger</surname>
          </string-name>
          , “
          <article-title>Semantic Assistants - User-Centric Natural Language Processing Services for Desktop Clients,” in 3rd Asian Semantic Web Conference (ASWC 2008), ser</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>5367</volume>
          . Bangkok, Thailand: Springer, Feb. 2-
          <issue>5</issue>
          ,
          <year>2009</year>
          2008, pp.
          <fpage>360</fpage>
          -
          <lpage>374</lpage>
          . [Online]. Available: http://rene-witte.
          <article-title>net/semantic-assistants-aswc08 4Web Ontology Language</article-title>
          , http://www.w3.org/2004/OWL/ 5SPARQL Query Language, http://www.w3.org/TR/rdf
          <article-title>-sparql-query/ 6General Architecture for Text Engineering</article-title>
          (GATE), http://gate.ac.uk/ 7Semantic Assistants, http://sourceforge.net/projects/semantic-assist/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>