<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adaptive Retrieval and Composition of Socio-Semantic Content for Personalised Customer Care</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ben Steichen</string-name>
          <email>Ben.Steichen@cs.tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincent Wade</string-name>
          <email>Vincent.Wade@cs.tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Knowledge and Data Engineering Group, School of Computer Science and Statistics Trinity College</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The parallel rise of the Semantic and Social Web provides unprecedented possibilities for the development of novel search system architectures. However, many traditional search systems have so far followed a simple one-size-fits-all paradigm by ignoring the different user information needs, preferences and intentions. In the last number of years, we have begun to see initial evidence that personalisation may be applied within web search engines, however little detail has been published other than adaptation based on user histories. Moreover, current implementations often fail to combine the mutual benefits of both structured and unstructured information resources. This paper presents techniques and architectures for leveraging socio-semantic content and adaptively retrieving and composing such content in order to provide personalised result presentations. The system is presented in a customer care scenario, which provides an application area for personalisation in terms of available heterogeneous resources as well as user preferences, context and characteristics. The presented architectures combine techniques from the fields of Information Retrieval, Semantic Search as well as Adaptive Hypermedia in order to enable efficient adaptive retrieval as well as personalised compositions.</p>
      </abstract>
      <kwd-group>
        <kwd>Adaptive Information Retrieval</kwd>
        <kwd>Adaptive Result Composition</kwd>
        <kwd>Socio-Semantic Search</kwd>
        <kwd>Personalised Search</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The vast growth of the World Wide Web has resulted in search engines playing an
integral part in people’s daily pursuit for information. In particular, with the rise of
the Social Web, or Web 2.0, a significant part of the growing number of resources
constitute user-generated content such as forum posts, tags, media uploads, etc.
Although web search engines have become very efficient at indexing, retrieving and
ranking unstructured documents (including such Web 2.0 resources), traditionally
they have often followed a one-size-fits-all paradigm: the same results are returned in
the same form and order for each user with the same query. More recently the notion
of Personalised Information Retrieval (PIR) has emerged in research projects in order
to retrieve more relevant results for users’ personal information needs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However,
the conceived solutions have mainly focussed on improving ranked list scores by
boosting documents depending on their similarity to a mined user profile. They do not
take into account the different search expertise, preferences or knowledge levels of
users, nor do they make use of search strategies in order to assist more complex
informational queries. The rise of the Semantic Web has provided new possibilities
for representing information using semantic data formats such as ontologies, allowing
the development of Semantic Search (SS) systems. However, the current state of the
art of such systems has largely followed the IR approach of ranking relevant
documents and presenting them in ranked lists. They have so far failed to use
semantic knowledge to provide an improved guidance for querying users. The field of
Adaptive Hypermedia (AH) has traditionally focussed on providing such guidance
using personalised result compositions and presentations through multi-dimensional
adaptivity. However, their reliance on heavily marked up content has often hampered
the inclusion of open-corpus documents such as user-generated content.
      </p>
      <p>This paper proposes to combine techniques and architectures from PIR, SS and AH
in order to provide Adaptive Information Retrieval and Composition. The proposed
system consolidates both social and semantic data sources and provides a single query
interface that supports personalised query responses. Customer Care is used as an
example field where such a personalised system can be applied, since in addition to
providing traditional technical documentation, many organisations now provide their
own versions of community resources where users increasingly engage in forums in
order to solve technical problems. By applying our search system across these
different data sources, we are able to provide users with result compositions that are
(i) personalised to their own needs with respect to the product, (ii) semantically
structured according to organisational knowledge and (iii) combined from closed
(semantic) as well as open (social) content.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Related Work</title>
      <p>A variety of techniques and technologies have been developed in several research
fields in order to i) search across increasingly large volumes of data and ii) tailor the
content retrieval towards users’ personal interests and preferences. A broad
characterisation of such techniques reveals three distinct research areas: Personalised
Information Retrieval, Semantic Search and Adaptive Hypermedia.</p>
      <p>
        The field of Information Retrieval [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] has typically focused on improving ranked
result lists using one-size-fits-all algorithms. More recently, Personalised Information
Retrieval systems make use of personal information (e.g. gathered from previous
search interactions [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) in order to either expand the original user query with
personalised keywords [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and logical operators (e.g. AND, OR, NOT) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], or to bias
traditional ranking algorithms towards more personally relevant information [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
Alternative composition and presentation attempts such as result clustering [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] have
most often been confined to keyword frequency calculations, largely lacking a more
fine-grained representation of i) the knowledge space that is being queried and ii) the
user’s personal knowledge state and preferences.
      </p>
      <p>In order to overcome this lack of structured representations of both domain and
user models, Semantic Search engines draw from the expressive power of ontologies,
which can be used for modelling and reasoning across the knowledge space as well as
user interests [8]. Although early Semantic Search systems often made use of manual
one-to-one mappings between documents and ontological concepts, more
“lightweight” systems [9][10] now (semi-) automatically annotate documents using
multiple concepts drawn from ontologies. These annotations can then be used in order
to rank open corpus documents not only by their statistical similarity to a user’s
keywords, but also by ranking them according to the importance of their particular
annotations [10]. The usage of semantic user models such as in [8] has advanced the
field to more personalised rankings of documents, however the sole dimension of
adaptation has again been that of user interests. Moreover, user guidance has so far
been largely neglected, as documents have mostly been composed and presented in a
flat ranked list format, failing to guide the user through the result space.</p>
      <p>Adaptive Hypermedia (AH) [11] is a field that has inherently focussed on
providing multi-dimensional adaptation by creating personalised information
compositions and presentations. Since the earliest systems such as AHA! [12] and
APeLS [13], their focus has been on providing information compositions, which
contain documents that are not only adaptively selected for the particular users, but
also sequenced according to current user knowledge states as well as to a variety of
user preferences. Moreover, presentational cues such as link hiding [12] or link
annotation [14] provide additional navigation guidance across the document space.
This increased adaptivity is facilitated by a new type of model called the Adaptation
Model [12] or Narrative Model [13]. This model describes the strategy by which
concepts can be traversed to support particular objectives. For example, a “how to”
query of an inexperienced user might have a narrative that would first choose content
containing a general introduction of the topic and its concepts, followed by examples
on how to carry out the queried task. However, AH systems have inherently been
hampered by their reliance on fine-grained concept-to-content indexing of the
document space, making it hard to incorporate “unknown” open corpus data.</p>
      <p>An additional search paradigm that has emerged over the last years is the notion of
social search or collaborative recommendation. In these systems, users are presented
with documents or items that are either globally popular [15] or recommended by
users with similar interests (e.g AMAZON1 recommendations). With the growth of
online communities, these techniques might become increasingly powerful for future
adaptive and personalised search. However, such collaborative techniques are out of
the scope of the research presented in this paper.</p>
      <p>In conclusion, the major gap in current search systems lies in the failure to
augment Personalised Information Retrieval with Semantic Search and Adaptive
Hypermedia techniques in order to create Personalised Result Compositions and
Presentations. In order to overcome this gap, search systems need to integrate the
notion of query adaptation based on a wider variety of user characteristics in order to
enable more personalised retrieval. Moreover, the expressive power of ontologies that
drives Semantic Search systems needs to be integrated in order to model both the
knowledge domain as well as the system users. Finally, the Adaptive Hypermedia
notion of a Narrative Model needs to be incorporated in order to i) retrieve documents
that most closely correspond to the current domain and user model states and ii)
adaptively compose and present the results to improve the guidance of users.
1 http://www.amazon.com</p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>In order to study and address the identified gaps in current adaptive search techniques,
a vertical application area is needed, which provides i) the necessary heterogeneous
content and ii) an authentic evaluation scenario. For the research presented in this
paper, a case-based study of customer care has been chosen, which represents an
application area where users are currently already searching across both structured
(closed corpus) and unstructured (open corpus) content. Additionally, this case study
provides the necessary context for addressing different user information needs, skills
and preferences.</p>
    </sec>
    <sec id="sec-4">
      <title>4 Personalised Customer Care</title>
      <p>Customer care is a crucial area for companies wishing to establish long-term
relationships with their user base. Despite offering a strong product or service, it is
often the post-purchase assistance that influences a user’s decision to consider
purchasing more products or services from this particular company [16]. However, it
is surprising that the type of support in this massive area has been confined to the
simple one-size-fits-all paradigm. Users are left having to either consult complete user
manuals in order to find the relevant section for the problem in question, or perform a
keyword query and search through traditional ranked result lists regardless of their
personal background in terms of product knowledge, skills and preferences.</p>
      <p>From a technical perspective, there are three types of help files that are available
for supported products. First of all, a company internally produces technical
documentation that is often sliced to a fine granularity in order to assure their reuse in
the case of software updates. These smaller units are then compiled into manuals in
order to be shipped as complete user guides. By composing these knowledge items
into manuals, chapters and sections, companies provide a great array of implicit
metadata information that can potentially be used for adaptive and personalised
retrieval. In addition to these highly structured data sources, companies often produce
a second type of documents, which contain knowledge resources that have been
generated by support staff following a direct interaction with customers. These types
of documents are generally less structured than technical support documents,
containing limited metadata such as topic categorisation. Nevertheless, these articles
contain valuable information for an end-user who might be facing a similar issue.
Finally, a third type of documents is emerging increasingly with the rise of the social
web, or Web 2.0. Users increasingly engage in community forums, asking questions to
the general user community in the hope that either a similar problem might have been
solved previously or that a user in the community has the technical knowledge to
identify the problem area. In terms of technical markup, these documents contain the
least structure for several reasons. First of all, users inherently use different
terminologies depending on their linguistic and technical background. Secondly,
when users categorise or tag forum posts, they might have differing intentions and
perceptions of what might be relevant for future use. Finally, even if users agree on
the type of tags, categorisations and language, the problems of synonymy and
polysemy increase the mismatch between user-generated terms and the organisational
terminology.</p>
      <p>It becomes apparent that current customer care is not lacking in terms of support
document quantity, but rather in terms of aggregating and structuring existing content
in order to make it i) consistent, ii) reusable and iii) suitable for adaptation and
personalisation. Hence it is necessary to develop new techniques and architectures for
structuring and aggregating the different document types. Additionally, new search
architectures are required that leverage such improved data models in order to make
full usage of the complete document space.</p>
    </sec>
    <sec id="sec-5">
      <title>5 Structuring Heterogeneous Content</title>
      <p>The heterogeneous support content that is available for software products needs to be
transformed to a semantically richer form in order to allow reasoning, adaptation and
personalisation across it. As mentioned earlier, Semantic Web technologies such as
ontologies represent an opportunity to base such structuring and markup on. The
different types of content can be broadly categorised by their amount of existing
metadata and structure. Consequently, different types of usage can be drawn from
each: whereas highly structured content (such as technical documentation) can be
used to derive an ontology of the knowledge domain, unstructured content (such as
forum posts) can be marked up in order to provide querying users with a larger range
of problem solutions. Key challenges in using marked up content and ontologies lie in
identifying (i) how high quality markup needs to be, (ii) how extensive the vocabulary
can be and (iii) how extensive the ontology needs to be.
5.1 Structuring organisational content
Organisational structured content is often of a fine granularity in order to ensure its
reusability for future product updates. Transforming both the individual knowledge
items as well as their compositions (e.g. from product manuals) to a domain ontology
allows the content to be more reusable and suitable for adaptation and personalisation.</p>
      <p>First of all, for each individual knowledge item, there exist a number of content
fields such as title, paragraph, procedure, etc., as well as metadata fields such as index
terms (i.e. keywords) or media type (e.g. text, image). By modelling the different
fields as ontological classes, each knowledge item and its constituent parts can be
populated as instances of these classes. This is particularly useful in the case of
content and metadata fields that can be used for reasoning and adaptation (e.g. a
metadata field indicating a procedure). For example, if a particular user has only just
installed the product, explanatory items should introduce the user to a particular
feature first, before showing a detailed procedure on how to configure this feature.
The difficulty of an item can also be inferred from a variety of structural features
contained such as the number of procedural steps, the content length, the number of
paragraphs, etc. Corporate product documentation is often extensively marked up to a
deep structural level, allowing such a detailed content analysis.</p>
      <p>Secondly, the composition of knowledge items is transformed to ontological form
by creating classes for the hierarchical components of the document (see Figure 1).
Moreover, components such as chapters, sections and subsections often contain
additional data (e.g. overview titles), providing valuable information about the overall
subject of its constituent knowledge items. The individual content items (e.g.
chapters, sections, subsections) are used to populate the different ontological classes
as instances, with instance relationships ensuring the ability to reason across
connected items. For example, if a section explains a particular product feature, its
subsections typically provide more detailed information. Again in the case of a less
experienced user, it is important to not only show the detailed information about how
to configure a particular feature, but also to introduce the feature with the explanation
that is contained in a higher level section.</p>
      <p>By transforming the complete technical documentation into classes and instances, a
domain ontology can be created, which accurately describes the subject area from the
point of view of the product provider. In particular, implicit knowledge from the
existing item compositions in product manuals is effectively transformed into a form
that allows making this knowledge explicit using ontological reasoning. Since the
technical documentation is marked up consistently according to predefined schemas,
most of the transformations can be applied automatically. However, in order to extract
additional, more high-level concepts, a certain amount of manual effort is involved.
For example, in the case of several product manual chapters referring to the same
product features (one chapter explaining its installation, another one its
configuration), the domain ontology should capture these cross-chapter relationships.
Unless such references to higher level concepts (e.g. particular product features) are
mentioned explicitly in the document markup, a domain expert needs to manually add
these ontology classes and relationships.
After a domain ontology has been generated, it is possible to link new “unknown”
documents with the existing ontological instances. Two separate components are
needed in order to generate i) the right granularity from the open corpus content and
ii) conceptual indexing according to the ontological structure (see Figure 2).</p>
      <p>First of all, a content slicer described in [17] is responsible for transforming the
original documents into fine-grained “slices”. Such slices are viewed as stand-alone
pieces of information, containing their own semantic properties and metadata. During
the slicing of the original open-corpus data (i.e. forum content and knowledge
resources), structural as well as semantic analysis techniques are applied in order to
generate fine-grained knowledge items as well as an initial set of metadata fields.</p>
      <p>In a second step, the Web 2.0 concept of “crowd sourcing” is used to generate
additional and more accurate annotations by presenting the content slices and their
initial associated metadata to voluntary annotators (similar to [18]). Ideally, this
socio-semantic annotation client is embedded within the actual community forum,
allowing the initial content generators to tag their own posts. The domain ontology is
also available to annotators as a preferred vocabulary in order to correspond their
conceptual understanding of the slice to the terminology of the underlying semantic
knowledge representation. The ontology is presented in hierarchical form, allowing
annotators to easily browse and select concepts for the displayed slice. Furthermore,
the annotation user interface includes several drop-down lists, which offer an
annotation vocabulary for additional properties, such as the difficulty or interactivity
level of the content. Finally, the selected annotations are stored in a triple store.</p>
      <p>As a result of this two-stage approach, the original user-generated forum content as
well as the knowledge resources have been annotated and consequently integrated
with the semantic knowledge representation of the domain ontology. Even if the
annotations are not as complete or accurate as the fine-grained technical
documentation, they nevertheless enable partial reasoning, adaptation and
personalisation during the content retrieval and composition stage.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Knowing and adapting to the user</title>
      <p>Knowing the different characteristics, context and preferences of users is crucial
for the development of any adaptive and personalised system. In the particular case of
Personalised Customer Care, there are a number of user characteristics that product
providers can adapt on.</p>
      <p>First of all, a customer is using one or more particular products or services out of a
potentially large portfolio from the company in question. Instead of leaving users
sorting through search results in order to find the information that is related to their
particular product, a system can automatically adapt the information retrieval and
result composition accordingly. Secondly, upon each interaction with a search system,
the user has a particular product state. For example, a user might have just purchased
the product, consequently finding him-/herself at the “product installation and
activation state”. Other examples would be the state of “configuring” after installation
or the execution of “pro-active actions” (e.g. the user simply wants to find out more
about a certain feature) or “re-active actions” (e.g. an error message has occurred in
the product and the user wants to solve the problem). Another characteristic of a
customer is one’s personal knowledge state, which depends on previous interactions
with the product and the search system. Users could range from being complete
novices to being considerable experts regarding particular parts of the product.
Additionally, users can have different content preferences, for example some users
might prefer looking at the content that contains the procedures for solving a problem,
whereas other users might prefer to consult explanations or overviews first. Also,
language preferences of users can be used during the adaptation phase, given that
most of the content produced by a company as well as the community forums are
available in different languages. In particular, consider a user who types in a keyword
query in his/her native language other than English. If this particular user also speaks
English, the system can adaptively retrieve additional resources in the case of poor
coverage in the user’s native language.</p>
      <p>In addition to these user characteristics and preferences, there are additional axes
of adaptation that arise at query time. A particular query can have a question type,
which represents the type of intent of the user’s question. For example, a user can
have a query that is a “what”-type question, which requires an explanation as an
answer. On the other hand, a “how” question requires the result to be a tutorial or
procedure that the user has to follow in order to solve a particular problem.
Additionally, the preferred answer structure might vary from query to query. For
example, some queries are preferably answered with a “highly structured” result
composition (including overviews, explanations, tutorials, related items, etc.),
whereas a “quick” answer would simply provide a tutorial or reference resources (e.g.
registry entry values, etc.).</p>
      <p>The different user characteristics and preferences are stored using a hybrid user
model, consisting of simple key-value pairs (e.g. for language preferences), semantic
structures that mirror the domain ontology (i.e. overlay user model), as well as
keyword vectors that represent users’ historical interactions with the system (i.e.
based on resources a user has looked at/clicked on).
7</p>
    </sec>
    <sec id="sec-7">
      <title>Retrieval and Composition System architecture</title>
      <p>In order to provide multi-dimensional adaptation, the domain and user models need
to be consolidated with the Adaptive Hypermedia concept of a Narrative/Adaptation
Model (as mentioned in section 2). This model contains the particular rules on i) what
should be adapted on and ii) how the adaptation should occur. In this section, a
Retrieval and Composition system architecture will be explained, which incorporates
these three models in order to deliver Personalised Customer Care. The retrieval and
composition process is broken down in several stages (see Figure 3) and incorporates
influences from the areas of Adaptive Hypermedia, Semantic Search and Information
Retrieval. In particular, this work extends an initial prototype presented in [18], which
has already proven the benefits of personalised retrieval and composition of
opencorpus content in an educational scenario.</p>
      <p>In the first stage, a user is requested to input a standard keyword query, along with
a drop-down selection of query types (i.e. what/how). Additionally, users indicate
their current activity or intent regarding the product, i.e. getting started, reacting to a
problem, etc. Ideally, this property would already be stored in a user model (e.g. from
previous interactions with the product or search engine), thus not requiring a user to
manually select this information. The keyword query is executed on an indexed
version of the domain ontology, yielding a collection of instance results. From these
results, several statistics can be generated. First of all, it is possible to determine
which “conceptual area” of the domain ontology has yielded the most results, i.e.,
which are the high level concepts that have the most results. For example, by ordering
the results by their corresponding chapter or subject, one can infer the particular part
of the domain ontology that contains many of the keywords. Additionally, by
analysing the search results, it is also possible to generate statistics about the type of
content that is retrieved, such as the activity-level (i.e. amount of procedures and
tutorials), the compositional properties (number of detailed subsections results), etc.</p>
      <p>These initial statistics are used in a second stage to group results and to extend the
subject space in order to personalise the results shown to the user. By consolidating
the initial results with the domain and user model, a strategy is then applied to provide
a “storyline” across the conceptual space. Particular ontological relationships of the
initial results are followed depending on user model preferences. For example, in the
case of a user who has just purchased the product, knowledge items (i.e. instances)
that focus on installing and activating the product are added to the results. Another
example would be to add related instances that fill a particular user’s current
knowledge gap (e.g. overview resources about a product feature, related features,
etc.). Also, the activity level and difficulty level of instances influence their inclusion
in the result space based on the user model preferences. At the end of this second step,
a complete personalised result space has been selected from the domain ontology,
which is not only more personally relevant than the initial results, but also more
diversified and complete, containing additional relevant instances that would not have
been found using conventional keyword search. The different results are composed
according to their ontological relationships (provided by the domain ontology), their
subject coverage, as well as their relevance to the querying user.</p>
      <p>In stages 4 and 5, additional resources are retrieved by generating and executing
expanded information retrieval queries across the user-generated content base. For
each instance result in the extended subject space, an adapted query is generated,
which contains the various aspects of resources that should be retrieved (in terms of
keywords and metadata attributes). By indexing the content as well as the
usergenerated annotations, structured queries can be used to retrieve topically as well as
personally relevant data. Additionally, logical operators and query term weights are
used in order to also minimise an overlap between the different result sets.</p>
      <p>In the final step, the different results are composed together with the instance
results from the domain ontology in order to provide a complete result space. The
combined sets are grouped, sequenced and linked according to the particular structure
of the personalised subject space that was generated in step 2. This additional notion
of sequence or narrative corresponds to a typical Adaptive Hypermedia presentation
that guides users through the result space rather than presenting a flat list [13]. For
example, for a novice user, advanced features are preceded by simpler
(overviewtype) resources, and followed by additionally relevant/related results. Also, due to
these highly structured and personalised characteristics of the result space, additional
Adaptive Hypermedia techniques can be applied. For example, on the result overview
page, visual cues and link annotations guide a user to the currently most appropriate
items to look at. Lastly, the composition of both organisational content as well as
user-generated content ensures structure while still maintaining great topic coverage.
8</p>
    </sec>
    <sec id="sec-8">
      <title>Ongoing Work</title>
      <p>The system implementation is currently being completed using a variety of
technologies. The organisational content has been transformed into the Web Ontology
Language (OWL)2 using customised scripts, whereas the annotation store consists of a
standard installation of the ARC triple store3. To ensure both efficiency as well as
reasoning capabilities, the domain ontology is stored in both eXist4 (which allows
efficient indexing using the built-in Lucene5 functionality), as well as its ontological
form (for reasoning during the extended subject search stage). The retrieval and
composition system builds on work presented in an educational scenario [18] (see
Figure 4) and uses an Adaptive Engine to consolidate the User, Domain and Narrative
Models. Ontological reasoning is performed within the Adaptive Engine using the
Jena Framework6. Similarly, the extended queries are generated by the rules encoded
in the narrative, which can either be scripted (JavaScript), or rule-based (Drools7).
The adapted queries are executed on an indexed version of the annotated content
slices and the results are presented in a web-based interface using JSP and JavaScript.</p>
      <p>The system evaluation will consist of authentic users performing activities over the
domain content, with assessment measures focussing on retrieval accuracy and
2 http://www.w3.org/2004/OWL/
3 http://arc.semsol.org/
4 http://exist.sourceforge.net/
5 http://www.exist-db.org/lucene.html
6 http://jena.sourceforge.net/
7 http://www.jboss.org/drools/
appropriateness, as well as the general task assistance in terms of task completion
time and user effort. A second evaluation will capture typical user queries, which will
be used as test evaluations of system response accuracy by product experts.</p>
    </sec>
    <sec id="sec-9">
      <title>9 Conclusions</title>
      <p>This paper has presented a novel approach to providing personalised information
retrieval and composition from a variety of heterogeneous data sources. The presented
architectures for structuring and retrieving both structured and user-generated content
combine the latest advances in Personalisation, Semantic Search, Information
Retrieval as well as the Social Web. Firstly, existing content resources are leveraged
and structured in order to make them reusable, as well as suitable for adaptation and
personalisation. Secondly, large sets of user-generated content are annotated using a
socio-semantic annotation tool. Finally, an adaptive retrieval and composition
architecture is responsible for aggregating the different data sources into personalised
result presentations, which guide users towards relevant and appropriate resources.</p>
      <p>The system is presented in a Customer Care scenario, which provides both the
necessary heterogeneous data sources, as well as the context for different user
information needs and preferences. It makes full usage of existing organisational
structured knowledge and applies this across the user-generated content. The resulting
user experience is a vastly improved customer care service, which provides an
automated personalised assistance without the need of technical support staff
intervention. Existing socio-semantic resources are hence leveraged and combined not
only to improve customer satisfaction, but also to save costs for the product provider.</p>
    </sec>
    <sec id="sec-10">
      <title>Acknowledgements</title>
      <p>This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as
part of the Centre for Next Generation Localisation (http://www.cngl.ie) at Trinity
College Dublin. We would like to acknowledge the contributions of the Localisation
Department at Symantec that have provided us with a variety of customer care
content, especially Fred Hollowood, Johann Roturier and Jason Rickard.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Micarelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gasparetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sciarrone</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Gauch</surname>
            ,
            <given-names>S. Personalized</given-names>
          </string-name>
          <article-title>Search on the World Wide Web</article-title>
          .
          <source>In: The Adaptive Web, LNCS</source>
          , vol.
          <volume>4321</volume>
          , pp.
          <fpage>195</fpage>
          -
          <lpage>230</lpage>
          . (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baeza-Yates</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ribeiro-Neto</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Modern Information Retrieval</article-title>
          .
          <string-name>
            <surname>Addison-Wesley Longman</surname>
          </string-name>
          Publishing Co., Inc. (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Speretta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Gauch</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Personalized Search Based on User Search Histories</article-title>
          . In: Web Intelligence,
          <source>WI2005</source>
          , pp.
          <fpage>622</fpage>
          -
          <lpage>628</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Teevan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S. T.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          , E.:
          <article-title>Personalizing search via automated analysis of interests and activities</article-title>
          .
          <source>In: Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval</source>
          , SIGIR '
          <fpage>05</fpage>
          . (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Koutrika</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ioannidis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A Unified User Profile Framework for Query Disambiguation and Personalization</article-title>
          .
          <source>In: Proceedings of the Workshop on New Technologies for Personalized Information Access, PIA2005</source>
          , pp.
          <fpage>44</fpage>
          -
          <lpage>53</lpage>
          , Edinburgh, Scotland, UK (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Micarelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and Sciarrone F.:
          <article-title>Anatomy and empirical evaluation of an adaptive web-based information filtering system</article-title>
          .
          <source>In: User Modeling</source>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>14</volume>
          ,
          <fpage>2</fpage>
          -
          <lpage>3</lpage>
          , pp.
          <fpage>159</fpage>
          -
          <lpage>200</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Lau</surname>
            ,
            <given-names>F. C.</given-names>
          </string-name>
          :
          <article-title>A new visual search interface for web browsing</article-title>
          .
          <source>In: Second ACM international Conference on Web Search and Data Mining</source>
          , Barcelona, Spain (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>