=Paper= {{Paper |id=None |storemode=property |title=Multilingual Ontology-based User Profile Enrichment |pdfUrl=https://ceur-ws.org/Vol-571/paper8.pdf |volume=Vol-571 |dblpUrl=https://dblp.org/rec/conf/www/LucaPKA10 }} ==Multilingual Ontology-based User Profile Enrichment== https://ceur-ws.org/Vol-571/paper8.pdf
       Multilingual Ontology-based User Profile Enrichment

              Ernesto William De Luca, Till Plumbaum, Jérôme Kunegis, Sahin Albayrak
                                            DAI Lab, Technische Universität Berlin
                                         Ernst-Reuter-Platz 7, 10587 Berlin, Germany
            {ernesto.deluca, till.plumbaum, jerome.kunegis, sahin.albayrak}@dai-labor.de




ABSTRACT                                                                vocabulary that is not contained in the documents, known
In this paper, we discuss the possibility of enriching user pro-        as the paraphrase problem.
files with multilingual information. Nowadays, the English                 Multilingual Retrieval. When working in a multilingual
language is the de facto standard language of commerce and              environment, words have to be disambiguated both in the
science, however users can speak and interact also in other             native and in the other languages. In this case the com-
languages. This brings up the need of enriching the user pro-           bination of multilingual text retrieval and word sense dis-
files with multilingual information. Therefore, we propose              ambiguation (WSD) approaches is crucial [2]. In order to
to combine ontology-based user modeling with the informa-               retrieve the same concept in different languages, some re-
tion included in the RDF/OWL EuroWordNet hierarchy.                     lations between the searched concept and its translations
In this way, we can personalize retrieval results according to          have to be built. WSD is used to convert relations between
user preferences, filtering relevant information taking into            words into relations between concepts; sense disambigua-
account the multilingual background of the user.                        tion can be acquired for words, but it is more difficult for
                                                                        documents. To have accurate WSD, we need a larger cov-
                                                                        erage of semantic and linguistic knowledge than is available
Categories and Subject Descriptors                                      in current lexical resources.
H.3.1 [Information Storage and Retrieval]: Content                         Because we focus on multilingual concepts, we decided to
Analysis and Indexing                                                   use EuroWordNet [6], a variant of the most well-known avail-
                                                                        able lexical database WordNet. In previous work, we ex-
                                                                        tended the RDF/OWL WordNet representation [5] for mul-
General Terms                                                           tilingualism, leading to our own RDF/OWL EuroWordNet
RDF/OWL, Web 2.0, Multilingualism, EuroWordNet                          representation [3].
                                                                           Ontology-based User Modeling. With the advent of the
                                                                        Web 2.0 and the growing impact of the Internet on our ev-
Keywords                                                                ery day life, people start to use more and more different web
Multilingual Semantic Web, User Modeling                                applications. They manage their bookmarks in social book-
                                                                        marking systems, communicate with friends on Facebook1
                                                                        and use services like Twitter2 to express personal opinions
1.   INTRODUCTION                                                       and interests. Thereby, they generate and distribute per-
  At present most of the demand for text retrieval is well              sonal and social information like interests, preferences and
satisfied by monolingual systems, because the English lan-              goals [4]. This distributed and heterogeneous corpus of user
guage is the de facto standard language of commerce and                 information, stored in the user model (UM) of each applica-
science. However, there is a wide variety of circumstances in           tion, is a valuable source of knowledge for adaptive systems
which a reader might find multilingual retrieval techniques             like information filtering services. These systems can utilize
useful. Being able to read a document in a foreign language             such knowledge for personalizing search results, recommend
does not always imply that a person can formulate appropri-             products or adapting the user interface to user preferences.
ate queries in that language as well. Furthermore, dealing              Adaptive systems are highly needed, because the amount of
with polysemic words seems to be more difficult in multilin-            information available on the Web is increasing constantly, re-
gual than in monolingual retrieval tasks.                               quiring more and more effort to be adequately managed by
  Every text retrieval approach has two basic components:               the users. Therefore, these systems need more and more in-
the first for representing texts (queries and documents) and            formation about users interests, preferences, needs and goals
the other for their comparison. This automated process is               and as precise as possible. However, this personal and so-
successful when its results are similar to those produced by            cial information stored in the distributed UMs usually exists
human comparison between queries and documents. Queries                 in different languages due to the fact that we communicate
and documents often differ from its length however. While               with friends all over the world. Also, today’s adaptive sys-
the query is often quite short, documents might be up to                tems are usually part of web applications and typically only
hundreds of pages long. Moreover, users frequently adopt a              have access to the information stored in that specific ap-
Copyright is held by the author/owner(s).                               1
WWW2010, April 26-30, 2010, Raleigh, North Carolina.                        http://www.facebook.com/
                                                                        2
.                                                                           http://twitter.com/




                                                                   41
                                                                                              Aggregated Profile


plication. Therefore, we enhance the user model aggrega-                                                                                      Attrib
                                                                                                                                               ute
                                                                                                                                                                Attrib
                                                                                                                                                                 ute

                                                                                                                                                                                                                                                gaan
                                                                                User Models

tion process by adding valuable and important meta-data                          !"#$%&'
                                                                                                                            Attrib
                                                                                                                                                                                           move

                                                                                                                                                                                                      drive                  rijden
                                                                                                                             ute
                                                                                                                                                                                                                                                   berijd

which leads to better user models and thus to better adap-                                                                                                                          ride
                                                                                                                                                                                                                                                    en


                                                                                                                                                                                                              Interlingual

tive systems. For this reason, we propose a combination of                has    !"#$%&(                                             Attrib
                                                                                                                                      ute
                                                                                                                                                                                                                 Index




RDF/OWL EuroWordNet within ontology-based aggrega-                                                    Attrib       Attrib            Attrib            Attrib            Attrib
                                                                                                                                                                                             condu
                                                                                                                                                                                              icir
                                                                                                                                                                                                                                      guidare
                                                                                 !"#$%&)
tion techniques.
                                                                                                       ute          ute               ute               ute               ute                                                                          caval
                                                                                                                                                                                  cabal                                                                care
                                                                                                                                                                                   gar
                                                                                                                                                                                                  mover
                                                                                                                                                                                                                               andare

                                                                                                                       Attrib                          Attrib
                                                                                                                        ute                             ute




2.     PROPOSED SEMANTIC                                                                                                                                                                             RDF/OWL EuroWordNet




       USER MODELING AGGREGATION                                     Figure 1: Integrating semantic knowledge about
   RDF/OWL EuroWordNet opens new possibilities for over-             multilingual dependencies with the information
coming the problem of language heterogeneity in different            stored in the user models.
user models and thus allows a better user modeling aggre-
gation. Therefore, we propose an ontology-based user mod-
elling approach that combines mediator techniques to aggre-             To use the information contained in RDF/OWL Euro-
gate user models from different applications and utilize the         WordNet, we developed a framework that allows us to de-
EuroWordNet information to handle the multilingual infor-            fine several mediators that take the information from user
mation in the models. Based on this idea, we define some             models and trigger different sources in the Semantic Web
requirements that we have to fulfill.                                for more information. These mediators are specialized com-
   Requirement 1: Ontology-based profile aggregation. We             ponents that read a user model and collect additional data
need an approach to aggregate information that is both ap-           from an external source.
plication independent and application overarching. This re-
quires a solution that allows us to semantically define rela-        4.         CONCLUSION
tions and coherences between different attributes of differ-            In this paper, we presented the possibility of enriching
ent UMs. The linked attributes must be easily accessible by          user profiles with information included in the RDF/OWL
applications such as recommender and information filtering           EuroWordNet hierarchy to better filter results during the
systems. In addition, similarity must be expressed in these          search process. This aggregated information can be used in
defined relations.                                                   our multilingual semantic information retrieval system that
   Requirement 2: Integrating semantic knowledge. A so-              has been described in more details in [2]. In this work, we
lution to handle the multilingual information for enriching          have shown that we can handle the high heterogeneity of
user profiles is needed. Hence, we introduce a method to             distributed data, especially concerning multilingual hetero-
incorporate information from semantic data sources such as           geneity, using aggregated user profiles that have been en-
EuroWordNet and to aggregate complete profile informa-               riched with information contained in the RDF/OWL Euro-
tion. We decided to use an ontology as the conceptual ba-            WordNet representation. This gives us the possibility to
sis of our approach to meet the first requirement explained          personalize retrieval results according to user preferences,
above. Therefore a meta-ontology is used to link attributes          filtering relevant information taking into account the multi-
of different UMs that contain equal or similar content.              lingual background of the user.
   The definition of a meta-model based on the meta-ontology
can be divided into two steps. First, we define a concrete
meta-model for a specific domain we want to work with, such          5.         REFERENCES
as music, movies or personal information. The meta-model             [1] Shlomo Berkovsky, Tsvi Kuflik, and Francesco Ricci.
can be an already existing model, like FOAF3 or a pro-                   Mediation of user models for enhanced personalization
prietary model that only certain applications understand.                in recommender systems. User Modeling and
Next, we decribe how to connect multilingual attribute in-               User-Adapted Interaction, 18(3):245–286, 2008.
formation stored in different user models.                           [2] Ernesto William De Luca. Semantic Support in
                                                                         Multilingual Text Retrieval. Shaker Verlag, Aachen,
3.     MULTILINGUAL ONTOLOGY-BASED                                       Germany, 2008.
                                                                     [3] Ernesto William De Luca, Martin Eul, and Andreas
       AGGREGATION                                                       Nürnberger. Converting EuroWordNet in OWL and
   To enrich the user model with multilingual information, as            extending it with domain ontologies. In Proc. Workshop
described above, we decided to utilize the knowledge avail-              on Lexical-semantic and Ontological Resources, 2007.
able in RDF/OWL EuroWordNet [3]. We want to leverage                 [4] Till Plumbaum, Tino Stelter, and Alexander Korth.
this information and use it for a more precise and qualita-              Semantic web usage mining: Using semantics to
tively better user modeling. We treat the semantic external              understand user intentions. In Proc. Conf. on User
resources as a huge semantic profile that can be used to en-             Modeling, Adaptation and Personalization, pages
rich the user model and add valuable extra information (see              391–396, 2009.
Figure 1). The aggregation of information into semantic              [5] Mark van Assem, Aldo Gangemi, and Guus Schreiber.
profiles and user models is performed similarly to the ap-               WordNet in RDFS and OWL. Technical report, W3C,
proach described in [1], by using components that mediate                2004.
between the different models. We extend this approach by
                                                                     [6] Piek Vossen. Eurowordnet general document, version 3,
using a combined user model, aggregated with the proposed
                                                                         final, 1999.
ontology.
3
    http://www.foaf-project.org/




                                                                42