=Paper= {{Paper |id=Vol-1388/DeCat2015-preface |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1388/DeCat2015-preface.pdf |volume=Vol-1388 }} ==None== https://ceur-ws.org/Vol-1388/DeCat2015-preface.pdf
                 DeCAT 2015 - Workshop on
             Deep Content Analytics Techniques
           for Personalized and Intelligent Services

                  Lora Aroyo1 , Geert-Jan Houben2 , Pasquale Lops3 ,
                      Cataldo Musto3 , and Giovanni Semeraro3
    1
        Department of Computer Science, VU University Amsterdam, The Netherlands
              2
                Delft University of Technology (TU Delft), The Netherlands
          3
            Department of Computer Science, University of Bari “A. Moro”, Italy




1       Introduction

According to a recent claim by IBM, 90% of the data available today have been
created in the last two years. This uncontrolled and exponential growth of the
online information gave new life to the research in the area of user modelling
and personalization, since information about users preferences, sentiment and
opinions can now be obtained by mining data gathered from many heterogeneous
sources.
    As an example, many recent work rely on the analysis of the content posted
by people on social networks and micro-blogs to unveil latent information about
their interests, automatically extract people personality traits, build preferences
models on the ground of textual reviews, and so on. At the same time, the recent
phenomenon of (Linked) Open Data fueled this research line by making available
a huge amount of machine-readable textual data.
    All these trends paved the way to the design of intelligent and personalized
systems able to extract some real value from this plethora of rough textual con-
tent produced on the Web: examples of such services are online brand monitoring
platforms, social recommender systems and smart cities-related applications, as
incident detection systems or personalized city tour planners.
    However, a complete exploitation of such textual streams requires a compre-
hension of the information conveyed by people. In turn, this requires a deep un-
derstanding of the language, which is not trivial. The major goal of this workshop
is to stimulate the attention of the scientific community on the aforementioned
topics. The workshop aims to provide a forum for discussing open problems,
challenges and innovative research approaches in the area, in order to investi-
gate whether the adoption of techniques for semantic content representation and
deep content analytics can be effective to build a new generation of intelligent
and personalized services based on the analysis of Social, Big and Linked Open
Data.
2   Motivations and Workshop Topics

The importance of user modeling and personalization is taken for granted in sev-
eral scenarios. According to this widespread paradigm, each user can be modeled
to some (explicitly or implicitly gathered) information about her knowledge or
about her preferences, in order to adapt the behavior of a generic intelligent
system to her specific characteristics.
    However, the rapid growth of social networks changed the rules for personal-
ization, since the spread of these platforms radically changed and renewed many
consolidated behavioral paradigms. Indeed, people today exploit these platforms
for decision-making related tasks, to support causes, to provide their circles with
recommendations or even to express opinions and discuss about the city or the
place where they live. Thanks to the heterogeneous nature of the discussions that
take place on social networks, a lot of new data are continuously available and
can be gathered and exploited to build richer and more complete user models,
to discover latent communities, to infer information about users emotions and
personality traits, and also to study very complex phenomena, such as those re-
lated to the psycho-social sphere, in a totally new way. At the same time, thanks
to crowdsourcing, a huge amount of content-based information has been made
available in open knowledge sources as Wikipedia and the Linked Open Data
Cloud.
    Given that most of the information stored in these modern data sylos is
made available as textual content, a consequence, a complete exploitation of
these rich information sources requires a big effort on the definition of models
and techniques able to effectively process the content and to represent it in a
machine-readable form, in order to unveil the latent semantics and trigger more
effective personalization and adaptation pipelines. This is not a trivial task,
since this process requires a deep comprehension of the language, which in turn
typically requires a combination of techniques coming from Machine Learning
and Natural Language Processing areas.
    The main goal of the workshop is to stimulate the discussion around prob-
lems, challenges and research directions regarding the exploitation of content-
based information sources (Big, Social and Linked Data) for personalization and
adaptation task and to foster the design of a new generation of intelligent user-
centered services.
    We hope the workshop will stimulate discussions around the presented pa-
pers, the invited talk and the following questions:

 – What is the impact of semantics in personalization and adaptation tasks?
 – Can social media improve the representation of user interests?
 – Can semantic analysis technique improve the representation of user interests?
 – Can these data sylos (Wikipedia, DBpedia, Freebase) be useful for person-
   alization and adaptation tasks?
 – Which data sylos are more effective to model user interests and preferences?
 – What content-based information is more useful to personalize and adapt the
   behavior of modern intelligent systems?
 – Does a semantic representation of the information improve the effectiveness
   of personalization tasks?
 – Does a semantic representation of the information improve the transparency
   of such platforms?
 – Can the analysis of content coming from social media provide some infor-
   mation about user personality traits?
 – How do people deal with privacy issues? Are them willing to trade better
   personalization with a larger tracking of their activities on the Web?
 – Is it possible to think about a novel generation of adaptive platforms able
   to completely exploit all the available information?

3   Contributions
Five papers will be presented in DeCAT 2015. The papers were accepted after
a peer-review process: each paper was reviewed by at least two members of the
Program Committee and evaluated in terms of Significance, Technical Quality
and Novelty of the approach.
    In their contributions, Abela et al. [1] tackle the Personal Information Man-
agement (PIM) problem, and propose a methodology to automatically organise
personal information accessed by the user into task-clusters. To this aim, the au-
thors transparently exploiting the users behaviour while performing some tasks.
A distinguishing aspect of their work is the usage of PiMx app. a tool which can
be of interest for other researchers working on task clustering.
    Next, Papadopoulos et al. [2] present ongoing work on the formalization of
a persons creativity, modelling it in terms of four characteristics of the personal
content creations, namely novelty, surprise, rarity and recreational effort. Based
on such formalization, the paper also presents the Creativity Profiling Server
(CPS), a system implementing the aforementioned user modelling framework
for computing and maintaining creativity profiles
    The analysis of social media is the focus of the work proposed by Matta et al.
[3]. In this paper the authors perform an interesting analysis of the connection
between Bitcoin’s price and the volume of Tweets about the topic. Specifically,
the authors use an external API to crawl Twitter data and assign a sentiment
to it. Next, they analyze how the price of Bitcoins changed over time and they
looked for some connections between these aspects. A thorough analysis of the
time series showed that some connection (calculated as the cross-correlation
between time series) exists.
    In the only short paper accepted, Pentel investigated the relation between
reading and writing skills in the task of age-based categorization. In this con-
tribution [4] he presents results of a study on age-based categorization of short
texts as 85 words per author. He introduced a novel set of features that will
reliably work with short texts, which makes easy to extract from the text itself
without any outside databases.
    Finally, Basile et al. [5] propose a content-based and time-aware movie recom-
mendation approach. The novel contribution is the time-adaptivity for a content-
based technique. The authors proposed an approach that models short-term
preferences by adopting a content-based sliding window approach: when a new
ratings comes into the system, the replacement of an older one is performed by
taking into account both a decay function for user interests and content simi-
larity between items on which ratings are provided, computed by distributional
semantics models. The authors carried out an evaluation that demonstrate that
their approach overtake the baseline FIFO strategy.


References
1. Abela, C., Staff, C., Handschuh, S. Automatic Task-Cluster Generation based on
   Document Switching and Revisitation. In Proceedings of DeCAT 2015 - 1st Work-
   shop on Deep Content Analytics Techniques for Personalized and Intelligent Ser-
   vices, co-located with UMAP 2015, Dublin (2015).
2. Papadopoulos, G., Karampiperis, P., Koukourikos, A., Konstantinidis, S. Creativity
   Profiling Server: Modelling the Principal Components of Human Creativity over
   Texts. In Proceedings of DeCAT 2015 - 1st Workshop on Deep Content Analytics
   Techniques for Personalized and Intelligent Services, co-located with UMAP 2015,
   Dublin (2015).
3. Matta, M., Lunesu, M.I., Marchesi, M. Bitcoin Spread Prediction Using Social And
   Web Search Media. In Proceedings of DeCAT 2015 - 1st Workshop on Deep Con-
   tent Analytics Techniques for Personalized and Intelligent Services, co-located with
   UMAP 2015, Dublin (2015).
4. Pentel, A. Employing Relation Between Reading and Writing Skills on Age Based
   Categorization of Short Estonian Texts. In Proceedings of DeCAT 2015 - 1st Work-
   shop on Deep Content Analytics Techniques for Personalized and Intelligent Ser-
   vices, co-located with UMAP 2015, Dublin (2015).
5. Basile, P., Caputo, A., de Gemmis, M., Lops, P., Semeraro, G. Modeling Short-
   Term Preferences in Time-Aware Recommender Systems. In Proceedings of DeCAT
   2015 - 1st Workshop on Deep Content Analytics Techniques for Personalized and
   Intelligent Services, co-located with UMAP 2015, Dublin (2015).