<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic and bayesian profiling services for textual resource retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eufemia Tinelli</string-name>
          <email>tinelli@di.uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierpaolo Basile</string-name>
          <email>basilepp@di.uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eugenio Di Sciascio</string-name>
          <email>disciascio@poliba.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <email>semeraro@di.uniba.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bari, Italy</institution>
          ,
          <addr-line>Bari 70126</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Elettrotecnica ed Elettronica, Politecnico di Bari, Italy</institution>
          ,
          <addr-line>Bari 70126</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dipartment of Computer Science, University of Bari, Italy</institution>
          ,
          <addr-line>Bari 70126</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>- This paper presents an integrated approach to textual resource retrieval, which combines logical inference services with user profiles, in which a structured representation of the user interests is maintained. Learning is performed on documents which have been disambiguated by exploiting the WordNet lexical database, in an attempt to discover concepts describing user interests. The proposed approach relies on several additional features compared to classical lexical knowledge systems, including: structured user recommendation, numeric value management, definition of strict and negotiable constraints and keywords to retrieve potential interesting resources w.r.t. both user request and profile.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>I. INTRODUCTION</p>
      <p>The main goal of this paper is to propose a strategy
to design advanced semantic search engines based on the
idea of combining semantic matchmaking with Bayesian text
categorization. By means of formal ontologies, modeled using
OWL [24], the knowledge on a specific domain is modeled and
exploited in order to make explicit the implicit knowledge, and
to reason on it by means of the formal semantics expressed in
OWL. On the other hand, a content-based recommender, which
is able to learn user profiles from disambiguated documents, is
used for customized search. The recommender exploits lexical
knowledge in the linguistic ontology WordNet [26].</p>
      <p>The success of a retrieval system strongly relies also on
query formulation and ranking functions. Especially for an
ontology-based system, the query language has to be very
simple for the end user but, at the same time, its expressiveness
must be able to capture the real user needs and to retrieve only
what the user is really looking for. In this paper we present a
system able both to help the user during the query formulation
process via an intensional navigation of the ontology, and to
return relevant resources via a ranking function exploiting both
the ontology-related semantics of the query and the user profile
managed by the content-based recommender.</p>
      <p>Hence, the system suggests interesting items to user by
taking into account three elements: user profiles, semantic item
descriptions and lexical item descriptions.</p>
      <p>The rest of the paper is structured as follows: the next
section outlines the work which mainly inspired this paper. In
Section III a brief summary of semantic-based matchmaking
in Description Logics is presented together with a Na¨ıve Bayes
method for user profiling. The description of the framework
architecture, a domain reference ontology, together with an
example of query example satisfying user needs, is presented
in Section IV, while conclusions close the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>II. RELATED WORK</title>
      <p>Recent years have witnessed a growing interest towards
profiling based resource retrieval. Among the most relevant
systems adopting a bayesian classifier we cite LIBRA [22]
which produces content-based book recommendations by
exploiting product descriptions obtained from Amazon.com Web
pages. Documents are represented by using keywords and are
subdivided into slots, each one corresponding to a specific
section of the document as authors, title, abstract. SiteIF [20]
exploits a sense-based representation to build a user profile
as a semantic network whose nodes represent senses of the
words in documents requested by the user. OntoSeek [14]
explores the role of linguistic ontologies in knowledge-based
retrieval systems. AMAYA [27] delivers context-aware
recommendations, which are based on provided feedback, context
data, and an ontology-based content categorization scheme.
Former system deals with user profile and on this basis can
provide a prediction/recommendation about interesting items
for the end user but w.r.t ITR [11] uses content-based filtering
algorithms.</p>
      <p>Our reference system is ITem Recommender (ITR) whose
strategy is to shift from a keyword-based document
representation to a sense-based one in order to integrate lexical
knowledge in the indexing step of training documents. Several
methods have been proposed to accomplish this task. Inn [21]
is proposed to include WordNet information at the feature level
by expanding each word in the training set with all the
synonyms for it, including those available for each sense, in order
to avoid a word sense disambiguation (WSD) process [28].
This approach has shown a decrease of effectiveness in the
obtained classifier, mostly due to the word ambiguity problem.
In [28] is pointed out that some kind of disambiguation is
required in any case. Subsequent works [3], [32] show that
embedding WSD in document classification tasks can improve
classification accuracy.</p>
      <p>Besides, for improving search and visualization various
example-based search tools have been developed, such as
SmartClient [33]. SmartClient uses constraint satisfaction
techniques, allows to refine (critique) preference values specified
in the first step of the search and supports trade-off analysis
among different attributes, e.g., looking for an apartment a
user can make a compromise between distance and rent (more
distant less expensive). Also in [19] a candidate/critiques
model has been presented, which allows users to refine
candidate solutions proposed. Here, preferences are elicited
incrementally by analyzing critiques through subsequent
iterations. It is an Automated Travel Assistant (ATA) for planning
airline travels, and similarly to SmartClient, ATA exploits CSP
techniques: preferences are described using soft constraints
defined on the values of attributes. AptDecision [31] is a
tool supporting elicitation of preferences in the real estate
domain: browsing the domain, users can discover new features
of interest and through their refinement of apartment features,
agents can build a profile of their preferences using learning
techniques. FindMe [6] uses case-based reasoning as a way
of recommending products in e-commerce catalogs. FindMe,
and its enhanced version The Wasabi Personal Shopper [4],
combines instance-based browsing and tweaking by difference.
Different FindMe-like systems have been developed, in various
domains. Among systems based on FindMe the most renowned
is Entre´e [5], a restaurant recommender, which allows users
to refine a query on the basis of the results displayed, so it is
possible to choose a restaurant less expensive or closer than
the restaurant shown after the first query.</p>
      <p>Recently, there has been a growing interest toward systems
supporting semantics exploitation, in different domains. In
[15] an application is presented, improving traditional web
searching using semantic web technologies: two Semantic
Search applications are presented, running on an application
framework called TAP, which provides a set of simple
mechanisms for sites to publish data onto the Semantic Web and
for applications to consume these data via a query interface
called GetData. The results provided by the system are then
compared with traditional text search results of Google.it
Web pages. Story Fountain [23] is an ontology-based tool,
which provides a guided exploration of digital stories using
a reasoning engine for the selection and organization of
resources. Story Fountain provides support for six different
exploration facilities to aid users engaged in exploration
process. The system is being used by the tour guides at Bletchley
Park. The approach has been further investigated in [8]. An
intelligent query interface exploiting an ontology-based search
engine is presented in [7]; the system enables access to data
sources through an integrated ontology and supports a user
in formulating a query even in the case of ignorance of the
vocabulary of the underlying information system.</p>
      <p>We do not present here related work on semantic
matchmaking (the interested reader is referred to [9]) but only a
framework of semantic-enabled e-marketplaces aimed at fully
exploiting semantics of supply/demand descriptions in B2C
and C2C e-marketplaces [13]. Main features of this framework
are the followings: full exploitation of nons-tandard inferences
for explanation services in the query-retrieval-refinement loop;
semantic-based ranking in the request answering; fully
graphical and usable interface, which requires no prior knowledge
of any logic principles, though fully exploiting it in the
backoffice.</p>
    </sec>
    <sec id="sec-3">
      <title>III. BASIC SERVICES AND ALGORITHMS</title>
      <p>A close relation exists between OWL and Description
Logics. In fact, the formal semantics of OWL DL sub-language
is grounded in the Description Logics theoretical studies. We
assume the reader be familiar with the basics of Description
Logics and with two standard inference services provided by
a DL reasoner: Subsumption and Satisfiablity [2].</p>
      <p>Given a query Q and an item to be retrieved I the following
match classes can be identified with respect to an ontology
T (see [12], [18], [25]).</p>
      <p>• exact - T |= Q ≡ I. I is semantically equivalent to Q.</p>
      <p>All the characteristics expressed in Q are presented in I
and I does not expose any additional characteristic with
respect to Q.
• full - T |= I v Q. I is more specific than Q. All the
characteristics expressed in Q are provided by I and I
exposes also other characteristics both not required by Q
and not in conflict with the ones in Q.
• plug-in - T |= Q v I. Q is more specific than I. All the
characteristics expressed in I are provided by Q and Q
requires also other characteristics both not exposed by I
and not in conflict with the ones in I.
• potential - T 6|= I u Q v ⊥. Q is compatible with I.</p>
      <p>Nothing in Q is logically in conflict with anything in I.
• partial - T |= I u Q v ⊥. Q is not compatible with
I. Something in Q is logically in conflict with some
characteristic in I.</p>
      <p>With respect to the above classification, in case of potential
match a similarity measure is needed to understand “how
potentially” I satisfies Q.</p>
      <p>The semantic similarity between a query and an item to
be retrieved can be computed with the aid of the algorithm
rankPotential [12]. Starting from the unfolded version (i.e.,
normalized with respect to the reference ontology) of both
the query and the item description, the algorithm is able to
quantify how many information requested in the query are
missing in the item description. In order to understand the
approach we consider the following trivial example where the
ontology is just a simple taxonomy1.</p>
      <p> B
=  C
 D</p>
      <p>
        v A
T v B u E
≡ A u F
With respect to the previous T consider the query Q = C u D
and the item I = E u B u G. Referring to the above
classification we see that Q and I are a potential match.
Unfolding T in both Q and I we obtain Q = C uB uAuE uF
and I = E u B u A u G. Since in the ontology the third one
is an equivalence axiom, we rewrite D with A u F instead
of expanding it as for B and C. Once we have the unfolded
version of Q and I we say that two pieces of information
{C,F } are missing in I in order to completely satisfy Q
(and then reach a full match). Since the maximum number
of missing pieces of information is equal to the length of the
unfolded Q, in this case five, we assign a normalized semantic
similarity score of (
        <xref ref-type="bibr" rid="ref1 ref10 ref11 ref12 ref13 ref14 ref15 ref16 ref17 ref18 ref19 ref2 ref20 ref21 ref22 ref23 ref24 ref25 ref26 ref27 ref28 ref29 ref3 ref30 ref31 ref32 ref33 ref34 ref4 ref5 ref6 ref7 ref8 ref9">1 − 52</xref>
        ) to the previous example match.
Then in the most general case, given an ontology T and two
concept Q and I, the semantic similarity score is computed by
the following formula:
      </p>
      <p>
        rankP otential(I, Q)
rank = 1 − rankP otential(&gt;, Q)
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
where &gt; is the most generic concept in every DLs ontology.
Obviously the previous score is equal to 1 only in the case of
full match.
      </p>
      <p>On the other hand, we consider the problem of learning
user profiles as a binary text categorization task [29]. Each
document has to be classified as interesting or not with respect
to the user preferences. Therefore, the set of categories is C =
{c+, c−}, where c+ is the positive class (user-likes ) and c−
the negative one (user-dislikes). We present a method able to
learn sense-based profiles by exploiting an indexing procedure
based on WordNet.</p>
      <p>We extend the classical BOW model [29] to a model in
which the senses (meanings) corresponding to the words in
the documents are considered as features. The goal of the
WSD algorithm is to associate the appropriate sense s to a
word w in document d, by exploiting its context C (a set of
words that precede and follow w). The sense s is selected
from a predefined set of possibilities, usually known as sense
inventory, that in our algorithm is obtained from WordNet
[26]. The basic building block for WordNet is the SYNSET
(SYNonym SET), a set of words with synonymous meanings
which represents a specific sense of a word. The text in
d is processed by two basic phases: the document is first
tokenized and then, after removing stopwords, part of speech
(POS) ambiguities are solved for each token. Reduction to
lemmas is performed and then synset identification with WSD
is performed: w is disambiguated by determining the degree of
semantic similarity among candidate synsets for w and those
of each word in C. The proper synset assigned to w is that with
the highest similarity with respect to its context of use. The
1For the sake of simplicity in this example we do not consider roles even
if rankPotential is able to deal with them for ALN ontologies.
semantic similarity measure adopted is the Leacock-Chodorow
measure [17]. Similarity between synsets a and b is inversely
proportional to the distance between them in the WordNet is-a
hierarchy, measured by the number of hops in the shortest path
from a to b. The algorithm starts by defining the context C of
w as the set of words in the same slot of w having the same
POS as w, then it identifies both the sense inventory Xw for w
and the sense inventory Xj for each word wj in C. The sense
inventory T for the whole context C is given by the union
of all Xj . After this step, we measure the similarity of each
candidate sense si ∈ Xw to that of each sense sh ∈ T and then
the sense assigned to w is the one with the highest similarity
score. Each document is mapped into a list of WordNet synsets
following three steps.</p>
      <p>1) each monosemous word w in a slot of d is mapped into
the corresponding WordNet synset;
2) for each pair of words hnoun,nouni or hadjective,nouni,
a search in WordNet is made to verify if at least one
synset exists for the bigram hw1, w2i. In the positive
case, the algorithm is applied on the bigram, otherwise
it is applied separately on w1 and w2; in both cases
all words in the slot are used as the context C of the
word(s) to be disambiguated;
3) each polysemous unigram w is disambiguated by the
algorithm, using all words in the slot as the context C
of w.</p>
      <p>A new version of the WSD algorithm has been recently
produced [30].</p>
      <p>The WSD procedure is used to obtain a synset-based vector
space representation that we called Bag-Of-Synsets (BOS). In
this model, a synset vector corresponds to a document, instead
of a word vector. Each document is represented by a set of
slots. Each slot is a textual field corresponding to a specific
feature of the document, in an attempt to take into account
also the structure of documents.</p>
      <p>Formally, assume that we have a collection of N documents,
each document being subdivided into M slots. Let m be the
index of the slot, n = 1, 2, ..., N , the n-th document is reduced
to 3 bags of synsets, one for each slot:</p>
      <p>dnm = htnm1, tnm2, . . . , tnmDnm i
where tnmk is the k-th synset in slot sm of document dn and
Dnm is the total number of synsets appearing in the m-th slot
of document dn. For all n, k and m, tnmk ∈ Vm, which is
the vocabulary for the slot sm (the set of all different synsets
found in slot sm). Document dn is finally represented in the
vector space by M synset-frequency vectors:</p>
      <p>fnm = hwnm1, wnm2, . . . , wnmDnm i
where wnmk is the weight of the synset tnmk in the slot sm
of document dn and can be computed in different ways:
It can be simply the number of times synset tk appears in
slot sm or a more complex TF- IDF score. Our hypothesis is
that the proposed indexing procedure helps to obtain profiles
able to recommend documents semantically closer to the user
interests.</p>
      <p>As a strategy to learn user profiles on BOS-indexed
documents, ITem Recommender (ITR) uses a Na¨ıve Bayes text
categorization algorithm to build profiles as binary classifiers
(user-likes vs user-dislikes ). The induced probabilistic model
estimates the a posteriori probability, P (cj |di), of document
di belonging to class cj as follows:</p>
      <p>P (cj |di) = P (cj ) Y</p>
      <p>
        P (tk|cj )N(di,tk)
w∈di
where N (di, tk) is the number of times token tk occurs in
document di. In ITR, each document is encoded as a vector
of BOS, one for each slot. Therefore, equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) becomes:
P (cj |di) =
      </p>
      <p>P (di) m=1 k=1
P (cj ) Y|S| |bim|</p>
      <p>
        Y P (tk|cj , sm)nkim
where S= {s1, s2, . . . , s|S|} is the set of slots, bim is the
BOS in the slot sm of di, nkim is the number of occurrences
of token tk in bim. Training is performed on BOS-represented
documents, thus tokens are WordNet synsets, and the induced
model relies on synset frequencies. To calculate (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ), the system
has to estimate P (cj ) and P (tk|cj , sm) in the training phase.
The documents used to train the system are rated on a discrete
scale from 1 to MAX, where MAX is the maximum rating that
can be assigned to a document. According to an idea proposed
in [22], each training document di is labeled with two scores, a
“user-likes” score w+i and a “user-dislikes” score w−i, obtained
from the original rating r:
w+i =
      </p>
      <p>r − 1
M AX − 1
;
w−i = 1 − w+i</p>
      <p>
        The scores in (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) are exploited for weighting the occurrences
of tokens in the documents and to estimate their probabilities
from the training set T R. The prior probabilities of the classes
are computed according to the following equation:
Pˆ(cj ) = i=1
|T R| + 2
|TPR| wji + 1
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
|T R|
N (tk, cj , sm) = X wji nkim
i=1
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
In (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ), nkim is the number of occurrences of token tk in
slot sm of document di. The sum of all N (tk, cj , sm) in the
denominator of equation (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) denotes the total weighted length
of the slot sm in class cj . In other words, Pˆ(tk|cj , sm) is
estimated as the ratio between the weighted occurrences of tk
in slot sm of class cj and the total weighted length of the slot.
The final outcome of the learning process is a probabilistic
model used to classify a new document in the class c+ or c−.
This model is the user profile, which includes those tokens
that turn out to be most indicative of the user preferences,
according to the value of the conditional probabilities in (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ).
      </p>
    </sec>
    <sec id="sec-4">
      <title>IV. SYSTEM FEATURES AND ARCHITECTURE</title>
      <p>
        Based on the previous techniques we built a system (see 1)
enabling users to perform:
• semantic searching by selecting ontology classes and
properties;
• personalized searching based on user profiles and item
information;
• semantic-personalized searching obtained by combining
the two types of searching;
Witten-Bell smoothing [34] is adopted to compute
P (tk|cj , sm), by taking into account that documents are
structured into slots and that token occurrences are weighted
using scores in equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ):

 Vcj +

      </p>
      <p>NP(tk,cj,sm)</p>
      <p>
        i N(ti,cj,sm) if N (tk, cj , sm) 6= 0
Pˆ(tk|cj , sm) =
 P Vcj 1
 Vcj + i N(ti,cj,sm) V −Vcj
otherwise
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
where N (tk, cj , sm) is the count of the weighted occurrences
of token tk in the slot sm in the training data for class cj , Vcj
is the total number of unique tokens in class cj , and V is the
total number of unique tokens across all classes. N (tk, cj , sm)
is computed as follows:
• item information - defined by two sets of information.
      </p>
      <p>The first set is composed of intrinsic information. For
example, in a bibliographic research scenario, intrinsic
information could be: item identification number, authors,
title, abstract, slot types. The second set is composed of
information produced by the classifier during the training
step: the bag of synsets obtained by the WSD process on
each slot;
• user profile - learned by the ITR system, as described in
the previous section;
• item semantic description - an OWL description of the
reference ontology.</p>
      <p>All the above information is stored in a repository, while the
adopted reference vocabulary to define semantic profile is the
WordNet lexical database. Recommender system architecture
is composed of several modules and each one has a specific
role and instantiates a part of the repository. The Interface
Module allows to define semantic item description and user
request. It provides a GUI to browse the hierarchy of concepts
and to outline properties of the selected class. This feature
of the GUI supports a user to define requests as descriptions
which are logically consistent w.r.t. the reference ontology.
The user request is split into two main parts:
• full : in the full part of the request, the user sets the
constraints she wants to be satisfied by (in full match
with) the retrieved items.
• potential : here the user sets her preferences, i.e., her
wishful options. The ontology-based score is computed
measuring “how many” of these constraints are satisfied
by a retrieved item.</p>
      <p>We stress the fact that in order to fulfill user needs the
research process could be an iterative one. On the basis of
returned results, a user is able to refine the previous request
exchanging a full constraint for a potential one and vice versa.
Our recommender system considers user query features as full
constraints by default and as potential only whenever explicitly
stated by the user [10].</p>
      <p>The Profile Engine Module prepares the item information
by performing WSD on the textual descriptions of the items to
be recommended. Mainly built on the ITR system, it performs
the training step on the disambiguated text in order to infer the
user profiles which will be exploited in the recommendation
process. Each inferred user profile is a binary classifier able to
categorize an item as interesting or not interesting according
to the classification score of the class user-likes .</p>
      <p>The Match Engine Module implements matchmaking and
ranking algorithms. It allows to compare the query with the
description of items referring to the same OWL ontology.
The reasoner is not embedded within the application, so the
Ontology Manager communicates with the Matchmaker via a
DIG 1.1 interface over HTTP. The Match Engine is used to
manage full and potential constraints. We think that when the
user sets a request characteristic as full , she wants to be sure
that full feature are explicitly mentioned in the item description
to be retrieved. Matchmaker Engine evaluates full or potential
match in the following way:
• all potential constraints - if user sets all request features
as potential ones then returned items will be those
potentially satisfying user request. According to an Open
World Assumption, returned items could have additional
or missing features w.r.t the request but no features in
logical conflict with any in the request. In this case the
module runs a potential match;
• all full constraints - if user sets all request features as
full ones then returned items have to express explicitly
at least all requested features. Of course this does not
prevent the item description to include also not requested
features. In this case the module evaluates a full match;
• mixed - in this case the module first compute a full
matches considering a temporary request composed of
full features only. The set of items returned in the previous
step could contain also features in logical conflict with
someone in the potential request, so the module runs a
potential match with the potential part of the request to
discard these results.</p>
      <p>
        Results returned by Profile Engine are defined by the
pair hitem identifier, relevance ratei whilst the ones
returned by Match Engine are defined by the pair
hitem identifier, rank valuei according to equations (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
and (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ).
      </p>
      <p>The Recommender Manager allows the communication and
synchronization between Profile and Match Engine. This
module is designed to deal with two issues: results ordering and
synchronization. In web-based retrieval systems, ranked lists
are surely preferable to unordered sets of items. Nevertheless
they become less effective and usable as the complexity of the
item description increases, together with possible user
preferences. At least two implementations of the Recommender
Engine are possible:
1) Recommender Manager operates in two steps. In the first
step, this module receives user requests. They are OWL
description translated in DIG; then it activates Profile
and Match Engine. In the second step, Recommender
Engine fetches user request results by two previous
modules, then it computes a unique score adding match
rank and classifier relevance rate value. The new ranked
results will be sent to the GUI;
2) Match Engine works as a filter to produce a subset of
items which is sent to Profile Engine. Profile Engine
uses only this subset to answer to the user request.</p>
      <p>Our recommender system implements the first approach
and computes results score according to the following simple
formula: score = α ∗ relevance + β ∗ rank where α and
β are numeric coefficients. As an initial attempt we set
both of coefficients to 0.5 value. The evaluation of several
experimental tests and usage of different reference ontologies
could require the change of the previous coefficients values.</p>
      <p>As real scenario we propose queries formulated in terms of
an ontology which models the bibliographic research domain
[16]. The ontology is defined by the following classes (Figure
2):
• Item - has subclass as Article, inProceeding, Book;
• Event - has subclass as Conference, Workshop,</p>
      <p>Meeting;
• Topic - topic hierarchy will be defined by specific
research topic and we may use Computer Science terms
of ACM Topic Hierarchy [1] for example;
• Author - author hierarchy will be defined by
sev• Project - project hierarchy will be defined by specific
research project adding properties as financedBy;
Besides, the item concept can be defined by the following
properties: aboutProject - has project class as range -,
hasPublicationYear, presentedAt - has event class as range
-, hasAuthor - has author class as range -, developedBy
- has
organization class as range -,
hasTopic - has
topic class as range. Obviously this domain ontology
can be extended by specific journal item properties like
hasVolume, hasMonth and by specific book item properties
like hasPublisher and hasEdition.</p>
      <p>Finally, in order to retriev only interesting items
several disjoint sets are defined. For example Book is disjoint
by Article, inProceeding, inBook, inCollection and
Proceedings while inProceeding is disjoint by Book,
Thesis, Booklet and Manual.</p>
      <p>In this scenario a user can propose queries such as the
combination of (a)”I’m looking for an inProceeding item
published after 2004, developed in a research project by
both Enterprise and University” and (b)”I am interested
in items with matchmaking as keyword in title, including
inProceeding u (≥
my profile”. In the previous request (a) is the semantic
query for matchmaker, (b) is the profile-based query for
the classifier. According to domain ontology the previous
semantic query (a) is translated in the following DL
description</p>
      <p>2004 hasPublicationYear) u
∀aboutProject.(ResearchProject u ∀developedBy.
(Enterprise u University)).</p>
      <p>V. CONCLUSION</p>
      <p>In this paper, we have described a strategy to design an
advanced semantic search engine able to combine logic based
matchmaking with Bayesian text categorization. The use of
Wordnet has been exploited in order to enhance a text based
categorization exploited by a Bayesian approach for automated
user profile learning. Combining probabilistic and logic based
similarity measure we have shown how to compute a score
representing the match degree between a query and an item
description. The score takes into account both the
ontologybased query and the user profile.</p>
      <p>An initial prototype implementing the proposed approach
has been developed and presented in the paper. Currently,
we are performing experiments on large datasets in order to
validate the proposed approach.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] ACM. Top Two Levels of The ACM Computing Classification System . www</article-title>
          .acm.org/class/1998/overview.html,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Mc</given-names>
            <surname>Guinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>PatelSchneider. The Description Logic Handbook</surname>
          </string-name>
          . Cambridge University Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bloedhorn</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          .
          <article-title>Boosting for text classification with semantic features</article-title>
          .
          <source>In Proc. of 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Mining for and from the Semantic Web Workshop</source>
          , pages
          <fpage>70</fpage>
          -
          <lpage>87</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Claypool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gokhale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Murnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Netes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sartin</surname>
          </string-name>
          .
          <article-title>Integrating knowledge-based and collaborative-filtering recommender systems</article-title>
          .
          <source>In Proceedings of the Workshop on AI and Electronic Commerce. AAAI 99</source>
          ,
          <string-name>
            <surname>Orlando</surname>
          </string-name>
          , Florida,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Robin</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Knowledge-based recommender systems</article-title>
          . In A. Kent, editor,
          <source>Encyclopedia of Library and Information Systems</source>
          , volume
          <volume>69</volume>
          . New York,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Robin</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Burke</surname>
            ,
            <given-names>Kristian J.</given-names>
          </string-name>
          <string-name>
            <surname>Hammond</surname>
          </string-name>
          , and
          <string-name>
            <surname>Benjamin</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Young</surname>
          </string-name>
          .
          <article-title>The findme approach to assisted browsing</article-title>
          .
          <source>IEEE Expert</source>
          ,
          <volume>12</volume>
          (
          <issue>4</issue>
          ):
          <fpage>32</fpage>
          -
          <lpage>40</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Tiziana</given-names>
            <surname>Catarci</surname>
          </string-name>
          , Paolo Dongilli, Tania Di Mascio, Enrico Franconi, Giuseppe Santucci, and
          <string-name>
            <given-names>Sergio</given-names>
            <surname>Tessaris</surname>
          </string-name>
          .
          <article-title>An ontology based visual tool for query formulation support</article-title>
          .
          <source>In Proceedings of the 16th European Conference onArtificial Intelligence (ECAI '04)</source>
          , pages
          <fpage>308</fpage>
          -
          <lpage>312</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Collins</surname>
          </string-name>
          , Paul Mulholland, and
          <article-title>Zdenek Zdra´hal. Semantic browsing of digital collections</article-title>
          .
          <source>In proc. of 4th International Semantic Web Conference (ISWC</source>
          <year>2005</year>
          ), pages
          <fpage>127</fpage>
          -
          <lpage>141</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Simona</given-names>
            <surname>Colucci</surname>
          </string-name>
          , Tommaso Di Noia, Eugenio Di Sciascio,
          <string-name>
            <surname>Francesco M. Donini</surname>
            , and
            <given-names>Marina</given-names>
          </string-name>
          <string-name>
            <surname>Mongiello</surname>
          </string-name>
          .
          <article-title>Concept abduction and contraction for semantic-based discovery of matches and negotiation spaces in an e-marketplace</article-title>
          .
          <source>Electronic Commerce Research and Applications</source>
          ,
          <volume>4</volume>
          (
          <issue>4</issue>
          ):
          <fpage>345</fpage>
          -
          <lpage>361</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Simona</surname>
            <given-names>Colucci</given-names>
          </string-name>
          , Tommaso Di Noia, Eugenio Di Sciascio,
          <string-name>
            <surname>Francesco M. Donini</surname>
            , and
            <given-names>Azzurra</given-names>
          </string-name>
          <string-name>
            <surname>Ragone</surname>
          </string-name>
          .
          <article-title>Knowledge elicitation for query refinement in a semantic-enabled e-marketplace</article-title>
          .
          <source>In 7th International Conference on Electronic Commerce ICEC 05 ACM Press</source>
          , pages
          <fpage>685</fpage>
          -
          <lpage>691</lpage>
          . ACM,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Marco</surname>
            <given-names>Degemmis</given-names>
          </string-name>
          , Pasquale Lops, and
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          .
          <article-title>An intelligent personalized service for conference partecipants</article-title>
          .
          <source>In 16th International Symposium on Methodologies for Intelligent Systems (ISMIS'06)</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Tommaso</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <given-names>Noia</given-names>
            ,
            <surname>Francesco M. Di Sciascio</surname>
          </string-name>
          , Eugenio andDonini, and
          <string-name>
            <given-names>Marina</given-names>
            <surname>Mongiello</surname>
          </string-name>
          .
          <article-title>A system for principled matchmaking in anelectronic marketplace</article-title>
          .
          <source>International Journal of Electronic Commerce</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <fpage>9</fpage>
          -
          <lpage>37</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Eugenio</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <given-names>Sciascio</given-names>
            , Simona Colucci,
            <surname>Tommaso</surname>
          </string-name>
          <string-name>
            <given-names>DiNoia</given-names>
            ,
            <surname>Francesco M. Donini</surname>
          </string-name>
          , Azzurra Ragone, and
          <string-name>
            <given-names>Raffaele</given-names>
            <surname>Rizzi</surname>
          </string-name>
          .
          <article-title>A semantic-based fully visual application formatchmaking and query refinement in b2cemarketplaces</article-title>
          .
          <source>In 8th International conference on ElectronicCommerce, ICEC 06</source>
          , pages
          <fpage>174</fpage>
          -
          <lpage>184</lpage>
          . ACM, ACM Press,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Masolo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Vetere</surname>
          </string-name>
          .
          <article-title>Content-based access to the web</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>14</volume>
          (
          <issue>3</issue>
          ):
          <fpage>70</fpage>
          -
          <lpage>80</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Ramanathan</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <surname>Rob McCool</surname>
            ,
            <given-names>and Eric</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Semantic search</article-title>
          .
          <source>In Proceedings of the Twelfth International World Wide Web Conference, WWW2003</source>
          , pages
          <fpage>700</fpage>
          -
          <lpage>709</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Haase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Broekstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Menken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Plechawski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pyszlak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schnizler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Siebes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Tempich</surname>
          </string-name>
          .
          <article-title>Bibster - a semantics-based bibliographic peer-to-peer system</article-title>
          .
          <source>In the International Semantic Web Conference (ISWC2004)</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Leacock</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Chodorow</surname>
          </string-name>
          .
          <article-title>Combining local context and WordNet similarity for word sense identification</article-title>
          , pages
          <fpage>305</fpage>
          -
          <lpage>332</lpage>
          . In C. Fellbaum (Ed.), MIT Press,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          .
          <article-title>A Software Framework for Matchmaking Based on Semantic Web Technology</article-title>
          .
          <source>International Journal of Electronic Commerce</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ),
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Greg</surname>
            <given-names>Linden</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Steve</given-names>
            <surname>Hanks</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Neal</given-names>
            <surname>Lesh</surname>
          </string-name>
          .
          <article-title>Interactive assessment of user preference models: The automated travel assistant</article-title>
          .
          <source>In Proceedings of the Sixth International Conference on User Modeling</source>
          , pages
          <fpage>67</fpage>
          -
          <lpage>78</lpage>
          , Vienna,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Strapparava</surname>
          </string-name>
          .
          <article-title>Improving user modelling with contentbased techniques</article-title>
          .
          <source>In Proc. 8th Int. Conf. User Modeling</source>
          , pages
          <fpage>74</fpage>
          -
          <lpage>83</lpage>
          . Springer,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>George</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Wordnet: An on-line lexical database</article-title>
          .
          <source>International Journal of Lexicography</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <year>1990</year>
          . (Special Issue).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Raymond</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mooney</surname>
            and
            <given-names>Loriene</given-names>
          </string-name>
          <string-name>
            <surname>Roy</surname>
          </string-name>
          .
          <article-title>Content-based book recommending using learning for text categorization</article-title>
          .
          <source>In Proceedings of the 5th ACM Conference on Digital Libraries</source>
          , pages
          <fpage>195</fpage>
          -
          <lpage>204</lpage>
          , San Antonio, US,
          <year>2000</year>
          . ACM Press, New York, US.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Paul</given-names>
            <surname>Mulholland</surname>
          </string-name>
          , Trevor Collins, and
          <article-title>Zdenek Zdra´hal. Story fountain: intelligent support for story research and exploration</article-title>
          .
          <source>In Proc. of Intelligent User Interfaces Conf.</source>
          , pages
          <fpage>62</fpage>
          -
          <lpage>69</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>OWL</surname>
          </string-name>
          .
          <article-title>Web Ontology Language</article-title>
          . www.w3.org/TR/owl-features/,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Paolucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kawamura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.R.</given-names>
            <surname>Payne</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Sycara</surname>
          </string-name>
          .
          <article-title>Semantic Matching of Web Services Capabilities</article-title>
          .
          <source>In proc. of International Semantic Web Conference (ISWC</source>
          <year>2002</year>
          ), number 2342 in LNCS.
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26] Princeton University. WordNet. http://wordnet.princeton.edu/,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Christian</given-names>
            <surname>Ra</surname>
          </string-name>
          <article-title>¨ck, Stefan Arbanowski, and Stephan Steglich. Contextaware, ontology-based recommendations</article-title>
          .
          <source>In international Symposium on Applications and the Internet Workshops (SAINTW'06)</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Sam</given-names>
            <surname>Scott</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stan</given-names>
            <surname>Matwin</surname>
          </string-name>
          .
          <article-title>Text classification using wordnet hypernyms</article-title>
          .
          <source>In COLING-ACL Workshop on usage of WordNet in NLP Systems</source>
          , pages
          <fpage>45</fpage>
          -
          <lpage>51</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>F.</given-names>
            <surname>Sebastiani</surname>
          </string-name>
          .
          <source>Machine learning in automated text categorization. ACM Computing Surveys</source>
          ,
          <volume>34</volume>
          (
          <issue>1</issue>
          ),
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Giovanni</surname>
            <given-names>Semeraro</given-names>
          </string-name>
          , Marco Degemmis, Pasquale Lops, and
          <string-name>
            <given-names>Pierpaolo</given-names>
            <surname>Basile</surname>
          </string-name>
          .
          <article-title>Combining learning and word sense disambiguation for intelligent user profiling</article-title>
          .
          <source>In Twentieth International Joint Conference on Artificial Intelligence, January</source>
          <volume>6</volume>
          -
          <issue>12</issue>
          ,
          <year>2007</year>
          , Hyderabad, India (to appear),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Sybil</given-names>
            <surname>Shearin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Henry</given-names>
            <surname>Lieberman</surname>
          </string-name>
          .
          <article-title>Intelligent profiling by example</article-title>
          .
          <source>In Proceedings of the International Conference on Intelligent User Interfaces (IUI</source>
          <year>2001</year>
          ), pages
          <fpage>145</fpage>
          -
          <lpage>151</lpage>
          . ACM press.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>M.</given-names>
            <surname>Theobald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schenkel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Weikum</surname>
          </string-name>
          .
          <article-title>Exploting structure, annotation, and ontological knowledge for automatic classification of xml data</article-title>
          .
          <source>In Proceedings of International Workshop on Web and Databases</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Torrens</surname>
          </string-name>
          , Boi Faltings, and
          <string-name>
            <given-names>Pearl</given-names>
            <surname>Pu</surname>
          </string-name>
          .
          <article-title>Smartclients: Constraint satisfaction as a paradigm for scaleable intelligent information systems</article-title>
          .
          <source>Constraints</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ):
          <fpage>49</fpage>
          -
          <lpage>69</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>I.H.</given-names>
            <surname>Witten</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.C.</given-names>
            <surname>Bell</surname>
          </string-name>
          .
          <article-title>The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression</article-title>
          .
          <source>IEEE Transactions on Information Theory</source>
          ,
          <volume>37</volume>
          (
          <issue>4</issue>
          ),
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>