=Paper=
{{Paper
|id=None
|storemode=property
|title=Semantic Link Prediction through Probabilistic Description Logics
|pdfUrl=https://ceur-ws.org/Vol-778/paper8.pdf
|volume=Vol-778
|dblpUrl=https://dblp.org/rec/conf/semweb/RevoredoLC11
}}
==Semantic Link Prediction through Probabilistic Description Logics==
Semantic Link Prediction through Probabilistic
Description Logics
Kate Revoredo1 , José Eduardo Ochoa Luna2 , and Fabio Gagliardi Cozman2
1
Departamento de Informática Aplicada, Unirio
Av. Pasteur, 458, Rio de Janeiro, RJ, Brazil
2
Escola Politécnica, Universidade de São Paulo,
Av. Prof. Mello Morais 2231, São Paulo - SP, Brazil
katerevoredo@uniriotec.br,eduardo.ol@gmail.com,fgcozman@usp.br
Abstract. Predicting potential links between nodes in a network is a
problem of great practical interest. Link prediction is mostly based on
graph-based features and, recently, on approaches that consider the se-
mantics of the domain. However, there is uncertainty in these predictions;
by modeling it, one can improve prediction results. In this paper, we pro-
pose an algorithm for link prediction that uses a probabilistic ontology
described through the probabilistic description logic crALC. We use an
academic domain in order to evaluate this proposal.
1 Introduction
Many social, biological, and information systems can be well described by net-
works, where nodes represent objects (individuals), and links denote the rela-
tions or interactions between nodes. Predicting a possible link in a network is
an interesting issue that has recently gained attention, due to the growing inter-
est in social networks. For instance, one may be interested on finding potential
friendship between two persons in a social network, or a potential collaboration
between two researchers. Thus link prediction [12, 20] aims at predicting whether
two nodes (i.e. people) should be connected given that we know previous infor-
mation about their relationships or interests. A common approach is to exploit
the network structure, where numerical information about nodes is analyzed
[12, 20, 9]. However, knowledge about the objects represented in the nodes can
improve prediction results. For instance consider that the researchers Joe and
Mike do not have a publication in common, thus they do not share a link in a
collaboration network. Moreover, graph features do not indicate a potential link
between them. However, they have published in the same journal and they both
teach the same course in their respectively universities. This information can be
an indication of a potential collaboration between them. Given this, approaches
that are based on the semantics related to the domain of the objects represented
by the nodes [21, 18] have been proposed. In some of them, an ontology modeling
the domain and the object interests were used in the prediction task.
However, there is uncertainty in such predictions. Often, it is not possible
to guarantee the relationship between two objects (nodes). This is maybe due
to the fact that information about the domain is incomplete. Thus, it would
be interesting if link prediction approaches could handle the probability of a
link conditioned on the information about the domain. In our example, knowing
that the probability of the relationship between Joe and M ike conditioned on
the knowledge of them publishing in the same journal and teaching the same
course is high implies a link between them in the network; otherwise, a link is
not suggested. In graph-based approaches, probabilistic models learned through
machine learning algorithms were used for link prediction. Some examples of
probabilistic models are Probabilistic Relational Model (PRM) [6], Probabilistic
Entity Relationship Model (PERM) [7] and Stochastic Relational Model (SRM)
[22]. On approaches based on semantic we claim that ontologies must be used
to model the domain. Therefore, to model uncertainty, probabilistic approaches,
such as probabilistic ontologies, must be considered.
An ontology can be represented through a description logic [2], which is typi-
cally a decidable fragment of first-order logic that tries to reach a practical bal-
ance between expressivity and complexity. To encode uncertainty, a probabilistic
description logic (PDL) must be contemplated. The literature contains a number
of proposals for PDLs [8, 10, 19, 13]. In this paper we adopt a recently proposed
PDL, called Credal ALC (crALC) [4, 16, 5], that extends the popular logic ALC
[2]. In crALC one can specify sentences such as P (Professor|Researcher) = 0.4,
indicating the probability that an element of the domain is a Professor given
that it is a Researcher. These sentences are called probabilistic inclusions. Exact
and approximate inference algorithms that deal with probabilistic inclusions
have been proposed [4, 5], using ideas inherited from the theory of Relational
Bayesian Networks (RBN)[11].
In this paper, we propose to use a probabilistic ontology defined with the
PDL crALC for semantic link prediction.
The paper is organized as follows. Section 2 reviews basic concepts of PDLs
and crALC. Section 3 presents our algorithm for semantic link prediction through
the PDL crALC. Experiments are discussed in Section 4, and Section 5 con-
cludes the paper.
2 Probabilistic Description Logics and crALC
Description logics (DLs) form a family of representation languages that are typi-
cally decidable fragments of first order logic (FOL) [2]. Knowledge is expressed
in terms of individuals, concepts, and roles. The semantic of a description is
given by a domain D (a set) and an interpretation ·I (a functor). Individuals
represent objects through names from a set NI = {a, b, . . .}. Each concept in the
set NC = {C, D, . . .} is interpreted as a subset of a domain D. Each role in the
set NR = {r, s, . . .} is interpreted as a binary relation on the domain.
Several probabilistic descriptions logics (PDLs) have appeared in the litera-
ture. Heinsohn [8], Jaeger [10] and Sebastiani [19] consider probabilistic inclusion
axioms such as PD (Professor) = α, meaning that a randomly selected object is a
Professor with probability α. This characterizes a domain-based semantics: prob-
abilities are assigned to subsets of the domain D. Sebastiani also allows inclusions
such as P (Professor(John)) = α, specifying probabilities over the interpretations
themselves. For example, one interprets P (Professor(John)) = 0.001 as assigning
0.001 to be the probability of the set of interpretations where John is a Professor.
This characterizes an interpretation-based semantics.
The PDL crALC is a probabilistic extension of the DL ALC that adopts an
interpretation-based semantics. It keeps all constructors of ALC, but only allows
concept names on the left hand side of inclusions/definitions. Additionally, in
crALC one can have probabilistic inclusions such as P (C|D) = α or P (r) = β
for concepts C and D, and for role r. If the interpretation of D is the whole
domain, then we simply write P (C) = α. The semantics of these inclusions is
roughly (a formal definition can be found in [5]) given by:
∀x ∈ D : P (C(x)|D(x)) = α,
∀x ∈ D, y ∈ D : P (r(x, y)) = β.
We assume that every terminology is acyclic; no concept uses itself. This as-
sumption allows one to represent any terminology T through a directed acyclic
graph. Such a graph, denoted by G(T ), has each concept name and role name
as a node, and if a concept C directly uses concept D, that is if C and D appear
respectively in the left and right hand sides of an inclusion/definition, then D
is a parent of C in G(T ). Each existential restriction ∃r.C and value restriction
∀r.C is added to the graph G(T ) as nodes, with an edge from r and C to each
restriction directly using it. Each restriction node is a deterministic node in that
its value is completely determined by its parents.
Example 1. Consider a terminology T1 with concepts A, B, C, D. Suppose
P (A) = 0.9, B v A, C v B t ∃r.D, P (B|A) = 0.45, P (C|B t ∃r.D) = 0.5, and
P (D|∀r.A) = 0.6. The last three assessments specify beliefs about partial overlap
among concepts. Suppose also P (D|¬∀r.A) = ² ≈ 0 (conveying the existence of
exceptions to the inclusion of D in ∀r.A). Figure 1 depicts G(T ).
Fig. 1. G(T ) for terminology T in Example 1 and its grounding for domain D = {a, b}.
The semantics of crALC is based on probability measures over the space of
interpretations, for a fixed domain. Inferences, such as P (Ao (a0 )|A) for an ABox
A, can be computed by propositionalization and probabilistic inference (for exact
calculations) or by a first order loopy propagation algorithm (for approximate
calculations) [5].
3 Link Prediction by using crALC
In this section we describe how to apply the PDL crALC for semantic link
prediction. We borrowed some syntax from the graph-based approach where
each node (a person in a social network) is represented by A, B, C, and we are
interested in defining whether a link between A and B is suitable given there is
no link between these nodes. Interests between the nodes are modeled through a
probabilistic ontology represented by the PDL crALC. The prediction link task
can be described as:
Given:
• a network defining relationship between objects;
• an ontology represented by crALC describing the domain of the objects;
• the ontology role that defines the semantic of the relationship between
objects;
• the ontology concept that describes the network objects.
Find:
• a revised network defining relationship between objects.
The proposed algorithm for link prediction receives a network of a specific
domain. For instance, in a collaboration network the nodes represent researchers
and the relationship can have the semantic ”has a publication with” or ”is ad-
vised by”. Therefore, the ontology represented by crALC describes the domain
of publications between researchers, having concepts like Researcher, Publication,
StrongRelatedResearcher and NearCollaborator and roles like hasPublication,
hasSameInstitution and sharePublication. This ontology can be learned automat-
ically through a learning algorithm as the ones proposed in [15, 17]. Thus, the
nodes represent instances of one of the concepts described in the PDL crALC
and the semantic of the links is described by one of the roles in the PDL crALC.
These concept and role must be informed as inputs to the proposed algorithm.
The link prediction algorithm is described in Algorithm 1.
The algorithm starts looking for all pairs of instances of the concept C defined
as the concept that provides the semantic for the network nodes. For each pair
it checks whether the corresponding nodes exist in the network (this can be
improved by exploring graph-based properties). If not the probability of the
link is calculated through the probability of the defined role conditioned on
evidence. The evidence is provided by the instances of the ontology. As many
instances the ontology have the better is the inference performed. The inference
is performed through the Relational Bayesian network build from ontology O. If
the probability inferred is greater than a threshold then the corresponding link
Require: a network N , an ontology O, the role r( , ) representing the semantic of
the network link, the concept C describing the objects of the network and a
threshold.
Ensure: a revised network Nf
1: define Nf as N
2: for all pair of instances (a, b) of concept C do
3: if does not exist a link between nodes a and b in the network N then
4: infer probability P (r(a, b)|evidences) using the RBN created through the
ontology O
5: if P (r(a, b)|evidences) > threshold then
6: add a link between a and b in network Nf
7: end if
8: end if
9: end for
Algorithm 1: Algorithm for link prediction through crALC.
is added to the network. Alternatively, when the threshold to be considered is
not known a priori, a rank of the inferred links based on their probability is done
and the top-k, where k would be a parameter, are chosen.
4 Preliminary Results
Experiments were run over a collaborative network of researchers. Data was ga-
thered from the Lattes curriculum platform 3 , the public repository for Brazilian
curriculum researchers. In this platform, every researcher has a unique Lattes
code that allows one to link to other researchers according to: shared publica-
tions, advising tasks, and examination board participations. Given this collabo-
rative network we are interested in predicting further links among researchers in
order to either promote further collaborations (suitable co-workers to research
tasks would be suggested) or gather information about research groups. Due
to form-filling errors there are many missing links among researchers; thus, we
are unable to completely state co-working relationships using only the Lattes
platform.
To tackle link prediction we firstly have collected information about 1200
researchers and learned a probabilistic ontology [15, 17], represented by the PDL
crALC, for modeling their research interests. A simplified probabilistic ontology
3
http://lattes.cnpq.br/
is given by:
P (Publication) = 0.3
P (Board) = 0.33
P (sharePublication) = 0.22
P (wasAdvised) = 0.05
P (hasSameInstitution) = 0.14
P (sameExaminationBoard) = 0.31
ResearcherLattes ≡ Person
u(∃hasPublication.Publication
u∃advises.Person u ∃participate.Board)
P (PublicationCollaborator | Researcher u ∃sharePublication.Researcher) = 0.91
P (SupervisionCollaborator | Researcher u ∃wasAdvised.Researcher) = 0.94
P (SameInstitution | Researcher u ∃hasSameInstitution.Researcher) = 0.92
P (SameBoard | Researcheru
∃sameExaminationBoard.Researcher) = 0.92
P (NearCollaborator | Researcher u ∃sharePublication.∃hasSameInstitution.
∃sharePublication.Researcher) = 0.95
FacultyNearCollaborator ≡ NearCollaborator
u ∃sameExaminationBoard.Researcher
P (NullMobilityResearcher | Researcher u ∃wasAdvised.
∃hasSameInstitution.Researcher) = 0.98
StrongRelatedResearcher ≡ Researcher
u (∃sharePublication.Researcher u
∃wasAdvised.Researcher)
InheritedResearcher ≡ Researcher
u (∃sameExaminationBoard.Researcher u
∃wasAdvised.Researcher)
In this probabilistic ontology concepts and probabilistic inclusions denote mu-
tual research interests. For instance, a PublicationCollaborator inclusion refers to
Researchers who shares a Publication, thus relates two nodes (Researcher) in a col-
laboration graph. Therefore, the concept Researcher and the role sharePublication
are inputs to the algorithm we proposed in Algorithm 1.
To perform inferences and therefore to obtain link predictions, a proposition-
alization step (a resulting relational Bayesian network) is required.
In addition, a collaboration graph, based on shared publications, was also
defined. Statistical information was computed accordingly. Figure 2 depicts col-
laborations among 303 researchers. Several relationships and clusterings can also
be observed.
If we carefully inspect this collaboration graph (Figure 3 shows a subgraph
obtained from Figure 2) we could be interested, for instance, in predicting links
among researchers from different groups.
Thus, in Figure 3 one could further investigate whether a link between re-
searcher R (red octagon node) and the researcher B (blue polygon node) is
suitable. In order to infer this, the probability of a possible link between R and
Nilton César Furtado Canto
Iberê Luiz Caldas
Vanessa Oliva Araujo
Luiz Henrique Alves Monteiro
André Alves Ferreira
Renato Teodoro Ramos Henrique Schützer Del Nero
José Guilherme de Souza Chaui Mattos Berlinck
Eduardo Massad
1
1
Fabiane Bizinella Nardon 1 Atila Madureira Bueno
1
1
1 1
Paulo Bandiera Paiva
Marilene de Pinho Santos Mazza
1
Vivian Rodrigues Fiales
1
1
Denis Roberto Zamignani
Paulo Sérgio Licciardi Messeder Barreto
José Roberto Castilho Piqueira
1
1
1
1
1
Eduardo Moacyr Krieger
1
Griselda Esther Jara de Garrido
1
1
1
Sergio Shiguemi Furuie
1
Cecil Chow Robilotta
1
Andre Fabio Kohn
Jose Jaime da Cruz
1
Silvio Ikuyo Nabeta
Luiz Lebensztajn
Renato Carlson
Pedro Pereira de Paula
Emilio Del Moral Hernandez
Mauricio Barbosa de Camargo Salles
Maurício Caldora Costa
1
Ivan Eduardo Chabu
Marcos Antonio Ruggieri Franco Roberto Moura Sales
Sérgio Luís Lopes Verardi
1
Mario Leite Pereira Filho
1
1
1
1
1
1
Yasmara Conceicao De Polli Migliano
Wanderlei Marinho da Silva 1 1 Bernardo Pinheiro de Alvarenga
1 1
Antonio Carlos de Jesus Paes
1 Paulo Roberto do Lago Helene
João Cyro André
1
1 1 Luiz Eduardo Teixeira Ferreira
1
Ricardo Amorim Einsfeld
Fabio Henrique Pereira
1
Djonny Weinzierl
1
1
1
José Roberto Cardoso 1
1
Pedro Afonso de Oliveira Almeida
Luiz Fernando Campos Ramos Martha
1
1
Viviane Cristine Silva
José Augusto Pereira da Silva 1
1
Jefferson Bettini Tulio Nogueira Bittencourt
1
1
José Alberto Cerri Gilson Fujii
Marcos Cramer Esteves Fernando Rebouças Stucchi
1 1
Ricardo Hauch Ribeiro de Castro 1
1
Samuel Marcio Toffoli
1
Jaime Tupiassú Pinho de Castro José Umberto Arnaud Borges
Eduardo Parente Prado
Reginaldo Muccillo Edson Roberto Leite Marco Antonio Meggiolaro
Jose Deodoro Trani Capocchi
Joaquim Bento Cavalcante Neto
1
Ing Hwie Tan
Carlos Renato Rambo
1
1
1
1
Neylor Antunes Stevo
1
Julio Cezar Adamowski
1 1
1
1
1
Emílio Carlos Nelli Silva
Reinaldo Morabito Neto
Carlos Shiniti Muranaka Murilo Daniel de Mello Innocentini
Douglas Gouvêa 1
Victor Carlos Pandolfelli
Guilherme Canuto da Silva
Gracinda Marina Castelo da Silva
1
1
Kazuo Nishimoto
1
1 Jose de Anchieta Rodrigues
Helmut Born
1
1
1
1
Maria Cristina Motta de Toledo
1
1
Jun Okamoto Junior
Giuliana Ratti
Ronaldo Câmara Cozza 1
Frank Patrick Missell
1 Raimundo Carlos Silvério Freire Jose Renato Coury
Wildor Theodoro Hennies
11
Walman Benicio de Castro 1 Antonio Carlos Vieira Coelho Fernando Resende
Marjorie Benegra
1 Leonardo Tolaine Massetto
Marcos de Sales Guerra Tsuzuki
1
Paulo Carlos Kaminski Viviane Carillo Ferrari
Rafael Giuliano Pileggi
1
Alexandre Magno Barros de Freitas Rodrigo Magnabosco
1 Julio Carlos Teixeira
Alex Kenya Abiko
1
1
Silvio Burrattino Melhado
Lilia Mascarenhas Sant Agostino 1
Izabel Fernanda Machado 1
1 1
Thiago Melanda Mendes
1
Lucas Antonio Moscato 1 André Rocha Studart
Geraldo Caixeta Guimarães Tathyana Moratti
Flávio Buiochi
1 1 1
Julio Cesar Klein das Neves Mário Sousa Takeashi
Suzilene Real Janasi Vahan Agopyan
1 1
1 Heliana Comin Vargas
Newton Kiyoshi Fukumasu
1 1
1 1
1
Vanessa Gomes da Silva
1 1
Raul Gonzalez Lima 1
Paulo Eigi Miyagi Marcio Gustavo Di Vernieri Cuppari 1
1
1 Roberto Cesar de Oliveira Romano
Fábio Cardoso Chagas 1 1
Henrique Kahn Anna Helena Reali COSTA
1 1
1
1 Márcia Terra da Silva
1 1 1 1
Carina Ulsen
1
1 Silvia Maria de Souza Selmo
1
Fernando Jose Gomes Landgraf 1 1
1 1
1 Dairo Hernán Mesa Grajales
Heloísa Cristina Fernandes
Giuseppe Pintaúde 1 1
Renato Seiji Tavares 1
Jose Daniel Biasoli de Mello 1 Sérgio Cirelli Angulo
11 Miguel Aloysio Sattler 1
1
1 1
1 1 1
1 1
Mauro Zilbovicius
1
Flavio Beneduce Neto 1
Carlos Torres Formoso
Andrea Murillo Betioli
Roberto Martins de Souza 1 1
1
1 1 1
Deniol Katsuki Tanaka Orestes Marraccini Goncalves
Daniel Rodrigues
1
1 Ivan Gilberto Sandoval Falleiros
1 1
1
1 1
1
Humberto Naoyuki Yoshimura Vanderley Moacyr John
1 1
1 1
Helio Goldenstein 1
1
1
Ricardo Fuoco 1
1 1 1
1 1 Mercia Maria Semensato Bottura de Barros
1 1 Anne Komatsu Agopyan
1
1 Adriana Gouveia Rodrigo
1 Erika Paiva Tenório de Holanda
1 1 1
1 1 1
1
1
1 Francisco Ferreira Cardoso
1
1
Cristiane Ueda
1 Ubiraci Espinelli Lemes de Souza
1
1 Andre Paulo Tschiptschin 1
Racine Tadeu Araujo Prado Neide Matiko Nakata Sato
Amilton Sinatora 1 1
1 1
1
Artemária Coêlho de Andrade
Miguel Angelo de Carvalho
1 1 Adriano Gameiro Vivancos
1
1
1
1
1 Ailton Camargo
Sara Aida Rodriguez Pulecio
1 Brunoro Leite Giordano
1
1
Maristela Gomes da Silva 1
1
1 1
1
Cleber Marcos Ribeiro Dias
1
Henrique Lindenberg Neto
Renato Rufino Xavier Marcela Paula Maria Zanin Meneguetti
Antonio Domingues de Figueiredo
Wilson Luiz Guesser 1
Wang Shu Hui 1
1 Abel André Cândido Recco 1
1 11
Arnaldo Manoel Pereira Carneiro
1
Bruno Luís Damineli 1
1
1
1 1
Mônica de Fátima Lobão Granero
Neusa Alonso-Falleiros 1
Paulo Roberto Mei
1 Fernando Henrique Sabbatini
Cristian Camilo Viáfara Arango
Eliana Kimie Taniguti
1
Linilson Rodrigues Padovese
Marcelo Nelson Páez Carreño
Marco Giulietti
Maria Elena Santos Taqueda
1
Stephan Wolynec 1 1
Idalina Vieira Aoki
1
Marco Antônio Penido de Rezende
Maria Aparecida Godoy Soler Pajanian
Cláudia Nascimento de Jesus
Inés Pereyra
1
1
Célia Aparecida Lino dos Santos
1 1
Celso Pupo Pesce
1
1
Cláudio Vicente Mitidieri Filho
1
Henrique Eisi Toma
1
Gerson dos Santos 1 1
1 Joao Batista de Almeida e Silva
Leni Campos Akcelrud 1
Patricia Alexandra Antunes
1
Fred Borges da Silva
1
1 1
1 Geraldo Narciso da Rocha Filho
11 Lucia Helena Terra Guerra
1
1 1
1 Viviane Miranda Araujo
1 Leoberto Costa Tavares
Adnei Melges de Andrade
1
1 Tiago Martinello
1 11
Susana Ines Cordoba de Torresi Assis Vicente Benedetti
Joao Paulo Sinnecker Vitor Angelo Fonseca Deichmann
Patricia Hatsue Suegama Attilio Converti
1
1
1 1
1
1
1 André Rolim Baby
Elfriede Marianne Bacchi
1
Carmen Cecilia Tadini
1
1
1 1 1
1 1
Carmen Cecilia Tadini
1
Antonio Riul Júnior
John Paul Hempel Lima Luis Fernando Peffi Ferreira
1
1 1 1
1
1
1
1
1
Fernando Josepetti Fonseca 1
Salvador Pinillos Gimenez
Rudolf Theoderich Bühler
1 1
Hercílio Gomes de Melo 1
1
1
1 1 Milene Galeti
1
1
Rodrigo Andrade Martinez 1 1
Paulo Sergio Santos
1
Victor Sonnenberg
1
Elizabete Wenzel de Menezes
1
1 1 1
1 1 1 1
1 1
1
Luciano Mendes Camillo
Roberto Guardani 1 1
1 1
1 Antônio Otávio de Toledo Patrocínio 1 1 1
1 1
Claudio Augusto Oller do Nascimento
1 1
1 Flavio Cesar Cunha Galeazzo Marcelo Antonio Pavanello
1 Osvaldo Chiavone Filho Javier Telis Romero
Pedro Vitoriano de Oliveira
1
1
1 1
Miguel Alexandre Novak
1 Sebastiao Gomes dos Santos Filho
1 João Antonio Martino
1 1 1
Pedro de Alcântara Pessôa Filho
1
1
Rodrigo Fernando Bianchi 1 Pricila Veiga dos Santos
1
Carlos Eduardo Calmanovici
Aparecido Sirley Nicolett
1
1
1
1
Ely Antonio Tadeu Dirani 1
Cinthia Tiemi Muranaka Frank Herbert Quina Isolda Costa
1 1 1
1 1 Patrícia Ponce Antonio Luis Pacheco Rotondaro
1 1 1 1
Susy Frey Sabato
Marystela Ferreira
Suzana Caetano da Silva Lannes Alan Rodrigo Navia
Mitiko Saiki
1 1
1 1
Marcello Bellodi
Jorge Andrey Wilhelms Gut
1 Jose Mauricio Pinto
Michelly de Souza
Michele Rodrigues
Aurea Yuki Sugai
1 1 Maysa Terada
Isabela Bellot de Souza Will 11
11
Paulo Cesar de Morais
Antonio Esio Bresciani
Marina Magnani
Efraim Cekinski
Rocio del Pilar Bendezú Hernández
Rinaldo Gregorio Filho
1
Risomá Chaves Marcel Dupret Lopes Barbosa
Fernando Morais dos Reis
Amilcar Machulek Junior
Katia Ribeiro Antonio Carlos Silva Costa Teixeira
Everaldo Carlos Venancio
Sylvana Cavedon Presti Migliavacca
1
Airton Jose de Luna
Esleide Lopes Casella
1
Reinaldo Giudici
1
1
1
1
1
1 1
1
Pedro Henrique Hermes de Araujo
Tah Wun Song
Heizir Ferreira de Castro
Fabio Bentes Freire
Alexandre Eliseu Stourdze Visconti
Ariovaldo Bolzan
Teresa Cristina Brazil de Paiva
Celso Ricardo Denser Pamboukian
1
Aldo Tonso
1
1
Mateus Meneghesso da Conceição
1
1
1
1 1
Fabiano Henrique Marques
1
1
1
1
Andreas Karoly Gombert
Mari Cleide Sogayar
Valéria Mendes Rodas
Irene Fernandes
Soraia Attie Calil Jorge
Felipe Santiago Chambergo Alcalde Carlos Augusto Pereira
Patricia Barros dos Santos
Ari José Scattone Ferreira
Fig. 2. Collaboration graph among researchers.
B is calculated, P (link(R, B)|E), where E denotes evidence about researchers
such as publications, institution, examination board participations and so on.
The role sharePublication is the one defining the semantic of the links in the
graph. Therefore, it is through it that we must calculate P (link(R, B)|E). Since
the concept PublicationCollaborator is defined by the role sharePublication and
considering as evidence Researcher(R) u ∃hasSameInstitution.Researcher(B) one
can infer P (link(R, B)|E) through:
Viviane Cristine Silva
José Augusto Pereira da Silva
Ivan Eduardo Chabu
Giuseppe Pintaúde Célia Aparecida Lino dos Santos Bernardo Pinheiro de Alvarenga
Newton Kiyoshi Fukumasu
Cristian Camilo Viáfara Arango
Yasmara Conceicao De Polli Migliano
Fabio Henrique Pereira
Roberto Moura Sales
Daniel Rodrigues 1
Helio Goldenstein 1
1 1 1
Marcio Gustavo Di Vernieri Cuppari
Wilson Luiz Guesser 1
1
1
1 1
Wanderlei Marinho da Silva Jose Jaime da Cruz
1 1
1
Julio Cesar Klein das Neves 1 Mauricio Barbosa de Camargo Salles
1 Renato Rufino Xavier
1 1 1 Sérgio Luís Lopes Verardi
1 1
1
Amilton Sinatora
1
1
Linilson Rodrigues Padovese
Marcelo Nelson Páez Carreño
José Roberto Cardoso
1
1
1
1
1 Pedro Pereira de Paula
Renato Carlson
Paulo Roberto Mei
1
1
1
1
1 Andre Paulo Tschiptschin
1
1
Roberto Martins de Souza
1 Andrea Murillo Betioli
1 Silvio Ikuyo Nabeta
1
1
1
Ricardo Fuoco Vanessa Gomes da Silva 1
Miguel Angelo de Carvalho Carlos Shiniti Muranaka
Inés Pereyra 1
Antonio Carlos de Jesus Paes
1
Antonio Domingues de Figueiredo
Celso Pupo Pesce Artemária Coêlho de Andrade
Thiago Melanda Mendes
Ivan Gilberto Sandoval Falleiros Marcos Antonio Ruggieri Franco
Emilio Del Moral Hernandez
Deniol Katsuki Tanaka
Orestes Marraccini Goncalves Djonny Weinzierl
Mario Leite Pereira Filho
Henrique Kahn
Vahan Agopyan
Luiz Lebensztajn
Cleber Marcos Ribeiro Dias
Maurício Caldora Costa
1
1
1 Brunoro Leite Giordano
1
1
Sérgio Cirelli Angulo
1
Anne Komatsu Agopyan 1
1
Maria Elena Santos Taqueda
1
Racine Tadeu Araujo Prado
1
1
1 1
1
Silvio Burrattino Melhado
Márcia Terra da Silva Mário Sousa Takeashi
Neide
1 Matiko Nakata Sato
Vanderley Moacyr John 1
Fernando Resende
1
Fred Borges da Silva 1
Carina Ulsen
1
1
Silvia Maria de Souza Selmo
1 1
1
1
Roberto Cesar de Oliveira Romano
Viviane Miranda Araujo
Carlos Torres Formoso Ubiraci Espinelli Lemes de Souza
1
1
1 1 1
1
1
1
1
Maristela Gomes da Silva
Marcela Paula Maria Zanin Meneguetti Miguel Aloysio Sattler Rafael Giuliano Pileggi
Mercia Maria Semensato Bottura de Barros
1
1 1
Adriana Gouveia Rodrigo
1
Henrique Lindenberg Neto 1
1
Antonio Carlos Vieira Coelho
1
Bruno Luís Damineli
Alex Kenya Abiko
Francisco Ferreira Cardoso
1
Heliana Comin Vargas
Arnaldo Manoel Pereira Carneiro
1
Heloísa Cristina Fernandes
Joao Paulo Sinnecker
Antonio Riul Júnior
Fernando Henrique Sabbatini 1
John Paul Hempel Lima
1 1
1
Rinaldo Gregorio Filho
Patricia Alexandra Antunes
Leni Campos Akcelrud
1 Paulo Sergio Santos
1
1
1
1 1
Mauro Zilbovicius
1
1 1
1 1 Ely Antonio Tadeu Dirani
1 Marco Antônio Penido de Rezende
Cláudia Nascimento de Jesus 1 Everaldo Carlos Venancio
Tathyana Moratti
1
1
1 Gerson dos Santos
Fernando Josepetti Fonseca
1
Antônio Otávio de Toledo Patrocínio
1
1 1
1
Marystela Ferreira
Adriano Gameiro Vivancos 1 1
Leonardo Tolaine Massetto
Rodrigo Fernando Bianchi 1 1
Eliana Kimie Taniguti Vitor Angelo Fonseca Deichmann
Adnei Melges de Andrade
Paulo Cesar de Morais
Miguel Alexandre Novak
Anna Helena Reali COSTA
Erika Paiva Tenório de Holanda
Cláudio Vicente Mitidieri Filho Maria Aparecida Godoy Soler Pajanian Rodrigo Andrade Martinez
Fig. 3. Collaboration subgraph.
P (PublicationCollaborator(R) |Researcher(R)
u∃hasSameInstitution.Researcher(B)) = 0.57.
If we took a threshold of 0.60, the link between R and B would not be
included.
One could gain more evidence, such as information about nodes that in-
directly connect these two groups (Figure 3), denoted by I1 , I2 . The inference
would be
P (PublicationCollaborator(R) |Researcher(R)
u∃sharePublication(I1 ).∃sharePublication(B)
u∃sharePublication(I2 ).∃sharePublication(B)) = 0.65.
Because more information was provided the probability inferred was different.
The same threshold now would preserve the link.
Other inferences are possible by considering the suggestion of links between
surrounding nodes, i.e. nodes directly linked to the two nodes R and B , denoted
by R1 , . . . , Rk , and B1 , . . . , Bn respectively. For each i = 1, ..., k and j = 1, ..., n,
calculates P (link(Ri , Rj )|E) and P (link(Bi , Bj )|E).
As a rule, if we are interested in discovering whether A and B could be linked,
probabilistic inference P (link(A, B)) should be performed.
In a more general framework, graph information could be useful to deal with a
large number of link predictions. Note that graph adjacency allow us to address
probabilistic inference for promising nodes. In a naive approach, each pair of
nodes in the collaboration graph would be evaluated so, multiple probabilistic
relational Bayesian inference calls would be required.
On the other hand, if graph-based information is used, such naive scheme
could be improved. In our approach, two nodes are probabilistically evaluated
if there is a path between them (number of incoming/outgoing edges, number
of mutual friends, node distances are also considered). Thus, numerical graph-
based information guides the inference process in the relational Bayesian network
(linked to the probabilistic ontology). In addition, other candidates sharing any
kind of evidence are also evaluated, i.e., interests based features (linked to onto-
logical knowledge) allow us to further explore link prediction.
Alternatively, by completing an overall link predicting task we can devise
further functionalities to the resulting collaboration network. The resulting graph
can be considered as being a probabilistic network, i.e., probabilities inferred for
each link could be denote strenght of the relationship.
5 Conclusion
We have presented an approach for predicting links that resorts both to graph-
based and ontological information. Given a collaborative network, e.g., a social
network, we encode interests and graph features through a crALC probabilistic
ontology. In order to predict links we resort to probabilistic inference. Prelimi-
nary results focused on an academic domain, and we aimed at predicting links
among researchers. These preliminary results showed the potential of the idea.
Previous combined approaches for link prediction [3, 1] have focused on ma-
chine learning algorithms [14]. In such schemes, numerical graph-based features
and ontology-based features are computed; then both features are input into a
machine learning setting where prediction is performed. Differently from such
approaches, in our work we adopt a generic ontology (instead of a hierarchi-
cal ontology, expressing only is-a relationships among interests). Therefore, our
approach uses more information about the domain to help the prediction.
Acknowledgements
The third author is partially supported by CNPq. The work reported here has
received substantial support by FAPESP grant 2008/03995-5.
References
1. W. Aljandal, V. Bahirwani, D. Caragea, and H.W. Hsu. Ontology-aware classifica-
tion and association rule mining for interest and link prediction in social networks.
In AAAI 2009 Spring Symposium on Social Semantic Web: Where Web 2.0 Meets
Web 3.0, Standford,CA, 2009.
2. F. Baader and W. Nutt. Basic description logics. In Description Logic Handbook,
pages 47–100. Cambridge University Press, 2002.
3. D. Caragea, V. Bahirwani, W. Aljandal, and W. H. Hsu. Ontology-based link
prediction in the livejournal social network. In Symposium on Abstraction, Refor-
mulation and Approximation, 2009.
4. F. G. Cozman and R. B. Polastro. Loopy propagation in a probabilistic descrip-
tion logic. In Sergio Greco and Thomas Lukasiewicz, editors, Second International
Conference on Scalable Uncertainty Management, Lecture Notes in Artificial In-
telligence (LNAI 5291), pages 120–133. Springer, 2008.
5. F. G. Cozman and R. B. Polastro. Complexity analysis and variational inference for
interpretation-based probabilistic description logics. In Conference on Uncertainty
in Artificial Intelligence, 2009.
6. N. Friedman, L. Getoor, D. Koller, and A. Pfeffer. Learning probabilistic relational
models. In Proc. 16th Int. Joint Conference on Artificial Intelligence, pages 1300–
1309, 1999.
7. D. Heckerman, C. Meek, and D. Koller. Probabilistic entity-relationship models,
prms, and plate models. In In Proceedings of the 21st International Conference on
Machine Learning, 2004.
8. J. Heinsohn. Probabilistic description logics. In International Conf. on Uncertainty
in Artificial Intelligence, pages 311–318, 1994.
9. W.H. Hsu, A.L. King, M.S.R. Paradesi, T. Pydimarri, and T. Wneinger. Collab-
orative and structural recommendation of friends using weblog-based social net-
work analysis. In Proceedings of Computational Approaches to Analysing WebLogs
(AAAI), 2006.
10. M. Jaeger. Probabilistic reasoning in terminological logics. In Principals of Know-
ledge Representation (KR), pages 461–472, 1994.
11. M. Jaeger. Relational Bayesian networks: a survey. Linkoping Electronic Articles
in Computer and Information Science, 6, 2002.
12. D. Liben-Nowell and J. Kleinberg. The link prediction problem for social net-
works. In Proceedings of the twelfth international conference on Information and
Knowledge Management, pages 556–559, New York, NY, USA, 2003. ACM.
13. T. Lukasiewicz and U. Straccia. Managing uncertainty and vagueness in description
logics for the semantic web. Web Semant., 6:291–308, November 2008.
14. T. Mitchell. Machine Learning. McGraw-Hill, New York, 1997.
15. J. Ochoa-Luna, K. Revoredo, and F.G. Cozman. Learning sentences and assess-
ments in probabilistic description logics. In Bobillo, F., et al. (eds.) Proceedings of
the 6th International Workshop on Uncertainty Reasoning for the Semantic Web,
volume 654, pages 85–96, Shangai, China, 2010. CEUR-WS.org.
16. R. B. Polastro and F. G. Cozman. Inference in probabilistic ontologies with at-
tributive concept descriptions and nominals. In 4th International Workshop on
Uncertainty Reasoning for the Semantic Web (URSW) at the 7th International
Semantic Web Conference (ISWC), Karlsruhe, Germany, 2008.
17. K. Revoredo, J. Ochoa-Luna, and F. Cozman. Learning terminologies in proba-
bilistic description logics. In Antônio da Rocha Costa, Rosa Vicari, and Flavio
Tonidandel, editors, Advances in Artificial Intelligence SBIA 2010, volume 6404
of Lecture Notes in Computer Science, pages 41–50. Springer / Heidelberg, Berlin,
2010.
18. M. Sachan and R. Ichise. Using semantic information to improve link prediction
results in network datasets. International Journal of Computer Theory and Enge-
neering, 3:71–76, 2011.
19. F. Sebastiani. A probabilistic terminological logic for modelling information re-
trieval. In ACM Conf. on Research and Development in Information Retrieval
(SIGIR), pages 122–130, 1994.
20. B. Taskar, M. FaiWong, P. Abbeel, and D. Koller. Link prediction in relational
data. In Proceedings of the 17th Neural Information Processing Systems (NIPS),
2003.
21. T. Wohlfarth and R. Ichise. Semantic and event-based approach for link predic-
tion. In Proceedings of the 7th International Conference on Practical Aspects of
Knowledge Management, 2008.
22. K. Yu, W. Chu, S. Yu, V. Tresp, and Z. Xu. Stochastic relational models for
discriminative link prediction. In Proceedings of the Neural Information Processing
Systems (NIPS), 2006.