<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Increasing community participation for learning and validating ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cheng</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>D. C.</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Espiritu L. Ph.D. DLSU - Manila</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Taft Avenue Manila</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>danny.cheng@dlsu.edu.ph</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>lloyd.espiritu@dlsu.edu.ph</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2008</year>
      </pub-date>
      <volume>5318</volume>
      <abstract>
        <p>In recent years, it has been recognized that the overhead of developing ontologies by experts to be used for various applications over the web is time consuming and costly. As such, various studies have been performed that aims to take advantage of the social community to assist in building the ontology. Researchers have focused on the integration of various Web 2.0 technologies and paradigms to collaboratively build an ontology. While others have focused on automated discovery of taxonomic and non-taxonomic relationships from unstructured text such as folksonomies to construct an ontology. However, in these researches, the community is still mostly aware of the fact that they are building an ontology and this can limit the amount of participation of the community in the process. In this paper, we suggest the use of an explicit social network to assist in the validation and learning of the ontology being constructed in conjunction with existing automated ontology discovery and construction processes. The goal is to tap into the social network to allow the stake holders to participate in the construction of the ontology based on how the community perceives the relationships of the concepts. At the same time, the process is embedded into common tasks that the community partake in so as to hide the complexity and increase the participation of the community in the process. Being an explicit network, we discuss the localization of the resulting ontologies to the community, the incentive mechanism to encourage the community to assist in the validation, the use of natural language processing in the generation of the questions to be fielded to the community to validate the ontology, and the possible identification of pseudo-experts within the community in terms of the value they contribute in the validation and alignment of the ontologies.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Social Computing</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Ontology</kwd>
        <kwd>Folksonomies</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Ontology Engineering is the process with which ontologies can
be produced that is useful, consensual, rich, current, complete
and interoperable. [1] However, ontology engineering and
management can be time-consuming, requiring experts from
both ontology engineering and the domain of interest. This may
be feasible if the application domain is limited and high in value
but not for applications that are varied like the semantic web. It
light of this, semi automatic approaches that automatically
extract and annotate from large volumes of text becomes an
important component of the semantic web. [2]
Given that full automation cannot be done as yet and the
disadvantages of expert driven ontology engineering,
communities of stakeholders would have to be involved in the
engineering process. However, there is a lack of participation
from the end users as the ontology engineering processes is
considered to be a top-down expert based approach. Under the
current assumptions of the tools being developed, the
engineering and alignment of ontologies is to be done in small
number of long sessions by experts wherein the output is a
generic ontology that can be reused. Personalization or
localization of the ontology and the actual terminology and
organization being used does not match the language,
perception, and understanding of the end user. [3]
As the problem becomes more evident and the need to involve
the users more to develop non-toy ontologies, several researches
have started to look into how to take advantage of the existing
infrastructure and workflow provided by Web 2.0. One such
research looks into the possibility of using the Wiki model to
allow end users to collaboratively create a lightweight ontology.
The motivation behind this is to keep pace with reality, build an
ontology that represents the view of the community, and
distribute the cost of building the ontology. [4] [5] These two
researches have done much in terms of attempting to bring
ontology engineering to the community. Taking different
perspective and approaches, one using a question and answer
mechanism while the other an explicit construction scheme
using the wiki model. There has also been works that states the
need to have comprehensive approach for deriving ontologies
from folksonomies by integrating multiple resources and
techniques [6]
There have also been improvements done on the automated
construction of ontologies from folksonomies. Researches in
this area focused on the automated enrichment of folksonomies
with expert developed ontologies to discover taxonomic
relationships and build a collabulary [7]. Though much effort
has been put in ontology learning, the knowledge acquisition
process is typically focused in the taxonomic aspect. “The
discovery of non-taxonomic relationships is often neglected,
even though it is a fundamental point in structuring domain
knowledge” [8].</p>
      <p>Although much work has already been done in the field, there
are still a lot of opportunities for improvement. In this paper, we
discuss an approach that aims to build on top of these existing
researches to improve on the evaluation of discovered
relationships by tapping on an explicit social network
community to provide feedback and help correct the system and
hiding the engineering process to encourage more participation
from the community. We define a means for selecting a
subcommunity within the network to solicit feedback from and
device a presentation scheme in the form of a QA system
integrated into their common tasks for the users to feedback or
perform disambiguation task that is non-intrusive. Currently, the
feedback comes from domain experts or knowledge engineers.
[9] Finally, we define a mechanism for inferring and
performing peer evaluation on the contributed feedback and
disambiguation to determine pseudo-experts within the
community to speed up the development of the ontology and
improve on the actual quality of the ontology.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>OUR APPROACH</title>
      <p>Social networks have become one of the recent technologies that
have gained an enormous amount of growth and potential. The
potentials lie in the large number of users within the social
network that can be tapped to perform some work that would
normally be difficult if not impossible to be done automatically
given the current state of technology. Recent works have tried
to model after the framework and apply it to ontology
engineering. We take an alternative approach in that we try to
embed the processes needed to validate, refine, and discover
ontologies within the common tasks that the community of
nonexperts are familiar with. At the same time, the mapping
process is also translated into a form that allows for common
members of the community to be able to contribute to the
system. The approach is composed of tag enrichment, question
and answering validation and discovery, an incentive scheme,
participant selection, and peer-evaluation scheme (see figure 1).
The community base and social relationships will be derived
from Facebook while these users will be linked to their accounts
in Delicious which will contain the tags and the objects. The
information in Delicious will be mined, gardened, and used to
discover ontologies. Upon processing, the resulting candidate
ontology would now be fielded into the social network and its
applications to allow for user and community validation. Every
action taken by the community is aimed at building consensus
and these are localized within a community only. Additional tag
gardening can occur as part of the validation or question and
answering process.
gardening) [10] so as to improve consistency and recall aside
from possible enrichment of the tags. Tag reuse through a
personomy [10] can be used to address this issue and relating it
to the community can help suggest other tags to use. This can be
used in all stages of the process where free form user input or
tags would be solicited. The tags to be used can be suggested
via identification of the object referred to by the tag or tag
cooccurrences by the tags used within the social network or
community. The actual reuse of the tags by the community can
then be used to serve as a feedback mechanism to determine the
applicability of the term used in the tag. Aside from this, tags
can be enriched by trying to identify the context of use for the
tags via user modeling and implicitly gathering additional
context information when possible e.g. through the use of
mobile devices that provide location information in the case of
photos which can in turn also be used in the gardening process.</p>
    </sec>
    <sec id="sec-3">
      <title>2.2 Feedback application in social network</title>
      <p>One foreseeable issue with soliciting participation from the
nonexpert community of stakeholders is the design of the interface
with which feedback will be solicited by. The traditional
presentation of ontologies in the form of a tree or a graph would
typically intimidate the common user and feedback would not be
possible. The wiki [5] has been used to gather a consensus from
the open community but this still means that the user knows that
they are building an ontology in this case a lightweight
ontology. We propose the development of applications within
social networks that behave similarly to currently existing
applications (see figure 2) so as not to intimidate the end user
and encourage participation from end users. These could be
presented in the form of a game as well. [10]
As the input to the system, tags have been made popular by
current Web 2.0 sites like Flickr, del.icio.us, Facebook, and
YouTube. Currently, the tags are free form and can create
consistency and recall issues. In this research, we look into the
possibility of managing and organizing the tags (i.e. tag
For our work, we propose the use of reputation and rewards
initially as the possible incentive schemes to be provided to the
community. Rewards could be in the form of virtual items as
the game could combine roleplay and arcade characteristics.
These items could be used for trade or decorative purposes. The
actual items to be given as rewards would depend on the
application or game developed. Group or team scores can also
be used to further motivate the community to participate and
contribute. To provide a reputation based incentive, the actors
within the community who was able to provide good responses
repeatedly will be tracked and displayed by the system on a per
sub community and object basis. An overall point system for the
entire community can also employed to further provide an
incentive to the community. Such schemes are currently being
used in systems like BOINC which is being used by
SETI@HOME. Aside from this, questions could also be posted
as part of the user’s status message to mimic asking for help and
word of mouth activities. Another possible incentive could be
putting as part of the application or game the ability for the user
to explicitly solicit action from his/her friends.
2.4</p>
    </sec>
    <sec id="sec-4">
      <title>Feedback participant selection</title>
      <p>One of the goals of this research is to increase the involvement
of the community in providing feedback with regards to the
validity of the learned concepts and relationships. And as
mentioned, these concepts and relationships are dependent on
the target audience and they evolve over time. As such, one
concern in this process would be that asking feedback about a
concept that is foreign to the current user would not yield useful
results. To address this, the research uses the relationship of the
actor, the tag, and the object instance. The actor is classified
into, the owner, the tagger, and the viewer. An object is owned
by the owner and is tagged by the owner or tagger. Given this
relationship, the questions that will be generated for feedback
purposes can now be fielded through the use of affinity
measures within the social network using the owner, and the
tagger, as the basis or center of the affinity. The rationale for
this is that we assert that an object tagged or uploaded by the
tagger or owner would be recognizable to the people that are in
close affinity with the tagger or owner. Also, the words used to
tag and identify the object would be understandable within the
same community that may otherwise not make sense when given
to any arbitrary person in the entire social network.</p>
    </sec>
    <sec id="sec-5">
      <title>2.5 Generation of questions for the feedback mechanism</title>
      <p>Current mechanisms used to receive feedback from the end user
assume an expert user well versed in ontology engineering. We
look into the use of natural language generation techniques to
present the feedback mechanism in the form of a question that is
part of the application or game. The question can contain both
textual and non-textual information if needed. As the tags /
concepts are associated with the objects, the user can be
provided with enriched information that shows examples or
instances of the concept that will help the user in answering the
question and providing feedback. Questions that will be asked
fall under 3 categories or usages namely:
1.
2.</p>
      <p>Validation of tags learned in the folksonomy to determine
if it is really a valid tag or word that can be reused later or
if it is just a personalized term invented by the owner who
tagged the object. To perform this, questions will be
phrased in the following manner:
o Are you familiar with this [tag/concept]?
o Is [tag/concept] a common word?
By statistically analyzing the answers, it should be possible
to minimize words that are not relevant even if these words
are not yet stored in lexical resources such as WordNet.
Social affinity could also be used to localized the tags or
words used as these words may only have meaning within a
local or sub grouping or community.</p>
      <p>For taxonomic relationships, the questions will be phrased
in the following manner:
o Is [tag/concept] a kind of [tag/concept]? This will be
used to determine and validate subsumption.
o Is [tag/concept] the same as [tag/concept]? Or are
these two [tag/concept] the same? This will be used to
determine equivalence.
o Does this [tag/concept] belong to [actor]? This will be
used to determine instances. It is possible to perform
this as the tags connection to the actual instance of the
object is maintained with regards to the owners or
those who tagged the object.</p>
      <p>For non-taxonomic relationships, the question will be
phrased as follows:
o Does a [tag/concept] [verb/relationship] [tag/concept]?
This is used to validate the learned non-taxonomic
relationships.</p>
      <p>For discovering new relationships, the question will be
phrased as follows:
o What can [tag/concept] do with [tag/concept]? This
will be used to discover new relationships between
concepts from the community that are previously not
available in the existing resources.</p>
      <p>These are the initial set of questions that have been identified
that can be fielded to the community with a certain level of
expectation that it will be answered. However, we also perceive
fielding out similar questions but instead of tags or words, the
object being compared could be photos or videos or web sites or
even documents. The only issue with complex contents such as
videos, web sites, and documents is that the community may not
be able to answer the questions with just a single simple glance
at the content. If this is the case, then it might discourage the
community from participating due to its nature of complexity.</p>
    </sec>
    <sec id="sec-6">
      <title>2.6 Validation and Identification of pseudo-experts</title>
      <p>The research proposes the use of a feedback loop to be used by
the owners or authors of the information to validate the
assertions of the community during the creation of the ontology.
As the authors or owners of the actual content or object in
question, the tags and identified relationships would be fed back
to them so that they can validate the assertions as they know the
content as owners of the content. The advantage of this
approach is that the system can now identify through statistical
approaches a set of pseudo-experts within the community. This
would increase the volume of experts that are available that can
be utilized in the creation or validation of the ontologies. This
can be used to determine trust and reputation among the
community. If an actor or user of the community is frequently
providing answers or feedback that the community does not
agree with, less weight can be given. This could be later used to
minimize security risks such as spam while at the same time also
help in maintaining the integrity and consistency of the various
relationships learned. This step is necessary to prevent a flood
of erroneous feedback that affects the integrity of the system
thereby discouraging the community from participating in the
effort.</p>
    </sec>
    <sec id="sec-7">
      <title>3. PERCEIVED ISSUES AND</title>
    </sec>
    <sec id="sec-8">
      <title>LIMITATIONS</title>
      <p>One of the limitations of this approach is that it currently
assumes a social network that has explicitly expressed the
relationships of the members via their friends list or groupings.
It has not yet considered the scenario where in there is a
community driven site wherein the members’ relationships are
not explicitly stated. However, a possible starting point for this
scenario would be the research [6] as this research analyzes the
possibility of inferring relationships or the implicit social
network based on the tags and objects in use within the network.
Adhoc groupings based on domain and context are also not yet
included in the current research. This would be useful as it
could further refine the entire process. Another limitation of the
current system is that when it uses tags generated by the users, it
is highly possible that the words may not appear within an
existing resource like WordNet. As such, certain non-taxonomic
relationships may not be discovered by the system. Also, since
the input could be relating to non-textual data sources such as
photos and videos, analysis of the content to determine
nontaxonomic relationships would have to be re-evaluated as the
existing approach relies on the sentence structure to infer
relationships. And as our experience, when users construct
content or compose their thoughts online, their sentence
structure may not necessarily be as formal as a document as
some would reuse SMS or Instant Messaging lingo when
posting online. Also, uncommon words or phrases such as
nicknames or slangs have not been considered. Lastly, stopping
attempts to circumvent the schemes put in place through
malicious use is a current limitation of the research.
4.</p>
    </sec>
    <sec id="sec-9">
      <title>FUTURE DIRECTIONS</title>
      <p>We have done some similar works on determining relevant
participants in a social network and tag enrichment schemes
through the use mobile devices. During the implementation of
the system, additional research can be done to focus on the
possibility of varying languages being used, non-textual data
involved, other tag enrichment and disambiguation approaches
for input or data other than those related to photos. It is also
seen that, tweaking and refinements in the various stages or
steps proposed in this research should be done to allow for the
system to reach a critical mass. The modeling of context should
also be considered as aside from the social affinity, the context
should also be used to personalize the ontology in order to
provide a more accurate mapping of results. Finally, research on
crossing social groupings for ontology alignments and mapping
will also be tackled in future works.
5.
[3]. Conroy, C., O’Sullivan, D. and Lewis, D. Towards
Ontology Mapping for Ordinary People. Tenerife, Spain : CEUR
Workshop Proceedings, 2008. 5th European Semantic Web
Conference Ph.D. Symposium.
[4]. Hepp, M., Siorpaes, K., and D. Bachlechner. Harvesting
Wiki Consensus: Using Wikipedia Entries as Vocabulary for
Knowledge Management. 2007. IEEE Internet Computing. pp.
Vol. 11, No. 5, pp. 54-65.
[6]. Van Damme, C., Hepp, M. and Siorpaes, K. FolksOntology:
An Integrated Approach for Turning Folksonomies into
Ontologies. Innsbruck : s.n., 2007. Workshop Bridging the Gap
between Semantic Web and Web 2.0 at the ESWC 2007.
[8]. Sánchez, D. and Moreno, A.. Learning non-taxonomic
relationships from web documents for domain ontology
construction. 2008, Data &amp; Knowledge Engineering, pp.
600623.
[9]. Villaverde, J., et al. Supporting the discovery and labeling
of non-taxonomic relationships in ontology learning (article in
press). 2009, Expert Systems with Applications.
[12]. Farzan, R, et al. When the experiment is over: Deploying
an incentive system to all the users. 2008. Symposium on
Persuasive Technology, In conjunction with the AISB 2008
Convention.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>