=Paper=
{{Paper
|id=None
|storemode=property
|title=Semantic Representation of Provenance in Wikipedia
|pdfUrl=https://ceur-ws.org/Vol-670/paper_7.pdf
|volume=Vol-670
|dblpUrl=https://dblp.org/rec/conf/semweb/OrlandiCP10
}}
==Semantic Representation of Provenance in Wikipedia==
Semantic Representation of
Provenance in Wikipedia
Fabrizio Orlandi Pierre-Antoine Champin Alexandre Passant
Digital Enterprise Research Institute LIRIS, Université de Lyon, CNRS, UMR5205 Digital Enterprise Research Institute
National University of Ireland, Galway Université Claude Bernard Lyon 1, F-69622 National University of Ireland, Galway
Galway, Ireland Villeurbanne, France Galway, Ireland
fabrizio.orlandi@deri.org pchampin@liris.cnrs.fr alexandre.passant@deri.org
Abstract—Wikis are often considered as being a wide source of In the next section, we discuss some related work in the
information. However, identifying provenance information about realm of provenance management on the Semantic Web. Then,
their content is crucial, whether it is for computing trust in we give some background information regarding SIOC and
public wiki pages or to identify experts in corporate wikis. In this
paper, we address this issue by providing a lightweight ontology various extensions used in our work. In Section IV, we
for provenance management in wikis, based on the W7 model. present the W7 theory and the lightweight ontology we have
Furthermore, we showcase the use of our model in a framework built to represent it in RDFS. We then describe our software
that computes provenance information in Wikipedia, also using architecture and how we compute provenance information in
DBpedia to compute provenance and contribution information Wikipedia and finally present the user-interface to access this
per category, and not only per page.
information, before concluding the paper.
I. I NTRODUCTION II. R ELATED W ORK
The representation and extraction of provenance informa-
From public encyclopedia to corporate knowledge man-
tion is not a recent research topic. Many studies have been
agement tools, wikis are often considered as being a wide
conducted for representing provenance of data [15], but few of
source of information. Yet, since wikis generally offer an open
them have been focused on integrating provenance information
publishing process where everyone can contribute, identifying
into the Web of data [6]. Providing this information as RDF
provenance information in their pages is an important require-
would make provenance meta-data more transparent and inter-
ment. In particular this information can be used to identify
linked with other sources, and it would also offer new scenar-
trust values for pages or pages fragments [2] as well as for
ios on evaluating trust and data quality on the top of it. In this
identifying experts based on the number of contributions [9]
regard a W3C Provenance Incubator Group2 has been recently
and other criteria such as the users’ social graphs [10] etc.
established. The mission of the group is to “provide a state-
By providing this information as RDF [6], provenance meta-
of-the art understanding and develop a roadmap in the area of
data becomes more transparent and offers new opportunities
provenance for Semantic Web technologies, development, and
for the previous use-cases, as well as letting people link to
possible standardization”. Requirements for provenance on the
provenance information from other sources, and personalizing
Web3 , as well as several use cases and technical requirements
trust metrics based on the trust they have to a person regarding
have been provided by the working group. A comprehensive
a particular topic [5].
analysis of approaches and methodologies for publishing and
This paper describes three of our contributions to address
consuming provenance metadata on the Web is exposed in [7].
this issue and make provenance information in MediaWiki-
Another research topic relevant to our work is the evaluation
powered wikis 1 available on the Semantic Web:
of trust and data quality in wikis. Recent studies proposed
1) a lightweight ontology to represent provenance informa- several different algorithms for wikis that would automatically
tion in wikis, based on the W7 theory [13] and using calculate users’ contributions and evaluate their quantity and
SIOC and its extensions; quality in order to study the authors’ behavior, produce trust
2) a software architecture to extract and model provenance measures of the articles and find experts. WikiTrust [2] is a
information about Wikipedia pages and categories, using project aimed at measuring the quality of author contributions
the aforementioned ontology; on Wikipedia. They developed a tool that computes the origin
3) a user-interface to make this information openly available and author of every word on a wiki page, as well as “a
on the Web, both to human and software agents and measure of text trust that indicates the extent with which text
directly within Wikipedia pages. has been revised”4 . On the same topic other researchers tried
2 established in September 2009. http://www.w3.org/2005/Incubator/prov/
This work is funded by the Science Foundation Ireland under grant number
SFI/08/CE/I1380 (Lı́on 2) and by an IRCSET scholarship. 3 http://www.w3.org/2005/Incubator/prov/wiki/User Requirements
1 MediaWiki is the wiki engine that powers Wikipedia – www.mediawiki.org 4 WikiTrust: http://wikitrust.soe.ucsc.edu/
to solve the problem of evaluating articles’ quality, not only as creates, modifies, uses, etc. Besides the SIOC
examining quantitatively the users’ history [9], but also using ontology, SIOC-actions relies on the vocabulary for Linking
social network analysis techniques [10]. Open Descriptions of Events (LODE)7 . The core of the module
From our perspective, there is a need of publishing prove- is the Action class (subclass of event:Event from the
nance information as Linked Data from websites hosting a Event Ontology) which is a timestamped event involving an
wide source of information (such as Wikipedia). Yet, most agent (e.g. a UserAccount) and a number of digital artifacts
of the work on provenance of data is, either not focused on (e.g. Items). For more details about SIOC Actions and its
integrating the information generated on the Web of data, implementation see the following Sec. IV.
or mainly based on provenance for resource descriptions or
IV. R EPRESENTING THE W7 MODEL USING RDFS/OWL
already structured data. On the other hand, the interesting work
done so far on analyzing trust and quality on wikis does not The W7 model is an ontological model created to describe
take into account the importance of making the information the semantics of data provenance [13]. It is a conceptual model
extracted available on the Web of data. and to the best of our knowledge a RDFS/OWL representation
of this model has not been implemented yet. Hence we will
III. BACKGROUND focus on an implementation of this model for the specific
A. Using SIOC for wiki modelling context of wikis. As a comparison, in [14] the authors use
the example of Wikipedia to illustrate theoretically how their
The SIOC Ontology — Semantically-Interlinked Online
proposed W7 model can capture domain or application specific
Communities [1] — provides a model for representing online
provenance.
communities and their contributions5 . It is mainly centered
The W7 model is based on the Bunge’s Ontology [3],
around the concepts of users, items and containers, so it can be
furthermore it is built on the concept of tracking the history of
used to model content created by a particular user on several
the events affecting the status of things during their life cycle.
platforms, enabling a distributed perspective to the manage-
In this particular case we consider the data life cycle. The
ment of User-Generated Content on the Web. In particular, the
Bunge’s ontology, developed in 1977, is considered as one of
atomic elements of the Web applications described by SIOC
the main sources of constructs to model real systems and infor-
are called Items. They are grouped in Containers, that
mation systems. Since the Bunge’s work is a theoretical work,
can themselves be contained in other Containers. Finally,
there has been some effort from the scientific community to
every Container belongs to a Space. As an example,
translate his work into machine readable ontologies8 .
a Site (subclass of Space) may contain a number of
The W7 model represents data provenance using seven
Wikis (subclass of Container) and every Wiki contains
fundamental elements or interrogative words: what, when,
a set of WikiArticles (subclass of Item) generated by
where, how, who, which, and why. It has been purposely built
UserAccounts. For more details about SIOC, we invite the
with general and extensible principles, hence it is possible to
reader to consult the W3C Member Submission [1] and its
capture provenance semantics for data in different domains.
online specification6 .
We refer to [13] for a detailed description of the mappings
While the SIOC Types module provides several sub-
between W7 and Bunge’s models, and in Table I we provide
classes of Container and Item, including Wiki and
a summary of the W7 elements (as in [14]).
WikiArticle, some characteristics of wikis required further
Looking at the structure of the W7 model it is clear the
modelling. Hence, in our previous work [11] we extended the
motivation why we chose the SIOC Actions module as core of
SIOC Ontology to take into account such characteristics (e.g.
our model. Most of the concepts in the Actions module are the
multi-authoring, versioning, etc.). Then, some tools to generate
same as in the W7 model. Furthermore wikis are community
and consume data from wikis using our model have also been
sites and the Actions module has been implemented to repre-
developed [12].
sent dynamic, action-centric views of online communities.
B. The SIOC Actions module In the following sections we give a detailed description of
how we answered each of these seven questions.
While SIOC represents the state of a community at a
given time, SIOC-actions [4] can be used to represent their A. What
dynamics, i.e. how they evolve. Hence, SIOC provides a The What element represents an event that affected data
document-centric view of online communities and SIOC- during its life cycle. It is a change of state and the core of
actions focuses on an action-centric view. More precisely, the model. In this regard, there are three main events affecting
the evolution of an online community is represented as a set data: creation, modification and deletion. In the context of
of actions, performed by a user (sioc:UserAccount), at wikis, each of them can appear: users can (1) add new
some time, and impacting a number of objects (sioc:Item). sentences (or characters), (2) remove sequences of characters,
SIOC-actions provides an extensible hierarchy of properties or (3) modify characters by removing and then adding content
for representing the effect of an action on its items, such
7 LODE Ontology specification — http://linkedevents.org/ontology/
5 http://sioc-project.org 8 Evermann J. provides an OWL description of the Bunge’s ontology at:
6 http://rdfs.org/sioc/spec/ http://homepages.mcs.vuw.ac.nz/∼jevermann/Bunge/v5/index.html
Provenance Construct Definition
element in Bunge’s following types of edits: Insertion, Update and Deletion of
ontology both Sentences and References. With the term Sentence here
What Event An event (i.e. change of state) that happens we refer to every sequence of characters that does not include
to data during its life time
How Action An action leading to the events. An event may
a reference or a link to another source, and with Reference
occur, when it is acted upon by another thing, we refer to every action that involves a link or a so-called
which is often a human or a software agent Wikipedia reference. As discussed in [14], another type of
When Time Time or more accurately the duration of an
event
edit would be a Revert, or an undo of the effects of one or
Where Space Locations associated with an event more edits previously happening. However, in Wikipedia, a
Who Agent Agents including persons or organizations in- revert does not restore a previous version of the article, but
volved in an event creates a new version with content similar to the one from an
Which Agent Instruments or software programs used in the
event earlier selected version. In this regard, we decided to model a
Why - Reasons that explain why an event occurred revert as all the other edits, and not as a particular pattern. The
distinction between a revert and other types of action can be
TABLE I yet identified, with an acceptable level of precision, by looking
D EFINITION OF THE 7 W S BY R AM S . AND L IU J.
at the user comment entered when doing the revert, since most
users add a related revert comment 9 .
Going further, and to represent provenance data for the
in the same position of the article. In addition, in systems like action involved in each wiki edit, we modelled the diffs
Wikipedia, some other specific events can affect the data on the appearing between pages. To model the differences calculated
wiki, for example “quality assessment” or “change in access between subsequent revisions we created a lightweight
rights” of an article [14]; however, they can be expressed with Diff ontology, inspired by the Changeset vocabulary10 .
the three broader types defined above. Yet, instead of describing changes to RDF statements, our
Since (1) wikis commonly provide a versioning mechanism model aims at describing changes to plain text documents.
for their content and (2) every action on a wiki article leads It provides a main class, the diff:Diff class, and six
to the generation of a new article revision, the core event subclasses: SentenceUpdate, SentenceInsertion,
describing our What element is the creation of an article SentenceDeletion and ReferenceUpdate,
version. In particular we model this creation, and the related ReferenceInsertion, ReferenceDeletion, based
modification of the latest version (i.e. the permalink), using on the previous How patterns.
the SIOC-Actions model as shown in Listing 1.
sioca:creates ;
sioca:modifies ;
a sioca:Action.
Listing 1. Representing the ”What” element
As we can see from the example above expressed
in Turtle syntax, we have a sioca:Action identified
by the URI hhttp://example.com/action?title=Dublin Core#
380106133i that leads to the creation of a revision of the main
wiki article about “Dublin Core”. The creation of a new revi-
sion was originated by a modification (sioca:modifies)
of the main Wikipedia article hhttp://en.wikipedia.org/wiki/ Fig. 1. Modeling differences in plain text documents with the Diff vocabulary
Dublin Corei. Details about the type of event are exposed
in the next section about the How element, where we identify The main Diff class represents all information about
the type of action involved in the event creation. the change between two versions of a wiki page (see
Fig. 1). The Diff’s properties subjectOfChange and
B. How objectOfChange point respectively to the version changed
The How element in W7 is an equivalent to the Action by this diff and to the newly created version. Details about
element from Bunge’s ontology, and describes the action the time and the creator of the change are provided respec-
leading to an event. In wikis, the possible actions leading tively by dc:created and sioc:has_creator. More-
to an event (i.e. the creation of a new revision) are all over, the comment about the change is provided by the
the edits applied to a specific article revision. By analyzing diff:comment property with range rdfs:Literal. In
the diff between two subsequent revisions of a page, we 9 Note that we could also compare the n-1 and n+1 version of each page to
can identify the type of action involved in the creation of identify if a change is a revert
the newer revision. In particular we focus on modelling the 10 The Changeset schema: http://purl.org/vocab/changeset/schema#
Figure 1 we also display a Diff class linking to another Diff D. Where
class. The latter represents one of the six Diff subclasses The Where element represents the online “Space” or the
described earlier in this section. Since a single diff between location associated with an event. In wikis, and in particular
two versions can be composed by several atomic changes (or in Wikipedia, this is one of the most controversial elements
“sub-diffs”), a Diff class can then point to several subclasses of the W7 model. If the location of an article update might
using the dc:hasPart property. Each Diff subclass can be considered as the location of the user when updating the
have maximum one TextBlock removed and one added: if content, then this information on Wikipedia is not completely
it has both, then the type of change is an Update, otherwise provided or accurate. Indeed we can extract this information
the type would be an Insertion or a Deletion. only from the IP address of the anonymous users but not
The TextBlock class is part of the Diff ontology and from all the Wikipedia users. To note that is possible to
represents a sequence of characters added or removed in a link a sioc:UserAccount (e.g. hhttp://en.wikipedia.org/
specific position of a plain text document. It exposes the wiki/User:96.245.230.136i) to the related IP address using the
content itself of this sequence of characters (content) and SIOC ip_address property.
a pointer to its position inside the document (lineNumber).
It is important to precise that usually the document content is E. Who
organized in sets of lines, as in wiki articles, but this class The Who element describes an agent involved in an event,
is generic enough to be reusable with other types of text therefore it includes a person or an organization. On a wiki it
organization. To note also that each of the six subclasses of represents the editor of a page, and it can be either a registered
the Diff class inherit the properties defined for the parent user or an anonymous user. A registered user might also
class, but unfortunately this is not displayed in Figure 1 for have different roles in the Wikipedia site and, on this basis,
space reasons. different permissions are granted to its account. With this work
With the model presented it is possible to address an we are only interested in keeping track of the user account
important requirement for provenance: the reproducibility of involved in each event, and not also in the role on the wiki.
a process. Starting from an older revision of a wiki article, Users are modelled with the sioc:UserAccount class and
just following the diffs between the newer revisions and the linked to each sioca:Action, sioct:WikiArticle
TextBlocks added or removed, it is possible to reconstruct and diff:Diff with the property sioc:has_creator. A
the latest version of the article. This approach goes a step sioc:UserAccount represents a user account, in an online
further than just storing the different data versions: it provides community site, owned by a physical person or a group or an
details of the entire process involved in the data life cycle. organization (i.e. a foaf:Agent). Hence a physical person,
represented by a foaf:Person subclass of foaf:Agent,
C. When
can be linked to several sioc:UserAccount.
The When element in W7 is equivalent to the Time element
from Bunge’s ontology, and obviously refers to the time an
event occurs, which is recorded in every wiki platform for page
edits. As depicted in Figure 1, each Diff class is linked to the
timestamp of the event using the dc:created property. The
same timestamp is also linked to each Diff subclass using
the same property (not shown in Fig. 1 for space reasons). The
time of the event is modelled with more detail in the Action
element as shown in the following Listing 2 11 .
Fig. 2. Modeling the Who element with sioc:UserAccount
dc:created "2010-08-21T06:36:17Z"ˆˆ;
lode:atTime [ F. Which
a time:Instant;
time:inXSDDateTime "2010-08-21T06:36:17Z"ˆˆ. ments used in the event. In our particular case it is the software
];
a sioca:Action. used in editing the event, which might be a bot or the wiki
software used by the editor. Since there is not a direct and
Listing 2. Representing the ”When” element in Turtle syntax
precise way to identify whether the edit has been made by a
In this context we consider actions to be instantaneous. As in human or a bot, our model does not make this distinction. A
[4] we track the instant that an action is taking effect on a wiki naive method could be to look at the username and check if
(i.e. when a wiki page is saved). Usually, this creation time it contains the “bot” string.
is represented using dc:created. Another option, provided
G. Why
by the LODE ontology, uses the lode:atTime property to
link to a class representing a time interval or an instant. The Why element represents the reasons behind the event
occurrence. On Wikipedia it is defined by the justifications for
11 For all the namespaces see: http://prefix.cc a change inserted by a user in the “comment” field. This is
not a mandatory field for the user when editing a wiki page to compute the type of change for each of the differences
but the Wikipedia guidelines recommend to fill-in this text identified. This allows us to mark each change with one of the
field. We model the comment left by the user with a property Sentence or Reference Insertion/Update/Deletion subclasses
diff:comment linking the diff:Diff class to the related of the diff:Diff class. Finally the script generates RDF
rdfs:Literal. data with the model described before and inserts it in the
local triplestore. In order to test our application we ran the
V. A PPLICATION USING PROVENANCE DATA FROM data extraction algorithm starting from the category “Semantic
W IKIPEDIA Web” on the English Wikipedia, and we generated data for
A. Collecting the data from the Web all the 166 wiki articles belonging to this category and its
In order to validate and test our modelling solution for subcategories recursively. As we can see, using Semantic Web
provenance on wikis and in particular from the Wikipedia technologies, we have the advantage of having a single and
website, we collected data from the English Wikipedia and the standard language to query wiki and provenance data together,
DBpedia service. The DBpedia project12 since it extracts and while developers that need to query original systems have to
publishes structured information from the English Wikipedia, learn a new API for each new system we want to query.
is considered as its RDF export. Collecting data not only
B. A Firefox plug-in for provenance from Wikipedia
from Wikipedia but also from the DBpedia source has an
important advantage: it directly provides us structured data In order to show the potential of the data collected and
modelled with popular standard lightweight ontologies in RDF. the data model created, we built an application to show some
We use the DBpedia data especially for the categories that interesting statistics extracted from provenance information of
hierarchically structure the articles on Wikipedia. We ran our the analyzed articles. The application displays a table directly
experiment collecting a portion of the Wikipedia articles, and on the top of each Wikipedia article exposing some informa-
in particular the articles belonging to the whole hierarchy tion about the most active users on the article and their edits.
under a given category. By doing this we could limit our In particular this has been developed using a Greasemonkey16
dataset only to articles strongly related with each other, and script: a Mozilla Firefox extension that allows users to install
collect a user community with the same interest in common. scripts that make on-the-fly changes to HTML web page
A PHP script has been developed to extract all the articles content. This script is developed in JavaScript language and
belonging to a category and all its subcategories, and for each is now compatible with other popular Web browsers. The
article all its revision history. More in detail, this program: structure of the application is then composed by the following
• Executes a SPARQL
13
query over the DBpedia endpoint elements: 1) The triplestore containing the data collected and
to get the categories hierarchy; exposing a SPARQL endpoint for querying the data; 2) A
• Stores the categories hierarchy (modelled with the
PHP script, used as an interface between the Greasemonkey
SKOS14 vocabulary) in a local triplestore; script and the triplestore; 3) A Greasemonkey script, which
• Queries again the DBpedia endpoint to get all the articles
retrieves the URL of the Wikipedia loaded page, sends the
belonging to the categories collected; request to the PHP script and then displays the returned
• For all the articles collected it generates (and stores
HTML data on the Wikipedia page. The PHP script in this
locally) RDF data using the SIOC-MediaWiki exporter15 ; application is important because it is responsible for executing
• Using the sioc:previous_version property it ex-
the SPARQL queries on the triplestore. Furthermore it retrieves
ports RDF for all the previous revisions of each article. the results and creates the HTML code to embed on the
Wikipedia page. A screenshot of the result of the process is
It is clear the advantage of using DBpedia in this process since displayed in Figure 3.
we collected structured data just executing two lightweight
The tables displayed in Figure 3 appear only on the top of
SPARQL queries.
the Wikipedia articles and categories that we analyzed with the
A second PHP script has been developed to extract detailed method described in Section V-A. A different type of table is
provenance information from the articles collected with the showed when the page visited is a category page. In Figure 3
previous step. This script calculates the diff function between on the top table, we can see the top six users who did the
consecutive versions of the articles, and retrieves more related biggest number of edits on the article. For each of these users
information from the Wikipedia API. The data retrieved from we then compute: (1) their total number of edits on the page;
the API is composed by all the information needed for the cre- (2) their percentage of “ownership” on the page (or better, the
ation of the model described in the previous section. Therefore percentage of their edits compared to all the edits done on the
information about the editor, the timestamp, the comment and article); (3) their number of lines added on the article; (4) their
the ID of the versions are identified. Moreover the algorithm number of lines removed on the article; (5) their total number
is not only capable of extracting the diff function, but also of lines added and removed on all the articles belonging to
12 http://dbpedia.org the category “Semantic Web”. With the other use-case, when
13 Query Language for RDF: http://www.w3.org/TR/rdf-sparql-query/ the user visits a Wikipedia category page, we display different
14 SKOS Reference: http://www.w3.org/TR/skos-reference/
15 http://ws.sioc-project.org/mediawiki/ 16 http://www.greasespot.net/
VI. C ONCLUSION AND F UTURE W ORK
The goal of this paper was to provide a solution for
representing and managing provenance of data from Wikipedia
(and other wikis) using Semantic Web technologies. To solve
this problem we provided: a specific lightweight ontology for
provenance in wikis, based on the W7 model; a framework
for the extraction of provenance data from Wikipedia; an
application for accessing the generated data in a meaningful
way and exposing it to the Web of data. We showed that
the W7 model is a good choice for modelling provenance
information in general and in wikis but, because of its high
abstraction level, it has to be refined using for instance other
specific lightweight ontologies. In our case this has been done
using SIOC and the Actions module. Future developments will
include a refinement of the proposed model and a subsequent
alignment with other general-purpose ontologies for represent-
ing provenance as Linked Data (e.g. the Open Provenance
Model). We also plan to improve and extend the potentialities
Fig. 3. A screenshot of the application on the “Linked Data” page and the
table from the Category “Semantic Web” page of our application offering more features, and providing a
wider range of data with an architecture that automatically
updates the data as soon as it changes on Wikipedia.
types of information but using the same method. See the table
R EFERENCES
on the bottom in Figure 3. Browsing a wiki category page, the
application shows a list of the users with the biggest number [1] SIOC Core Ontology Specification. W3C Member Submission 12
June 2007, World Wide Web Consortium, 2007. http://www.w3.org/
of edits on the articles of the whole category (and related Submission/sioc-spec/.
subcategories). It also shows the related percentages of their [2] B.T. Adler, L. de Alfaro, I. Pye, and Vishwanath Raman. Measuring
edits compared to the total edits on the category. The second author contributions to the wikipedia. In Proceedings of WikiSym ’08.
ACM, 2008.
table on the right exposes a list of the most edited articles in [3] Mario Bunge. Treatise on Basic Philosophy: Ontology I: The Furniture
the category during the last three months. To note also that of the World. Riedel, Boston, 1977.
at the bottom of each table there is a link pointing to a page [4] P.A. Champin and A. Passant. SIOC in Action - Representing the Dy-
namics of Online Communities. In Proceedings of the 6th International
where a longer list of results will be displayed. Conference on Semantic Systems (I-SEMANTICS 2010). ACM, 2010.
At the moment the PHP script developed is available at http: [5] J. Golbeck, B. Parsia, and J. Hendler. Trust networks on the semantic
//vmuss06.deri.ie/WikiProvenance/index.php. Just using this web. Cooperative Information Agents VII, pages 238–249, 2003.
[6] Olaf Hartig. Provenance information in the web of data. In 2nd
script is possible to have the same information displayed Workshop on Linked Data on the Web (LDOW 2009) at WWW, 2009.
using the Greasemonkey script and also to have the RDF [7] Olaf Hartig and Jun Zhao. Publishing and Consuming Provenance
descriptions of the page requested. In order to represent these Metadata on the Web of Linked Data. In Proceedings of 3rd Int.
Provenance and Annotation Workshop, 2010.
statistical information in RDF, we use SCOVO, the Statistical [8] M Hausenblas, W Halb, Y Raimond, L Feigenbaum, and D Ayers.
Core Vocabulary [8]. It relies on the concept of Item and SCOVO: Using statistics on the Web of data. In Semantic Web in Use
dimensions to represent statistical information. In our context, Track of the 6th European Semantic Web Conference (ESWC2009), 2009.
[9] B Hoisl, W Aigner, and S Miksch. Social Rewarding in Wiki Systems–
the item is one piece of statistical information (e.g. user Motivating the Community. In Proceedings of the 2nd international
“X” edited 10 lines on page “Y”), and various items are conference on Online communities and social computing, pages 362–
involved in the description: (1) the type of information that 371. Springer-Verlag, 2007.
[10] NT Korfiatis, M Poulos, and G Bokos. Evaluating authoritative sources
we want to represent (number of edits, percentage, lines added using social networks: an insight from Wikipedia. Online Information
and removed etc.); (2) the page or the category impacted; Review, 2006.
(3) the user involved. Hence, we created four instances of [11] Fabrizio Orlandi and Alexandre Passant. Enabling cross-wikis integra-
tion by extending the SIOC ontology. In 4th Semantic Wiki Workshop
scv:Dimension to represent the first dimension, and relied (SemWiki 2009). CEUR-WS, 2009.
then simply on the scv:dimension property for the other [12] Fabrizio Orlandi and Alexandre Passant. Semantic Search on Hetero-
ones. As an example, the following snippet represents that the geneous Wiki Systems. In International Symposium on Wikis (Wik-
iSym2010). ACM, 2010.
user KingsleyIdehen made 11 edits on the SIOC page. [13] Sudha Ram and Jun Liu. Understanding the semantics of data prove-
ex:123 a scovo:Item ; nance to support active conceptual modeling, pages 17–29. Springer
rdf:value 11 ; Berlin / Heidelberg, lncs edition, 2007.
scv:dimension :Edits ; [14] Sudha Ram and Jun Liu. A New Perspective on Semantics of Data
scv:dimension ; Provenance. In First International Workshop on the role of Semantic
scv:dimension . [15] Y.L. Simmhan, B. Plale, and D. Gannon. A survey of data provenance
techniques. Computer Science Department, Indiana University, Bloom-
Listing 3. Representing the number of edits by a user with SCOVO ington IN, 47405, 2005.