=Paper=
{{Paper
|id=None
|storemode=property
|title=The SWO Project: A Case Study for Applying Agile Ontology Engineering Methods
for Community Driven Ontologies
|pdfUrl=https://ceur-ws.org/Vol-897/session4-paper20.pdf
|volume=Vol-897
|dblpUrl=https://dblp.org/rec/conf/icbo/CopelandBPSM12
}}
==The SWO Project: A Case Study for Applying Agile Ontology Engineering Methods
for Community Driven Ontologies==
The SWO Project: A Case Study of Applying Agile Ontology
Engineering Methods in Community Driven Ontologies
Maria Copeland 1 , Andy Brown 1 , Helen Parkinson 2 , Robert Stevens 1 and James
Malone 2∗
1
Department of Computer Science, University of Manchester, Manchester, UK
2
European Bioinformatics Institute, Cambridge, UK
ABSTRACT building process. The SWO is a community driven, collaborative
The Software Ontology Project (SWO) is a community effort to build project with the aim of producing an ontology that captures formal
an ontology that models software used in the generation and analy- descriptions of software used in the production and analysis of data
sis of data for curation and preservation purposes in areas such as for curation and preservation purproses, in order to promote stan-
biomedicine. In community driven efforts, requirements are elicited dardisation and reusability of knowledge (Brown, 2011). This need
from the members of these communities to help ensure the ontology is particularly pertinent in the computational biology field, where
is fit for purpose. This requires methods which are able to engage an understanding of how data were analysed is an important part of
with users with a wide range of expertise, allow close collaboration the scientific record that allows proper re-interpretation and re-use
between developers and users and that are able to respond rapidly of the discipline’s data.
to changing knowledge. We describe an Agile Ontology Engineering There is a need in many Ontology Engineering efforts to engage
method for developing ontologies, adapted from modern Agile softw- with a community of domain experts that may not be ontology or
are engineering methods. The approach was applied within the SWO knowledge representation experts. This presents a challenge of how
project and demonstrated promising results in engaging a diverse set to engage an audience without having to learn their domain know-
of community representatives and an objective measure of ontology ledge or for them to become ontology engineers. In addition to this
success, of much relevance to the active community of bio-ontology challenge, there are difficulties of keeping the community engaged,
developers and users. the ontology up-to-date and responsive to the customers. Some of
these challenges in community ontology development are similar to
1 INTRODUCTION those seen in Software Engineering.
One popular method in software engineering for tackling these
Community driven development of ontologies have become the issues is Agile Software Development (Martin, 2002) which is a set
norm in biomedicine (Garcia et al., 2010; The Gene Ontology of software engineering methods that drives the development effort
Consortium, 2010). Engaging with a diverse set of users from around the requirements, use cases, and continuous participation
a community presents challenges; some of these challenges are with users. Agile methods require that iterations are frequent and
technical, such as when multiple ontology authors need to synch- that software is released often with an emphasis on collaboration
ronously or asynchronously edit from multiple sites, and to allow between developers, especially with regards customer requirements.
users to view ontologies and directly comment on the artefacts Typically through these short iterations, the whole life cycle is
(Alexander et al., 2011). For such challenges, tools such as Col- reproduced, including some requirements analysis, testing and pro-
laborativeProtégé (Tudorache et al., 2008) (CP) and resources such duct delivery. Some of the advantages of this approach are that
as BioPortal (Noy et al., 2009) have been developed, and versioning changing requirements are considered part of the normal life cycle
tools can be used to support the use of code bases in a similar way and that the users are more ‘involved’ with the development process,
as for software projects. Tools such as CP offer technical solutions making this approach more able to respond to change as determi-
for the technically savvy, but methods for collaboration need to be ned by the customer. There is also an emphasis on working code
much broader than addition of axioms (a necessary, but not suffici- and test-driven development, so the software works for each of the
ent task for ontology building). Some of these challenges are more features added at each iteration.
about the process used and are sociological in nature, as much as In community ontology projects like the SWO, the community
technical (Randall et al., 2011). They include, but are not limited should act as the customers from which requirements are elicited
to: working with people that are not ontology engineers; the ability and used to help make the ontology fit for purpose. Unsurprisingly
to appropriately capture requirements from a diverse set of users; then, ontology engineering methods have been developed that aim
prioritising those requirements by seeking group consent and resolu- to tackle similar aspects of the development process as in Software
tions; judging when an ontology meets a community’s requirements; Engineering. In a review of ontology engineering methods, Tem-
responding to issues, requests and shifts in domain knowledge; and pich and colleagues (Tempich et al., 2006) were critical of many
take into account and predicting the various types of the ontologyś methods for their lack of consideration for evolution of ontologies,
users. thus indicating a failure in continuing a collaborative relationship
We used the Software Ontology (SWO) Project to explore ways with the community of users as part of the ontology’s evolution. A
of including a broad range of user backgrounds into a community wider study surveyed 148 ontology engineering projects in acade-
mia and industry, concluding that ontology engineering has become
∗ To whom correspondence should be addressed: malone@ebi.ac.uk
1
Copeland et al
an important discipline, though there is still work to be done in the for individual features based on the ‘Buy a Feature’ method
area, such as the need for better documentation of decisions taken (Kirk, 2011).
during the ontology engineering process (Simperl et al., 2010). • Implementation of Top Requirements Event: The modelling
In response to such needs, some aspects of Agile methods are and coding of the ontology takes place, focusing on the features
beginning to be adopted by the Ontology Engineering community. ‘bought’ from the previous step. Modular development is used
One such method is the RapidOWL method (Auer and Herre, 2007), to allow concurrent development from co-located or distributed
where they claim that the adoption of agile methods for Ontology developers. Content is also gathered by participants completing
Engineering is greatly benefited when these methods are carried templates taken from the implemented ontology.
out in a unified process and with the continuous involvement of
• Evaluation Event: Evaluation of the ontology is done by all
the user community. This is a positive step in community ontology
participants of the development process, thus helping compete-
development, however more analysis, case studies, and empiri-
ncy questions to be satisfied. Testing is conducted with defined
cal evaluations are needed in order to recommend and refine the
classes acting as queries based on the competency questions
application of Agile Ontology Development.
against the ontology. This is disseminated to the stakeholders
In this paper we present a case study in which an Agile Ontology
who evaluate the ontology against the requirements.
Development method was applied to the SWO project. We use the
SWO as a case study of how an agile software approach can be
adapted to the development of an ontology. We also reflect on what
we did in our agile method and what we think could and should be Feature Request
changed to make it better. Competency Questions
Requirements
2 SWO ’S AGILE METHOD
Features Ranking
Eliciting requirements from and engaging with the life science com-
munity is important to ensure ontologies meet the needs of the Testing with Evaluation Planning Poker
Defined Classes
stakeholders. Given a broad similarity of activities within both
software and ontology development (requirements gathering, eva-
luation, design, implementation, testing, publishing, maintaining, Implementation
of top
etc.), moving methods from one to the other has an a priori appeal. requirements Modular Development
The organisers of the SWO project approached the ontology buil-
Assemble using Reasoning
ding task by adapting agile methods from software development Compositional Approach
into the engineering process for ontologies. SWO focused on the
following agile principles (Martin, 2002):
Fig. 1. SWO ’s Ontology Development Flow of Events.
• The introduction of requirements gathering and ontology
modelling sessions as iterative and incremental activities;
• That requirements evolve throughout the engineering cycle; Requirements Gathering Event The following ground-rules
• The encouragement of self-organised and cross-functional were used in the SWO workshops:
teams;
• Presenting the community of users and contributors by intro-
• The provision of rapid and responsive ontology development; ducing stakeholders and typical users of the ontology;
• That domain experts, users, and ontology engineers are all • ‘No death by PowerPoint’; the workshops are hands on events
active contributors throughout the process; where results and instant feedback will be the focus of activity;
• The use of a test driven approach to development; • No up-front ‘Ontologising’ in the workshops; the resolution
• The provision of regular and frequent builds to the participants of the pattern of axioms for representing the user ’s needs is
for discussion, testing, refinement, and agreement. vital, but does not need to be done interactively in this setting,
These methods have several events whose activities are planned especially with those that are not familiar with ontologies;
to deliver information to other events in a cyclic manner. Events • Everyone participates. Activities were designed such that ever-
can, however, sometimes run in parallel. The Agile Ontology yone could join in and contribute and that ‘powerful’ voices
Engineering Method is summarised as (see also Figure 1): could not dominate.
• Requirements Gathering Event: This event focuses on the These rules of engagement were overseen by a moderator to help
capture of requirements by identifying competency questions keep the level of details and discussion appropriate for the activi-
and desirable features for the ontology. These activities are use ties. The universe of discourse was created by asking participants
case or scenario driven. to organise into groups and write on ‘sticky-notes’ the information
• Requirements Prioritisation Event : This event has two parts, they wanted to record about software. The notes were then clu-
both adapted from Agile Software Engineering techniques. The stered according to similarity by the participants, invoking more
first requires participants to estimate the complexity required to discussion. A similar exercise was used for gathering competency
implement a requirement based on ‘planning poker’ (Grenning, questions. Both sets of clusters were reviewed for correspondence
2002). The second part has participants rank features collected between features and questions and any gaps highlighted by one
in the requirements gathering phase by importance by ‘bidding’ cluster in the other set of clusters were discussed and addressed.
2
Agile Ontology Engineering
Table 1. Features that were initially identified, then given complexity
The validation of the clusters and understanding of competency
estimate (cost) and then later prioritized through a ’bidding’ process.
questions were conducted by the next activity which instructed users
*indicate features that were bought (i.e. their cost was met)
to create personas. A persona represents an individual user of the
ontology with specific characteristics that provide context to the
Feature Cost Total Bid
interactions and needs of that particular user. In contrast to use cases,
Algorithms* 75 75
personas define a specific individual of a user group with a detai-
Architecture 87 0
led ‘story’ of the user and an example of usage of the ontology.
The moderator asked for each persona to give: their age, back- Capability 254 247
ground, dress code, favourite food, work, detailed task that include Configure/run parameters 542 174
the competency question in the form of a story. Persona are meant Cost of ownership 295 70
to motivate participants to have some basis for the decisions they Data* 300 300
make, rather than making decision based on personal preference. Dependencies 276 91
The use of personas is a requirements gathering method that aims to Function* 44 44
reduce the effect of sample bias in participant events. Interface* 74 74
Requirements Prioritization Event - Planning Poker: The set Licenses* 38 38
of clusters, questions, and usage examples were prioritised by ado- Life cycle* 188 188
pting the Agile technique of Planning Poker (Grenning, 2002). The Platform 169 50
SWO project ’s version of this technique was divided into two voting Source code location* 25 25
activities. In the first voting activity participants were asked to rate Supplier* 14 14
the difficulty of describing the clusters by casting a vote using Version* 110 110
‘voting’ cards with ‘?’, ‘0’, ‘1’, ‘2’, ‘3’, ‘5’, ‘8’, ‘13’, ‘20’, ‘40’,
and ‘100’; where zero is the easiest and 100 is the hardest. Discus-
sion was encouraged by asking participants to contribute their views
when their vote was not close to the average vote for that cluster. SWO ontology engineers produced a series of software description
The consensus or mean was used to derive a cost for the feature in test cases based on the competency questions and samples supplied
question. by the participants during the two workshops, the email lists and
In the second voting activity users were given 100 points to spend blog posts. All commonly used test cases were covered without
on the clusters they wanted to be represented using the ‘Buy a spending too much time on rarely used software descriptions not
Feature’ game. In this activity voting and polling drives the prio- covered in the test cases. The ontology was tested with the use of
ritisation; if a discussion about a cluster was prolonged a re-vote personas and competency questions through the participation of all
was taken to allow participants the opportunity to change their inve- SWO project members throughout all the events in the development
stment after the discussion. The buying phase results in a set of cycle.
features that the participants most wish to have in the ontology. Such
prioritisation events should be iterative over the entire life of the
ontology. 3 RESULTS
Implementation of Top Requirements Event: Implementation The SWO project had a six-month duration. The Requirements
of the requirements was conducted in a modular development appro- Gathering Events took place during the first face-to-face workshop
ach using normalisation Rector (2003). The requirements indicated (WS1) and then again in an iteration four months later in a second
the main modules to be made (for example data in a data OWL workshop (WS2); there were 18 participants in WS1 and 13 partici-
module). Classes of Software are then described in terms of these pants in WS2. Seven of these people participated in both workshops.
feature modules via restrictions, and defined classes of softw- The first set of requirements were produced following the activities
are establish the hierarchy of software. In addition to versioning described in section 2. This resulted in 17 clusters of features (for-
standards, other standards such as coding standards, URI naming med by clustering possible features of interest) and 91 competency
conventions, and labelling standards were chosen or devised, and questions aligned to these features. The matching of competency
implemented by the ontology engineers. The Implementation Event questions to feature clusters provided interesting validation regar-
ran concurrently with the Evaluation Event in order to adhere to the ding what ’can be said’ as opposed to what is ’desirable to ask’
Agile principle of continuous and gradual testing and validation. about a feature. For instance, one of the main clusters identified was
In the Implementation Event, user contribution continued by allo- software ‘platform’. Platform had 10 sticky notes about informa-
wing users to populate templates of software descriptions as part tion that can be recorded about a platform’s software requirements
of the development process. These templates, based on the axiom (e.g. will it run on Linux, Android etc.); however, when questions
patterns in SWO, were used to both gather input for software descri- were solicited from participants, only 4 questions were asked. The
ptions and to test the ability of the SWO to enable descriptions. The moderator highlighted this discrepancy and explained the usage of a
spreadsheet had a software entity in the first column and subsequent cluster is as important as the detailed recorded in the cluster.
columns represented the prioritised features for software descripti- Following the initial gathering activities, the Requirements Priori-
ons. Participants either filled in from the existing SWO ’s software tization Event took place. Table 1 presents a summary of the features
descriptions or provided their own new terms. bought by the method described in section 2. It should be noted that
Evaluation Event: Evaluation of the ontology was conducted by the ‘cost’ of each feature does not translate directly to a real effort
combining the information from the Requirement Gathering Event, or cost (e.g. days or money). The metric is meant as a comparable
namely software clusters and their descriptions, with test cases. The measure within features only.
3
Copeland et al
Table 2. Examples of competency questions for some of the features and
4 DISCUSSION
how they were met in the ontology, evaluated using an OWL defined class,
which equated to the competency question. The example answers are actual In our experience of building SWO, the use of Agile methods
answers given from the ontology (not exhaustive). in ontology development appeared to have several strengths that
should be of relevance to the biomedical community. There is, of
course, no formal evaluation of the process; what we report here is
Feature Competency Manchester OWL Example reflection on what we did with the SWO and in our involvement
Question Test Question Answer in other ontology development efforts. During the SWO project
Data Which sof- has specified input ArrayExpress the link between participation of its members and its corresponding
tware has some (data and Biocondu- effects on actions in the development cycle were self-evident. The
MAGE tab has format specification ctor focus of development was the community of users rather than on
input? some ‘MAGE tab a particular ontology technology, formal ontology or philosophical
format’) paradigm. The ‘ontologising’ is, of course, important, but we trea-
Function What softw- achieves objective EMBOSS ted it only as a means to an end. Also, the process we describe is
are peforms some ‘molecular neutral to ontological paradigm.
molecular sequencing analysis’ This emphasis on community and commitment to Agile principles
sequencing appears to us to have produced the following benefits:
analysis?
Algorithm Which implements some GeneSelector • An open and transparent process, present throughout the deve-
software ‘Bayesian Model’ lopment cycle of the ontology. This seemed to have encouraged
implements participation and a sense of membership to the SWO project
a Bayesian as a whole rather than unconnected contributions to prescribed
Model phases.
Interface What has interface some DROID • Users’ competency questions, personas, and test cases were
software ‘Command Line important drivers and validators of the development process.
has com- Interface’ • The resulting ontology covered the knowledge needed by
mand line the community as communicated through concrete examples,
interface? users’ descriptions, and competency questions.
Version Which ‘Microsoft Excel’ and Microsoft
version (has version some Excel 2010 The approach enabled the SWO developers to engage with users
Microsoft (‘version name or with a varied background and experience in ontology building.
Excel came number’ and (pre- Around 80% of those who contributed to the workshops and require-
after 2007? ceeded by value ments had no experience of ontologies or OWL, yet all participants
‘Microsoft 2007 actively contributed to the ontology. The SWO ’s approach relied
version’))) on the leadership of the moderators and organisers of the project
during meetings with users. The organisers were familiar with, but
not experts in, Agile techniques; this suggests that the techniques
may be easily adopted.
An element of the workshops that appeared to be particularly
successful was that of conflict resolution with regards to estima-
Participants continued to contribute to the population of descri- ting the effort required to add features. One of the benefits of the
ptions and clusters by providing examples of usage, software planning poker game was that it appeared to allow for both an inde-
descriptions, and new entities via the emailing list and blog post. pendent estimate of effort and then, if estimates were individually
After four months WS2 was held at which progress was evaluated. far apart, discussion as a group to resolve those discrepancies. Seve-
An initial exercise of asking the participants to each describe softw- ral such discrepancies occurred during the initial SWO workshop,
are, following the requirements set out previously, was conducted. often as a result of disagreement in exactly what adding such a
Then, a re-prioritisation occurred, with the intention of finding out feature would require. Discussion then helped to come to a com-
if, given the current ontology and experience of describing software, mon understanding and a re-estimate was taken, often with closer
the current set of priorities were still relevant. Some new features agreement.
that emerged from this included the need to capture software suites Deciding what was to be included in the SWO was a collabo-
or packages, and software documentation. the platform upon which rative effort using the buy a feature game. The vast majority of
the software runs was seen to be a more important feature in WS2, features were not able to be bought by any one individual and thus
although it was not bought in WS1. Axiom patterns changed accor- co-operation in bidding was required. There were some features on
ding to feedback and the descriptions captured in the spreadsheets which much of the ‘money’ spent on that feature was done so by
used to evaluate SWOS’s ability to describe software were added one individual in an attempt to have it ‘accepted’, despite no one
to SWO. These spreadsheets captured extensions to various parts of else bidding for that feature. Some features may be important for
the SWO, so all parts of the SWO’s descriptions of software fea- one domain but less so for another; such features may be ‘show-
tures changed. Three iterations occurred, resulting in three releases stoppers’ for that community and this could be taken into account in
during the six-months. The final ontology after iteration 3 had 903 the process. An example of this was the ‘algorithm’ feature which
classes, 101 individuals, and 34 properties. was bid for by one person with a large amount of money, but not
4
Agile Ontology Engineering
bought outright as it was deemed less important to almost all other agile method provides objective, documented evidence that an onto-
users. logy meets a users needs. Requirements are more than capturing the
This feature buying process helped to reduce development time entities and their relationships that exist in a given domain, since for
on areas that were collectively deemed unimportant. This is of par- most domains, this is not feasible and can introduce ontologies that
ticular importance in the life science domain, in which the scope is are unnecessarily complex. The features of interest should be pri-
vast. Prioritising areas of importance should provide a cost benefit. oritised with the users needs in mind, but also with the developing
Although one of the Agile principles was to encourage self orga- team in mind, since person power and money are limited; setting
nised and cross-functional teams, this would not have been possible out to represent all of reality is infeasible in most cases and often
without the input from the project ’s organizers. Skills in group orga- unnecessary. Our experience from SWO is that an agile approach
nisation, team building, and responsive feedback were as important to ontology authoring could deliver what a biomedical community
as the method implemented. Experience of community ontology needs by making ontology authoring agiley responsive to users.
building was also a factor that allowed the organizers to prevent
and resolve conflict during the requirements gathering. The cycli-
cal nature of the method allowed for a continuous feedback process 6 ACKNOWLEDGEMENTS
to all participants and presented working versions of the ontology as This work was funded by JISC. We would like to thank all the
requirements and competency questions were refined. participants who gave their time to the SWO project.
One area of the method that was under employed was the ‘per-
sona’. Personas are used in Software Engineering with the purpose
of validating specific user interacting features such as user interfa-
REFERENCES
ces. Requirements not only cover features but also cover the needs of
Alexander, P., Nyulas, C., Tudorache, T., Whetzel, P., Noy, N., and Musen, M. (2011).
stakeholders that may or may not be explicitly known to the users.
Semantic infrastructure to enable collaboration in ontology development. In 2011
SWO could have increased the scope of the ontology by combi- International Conference on Collaboration Technologies and Systems (CTS), pages
ning use cases and personas, and by utilising both as assets to the 423–430.
different phases of the project such as during Planning Poker. Auer, S. and Herre, H. (2007). Rapidowl an agile knowledge engineering methodology.
Consistency of modelling was achieved in SWO by the esta- In I. Virbitskaite and A. Voronkov, editors, Perspectives of Systems Informatics,
volume 4378 of Lecture Notes in Computer Science, pages 424–430. Springer Berlin
blishment and enforcement of standards, the ontology engineers’ / Heidelberg.
communication and participation in requirements gathering and pri- Brown, A. (2011). An overview of swo.
oritising sessions, and effectively using communication tools such http://softwareontology.wordpress.com/2011/02/23/an-overview-of-sword/.
as the SWO blog to resolve modelling conflicts and questions. Garcia, A., ONeill, K., Garcia, L. J., Lord, P., Stevens, R., Corcho, O., and Gibson, F.
(2010). Developing ontologies within decentralised settings. In H. Chen, Y. Wang,
These observations suggest that participants, users, and engine-
and K.-H. Cheung, editors, Semantic e-Science, volume 11 of Annals of Information
ers valued open communication and collaboration throughout the Systems, pages 99–139. Springer US.
project and benefited from a mutual understanding of each other’s Grenning, J. (2002). Planning poker or how to avoid analysis paralysis while release
background and motivation. The commitment to shared values and planning.
trust in a methodology are social questions that are hard to quantify Kirk, G. (2011). Democracy unleashed: Bringing agility to citizen engagement. In
AGILE Conference 2011, pages 209–215.
and reproduce in all projects, but they appeared to help make this Martin, R. (2002). Agile Software Development, Principles, Patterns and Practices.
instance of the agile method work. Prentice Hall.
Noy, N. F., Shah, N. H., Whetzel, P. L., Dai, B., Dorf, M., Griffith, N., Jonquet, C.,
5 CONCLUSION Rubin, D. L., Storey, M.-A., Chute, C. G., and Musen, M. A. (2009). BioPortal:
ontologies and integrated data resources at the click of a mouse. Nucleic Acids
This paper presented a reflection on one projects experience of Research, 37(suppl 2), W170–W173.
applying an agile method to ontology development in the Software Randall, D., Procter, R., Lin, Y., Poschen, M., Sharrock, W., and Stevens, R. (2011).
Ontology (SWO). The success or failure of the adoption of Agile Distributed ontology building as practical work. Int. J. Hum.-Comput. Stud., 69,
220–233.
methods in Ontology Engineering have much to do with the deli- Rector, A. L. (2003). Modularisation of domain ontologies implemented in description
very of such methods under an organised, flexible, responsive and logics and related formalisms including owl. In Proceedings of the 2nd International
collaborative social and technical environment. The technical tools Conference on Knowledge Capture.
employed to deliver collaborative ontologies must be complemented Simperl, E., Machol, M., and Burger, T. (2010). Achieving maturity: the state of
practices in ontology engineering in 2009. Int Journal of Computer Science and
with organisation, responsive feedback, and transparency of pro-
Applications, 7, 4565.
cess. The SWO project should be judged not only on the coverage Tempich, C., Pinto, H. S., and Staab, S. (2006). Ontology engineering revisited: an
and consistency of its knowledge representation, but on the active iterative case study with diligent. In In Proc. of the 3rd European Semantic Web
participation of its members in making it a maintainable ontology Conference (ESWC 2006, pages 110–124.
grounded in its communitys needs. Such methods should be impor- The Gene Ontology Consortium (2010). The Gene Ontology in 2010: extensions and
refinements. Nucleic Acids Research, 38, D331–D335.
tant to the bio-ontology community, where user participation is key Tudorache, T., Vendetti, J., and Noy, N. (2008). Web-Prot́egé: A Lightweight OWL
in building community driven ontologies that are relevant, adapta- Ontology Editor for the Web. Fifth OWLED Workshop on OWL: Experiences and
ble to change and therefore more likely to be used. In addition, the Directions.
5