=Paper=
{{Paper
|id=None
|storemode=property
|title=An Organizational Environment for in Silico Experiments in Molecular Biology
|pdfUrl=https://ceur-ws.org/Vol-737/paper4.pdf
|volume=Vol-737
|dblpUrl=https://dblp.org/rec/conf/red/LinLSML11
}}
==An Organizational Environment for in Silico Experiments in Molecular Biology==
An organizational environment for in silico
experiments in molecular biology
Yuan Lin1 , Marie-Angélique Laporte1,3 , Lucile Soler4 , Isabelle Mougenot1,2 ,
and Thérèse Libourel1,2
1
LIRMM, UMR5506 CNRS-UM2, 161, rue Ada, 34095 Montpellier, Cedex 5, France
firstname.lastname@lirmm.fr
2
UMR ESPACE DEV IRD-UM2, 500 rue J.F. Breton, 34093 Montpellier, Cedex 5,
France
firstname.lastname@univ-montp2.fr
3
Centre d’Ecologie Fonctionnelle et Evolutive, UMR5175 CNRS, 1919, route de
Mende, 34293 Montpellier, Cedex 5, France
firstname.lastname@cefe.cnrs.fr
4
CIRAD-PERSYST, Campus International de Baillarguet, 34398 Montpellier cedex
5, France
firstname.lastname@cirad.fr
Abstract. Molecular biologists, just like geneticists, make use of various
experimental mechanisms and devices to conduct research and to validate
or invalidate their theories or initial hypotheses. Mechanisms powered
by information technology, called in silico, put data and analysis tools
at the centre of the experiments, and are thus different from in vivo, ex
vivo and in vitro mechanisms.
Multiple resources (data sources as well as analysis tools) are widely
available and, very often, allow various modes of operation, requiring cer-
tain expertise for their optimal use. This is especially true when drawing
up complex analysis scenarios based on the sequential use of appropri-
ate processing tools. To facilitate the construction of these experimenta-
tion mechanisms, we propose a scientific workflow infrastructure which
uses an organizational environment to allow abstract planning of the ex-
perimentation, followed by its concretization. The concretization phase
includes a verification of the conformity of the planned process chains
composition to avoid any error during execution.
Keywords: Scientific workflow, analysis pipeline, specification language, val-
idation aspects of service composition.
1 Introduction
Life sciences often rely on the chaining of data and application resources to
express the experimentation process. Valuable resources for biology, while avail-
able in ever-increasing quantities, remain, for the most part, cost-expensive and
time-consuming to acquire and thus their reuse becomes almost a necessity.
To design these complex experiments, scientists often need to locate suitable
resources and then to organize or reorganize them. In addition, each experiment
deserves to be saved so that it can be re-executed several times, either in various
different configurations or with diverse test data. In such a context, the use of
a scientific workflow proves to be an invaluable help. Several dedicated software
applications for this purpose now exist, most notably in the financial sector, and
research in the field is relatively advanced. A first study [7] presented our ap-
proach based on the concept of the scientific workflow environment. Its objective
is to help the user to:
– design experimentation process chains (in as abstract a manner as possible),
– better organize resources (data and processes) which will be elements in the
concretization of these process chains,
– capitalize on the existing by constructing new processes from previously
devised experimentation plans.
This article develops our research advances in terms of resource organiza-
tion and semi-automatic verification of validity of workflows designed within a
prototype.
This article is structured as follows: section 2 presents a brief state of the
art, section 3 proposes an architecture for implementing a scientific workflow and
section 4 provides a glimpse of the organization brought about. Section 5 covers
the proposed verification of conformity, section 6 illustrates with an example
the validation of conformity of a concrete process chain, and section 7 presents
perspectives in progress.
2 State of the art
A study was conducted based on characteristics we deemed relevant [8]:
– The existence of a meta level for describing and creating process chains. In
fact, the generic aspect conferred by meta-modelling appears to be funda-
mental for all of us.
– Taking the experimental aspect into account. The unique characteristics of
scientific data and processes should show through at the formalism level.
We present here only two representative projects, Kepler[1] and Taverna[6],
which gain a certain amount of popularity among workflow scientists.
2.1 KEPLER
KEPLER 5 is a complete scientific workflow environment based on the Ptolemy
II platform of the University of Berkeley. As far as process chains are concerned,
5
http://kepler-project.org/
KEPLER adopts a human organization metaphor. It is Actor-Based and con-
siders all components of a process chain as actors. Actors (services) are accessed
via a structure corresponding to the business ontology of the concerned domain.
The workflow is represented using a graphical language in the form of a
graph linking ports (input/output parameters) of actors via channels. One or
more actors in charge, Directors, plan tasks for other actors of the organization;
they do so based on the available ontology. The execution plan of a process chain
(or a portion of a process chain) is therefore created by a Director of the system.
Any necessary adaptations are achieved by intermediary sender and receiver
programs, which ensure the compatibility of data transferred over a channel.
The process chain is saved in the form of MoML (Modelling Markup Language)
files. (MoML is an XML-based language.) At the environment-interface level, a
specific zoom feature is associated with the concept of an opaque actor (cf. figure
1). An opaque actor appearing in a process chain can be opened, thus revealing
its constituent details.
Fig. 1. Overview of a process chain in the KEPLER environment
2.2 Taverna
Taverna is a workflow project created by the my Grid team in England and used
mainly in the life sciences. A workflow in Taverna is considered as a process
graph in which processes are connected by data links or control links. Processes
used are essentially web services (which can be supplemented by local libraries,
manuscript scripts, etc.). During process composition, the user manually couples
input/output parameters of web services or invokes shim services, specific adap-
tors existing from couplings constructed and tested for experiments. In addition,
the process chain is saved in the form of a SCUFL (Simple Conceptual Unified
Flow Language) file. (SCUFL is an XML-based language.)
Fig. 2. A concrete workflow in Taverna (taken from the myExperiment Taverna sharing
site)
2.3 Other related works of interest
The Taverna and Kepler projects both provide generic models for instantiation
and composition of services. Additionally, some other approaches are also highly
relevant to scientific workflow management:
– The project BioMoby [17], as a first attempt to assist process chaining by
using scientific resources, which are described and classified in the MOBY
Central.
– PISE ans its revised system Mobyle [18] that provides a web environment
(a Web Portal) to define and execute bioinformatics analyses. Registered
analysis programs are pre-classified in a hierarchy, as well as some frequently-
used workflows. Experts can easily find them by using the search function
panel that is integrated in the web site.
– The project ProtocolDB [19] proposed to model scientific workflows at two
different layers (design protocol/ implementation protocol). An implementa-
tion protocol for a given design protocol is realized by mapping design tasks
to different implementation tasks (scientific resources like database queries/
tools), and by connecting them together.
– In [20–22], scientific workflow modeling is supported by resource discovery
approaches.
In this manuscript we focus mainly on scientific workflows and the way they
are modeled and implemented. Our proposal introduces an additional level of
abstraction, whose purpose is to describe the business domain prior to creat-
ing the process chains. This additional modelling level is predicted to facilitate
the construction of process chains by allowing biologists to use their expertise
of their domain, but without requiring them to have expert and often precise
knowledge of the underlying resources and their locations. It also plays the role
of a prescription model, to which instantiation and service composition models
have to conform.
3 Workflow architecture
Our efforts have been guided by the business point of view, that of the experi-
menters. Designing an experimental protocol corresponds to general model with
three stages: 1) Definition: abstract definition of a process chain corresponding
to an experimentation sequence (planning the experiments), 2) Instantiation:
a more specific definition after identifying the various elements of the chain
(data/processes), 3) Execution: customized execution (according to strategies
corresponding to the requirements).
Based on this experimental life cycle, and inspired by the architectural styles
proposed by OMG [11], we propose the following 3-level architectural vision (cf.
figure 3):
Language used to define a Workflow Meta-Model
workflow business model
Conforms
Business description
of the process chain Business model Business model ..... Business model Static
Instance of
......... Intermediate
Model instantiated Instantiated model Instantiated model
from a business model
Choice of the Dynamic
execution strategy Centralized / Decentralized execution
Fig. 3. 3-level architecture of a workflow component
The static level concerns the design phase. It is a matter of constructing
(abstract) business-process models using a simple language. The intermediate
level represents an instantiation and pre-verification phase. Using the business
process model, the user constructs the real process chain by selecting and locating
the processes and data most appropriate to the planned experimentation. The
pre-verification is semi-automatized (cf. section 4). The dynamic level concerns
the actual execution phase. It takes place based on the various strategies defined
by both the user and the operational configurations.
The static level has been studied in some detail in our [7, 8]. We have ana-
lyzed various language standards such as UML (activity diagram) [9] and SPEM
[10], as also various existing projects such as BioSide [5], Meta-model WDO-It!
[12] and CIMFlow [4]. Following this study, we proposed a simple but complete
language. It is based on a language defined by a meta-model whose abstract
elements, tasks or processes, are connected by unidirectional links and by the
intermediary of ports. To facilitate the manipulation of abstract process chains,
a corresponding graphical language was created within a prototype (cf. the top
part of the figure 4). By using this workflow definition language, a simple exam-
ple is modelled and shown in the lower part of the figure 4 6 .
Atomic task Role Data Port (parameter) Data link
Task page
page data
Role Data
Abstract model page
page
Visualization Image
1
Protein
page
page Similarity search page
page
Alignment
sequence page
2 Tree reconstruction page
Tree
Fig. 4. Some essential elements of our graphical language and a simple example
We currently focus on the intermediate level, which consists of two essential
stages:
– instantiation of the abstract model with existing resources (data/processes);
– validation of the concrete model instantiated from the organizational envi-
ronment.
4 Organizational environment
To carry out the experimental protocols, the abstract model instantiation stage
consists of finding and reusing existing resources. To facilitate this search, we
base ourselves on the concept of organizational environment. This environment
relies on the description of resources (data and processes) in the form of metadata
6
This example is also used in the later sections, we will explain it in detail during the
following sections.
(expressed in XML schema format). The resource descriptions are hierarchized
in resource categories and in concrete resources. As shown in figure 5, it consists
of:
– an organization relating to processes. It manages the hierarchy of descrip-
tions of process categories and of concrete processes. The concept of Con-
verter corresponds to the concept of a specific process responsible for adapt-
ing data between different formats of the same data category.
– an organization relating to data. It manages a hierarchy of descriptions of
data categories, of concrete data and of the various associated data formats7 .
Environment
1 1
Organization of Organization of
processes data
specification
specification *
* * * *
* InputCategories * Data linked to Concrete
Process linked to Concrete
category 1 * data
category process * OutputCategories *
1 *
* * is in 1
format sub format
1 1
InputFormats * 1
Normal
Converter Data format
process *
OutputFormats *
Fig. 5. Organizational environment
To illustrate this concept of the environment, we take an example from the
world of molecular biology (cf. figure 6). The upper part of each hierarchy (pro-
cesses and data) represent a set of categories (shown as ovals) sorted according
to the generalization/specialization relationship. The descriptions of concrete
resources (data or processes) are then associated to their category.
The description of a concrete data describes its format, whereas that of a
concrete process corresponds to its signature, which we formalize thus:
Definition 1. Formalized signature of a concrete process
Name (Input parameter list) : (Output parameter list), where each parameter is
described by the doublet (Data category : data format).
A set of data formats (Fasta, xml, MultiFasta, Clustal, Newick, Jpeg) is also
presented. Figure 6 is therefore complemented by the description of signatures
of some example concrete processes:
7
Remark: It should be noted that several data categories can share the same format.
Blastp(ProteinSeq:Fasta) : (SeqPairs:xml)
ClustalW (ProteinDataBank:MultiFasta) : (MultipleAlignment:Clustal)
InteractiveSelection(SeqPairs:xml) : (ProteinDataBank:MultiFasta)
Logo(MultipleAlignment:Clustal) : (Image:jpeg)
PhyML(MultipleAlignment:Clustal) : (PhylogeneticTree:Newick)
Fig. 6. Illustration of an organizational environment in a biological context
5 Conformities
5.1 The problem
As already mentioned, the second important stage of the intermediate level con-
sists of validating the concrete model instantiated from the abstract model.
Let us take an example described by using the workflow language, corre-
sponding to an abstract process chain model that a biologist designs with the
intention of characterizing a protein sequence which interests him in the context
of his putative functional domains.
At the concrete level, the idea is to begin by using the Blast similarity-search
tool to compare the protein sequence under consideration with a data bank of
protein sequences and to thus identify segments with high similarity shared
both by the protein sequence under consideration and by various sequences in
the sequence data bank. These similar segments indicate the possible presence
of functional domains. The biologist then continues his study by reusing the
results output from the Blast tool [2], either to construct a phylogenetic tree
and retrace the evolutionary history of the sequence via the PhyML tool [3] or
to display the preserved positions common to all the similar segments via the
Logo tool [13]. This simplified example of a process chain in molecular biology
allows us to highlight the difficulties encountered by the biologist in using the
results output by one tool as input to another tool. The difficulties relate, at the
same time, to the nature of the data (here characterized as data category), to the
format of this data, and, finally, to the biologists expertise. In the example, we
make willing use of the discrepancy which arises between the Blast tool, which
outputs a collection of simple alignments, and the PhyML and Logo tools, which
require multiple alignments to run. In fact, Blast leads to multiple discrepancies
two-by-two, involving the sequence under consideration and one of the sequences
from the sequence data bank which is similar to it; whereas PhyML and Logo
use the shared similarity by a set of sequences which includes the sequence under
consideration. This example highlights what we will subsequently term semantic
incompatibility.
In its upper part, the figure 7 shows the abstract process chain and in
the lower the concrete chain obtained after locating data descriptions S1 and
adapted processes Blastp and PhyML. The problem which we designate as one
of validation of the instantiated (concrete) model consists of verifying the com-
patibility of each composition. A composition corresponds to the link between an
output parameter p1 of a process T and an input parameter p2 of the process
following T; we denote it (p1 → p2).
5.2 Identifying situations of compatibility
Verification is undertaken by analyzing the signatures of linked processes. To do
so, we have to take two important aspects into account:
– The syntactic aspect : relating to the data formats used by the parameters.
Fig. 7. Problem at hand
– The semantic aspect : relating to the processs functionality. It not only de-
pends on the processs name but also on the signification of the input/output
parameters.
For two processes T1(dc1:fo1) : (dc2:fo2, dc3:fo3) and T2(dc4:fo4) : (dc5:fo5),
let us suppose that there exists a composition, denoted p1→p2, between the p1
(dc3:fo3) output parameter of process T1 and the p2 (dc4:fo4) input parameter
of process T2.
Syntactic and semantic compatibilities are defined as follows:
Definition 2. Syntactic compatibility
p1 → p2 is syntactically compatible if (fo3 = fo4) ∨ (fo3 is a sub-format
Syn
of fo4), denoted p1 → p2. Two parameters are syntactically compatible if they
use the same data format or if they use an output format which is a sub-format
Syn
of the input format. Else p1 9 p2.
Definition 3. Semantic compatibility
p1 → p2 is semantically compatible if (dc3 = dc4) ∨ (dc3 is a sub-category
Sem
of dc4), denoted p1 → p2. Two parameters are semantically compatible if they
use the same category, or if they use an output category which is a sub-category
Sem
of the input category. Else p1 9 p2.
The verification of a compositions compatibility is thus done at two levels:
syntactic and semantic. Three types of situations can arise:
Sem Syn
– Situation 1 (p1 → p2) ∧ (p1 → p2): p1 and p2 are compatible at the
semantic and syntactic levels. This is the ideal situation in our context; we
designate it as valid.
Sem Syn
– Situation 2 (p1 → p2) ∧ (p1 9 p2): p1 and p2 are compatible at the
semantic level but not at the syntactic level. The composition is syntactically
adaptable. An adaptation between the two data formats will be necessary
(cf. converters).
Sem
– Situation 3 p1 9 p2 : The two parameters are not semantically compatible.
In such a case, it is pointless to proceed to verify their syntactic compatibility
(in fact, for us, two parameters with different significations cannot be paired).
The composition is semantically adaptable.
From these definitions, we develop our proposed approach for resolving the
incompatibilities.
6 Validation of the experimental chain
Of the three compatibility situations identified, the latter two require an adap-
tation stage before going on to the execution phase. It is a matter of finding one
or more intermediate processes which can overcome the compositions incompat-
ibility. For situations 2 and 3, two types of adaptations are proposed:
– semantic adaptation (for situation 3). The incompatibility of situation 3
represents the case where the two parameters of a composition use incom-
patible data categories. The adaptation here consists of finding a possible
intermediate process chain between these two categories.
– syntactic adaptation (for situation 2). In situation 2, where the composition
is already semantically compatible, the problem can be expressed as a di-
vergence between the data formats used by the two connected parameters.
All that is required is to find converters to convert one data format into the
other.
These adaptations are based on the organizational environment. The search
for intermediate processes can be equated to a search for itineraries between
two incompatible data categories or formats. We will illustrate this using the
example and the organizational environment constructed earlier (cf. figure 6).
Let us consider again the previous example. The verification conducted on
the instantiation of the abstract model detects a semantic incompatibility in the
composition between Blastp and Logo or between Blastp and PhyML due to dif-
ference in categories Pairs of sequences and Multiple Alignment (Incompatibility
situation 3 ). The (semantic) adaptation will be applied; it consists of finding in
what we call the (semantic) resource graph the path allowing the conversion of
categories.
The construction of the (semantic) resource graph consists of extracting,
from the organizational environment, the descriptions of processes and of data
categories referenced by their parameters. Such a (semantic) resource graph gen-
erated from the environment described in the figure 6 is shown in the figure 8.
A graph traversal algorithm is used to find all the possible paths between
the two concerned data categories (Pairs of sequences and Multiple Alignment).
A single path is found in the graph: Pairs of sequences → InteractiveSelection
→ ProteinDataBank → ClustalW → Multiple Alignment. The two processes,
InteractiveSelection and ClustalW, will therefore be added to the incompatible
chain (cf. figure 9).
Fig. 8. (Semantic) resource graph generated from the organizational environment of
the figure 6
MultipleAlignment : Clustal Image : jpeg
SeqPairs : XML 1 Logo page
page
Image
Protein sequences
page Blastp Pairs of sequences
page
page
page
(Fasta) (XML)
2 PhyML page
page
Tree
ProteinSeq : Fasta
MultipleAlignment : Clustal Tree : Newick
MultipleAlignment : Clustal Image : jpeg
ProteinDataBank : XML MultipleAlignment : Clustal Logo page
page
Image
Pairs of sequences Interactive
... ... page
page ClustalW
(XML) selection
SeqPairs : XML ProteinDataBank : MultiFasta PhyML page
page
Tree
MultipleAlignment : Clustal Tree : Newick
Fig. 9. Semantic adaptation
Once this adaptation is done, there still remains the existing syntactic in-
compatibility of the composition between the InteractiveSelection and ClustalW
processes because even though InteractiveSelection outputs the same data cate-
gory that is accepted for input by ClustalW, their data formats are different (xml
and MultiFasta). Syntactic adaptation consists of finding specific converters, or
compositions of converters, necessary for these conversions. We will not cover
this stage in detail; it is simply enough to understand that converters (or their
composition) can be added to obtain the required validity.
7 Conclusion and perspectives
A prototype (http://www.lirmm.fr/ lin/project/) illustrating the key aspects of
our approach for designing and validating scientific process chains is currently
being developed. This prototype serves as a basis for an inductive experimen-
tal approach using data of BAC and EST nucleic sequences as well as physical
and genetic maps for identifying and characterizing genetic markers relating to
sex of the Nile tilapia (Oreochromis niloticus). Over a longer term, we intend
to integrate the current prototype into a platform with a search engine based
on resource descriptions to be able to undertake the execution using real re-
sources, after requisite validation of experimentation chain. It will eventually
also use open-source controlled vocabularies such as PFO (Protein Feature On-
tology)[14], SO (Sequence Ontology)[15], and GO (Gene Ontology)[16] to enrich
data categories by additional representations and thus extend the descriptive
capacities of the organizational environment.
References
1. I. Altintas, B. Ludäscher, S. Klasky, and M. A. Vouk. S04 - introduction to scientific
workflow management and the kepler system. In SC, page 205, 2006.
2. S. Altschul, W. Gish, W. Miller, E. Myers and D. Lipman. Basic local alignment
search tool. In Journal of Molecular Biology, vol 215, pages 403-410, 1990.
3. S. Guindon and O. Gascuel. A simple, fast, and accurate algorithm to estimate large
phylogenies by maximum likelihood, in Systematic Biology, vol 52, pages 696-704,
2003.
4. L. Haibin, F. Yushun, CIMFlow: A Workflow Management System Based on Inte-
gration Platform Environment. In Proceedings of 7th IEEE International Confer-
ence on Emerging Technologies and Factory Automation. Barcelona : ETFA, 1999:
187-193.
5. M. Hallard & al. Bioside : faciliter l’acceès des biologistes aux ressources bio-
informatiques, JOBIM, Montreéal 2004, p 64.
6. D. Hull, K. Wolstencroft, R. Stevens, C. A. Goble, M. R. Pocock, P. Li, and T.
Oinn. Taverna: a tool for building and running workflows of services. Nucleic Acids
Research, 34(Web-Server-Issue):729732, 2006.
7. T. Libourel, Y. Lin, I. Mougenot, C. Pierkot, JC. Desconnets, A Platform Ded-
icated to Share and Mutualize Environmental Applications. Proceedings of 12th
International Conference on Enterprise Information Systems, Madere, 2010.
8. Y. Lin, T. Libourel, I. Mougenot, A Workflow Language for the Experimental
Sciences, Proceedings of 11th International Conference on Enterprise Information
Systems, Milan, 2009.
9. Object Management Group (OMG), OMG Unified Modeling LanguageTM (OMG
UML), Infrastructure Version 2.3. OMG Document Number: formal/2010-05-03.
10. Object Management Group (OMG), SPEM - Software & Systems Process
Engineering Meta-Model Specification, Version 2.0. OMG Document Number:
formal/2008-04-01.
11. Object Management Group (OMG), Meta Object Facility (MOF) Core Spec-
ification OMG Available Specification Version 2.0, OMG Document Number:
formal/06-01-01.
12. P. Pinheiro da Silva, L. Salayandia, A.Q. Gates, WDO-It! A Tool for Building Sci-
entific Workflows from Ontologies (2007). Departmental Technical Reports (CS).
Paper 201.
13. T. D. Schneider and R. M. Stephens, Sequence Logos: A New Way to Display
Consensus Sequences. In Nucleic Acids Res., vol 18, pages 6097-6100, 1990.
14. G.A. Reeves, K.Eilbeck, M.Magrane, C.O’Donovan, L.Montecchi-Palazzi, M.A.
Harris, S.E. Orchard, R.C. Jimenez, A.Prlic, T. J. P. Hubbard, H.Hermjakob,
J.M. Thornton. The Protein Feature Ontology: a tool for the unification of protein
feature annotations. In Bioinformatics, vol 24, pages 2767-2772, 2008.
15. K.Eilbeck, S.E Lewis, C.J Mungall, M.Yandell, L.Stein, R.Durbin, M.Ashburner.
The Sequence Ontology: a tool for the unification of genome annotations. In
Genome Biology, vol 6, pages R44, 2005.
16. M.Ashburner, C.A. Ball, J.A. Blake, D.Botstein, H.Butler, J. Michael Cherry, A.P.
Davis, K.Dolinski, S.S. Dwight, J.T. Eppig, M.A. Harris, D.P. Hill, L.Issel-Tarver,
A.Kasarskis, S.Lewis, J.C. Matese, J. E. Richardson, M.Ringwald, G.M. Rubin,
G.Sherlock, Gene ontology: tool for the unification of biology. The Gene Ontology
Consortium. In Nature Genetics, vol 25, pages 25-29, 2000.
17. Michael DiBernardo, Rachel Pottinger, Mark Wilkinson: Semi-automatic web ser-
vice composition for the life sciences using the BioMoby semantic web framework.
Journal of Biomedical Informatics 41(5): 837-847 (2008).
18. Bertrand Néron, Hervé Ménager, Corinne Maufrais, Nicolas Joly, Julien Maupetit,
Sébastien Letort, Sébastien Carrère, Pierre Tufféry, Catherine Letondal: Mobyle: a
new full web bioinformatics framework. Bioinformatics 25(22): 3005-3011 (2009).
19. Michel Kinsy, Zoé Lacroix, Christophe Legendre, Piotr Wlodarczyk, Nadia Yacoubi
Ayadi: ProtocolDB: Storing Scientific Protocols with a Domain Ontology. WISE
Workshops 2007: 17-28
20. Zoé Lacroix: Resource Discovery, Second International Workshop, RED 2009,
Lyon, France, August 28, 2009. Revised Papers Springer 2010.
21. Zoé Lacroix, Cartik R. Kothari, Peter Mork, Rami Rifaieh, Mark Wilkinson, Ju-
liana Freire, Sarah Cohen Boulakia: Biological Resource Discovery. Encyclopedia
of Database Systems 2009: 220-223.
22. Nadia Yacoubi Ayadi, Zoé Lacroix, Maria-Esther Vidal: A Deductive Approach
for Resource Interoperability and Well-Defined Workflows. OTM Workshops 2008:
998-1009.