<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A UML Profile for Functional Modeling Applied to the Molecular Function Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Patryk Burek</string-name>
          <email>patryk.burek@imise.uni-</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank Loebe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heinrich Herre</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Institute, University of Leipzig</institution>
          ,
          <addr-line>Augustusplatz 10, 04109 Leipzig</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig</institution>
          ,
          <addr-line>Haertelstrasse 16-18, 04107 Leipzig</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <abstract>
        <p>Gene Ontology (GO) is the largest, and steadily growing, resource for cataloging gene products. Naturally, its growth raises issues regarding its structure. Modeling and refactoring big ontologies such as GO is far from being simple. It seems that human-friendly graphical modeling languages, such as the Unified Modeling Language (UML) could be helpful for that task. In the current paper we investigate if UML can be utilized for making the structural organization of the Molecular Function Ontology (MFO), a sub-ontology of GO, more explicit. In addition, we examine if and how using UML can support the refactoring of MFO. We utilize UML and its extension mechanism for the definition of a UML dialect, which is suited for modeling functions and is called Function Modeling Language (FuML). Next, we use FuML for capturing the structure of molecular functions. Finally, we propose and demonstrate some refactoring options for MFO.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        The Molecular Function Ontology (MFO) is a sub-ontology of the
Gene Ontology (GO) – the largest, and steadily growing, resource
for cataloging gene products. In 2000 GO contained less than 5,000
terms, in 2003 – 13,000
        <xref ref-type="bibr" rid="ref10">(Gene Ontology Consortium, 2004)</xref>
        , in 2010
it exceeded 30,000
        <xref ref-type="bibr" rid="ref9">(du Plessis et al., 2011)</xref>
        , whereas at the beginning
of 2015 its size is above 42,000 terms. The growth of the ontology
leads to a suboptimal structure
        <xref ref-type="bibr" rid="ref9">(du Plessis et al., 2011)</xref>
        , which
motivates refactoring initiatives such as
        <xref ref-type="bibr" rid="ref1 ref13">(Guardia et al., 2012; Alterovitz
et al., 2010)</xref>
        , besides the work of the GO Consortium itself that
constantly improves and evolves GO. It turns out that modeling and
refactoring big ontologies such as GO is a difficult task, the
realization of which can be supported by a human-friendly representation
format. The serialization formats used for machine processing of
the ontologies, such as the OBO flat file format
        <xref ref-type="bibr" rid="ref15">(Horrocks, 2007)</xref>
        or
the Web Ontology Language (OWL)
        <xref ref-type="bibr" rid="ref28">(W3C OWL Working Group,
2009)</xref>
        , are not the easiest for a human user. This motivates the
adoption of human-friendly graphical notations like those used in
software engineering for the task of ontology representation
        <xref ref-type="bibr" rid="ref2 ref20">(Kogut
et al., 2002; Belghiat and Bourahla, 2012)</xref>
        for certain purposes.
      </p>
      <p>
        The de facto standard for graphical conceptual modeling of
software systems is the Unified Modeling Language (UML)
        <xref ref-type="bibr" rid="ref24">(Rumbaugh et al., 2005)</xref>
        , currently developed and maintained by the
Object Management Group (OMG)
        <xref ref-type="bibr" rid="ref23">(Object Management Group,
2014)</xref>
        . UML has a big potential for various applications that go
beyond software engineering, among them for modeling biological
knowledge and biological ontologies
        <xref ref-type="bibr" rid="ref13 ref26">(Shegogue and Zheng, 2005;
Guardia et al., 2012)</xref>
        .
      </p>
      <p>
        UML is well-suited for modeling biological systems, not at least
due to the rich infrastructure and the available tools. In particular,
the UML built-in extension mechanisms such as stereotypes and
profiles permit the easy construction of domain- or task-specific
UML dialects, e.g the OBO relations profile
        <xref ref-type="bibr" rid="ref13">(Guardia et al., 2012)</xref>
        .
Numerous tools for UML modeling are available on the market and
can be used out of the box for visualizing biological ontologies as a
whole or in part.
      </p>
      <p>In the present paper we investigate if UML can be utilized for
making the structure of MFO more explicit and if it can support
the refactoring of MFO. We use UML and its extension mechanism
for the definition of a UML dialect, called Function Modeling
Language (FuML), which is suited for function modeling. Next, we use
FuML for modeling the structure of molecular functions. Finally,
we propose and demonstrate some refactoring options for MFO.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
      <sec id="sec-2-1">
        <title>Molecular Function Ontology</title>
        <p>
          Like all GO terms, functions in MFO are specified by id, name,
natural language definition and an optional list of synonyms. For
instance, the function of catalyzing carbohydrate transmembrane
transport is specified by id: GO:0015144; name: carbohydrate
transmembrane transporter activity; definition: catalysis of the
transfer of carbohydrate from one side of the membrane to the other;
synonym: sugar transporter. Additionally, for each function its
relations with other concepts can be captured. The semantics of the
relations that are used for this purpose is provided by serialization
languages such as the OBO flat file format or OWL, and/or by the
OBO relations ontology (RO)
          <xref ref-type="bibr" rid="ref27">(Smith et al., 2005)</xref>
          . In particular,
functions in MFO are organized into a hierarchy by means of the
is_a link from RO; furthermore, they are linked with processes by
the part_of relationship from RO; and in some cases they have
relations with concepts of other ontologies such as ChEBI
          <xref ref-type="bibr" rid="ref8">(Degtyarenko
et al., 2008)</xref>
          . For instance, GO:0015144 is linked, by means of the
RO is_a relation, to its parent functions GO:1901476 carbohydrate
transporter activity and GO:0022891 substrate-specific
transmembrane transporter activity, by means of the RO part_of relation to
the process GO:0034219: carbohydrate transmembrane transport,
and by means of the RO transports_or_maintains_localization_of to
CHEBI:16646: carbohydrate.
        </p>
        <p>From the above we see that the semantics of functions in MFO is
provided to a large extent by informal natural language expressions
and partially by relations with other concepts.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Intensional Subsumption</title>
        <p>
          We propose defining the notion of function subsumption, which is a
backbone of MFO, upon an intensional interpretation of the is_a
relation. Typically, in the field of ontology engineering the extensional
aspect of the is_a relation is stressed; in OWL, for instance, A is a
subclass of B if every instance of A is an instance of B. The same
interpretation is used in RO, where is_a is defined by the reference
to the sets of all instances (extensions) of the concepts. According
to this understanding the is_a relation is often called extensional
subsumption, in contrast to its intensional counterpart(s), where we
focus on structural subsumption
          <xref ref-type="bibr" rid="ref29">(Woods, 1991)</xref>
          . Instead of
referring to instances, this type of subsumption is defined based on the
structure of concepts. The latter can be understood as a
composition of conceptual parts by means of various composing relations.
For illustration within GO itself, GO:0005215: transporter activity
is justified to intensionally subsume GO:0022857: transmembrane
transporter activity, because, following
          <xref ref-type="bibr" rid="ref29">(Woods, 1991)</xref>
          , both are
activities and they are (partially) defined by part-of relations, to
GO:0006810: transport and to GO:0055085: transmembrane
transport, resp., and the latter is subsumed by the former. Overall, the
main assumption is that concepts are complex structures which
can be organized into a subsumption hierarchy. The reading of
intensional subsumption is similar to inheritance in object-oriented
languages, where one class inherits its structure from another. That
enables the structuring of classes into hierarchies.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>UML Profiles and FuML</title>
        <p>UML is a graphical modeling language founded on the explicit
distinction between the static and the dynamic views of a system; it
introduces thirteen diagram types, grouped into two sets: structural
modeling diagrams and behavioral modeling diagrams. UML lacks
constructs dedicated to function modeling as such, but it provides
several build-in mechanisms that allow for an easy extension of the
language. Among them are profiles.</p>
        <p>A profile is a light-weight UML mechanism, typically used for
extending the language for particular platforms, domains or tasks. It
specifies a set of extensions of the UML standard metamodel which
include, among others, stereotypes. With stereotypes it is possible
to extend the standard UML vocabulary with new model elements.
A stereotype can be graphically represented by a dedicated icon,
though in the most straightforward form it is represented by a
stereotype name, surrounded by guillemets and placed above the name of
the stereotyped UML element, cf. «Function» in Figure 1.</p>
        <p>
          We used the profile mechanism for developing a UML extension,
called Function Modeling Language (FuML), aimed at
supporting the modeling of functions, function ascription, and function
decomposition. FuML defines 15 stereotypes for representing
functions and function structure, 8 stereotypes for modeling function
decomposition, subsumption and function dependencies. The full
specification of FuML stereotypes is provided in
          <xref ref-type="bibr" rid="ref5">(Burek and Herre,
2014)</xref>
          . In the remaining part of the current paper we analyze how far
FuML can be used for modeling and refactoring MFO.
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>ANALYSIS</title>
      <sec id="sec-3-1">
        <title>Modeling Molecular Functions with FuML</title>
        <p>3.1.1 Functions FuML enables graphical modeling of functions
in a compact and in an extended form. The compact form is
particularly suited for big models containing many functions, whereas the
extended form is designed for visualizing the dependencies within
the structure of a single function or between several functions.
Figure 1 presents an exemplary FuML model, depicting the
structure of the function GO:0015144: carbohydrate transmembrane
transporter activity. The upper part of the figure presents the
compact notation, whereas the extended notation is shown in the lower
part. The stereotypes utilized in the figure are discussed in the
remainder of the current section.</p>
        <p>
          A function in FuML is interpreted as a role that an entity plays
in the context of some goal achievement, such as e.g. a
teleological process. This account of functions is similar to
          <xref ref-type="bibr" rid="ref17">(Karp, 2000)</xref>
          ,
where a biological function of a molecule is described as the role
that the molecule plays in a biological process. In this sense, the
function GO:0015144: carbohydrate transmembrane transporter
activity, defined in GO as “catalysis of the transfer of carbohydrate
from one side of the membrane to the other”, depicts the catalyst
role in the teleological process of transferring carbohydrate from
one side of the membrane to the other. In terms of the structure we
can therefore say that a function specification contains as its part
a specification of a goal achievement, understood as a teleological
entity which is specified in terms of a transformation from an input
situation to an output situation. As presented in Figure 1, a
function is depicted by a UML classifier with a stereotype «Function».
It connects to its goal achievement by an association with a
stereotype «has-goal-achievement» in the extended notation, whereas the
compact notation utilizes the attribute goal_achievement.
3.1.2 Goal Achievements In FuML, a goal achievement (GA) x
is defined as a category the instances of which are transitions from
certain input situations to output situations. Input and output are
defined as follows:
        </p>
        <p>The input category y of the goal achievement x is a situation
category such that every instance of x is a transition starting
from a situation instantiating y.
The output y of a goal achievement x is a situation category
specifying the situations in which instances of x result by
transition. Every instance of x is a transition resulting in a situation
instantiating y.</p>
        <p>For example, the goal achievement carbohydrate transmembrane
transport establishes the input category, the instances of which are
situations of carbohydrate being on the one side of the membrane,
and the output category, the instances of which are situations of
carbohydrate being on the other side of the membrane. This means that
every instance of carbohydrate transmembrane transport exhibits a
transition from an instance of the input category to an instance of
the output category, i.e. from individual situations of carbohydrate
located on one side of the membrane, to individual situations of
carbohydrate located on the other side of the membrane.</p>
        <p>As shown in Figure 1, an input is indicated in the extended
notation by the association with stereotype «has-input», and by the input
attribute of a function in the compact notation. The representation
of outputs is analogous.</p>
        <p>
          Typically, a transformation from an input to an output situation is
a process, and then the GA can be understood as a process category.
In the running example, the GA is a teleological process category,
namely of carbohydrate transfer from one side of the membrane to
the other. This process exhibits the causal transition from the
situation of carbohydrate being on one side of the membrane to the
situation where carbohydrate is on the other side of the membrane.
3.1.3 Mode of Goal Achievement In some cases the specification
of a function is not reduced to a mere input-output pair, but it defines
constraints on the method of function realization. For example, the
molecular functions GO:0015399: primary active transmembrane
transporter activity and GO:0015291: secondary active
transmembrane transporter activity share the same input: solute is on one side
of the membrane, and the same output: solute is on the other side of
the membrane. Therefore, the pure input-output views of the
functions are equal. However, they are distinct due to the way in which
they achieve the goal. The former function is realized by means of
some primary energy source, for instance, a chemical, electrical or
solar source, whereas the latter relies on a uniporter, symporter or
antiporter protein. Thus we see that the functions provide the same
answer to the question on what is to be achieved, however they
provide different answers on how that is realized. In order to represent
this distinction, in FuML we introduce another component of
function structure, called Mode of Goal Achievement. The mode x of
the goal achievement y specifies the way in which y transforms the
input to the output situation. For GO:0015399 the mode is: some
primary energy source, for instance chemical, electrical or solar
source, and for GO:0015291 it is: uniporter, symporter or antiporter
protein. The mode is a constraint on the function realization, which
does not affect the input or the output. For example, if one adds
to the function of transmembrane transport the constraint that the
transport should be realized by the uniporter protein then the input
and the output remain unchanged. However, the function as such
changes in that not every transportation process realizes it, but only
those that are driven by a uniporter protein.
3.1.4 Participants Often goal achievements are expressed by
action sentences of natural language and thus the results of linguistic
analysis of action sentences can be applied to the analysis of the
structure of goal achievements. In linguistics, the role that a noun
phrase plays with respect to the action or state described by the verb
of a sentence is called a thematic role
          <xref ref-type="bibr" rid="ref14">(Harley, 2010)</xref>
          . The
specifications of molecular functions in MFO often contain two thematic
roles – a patient (called an operand in FuML) and an actor (called a
doer in FuML). An operand indicates the entity undergoing the
effect of the action. We say that an operand y of the goal achievement
x specifies a category y such that instances of x operate on instances
of y. GO:0015144 operates on (transports) carbohydrate.
        </p>
        <p>A doer is not as common in MFO as an operand. For example,
in the discussed carbohydrate transmembrane transport function no
doer is indicated. Typically, a doer is a part of the GA in cases
where the mode of realization is provided. For instance, the
functions GO:0015292 uniporter activity and GO:0015293 symporter
activity both specify the mode of realization and each indicates its
doer, namely the respective protein.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>PATTERNS OF FUNCTION SUBSUMPTION</title>
      <p>
        Behind functional subsumption actually various distinct relations
are implicitly hidden
        <xref ref-type="bibr" rid="ref6">(Burek et al., 2009)</xref>
        . FuML introduces several
distinct patterns for function subsumption
        <xref ref-type="bibr" rid="ref5">(Burek and Herre, 2014)</xref>
        .
In the following section we discuss the application of three of those
patterns for the modeling of MFO.
      </p>
      <p>In FuML the notion of function subsumption is founded on the
subsumption of goal achievements. We say that the function x is
subsumed by the function y if the goal achievement of x is
subsumed by the goal achievement of y. Since goal achievements are
quite complex entities, it is not trivial to answer the question of what
it means that one goal achievement subsumes another. Here,
however, the analysis of GA structure is helpful, which pertains to the
intensional aspects of the corresponding GA category, as discussed
in previous sections. Based on this approach one can detect various
patterns of function subsumption.
4.1</p>
      <sec id="sec-4-1">
        <title>Operand Specialization</title>
        <p>Since function specifications often contain operands, it is very
common to construct a hierarchy of functions on the basis of
the taxonomic hierarchy of their operands. In fact, this pattern
is applied frequently in MFO. Consider, for instance, the
functions GO:0015075: ion transmembrane transporter activity and
GO:0008324: cation transmembrane transporter activity, linked by
the is_a relation in GO. The relation between those two functions
is based on the relation of their operands, as cation is subsumed by
ion. In FuML function subsumption by operand specialization is
depicted with a dependency link with stereotype «operand-spec». The
supplier of the link is the subsumed function and the client is the
subsumer.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Mode Addition</title>
        <p>Another pattern of function subsumption, frequently met in
MFO, is based on modes of goal achievement. Consider two
functions, GO:0022857: transmembrane transporter activity and
GO:0022804: active transmembrane transporter activity. Both
share the same operand, namely substance, as well as the same
input-output pair – operand is on one side of the membrane and
operand is on the other side of the membrane. In this sense those
functions are equal. However, they differ in that the former does
not define any mode of realization, whereas the latter has the
following mode defined: the transporter binding the solute undergoes
a series of conformational changes. Therefore, one can say that</p>
        <p>GO:0022804 specializes GO:0022857 by addition of a mode. We
say that function x is subsumed by the function y by mode
addition if x is subsumed by y and x has some mode, whereas y has no
mode assigned. Function subsumption by mode addition is depicted
in FuML by means of a dependency link with stereotype
«modeadded». The subsumed function is the supplier of the link and the
subsuming function is a client.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Mode Specialization</title>
        <p>Subsumption of functions can be based on the mode of realization
also in cases where a parent function has already a mode assigned.
Consider, for instance, the function GO:0022804: active
transmembrane transporter activity having the mode: transporter binds
the solute and undergoes a series of conformational changes and
the function GO:0015291: secondary active transmembrane
transporter activity with the mode: transporter binds the solute and
undergoes a series of conformational changes driven by chemiosmotic
energy sources, including uniport, symport or antiport. The
latter clearly characterizes particular modes of active transmembrane
transport. Consequently, it seems intuitive to say that GO:0015291
specializes GO:0022804 (as is the case in GO). We call this type of
function subsumption the subsumption by mode specialization and
define it as follows: The function x is subsumed by the function
y by mode specialization if x is subsumed by y and mode r of x
specializes mode s of y. In FuML function subsumption by mode
specialization is depicted with a dependency link with stereotype
«mode-spec». The subsumed function is the supplier of the link and
the specialized function is a client.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>APPLICATION</title>
      <p>The application of FuML to GO pursues two objectives. The first
objective is the usage of FuML for establishing a semantic basis
for molecular functions that supports the representation of
functions in an organized way beyond the textual description. Moreover,
the discussed patterns represent basic knowledge on the
interrelations between biological processes and molecular functions.
The part_of relation between biological processes and molecular
functions can be mapped to the has-goal-achievement association
between functions and goal achievements.</p>
      <p>The second and the main objective of applying FuML to MFO is
to explicitly document design choices and the subsumption patterns
utilized implicitly in MFO. Figure 2 presents such a documentation
for a fragment of MFO in terms of FuML. The patterns are indicated
by stereotypes of FuML, which enables an easy-to-grasp
visualization of the structure of MFO as well as the underlying design
choices. One benefit of this approach is that the explicit specification
of the design choices makes the ontology much more intelligible for
a human user.</p>
      <p>Furthermore, the application of FuML reveals potential of
refactoring and revision of GO. For instance, the application of FuML in
modeling the functions GO:0022857: transmembrane transporter
activity and GO:0022891: substrate-specific transmembrane
transporter activity reveals that both share similar goal achievements:
transfer of an operand from one side of a membrane to the other,
with input: operand is on one side of the membrane, and output:
operand is on the other side of the membrane. Consequently and
following FuML, a potential difference between GO:0022857 and
GO:0022891 can be searched in their operands. For GO:0022857
that is a substance, whereas for GO:0022891 it is a specific
substance or group of substances. Therefore, the first refactoring option
would be to explicitly document the pattern of subsumption
between GO:0022857 and GO:0022891 as operand specialization.
The alternative refactoring option is driven by the further
analysis of operands of those functions, in particular by clarifying
what the difference between “a substance” and “a specific
substance or group of substances” is. The answer could be found in
GO:0022892 substrate-specific transporter activity, a parent
function of GO:0022891. An operand of GO:0022892 is exemplified by
macromolecules, small molecules or ions. In that case, however,
it seems that functions like GO:0090482: vitamin transmembrane
transporter activity and GO:0015238: drug transmembrane
transporter activity should also be considered as substance specific
transmembrane transport and specialize GO:0022891 by operand
specialization, which is currently not the case, however.</p>
      <p>Finally, the third possible refactoring option could be based on
the assumption that the distinction between those two operands is
only superficial and GO:0022891 is merely used for the
organization of the function taxonomy, i.e., for grouping all functions
that are distinguished by operands such as ion, alcohol, and water.
According to this view, GO:0022891 would in fact be a
duplication of GO:0022857, introduced into MFO only for the purpose of
structuring it, but not as a specification of particular biological
functions. As illustrated in Figure 2, FuML enables the replacement of
GO:0022891 with an explicit specification of the design choices by
stereotyped links.</p>
      <p>The decision on the refactoring option, as in any modeling
enterprise, is the responsibility of the modeler(s), GO developers in this
case. Yet, the above analysis demonstrates how graphical languages,
such as FuML, similarly as in software and systems engineering, can
drive and support that task for biological ontologies such as MFO.
6</p>
    </sec>
    <sec id="sec-6">
      <title>RELATED WORK</title>
      <p>
        The ideas underlying the structure of functions, introduced in
FuML, are the result of an analysis of the current state of the art of
function modeling in software, systems and ontological
engineering. For instance, the interpretation of a function in terms of a role
is common not only in biological systems
        <xref ref-type="bibr" rid="ref17">(Karp, 2000)</xref>
        , but also
in functional modeling in mechanical engineering
        <xref ref-type="bibr" rid="ref19 ref21 ref7">(Kitamura et al.,
2006; Lind, 1994; Chandrasekaran and Josephson, 2000)</xref>
        .
      </p>
      <p>
        The notion of goal achievement grasps the teleological character
of a function, its orientation on some goal. This aspect is stressed in
many approaches to function representation, e.g.
        <xref ref-type="bibr" rid="ref11 ref16 ref25">(Sasajima et al.,
1995; Iwasaki et al., 1995; Gero, 1990)</xref>
        . In particular, defining
a function in terms of input-output pairs is present in modeling
technical artifacts
        <xref ref-type="bibr" rid="ref12 ref3">(Borgo et al., 2011; Goel et al., 2009)</xref>
        .
      </p>
      <p>
        The mode of realization, also called the
way-of-functionachievement, specifying the constraints on the method of function
realization is present in
        <xref ref-type="bibr" rid="ref18">(Kitamura et al., 2002)</xref>
        , among others.
      </p>
      <p>
        To the best of our knowledge, the presented patterns of function
decomposition are not collected and integrated into any other single
modeling framework, though the techniques themselves are
commonly used, especially in software and systems engineering, e.g. see
the function-means-context link in
        <xref ref-type="bibr" rid="ref4">(Bracewell and Wallace, 2001)</xref>
        or
the decomposition with zig-zaging in
        <xref ref-type="bibr" rid="ref22">(Nam, 2001)</xref>
        .
7
      </p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION</title>
      <p>
        In the current paper we present and discuss applications of UML and
patterns for function subsumption to the modeling and refactoring
of biological ontologies. In particular, we developed a UML profile
for functional modeling, called the Function Modeling Language
(FuML)
        <xref ref-type="bibr" rid="ref5">(Burek and Herre, 2014)</xref>
        , and apply it to the modeling and
refactoring of a segment of the Molecular Function Ontology.
      </p>
      <p>The application of FuML enables the systematic, graphical
representation of information that is currently available in MFO mainly
in the form of textual descriptions. We demonstrate that behind
the extensional is_a relation, which is used for the construction of
MFO, several different patterns of intensional subsumption can be
determined. Modeling MFO via FuML helps in identifying these
patterns and, moreover, provides the means for representing them
directly in the hierarchy of molecular functions. We argue that this
can help making the ontology structure more comprehensible for
human users and supports communication. The claim is illustrated
by an analysis and a model of an MFO fragment with FuML, from
which we derive several refactoring options.</p>
      <p>Besides proposing FuML and the particular refactoring options
in this paper, for future work we consider first the continued
analysis of MFO. Extending this to a larger scale may require establishing
software support, e.g., for identifying subsumption pattern instances
within MFO (semi-)automatically. Moreover, FuML and its
methods may also be transferred to or yield new methods for common
languages of biomedical ontologies, nowadays including OWL.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Alterovitz</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hill</surname>
            ,
            <given-names>D. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lomax</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Cherkassky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Dreyfuss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Mungall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            ,
            <surname>Dolan</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. E.</surname>
          </string-name>
          , et al. (
          <year>2010</year>
          ).
          <article-title>Ontology engineering</article-title>
          .
          <source>Nature biotechnology</source>
          ,
          <volume>28</volume>
          (
          <issue>2</issue>
          ),
          <fpage>128</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Belghiat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bourahla</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Automatic generation of OWL ontologies from UML class diagrams based on meta-modelling and graph grammars</article-title>
          .
          <source>World Academy of Science, Engineering and Technology</source>
          ,
          <volume>6</volume>
          (
          <issue>8</issue>
          ),
          <fpage>380</fpage>
          -
          <lpage>385</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Borgo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrara</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garbacz</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vermaas</surname>
            ,
            <given-names>P. E.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>A formalization of functions as operations on flows</article-title>
          .
          <source>Journal of Computing and Information Science in Engineering</source>
          ,
          <volume>11</volume>
          (
          <issue>3</issue>
          ),
          <fpage>031007</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Bracewell</surname>
            ,
            <given-names>R. H.</given-names>
          </string-name>
          <article-title>and</article-title>
          <string-name>
            <surname>Wallace</surname>
            ,
            <given-names>K. M.</given-names>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>Designing a representation to support function-means based synthesis of mechanical design solutions</article-title>
          . In S. Culley,
          <string-name>
            <given-names>A.</given-names>
            <surname>Duffy</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>McMahon, and</article-title>
          K. Wallace, editors,
          <source>Proceedings of ICED01</source>
          , Glasgow, Scotland,
          <string-name>
            <surname>UK</surname>
          </string-name>
          ,
          <source>Aug 21-23</source>
          , pages
          <fpage>275</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Burek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Herre</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <source>FuML Specification v1.0. Onto-med report</source>
          , University of Leipzig.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Burek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herre</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Loebe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Ontological analysis of functional decomposition</article-title>
          . In H. Fujita and V. Marˇík, editors,
          <source>Proceedings of the 8th International Conference on Software Methodologies, Tools and Techniques</source>
          ,
          <source>SoMeT</source>
          <year>2009</year>
          , Prague, Czech Republic,
          <source>Sep 23-25</source>
          , pages
          <fpage>428</fpage>
          -
          <lpage>439</lpage>
          , Amsterdam. IOS Press.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Chandrasekaran</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Josephson</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>Function in device representation</article-title>
          .
          <source>Engineering with computers</source>
          ,
          <volume>16</volume>
          (
          <issue>3-4</issue>
          ),
          <fpage>162</fpage>
          -
          <lpage>177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Degtyarenko</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Matos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ennis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hastings</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zbinden</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McNaught</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alcántara</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darsow</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guedj</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>ChEBI: a database and ontology for chemical entities of biological interest</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>36</volume>
          (
          <issue>Suppl 1</issue>
          ),
          <fpage>D344</fpage>
          -
          <lpage>D350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>du Plessis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Škunca</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Dessimoz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>The what, where, how and why of gene ontology - a primer for bioinformaticians</article-title>
          .
          <source>Briefings in Bioinformatics</source>
          ,
          <volume>12</volume>
          (
          <issue>6</issue>
          ),
          <fpage>723</fpage>
          -
          <lpage>735</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Gene</given-names>
            <surname>Ontology Consortium</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>The Gene Ontology (GO) database and informatics resource</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>32</volume>
          (
          <issue>Suppl 1</issue>
          ),
          <fpage>D258</fpage>
          -
          <lpage>D261</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Gero</surname>
            ,
            <given-names>J. S.</given-names>
          </string-name>
          (
          <year>1990</year>
          ).
          <article-title>Design prototypes: a knowledge representation schema for design</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>11</volume>
          (
          <issue>4</issue>
          ),
          <fpage>26</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Goel</surname>
            ,
            <given-names>A. K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rugaber</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vattam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Structure, behavior, and function of complex systems: The structure, behavior, and function modeling language</article-title>
          .
          <source>Artificial Intelligence for Engineering Design, Analysis and Manufacturing</source>
          ,
          <volume>23</volume>
          (
          <issue>01</issue>
          ),
          <fpage>23</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Guardia</surname>
            ,
            <given-names>G. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vêncio</surname>
            ,
            <given-names>R. Z.</given-names>
          </string-name>
          , and de Farias,
          <string-name>
            <surname>C. R.</surname>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>A UML profile for the OBO relation ontology</article-title>
          .
          <source>BMC Genomics</source>
          ,
          <volume>13</volume>
          (
          <issue>Suppl 5</issue>
          ),
          <fpage>S3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Harley</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Thematic roles</article-title>
          .
          <source>The Cambridge Encyclopedia of the Language Sciences</source>
          , pages
          <fpage>861</fpage>
          -
          <lpage>862</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>OBO flat file format syntax and semantics and mapping to OWL Web Ontology Language</article-title>
          .
          <source>Technical report</source>
          , University of Manchester.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Iwasaki</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vescovi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fikes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Chandrasekaran</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>1995</year>
          ).
          <article-title>Causal functional representation language with behavior-based semantics</article-title>
          .
          <source>Applied Artificial Intelligence: An International Journal</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <fpage>5</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Karp</surname>
            ,
            <given-names>P. D.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>An ontology for biological function based on molecular interactions</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>16</volume>
          (
          <issue>3</issue>
          ),
          <fpage>269</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Kitamura</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sano</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Namba</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mizoguchi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>A functional concept ontology and its application to automatic identification of functional structures</article-title>
          .
          <source>Advanced Engineering Informatics</source>
          ,
          <volume>16</volume>
          (
          <issue>2</issue>
          ),
          <fpage>145</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Kitamura</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koji</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mizoguchi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>An ontological model of device function: industrial deployment and lessons learned</article-title>
          .
          <source>Applied Ontology</source>
          ,
          <volume>1</volume>
          (
          <issue>3</issue>
          ),
          <fpage>237</fpage>
          -
          <lpage>262</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Kogut</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cranefield</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dutra</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baclawski</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kokar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>UML for ontology development</article-title>
          .
          <source>The Knowledge Engineering Review</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <fpage>61</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Lind</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>1994</year>
          ).
          <article-title>Modeling goals and functions of complex industrial plants</article-title>
          .
          <source>Applied Artificial Intelligence: An International Journal</source>
          ,
          <volume>8</volume>
          (
          <issue>2</issue>
          ),
          <fpage>259</fpage>
          -
          <lpage>283</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Nam</surname>
            ,
            <given-names>P. S.</given-names>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>Axiomatic design: Advances and applications</article-title>
          . Oxford University Press, New York.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Object</given-names>
            <surname>Management Group</surname>
          </string-name>
          (
          <year>2014</year>
          ). http://www.omg.org/.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Rumbaugh</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jacobson</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Booch</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>The Unified Modeling Language Reference Manual</article-title>
          .
          <source>Addison Wesley</source>
          , Reading, Massachusetts, 2. edition.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Sasajima</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kitamura</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ikeda</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mizoguchi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>1995</year>
          ).
          <article-title>FBRL: A function and behavior representation language</article-title>
          .
          <source>In Proc. of IJCAI</source>
          <year>1995</year>
          , pages
          <fpage>1830</fpage>
          -
          <lpage>1836</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Shegogue</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>W. J.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Integration of the Gene Ontology into an objectoriented architecture</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ),
          <fpage>113</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klagges</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Köhler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lomax</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A. L.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Relations in biomedical ontologies</article-title>
          .
          <source>Genome biology</source>
          ,
          <volume>6</volume>
          (
          <issue>5</issue>
          ),
          <fpage>R46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>W3C OWL Working Group</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>OWL 2 Web Ontology Language Document Overview</article-title>
          .
          <source>Technical report, World Wide Web Consortium.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Woods</surname>
            ,
            <given-names>W. A.</given-names>
          </string-name>
          (
          <year>1991</year>
          ).
          <article-title>Understanding subsumption and taxonomy: A framework for progress</article-title>
          .
          <source>In Principles of Semantic Networks</source>
          , pages
          <fpage>45</fpage>
          -
          <lpage>94</lpage>
          . Morgan Kaufmann.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>