=Paper= {{Paper |id=Vol-1515/regular3 |storemode=property |title=A UML profile for functional modeling applied to the Molecular Function Ontology |pdfUrl=https://ceur-ws.org/Vol-1515/regular3.pdf |volume=Vol-1515 |dblpUrl=https://dblp.org/rec/conf/icbo/BurekLH15 }} ==A UML profile for functional modeling applied to the Molecular Function Ontology== https://ceur-ws.org/Vol-1515/regular3.pdf
                    A UML Profile for Functional Modeling Applied to
                           the Molecular Function Ontology
                              Patryk Burek 1∗, Frank Loebe 2 and Heinrich Herre 1
              1
                  Institute for Medical Informatics, Statistics and Epidemiology (IMISE), University of Leipzig,
                                         Haertelstrasse 16-18, 04107 Leipzig, Germany
                                       2
                                         Computer Science Institute, University of Leipzig,
                                           Augustusplatz 10, 04109 Leipzig, Germany




ABSTRACT                                                                  knowledge and biological ontologies (Shegogue and Zheng, 2005;
   Gene Ontology (GO) is the largest, and steadily growing, resource      Guardia et al., 2012).
for cataloging gene products. Naturally, its growth raises issues re-       UML is well-suited for modeling biological systems, not at least
garding its structure. Modeling and refactoring big ontologies such       due to the rich infrastructure and the available tools. In particular,
as GO is far from being simple. It seems that human-friendly graph-       the UML built-in extension mechanisms such as stereotypes and
ical modeling languages, such as the Unified Modeling Language            profiles permit the easy construction of domain- or task-specific
(UML) could be helpful for that task. In the current paper we inves-      UML dialects, e.g the OBO relations profile (Guardia et al., 2012).
tigate if UML can be utilized for making the structural organization of   Numerous tools for UML modeling are available on the market and
the Molecular Function Ontology (MFO), a sub-ontology of GO, more         can be used out of the box for visualizing biological ontologies as a
explicit. In addition, we examine if and how using UML can support        whole or in part.
the refactoring of MFO. We utilize UML and its extension mechanism          In the present paper we investigate if UML can be utilized for
for the definition of a UML dialect, which is suited for modeling func-   making the structure of MFO more explicit and if it can support
tions and is called Function Modeling Language (FuML). Next, we           the refactoring of MFO. We use UML and its extension mechanism
use FuML for capturing the structure of molecular functions. Finally,     for the definition of a UML dialect, called Function Modeling Lan-
we propose and demonstrate some refactoring options for MFO.              guage (FuML), which is suited for function modeling. Next, we use
                                                                          FuML for modeling the structure of molecular functions. Finally,
                                                                          we propose and demonstrate some refactoring options for MFO.
1   INTRODUCTION
The Molecular Function Ontology (MFO) is a sub-ontology of the
Gene Ontology (GO) – the largest, and steadily growing, resource          2     METHODS
for cataloging gene products. In 2000 GO contained less than 5,000        2.1   Molecular Function Ontology
terms, in 2003 – 13,000 (Gene Ontology Consortium, 2004), in 2010
                                                                          Like all GO terms, functions in MFO are specified by id, name,
it exceeded 30,000 (du Plessis et al., 2011), whereas at the beginning
                                                                          natural language definition and an optional list of synonyms. For
of 2015 its size is above 42,000 terms. The growth of the ontology
                                                                          instance, the function of catalyzing carbohydrate transmembrane
leads to a suboptimal structure (du Plessis et al., 2011), which moti-
                                                                          transport is specified by id: GO:0015144; name: carbohydrate
vates refactoring initiatives such as (Guardia et al., 2012; Alterovitz
                                                                          transmembrane transporter activity; definition: catalysis of the
et al., 2010), besides the work of the GO Consortium itself that con-
                                                                          transfer of carbohydrate from one side of the membrane to the other;
stantly improves and evolves GO. It turns out that modeling and
                                                                          synonym: sugar transporter. Additionally, for each function its re-
refactoring big ontologies such as GO is a difficult task, the realiza-
                                                                          lations with other concepts can be captured. The semantics of the
tion of which can be supported by a human-friendly representation
                                                                          relations that are used for this purpose is provided by serialization
format. The serialization formats used for machine processing of
                                                                          languages such as the OBO flat file format or OWL, and/or by the
the ontologies, such as the OBO flat file format (Horrocks, 2007) or
                                                                          OBO relations ontology (RO) (Smith et al., 2005). In particular,
the Web Ontology Language (OWL) (W3C OWL Working Group,
                                                                          functions in MFO are organized into a hierarchy by means of the
2009), are not the easiest for a human user. This motivates the
                                                                          is_a link from RO; furthermore, they are linked with processes by
adoption of human-friendly graphical notations like those used in
                                                                          the part_of relationship from RO; and in some cases they have rela-
software engineering for the task of ontology representation (Kogut
                                                                          tions with concepts of other ontologies such as ChEBI (Degtyarenko
et al., 2002; Belghiat and Bourahla, 2012) for certain purposes.
                                                                          et al., 2008). For instance, GO:0015144 is linked, by means of the
   The de facto standard for graphical conceptual modeling of
                                                                          RO is_a relation, to its parent functions GO:1901476 carbohydrate
software systems is the Unified Modeling Language (UML) (Rum-
                                                                          transporter activity and GO:0022891 substrate-specific transmem-
baugh et al., 2005), currently developed and maintained by the
                                                                          brane transporter activity, by means of the RO part_of relation to
Object Management Group (OMG) (Object Management Group,
                                                                          the process GO:0034219: carbohydrate transmembrane transport,
2014). UML has a big potential for various applications that go
                                                                          and by means of the RO transports_or_maintains_localization_of to
beyond software engineering, among them for modeling biological
                                                                          CHEBI:16646: carbohydrate.
                                                                             From the above we see that the semantics of functions in MFO is
∗ To whom correspondence should be addressed: patryk.burek@imise.uni-     provided to a large extent by informal natural language expressions
leipzig.de                                                                and partially by relations with other concepts.



 Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                          1
Burek et al.



2.2    Intensional Subsumption                                            extended form is designed for visualizing the dependencies within
We propose defining the notion of function subsumption, which is a        the structure of a single function or between several functions.
backbone of MFO, upon an intensional interpretation of the is_a re-
lation. Typically, in the field of ontology engineering the extensional
aspect of the is_a relation is stressed; in OWL, for instance, A is a
subclass of B if every instance of A is an instance of B. The same
interpretation is used in RO, where is_a is defined by the reference
to the sets of all instances (extensions) of the concepts. According
to this understanding the is_a relation is often called extensional
subsumption, in contrast to its intensional counterpart(s), where we
focus on structural subsumption (Woods, 1991). Instead of refer-
ring to instances, this type of subsumption is defined based on the
structure of concepts. The latter can be understood as a composi-
tion of conceptual parts by means of various composing relations.
For illustration within GO itself, GO:0005215: transporter activity
is justified to intensionally subsume GO:0022857: transmembrane
transporter activity, because, following (Woods, 1991), both are
activities and they are (partially) defined by part-of relations, to
GO:0006810: transport and to GO:0055085: transmembrane trans-
port, resp., and the latter is subsumed by the former. Overall, the
main assumption is that concepts are complex structures which
can be organized into a subsumption hierarchy. The reading of in-
tensional subsumption is similar to inheritance in object-oriented
languages, where one class inherits its structure from another. That         Figure 1. A FuML model of a molecular function, displayed in the
enables the structuring of classes into hierarchies.                         compact notation at the top and in the extended form at the bottom.

2.3    UML Profiles and FuML
UML is a graphical modeling language founded on the explicit dis-         Figure 1 presents an exemplary FuML model, depicting the struc-
tinction between the static and the dynamic views of a system; it         ture of the function GO:0015144: carbohydrate transmembrane
introduces thirteen diagram types, grouped into two sets: structural      transporter activity. The upper part of the figure presents the com-
modeling diagrams and behavioral modeling diagrams. UML lacks             pact notation, whereas the extended notation is shown in the lower
constructs dedicated to function modeling as such, but it provides        part. The stereotypes utilized in the figure are discussed in the
several build-in mechanisms that allow for an easy extension of the       remainder of the current section.
language. Among them are profiles.                                           A function in FuML is interpreted as a role that an entity plays
   A profile is a light-weight UML mechanism, typically used for          in the context of some goal achievement, such as e.g. a teleolog-
extending the language for particular platforms, domains or tasks. It     ical process. This account of functions is similar to (Karp, 2000),
specifies a set of extensions of the UML standard metamodel which         where a biological function of a molecule is described as the role
include, among others, stereotypes. With stereotypes it is possible       that the molecule plays in a biological process. In this sense, the
to extend the standard UML vocabulary with new model elements.            function GO:0015144: carbohydrate transmembrane transporter
A stereotype can be graphically represented by a dedicated icon,          activity, defined in GO as “catalysis of the transfer of carbohydrate
though in the most straightforward form it is represented by a stereo-    from one side of the membrane to the other”, depicts the catalyst
type name, surrounded by guillemets and placed above the name of          role in the teleological process of transferring carbohydrate from
the stereotyped UML element, cf. «Function» in Figure 1.                  one side of the membrane to the other. In terms of the structure we
   We used the profile mechanism for developing a UML extension,          can therefore say that a function specification contains as its part
called Function Modeling Language (FuML), aimed at support-               a specification of a goal achievement, understood as a teleological
ing the modeling of functions, function ascription, and function          entity which is specified in terms of a transformation from an input
decomposition. FuML defines 15 stereotypes for representing func-         situation to an output situation. As presented in Figure 1, a func-
tions and function structure, 8 stereotypes for modeling function         tion is depicted by a UML classifier with a stereotype «Function».
decomposition, subsumption and function dependencies. The full            It connects to its goal achievement by an association with a stereo-
specification of FuML stereotypes is provided in (Burek and Herre,        type «has-goal-achievement» in the extended notation, whereas the
2014). In the remaining part of the current paper we analyze how far      compact notation utilizes the attribute goal_achievement.
FuML can be used for modeling and refactoring MFO.
                                                                          3.1.2 Goal Achievements In FuML, a goal achievement (GA) x
3     ANALYSIS                                                            is defined as a category the instances of which are transitions from
                                                                          certain input situations to output situations. Input and output are
3.1    Modeling Molecular Functions with FuML                             defined as follows:
3.1.1 Functions FuML enables graphical modeling of functions                 • The input category y of the goal achievement x is a situation
in a compact and in an extended form. The compact form is particu-             category such that every instance of x is a transition starting
larly suited for big models containing many functions, whereas the             from a situation instantiating y.


2                           Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes
                                                          A UML Profile for Functional Modeling Applied to the Molecular Function Ontology



   • The output y of a goal achievement x is a situation category            phrase plays with respect to the action or state described by the verb
      specifying the situations in which instances of x result by tran-      of a sentence is called a thematic role (Harley, 2010). The specifi-
      sition. Every instance of x is a transition resulting in a situation   cations of molecular functions in MFO often contain two thematic
      instantiating y.                                                       roles – a patient (called an operand in FuML) and an actor (called a
For example, the goal achievement carbohydrate transmembrane                 doer in FuML). An operand indicates the entity undergoing the ef-
transport establishes the input category, the instances of which are         fect of the action. We say that an operand y of the goal achievement
situations of carbohydrate being on the one side of the membrane,            x specifies a category y such that instances of x operate on instances
and the output category, the instances of which are situations of car-       of y. GO:0015144 operates on (transports) carbohydrate.
bohydrate being on the other side of the membrane. This means that              A doer is not as common in MFO as an operand. For example,
every instance of carbohydrate transmembrane transport exhibits a            in the discussed carbohydrate transmembrane transport function no
transition from an instance of the input category to an instance of          doer is indicated. Typically, a doer is a part of the GA in cases
the output category, i.e. from individual situations of carbohydrate         where the mode of realization is provided. For instance, the func-
located on one side of the membrane, to individual situations of             tions GO:0015292 uniporter activity and GO:0015293 symporter
carbohydrate located on the other side of the membrane.                      activity both specify the mode of realization and each indicates its
   As shown in Figure 1, an input is indicated in the extended nota-         doer, namely the respective protein.
tion by the association with stereotype «has-input», and by the input
attribute of a function in the compact notation. The representation          4     PATTERNS OF FUNCTION SUBSUMPTION
of outputs is analogous.                                                     Behind functional subsumption actually various distinct relations
   Typically, a transformation from an input to an output situation is       are implicitly hidden (Burek et al., 2009). FuML introduces several
a process, and then the GA can be understood as a process category.          distinct patterns for function subsumption (Burek and Herre, 2014).
In the running example, the GA is a teleological process category,           In the following section we discuss the application of three of those
namely of carbohydrate transfer from one side of the membrane to             patterns for the modeling of MFO.
the other. This process exhibits the causal transition from the sit-            In FuML the notion of function subsumption is founded on the
uation of carbohydrate being on one side of the membrane to the              subsumption of goal achievements. We say that the function x is
situation where carbohydrate is on the other side of the membrane.           subsumed by the function y if the goal achievement of x is sub-
                                                                             sumed by the goal achievement of y. Since goal achievements are
3.1.3 Mode of Goal Achievement In some cases the specification               quite complex entities, it is not trivial to answer the question of what
of a function is not reduced to a mere input-output pair, but it defines     it means that one goal achievement subsumes another. Here, how-
constraints on the method of function realization. For example, the          ever, the analysis of GA structure is helpful, which pertains to the
molecular functions GO:0015399: primary active transmembrane                 intensional aspects of the corresponding GA category, as discussed
transporter activity and GO:0015291: secondary active transmem-              in previous sections. Based on this approach one can detect various
brane transporter activity share the same input: solute is on one side       patterns of function subsumption.
of the membrane, and the same output: solute is on the other side of
the membrane. Therefore, the pure input-output views of the func-            4.1    Operand Specialization
tions are equal. However, they are distinct due to the way in which          Since function specifications often contain operands, it is very
they achieve the goal. The former function is realized by means of           common to construct a hierarchy of functions on the basis of
some primary energy source, for instance, a chemical, electrical or          the taxonomic hierarchy of their operands. In fact, this pattern
solar source, whereas the latter relies on a uniporter, symporter or         is applied frequently in MFO. Consider, for instance, the func-
antiporter protein. Thus we see that the functions provide the same          tions GO:0015075: ion transmembrane transporter activity and
answer to the question on what is to be achieved, however they pro-          GO:0008324: cation transmembrane transporter activity, linked by
vide different answers on how that is realized. In order to represent        the is_a relation in GO. The relation between those two functions
this distinction, in FuML we introduce another component of func-            is based on the relation of their operands, as cation is subsumed by
tion structure, called Mode of Goal Achievement. The mode x of               ion. In FuML function subsumption by operand specialization is de-
the goal achievement y specifies the way in which y transforms the           picted with a dependency link with stereotype «operand-spec». The
input to the output situation. For GO:0015399 the mode is: some              supplier of the link is the subsumed function and the client is the
primary energy source, for instance chemical, electrical or solar            subsumer.
source, and for GO:0015291 it is: uniporter, symporter or antiporter
protein. The mode is a constraint on the function realization, which         4.2    Mode Addition
does not affect the input or the output. For example, if one adds            Another pattern of function subsumption, frequently met in
to the function of transmembrane transport the constraint that the           MFO, is based on modes of goal achievement. Consider two
transport should be realized by the uniporter protein then the input         functions, GO:0022857: transmembrane transporter activity and
and the output remain unchanged. However, the function as such               GO:0022804: active transmembrane transporter activity. Both
changes in that not every transportation process realizes it, but only       share the same operand, namely substance, as well as the same
those that are driven by a uniporter protein.                                input-output pair – operand is on one side of the membrane and
                                                                             operand is on the other side of the membrane. In this sense those
3.1.4 Participants Often goal achievements are expressed by ac-              functions are equal. However, they differ in that the former does
tion sentences of natural language and thus the results of linguistic        not define any mode of realization, whereas the latter has the fol-
analysis of action sentences can be applied to the analysis of the           lowing mode defined: the transporter binding the solute undergoes
structure of goal achievements. In linguistics, the role that a noun         a series of conformational changes. Therefore, one can say that



 Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                               3
Burek et al.



GO:0022804 specializes GO:0022857 by addition of a mode. We                that is a substance, whereas for GO:0022891 it is a specific sub-
say that function x is subsumed by the function y by mode addi-            stance or group of substances. Therefore, the first refactoring option
tion if x is subsumed by y and x has some mode, whereas y has no           would be to explicitly document the pattern of subsumption be-
mode assigned. Function subsumption by mode addition is depicted           tween GO:0022857 and GO:0022891 as operand specialization.
in FuML by means of a dependency link with stereotype «mode-               The alternative refactoring option is driven by the further anal-
added». The subsumed function is the supplier of the link and the          ysis of operands of those functions, in particular by clarifying
subsuming function is a client.                                            what the difference between “a substance” and “a specific sub-
                                                                           stance or group of substances” is. The answer could be found in
4.3    Mode Specialization                                                 GO:0022892 substrate-specific transporter activity, a parent func-
Subsumption of functions can be based on the mode of realization           tion of GO:0022891. An operand of GO:0022892 is exemplified by
also in cases where a parent function has already a mode assigned.         macromolecules, small molecules or ions. In that case, however,
Consider, for instance, the function GO:0022804: active trans-             it seems that functions like GO:0090482: vitamin transmembrane
membrane transporter activity having the mode: transporter binds           transporter activity and GO:0015238: drug transmembrane trans-
the solute and undergoes a series of conformational changes and            porter activity should also be considered as substance specific
the function GO:0015291: secondary active transmembrane trans-             transmembrane transport and specialize GO:0022891 by operand
porter activity with the mode: transporter binds the solute and un-        specialization, which is currently not the case, however.
dergoes a series of conformational changes driven by chemiosmotic             Finally, the third possible refactoring option could be based on
energy sources, including uniport, symport or antiport. The lat-           the assumption that the distinction between those two operands is
ter clearly characterizes particular modes of active transmembrane         only superficial and GO:0022891 is merely used for the organi-
transport. Consequently, it seems intuitive to say that GO:0015291         zation of the function taxonomy, i.e., for grouping all functions
specializes GO:0022804 (as is the case in GO). We call this type of        that are distinguished by operands such as ion, alcohol, and water.
function subsumption the subsumption by mode specialization and            According to this view, GO:0022891 would in fact be a duplica-
define it as follows: The function x is subsumed by the function           tion of GO:0022857, introduced into MFO only for the purpose of
y by mode specialization if x is subsumed by y and mode r of x             structuring it, but not as a specification of particular biological func-
specializes mode s of y. In FuML function subsumption by mode              tions. As illustrated in Figure 2, FuML enables the replacement of
specialization is depicted with a dependency link with stereotype          GO:0022891 with an explicit specification of the design choices by
«mode-spec». The subsumed function is the supplier of the link and         stereotyped links.
the specialized function is a client.


5     APPLICATION
The application of FuML to GO pursues two objectives. The first
objective is the usage of FuML for establishing a semantic basis
for molecular functions that supports the representation of func-
tions in an organized way beyond the textual description. Moreover,
the discussed patterns represent basic knowledge on the inter-
relations between biological processes and molecular functions.
The part_of relation between biological processes and molecular
functions can be mapped to the has-goal-achievement association
between functions and goal achievements.
   The second and the main objective of applying FuML to MFO is
to explicitly document design choices and the subsumption patterns
utilized implicitly in MFO. Figure 2 presents such a documentation
for a fragment of MFO in terms of FuML. The patterns are indicated
by stereotypes of FuML, which enables an easy-to-grasp visual-
ization of the structure of MFO as well as the underlying design
choices. One benefit of this approach is that the explicit specification
of the design choices makes the ontology much more intelligible for
a human user.
   Furthermore, the application of FuML reveals potential of refac-
toring and revision of GO. For instance, the application of FuML in
modeling the functions GO:0022857: transmembrane transporter
activity and GO:0022891: substrate-specific transmembrane trans-
porter activity reveals that both share similar goal achievements:
                                                                                       Figure 2. An MFO segment modeled with FuML.
transfer of an operand from one side of a membrane to the other,
with input: operand is on one side of the membrane, and output:
operand is on the other side of the membrane. Consequently and             The decision on the refactoring option, as in any modeling enter-
following FuML, a potential difference between GO:0022857 and              prise, is the responsibility of the modeler(s), GO developers in this
GO:0022891 can be searched in their operands. For GO:0022857               case. Yet, the above analysis demonstrates how graphical languages,


4                            Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes
                                                                  A UML Profile for Functional Modeling Applied to the Molecular Function Ontology



such as FuML, similarly as in software and systems engineering, can                     Belghiat, A. and Bourahla, M. (2012). Automatic generation of OWL ontologies
drive and support that task for biological ontologies such as MFO.                         from UML class diagrams based on meta-modelling and graph grammars. World
                                                                                           Academy of Science, Engineering and Technology, 6(8), 380–385.
                                                                                        Borgo, S., Carrara, M., Garbacz, P., and Vermaas, P. E. (2011). A formalization of
6    RELATED WORK                                                                          functions as operations on flows. Journal of Computing and Information Science in
The ideas underlying the structure of functions, introduced in                             Engineering, 11(3), 031007.
                                                                                        Bracewell, R. H. and Wallace, K. M. (2001). Designing a representation to sup-
FuML, are the result of an analysis of the current state of the art of                     port function-means based synthesis of mechanical design solutions. In S. Culley,
function modeling in software, systems and ontological engineer-                           A. Duffy, C. McMahon, and K. Wallace, editors, Proceedings of ICED01, Glasgow,
ing. For instance, the interpretation of a function in terms of a role                     Scotland, UK, Aug 21-23, pages 275–282.
is common not only in biological systems (Karp, 2000), but also                         Burek, P. and Herre, H. (2014). FuML Specification v1.0. Onto-med report, University
                                                                                           of Leipzig.
in functional modeling in mechanical engineering (Kitamura et al.,
                                                                                        Burek, P., Herre, H., and Loebe, F. (2009). Ontological analysis of functional
2006; Lind, 1994; Chandrasekaran and Josephson, 2000).                                     decomposition. In H. Fujita and V. Mařík, editors, Proceedings of the 8th Interna-
   The notion of goal achievement grasps the teleological character                        tional Conference on Software Methodologies, Tools and Techniques, SoMeT 2009,
of a function, its orientation on some goal. This aspect is stressed in                    Prague, Czech Republic, Sep 23-25, pages 428–439, Amsterdam. IOS Press.
many approaches to function representation, e.g. (Sasajima et al.,                      Chandrasekaran, B. and Josephson, J. R. (2000). Function in device representation.
                                                                                           Engineering with computers, 16(3-4), 162–177.
1995; Iwasaki et al., 1995; Gero, 1990). In particular, defining                        Degtyarenko, K., De Matos, P., Ennis, M., Hastings, J., Zbinden, M., McNaught, A.,
a function in terms of input-output pairs is present in modeling                           Alcántara, R., Darsow, M., Guedj, M., and Ashburner, M. (2008). ChEBI: a database
technical artifacts (Borgo et al., 2011; Goel et al., 2009).                               and ontology for chemical entities of biological interest. Nucleic acids research,
   The mode of realization, also called the way-of-function-                               36(Suppl 1), D344–D350.
                                                                                        du Plessis, L., Škunca, N., and Dessimoz, C. (2011). The what, where, how and why of
achievement, specifying the constraints on the method of function
                                                                                           gene ontology — a primer for bioinformaticians. Briefings in Bioinformatics, 12(6),
realization is present in (Kitamura et al., 2002), among others.                           723–735.
   To the best of our knowledge, the presented patterns of function                     Gene Ontology Consortium (2004). The Gene Ontology (GO) database and informatics
decomposition are not collected and integrated into any other single                       resource. Nucleic acids research, 32(Suppl 1), D258–D261.
modeling framework, though the techniques themselves are com-                           Gero, J. S. (1990). Design prototypes: a knowledge representation schema for design.
                                                                                           AI Magazine, 11(4), 26–36.
monly used, especially in software and systems engineering, e.g. see                    Goel, A. K., Rugaber, S., and Vattam, S. (2009). Structure, behavior, and function
the function-means-context link in (Bracewell and Wallace, 2001) or                        of complex systems: The structure, behavior, and function modeling language. Ar-
the decomposition with zig-zaging in (Nam, 2001).                                          tificial Intelligence for Engineering Design, Analysis and Manufacturing, 23(01),
                                                                                           23–35.
                                                                                        Guardia, G. D., Vêncio, R. Z., and de Farias, C. R. (2012). A UML profile for the OBO
7    CONCLUSION                                                                            relation ontology. BMC Genomics, 13(Suppl 5), S3.
In the current paper we present and discuss applications of UML and                     Harley, H. (2010). Thematic roles. The Cambridge Encyclopedia of the Language
patterns for function subsumption to the modeling and refactoring                          Sciences, pages 861–862.
                                                                                        Horrocks, I. (2007). OBO flat file format syntax and semantics and mapping to OWL
of biological ontologies. In particular, we developed a UML profile                        Web Ontology Language. Technical report, University of Manchester.
for functional modeling, called the Function Modeling Language                          Iwasaki, Y., Vescovi, M., Fikes, R., and Chandrasekaran, B. (1995). Causal func-
(FuML) (Burek and Herre, 2014), and apply it to the modeling and                           tional representation language with behavior-based semantics. Applied Artificial
refactoring of a segment of the Molecular Function Ontology.                               Intelligence: An International Journal, 9(1), 5–31.
                                                                                        Karp, P. D. (2000). An ontology for biological function based on molecular interactions.
   The application of FuML enables the systematic, graphical repre-
                                                                                           Bioinformatics, 16(3), 269–285.
sentation of information that is currently available in MFO mainly                      Kitamura, Y., Sano, T., Namba, K., and Mizoguchi, R. (2002). A functional con-
in the form of textual descriptions. We demonstrate that behind                            cept ontology and its application to automatic identification of functional structures.
the extensional is_a relation, which is used for the construction of                       Advanced Engineering Informatics, 16(2), 145–163.
MFO, several different patterns of intensional subsumption can be                       Kitamura, Y., Koji, Y., and Mizoguchi, R. (2006). An ontological model of device
                                                                                           function: industrial deployment and lessons learned. Applied Ontology, 1(3), 237–
determined. Modeling MFO via FuML helps in identifying these                               262.
patterns and, moreover, provides the means for representing them                        Kogut, P., Cranefield, S., Hart, L., Dutra, M., Baclawski, K., Kokar, M., and Smith,
directly in the hierarchy of molecular functions. We argue that this                       J. (2002). UML for ontology development. The Knowledge Engineering Review,
can help making the ontology structure more comprehensible for                             17(1), 61–64.
                                                                                        Lind, M. (1994). Modeling goals and functions of complex industrial plants. Applied
human users and supports communication. The claim is illustrated
                                                                                           Artificial Intelligence: An International Journal, 8(2), 259–283.
by an analysis and a model of an MFO fragment with FuML, from                           Nam, P. S. (2001). Axiomatic design: Advances and applications. Oxford University
which we derive several refactoring options.                                               Press, New York.
   Besides proposing FuML and the particular refactoring options                        Object Management Group (2014). http://www.omg.org/.
in this paper, for future work we consider first the continued analy-                   Rumbaugh, J., Jacobson, I., and Booch, G. (2005). The Unified Modeling Language
                                                                                           Reference Manual. Addison Wesley, Reading, Massachusetts, 2. edition.
sis of MFO. Extending this to a larger scale may require establishing                   Sasajima, M., Kitamura, Y., Ikeda, M., and Mizoguchi, R. (1995). FBRL: A function
software support, e.g., for identifying subsumption pattern instances                      and behavior representation language. In Proc. of IJCAI 1995, pages 1830–1836.
within MFO (semi-)automatically. Moreover, FuML and its meth-                           Shegogue, D. and Zheng, W. J. (2005). Integration of the Gene Ontology into an object-
ods may also be transferred to or yield new methods for common                             oriented architecture. BMC Bioinformatics, 6(1), 113.
                                                                                        Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., Mungall,
languages of biomedical ontologies, nowadays including OWL.
                                                                                           C., Neuhaus, F., Rector, A. L., and Rosse, C. (2005). Relations in biomedical
                                                                                           ontologies. Genome biology, 6(5), R46.
REFERENCES                                                                              W3C OWL Working Group (2009). OWL 2 Web Ontology Language Document
                                                                                           Overview. Technical report, World Wide Web Consortium.
Alterovitz, G., Xiang, M., Hill, D. P., Lomax, J., Liu, J., Cherkassky, M., Dreyfuss,
                                                                                        Woods, W. A. (1991). Understanding subsumption and taxonomy: A framework for
   J., Mungall, C., Harris, M. A., Dolan, M. E., et al. (2010). Ontology engineering.
                                                                                           progress. In Principles of Semantic Networks, pages 45–94. Morgan Kaufmann.
   Nature biotechnology, 28(2), 128–130.




 Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                                                            5