=Paper= {{Paper |id=Vol-2137/paper_40.pdf |storemode=property |title=Tawny-SBOL: Using Ontologies to Design and Constrain Genetic Circuits |pdfUrl=https://ceur-ws.org/Vol-2137/paper_40.pdf |volume=Vol-2137 |authors=Goksel Misirli,Phillip Lord |dblpUrl=https://dblp.org/rec/conf/icbo/MisirliL17 }} ==Tawny-SBOL: Using Ontologies to Design and Constrain Genetic Circuits== https://ceur-ws.org/Vol-2137/paper_40.pdf
    Tawny-SBOL: Using ontologies to design and constrain genetic
                             circuits
                                              Goksel Misirli 1∗ and Phillip Lord 2
                                  1
                                      School of Computing and Mathematics, Keele University, UK
                                       2
                                         School of Computing Science, Newcastle University, UK




ABSTRACT                                                                     Ideally, biologists should use tools built upon these APIs. Clearly,
   Synthetic biology is a data-driven engineering discipline and          interacting with different tools takes time and effort to learn.
designing novel genetic circuits often requires utilizing existing        Simplified textual representation is another way of sketching genetic
information where possible. Semantic Web technologies, and                circuit designs and improving them later on. Moreover, decoupling
particularly ontologies, are important to formalize knowledge for         the connection between APIs and complex data structures may
computational design processes and to facilitate data interoperability.   further facilitate the development of useful tools.
The Synthetic Biology Open Language has already emerged as                   ShortBOL (Pocock et al., 2016) has particularly been developed
a data standard and is based on RDF/XML. This language is                 as a shorthand language to produce complex SBOL documents
ideal to represent information as graphs in which nodes and edges         more easily. It is a human-readable textual language and allows
are defined using multiple properties. Terms from ontologies and          defining design components and their composition. The mechanism
controlled vocabularies are used to indicate the meaning of these         behind ShortBOL is template expansion, in which templates can
multiple properties. Semantic representation of these nodes and           be hierarchical and each template adds additional graph attributes
edges would simplify both the representation of information and the       until fully-serialized SBOL RDF graphs are created. For example,
querying of underlying information. Here, we present Tawny-SBOL as        in SBOL to represent a promoter component, it needs to be
a domain specific language and a framework to address these issues.       declared as a ComponentDefinition with the sbol:type
Tawny-SBOL is a proof-of-concept project, based on the Tawny-OWL          of biopax:DnaRegion, and the sbol:role of SO:0000167
ontology library, to specify genetic circuit designs. Users can query     (The Sequence Ontology promoter term). In Shorthand, a promoter
and potentially constrain these designs. As a result, designs can be      is already defined as a template, and the SBOL Shorthand compiler
evolved based on predefined requirements. Due to the native Clojure       utilises this template to inject required RDF triples.
language support, users can extend Tawny-SBOL programmatically               In this work, we present Tawny-SBOL, based on the promising
and work interactively.                                                   development of ShortBOL, and utilise ontologies to provide the
                                                                          meaning of design entities through subsumption. This domain
1   INTRODUCTION                                                          specific language (DSL) based ontological representation allows
The Synthetic Biology Open Language (SBOL) (Bartley et al.,               executing simple semantic queries that can be quite complex when
2015) has been developed to computationally exchange information          represented as a graph query. Moreover, as opposed to creating static
about genetic circuits. Using this language, complex genetic              documents that can be exchanged between researchers, our aim is to
circuits can be defined in terms of constituting simpler components       provide an interactive design environment, where users can create
such as DNA, proteins and small molecules. Designs can be                 semantic constraints and queries, and designs can evolve over time.
hierarchical, formed of many sub designs, and the querying of the
underlying information becomes challenging due to the complexity          2   THE SBOL ONTOLOGY
of relationships between different components. Each component             Standardised SBOL terms to describe the SBOL data model already
may have additional properties such as the intended biological role,      exist. However, these terms are part of a controlled vocabulary,
its molecular composition and so on.                                      which is embedded in the SBOL specification documents using
   These details are encoded using RDF and it can be difficult            free text. In order to utilise ontological representation of SBOL
to construct SBOL documents manually. Although, there are                 documents, we created the SBOL ontology using Tawny-OWL
discussions to adopt the Turtle format in the future, RDF/XML             (Lord, 2013) programmatically (Figure 1). We defined classes for
is currently adopted and the utilisation of existing Semantic             SBOL entities that are represented as RDF resources. Some of
Web tooling is particularly valuable. There are already ongoing           the SBOL entities are not serialised but act as interfaces to group
developments to create SBOL APIs which are available in Java,             others. In this work, super classes have been defined to represent
C, Python and JavaScript languages. Although these APIs are               these interface entities. Moreover, SBOL specific terms that are only
necessary to create SBOL documents, detailed knowledge about the          referenced to uniquely identify features of SBOL entities have been
SBOL data model and how each SBOL entity is related to others is          represented as classes. These include classes to indicate Access,
required. These APIs can be used by experienced programmers who           Direction and Refinement types in SBOL.
are expert in using the programming language for their chosen API
and these programmers usually follow the development of SBOL
closely.                                                                  3   TAWNY-SBOL
                                                                          Tawny-SBOL provides a simple DSL to create SBOL data. It is
∗ To whom correspondence should be addressed: g.misirli@keele.ac.uk       implemented using Clojure and therefore inherits the properties of


                                                                                                                                               1
Misirli et al



                                                                        (sboldocument "http://virtualparts.org/v2#" "v2")
                                                                        ...
                                                                        (cds "lacI"
                                                                             {name "lacI",
                                                                              description "lacI coding sequence",
                                                                              designedBy "..."
                                                                              }
                                                                         )
                                                                         ...
                                                                        (design "lacI_expression prom1 1..40:+ rbs1 41..50
                                                                            :+ lacI 51..800:+ term1 801..850:+")
                                                                        (design "prom1 lac1 1..10:+ lac2 30..40:+")

                                                                        (save "lacI_expression")


                                                                         Fig. 2. Partial information about the lacI expression genetic circuit
                                                                        design, formed of a promoter, a RBS, a CDS and a terminator. The numeric
                                                                        range is used to provide the location information, + and - signs indicate the
                                                                          DNA strand. Only the representation of the CDS component is included
                                                                        here. The promoter component is further annotated with the use of two LacI
                                                                                      binding sites using the second design command.


                                                                        ComponentDefinition and (role some SO:0000167)
                                                                        ComponentDefinition and ((component some lac1) or
                                                                            (component some lac1Parent))


                                                                         Fig. 3. Using ontological queries to extract information about biological
                                                                                            components and genetic circuits.




                                                                        4    CONCLUSION
                                                                        Ontologies can be extremely useful to capture domain knowledge
         Fig. 1. OWL classes representing various SBOL entities.        and to execute logical queries in synthetic biology (Misirli et al.,
                                                                        2016). Tawny-SBOL has been developed to exploit these features
                                                                        for the ontological representation of genetic circuit designs. Here,
this language, such as using parentheses to create a block of SBOL      we introduced the SBOL ontology together with a human readable
data or to perform specific actions such as saving the results and so   textual DSL for SBOL. This DSL is based on Tawny-OWL and
on. A specific Tawny-SBOL keyword is used to indicate the type of       the Clojure programming language, providing users an extensible
a simple biological component such as promoter, coding sequence,        and interactive environment to add new design information when
ribosome binding site, terminator and so on. The complex designs        it is available, to query design information, and to create logical
formed of simple components are represented using the design            constrains. As the design-build-test cycle of engineering biological
command. This command takes a parameter specified according to          systems can take several iterations and can be achieved in long
a grammar (Figure 2).                                                   timescales, this constraint based approach will help to achieve
   The resulting files not only include SBOL specific information,      desired systems and also to evolve designs in a controlled manner.
but also additional classes that facilitate executing semantic
reasoners. These classes are injected by Tawny-SBOL. Currently,
queries can be written using the OWL syntax and can directly be         REFERENCES
executed using the Tawny-OWL framework. For example, the first          Bartley, B., Beal, J., Clancy, K., Misirli, G., Roehner, N., Oberortner, E., Pocock, M.,
query in Figure 3 lists promoter resources, which are represented          Bissell, M., Madsen, C., Nguyen, T., Zhang, Z., Gennari, J. H., Myers, C., Wipat,
using SBOL’s ComponentDefinition entity and has the role                   A., and Sauro, H. (2015). Synthetic Biology Open Language (SBOL) Version 2.0.0.
                                                                           Journal of Integrative Bioinformatics, 12(2), 272.
of SO:0000167 term. In the second query, all the parents of the         Lord, P. (2013). The semantic web takes wing: Programming ontologies with tawny-
lac1 component is queried. The query in this case is an ontology           owl. arXiv preprint arXiv:1303.0213.
class named lac1Parent and is used to recursively find all the          Misirli, G., Hallinan, J., Pocock, M., Lord, P., McLaughlin, J. A., Sauro, H., and Wipat,
uses of the child component in parent designs. In the future, we will      A. (2016). Data integration and mining for synthetic biology design. ACS Synthetic
                                                                           Biolology, 5(10), 1086–1097.
further simplify the querying process by introducing SBOL specific
                                                                        Pocock, M., Taylor, C., McLaughlin, J., Misirli, G., and Wipat, A. (2016). Shortbol: A
commands in Tawny-SBOL.                                                    shorthand for sbol. In 8th International Workshop on Bio-Design Automation.




2