Hazard Estimation and Method Comparison with
             OWL-Encoded Toxicity Decision Trees

               Leonid L. Chepelev1, Dana Klassen1, and Michel Dumontier1,2,3,
     1
         Department of Biology, 2 Institute of Biochemistry, and 3 School of Computer Science,
             Carleton University, 1125 Colonel By Drive, K1S 5B6, Ottawa, Canada
                  {leonid.chepelev, dana.klassen, michel.dumontier}@gmail.com


          Abstract. Industrial and regulatory evaluation of chemical toxicity is often
          done via statistical analysis of chemical features focusing on chemical structure
          and function. One popular method to characterize chemical toxicity involves the
          development of decision trees based on large sets of empirical toxicological
          data where chemicals are assigned toxicity or activity classes. In this paper, we
          describe the representation of decision trees as OWL ontologies that can be
          used to carry out initial evaluation of toxicity and activity of prospective
          chemical products. We further discuss how trees derived from different datasets
          can be semantically compared by examining the logical equivalence of the
          toxicity and bioactivity classes in different trees. Taken together, this initial
          work forms the basis for continued investigation into OWL-driven semantic
          framework for toxicity evaluation.
          Keywords: Chemical Hazard Estimation, Computational Toxicology, Decision
          Trees


1 Introduction

Our industrialized society relies on millions of diverse chemical entities in
applications as broad as energy production, combating disease, and manufacturing. As
novel chemicals are developed and as industrial processes evolve, we become heavily
exposed to an increasingly diverse pool of environmental pollutants and their poorly
characterized by-products. The resource commitment necessary to fully characterize
the toxicity of even a single chemical entity experimentally is very substantial. Since
the pool of chemicals in need of routine toxicity screening by organizations such as
environmental protection agencies and pharmaceutical companies is practically
infinite and the resources for this task are often scarce, alternative means of toxicity
screening are often applied to prioritize compound screening or alert chemical
researchers to the potential adverse effects of their molecule of interest, especially in
the early stages of compound development.
   Such predictive in silico approaches may be broadly characterized into two major
categories: data-driven systems and expert systems [1]. Data-driven systems involve
the generation of mathematical models (regression, neural network, or any other
method) to correlate computed or observed physicochemical molecular properties to
their experimentally obtained functional characteristics, such as toxicity, binding
affinity for a given enzyme, or biological activity of a given type. The result of data-
driven systems are quantitative structure-activity relationships (QSAR) that are
specific to the class of compounds represented in the training set and are often
difficult to logically interpret or integrate even for a human operator.
   Expert-based systems, on the other hand, strive to capture the knowledge of human
toxicology experts into machine-readable models with the aim of automating
chemical classification and chemical information analysis. Expert-based systems can
take a number of forms, among which rule-based and decision tree-based systems are
quite prominent. Rule-based systems rely on the formulation of a number of
independent rules that can be integrated to construct a logical conclusion about the
toxicity or activity of a given compound. Decision tree-based systems involve the
sequential execution of a series of logical tests, with each branch point of the tree
containing a logical test, and each leading either to a final classification, or a deferral
to further tests (Fig. 1).


Fig. 1. A simple toxicity decision tree: at each branching point, a rule is evaluated, and based on the
outcome of this rule, either a final activity decision is made, or judgment is deferred to another node.

   Since their introduction four decades ago [2], decision tree-based toxicity and
activity prediction systems have gained acceptance by academics and industrial
researchers alike, finding applications in predicting molecular properties such as
mutagenicity, toxicity, and skin sensitization among others [3]. Furthermore,
automated objective methods have appeared to emulate the work of human experts by
creating decision trees in which rules and tree structures are drawn based on the
analysis of empirical toxicity data [4]. Aside from simplifying and automating
classification efforts, and unlike data-driven toxicology prediction systems, decision
tree-encoded expert knowledge is understandable to humans and machines alike.
   Unfortunately, the potential of OWL ontologies to formally capture and enact such
expert-based decisions in chemical toxicology and many other fields has not yet been
fully realized. Consequently, the decision frameworks and the supporting databases
for making such decisions are still largely fragmented along discipline, software, and
institutional divides. Since biological and chemical information is increasingly
standardized and integrated into the Semantic Web through initiatives like Bio2RDF
[5], we find ourselves at the point where OWL-based formalization of expert rule
bases and decision trees, combined with ready access to vast amounts of linked data
can yield unprecedented, tangible benefits in integrated bioactivity and toxicity
prediction and predictive method comparison and integration.
   In this work, we demonstrate the automated generation of biologically relevant
decision trees and their subsequent representation as OWL ontologies. We show how
the OWL ontologies can be used for classification over RDF-based linked data and
discuss the potential for the application of OWL-based decision trees on large RDF
chemical knowledgebases. Finally, we demonstrate the automated logical comparison
and integration of bioactivity/toxicity classes on the example of automatically derived
decision trees for drug-likeness and toxicity prediction. We believe that this work is
an important initial development in the formalization, standardization, and integration
of computational toxicology resources and predictive classification methods.

2 Methods
In order to explore the practical utility of decision trees for predictive chemical
toxicology, we first built decision trees using a popular toolkit with experimental and
molecular features from a chemical carcinogenicity dataset. These trees were
converted to OWL ontologies, which were used in classification of RDF-based data
using automated reasoning. Finally, we demonstrate the possibility of inferring
toxicity/bioactivity class logical equivalence for different OWL-based decision trees.

2.1 Data Sources and Data Preparation

Our analysis made use of empirically and theoretically derived datasets. A
carcinogenic toxicity dataset, from which 1400 chemical entities were selected, was
obtained from the ToxCast database [6]. These compounds were either active or
inactive with respect to single cell mutagenicity. Then, 318 non-redundant features for
each molecule were computed using the ToxTree API [7] to determine a Boolean
value for each feature: true for feature presence and false for absence. These features
corresponded to rules at decision tree branch points: true if satisfied, and false if not.
  Features for the Rule of Five training set, consisting of 7000 compounds selected
from HMDB [10], were computed using the Chemistry Development Kit [8], and the
drug-likeness attribute was derived using the logical tests outlined by Lipinski [9].
Software and data are available upon request.

2.2 Decision Tree Construction and Validation

Weka [11] was used to construct and validate binary decision trees using the
experimental and computed feature information. Decision trees were constructed
using the J48 algorithm [4]. We applied ten-fold cross-validation to derive a set of
statistical measures of tree predictive ability. Though these statistical measures are not
directly relevant for this work, they have been included as annotations on resultant
OWL-encoded decision trees for completeness. For the purposes of discussion in this
work, we generated five decision trees: Lipinski Rule of Five, modified Lipinski Rule
of Five, as well as trees resulting from different partitions of the ToxCast datasets.
2.3 Representation of Decision Trees as OWL Ontologies

OWL ontologies were constructed using the OWL API [12] from the decision tree
graphs represented with the DOT graph description language. Each decision node is
represented as being equivalent to a class expression involving the parent decision
node intersected with a restriction on the attribute value (true;false) that the parent
node represents (e.g. contains an alcohol moiety). For example, given three
substances (A, B and C), where A is the parent substance and B and C are defined
with respect to the exact value of the parent feature X, and given Substance classes,
‘has attribute’ object property, and ‘has value’ functional datatype property, the
equivalent class expressions corresponding to Substance B and Substance C are:

Substance B EquivalentClass
       Substance A and ‘has attribute’ some (Attribute X and ‘has value’ true)
Substance C EquivalentClass
      Substance A and ‘has attribute’ some (Attribute X and ‘has value’ false)

EquivalentClass axioms were added to terminal nodes corresponding to the final
classification, e.g. toxic or non-toxic. This enabled us to reflect both the structure of
the decision tree and the formal axioms leading to the classification of a given
chemical entity into a given biological functional class. We did not include covering
axioms (e.g. A can have the disjoint subclasses B or C) because we would like to
avoid inconsistencies in some manually created trees where multiple classification
outcomes may be possible and the most hazardous classification outcome is selected.

2.4 Ontology Integration and Comparison

For direct comparison of simple ontologies to logically identify predicted toxicity and
bioactivity class equivalence, we used the Pellet reasoner through the OWL API in
Java. We fused ontologies through a direct import and carried out ontology
classification using Pellet [13]. In cases where an equivalence or subclass relationship
between the final bioactivity or toxicity classes was identified, we noted this
relationship directly.

2.5 Chemical Classification

Molecular entities were instantiated using conventions set out by the Chemical Entity
Semantic Specification (CHESS) [14] and the Chemical Information Ontology
(CHEMINF) [15]. These entities annotated with chemical feature data were classified
using Pellet through the OWL API into the predicted toxicity classes using our
automatically generated OWL-based decision trees.
3 Results and Discussion

3.1 OWL-Based Decision Trees: Rule of Five

The first task that we addressed with our automated OWL ontology decision tree
generator was the construction of simple ontologies where the classification rules
involved the evaluation of numerical values associated with various molecular
descriptors. This is a fairly common mode of preliminary screening of large
compound datasets in initial stages of cheminformatics analysis. The decision tree
generated by Weka using computed data reproduced the Rule of Five criteria (Fig. 2).


Fig. 2. A decision tree generated from a computationally derived dataset of drug-like compounds. Drug-
like compound classification is indicated as true. Correctly classified molecule counts are given in brackets.
No classification was incorrect.

   There was little surprise that the Rule of Five criteria (used as an example, not a
practical application) which we imposed in the computationally derived dataset were
perfectly returned to us after data-based decision tree construction in Weka. However,
this had demonstrated to us that, given a sufficient amount of data with low levels of
noise, one could successfully derive meaningful and useful numerical cutoff-based
decision trees which could subsequently be converted to predictive ontologies.
   In order to carry out the conversion, we have followed the scheme indicated in
Section 2.3 to obtain a set of substance classes that followed numerical cutoff rules,
such as the following.
   Substance_N1:
         Substance_N0 and has_attribute some (MolecularWeight and has_value
         some double[<= "500"^^double])
   As a result of applying our generator, we have obtained an ontology that perfectly
captured the Rule of Five decision tree (Fig. 3).
Fig. 3. The structure of an automatically generated OWL representation of a Rule of Five tree (Fig. 2).


3.2 OWL-Based Decision Trees: Large-Scale Boolean Feature-Based Trees

Unfortunately, biological information is often a subject to extensive variation,
whether due to noise in experimental conditions or the abundance of the variable
parameters that may differ even within a single laboratory and experiment.
Compounding this is the limited experimental data availability to characterize most
forms of biological activity, especially for experiments that are not high-throughput at
inception. As a result, the real-world data is rarely as neatly classifiable as in the
decision tree above. However, our primary concern in this work has been the proof of
principle for the utility of OWL-based decision trees. To this end we have been able
to generate a number of useable trees with the full 318-feature set (not shown due to
complexity), as well as the more presentation-friendly limited feature sets (Fig. 4).
   Upon closer examination of such increasingly complex decision trees, we have
identified several unanticipated classification challenges. The greatest surprise has
come from the identification of the logical equivalence of several branches within
some of the generated trees. While that was considered completely plausible at the
level of the individual nodes, the subsequent identification of the logical equivalence
of the final toxicity and bioactivity classifications upon the application of reasoners to
our generated ontologies has led to some concerns over the validity and applicability
of our approach. Clearly, the equivalence of the class of toxic compounds to the non-
toxic compounds is not an anticipated or desirable effect for an ontology used to
replace the existing classification systems. Further, in order to make the decision tree
more transparent, we needed a way to trace the logical path taken to activity
classification leaves, while still preserving broad activity classification capacity.
Fig. 4. A simplified carcinogenic toxicity decision tree generated from a ToxCast dataset, using a restricted
set of chemical features for ease of presentation. Note the repetition of some rules at multiple decision tree
nodes. The path taken to classify acetaminophen, as detected with the explanation functionality of Protégé,
has been highlighted with red arrows.

   After careful consideration of the logical explanation of the equivalence of these
practically distinct classes, we identified the cause of the problem to lie in the
repetition of rules within a single decision tree and the lack of the distinction between
the nodes that executed rules in a particular order. As such, it was quite possible to
arrive at a situation where, having ignored the context of the rest of the tree, the
classifier technically correctly assigned class equivalence between the toxic and non-
toxic compounds simply because parts of the paths taken to these classifications were
similar, while the other parts were not mutually exclusive.
   To rectify this problem, we have recognized that node-specific classification rule
tracking had to be implemented. Thus, we amended our generator to include a local
set of node-specific classification features within a given ontology. This translated
into alterations to substance classifications, as follows.
Substance_N6:
   Substance_N0 and has_attribute some (RuleToxicFunctionalGroups_N0 and
   has_value value false)
   Note that what used to be the RuleToxicFunctionalGroups descriptor became the
RuleToxicFunctionalGroups_N0 descriptor. This amendment was effective in solving
our misclassification problem. However, the introduction of ontology-specific
descriptors would negate our ability to integrate and compare the different ontologies,
as well as to draw on existing repositories of chemical entities annotated with the
general standard descriptors and features. To rectify ontology comparison deficiency,
we have created versions of our decision tree ontologies where node-specific rules
were explicitly defined as subclasses of their generic counterparts. Similarly, node-
specific activity leaves were introduced to enable tracing classification paths. Thus,
although we had to artificially distinguish activity categories and rules, we were still
able to query for the compounds falling into the general activity classes, as well as to
trace classification paths, important in e.g. automated toxicity tree comparison.

3.3 Chemical Entity Classification

While the above amendment permitted comparison between multiple ontologies and
still avoided erroneous class equivalence conclusions, it did not address drawing on
existing data repositories, as there is no direct inference that if a general rule bears a
particular value, there exists an instance of its subclass that bears the same value. The
first intuitive suggestion to clear this task is to modify our generator to also create
ontologies where the general rules were specified as subclasses of the node-specific
rules. This has allowed us to automatically make the necessary inference to import
data from existing chemical knowledge repositories in RDF.
However, upon carrying out the classification within such ontologies, we have been
unpleasantly surprised to find out that due to the introduced equivalences at the data
level, some of our instances were capable of adopting both, active and inactive
classifications. In order to rectify this problem, we defined node-specific final
classifications (e.g. active_N3) which were declared to be subclasses of the general
final classes (e.g. active and inactive) (Fig. 5).


Fig. 5. A fragment of the final, classification-friendly decision tree.

Classification was successfully carried out by querying whether a given instance
belonged to one of these general classes. Using thus constructed decision tree-based
ontologies, we encountered no problems classifying numerous RDF-encoded
molecules bearing the requisite information. A sample OWL model is available [18].

                      Acyl      Rule       Rule        Rule       Rule     Rule Aryl     Rule     Rule Toxic   SN2 Rule
  Structure         Transfer   Alcohol   Aldehydes   Aromatic   Aromatic    Methyl     Tertiary   Functional
                      Rule                           Aldehyde     Azo       Halide      Amine      Groups


                     TRUE      TRUE      FALSE       FALSE      FALSE      FALSE       FALSE       FALSE       FALSE


Fig. 6. Relevant features of acetaminophen used in classification.
As an example, consider the case of acetaminophen, a known non-carcinogen. Its
attributes (Fig. 6) were imported from its CHESS [15] representation and correctly
classified as inactive according to the decision tree presented earlier (Fig. 4). This
classification was also reproduced using numerical trees (omitted for brevity).
Further, unlike the traditional classification systems, which are essentially black
boxes, our approach has allowed us to automatically trace the exact route taken to
classifying acetaminophen as a non-carcinogen, using the explanation feature of
Protégé [16]. The fact that we created artificially distinct activity classes in our trees
did not prevent us from querying for chemical activity in terms of general categories.

3.4 Ontology Integration and Concept Comparison

Thanks to the automatically generated ontology structure (Section 3.2), it was
possible to integrate and compare multiple predictive toxicology ontologies in order
to identify equivalence or subclass relationships between their toxicity and bioactivity
classifications. Perhaps the easiest to demonstrate is the integration of two Rule of
Five-based ontologies. In one set, one of the requirements for a compound to be drug-
like was a molecular weight less than 500 Da (Fig. 2), while in another, small drug-
like compounds were introduced, with a molecular weight under 250 Da. Simple
import of one ontology into the other and classification with Pellet resulted in small
drug-like compounds inferred to be a subclass of drug-like compounds.

4 Conclusions

4.1 Significance

In this work, we have demonstrated for the first time the automated construction and
practical application of OWL-encoded decision trees in chemical toxicology. The
OWL ontologies that we generate can capture numerical cutoff-based rules, as well as
Boolean-based rules, and can be used to represent both, automatically and expert-
generated decision trees. Using our approach, decision trees that form the basis for
predictive chemical toxicology classification and are either manually (expert-based
systems) or algorithmically (data-based systems) generated can be routinely converted
to OWL ontologies. Due to the explicit and formal specification of concepts within
these ontologies, toxicity and bioactivity classes can be exposed for comparison and
logical integration. In addition to this, these ontologies can also be easily applied to
classify chemical entities in the rapidly growing knowledgebases of RDF-encoded
chemical information. In replacing framework-, software-, and domain-specific
classification engines with standard OWL ontologies, we allow for the chemical
toxicology efforts to break free of their respective boundaries and support their
current shift towards the Semantic Web technologies. As this shift occurs, we are
confident that the work we present here will play an important role in informing
future efforts in integrating and analyzing the future Chemical Semantic Web to
support open, transparent, and reproducible chemical toxicology research.
4.2 Future Applications and Developments
This work marks a first step towards an OWL-based predictive toxicology framework
that is currently under development. In this framework, ontologies capture the
decision tree-based toxicology and bioactivity mathematical models are generated on
the fly from linked open data. These ontology-specified models will subsequently be
accessible for further automated classification of large collections of semantically
represented chemical entities. Preliminary results point to the possibility of logically
comparing formalized decision trees of multiple types so as to provide explanations
for [16] and to identify points of equivalence of toxicity and bioactivity classes.
Finally, the capture of classification statistics presents an interesting avenue to
explore probabilistic reasoning [17] using description logics which would be well
suited for toxicity prediction within a set of confidence intervals.
Acknowledgments. The authors are financially supported by a Discovery Grant from
the Natural Sciences and Engineering Research Council of Canada, a Health Canada
grant and CANARIE.

References
1. Helma, C.: In Silico predictive toxicology: The state-of-the-art and strategies to predict
   human health effects. Curr. Opin. Drug Discov. Devel. 8, 27-31 (2005)
2. Cramer, G.M., Ford, R.A., Hall, R.L.: Estimation of Toxic Hazard - A Decision Tree
   Approach. J. Cosmet. Toxicol. 16, 255-276 (1978)
3. Kroes, R., et al.: Structure based thresholds of toxicological concern. Food Chem. Toxicol.
   42, 65–83 (2004)
4. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1999)
5. Belleau, F. et al.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems.
   J. Biomed. Inform. 41, 706-716 (2008)
6. Carcinogenic Potency Database. http://www.epa.gov/ncct/dsstox/sdf_cpdbas.html
7. Patlewicz, G. et al.: An evaluation of the implementation of the Cramer classification
   scheme in the Toxtree software. SAR QSAR Environ. Res. 19, 495-524 (2008)
8. Steinbeck, C. et al.: The Chemistry Development Kit (CDK) J. Chem. Inf. Comput. Sci. 43,
   493-500 (2003)
9. Lipinski, C.A. et al.: Experimental and computational approaches to estimate solubility and
   permeability in drug discovery and development. Adv. Drug. Del. Rev. 46, 3–26 (2001)
10.Wishart, D.S., et al.: HMDB: a knowledgebase for the human metabolome. Nucleic Acids
   Res. 37, D603-D610 (2009)
11.Hall, M. et al.: The WEKA Data Mining Software: An Update. SIGKDD Explorations. 11,
   10-18 (2009)
12.Horridge, M., Bechhofer, S.: The OWL API. OWLED 2009, 6th OWL Experiences and
   Directions Workshop. Chantilly, Virginia, USA. (2009)
13.Sirin, E., Parsia, B., et al.: Pellet: A practical OWL-DL reasoner. Software Engineering and
   the Semantic Web. 5, 51-53 (2007)
14.Chemical Entity Semantic Specification. http://semanticscience.org/projects/chess/
15.CHEMINF. http://semanticchemistry.googlecode.com/svn/trunk/ontology/cheminf.owl
16.Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in OWL. In: Proc. of
   ISWC-08, LNCS. 5318, 323-338 (2008)
17.Klinov, P.: Pronto: A Non-monotonic Probabilistic Description Logic Reasoner. In: The
   Semantic Web: Research and Applications, LNCS. 5021, 826-830 (2008)
18.Semtox Project Page. http://semanticscience.org/projects/semtox/