<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Characterization of pathogenic germline mutations in Hu- man Protein Kinases</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jose MG Izarzugaza</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lisa EM Hopcroft</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anja Baresic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christine A Orengo</string-name>
          <email>orengo@biochem.ucl.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew CR Martin</string-name>
          <email>andrew@bioinf.org.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alfonso Valencia</string-name>
          <email>avalencia@cnio.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Structural and Molecular Biology, Division of Biosciences, University College London</institution>
          ,
          <addr-line>Gower Street, Lon-don WC1E 6BT</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO)</institution>
          ,
          <addr-line>C/Melchor Fernandez Almagro 3, E28029 Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Background: Protein Kinases are a superfamily of proteins involved in crucial cellular processes such as cell cycle regulation and signal transduction. Accordingly, they play an important role in cancer biology. To contribute to the study of the relation between kinases and disease we compared pathogenic mutations to neutral mutations. First, we analyzed native and mutant proteins in terms of amino acid composition. Secondly, mutations were characterized according to their potential structural e ects and nally, we assessed the location of the di erent classes of polymorphisms with respect to kinase-relevant positions in terms of subfamily speci city, conservation, accessibility and functional sites. Results: Pathogenic Protein Kinase mutations perturb essential aspects of protein function, including disruption of substrate binding and/or e ector recognition at family-speci c positions. Interestingly these mutations in Protein Kinases display a tendency to avoid structurally relevant positions, what represents a signi cant di erence with respect to the average distribution of pathogenic mutations in other protein families. Conclusions: Disease associated mutations display sound di erences with respect to neutral mutations: several amino acids are speci c of each mutation type, di erent structural properties characterize each class and the distribution of pathogenic mutations within the consensus structure of the Protein Kinase domain is substantially di erent to that for non-pathogenic mutations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Background</title>
      <p>
        Point mutations of nucleotide bases are a mechanism of crucial importance in the evolution of proteins, and
hence in the evolution of organisms. A biologically relevant class of point mutation, accounting for about
90% of sequence polymorphisms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] at an overall frequency of about one per 1000 bases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is the single
nucleotide point mutation or PM. Traditionally, polymorphisms are classi ed according to their genomic
location into coding or non-coding. Coding PMs can be further classi ed depending on whether the
resulting protein product is changed owing to the genomic polymorphism. Non-synonymous PMs (nsPMs) are
those that alter the amino acid sequence of the protein product through either amino acid substitution or
the insertion of truncation mutations. We refer to those which generate a single amino acid substitution
as `single amino acid polymorphisms' or SAAPs. In contrast, synonymous PMs (also referred as silent or
sPM) are those that do not alter the amino acid sequence of the protein product expressed. A particular
case of PMs corresponds to single nucleotide polymorphisms (SNPs): those germline mutations frequently
found (&gt;1%) in normal individuals and considered neutral. A major e ort to catalogue and annotate SNPs
is dbSNP [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Although most amino acid changes are tolerated in the native protein structure, not all PMs
are neutral. An increasing number of mutations are prone to be associated with aberrant phenotypes and
disease. Disease-associated mutations occur at much lower frequencies in the population and have a severe
e ect on phenotype. Here, we use the term `pathogenic deviation' (PD hereafter) to refer to any single base
change reported to be correlated with disease. Although both PDs and nsSNPs result in a change in the
expressed protein product, the former are reported to have a severe e ect on phenotype whereas nsSNPs are
expected to have a non-deleterious phenotypic e ect.
      </p>
      <p>
        About 1% of all human genes are known to contribute to cancer as a result of acquired mutations. The family
of genes most frequently contributing to cancer is the Protein Kinase gene family [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] which is implicated
in a huge number of tumorigenic functions including immune evasion, proliferation, antiapoptotic activity,
metastasis and angiogenesis, possibly due to the simplicity of the mechanism of attaching an ATP-derived
phosphate to a substrate protein [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Protein Kinases are one of the most ubiquitous families of
signaling molecules in the human cell, accounting for approximately 2% of the proteins encoded by the human
genome [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Protein Kinases show a wide-scale similarity both at sequence and structure level, attributable
to the fact that all kinases transfer the terminal phosphate of ATP to a serine, threonine or tyrosine residue
in a target protein. Empirical studies to date also suggest a common, with a few exceptions, catalytic
mechanism whereby ATP and an active site divalent cation are bound in identical manners and phospho-transfer
is carried out by a shared set of amino acids [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Studies [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] on yeast models have shown that kinases can
be very promiscuous, phosphorylating a huge number of di erent protein substrates albeit showing
remarkable speci city. This inconsistency suggests that kinases have a region committed to the general function
of catalysis, with another region (or regions) customizable to con rm substrate speci city to the enzyme
without any particular need to alter fold, compromise ligand binding or modify the subsequent reaction
mechanism. Protein Kinases are a thoroughly studied protein family and a plethora of mutations have been
previously reported in the literature [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. These studies often include evidence of association with disease.
Concomitantly, several e orts [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] are devoted to the prediction of the pathogenicity of somatic kinase
mutations in cancer samples. These mutations are classi ed into two main categories: those that are involved
in cancer onset and development {driver mutations{ and those that are biologically neutral {passenger{
mutations. For a detailed review, see Baudot et al.,2009 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Previous work [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] characterized the preferential
distribution of cancer driver kinase somatic mutations in regions of importance for protein function,
including disruption of substrate binding and/or e ector recognition at family-speci c positions, often avoiding
structurally relevant positions.
      </p>
      <p>The objectives of the work presented here are two-fold. Firstly, we wanted to clarify whether the trends
detected for driver somatic kinase mutations can be extended to other disease related mutations independently
of the nature of the mutation. Secondly, we wanted to provide additional information to the discussion on
the interpretation of the role of kinase driver somatic mutations in the onset of cancer.
Consequently, we carried out a detailed multi-level comparative analysis of the di erences between pathogenic
and neutral (not pathogenic) germline mutations within the framework of the human kinome: amino acid
composition of the polymorphisms was compared, mutations were characterized according to their potential
structural e ects and nally, we assessed the location of the di erent classes of polymorphisms with respect
to kinase-relevant positions in terms of subfamily speci city, conservation, accessibility and functional sites.</p>
    </sec>
    <sec id="sec-2">
      <title>Results and Discussion</title>
      <sec id="sec-2-1">
        <title>Sequence features of deleterious kinase mutations</title>
        <p>We mapped 130 pathogenic deviations and 200 neutral SNPs to sequences within the Protein Kinase domain
(PDP K s and SNPP K s, respectively). The native residue in the pathogenic (PDP K ) set was enriched in
glycines (p=0.01) and leucines (p=0.04) when the sequences were compared using a two-sided Fisher exact
test whereas it was enriched in prolines (p=3x10 5) when the mutant amino acids were considered. As
expected, in the SNPP K dataset none of the residue types were particularly enriched, neither in the native
sequences nor in the mutated ones. Considering the native and mutant as a residue pair, three mutations
were found more often in the PDP K dataset: leucine-proline (p=0), lysine-glutamate (p=0.02) and
arginineproline (p=0.02). Again, in the SNPP K dataset, no signi cant enrichment was found for the the
wildtypemutated pairs. The complete set of results can be found in Supplementary Table S2.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Hypothesized structural e ects of deleterious kinase mutations</title>
        <p>
          SAAPdb [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] provides a characterization of the structural consequences of mutations. When these features
were compared some di erences between the groups appeared. PDP K s were often observed at the interface
(p=0.03) including sites of inter-chain binding, as well as ligand binding. By contrast, SNPP K s more
often introduced empty spaces in the core of the protein (p=0.04). Protein Kinase pathogenic mutations
compared to pathogenic mutations in general (PDP K /PDnP K ) produced striking results: PDnP K s were
more often explained by structural analyses: the modi ed residues a ected stability, a ected functional
residues, introduced an empty region in the interior of the protein, and a ected interaction prone positions
(as annotated in MMDBBIND). By contrast, PDP K mutations were not signi cantly related with any of
those categories. A complete description of these results can be found in Supplementary Table S3.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Proximity of deleterious kinase mutations to kinase-speci c protein features</title>
        <p>
          We present here the results of mapping the di erent types of mutations, PDP K s and SNPP K s, onto a
representative structural model from the Protein Kinase superfamily. The mutations were analyzed in terms
of their distribution relative to evolutionary conserved positions and known functional regions. We were
able to map 47 positions containing at least one of the 62 PDP K mutations and 27 positions with at least
one of the 36 SNPP K s to the consensus model (Fig. 1) described in the Methods section and in previous
studies [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. The results of these analyses are summarized in Table 1.
        </p>
        <sec id="sec-2-3-1">
          <title>Proximity of kinase mutations to known functional regions</title>
        </sec>
        <sec id="sec-2-3-2">
          <title>a) Kinase mutations and the catalytic region</title>
          <p>
            The active region of Protein Kinases includes the ATP binding site, the peptide-substrate binding sites and
the catalytic loop implicated in the transference of the phosphate group. We de ned the kinase-binding site
as the set of residues extracted from the FireDB database [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. This de nition includes 32 residues (Fig. 1)
that directly contact the ATP in the binding pocket and that contains the ve highly conserved residues that
play a critical role in positioning ATP and stabilizing the active conformation in the catalytic mechanism [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ]
(see Fig. 1 ). The distance distribution histograms (Fig. 2, panels A and B) and the results in Table 1 (see
also Fig. S2, where the PDP K s, SNPP K s and catalytic residues were represented), showed a very strong
tendency of PDP K s to locate if not in catalytic residues, at least close to them. Indeed, 13 out of the 32
residues in the binding pocket were annotated as pathogenic deviations whereas only three were annotated
as neutral SNPP K s. Moreover, 2 out of the 5 residues described as essential for the correct functioning of
the ATP binding pocket [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] were annotated as PDP K s.
          </p>
        </sec>
        <sec id="sec-2-3-3">
          <title>b) Kinase mutations and regions of functional subspeci city</title>
          <p>
            In this study we used the position of the tree-determinant residues as a proxy for functionally important
regions in Protein Kinases, particularly those related with the speci c functions of each one of the subfamilies.
Residues speci c to the various subfamilies of Protein Kinases were identi ed for each of the eight subfamilies
in which KinBase categorizes the human kinome [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]. Our recent implementation of the sequence-space
approach, S3det [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] identi ed 32 unique positions in the model as containing information relevant for
di erentiating between subfamilies; that is, residues that tend to be conserved in the speci c subfamilies and
vary to di erent degrees in the others (Fig. S3 depicts the distribution of the mutations along with these
tree-determinant residues). Out of the 32 tree-determinants in the model, ve were annotated as pathogenic
(residues 50, 173, 190, 217 and 233 in the generated structural model) whereas only two were annotated
as neutral. Pathogenic deviations, if not exactly in positions that were disease associated, clustered around
tree-determinant residues in general. This tendency was especially relevant for tree-determinants in the ATP
binding pocket, but also appreciable in the other function speci c tree-determinants. PDP K s were closer
to tree-determinant positions than neutral SNPP K s (Fig. 2C) This was also clear in the di erence of Xd
values (-1.98) indicating the existence of signi cant di erences between PDP K s and SNPP K s with respect
to proximity to positions characterized as important for the function and subspeci city of the kinases.
          </p>
        </sec>
        <sec id="sec-2-3-4">
          <title>Proximity of kinase mutations to the protein core</title>
          <p>With the accessibility parameters de ned in Methods, 99 residues were classi ed as buried in the kinase
structural model; 20 of these buried residues are annotated as PDP K s and 12 were annotated as SNPP K s
(three residues { 58, 197 and 218 { are described in both datasets) (Fig. S4). The distribution of distances
(Fig. 2D) manifests a clear tendency of PDP K s to be closer to buried residues. As a matter of fact,
the analysis of the mean distances of mutated positions to buried residues (2.94A and 3.63A respectively)
revealed a tendency for the pathogenic deviations (PDP K s) to be closer to buried residues than the neutral
polymorphisms. This fact was supported by a Xd di erence of -0.83.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Examples of well-characterized disease-associated mutations a ecting kinase function</title>
        <p>
          The statistically signi cant results provided after analyzing the sequence features of the single amino acid
polymorphisms (see sequence features of deleterious kinase mutations section ) are supported by several
examples in the literature. Karkkainen et al., (2000) [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] associated human hereditary lymphedema, a
particular case of lymphatic obstruction, with mutations in the vascular endothelial growth factor receptor 3
(VEGFR-3). The endothelial cells lining blood and lymphatic vessels depend on signal transduction mediated
by speci c receptor tyrosine kinases for their di erentiation into a primary vascular plexus (vasculogenesis)
and for the sprouting and splitting of new capillaries from previously existing vessels (angiogenesis). Two
mutations were reported as the cause of the disease, due to the loss of tyrosine kinase activity and hence
impaired downstream signaling, and a slower rate of internalization of the receptor. The suggested model
highlights a mutation in the second arginine in the highly conserved HRDLAARN motif in the catalytic
site to proline that facilitates the hydrogen bond between aspartate and a hydroxyl group from ATP at the
binding pocket. This aspartate is critical for protein function as it is believed to act as the catalytic base
in the phosphotransfer reaction. Moreover, their study revealed that a mutation from leucine to proline
disturbed protein function due to both a disturbance in protein structural integrity and an interference of
ATP binding. Since the pathogenic residue is located in the middle of a -strand, introducing a restricted
mobility residue such as proline disrupts the interaction with the surrounding -strands, destabilizing the
protein fold in the region. In addition, the sidechain of this leucine is part of the adenine-binding pocket,
therefore alterations in such a relevant position modify the shape of the cleft and might interfere with ATP
binding. With respect to the signi cant lysine-glutamate mutation, Mao et al., (2001) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] described the
commonly cited mutation K430E (amongst others) in Bruton's tyrosine kinase (BTK, Q06187), speculating
with regard to how it might cause severe XLA (X-linked agammaglobulinemia), and providing a solved
crystal structure. BTK is expressed on the surface of B cells and its kinase activity is crucial in proliferation
and di erentiation to mature B lymphocytes. According to their mechanism, upon the trans-phosphorylation
of Y551, the highly conserved K430 facilitates a hydrogen bond with E445, causing a shift of the C helix.
The K430E mutation disables the C helix repositioning, which is crucial for the catalytic activity of BTK,
hence impairing the creation of mature B cells.
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>Assessing the possible functional role of relevant kinase mutations by their sequence-structure characteristics</title>
        <p>
          Our general analysis of the distribution of mutations in Protein Kinases not only provided an overview of their
relation with function and structure, but also provided an insight into their speci c biomedical implications.
In the work presented here, we summarized all the knowledge accumulated for the Protein Kinase domain in a
single framework structure (Fig. 1) under the assumption that regions important for the structure/function of
the kinases are common to the whole family and hence they can be used as a reference for the interpretation
of the mutations in any of the individual kinases. The accumulation of information clearly increases the
signi cance of the results provided and makes the distribution of the polymorphisms more reliable and
accurate. For instance, mutations in the insulin receptor gene in humans (INSR) have been reported as
disease associated in the literature. Defects in INSR are the cause of insulin-resistant diabetes mellitus
with acanthosis nigricans type A (IRAN type A, MIM:610549), a syndrome characterized by the association
of severe insulin resistance manifested by marked hyperinsulinemia and a failure to respond to exogenous
insulin with the skin lesion acanthosis nigricans and overian hyperandrogenism in adolescent female patients.
The relation of the disease with mutations in kinases is thoroughly reported in OMIM. Recent studies have
further characterized the relationship between insulin resistance and disease (for example, [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]). Moreover,
several studies have associated acanthosis nigricans with mutations in other kinases, such as the broblast
growth factor receptors II and III [21{23]. Here, we identi ed A1161T (residue 173 in the model), a
wellcharacterized mutation that introduces an alanine-threonine shift in the ATP binding pocket of INSR. This
mutation has been de ned in our analysis as a pathogenic deviation. Concomitantly, it is a catalytic residue
in FireDB and is important for family speci city. This perturbation of the ATP binding pocket might
explain the unpaired phosphorylation and therefore the reduced enzymatic activity leading to the aberrant
phenotype.
        </p>
        <p>
          Finally, we considered not only a single mutation but a pair of consecutive mutations: T341P and C342F
(positions 233 and 234 in the structural model respectively), in the human broblast growth factor receptor
type 2 (FGFR2) to demonstrate that although only T341P is reported to be a pathogenic deviation in our
dataset, the analysis can provide insights into complex diseases caused by more than one mutation at the
same time. Several diseases caused by uncontrolled cell growth have been associated to defects on FGFR2.
Among them, two related syndromes { Pfei er syndrome (PS) and Crouzon syndrome (CS) { have been
reported to be associated to the pair of mutations of interest (T341P in PS and CS, C342F in CS). In our
analysis, we described T341P as a mutation that introduces a change from threonine to proline in a buried
sequence-conserved position. Other authors [
          <xref ref-type="bibr" rid="ref15 ref24">15, 24</xref>
          ] have shown that mutations to proline are very often
associated with disease. Additionally, we have characterized mutation C342F as a replacement of a buried
residue. In addition, we identi ed both mutations as tree-determinant residues, thus considered related to
binding speci city and indicative of the importance of these two positions for protein function.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions</title>
      <p>
        We have analyzed point mutations in the structure of Protein Kinases in order to characterize the structural
and functional singularities of pathogenic and neutral mutations. Although the de nition of the groups is
by no means stable and the groups are constantly being rede ned as new studies on the pathogenicity of
mutations arise, this might be used as a proxy to deepen the knowledge on the underlying mechanisms of
disease. The human kinome is particularly amenable for this type of study since much is known about
the structure and function of this protein family and very relevant cancer-associated mutations have been
published for these proteins [
        <xref ref-type="bibr" rid="ref11 ref12 ref14">11, 12, 14</xref>
        ].
      </p>
      <p>
        To address this point, pathogenic deviations mapped to the kinase domain (PDP K ) and single nucleotide
polymorphisms mapped to the kinase domain (SNPP K ) were compared on the basis of sequence,
hypothesized structural e ect and proximity to known kinase-speci c features within the framework of a modeled
consensus structure, representative of the whole superfamily. At the sequence level, several mutations were
di erentially observed in the PDP K and SNPP K datasets: the leucine-proline mutation emerged as an
interesting feature of the PDP K dataset: it was identi ed both when analyzing the native and mutant
residues separately, and when analyzing the mutation pairs, and it has been identi ed as being indicative
of disease elsewhere, both speci cally in a kinase disease dataset [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and in wider, non-disease-speci c
datasets [
        <xref ref-type="bibr" rid="ref15 ref25 ref26">15, 25, 26</xref>
        ]. In addition, replacing an arginine with a proline was more often observed in the PDP K
dataset. Again, this mutation has been described as being associated both with diseases predicted to be
related to mutations in kinases [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and across all other diseases [
        <xref ref-type="bibr" rid="ref15 ref26">15,26</xref>
        ]. Finally, replacing the positive charge
of Lysine with the bulkier, negatively charged glutamic acid sidechain is identi ed in the PDP K dataset.
Unlike the well-characterized previous mutations, this is a novel observation. These nding provide evidence
about the existence of distinctive di erences between the two types of mutations even at such a coarse grain
level as the amino acid composition.
      </p>
      <p>
        In the second part of the analysis we focused on the singularities of the mutations in structurally and
functionally relevant regions. Thus, we characterized PDP K mutations in terms of their structural consequences
by comparing them to SNPP K mutations. PDP K mutations were more likely to occur at the interface with
ligands and other protein chains and SNPP K mutations were more likely to introduce a cavity in the protein
core. This is coherent with the previous publications concluding that disease associated mutations in kinases
often a ect the site of ATP binding [
        <xref ref-type="bibr" rid="ref16 ref18 ref19">16, 18, 19</xref>
        ]. In addition, PDP K mutations were compared to other PD
mutations (PDnP K ) to comment on the mechanisms by which mutations in kinases might lead to disease.
In general, it is easier to explain PDnP K in structural terms. Most noticeably, very few PDP K mutations
introduced a signi cant cavity, an empty space, in the protein core or a ected protein stability at all. An
increasing body of literature attributes the pathogenic nature of PDs to their destabilizing e ect on native
protein structure [15, 27{30]. The results here, speci c to the mutations in the Protein Kinase domain,
contradict this trend, indicating that the pathogenicity of kinase PDs could not be simply attributed to
decreased stability. This might be explained by the fact that Protein Kinases are known to be a highly
structurally conserved superfamily, while varying signi cantly with respect to sequence; as such, the
structures must be tolerant to sequence variation. Hence, considerable exibility must exist within the protein
structure to be robust to sequence diversity. Although out of the scope of this work, it remains interesting
to analyze other protein families at a genome wide scale to corroborate whether this is just a singularity of
the Protein Kinase family.
      </p>
      <p>
        In order to provide more detailed insight into the distinctive implication of pathogenic mutations in the
disruption of protein function, we characterized the di erences between PDP K s and SNPP K s by their
association to kinase-speci c functional and structural features. There were signi cant di erences between the
two types of mutations in terms of conservation, accessibility, distance to active/binding site residues and
distance to family speci c binding sites. It is interesting to compare these results with the ones previously
obtained for driver/passenger mutations in Protein Kinases [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Drivers are those mutations predicted to
be involved in cancer onset whereas passenger mutations are those that are supposed to be accumulated
during cancer progression being neutral respect to the origin of the cancer. In the previous work, the main
conclusion was that driver mutations tended to be located near to important residues such as sequence
conserved positions, family speci c regions and active/catalytic sites whereas passenger mutations were located
closer to structurally conserved regions. The di erence observed here for the set of PDP K s and SNPP K s
mimicked the one previously observed for driver and passenger mutations. Both datasets proved to be
nonredundant. Only 4 residues were common to both driver and pathogenic datasets, all of them in the human
B-raf proto-oncogene. The small overlap encountered is consistent with the very di erent nature {germline
and somatic{ of the mutations in each of the disease-prone datasets. Thus, the new results presented here
can be seen as a con rmation of the functional/structural role of the mutations that are more likely to be
pathogenic (drivers and PDs). Given that the de nition of mutations as drivers and passengers is somehow
controversial (for a review, see Baudot et al., 2009 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]), it is good to see that the current results could be
interpreted as an indirect support to the categorization of mutations into drivers (disease associated) and
passengers (disease neutral) and their use as a proxy for the study of the involvement of somatic mutations
in cancer biology.
      </p>
      <p>In summary, we have con rmed that the pathogenicity of mutations within the Protein Kinase superfamily
is related to essential aspects of protein function, including perturbation of substrate binding and
recognition by e ectors at family-speci c positions. As a matter of fact, pathogenic deviations accumulate in key
functional regions whereas they seem to be absent from structurally relevant positions. These observations
reinforce the idea that the pathogenicity of disease-associated mutations can be attributed to a disruption of
native protein function while avoiding drastic changes prone to disrupt the protein globally. This tendency
was not observed for neutral polymorphisms, which are apparently less disruptive of protein function and
tend to be tolerated. Consequently, they are more often found in normal individuals. However, it is clear
that further classi cation of the mutations in more speci c subgroups will be necessary to provide a deeper
knowledge on the mechanisms leading to disease. A typical example would be the characterization of
mutations into a gain-of-function/loss-of-function on a large-scale.</p>
      <p>The analysis presented here provides not only a characterization of the mutations, but also in some cases
additional insight into the speci c biomedical implications of the mutations. This type of approach will be
particularly useful as part of the bioinformatics platform developed for the International Cancer Genome
Consortium and other cancer genome projects.</p>
    </sec>
    <sec id="sec-4">
      <title>Methods</title>
      <sec id="sec-4-1">
        <title>Protein Kinase domain sequences</title>
        <p>
          The KinBase resource [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] is a repository of the currently accepted classi cation of eukaryotic Protein
Kinases. At the moment of the analysis, KinBase contained 620 human protein sequences of which 516 were
Protein Kinases not de ned as pseudogenes by the database curators. Although it has been described that
some kinase pseudogenes are transcribed and even might have a residual or sca olding function [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] kinase
pseudogenes were not considered in the analysis performed here. KinBase does not map directly onto
UniProtKB [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. The mapping was performed using a BlastP [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] for each sequence against a custom database
containing all entries in UniProtKB annotated as Protein Kinase domain for human. We were able to map
488 KinBase identi ers to a valid UniProtKB entry, 474 of them (97.13%) at sequence identity levels of at
least 95%.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Classi cation of the mutations</title>
        <p>
          SAAPdb [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] is a database of single amino acid polymorphisms (SAAPs) mapped to protein structure.
SAAPdb aims to provide likely structural e ects of mutations and identify di erences in potential structural
consequences between neutral and pathogenic mutations. The disease dataset is derived mostly from OMIM
[
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] whereas the neutral dataset comes from dbSNP. For the Protein Kinase domain of the 488 kinases in
SAAPdb contains 130 pathogenic deviations (PDP K s) and 200 neutral polymorphisms (SNPP K s). Of these
130 PDP K s, 62 were successfully mapped to a residue in the solved PDB structure. Similarly, of the 200
SNPP K s mapped to sequence, 36 were mapped to a PDB structure (see Supplementary Table S1 ). In order
to create control datasets 9263 non-kinase PDs (PDnP K ) were retrieved from SAAPdb, out of them 4652
mapped to a structure. All three datasets contain only unique, non-synonymous (missense) mutations. We
excluded nonsense and synonymous mutations because these have a known, truncating e ect or no e ect on
the protein structure. A unique mutation was de ned by the combination of four parameters: UniProtKB
accession number and sequence position, native amino acid and mutated amino acid.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Comparing mutations with respect to the native and/or mutant residues</title>
        <p>To compare the mutations with respect to their sequence features, the native and mutant residues were
extracted from SAAPdb, as described above. Two-sided Fisher exact tests were carried out since they allow
robust comparison of datasets of disparate sizes and evaluation of contingency tables with empty cells.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Structural e ects of mutations</title>
        <p>
          The SAAPdb hypothesized structural e ects are fully described elsewhere [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] and are summarised brie y
below. (a) Mutations a ecting stability: mutations on the surface of proteins that replace a hydrophilic
residue with a hydrophobic residue are identi ed as introducing unfavourable hydrophobicity on the surface.
Similarly, mutations in the core that replace a hydrophobic residue with a hydrophilic residue are identi ed
as introducing unfavourable hydrophobicity in the core. Buried mutations that create a charge shift are also
identi ed. Using a geometric analysis of PDB structures, SAAPdb identi es mutations that a ect potential
disulphide bonds. Mutations that introduce large cavities in the protein core or that break hydrogen bonds are
identi ed as well. (b) Mutations a ecting folding: unfavourable (with respect to torsion angles) mutations
from cis-proline, from glycine and to proline are identi ed. Mutations that will clash with existing residues
are identi ed. (c) Mutations to UniprotKB annotated residues: mutations at the site of residues annotated
by UniprotKB as functionally relevant. (d) Mutations to binding residues: PDB structures are analysed
to identify residues binding to proteins, DNA or small molecules. These data are augmented by data
from MMDBBIND. (e) Mutations disrupting the quaternary structure (f) Mutations to sequence conserved
residues. In addition, we provide results for the summary analyses categories of structurally explained (where
at least one structural explanation is true, i.e., any explanation apart from the sequence conservation) and
explained (where at least one explanation is true). In order to compare the mutations with respect to their
structural e ects, a binary explanatory vector is calculated for each mutation and a two-sided Fisher exact
test was carried out.
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>Generation of a consensus model summarizing Protein Kinase structures</title>
        <p>
          A consensus model (Fig.1) of the basic structure of the kinase domain was created. This consensus model
represents the average structure of a large number of kinases widespread along the human kinome, and
therefore it is useful to summarize global characteristics of the structures. To build the model we rst
selected MAP3K1 as a standard representative sequence of the family from a manually curated multiple
sequence alignment of the human kinome constructed using the alignment package MUSCLE [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. The
selected sequence was submitted to Modeller [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ] assembling the models created using all those closely
related template PDBs structures returned from a BLAST search. The predicted model has previously been
used as a consensus of the Protein Kinase domain [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Finally, mutations were transferred from their own
PDB coordinates to the consensus model for comparison.
        </p>
      </sec>
      <sec id="sec-4-6">
        <title>Calculation of Important Regions</title>
        <sec id="sec-4-6-1">
          <title>Calculation of Accessibility</title>
          <p>
            NACCESS (Hubbard, unpublished ) is a stand-alone program that calculates accessible areas by rolling a
probe with van der Waals radius over the surface of the molecule. A residue is de ned as buried if 16%
or less of the residue's surface is exposed to the probe. This is a common threshold [
            <xref ref-type="bibr" rid="ref37">37</xref>
            ] that ensures a
reasonable number of buried residues.
          </p>
        </sec>
        <sec id="sec-4-6-2">
          <title>De nition of Catalytic Sites</title>
          <p>
            The FireDB database [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ] contains a comprehensive curated set of substrate binding and catalytic residues,
extracted directly from the PDB [
            <xref ref-type="bibr" rid="ref38">38</xref>
            ] or from the Catalytic Site Atlas [
            <xref ref-type="bibr" rid="ref39">39</xref>
            ]. FireDB binding residues for the
various kinases were mapped into the general model using the corresponding multiple structure alignments.
          </p>
        </sec>
        <sec id="sec-4-6-3">
          <title>Prediction of Tree Determinant positions</title>
          <p>
            S3det [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ] is an algorithm for the detection of groups of proteins within a family with potential functional
speci cities and to identify the residues that are characteristic of that group. S3det is based on the
simultaneous quantitative analysis of sequences and residues within a multiple sequence alignment on related
multidimensional spaces. Those residues associated with speci c sequence subfamilies tend to be
determinants of functional speci city and are located in functional regions of protein families, including substrate
binding sites, functional sites and protein interaction sites.
          </p>
        </sec>
      </sec>
      <sec id="sec-4-7">
        <title>Xd analysis</title>
        <p>
          To assess the signi cance of the proximity of di erent sets of mutations to areas of the protein (buried,
functional, conserved, etc) we used the harmonic deviation, Xd, measure introduced previously [
          <xref ref-type="bibr" rid="ref40">40</xref>
          ].
        </p>
        <p>
          Xd = Xii==n1 Pidci nPia (1)
Where n is the number of distance bins in the distributions, di is the upper limit for each bin, Pic is the
percentage of residues with distance between di and di 1 and Pia is the same percentage for all residues
in the protein. De ned this way, positive values of Xd indicate that the population of residues is shifted
to smaller distances with respect to the population of all residues. In practice we used a di erence of Xd
values of 0.75 to indicate distributions of residues that are signi cantly di erent regarding their proximity
to previously de ned areas of the protein. This threshold { albeit arbitrary { is based on manual inspection
of previous results and has been proved valid in a similar context [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Competing interests</title>
      <p>The authors declare that they have no competing interests.</p>
    </sec>
    <sec id="sec-6">
      <title>Authors contributions</title>
      <p>AV, CO and AM conceived the idea. All authors planned the analysis. JMGI, LEMH and AB generated
the datasets. JMGI and LEMH performed the analysis. All authors discussed the results. JMGI, LEMH
and AV wrote the manuscript. All authors read and approved the manuscript.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The authors want to thank D. Juan, A. Rausell, E. Leon, A. Carro, G. Lopez, O. Redfern and M. L. Tress
for their help, interesting discussion and ideas. This work was supported by Grant BIO2007-66855 from the
Spanish Ministerio de Ciencia e Innovacion
The model structure of human Protein Kinase, based on MAP3K1, shows the basic two-lobe kinase fold, with
the N- and C-terminal (green and orange respectively) lobes joined by a hinge region (magenta). Substrate
recognition is through interaction with the activation segment (blue), a region in the C-terminal lobe. The
substrate-binding groove is located between the catalytic loop, the P+1 loop (activation segment), helix D,
helix F, helix G and helix H. ATP binds at a site between the two lobes (yellow) that includes ve conserved
residues: (i) Lysine 74 that interacts with the alpha and beta phosphates of ATP and thereby stabilizing
it; (ii) a nearby glutamic acid (E96) forms a salt bridge with lysine 74 increasing the stabilization network;
(iii) Aspartate 171 is the catalytic base that initiates phosphotransfer by deprotonating the acceptor serine,
threonine or tyrosine; (iv) Asparagine 176 interacts with a secondary divalent cation, thereby positioning
the gamma-phosphate of ATP, and</p>
      <p>nally (v) Aspartate 190 which chelates the primary divalent cation,
indirectly positioning ATP at the same time.</p>
      <p>Fig. 2 - Histograms of the distribution of distances between mutated resides and the analyzed features.
(A) Catalytic residues according to FireDB. (B) Catalytic Residues according to Knight et al (2007). (C)
Tree-Determinants (D) Buried residues.</p>
    </sec>
    <sec id="sec-8">
      <title>Tables</title>
      <p>Table 1 - Results of the Xd analysis comparing PDP K s and SNPP K s
Where PD (A) and SNP (A) stand for the average closest distances in Angstroms from the feature residue
and the PDs and SNPs respectively, and</p>
      <p>Xd for the di erence in Xd values. Negative values indicate that
the PDs are closer to the feature residues whereas positive values indicate that SNPs are in the surroundings
of feature residues.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Collins</surname>
            <given-names>FS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brooks</surname>
            <given-names>LD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chakravarti</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>A DNA polymorphism discovery resource for research on human genetic variation</article-title>
          .
          <source>Genome Res</source>
          <year>1998</year>
          ,
          <volume>8</volume>
          (
          <issue>12</issue>
          ):
          <volume>1229</volume>
          {
          <fpage>31</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Taillon-Miller</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gu</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Q</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hillier</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kwok</surname>
            <given-names>PY</given-names>
          </string-name>
          :
          <article-title>Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms</article-title>
          .
          <source>Genome Res</source>
          <year>1998</year>
          ,
          <volume>8</volume>
          (
          <issue>7</issue>
          ):
          <volume>748</volume>
          {
          <fpage>54</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Sherry</surname>
            <given-names>ST</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ward</surname>
            <given-names>MH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kholodov</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phan</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smigielski</surname>
            <given-names>EM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sirotkin</surname>
            <given-names>K</given-names>
          </string-name>
          :
          <article-title>dbSNP: the NCBI database of genetic variation</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2001</year>
          ,
          <volume>29</volume>
          :
          <fpage>308</fpage>
          {
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Futreal</surname>
            <given-names>PA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coin</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marshall</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Down</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hubbard</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wooster</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rahman</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stratton</surname>
            <given-names>MR</given-names>
          </string-name>
          :
          <article-title>A census of human cancer genes</article-title>
          .
          <source>Nat Rev Cancer</source>
          <year>2004</year>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <volume>177</volume>
          {
          <fpage>83</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Garber</surname>
            <given-names>K</given-names>
          </string-name>
          :
          <article-title>The second wave in kinase cancer drugs</article-title>
          .
          <source>Nat Biotechnol</source>
          <year>2006</year>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <volume>127</volume>
          {
          <fpage>30</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Manning</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whyte</surname>
            <given-names>DB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sudarsanam</surname>
            <given-names>S</given-names>
          </string-name>
          :
          <article-title>The protein kinase complement of the human genome</article-title>
          .
          <source>Science</source>
          <year>2001</year>
          ,
          <volume>298</volume>
          (
          <issue>5600</issue>
          ):
          <year>1912</year>
          {
          <fpage>34</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Knight</surname>
            <given-names>JDR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qian</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kothary</surname>
            <given-names>R</given-names>
          </string-name>
          :
          <article-title>Conservation, variability and the modeling of active protein kinases</article-title>
          .
          <source>PLoS ONE</source>
          <year>2007</year>
          ,
          <volume>2</volume>
          (
          <issue>10</issue>
          ):
          <fpage>e982</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Ubersax</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woodbury</surname>
            <given-names>EL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quang</surname>
            <given-names>PN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paraz</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blethrow</surname>
            <given-names>JD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shokat</surname>
            <given-names>KM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morgan</surname>
            <given-names>DO</given-names>
          </string-name>
          :
          <article-title>Targets of the cyclin-dependent kinase Cdk1</article-title>
          .
          <source>Nature</source>
          <year>2003</year>
          ,
          <volume>425</volume>
          (
          <issue>6960</issue>
          ):
          <volume>859</volume>
          {
          <fpage>64</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ptacek</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Devgan</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michaud</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fasolo</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jona</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Breitkreutz</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sopko</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCartney</surname>
            <given-names>RR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            <given-names>MC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rachidi</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>SJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mah</surname>
            <given-names>AS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meng</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stark</surname>
            <given-names>MJR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stern</surname>
            <given-names>DF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Virgilio</surname>
            <given-names>CD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tyers</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrews</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gerstein</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schweitzer</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Predki</surname>
            <given-names>PF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Snyder</surname>
            <given-names>M</given-names>
          </string-name>
          :
          <article-title>Global analysis of protein phosphorylation in yeast</article-title>
          .
          <source>Nature</source>
          <year>2005</year>
          ,
          <volume>438</volume>
          (
          <issue>7068</issue>
          ):
          <volume>679</volume>
          {
          <fpage>84</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Krallinger</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Izarzugaza</surname>
            <given-names>JMG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodriguez-Penagos</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Extraction of human kinase mutations from literature, databases and genotyping studies</article-title>
          .
          <source>BMC Bioinformatics</source>
          <year>2009</year>
          , 10
          <issue>Suppl 8</issue>
          :
          <fpage>S1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Greenman</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stephens</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dalgliesh</surname>
            <given-names>GL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bignell</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davies</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teague</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butler</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edkins</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Meara</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vastrik</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            <given-names>EE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avis</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barthorpe</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhamra</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buck</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choudhury</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clements</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cole</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dicks</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Forbes</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halliday</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harrison</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hills</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jenkinson</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Menzies</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mironenko</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perry</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raine</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Richardson</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shepherd</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Small</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tofts</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varian</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Webb</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>West</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Widaa</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yates</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cahill</surname>
            <given-names>DP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Louis</surname>
            <given-names>DN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldstraw</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nicholson</surname>
            <given-names>AG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brasseur</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Looijenga</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            <given-names>BL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiew</surname>
            <given-names>YE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DeFazio</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greaves</surname>
            <given-names>MF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Green</surname>
            <given-names>AR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campbell</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Birney</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Easton</surname>
            <given-names>DF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chenevix-Trench</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            <given-names>MH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khoo</surname>
            <given-names>SK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teh</surname>
            <given-names>BT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yuen</surname>
            <given-names>ST</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leung</surname>
            <given-names>SY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wooster</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Futreal</surname>
            <given-names>PA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stratton</surname>
            <given-names>MR</given-names>
          </string-name>
          :
          <article-title>Patterns of somatic mutation in human cancer genomes</article-title>
          .
          <source>Nature</source>
          <year>2007</year>
          ,
          <volume>446</volume>
          (
          <issue>7132</issue>
          ):
          <volume>153</volume>
          {
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Wood</surname>
            <given-names>LD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parsons</surname>
            <given-names>DW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sjoblom</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leary</surname>
            <given-names>RJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boca</surname>
            <given-names>SM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barber</surname>
            <given-names>TD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ptak</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silliman</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szabo</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dezso</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ustyanksky</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolskaya</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolsky</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karchin</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilson</surname>
            <given-names>PA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminker</surname>
            <given-names>JS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croshaw</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willis</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dawson</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shipitsin</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willson</surname>
            <given-names>JKV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sukumar</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polyak</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
            <given-names>BH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pethiyagoda</surname>
            <given-names>CL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pant</surname>
            <given-names>PVK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ballinger</surname>
            <given-names>DG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sparks</surname>
            <given-names>AB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartigan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>DR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suh</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadopoulos</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckhaults</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markowitz</surname>
            <given-names>SD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmigiani</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinzler</surname>
            <given-names>KW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velculescu</surname>
            <given-names>VE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vogelstein</surname>
            <given-names>B</given-names>
          </string-name>
          :
          <article-title>The genomic landscapes of human breast and colorectal cancers</article-title>
          .
          <source>Science</source>
          <year>2007</year>
          ,
          <volume>318</volume>
          (
          <issue>5853</issue>
          ):
          <volume>1108</volume>
          {
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Baudot</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Real</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Izarzugaza</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>From cancer genomes to cancer models: bridging the gaps</article-title>
          .
          <source>EMBO Rep</source>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Izarzugaza</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Redfern</surname>
            <given-names>O</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orengo</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Cancer-associated mutations are preferentially distributed in protein kinase functional sites</article-title>
          .
          <source>Proteins</source>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Hurst</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McMillan</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Porter</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allen</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fakorede</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>The SAAPdb web resource: A large-scale structural analysis of mutant proteins</article-title>
          .
          <source>Hum Mutat</source>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Lopez</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tress</surname>
            <given-names>ML</given-names>
          </string-name>
          :
          <article-title>FireDB{a database of functionally important residues from proteins of known structure</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2007</year>
          ,
          <volume>35</volume>
          (Database issue):
          <fpage>D219</fpage>
          {
          <fpage>23</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rausell</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Juan</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pazos</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Protein interactions and ligand binding: from protein subfamilies to functional speci city</article-title>
          .
          <source>Proc Natl Acad Sci USA</source>
          <year>2010</year>
          ,
          <volume>107</volume>
          (
          <issue>5</issue>
          ):
          <year>1995</year>
          {
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Karkkainen</surname>
            <given-names>MJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferrell</surname>
            <given-names>RE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lawrence</surname>
            <given-names>EC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kimak</surname>
            <given-names>MA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levinson</surname>
            <given-names>KL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McTigue</surname>
            <given-names>MA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alitalo</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finegold</surname>
            <given-names>DN</given-names>
          </string-name>
          :
          <article-title>Missense mutations interfere with VEGFR-3 signalling in primary lymphoedema</article-title>
          .
          <source>Nat Genet</source>
          <year>2000</year>
          ,
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <volume>153</volume>
          {
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Mao</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uckun</surname>
            <given-names>FM</given-names>
          </string-name>
          :
          <article-title>Crystal structure of Bruton's tyrosine kinase domain suggests a novel pathway for activation and provides insights into the molecular basis of X-linked agammaglobulinemia</article-title>
          .
          <source>J Biol Chem</source>
          <year>2001</year>
          ,
          <volume>276</volume>
          (
          <issue>44</issue>
          ):
          <volume>41435</volume>
          {
          <fpage>43</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Caceres</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teran</surname>
            <given-names>CG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodriguez</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medina</surname>
            <given-names>M</given-names>
          </string-name>
          :
          <article-title>Prevalence of insulin resistance and its association with metabolic syndrome criteria among Bolivian children and adolescents with obesity</article-title>
          .
          <source>BMC Pediatr</source>
          <year>2008</year>
          ,
          <volume>8</volume>
          :
          <fpage>31</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Leroy</surname>
            <given-names>JG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuytinck</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lambert</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Naeyaert</surname>
            <given-names>JM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mortier</surname>
            <given-names>GR</given-names>
          </string-name>
          :
          <article-title>Acanthosis nigricans in a child with mild osteochondrodysplasia and K650Q mutation in the FGFR3 gene</article-title>
          .
          <source>Am J Med Genet A</source>
          <year>2007</year>
          ,
          <volume>143A</volume>
          (24):
          <volume>3144</volume>
          {
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Zankl</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elakis</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Susman</surname>
            <given-names>RD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inglis</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gardener</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckley</surname>
            <given-names>MF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roscioli</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>Prenatal and postnatal presentation of severe achondroplasia with developmental delay and acanthosis nigricans (SADDAN) due to the FGFR3 Lys650Met mutation</article-title>
          .
          <source>Am J Med Genet A</source>
          <year>2008</year>
          ,
          <article-title>146A(2):212{8</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Fonseca</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa-Lima</surname>
            <given-names>MA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cosentino</surname>
            <given-names>V</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orioli</surname>
            <given-names>IM</given-names>
          </string-name>
          :
          <article-title>Second case of Beare-Stevenson syndrome with an FGFR2 Ser372Cys mutation</article-title>
          .
          <source>Am J Med Genet A</source>
          <year>2008</year>
          ,
          <volume>146A</volume>
          (5):
          <volume>658</volume>
          {
          <fpage>60</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Torkamani</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schork</surname>
            <given-names>NJ</given-names>
          </string-name>
          :
          <article-title>Distribution analysis of nonsynonymous polymorphisms within the human kinase gene family</article-title>
          .
          <source>Genomics</source>
          <year>2007</year>
          ,
          <volume>90</volume>
          :
          <fpage>49</fpage>
          {
          <fpage>58</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Krishnan</surname>
            <given-names>VG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westhead</surname>
            <given-names>DR</given-names>
          </string-name>
          :
          <article-title>A comparative study of machine-learning methods to predict the e ects of single nucleotide polymorphisms on protein function</article-title>
          .
          <source>Bioinformatics</source>
          <year>2003</year>
          ,
          <volume>19</volume>
          (
          <issue>17</issue>
          ):
          <volume>2199</volume>
          {
          <fpage>209</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Vitkup</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sander</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Church</surname>
            <given-names>GM</given-names>
          </string-name>
          :
          <article-title>The amino-acid mutational spectrum of human genetic disease</article-title>
          .
          <source>Genome Biol</source>
          <year>2003</year>
          ,
          <volume>4</volume>
          (
          <issue>11</issue>
          ):
          <fpage>R72</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Wang</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moult</surname>
            <given-names>J</given-names>
          </string-name>
          :
          <article-title>SNPs, protein structure, and disease</article-title>
          .
          <source>Hum Mutat</source>
          <year>2001</year>
          ,
          <volume>17</volume>
          (
          <issue>4</issue>
          ):
          <volume>263</volume>
          {
          <fpage>70</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Ferrer-Costa</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orozco</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de la Cruz</surname>
            <given-names>X</given-names>
          </string-name>
          :
          <article-title>Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties</article-title>
          .
          <source>J Mol Biol</source>
          <year>2002</year>
          ,
          <volume>315</volume>
          (
          <issue>4</issue>
          ):
          <volume>771</volume>
          {
          <fpage>86</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Ferrer-Costa</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orozco</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de la Cruz</surname>
            <given-names>X</given-names>
          </string-name>
          :
          <article-title>Sequence-based prediction of pathological mutations</article-title>
          .
          <source>Proteins</source>
          <year>2004</year>
          ,
          <volume>57</volume>
          (
          <issue>4</issue>
          ):
          <volume>811</volume>
          {
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Yue</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moult</surname>
            <given-names>J</given-names>
          </string-name>
          :
          <article-title>Loss of protein structure stability as a major causative factor in monogenic disease</article-title>
          .
          <source>J Mol Biol</source>
          <year>2005</year>
          ,
          <volume>353</volume>
          (
          <issue>2</issue>
          ):
          <volume>459</volume>
          {
          <fpage>73</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Manning</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plowman</surname>
            <given-names>GD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sudarsanam</surname>
            <given-names>S</given-names>
          </string-name>
          :
          <article-title>Evolution of protein kinase signaling from yeast to man</article-title>
          .
          <source>Trends Biochem Sci</source>
          <year>2002</year>
          ,
          <volume>27</volume>
          (
          <issue>10</issue>
          ):
          <volume>514</volume>
          {
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Jain</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bairoch</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duvaud</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phan</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Redaschi</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suzek</surname>
            <given-names>BE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            <given-names>MJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGarvey</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gasteiger</surname>
            <given-names>E</given-names>
          </string-name>
          :
          <article-title>Infrastructure for the life sciences: design and implementation of the UniProt website</article-title>
          .
          <source>BMC Bioinformatics</source>
          <year>2009</year>
          ,
          <volume>10</volume>
          :
          <fpage>136</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Altschul</surname>
            <given-names>SF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madden</surname>
            <given-names>TL</given-names>
          </string-name>
          , Scha er
          <string-name>
            <given-names>AA</given-names>
            ,
            <surname>Zhang</surname>
          </string-name>
          <string-name>
            <given-names>J</given-names>
            ,
            <surname>Zhang</surname>
          </string-name>
          <string-name>
            <given-names>Z</given-names>
            ,
            <surname>Miller</surname>
          </string-name>
          <string-name>
            <given-names>W</given-names>
            ,
            <surname>Lipman</surname>
          </string-name>
          <string-name>
            <surname>DJ</surname>
          </string-name>
          :
          <article-title>Gapped BLAST and PSIBLAST: a new generation of protein database search programs</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>1997</year>
          ,
          <volume>25</volume>
          (
          <issue>17</issue>
          ):
          <volume>3389</volume>
          {
          <fpage>402</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Hamosh</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            <given-names>AF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amberger</surname>
            <given-names>JS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bocchini</surname>
            <given-names>CA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McKusick</surname>
            <given-names>VA</given-names>
          </string-name>
          :
          <article-title>Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2005</year>
          ,
          <volume>33</volume>
          (Database issue):
          <source>D514{7.</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Edgar</surname>
            <given-names>RC</given-names>
          </string-name>
          :
          <article-title>MUSCLE: a multiple sequence alignment method with reduced time and space complexity</article-title>
          .
          <source>BMC Bioinformatics</source>
          <year>2004</year>
          , 5:
          <fpage>113</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Fiser</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sali</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Modeller: generation and re nement of homology-based protein structure models</article-title>
          .
          <source>Meth Enzymol</source>
          <year>2003</year>
          ,
          <volume>374</volume>
          :
          <fpage>461</fpage>
          {
          <fpage>91</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Ezkurdia</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bartoli</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fariselli</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casadio</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tress</surname>
            <given-names>ML</given-names>
          </string-name>
          :
          <article-title>Progress and challenges in predicting protein-protein interaction sites</article-title>
          .
          <source>Brief Bioinformatics</source>
          <year>2009</year>
          ,
          <volume>10</volume>
          (
          <issue>3</issue>
          ):
          <volume>233</volume>
          {
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Berman</surname>
            <given-names>HM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westbrook</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feng</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilliland</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhat</surname>
            <given-names>TN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weissig</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shindyalov</surname>
            <given-names>IN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourne</surname>
            <given-names>PE</given-names>
          </string-name>
          :
          <article-title>The Protein Data Bank</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2000</year>
          ,
          <volume>28</volume>
          :
          <fpage>235</fpage>
          {
          <fpage>42</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Porter</surname>
            <given-names>CT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bartlett</surname>
            <given-names>GJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thornton</surname>
            <given-names>JM</given-names>
          </string-name>
          :
          <article-title>The Catalytic Site Atlas: a resource of catalytic sites and residues identi ed in enzymes using structural data</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2004</year>
          ,
          <volume>32</volume>
          (Database issue):
          <fpage>D129</fpage>
          {
          <fpage>33</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          40.
          <string-name>
            <surname>Pazos</surname>
            <given-names>F</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Helmer-Citterich</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ausiello</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valencia</surname>
            <given-names>A</given-names>
          </string-name>
          :
          <article-title>Correlated mutations contain information about protein-protein interaction</article-title>
          .
          <source>J Mol Biol</source>
          <year>1997</year>
          ,
          <volume>271</volume>
          (
          <issue>4</issue>
          ):
          <volume>511</volume>
          {
          <fpage>23</fpage>
          .
          <string-name>
            <surname>Feature Conservation - Shannon Conservation - AL2CO Structural Conservation Accessibility - Buried Catalytic - FireDB Catalytic - Knight TreeDeterminants</surname>
          </string-name>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>