<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mitochondrial Locus Speci c</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shamna Mole</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Saakshi Jalali</string-name>
          <email>saakshi.jalali@igib.res.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vinod Scaria</string-name>
          <email>vinods@igib.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anshu Bhardwaj</string-name>
          <email>anshu@csir.res.in</email>
          <email>anshub@osdd.net</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Council of Scienti c and Industrial Research (CSIR)</institution>
          ,
          <addr-line>2 Ra Marg, Delhi 110001</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Genomics and Molecular Medicine, Institute of Genomics and Integrative Biology, CSIR</institution>
          ,
          <addr-line>Mall Road, Delhi 110007</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Stella Maris College, University of Madras</institution>
          ,
          <addr-line>Chennai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Background: Human mitochondrial DNA (mtDNA) encodes a set of 37 genes which are essential structural and functional components of the electron transport chain. Variations in these genes have been implicated in a broad spectrum of diseases and are extensively reported in literature and various databases. In this study, we describe MitoLSDB, an integrated platform to catalogue disease association studies on mtDNA (http://mitolsdb.igib.res.in). The main goal of MitoLSDB is to provide a central platform for direct submissions of novel variants that can be curated by the Mitochondrial Research Community. Description: MitoLSDB provides access to standardized and annotated data from literature and databases encompassing information from 5231 individuals, 675 populations and 27 phenotypes. This platform is developed using the Leiden Open (source) Variation Database (LOVD) software. MitoLSDB houses information on all 37 genes in each population amounting to 132397 variants, 5147 unique variants. For each variant its genomic location as per the Revised Cambridge Reference Sequence, codon and amino acid change for variations in protein-coding regions, frequency, disease/phenotype, population, reference and remarks are also listed. MitoLSDB curators have also reported errors documented in literature which includes 94 phantom mutations, 10 NUMTs, six documentation errors and one artefactual recombination. Conclusion: MitoLSDB is the largest repository of mtDNA variants systematically standardized and presented using the LOVD platform. We believe that this is a good starting resource to curate mtDNA variants and will facilitate direct submissions enhancing data coverage, annotation in context of pathogenesis and quality control by ensuring</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>non-redundancy in reporting novel disease associated variants.</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>Mitochondria are the essential energy-generating organelles in eukaryotes possessing the oxidative
phosphorylation system (OXPHOS). Mitochondrial disorders are caused by mutations in mitochondrial genes
encoded by nuclear or mitochondrial DNA (mtDNA) [1]. The OXPHOS comprises of ve protein
complexes and majority of their protein subunits are nuclear encoded with only a subset of 37 genes encoded
by mtDNA [2]. Of these 37 genes, 13 are protein subunits, 22 tRNAs, and 2 rRNAs. These genes are
essential components of electron transport chain complexes I, III and IV and complex V (ATP synthase)
[3]. Mutations in these genes have been associated with a broad spectrum of diseases [4]. At least 1 in
every 200 births is thought to have a potentially pathogenic mitochondrial DNA mutation [5]. The disease
phenotypes attributed to mutations in mtDNA have diverse and overlapping symptoms and also multi-organ
involvement [6]. Many deleterious point mutations have been identi ed to date, the most frequent ones
being the m.3243A&gt;G MELAS mutation [7], the LHON primary mutations [8], and the m.8344A&gt;G MERRF
mutation [9]. Others are found less often, while still others have been described only as case studies or
in families. The investigation of pathogenic mtDNA mutations has revealed a complex relation between
patient genotype and phenotype [10]. The phenotypic variability is due to the peculiarities of mitochondrial
properties, such as heteroplasmy, di erent mutation rates in di erent tissues and highly polymorphic nature
[11-13]. Therefore, the patho-mechanisms of mtDNA point mutations are still not very well understood.
Furthermore, there appears to be a class of slightly deleterious mutations that modify the risks of developing
certain complex diseases or traits [14]. Besides, heteroplasmic and homoplasmic mtDNA have also been
observed along with large number of basal polymorphisms in the mitochondrial genome across databases
like OMIM [http://www.ncbi.nlm.nih.gov/omim] [15], MitoMap [16], Mitovariome [17]and mtDB [18]. These
facts highlight the challenges in assessing the role of mtDNA variants in diseases or phenotypes.</p>
      <p>Recent reports indicate role of mitochondrial dysfunction in the pathogenesis of or in uence the risk
of diseases such as Alzheimer's, Parkinson's, cardiovascular disease including cardiomyopathy, etc. [19-21].
But the genotype-phenotype relationship is unclear and debatable [22, 23]. More than 5000 complete or
coding-region sequences of publicly available mtDNA were analyzed to study the diversity of the global
human population [24]. This study has generated useful data in the form of all possible transitions and
transversions and their analysis lead to interesting observations that may help in understanding the role of
mtDNA variants in disease. Besides, there has been an increase in DNA variant data resulting from new
automatic sequencing technologies [25]. Thus, it is imperative to catalogue this information on a standard
web-based platform for sharing and evaluating the potential pathological e ects of mtDNA variants. To
this end, we have used the Leiden Open (source) Variation Database (LOVD) Software [26, 27] for
creating a catalog of human mtDNA variants, through manual curation of data from literature and from public
databases. LOVD is a commonly used tool for organizing locus-centric variation data. As of now, MitoLSDB
has patient and variant information from 5231 individuals from 675 di erent populations [24, 28] from 27
di erent groups including patients with Alzheimer's disease, Asthanozoospermic, Atypical psychosis, Breast
cancer, Diabetes, Angiopathy, Deafness, Glioma, Parkinson's disease, Teratozoospermic, Thyroid cancer,
etc and can be accessed at http://mitolsdb.igib.res.in. MitoLSDB is a Locus-Speci c DataBase (LSDB) for
human mtDNA genes and provides access to standardized and annotated data compiled from di erent
resources which are otherwise di cult to search and comprehend. This is in line with the objectives of LSDBs
which are expected to contain comprehensive information from disparate resources and are open for direct
submissions. It has also been observed that a large amount of variant data from case studies or reports
never get published and LSDBs have served as a viable platform for the scienti c community to bene t from
and actively contribute to [29]. For each variant curated in MitoLSDB, its genomic location as per the
Revised Cambridge Reference Sequence, codon and amino acid change for variations in protein-coding regions,
frequency, disease/phenotype, population, reference and remarks are also listed. MitoLSDB curators have
also reported errors documented in literature which includes 94 phantom mutations, 10 NUMTs (Nuclear
mitochondrial DNA sequences), six documentation errors and one artefactual recombination. We believe
that this is a good starting resource to curate mtDNA variants and will facilitate direct submissions
enhancing data coverage, annotation in context of pathogenesis and quality control by ensuring non-redundancy in
reporting novel disease associated variants.</p>
    </sec>
    <sec id="sec-3">
      <title>Data Collection and Integration</title>
      <sec id="sec-3-1">
        <title>Data Collection</title>
        <p>The variant data and other patient information of 5139 individuals from di erent populations were obtained
from the study by Pereira et al [24] and the public databases www.phylotree.org [30] and Ian Logan's website
http://www.ianlogan.co.uk/checker/genbank.htm [31] which belongs to 26 di erent groups and a set of 92
complete genomes from sporadic ataxia patients [28]. A set of PERL scripts were developed to extract
variant data from the sources. The dataset obtained from Pereira et al's study [24] gives the details on
sample ID, variant, reference, haplotype reported, origin/ethnicity and phenotype. However, this dataset
only provides variant positions in each sample. This information was complemented with variant detail with
help of other resources. These variants are reported as per the coordinates of the revised rCRS (Revised
Cambridge Reference Sequence) [GI:251831106] . We have also reported errors documented in literature
which includes 94 phantom mutations [31], 10 NUMTs [32], 6 documentation errors and one artefactual
recombination [31] in the remarks section of the database. This data is converted to match the `import le'
speci cations of LOVD.</p>
      </sec>
      <sec id="sec-3-2">
        <title>The Database</title>
        <p>The database is customized on the LOVD platform which is supported on the backend by a MySQL relational
database management system. Links are provided to genes to assist the user in searching detailed information
related to the gene. In addition, plug-ins have been created to export the data to a standard meta-tagged
format for interoperability with other resources. This would aid the user to have a genome centered and
holistic view of the variants and this would be helpful in interpreting the biological impact of variations.
In human mtDNA there are ve instances of overlapping bases among genes and thus these have been
mapped to both genes. Codon assignments for mtDNA are di erent from the universal genetic code and
thus the alternate codon table is utilized for reporting codon changes [33]. The database provides for each
variant information on its genomic location, gene name, frequency, phenotype, tissue, sample information,
methodology, codon and amino acid change for protein variation, variant submission link, advance search
options and registration guidelines for a new submitter.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and Discussion</title>
      <p>MitoLSDB comprise of variants data from 5231 individuals from 675 populations which belong to 27
different categories [Figure 1 and Supplementary Table 1]. The base changes reported as per the genomic
coordinates of rCRS so that this data can be compared easily with other datasets. Overall, 132397 variants
are catalogued in MitoLSDB. Of these 5147 are unique genomic variants, wherein 4226 belong to
proteincoding genes, 538 to rRNA genes and 383 are tRNA variants. Of the 4226 protein-coding variants, 158 are
nucleotide ambiguities. In the remaining 4068 protein coding variants, 1349 and 2719 are non-synonymous
and synonymous changes, respectively with 1066, 528 and 2474 at rst, second and third codon positions, in
that order. Presence of variation at speci c codon position may also be related to the strength of association
with the phenotype. In disease association studies, variations occurring with high frequency in patients as
compared to normal individuals are considered to be disease associated. For most of the disease phenotypes
included in MitoLSDB, there is no information on the normal population variants and hence it is not possible
to report disease association based on frequency di erences. We have instead reported the frequency of each
variant within each population that may assist in evaluating the pathogenic status of these variants during
subsequent data analyses.</p>
      <p>A closer look at the data highlights that MT-ATP8 shows the maximum number of non-synonymous
variations after normalization for gene length. Similarly, MT-ND6 shows the maximum number of
synonymous changes [Figure 2]. However, MT-ND2 and MT-ND4L harbor least number of synonymous changes
and non-synonymous changes, respectively.</p>
      <p>MT-CYB shows the maximum number of polymorphisms with frequency one, which is 2048 in number.
For example, the variant m.15326A&gt;G is seen in all the 77 samples from Finland CADASIL population
and there are many more variants captured with frequency one. m.8860A&gt;G, m.750A&gt;G, m.15326A&gt;G
are some of the variants seen repeatedly in di erent samples from various populations. This highlights
the systemic involvement of these mitochondrial variants in diseases or phenotypes. The statistics of base
changes shows a clear skew towards transitions (4293) being more common as compared to transversions
(575) [Figure 3]. The A&gt;G transitions are most common, while G&gt;T transversions are least frequent. As
stated earlier, variations are more frequent at the third position in codons as compared to second position.</p>
      <sec id="sec-4-1">
        <title>Errors Detected and Reported</title>
        <p>We have reported a number of errors documented in literature for the datasets integrated in MitoLSDB. These
errors include 94 phantom mutations, 10 NUMT (Nuclear mitochondrial DNA sequences) contaminations,
6 documentation errors, and one sequence with artefactual recombination. Data reported by Pereira et
al [24] are directly retrieved from GenBank. It has been observed that many mtDNA sequences available
in GenBank are reported errors and unintended mistakes [31, 34]. Many of these errors have already been
reported in literature or sometimes even corrected by the authors. Unfortunately in several instances the new
corrected versions of sequences have not been updated in GenBank [31].These documentation errors include
missing variants, phantom mutations and artefactual recombinants that may lead to wrong conclusions.
Missing variants are those that are expected in a particular mtDNA haplotype according to its haplogroup
status. For example, the sequence [GenBank:DQ826448] lacks an additional nine expected variants to group
that sequence into haplogroup M7b1. Phantom mutations are de ned by the exclusive presence of a rare
transversion [31]. These are systematic artefacts generated in the course of the sequencing process. The
amount of artefacts depends not only on the automated sequencer and sequencing chemistry employed, but
also on other lab-speci c factors [35] and it is also observed that the pattern of phantom mutations di ers
signi cantly from that of natural mutations [36]. In particular, phantom mutation hotspots could lead to
spurious mapping of somatic mutations and to misinterpretations in clinical mtDNA studies [1]. Another
type of error reported in GenBank mtDNA sequences is the NUMTs. These are the mitochondrial DNA
sequences in the nuclear genome [37] (nuclear mitochondrial pseudogenes) which on accidental ampli cation
can pose a serious problem for mitochondrial disease studies [32]. Primers designed for ampli cation of
mtDNA can potentially anneal with sequences in nuclear genome that present at high homology to mtDNA.
In fact NUMTs have already been mistaken as heteroplasmic positions in the case of reported association
of mutations in MT-CO1 and MT-CO2 with development of Alzheimer's disease, which were later shown
to be an artefact resulting from the accidental ampli cation of nuclear mitochondrial pseudo genes [38].
Studies on artefactual recombinations [39, 40] and various missing mutations [41, 42] have also been used
to report the status of variants in the data sets used in MitoLSDB. For example, the mtDNA sequence
[GenBank:DQ834258] may be a recombinant since it bears m.8701A&gt;G (MT-ATP6) and m.9540T&gt;C
(MTCO3), characteristic of non-N status; but this sequence was misclassi ed as haplogroup HV, due to artefactual
recombination [31]. This sequence is reported as `recombinant sequence' in the database remarks section.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>MitoLSDB is a systematic compilation of variant information and is expected to facilitate the submission
of novel variants by the users. This is proposed as a good starting resource to curate mitochondrial DNA
variants, which would facilitate researchers in genotype-phenotype studies and also streamline the task of
reporting novel mutations. It would also allow cross-comparison of di erent mtDNA association studies
and help understand the molecular correlates of mitochondrial disease phenotypes, which otherwise is a
very daunting and challenging task given the complexity of mitochondrial genetics.Variants are integrated
in MitoLSDB in a standard updatable format, with a very user friendly interface [Figure 4]. We believe that
MitoLSDB may work as a central repository for reporting novel pathogenic variants and provide a solution
to documented issues in context of spurious reports and faulty conclusions on disease association status of
mtDNA variants [43].</p>
      <p>MitoLSDB is a freely accessible website that allows researchers to retrieve mitochondrial genome
variation data on 5231 individuals from various populations. Unlike other available sources, users can browse
and obtain the variation data gene wise. It also allows the user to list the variants based on patient origin.
Contrasting to other existing resources MitoLSDB provides data on variants caused by insertions and
deletions. The MitoLSDB curators do not report the missing mutation or haplogroup information in the rst
version of the database because of the ambiguities reported in the haplogroup status, which may lead the
researcher to wrong conclusions. We have reported the available corrected errors in the database remarks
column. It would be the best to get these ambiguities con rmed by the original authors.</p>
    </sec>
    <sec id="sec-6">
      <title>Future Perspectives</title>
      <p>To the best of our knowledge, MitoLSDB is the largest repository of mtDNA variants systematically
standardized and presented using the LOVD platform. The curators have integrated data from 675 populations
comprising of 5231 individuals and 5147 unique variants. We are attempting to make the data interoperable
with various genomic databases and computational work ows, which would facilitate easy and automated
analysis of the variants. This would facilitate researchers in genotype-phenotype studies. MitoLSDB would
also allow cross-comparison or meta-analysis of di erent mtDNA association studies and help understand the
molecular correlates of mitochondrial disease phenotypes, which otherwise is a very daunting and challenging
task given the complexity of mitochondrial genetics. It has been demonstrated earlier that publications
contain a signi cant number of reporting errors that have been corrected or reported by curators and submitters
of LSDBs [29]. We expect a similar trend for mtDNA variations and believe that community participation
will further enhance data coverage, improved annotations in context of pathogenic status of variants and
quality checks for spurious reports and correctness of the submitted data.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The authors also thank Dr. Arijit Mukhopadhyay and Dr. Mohammed Faruq for helpful comments on
the manuscript. The research leading to results of MitoLSDB has received funding from the European
Community's Seventh Framework Programme under grant agreement 200754 (the GEN2PHEN project).</p>
    </sec>
    <sec id="sec-8">
      <title>Figures</title>
      <p>Figure 1 - Frequency distribution of variants across various disease phenotypes
The abbreviations in the pie are : AD - Alzheimer's disease; AT { Ataxia; AP - Atypical Psychosis; OB
{ Obese; Non-OB - Non obese; PCT - primary cancerous tissue; BC - breast cancer; ThC- thyroid cancer;
TZ { Teratozoospermic; semi-SC - semi-supercentenarian; T2D - Type 2 diabetes; T2DA - Type 2 Diabetes
with Angiopathy; DD - Diabetes and deafness; NFTT1 - Neuro bromatosis type I; AZ {Asthanozoospermic;
PD - Parkinson's disease; POLG1-T251I - genotype: POLG1 variant T251I; POLG1-G268A - genotype:
POLG1 G268A; MELAS { Mitochondrial Encephalopathy, Lactic acidosis and Stroke like episodes; LHON
{ Leber's Hereditary Optic Neuropathy; MERRF { Myoclonic Epilepsy with Raged Red Fibers; CADASIL
{ Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and- Leukoencephalopathy; Stroke
and vision loss - stroke-like episodes, lactic acidosis, exercise intolerance similar to that seen in MELAS; two
episodes of transient central vision loss similar to that seen in LHON; CPEO { chronic progressive external
opthalmoplegia; OXPHOS de ciency; Glioma; Centenarian
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.</p>
      <p>Kazuno AA, Munakata K, Nagai T, Shimozono S, Tanaka M, Yoneda M, Kato N, Miyawaki A,
Kato T: Identification of mitochondrial DNA polymorphisms that alter mitochondrial
matrix pH and intracellular calcium dynamics. PLoS Genet 2006, 2(8):e128.</p>
      <p>Cho YM, Park KS, Lee HK: Genetic factors related to mitochondrial function and risk of
diabetes mellitus. Diabetes Res Clin Pract 2007, 77 Suppl 1:S172-177.</p>
      <p>Hanagasi HA, Ayribas D, Baysal K, Emre M: Mitochondrial complex I, II/III, and IV
activities in familial and sporadic Parkinson's disease. Int J Neurosci 2005, 115(4):479-493.
Saxena R, de Bakker PI, Singer K, Mootha V, Burtt N, Hirschhorn JN, Gaudet D, Isomaa B, Daly
MJ, Groop L et al: Comprehensive association testing of common mitochondrial DNA
variation in metabolic disease. Am J Hum Genet 2006, 79(1):54-61.</p>
      <p>Pereira L, Freitas F, Fernandes V, Pereira JB, Costa MD, Costa S, Maximo V, Macaulay V,
Rocha R, Samuels DC: The diversity present in 5140 human mitochondrial genomes. Am J
Hum Genet 2009, 84(5):628-640.
van Eijsden RG, Gerards M, Eijssen LM, Hendrickx AT, Jongbloed RJ, Wokke JH, Hintzen RQ,
Rubio-Gozalbo ME, De Coo IF, Briem E et al: Chip-based mtDNA mutation screening
enables fast and reliable genetic diagnosis of OXPHOS patients. Genet Med 2006,
8(10):620627.</p>
      <p>Fokkema IF, den Dunnen JT, Taschner PE: LOVD: easy creation of a locus-specific sequence
variation database using an "LSDB-in-a-box" approach. Hum Mutat 2005, 26(2):63-68.
Fokkema IF, Taschner PE, Schaafsma GC, Celli J, Laros JF, den Dunnen JT: LOVD v.2.0: the
next generation in gene variant databases. Hum Mutat 2011, 32(5):557-563.</p>
      <p>Bhardwaj A, Mukerji M, Sharma S, Paul J, Gokhale CS, Srivastava AK, Tiwari S: MtSNPscore:
a combined evidence approach for assessing cumulative impact of mitochondrial variations
in disease. BMC Bioinformatics 2009, 10 Suppl 8:S7.</p>
      <p>Celli J, Dalgleish R, Vihinen M, Taschner PE, den Dunnen JT: Curating gene variant
databases (LSDBs): Toward a universal standard. Hum Mutat 2011.
van Oven M, Kayser M: Updated comprehensive phylogenetic tree of global human
mitochondrial DNA variation. Hum Mutat 2009, 30(2):E386-394.</p>
      <p>Yao YG, Salas A, Logan I, Bandelt HJ: mtDNA data mining in GenBank needs surveying. Am
J Hum Genet 2009, 85(6):929-933; author reply 933.</p>
      <p>Yao YG, Kong QP, Salas A, Bandelt HJ: Pseudomitochondrial genome haunts disease studies.
J Med Genet 2008, 45(12):769-772.</p>
      <p>Knight RD, Landweber LF, Yarus M: How mitochondria redefine the code. J Mol Evol 2001,
53(4-5):299-313.</p>
      <p>Bandelt HJ, Achilli A, Kong QP, Salas A, Lutz-Bonengel S, Sun C, Zhang YP, Torroni A, Yao
YG: Low "penetrance" of phylogenetic knowledge in mitochondrial disease studies.</p>
      <p>Biochem Biophys Res Commun 2005, 333(1):122-130.</p>
      <p>Brandstatter A, Sanger T, Lutz-Bonengel S, Parson W, Beraud-Colomb E, Wen B, Kong QP,
Bravi CM, Bandelt HJ: Phantom mutation hotspots in human mitochondrial DNA.</p>
      <p>Electrophoresis 2005, 26(18):3414-3429.</p>
      <p>Bandelt HJ, Quintana-Murci L, Salas A, Macaulay V: The fingerprint of phantom mutations in
mitochondrial DNA data. Am J Hum Genet 2002, 71(5):1150-1160.</p>
      <p>Bensasson D, Zhang D, Hartl DL, Hewitt GM: Mitochondrial pseudogenes: evolution's
misplaced witnesses. Trends Ecol Evol 2001, 16(6):314-321.</p>
      <p>Goios A, Prieto L, Amorim A, Pereira L: Specificity of mtDNA-directed PCR-influence of
NUclear MTDNA insertion (NUMT) contamination in routine samples and techniques. Int J
Legal Med 2008, 122(4):341-345.</p>
      <p>Behar DM, Villems R, Soodyall H, Blue-Smith J, Pereira L, Metspalu E, Scozzari R, Makkan H,
Tzur S, Comas D et al: The dawn of human matrilineal diversity. Am J Hum Genet 2008,
82(5):1130-1140.
40.
41.
42.
43.</p>
      <p>Kong QP, Salas A, Sun C, Fuku N, Tanaka M, Zhong L, Wang CY, Yao YG, Bandelt HJ:
Distilling artificial recombinants from large sets of complete mtDNA genomes. PLoS One
2008, 3(8):e3016.</p>
      <p>Sun C, Kong QP, Palanichamy MG, Agrawal S, Bandelt HJ, Yao YG, Khan F, Zhu CL,
Chaudhuri TK, Zhang YP: The dazzling array of basal branches in the mtDNA
macrohaplogroup M from India as inferred from complete genomes. Mol Biol Evol 2006,
23(3):683-690.</p>
      <p>Palanichamy MG, Sun C, Agrawal S, Bandelt HJ, Kong QP, Khan F, Wang CY, Chaudhuri TK,
Palla V, Zhang YP: Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on
complete sequencing: implications for the peopling of South Asia. Am J Hum Genet 2004,
75(6):966-978.</p>
      <p>Bandelt HJ, Salas A, Taylor RW, Yao YG: Exaggerated status of "novel" and "pathogenic"
mtDNA sequence variants due to inadequate database searches. Hum Mutat 2009,
30(2):191196.</p>
      <p>Normalized gene wise distribution of synonymous and non-synonymous changes
Protein coding genes
Non-syn
Syn
1
2
3
1
2
3
4
5
6
O
C
T</p>
      <p>O
C
T</p>
      <p>O
C
T</p>
      <p>B
Y
C
T
M</p>
      <p>D</p>
      <p>D</p>
      <p>D</p>
      <p>D</p>
      <p>D</p>
      <p>D
N
T</p>
      <p>N
T</p>
      <p>N
T</p>
      <p>N
T</p>
      <p>N
T</p>
      <p>N</p>
      <p>T
L
4
D
N
T
1600
1400
1200
1000
y
cen 800
u
q
e
r
F 600
400
200
0</p>
      <p>Frequency of transitions and transversions
A&gt;G</p>
      <p>C&gt;T</p>
      <p>T&gt;C</p>
      <p>G&gt;A</p>
      <p>C&gt;A</p>
      <p>A&gt;T
Base change</p>
      <p>A&gt;C</p>
      <p>T&gt;G</p>
      <p>C&gt;G</p>
      <p>T&gt;A</p>
      <p>G&gt;C</p>
      <p>G&gt;T</p>
      <p>Supplementary Table1: Distribution of phenotypes across individuals from different</p>
      <sec id="sec-8-1">
        <title>Phenotype</title>
        <p>Alzheimer disease
Asthenozoospermic
Atypical psychosis
Breast cancer
CADASIL
Centenarian
diabetes and deafness
Diabetes Type II
Diabetic with angiopathy
genotype: POLG1 G268A
genotype: POLG1 variant T251I
Glioma
LHON
MELAS
chronic progressive external opthalmoplegia (CPEO)</p>
      </sec>
      <sec id="sec-8-2">
        <title>Ethnicity/Population</title>
        <p>Chiba Japan
Portugal
Japan
Italy
Romania
Finland
Gifu Japan
Gifu Japan
Tokyo Japan
Bonn, Germany
The Netherlands
stroke-like episodes, lactic acidosis, exercise intolerance similar
to that seen in MELAS; two episodes of transient central vision Ashkenazi, Jew, Poland
loss similar to that seen in LHON
Germany
Teratozoospermic
Thyroid cancer
23
64
1
1
No suitable term
Thyroid Neoplasms</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Taylor</surname>
            <given-names>RW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turnbull</surname>
            <given-names>DM</given-names>
          </string-name>
          :
          <article-title>Mitochondrial DNA mutations in human disease</article-title>
          .
          <source>Nat Rev Genet</source>
          <year>2005</year>
          ,
          <volume>6</volume>
          (
          <issue>5</issue>
          ):
          <fpage>389</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Wallace</surname>
            <given-names>DC</given-names>
          </string-name>
          :
          <article-title>Mitochondrial diseases in man and mouse</article-title>
          .
          <source>Science</source>
          <year>1999</year>
          ,
          <volume>283</volume>
          (
          <issue>5407</issue>
          ):
          <fpage>1482</fpage>
          -
          <lpage>1488</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Wallace</surname>
            <given-names>DC</given-names>
          </string-name>
          :
          <article-title>A mitochondrial paradigm of metabolic and degenerative diseases, aging, and cancer: a dawn for evolutionary medicine</article-title>
          .
          <source>Annu Rev Genet</source>
          <year>2005</year>
          ,
          <volume>39</volume>
          :
          <fpage>359</fpage>
          -
          <lpage>407</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Gropman</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>TJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perng</surname>
            <given-names>CL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krasnewich</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chernoff</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tifft</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            <given-names>LJ</given-names>
          </string-name>
          :
          <article-title>Variable clinical manifestation of homoplasmic G14459A mitochondrial DNA mutation</article-title>
          .
          <source>Am J Med Genet A</source>
          <year>2004</year>
          ,
          <volume>124A</volume>
          (4):
          <fpage>377</fpage>
          -
          <lpage>382</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Elliott</surname>
            <given-names>HR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Samuels</surname>
            <given-names>DC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eden</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Relton</surname>
            <given-names>CL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chinnery</surname>
            <given-names>PF</given-names>
          </string-name>
          :
          <article-title>Pathogenic mitochondrial DNA mutations are common in the general population</article-title>
          .
          <source>Am J Hum Genet</source>
          <year>2008</year>
          ,
          <volume>83</volume>
          (
          <issue>2</issue>
          ):
          <fpage>254</fpage>
          -
          <lpage>260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Zeviani</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bertagnolio</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uziel</surname>
            <given-names>G</given-names>
          </string-name>
          :
          <article-title>Neurological presentations of mitochondrial diseases</article-title>
          .
          <source>J Inherit Metab Dis</source>
          <year>1996</year>
          ,
          <volume>19</volume>
          (
          <issue>4</issue>
          ):
          <fpage>504</fpage>
          -
          <lpage>520</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Lam</surname>
            <given-names>CW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lau</surname>
            <given-names>CH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            <given-names>JC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            <given-names>YW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            <given-names>LJ</given-names>
          </string-name>
          :
          <article-title>Mitochondrial myopathy, encephalopathy, lactic acidosis and stroke-like episodes (MELAS) triggered by valproate therapy</article-title>
          .
          <source>Eur J Pediatr</source>
          <year>1997</year>
          ,
          <volume>156</volume>
          (
          <issue>7</issue>
          ):
          <fpage>562</fpage>
          -
          <lpage>564</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Yang</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tong</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Han</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            <given-names>X</given-names>
          </string-name>
          :
          <article-title>Novel A14841G mutation is associated with high penetrance of LHON/C4171A family</article-title>
          .
          <source>Biochem Biophys Res Commun</source>
          <year>2009</year>
          ,
          <volume>386</volume>
          (
          <issue>4</issue>
          ):
          <fpage>693</fpage>
          -
          <lpage>696</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Enriquez</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chomyn</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Attardi</surname>
            <given-names>G</given-names>
          </string-name>
          :
          <article-title>MtDNA mutation in MERRF syndrome causes defective aminoacylation of tRNA(Lys) and premature translation termination</article-title>
          .
          <source>Nat Genet</source>
          <year>1995</year>
          ,
          <volume>10</volume>
          (
          <issue>1</issue>
          ):
          <fpage>47</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Schon</surname>
            <given-names>EA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonilla</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DiMauro</surname>
            <given-names>S</given-names>
          </string-name>
          :
          <article-title>Mitochondrial DNA mutations and pathogenesis</article-title>
          .
          <source>J Bioenerg Biomembr</source>
          <year>1997</year>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>131</fpage>
          -
          <lpage>149</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Morgan-Hughes</surname>
            <given-names>JA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sweeney</surname>
            <given-names>MG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cooper</surname>
            <given-names>JM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammans</surname>
            <given-names>SR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brockington</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schapira</surname>
            <given-names>AH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harding</surname>
            <given-names>AE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>JB</given-names>
          </string-name>
          :
          <article-title>Mitochondrial DNA (mtDNA) diseases: correlation of genotype to phenotype</article-title>
          .
          <source>Biochim Biophys Acta</source>
          <year>1995</year>
          ,
          <volume>1271</volume>
          (
          <issue>1</issue>
          ):
          <fpage>135</fpage>
          -
          <lpage>140</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Wong</surname>
            <given-names>LJ</given-names>
          </string-name>
          :
          <article-title>Diagnostic challenges of mitochondrial DNA disorders</article-title>
          .
          <source>Mitochondrion</source>
          <year>2007</year>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          - 2):
          <fpage>45</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Kierdaszuk</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jamrozik</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tonska</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bartnik</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaliszewska</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminska</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kwiecinski</surname>
            <given-names>H</given-names>
          </string-name>
          :
          <article-title>Mitochondrial cytopathies: clinical, morphological and genetic characteristics</article-title>
          .
          <source>Neurol Neurochir Pol</source>
          <year>2009</year>
          ,
          <volume>43</volume>
          (
          <issue>3</issue>
          ):
          <fpage>216</fpage>
          -
          <lpage>227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Wallace</surname>
            <given-names>DC</given-names>
          </string-name>
          :
          <article-title>Mitochondrial DNA sequence variation in human evolution and disease</article-title>
          .
          <source>Proc Natl Acad Sci U S A</source>
          <year>1994</year>
          ,
          <volume>91</volume>
          (
          <issue>19</issue>
          ):
          <fpage>8739</fpage>
          -
          <lpage>8746</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>McKusick-Nathans Institute of Genetic Medicine</surname>
            <given-names>JHUB</given-names>
          </string-name>
          ,
          <article-title>MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD): Online Mendelian Inheritance in Man, OMIM (TM</article-title>
          ) [http://www.ncbi.nlm.nih.gov/omim]
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>Nucleic Acids Res</source>
          <year>2007</year>
          ,
          <volume>35</volume>
          (Database issue):
          <fpage>D823</fpage>
          -
          <lpage>828</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Lee</surname>
            <given-names>YS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            <given-names>WY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ji</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            <given-names>JH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhak</surname>
            <given-names>J</given-names>
          </string-name>
          :
          <article-title>MitoVariome: a variome database of human mitochondrial DNA</article-title>
          .
          <source>BMC genomics</source>
          <year>2009</year>
          ,
          <volume>10</volume>
          <issue>Suppl 3</issue>
          :
          <fpage>S12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Ingman</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gyllensten</surname>
            <given-names>U</given-names>
          </string-name>
          : mtDB:
          <article-title>Human Mitochondrial Genome Database, a resource for population genetics and medical sciences</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2006</year>
          ,
          <volume>34</volume>
          (Database issue):
          <fpage>D749</fpage>
          -
          <lpage>751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Chandrasekaran</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giordano</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brady</surname>
            <given-names>DR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoll</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            <given-names>LJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rapoport</surname>
            <given-names>SI</given-names>
          </string-name>
          :
          <article-title>Impairment in mitochondrial cytochrome oxidase gene expression in Alzheimer disease</article-title>
          .
          <source>Brain Res Mol Brain Res</source>
          <year>1994</year>
          ,
          <volume>24</volume>
          (
          <issue>1-4</issue>
          ):
          <fpage>336</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>