<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Alternative Base Callers Aid Real-Time Analysis of SARS-CoV-2 Sequencing Runs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vladimír Boža</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matej Fedor</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kristína Boršová</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viktória Cˇabanová</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jana Cˇ erníková</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viktória Hodorová</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Perešíni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Klára Sládecˇková</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Boris Klempa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jozef Nosek</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bronˇa Brejová</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tomáš Vinarˇ</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Biomedical Research Center of the Slovak Academy of Sciences</institution>
          ,
          <addr-line>Bratislava</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Mathematics</institution>
          ,
          <addr-line>Physics and Informatics</addr-line>
          ,
          <institution>Comenius University</institution>
          ,
          <addr-line>Bratislava</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Faculty of Natural Sciences, Comenius University</institution>
          ,
          <addr-line>Bratislava</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>One of the advantages of nanopore sequencing is its ability to provide data in real time, which allows monitoring, early stopping, and fast identification of mutations in sequenced material. Nanopore sequencer measures electrical current induced by the DNA passing through a pore and this signal needs to be translated to a string over the alphabet {A,C,G,T} through a process called base calling. To achieve base calling in real time, the mainstream tools (such as Guppy provided by Oxford Nanopore Technologies) require the support of highperformance GPUs. This is prohibitive in many settings. Here, we evaluate the accuracy of several alternative base callers, which only require use of a desktop CPU or a support of low-cost USB-connected accelerator. While their accuracy is, in general, lower than that of Guppy in a highaccuracy mode using GPUs, we show that these alternative base callers can act as a replacement for monitoring and mutation detection in SARS-CoV-2 sequencing runs, without sacrificing the accuracy of the final result. Availability: http://compbio.fmph.uniba.sk/ sars-cov-2-sequencing/</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The ARTIC protocol has originally been developed for
sequencing viral genomes with nanopore sequencing devices
(Quick et al., 2016), and it has become a commonly used
protocol for SARS-CoV-2 sequencing
        <xref ref-type="bibr" rid="ref10 ref12">(Tyson et al., 2020)</xref>
        .
Briefly, overlapping segments of the viral genome are first
amplified using PCR, and the resulting amplicons are
sequenced using nanopore sequencing (see a simplified
illustration in Figure 1). Typically, multiple samples are
sequenced in parallel using barcoding. In bioinformatics
post processing, the individual reads are first assigned to
individual samples, using demultiplexing according to the
barcodes. Stricter parameters (requiring the presence of
barcodes on both ends of the read) are typically used in
order to avoid barcode bleeding and to discard partially
sequenced reads. The reads are then aligned to the
reference genome and mutations are discovered with the aid of
the raw sequencing signal using nanopolish
        <xref ref-type="bibr" rid="ref5">(Loman et al.,
2015)</xref>
        .
      </p>
      <p>One of the problems with this protocol is that the PCR
amplification step introduces wide variation in coverage,
both between samples and between different amplicons
within a sample. Due to the high error rate of nanopore
sequencing, it is not advisable to determine mutations in
regions with low coverage (the standard pipeline set the
coverage threshold at 20). In such scenarios, it is
difficult to estimate when to stop data acquisition. Fortunately,
results of nanopore sequencing can be processed in
realtime and on-the-fly monitoring during sequencing helps to
inform decisions on when to stop the run.</p>
      <p>A nanopore sequencer reads an electrical signal induced
by the DNA passing through a pore and before
subsequent analysis, this signal needs to be translated to DNA
bases via base calling. A base caller provided by
manufacturer (called Guppy), requires a machine with a high
performance GPU, which is not available in many laptop
computers and is also problem in desktops due to current
NVIDIA GPU shortages.</p>
      <p>
        In this work, we propose to use alternative base callers
with lower demands on computational resources, albeit
producing reads with a slightly lower accuracy (
        <xref ref-type="bibr" rid="ref1">Boža
et al., 2020</xref>
        ;
        <xref ref-type="bibr" rid="ref8">Perešíni et al., 2020</xref>
        ;
        <xref ref-type="bibr" rid="ref2">Boža et al., 2021</xref>
        ). We
demonstrate that using our alternative base caller not only
allows monitoring, but can also produce the final sequence
of similar quality as using the standard base caller.
Moreover, we are able to call tentative variants during
sequencing from incomplete sequence using a custom made
classifier. This allows us to report important information about
virus lineage determination already during the sequencing
run, well before the full sequence is determined.
2
      </p>
      <p>
        Evaluation of Alternative Base Callers
We have evaluated three alternative base callers that
can achieve real-time base calling without the use of a
GPU: Deepnano-blitz (
        <xref ref-type="bibr" rid="ref1">Boža et al., 2020</xref>
        ), Deepnano-Coral
(
        <xref ref-type="bibr" rid="ref8">Perešíni et al., 2020</xref>
        ), and Osprey (
        <xref ref-type="bibr" rid="ref2">Boža et al., 2021</xref>
        ).
There are also other alternative base callers such as
        <xref ref-type="bibr" rid="ref1">Bonito
(Seymour, 2020</xref>
        ) and SACall
        <xref ref-type="bibr" rid="ref10 ref12 ref4">(Huang et al., 2020)</xref>
        , but
none of them offers real-time base calling on a CPU or
a low power USB-connected TPU.
a)
      </p>
      <sec id="sec-1-1">
        <title>Combination of pools</title>
        <p>c)
AAAGTAGATGCTAAAGCTTACAAAGAAGT
GGGCCTTTTTATATATCCTACTATTGTTT
TATCTCTGCTATAGTAACCTGAAAGTCTC
AAAATTCTTTTAAGGCGGGTCATGGTAGT
TATTTATGTTCTTTTAACGTGCAACCCTC
d)
AGGTGCCACTACTTGTGGTTACTTACCCCAAAATGCTGTTGTTAAAATTTATTGTCCAGC
AGGTGCCACTACATGTGGTTACTTACCCCAAAA</p>
        <p>GGTGCCACTACATGTGGTTACTTACCCCAAAAT</p>
        <p>GTGCCACTACTTGTGGTTACTTACCCCAA
GGTGCCACTACATGTGGTTACTTACCCCAAAA</p>
        <p>GTGCCACTACATGTGGTTACTTACCCCAAAA</p>
        <p>TTACCCCAAAATGCTGTTGTT-AAATTTATTGTCCAGC</p>
        <p>TACCCCAAAATGCTGTTGTT-AAATTTATTGTCCAG
CTTACCCCAAAATGCTGTTGTT-AAATTTATTGTCC</p>
        <p>TTACCCCAAAATGCTGTTGTT-AAATTTATTGTCCAG</p>
        <p>ACCCCAAAATGCTGTTGTT-AAATTTATTGTCCAGC
f)
b)</p>
        <p>AAA GTAGATGCTAAAGCTTACAAAGA AGT
AAA ATTCTTTTAAGGCGGGTCATGGT AGT
GGG CCTTTTTATATATCCTACTATTG TTT
TAT CTCTGCTATAGTAACCTGAAAGT CTC</p>
        <p>TAT TTATGTTCTTTTAACGTGCAACC CTC
e)</p>
      </sec>
      <sec id="sec-1-2">
        <title>Input cDNA</title>
      </sec>
      <sec id="sec-1-3">
        <title>Primer pool 1 Primer pool 2</title>
        <p>
          Deepnano-blitz (
          <xref ref-type="bibr" rid="ref1">Boža et al., 2020</xref>
          ) is a real-time CPU
base caller based on recurrent neural networks.
Deepnanoblitz allows adjustment of the time vs. accuracy tradeoff
by changing the size of the neural network model. Smaller
version (48) can run in real time on a single CPU core,
larger version (96) requires multiple cores to achieve
realtime performance. The accuracy of the smaller version is
slightly lower than the accuracy of Guppy 4.4 in the fast
mode, the larger version is comparable to Guppy 4.4 in the
fast mode.
        </p>
        <p>
          Deepnano-Coral (
          <xref ref-type="bibr" rid="ref8">Perešíni et al., 2020</xref>
          ) is a
convolutional neural network base caller. It requires Coral Edge
TPU, which is a sub-$100 accelerator from Google that
can connect to a USB port, with very low power
requirements. Deepnano-Coral is best suited for laptop
computers that do not have GPU support, as well as in
scenarios where power consumption may become a limiting
factor (such as sequencing in the field). The accuracy of
real-time base calling with Deepnano-Coral falls between
Guppy 4.4 fast and high-accuracy (HAC) modes.
        </p>
        <p>
          Osprey (
          <xref ref-type="bibr" rid="ref2">Boža et al., 2021</xref>
          ) is a CPU-based base caller
that uses architecture similar to Deepnano-Coral, but is
further improved by using a technique called dynamic
pooling and decoding via transducers. The accuracy of
real-time base calling is equivalent to Guppy 3.4 HAC and
better than Guppy 4.4 fast. Computational requirements
are similar to Deepnano-blitz 96.
        </p>
        <p>
          Using faster base callers usually results in sacrificing
accuracy at the individual read level. However, in case of
the ARTIC pipeline, multiple reads are aligned to each
region, and only differences that consistently occur in many
reads are considered proper variants. Moreover, the
ARTIC pipeline uses Nanopolish
          <xref ref-type="bibr" rid="ref5">(Loman et al., 2015)</xref>
          , which
works directly with the raw sequencing signal, as an
underlying variant caller. Therefore the base calling
accuracy is not as important, since base calls are only used for
demultiplexing and for the initial alignment of the read to
the reference in Nanopolish.
        </p>
        <p>The ARTIC pipeline sometimes calls a particular base
as unknown (denoted as N in the sequence). This can
happen for two reasons: low coverage of an amplicon or
conflicting information from different sequencing reads.
Assigning an unknown base represents a conservative
decision and is used wherever it is impossible to decide
whether a particular base is the same as the reference or
represents a mutation with high enough confidence.</p>
        <p>
          We have evaluated the performance of each of the
above mentioned base callers in the context of the
ARTIC pipeline. For the evaluation purposes, we have used
a sequencing run from January 13, 2021 with 23 barcoded
SARS-CoV-2 samples (the 24th sample was excluded due
to very low coverage) using a MinION run with R9.4.1
flow cell, LSK109 chemistry, and 2-kbp amplicon scheme
by
          <xref ref-type="bibr" rid="ref10">Resende et al. (2020)</xref>
          . In the standard software pipeline,
we use Guppy 4.4 (highest version available in the time of
analysis) in the high accuracy mode to base call the reads,
followed by the ARTIC pipeline for variant calling. We
barcode 16
barcode 17
barcode 18
barcode 19
barcode 20
barcode 22
barcode 23
barcode 24
20000
25000
30000
used the results of this standard pipeline as a ground truth.
        </p>
        <p>For each of the above mentioned base callers, as well
as for Guppy 3.4 in the high-accuracy mode, we reran the
ARTIC pipeline with their base calls and compare the
results (see Table 1). Guppy 3.4 was used as a
representative base caller from a year ago. We also run all of our
base callers with a lower demutiplexing threshold, which
slightly increases the coverage, due to more reads being
demutiplexed to individual samples.</p>
        <p>Only very few positions (up to 2 in 23 samples) are
called differently (B!B column). Even though these
clearly represent erroneous base calls (see Table 2), there
are so few of them that they do not impact the overall
accuracy significantly. The largest problem presents an
increased number of “unknown” calls (B!N column).
These are mainly concentrated within a single 310bp
region (21242-21551) which in several samples had an
extremely low coverage (see Figure 2). With lower efficiency
of demultiplexing due to base calling errors, the coverage
of this region was in some samples pushed below the
minimum coverage threshold of the ARTIC pipeline and
consequently was masked with Ns in the result. There were
several additional “unknown” calls of individual bases which
were clustered around certain positions in the genome. We
suspect that this is due to some biases stemming from
nanopore sequencing, where variants of some bases in
certain contexts are difficult to distinguish.</p>
        <p>On the other hand, some additional bases are called
compared to baseline (N!B column). In all cases, these
were called as the original reference. Almost all cases
were at positions 16255 and 16256 and one case was in the
region 21220-21296, where coral-q50 increased the
coverage over the minimum threshold.</p>
        <p>While in some cases the use of our alternative base
callers may result in an incomplete sequence (compared
to the baseline), in general our results show that each of
these tools is a viable alternative to the standard base
calling with Guppy 4.4 in high accuracy mode with similar
quality of the final sequence. While Guppy 4.4 HAC
requires high performance GPU to have a reasonable
running time, the alternatives only require a CPU or a
sub$100 accelerator connected through a standard USB port.
3</p>
        <p>Determining Virus Lineages During</p>
        <p>
          Sequencing from Incomplete Data
One of the key tasks in analysis of sequenced
SARS-CoV2 samples is determination of the virus lineage according
to the standardized lineage classification
          <xref ref-type="bibr" rid="ref10 ref12 ref9">(Rambaut et al.,
2020)</xref>
          . The standard tool to accomplish this task is
pangolin
          <xref ref-type="bibr" rid="ref6">(O’Toole et al., 2021)</xref>
          , which uses machine
learning approach to determine the lineage from the finished
sequence. Pangolin currently fits a (single) decision tree
classifier to sequence data to determine the lineage. While
this approach seems to have high accuracy for complete
sequence data, it handles incomplete sequences by simply
filling them using bases from the reference sequence. This
naturally leads to unpredictable changes in classification
as sequence is being completed, since each new mutation
might lead to a complete change in the decision tree path.
        </p>
        <p>To quickly make provisional lineage classification, even
for incomplete sequences during the sequencing, we
propose a simple classification scheme based on a manually
curated list of characteristic mutations. We identify a list
of these characteristic mutations for expected lineages of
interest for a particular country at a particular time, and
each lineage also has a threshold for number of mutations
required to be present to make a call as shown in an
example in Figure 3.</p>
        <p>During the sequencing run, we use a fast base caller
(Deepnano-blitz 48 in our experiments) to provide live
base calling and by aligning individual sequencing reads
to the reference sequence and simply counting the
support for a mutation at a particular position, we make
provisional variant calls. Note that this would be highly
imprecise for insertions and deletions due to the frequent indels
in nanopore sequencing reads. For this reason, we only
focus on single nucleotide variants. Once the number of
characteristic mutations passes the threshold for a
particular sample, the lineage is provisionally called.</p>
        <p>
          We have integrated our tool within the RAMPART
sequencing run monitoring framework
          <xref ref-type="bibr" rid="ref3">(Hadfield, 2021)</xref>
          and
tested its performance on three runs: one run with 24
barcoded samples, and two with 96 barcoded samples each.
There were no disagreements when both our tool and
pangolin called the lineage, however, in certain cases one or
the other tool did not make a call (see a summary of results
in Table 3).
        </p>
        <p>Figure 4 shows that our tool can provide early
information about lineages detected in the sequencing run. Even
though barcodes in our samples were highly unbalanced,
some samples can be identified within minutes of
starting the run, and our tool has provided accurate detection
of lineages for 50% of barcodes as early as 40 minutes
from the start of a 96-barcode runs. Due to the low quality
of some samples, we typically run the sequencing for
approximately 24 hours, so such on-the-fly analysis provides
us an opportunity to report the basic information on
sequenced samples to health authorities as early as one day
before the final analysis is finished.</p>
        <p>While our determination of single nucleotide sequence
variants is somewhat simplistic, Figure 5 shows that on
real data even such a simple method can achieve results
with high confidence. In all cases, mutations were
supported by over 85% of reads and there were no calls that
would suffer from ambiguity.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4 Conclusions and Discussion</title>
      <p>One of the great advantages of nanopore sequencing is the
ability to analyze data as they are sequenced. Fast base
callers that can replace default base callers provided by
Oxford Nanopore Technologies are a key in utilizing this
advantage. Here, we have evaluated fast base callers in the
80
75
70
65
60
d55
e
iif50
s
lsc4450
a
s
le35
30
p
m
aS25
20
15
10
5
0
0
context of the ARTIC pipeline and determined that they
can provide results with similar quality at a fraction of
computational cost.</p>
      <p>In the case of the ARTIC pipeline, the quality of base
calls mainly affects the demultiplexing stage, and does
not play as important role in the variant calling since this
is done with the assistance of the raw sequencing signal.
Moreover, we have also demonstrated that fast base callers
can be used in the context of RAMPART monitoring tool
to identify virus lineages on-the-fly during the sequencing.
Such application allows us to relay important information
to health authorities much faster.</p>
      <p>
        One of the advantages of RAMPART monitoring tool
is that it can monitor in real time the coverage of all
regions in all barcoded samples, allowing us to make an
informed determination when to stop the sequencing run. As
a future work, we would like to use a similar framework
in connection with the selective sequencing
        <xref ref-type="bibr" rid="ref7">(Payne et al.,
2021)</xref>
        to achieve a more uniform coverage between
samples, as well as to mitigate uneven coverage within
samples stemming from varying efficiency of individual PCR
primers, by rejecting reads belonging to the regions that
are already well covered.
      </p>
      <p>Acknowledgements. This research was supported by a
grant ITMS:313011ATL7 “Pangenomics for personalized
clinical management of infected persons based on
identified viral genome and human exome” from the Operational
Program Integrated Infrastructure (90%) co-financed by
the European Regional Development Fund. The research
was also supported by VEGA 1/0458/18 to TV (10%).
pytorch
reads.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Boža</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perešíni</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brejová</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and Vinarˇ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Deepnano-blitz: a fast base caller for minion nanopore sequencers</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>36</volume>
          (
          <issue>14</issue>
          ):
          <fpage>4191</fpage>
          -
          <lpage>4192</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Boža</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perešíni</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brejová</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and Vinarˇ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2021</year>
          ).
          <source>Dynamic Pooling Improves Nanopore Base Calling Accuracy. London Calling</source>
          <year>2021</year>
          poster.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Hadfield</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Rampart: Read assignment, mapping, and phylogenetic analysis in real time</article-title>
          . https://github.com/artic-network/rampart.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nie</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ni</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Sacall: a neural network basecaller for oxford nanopore sequencing data based on self-attention mechanism</article-title>
          .
          <source>IEEE/ACM Transactions on Computational Biology and Bioinformatics.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Loman</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quick</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>A complete bacterial genome assembled de novo using only nanopore sequencing data</article-title>
          .
          <source>Nat Methods</source>
          ,
          <volume>12</volume>
          (
          <issue>8</issue>
          ):
          <fpage>733</fpage>
          -
          <lpage>735</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>O'Toole</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scher</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Underwood</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jackson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hill</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCrone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abu-Dahab</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , B.,
          <string-name>
            <surname>Yeats</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>du Plessis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aanensen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pybus</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rambaut</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>pangolin: lineage assignment in an emerging pandemic as an epidemiological tool</article-title>
          . github.com/cov-lineages/pangolin.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Payne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clarke</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Munro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Debebe</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Loose</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2021</year>
          ).
          <article-title>Readfish enables targeted nanopore sequencing of gigabase-sized genomes</article-title>
          .
          <source>Nature biotechnology</source>
          ,
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <fpage>442</fpage>
          -
          <lpage>450</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Perešíni</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boža</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brejová</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and Vinarˇ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Nanopore Base Calling on the Edge</article-title>
          .
          <source>Technical Report arXiV:2011</source>
          .
          <volume>04312</volume>
          , arXiv.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Rambaut</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>E. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Toole</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hill</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCrone</surname>
            ,
            <given-names>J. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>du Plessis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Pybus</surname>
            ,
            <given-names>O. G.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>A dynamic nomenclature proposal for SARSCoV-2 lineages to assist genomic epidemiology</article-title>
          .
          <source>Nat Microbiol</source>
          ,
          <volume>5</volume>
          (
          <issue>11</issue>
          ):
          <fpage>1403</fpage>
          -
          <lpage>1407</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Resende</surname>
            ,
            <given-names>P. C.</given-names>
          </string-name>
          et al. (
          <year>2020</year>
          ).
          <article-title>SARS-CoV-2 genomes recovered by long amplicon tiling multiplex approach using nanopore sequencing and applicable to other sequencing platforms</article-title>
          .
          <source>Technical Report doi:10</source>
          .1101/
          <year>2020</year>
          .04.30.069039, bioRxiv.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Seymour</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Bonito: A basecaller for oxford nanopore https://github</article-title>
          .com/nanoporetech/bonito.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Tyson</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          et al. (
          <year>2020</year>
          ).
          <article-title>Improvements to the ARTIC multiplex PCR method for SARS-CoV-2 genome sequencing using nanopore</article-title>
          .
          <source>Technical Report doi:10</source>
          .1101/
          <year>2020</year>
          .09.04.283077, bioRxiv.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>