<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ITAI: Adaptive Neural Machine Translation Platform</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thierry Etchegoyhen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Ponce</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Harritxu Gete Ugarte</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Victor Ruiz Gomez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vicomtech Foundation</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Basque Research</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Technology Alliance (BRTA)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>tetchegoyhen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>adponce</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>hgete</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>vruizg@vicomtech.org</string-name>
        </contrib>
      </contrib-group>
      <fpage>49</fpage>
      <lpage>52</lpage>
      <abstract>
        <p>We describe an adaptive neural machine translation platform which integrates continuous learning and supports multiple use-cases in the translation industry. The application is being developed and evaluated within the applied research project ITAI. Research within the project has shown the potential of the platform to cover the main identi ed use cases and provide rapid adaptation via continuous learning.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Neural Machine Translation (NMT)
        <xref ref-type="bibr" rid="ref1 ref8">(Bahdanau, Cho, and Bengio, 2015; Vaswani et
al., 2017)</xref>
        has brought signi cant gains in
Machine Translation (MT) quality and has
become the dominant paradigm in both
academic research and commercial exploitation.
      </p>
      <p>This technology is being increasingly
integrated in the translation industry to support
growing translation needs in the digital era.</p>
      <p>Providing adequate support to the
translation industry requires taking two main
aspects into account.</p>
      <p>First, actual practices in the industry
feature a wide array of scenarios depending on
the IT infrastructure at hand and the
network of translators working for a speci c
company in the eld. Translation may thus
be performed via computer-assisted
translation (CAT) tools such as SDL Trados
Studio or Wordfast, to name two of the main
ones, in Content Management System (CMS)
environments, or directly performed in
document editors such as Libre O ce or MS
Word. This disparity makes it di cult to
bring MT technology to a signi cant portion
of the translation industry.</p>
      <p>
        Secondly, MT systems are usually only
updated periodically when signi cant volumes
of new training data become available and,
therefore, do not provide timely adaptation
of MT output corrections generated via
postediting. This limitation can result in a loss of
productivity and increased frustration on the
part of translators tasked to repeatedly
correct identical errors over time when querying
MT engines. Continuous learning (CL)
addresses this issue via continuous updates of
MT models on the basis of post-edited
machine translation output fed back to model
training processes. In NMT, CL usually takes
the form of Online Learning (OL), where
each new pair of source sentence and
postedited translation is used to update the
corresponding model
        <xref ref-type="bibr" rid="ref2 ref3 ref5 ref6 ref9">(Peris and Casacuberta,
2019; Turchi et al., 2017; Wuebker, Simianer,
and DeNero, 2018; Domingo et al., 2019)</xref>
        ,
although CL could also be performed via
micro-batches with slightly delayed
integration of user feedback.
      </p>
      <p>One key aspect in CL is determining the
proper trade-o between rapid adaptation of
the models from user corrections and model
stability over time, a topic which has only
Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
been partially explored so far. Optimal
integration of CL methods remains a matter of
active research and is of key importance to
provide useful adaptive machine translation
technology.</p>
      <p>In this paper, we describe ITAI, an
adaptive neural machine translation
platform which integrates continuous learning
and supports multiple use-cases in the
translation industry. The platform is being
developed within the applied research project
ITAI, partially supported by the Department
of Economic Development of the Basque
Government (Spri). The project started in
April 2019 and will nalise in December 2021.</p>
      <p>It is carried over by the following
consortium: MondragonLingua1 (project
coordinator), Emun2, iAmetza3, Mixer4, Tai Gabe5
and Vicomtech6. The project takes into
account the translation requirements from each
company and continuous translators'
feedback across development cycles.
2</p>
    </sec>
    <sec id="sec-2">
      <title>ITAI</title>
      <p>The architecture of ITAI is described in
Figure 1. The application consists of the
following main elements:
• Front-ends, from which users may
interact with the application. The front-ends
include a web-based user interface (UI),
plugins for speci c CAT tools, and
supports CMS integration.
• A REST API, which exposes the
functionality of the back-end and handles
user authentication.
• A back-end, which includes the required
components to perform machine
translation, manage the data generated from
the use of the system, and manage the
training and selection of continuously
updated NMT models.</p>
      <p>The core work ow involves users
requesting machine translation of texts or
documents, post-editing the automated
translations as needed, and sending validated
translations to the system. These validated
translations are fed back to the NMT models via
1https://www.mondragonlingua.com/en/
2https://www.emun.eus/en/
3https://iametza.eus/
4http://www.mixer.com.es/es/mixer/
5https://www.naiz.eus/
6https://www.vicomtech.org
continuous learning to produce incremental
improvements of the models.</p>
      <p>The ITAI UI is implemented in Angular
and the back-end components in Go. We
describe the main components and
functionality in more detail in the following sections.
2.1</p>
      <sec id="sec-2-1">
        <title>Front-ends</title>
        <p>As noted in the introduction, translation
activities in the industry cover a wide array of
usage. To provide support for the main
identi ed use cases, the application supports
different entry points.</p>
        <p>
          We rst addressed the most commonly
used frameworks for multilingual content
generation. CAT tools are an important part
of the translation ecosystem, and we
developed a speci c plugin to connect the
popular SDL Trados environment to the
application, similarly to
          <xref ref-type="bibr" rid="ref2">Domingo et al. (2019)</xref>
          .
Other frameworks such as Wordfast Classic
have also been con gured and tested within
the project. Other CAT environments with
support for custom MT can be easily
congured to interact with ITAI via its REST
API. Additionally, ITAI supports integration
within CMS environments, and speci c
developments are being carried out within the
project for the Ubiquo7 environment.
        </p>
        <p>We also developed a Web-based user
interface to provide an additional access point to
the functionalities of the system. Such an
environment was identi ed as necessary for two
main reasons. First, translation is also being
carried out professionally outside dedicated
environments such as CAT tools and
without any technological support to improve
productivity. Secondly, some proprietary CAT
tools do not support the transfer of
postedited translations to external applications
and users' feedback cannot be re ected in the
MT models. In either case, users are thus
limited in their interaction with supporting
MT technology.</p>
        <p>To address these issues, the ITAI UI
offers a full- edged access to the MT
technology supported by the back-end. Users with
little or no access to MT technology may thus
upload documents and retrieve automatically
translated documents maintaining the
original format. The translated documents can
then be post-edited in an external
environment and the resulting validated translations
uploaded via the UI, where the content will
7https://www.ubiquo.me/
UI
CAT</p>
        <p>CMS
FRONTENDS</p>
        <p>Translate text | document
Post-edit and validate</p>
        <p>API
be extracted to feed the MT models. Dual
use-cases are also supported, where users
may work with a CAT tool that supports the
integration of ITAI MT services, and upload
translation memories or documents
containing their post-edited data via the ITAI UI.
Additionally, the UI provides a simple
editor where users can directly post-edit
machine translated output that has been
automatically segmented and ltered, as a default
environment to correct and validate
translations prior to sending them for integration in
the MT models.</p>
        <p>The user interface also provides
dashboards to monitor volumes of translated and
validated data, and list content that is
pending validation to dissociate use of MT from
the provision of post-edited data, as priorities
usually di er for these two activities.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Back-end</title>
        <p>The ITAI back-end provides support for three
main types of functionality, which we
describe in turn in the next sections.
2.2.1 Machine translation
Machine translation is carried out with
Vicomtech's Itzuli MT toolkit, via its two main
components: itzuli-translator and
itzulidoctrans.</p>
        <p>
          The former performs text translation and
can be deployed in scalable Kubernetes mode
or as a standalone platform in a dedicated
server. It integrates MarianNMT
(
          <xref ref-type="bibr" rid="ref4">JunczysDowmunt et al., 2018</xref>
          ) to perform e cient
NMT inference and training.
        </p>
        <p>Document translation is done via
itzulidoctrans, a robust component for translation
of documents in a variety of formats (odt,
docx, xlsx, pptx, html or xli , among others).
It performs content extraction, text
translation via itzuli-translator, and document
reconstruction with format preservation.</p>
        <p>Itzuli is a validated platform which
supports large-scale translation services and
provides ITAI with robust MT functionality.8</p>
        <p>
          All NMT models currently deployed in
ITAI are Transformer models
          <xref ref-type="bibr" rid="ref8">(Vaswani et
al., 2017)</xref>
          , trained on large volumes of
parallel, comparable and synthetic data.
Although the platform is agnostic in terms of
language pairs and domains, special
emphasis is placed within the project on
translation between Basque and English, French or
Spanish, to contribute to improving language
technology for the Basque language.
        </p>
        <sec id="sec-2-2-1">
          <title>2.2.2 Data management</title>
          <p>As one of the main goals of the project is to
gradually increase the quality of MT models
via continuous learning from user-generated
corrections and validations, data
management is a key functionality of the platform.</p>
          <p>
            Since the platform allows for the provision
of post-edited data via documents, in
addition to the provision of segment-level data,
the component supports sentence alignment
with a combination of the metrics generated
8It notably supports the internal and
public MT services of the Basque Government
(https://www.euskadi.eus/traductor/),
MondragonLingua's commercial MT services for Basque
(https://lingua.eus/eu/itzultzailea) and
domainspeci c translation, and Vicomtech's public platform
for the improvement of Basque translation technology
(https://www.batua.eus/).
by the HunAlign
            <xref ref-type="bibr" rid="ref7">(Varga et al., 2005)</xref>
            and
STACC
            <xref ref-type="bibr" rid="ref3">(Etchegoyhen and Azpeitia, 2016)</xref>
            aligners. It also performs several types of
ltering, to identify misaligned or noisy data,
via alignment scores and regular
expressionbased lters. Data selection is then
performed to determine relevant data for
continuous learning, given previous history.
          </p>
          <p>Finally, the component also generates
translation memories from validated aligned
data, which users can download as a
byproduct of the data management processes.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.3 Model management</title>
          <p>Previously unseen data validated by users
reach the model management component,
where continuous learning takes place. New
pairs, consisting of a source sentence and its
validated translation, are used to adapt the
relevant models, with a single update for
online learning using the appropriate learning
rate for the selected optimiser. Automatic
evaluation then takes place to measure the
impact of the update on both the new pairs
and static test sets for the models at hand.</p>
          <p>Although online learning is a relevant
framework to adapt MT models on the y,
it is still an open matter to determine an
optimal balance between aggressive adaptation,
required for online learning to take e ect on
the basis of single data points, and model
stability over time, necessary to maintain the
overall quality of the models.</p>
          <p>Several experiments are being carried out
within the ITAI project to determine the
appropriate con gurations in this respect.
Current results tend to favour a hybrid approach,
with online learning performed for rapid MT
adaptation useful to the users of the system,
and batch ne-tuning over prior model
training checkpoints once the volumes of
accumulated new data reach a signi cant threshold.
Model management processes for continuous
learning will be adapted as necessary as
nal conclusions are reached within the project
regarding continuous learning for neural
machine translation.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions</title>
      <p>In this paper, we described a neural machine
translation platform which supports
continuous learning and multiple use cases in the
translation industry. The application is
already operative within the applied research
project ITAI and will be nalised in 2021.
Research within the project has shown the
potential of the platform to cover the main
identi ed use cases and provide rapid
adaptation via continuous learning. It also
uncovered the need to further explore
continuous learning for neural machine translation
to reach an optimal balance between rapid
adaptation and model stability over time.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>In Proc. of ICLR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Domingo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Garc</surname>
          </string-name>
          <article-title>a-Mart nez, A</article-title>
          . Estela
          <string-name>
            <surname>Pastor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bie</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Helle</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Peris</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Casacuberta</surname>
            , and
            <given-names>M. Herranz</given-names>
          </string-name>
          <string-name>
            <surname>Perez</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Demonstration of a neural machine translation system with online learning for translators</article-title>
          .
          <source>In Proc. of ACL</source>
          , pages
          <volume>70</volume>
          {
          <fpage>74</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Etchegoyhen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Azpeitia</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>SetTheoretic Alignment for Comparable Corpora</article-title>
          .
          <source>In Proc. of ACL</source>
          , pages
          <year>2009</year>
          {
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Junczys-Dowmunt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Grundkiewicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dwojak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hoang</surname>
          </string-name>
          , K. Hea eld, T. Neckermann,
          <string-name>
            <given-names>F.</given-names>
            <surname>Seide</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Germann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Fikri</given-names>
            <surname>Aji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bogoychev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F. T.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Birch</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Marian: Fast neural machine translation in C++</article-title>
          .
          <source>In Proc. of ACL</source>
          , pages
          <volume>116</volume>
          {
          <fpage>121</fpage>
          ,
          <string-name>
            <surname>July</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Peris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Casacuberta</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Online learning for e ort reduction in interactive neural machine translation</article-title>
          .
          <source>Computer Speech &amp; Language</source>
          ,
          <volume>58</volume>
          :
          <fpage>98</fpage>
          {
          <fpage>126</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Turchi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Negri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Farajian</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Federico</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Continuous learning from human post-edits for neural machine translation</article-title>
          .
          <source>The Prague Bulletin of Mathematical Linguistics</source>
          ,
          <volume>108</volume>
          :
          <fpage>233</fpage>
          {
          <fpage>244</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Varga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nemeth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Halacsy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kornai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tron</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Nagy</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Parallel corpora for medium density languages</article-title>
          .
          <source>In Proc. RANLP</source>
          , pages
          <volume>590</volume>
          {
          <fpage>596</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Vaswani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I.</given-names>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Attention is all you need</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          , pages
          <fpage>6000</fpage>
          {
          <fpage>6010</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Wuebker</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Simianer</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. DeNero.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Compact personalized models for neural machine translation</article-title>
          .
          <source>In Proc. of EMNLP.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>