<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>and Yoram Singer. 2011.
Adaptive Subgradient Methods for Online Learning
and Stochastic Optimization. Journal of Machine
Learning Research</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Online Neural Automatic Post-editing for Neural Machine Translation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fondazione Bruno Kessler - Trento</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italia</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>MMT Srl - Trento</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italia [negri</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>turchi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>bertoldi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>federico]@fbk.eu</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <abstract>
        <p>English. Machine learning from user corrections is key to the industrial deployment of machine translation (MT). We introduce the first on-line approach to automatic post-editing (APE), i.e. the task of automatically correcting MT errors. We present experimental results of APE on English-Italian MT by simulating human post-edits with human reference translations, and by applying online APE on MT outputs of increasing quality. By evaluating APE on generic vs. specialised and static vs. adaptive neural MT, we address the question: At what cost on the MT side will APE become useless?</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italiano. L’apprendimento automatico
dalle correzioni degli utenti e`
fondamentale per lo sviluppo industriale della
traduzione automatica (MT). In questo
lavoro, introduciamo il primo approccio
on-line al post-editing automatico (APE),
ovvero il compito di correggere
automaticamente gli errori della MT. Presentiamo
risultati di online APE su MT da inglese
a italiano simulando le correzioni umane
con traduzioni manuali gia` disponibili e
utilizzando MT di qualita` crescente.
Valutando l’APE su MT neurale generica
oppure specializzata, statica o adattiva,
affrontiamo la domanda di fondo: a fronte
di quale costo sul lato MT l’APE diventera`
inutile?</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>
        Automatic Post-editing for MT is a supervised
learning task aimed to correct errors in a
machinetranslated text
        <xref ref-type="bibr" rid="ref13 ref23">(Knight and Chander, 1994; Simard
et al., 2007)</xref>
        . Cast as a problem of
“monolingual translation” (from raw MT output into
improved text in the same target language), APE
has followed a similar evolution to that of MT.
As in MT, APE research received a strong boost
from shared evaluation exercises like those
organized within the well-established WMT
Conference on Machine Translation
        <xref ref-type="bibr" rid="ref18 ref24 ref7">(Chatterjee et al.,
2018)</xref>
        . In terms of approaches, early MT-like
phrase-based solutions
        <xref ref-type="bibr" rid="ref15 ref2 ref20 ref4">(Be´chara et al., 2011; Rosa
et al., 2013; Lagarda et al., 2015; Chatterjee et
al., 2015)</xref>
        have been recently outperformed and
replaced by neural architectures that now represent
the state of the art
        <xref ref-type="bibr" rid="ref10 ref11 ref18 ref21 ref24 ref3 ref5 ref6 ref7">(Junczys-Dowmunt and
Grundkiewicz, 2016; Chatterjee et al., 2017a;
Tebbifakhr et al., 2018; Junczys-Dowmunt and
Grundkiewicz, 2018)</xref>
        . From the industry standpoint,
APE has started to attract MT market players
interested in combining the two technologies to
support human translation in professional workflows
        <xref ref-type="bibr" rid="ref8">(Crego et al., 2016)</xref>
        .
      </p>
      <p>
        Focusing on this industry-oriented perspective,
this paper makes a step further on APE research
by exploring an online neural approach to the
task. The goal is to leverage human feedback
(post edits) to improve on-the-fly a neural APE
model without the need of stopping it for
finetuning or re-training from scratch. Online
learning capabilities are crucial (both for APE and
MT) in computer-assisted translation scenarios
where professional translators operate on
suggestions provided by machines. In such scenarios,
human corrections represent an invaluable source of
knowledge that systems should exploit to enhance
users’ experience and increase their productivity.
Towards these objectives we provide two
contributions. One is the first online approach to neural
APE. Indeed, while MT-like online learning
techniques have been proposed for phrase-based APE
        <xref ref-type="bibr" rid="ref1 ref19 ref20 ref22 ref3 ref5 ref6">(Ortiz-Mart´ınez and Casacuberta, 2014; Simard
and Foster, 2013; Chatterjee et al., 2017b)</xref>
        , nothing
has been done yet under the state-of-the-art neural
paradigm. In doing this, the other contribution is
the first evaluation of neural APE run on the output
of neural MT (NMT). So far, published results
report significant gains1 when APE is run to correct
the output of a phrase-based MT system. To our
knowledge, the true potential of APE with higher
quality NMT output has not been investigated yet.
The last observation introduces a more general
discussion on the relation between MT and APE.
Since, by definition, APE’s reason of being is the
sub-optimal quality of MT output, one might
wonder if the level of current MT technology still
justifies efforts on APE. Along this direction, our third
contribution is an analysis of online neural APE
applied to the output of NMT systems featuring
different levels of performance. Our competitors
range from a generic model trained on large
parallel data (mimicking the typical scenario in which
industry users – e.g. Language Service Providers
– rely on web-based services or other black-box
systems) to highly customized online models (like
those that LSPs would desire but typically cannot
afford). Our experiments in this range of
conditions aim to shed light on the future of APE from
the industry standpoint by answering the question:
At what cost on the MT side will APE become
useless?
2
      </p>
    </sec>
    <sec id="sec-3">
      <title>Online neural APE</title>
      <p>
        APE training data usually consist of (src, mt, hpe)
triplets whose elements are: a source sentence
(src), its translation (mt) and a human correction
of the translated sentence (hpe). Models trained
on such triplets are then used to correct the mt
element of (src, mt) test data. Neural approaches
to the task have shown their effectiveness in batch
conditions, in which a static pre-trained model is
run on the whole test corpus. When moving to an
online setting, instead, APE systems should
ideally be able to continuously evolve by stepwise
learning from the interaction with the user. This
means that, each time a new post-edit becomes
available, the model has to update its parameters
on-the-fly in order to produce better output for the
next incoming sentence. To this aim, we extend a
batch APE model by adding the capability to
continuously learn from human corrections of its own
output. This is done in two steps:
(1) Before post-editing, by means of an instance
1Up to 7.6 BLEU points at WMT 2017
        <xref ref-type="bibr" rid="ref3">(Bojar et al., 2017)</xref>
        selection mechanism that updates the model by
learning from previously collected triplets that are
similar to the input test item (see lines 2-5 in
Algorithm 1);
(2) After post-editing, by means of a model
adaptation procedure that learns from human revisions
of the last automatic correction generated by the
system (lines 8-10).
      </p>
      <p>
        Similar to the methods proposed in
        <xref ref-type="bibr" rid="ref3 ref5 ref6">(Chatterjee et al., 2017b)</xref>
        and
        <xref ref-type="bibr" rid="ref5 ref9">(Farajian et al., 2017)</xref>
        ,
the instance-selection technique (first update step)
consists of two components: i) a knowledge base
(KB) that is continuously fed with the processed
triplets, and ii) an information retrieval engine
that, given the (src, mt) test item, selects the most
similar triplet (lines 2-3). The engine is
simultaneously queried using both src and mt segments
and it returns the triplet that has the highest
cosine similarity with both (Top(R)). If the
similarity is above a threshold , a few training iterations
are run to update the model parameters (line 5).
Depending on the application scenario, KB can be
pre-filled with the APE training data or left empty
and filled only with the incoming triplets. In our
experiments, the repository is initially empty.
      </p>
      <p>Algorithm 1: Online neural APE
Require M: Trained APE model
Require Ts: Stream of test data
Require KB: Pool of (src, mt, hpe) triplets
1: while pop (src, mt) from Ts do
2: R Retrieve ((src, mt), KB)
3: (srctop, mttop, hpetop) Top (R)
4: if Sim ((srctop, mttop, hpetop), (src, mt)) &gt; do
5: M Update (M,(srctop, mttop, hpetop))
6: ape APE (M ,(src, mt))
7: hpe HumanPostEdit ((src, ape))
8: KB KB [ (src,mt,hpe)
9: M Update (M ,(src, mt, hpe))
10: M M
11: end while</p>
      <p>Once the hpe has been generated, the second
update step takes place (line 9) by running few
training iterations on the (src, hpe) pair. When training
using only one single data point, the learning rate
and the number of epochs have a crucial role
because too high/small values can make the training
unstable/inefficient. To avoid such problems, we
connect the two parameters by applying a
timebased decay learning rate that reduces the learning
rate when increasing of the number of epochs (i.e.
lr = lr/(1+num epoch)). In our experiments, this
strategy results in better performance than setting
a fixed learning rate.</p>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>We run our experiments on English-Italian data,
by comparing the performance of different neural
APE models (batch and online) used to correct the
output of NMT systems of increasing quality.
3.1</p>
      <sec id="sec-4-1">
        <title>Data</title>
        <p>To train our NMT models we use both generic
and in-domain data. Generic data cover a
variety of domains. They comprise about 53M
parallel sentences collected from publicly-available
collections (i.e. all the English-Italian parallel
corpora available on OPUS2) and about 50M
sentence pairs from proprietary translation memories.
Generic data, whose size is per se sufficient to
train a competitive general-purpose engine, are
used to build our basic NMT model. On top of it,
in-domain (information technology) data are used
in different ways to obtain improved,
domainadapted models. In-domain data are selected to
emulate the online setting of industrial scenarios
where input documents are processed sequentially
on a sentence-by-sentence basis. They consist in a
proprietary translation project of about 421K
segments, which are split in training (416K segments)
and test (5,472) keeping the sentence order.
Postedits are simulated using references.</p>
        <p>
          To train the APE models we use the
EnglishItalian section of the eSCAPE corpus
          <xref ref-type="bibr" rid="ref18 ref7">(Negri et al.,
2018)</xref>
          . It consists of about 6.6M
syntheticallycreated triplets in which the mt element is
produced with phrase-based and neural MT systems.
3.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>NMT models</title>
        <p>
          Our NMT models feature increasing levels of
complexity, so to represent a range of conditions
in which a user (say a Language Service Provider)
has access to different resources in terms of MT
technology and/or data for training and adaptation.
Our systems, ranked in terms of complexity with
respect to these two dimensions are:
Generic (G). This model is trained on the large
(103M) multi-domain parallel corpus. It
represents the situation in which our LSP entirely
relies on an off-the-shelf, black-box MT engine that
cannot be improved via domain adaptation.
Generic Online (GO). This model extends G with
the capability to learn from the incoming human
post-edits (5,472 test items). Before and after
2http://opus.lingfil.uu.se dump of mid June
2017.
translation, few training iterations adapt it to the
domain of the input document. The adaptation
steps implement the same strategies of the online
APE system (see x2). This setting represents the
situation in which our LSP has access to the inner
workings of a competitive online NMT system.
Specialized (S). This model is built by fine-tuning
          <xref ref-type="bibr" rid="ref16 ref17 ref4">(Luong and Manning, 2015)</xref>
          G on the in-domain
training data (416K). It reflects the condition in
which our LSP has access both to customer’s data
and to the inner workings of a competitive batch
NMT engine. The adaptation routine, however, is
limited to the standard approach of performing
additional training steps on the in-domain data.
Specialized Online (SO). This model is built by
combining the functionalities of GO and S. It uses
the in-domain training data for fine-tuning and the
incoming (src, hpe) pairs for online adaptation to
the target domain. This setting represents the
situation in which our LSP has access to: i)
customer’s in-domain data and ii) the inner workings
of a competitive online NMT engine.
        </p>
        <p>
          All the models are trained with the ModernMT
open source software,3 which is built on top of
OpenNMT-py
          <xref ref-type="bibr" rid="ref12">(Klein et al., 2017)</xref>
          . It employs
an LSTM-based recurrent architecture with
attention
          <xref ref-type="bibr" rid="ref1">(Bahdanau et al., 2014)</xref>
          using 2 bi-directional
LSTM layers in the encoder, 4 left-to-right LSTM
layers in the decoder, and a dot-product attention
model
          <xref ref-type="bibr" rid="ref16 ref17">(Luong et al., 2015)</xref>
          . In our experiments
we used an embeddings’ size of 1024, LSTMs of
size 1024, and a source and target vocabulary of
32K words, jointly trained with the BPE algorithm
          <xref ref-type="bibr" rid="ref21">(Sennrich et al., 2016)</xref>
          . The fact that ModernMT
already implements the online adaptation method
presented in
          <xref ref-type="bibr" rid="ref5 ref9">(Farajian et al., 2017)</xref>
          simplified our
tests with online neural APE run on the output of
competitive NMT systems (GO and SO).
3.3
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>APE models</title>
        <p>We experiment with two neural APE systems:
Generic APE. This batch system is trained only
on generic data (6.6M triplets from eSCAPE) and
is similar to those tested in the APE shared task
at WMT. The main difference is that the training
data are neither merged with in-domain triplets nor
selected based on target domain information.
Online APE. This system is trained on the generic
data and continuously learns from human
postedits of the test set as described in x2.</p>
        <p>3http://github.com/ModernMT/MMT.</p>
        <p>MT Type
Generic (G) 40.3
Gen. Online (GO) 45.6
Specialized (S) 52.1
Spec. Online (SO) 55.0</p>
        <p>Generic</p>
        <p>APE
39.0
41.9
45.5
47.4</p>
        <p>
          The two systems are based on a multi-source
attention-based encoder-decoder approach
similar to
          <xref ref-type="bibr" rid="ref3 ref5 ref6">(Chatterjee et al., 2017a)</xref>
          . It employs a
GRU-based recurrent architecture with attention
and uses two independent encoders to process the
src and mt segments. Similar to the NMT systems,
it is trained on sub-word units by using BPE, with
a vocabulary created by selecting to 50K most
frequent sub-words. Word embedding and GRU
hidden state sizes are set to 1024. Network
parameters are optimized with Adagrad (Duchi et al.,
2011) with a learning rate of 0.01. A
development set randomly extracted from the training data
is used to set the similarity threshold used by the
online model for the first update step ( =0.5) as
well as the learning rate (0.01) and the number of
epochs (3) of both adaptation steps.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Results and discussion</title>
      <p>APE results computed on different levels of
translation quality are reported in Table 1. Looking
at the NMT performance, all the adaptation
techniques yield significant improvements over the
Generic model (G). The large gain achieved via
fine-tuning on in-domain data (S: +11.8 BLEU) is
further increased when adding online learning
capabilities on top of it to create the most
competitive Specialized Online system (SO: +14.7).</p>
      <p>
        As expected, the batch APE model trained on
generic data only (that is, without in-domain
information) is unable to improve the quality of
raw MT output. Moreover, although APE results
increase with higher translation quality, also the
performance distance from the more competitive
NMT systems becomes larger (from -1.3 to -7.6
points respectively for G and SO). These results
confirm the WMT findings about the importance
of domain customization for batch APE
        <xref ref-type="bibr" rid="ref3">(Bojar et
al., 2017)</xref>
        , and advocate for online solutions
capable to maximize knowledge exploitation at test
time by learning from user feedback.
      </p>
      <p>Online APE achieves significant4
improvements not only over the output of G (+6.8) and
its online extension GO (+2.5), but also over the
specialized model S (+1.4). The gain over GO is
particularly interesting: it shows that even when
APE and MT use the same in-domain data for
online adaptation, the APE model is more reactive to
human feedback. Though trained on much smaller
generic corpora (6.6M triplets versus 103M
parallel sentences), the possibility to leverage richer
information in the form of (src, mt, pe) instances at
test time seems to have a positive impact. A deeper
exploration of this aspect falls out of the scope of
this paper and is left as future work.</p>
      <p>Also with online APE, the gains become
smaller by increasing the MT quality, reaching
a point where the system can only approach the
highest MT performance of SO (with a
nonsignificant -0.2 BLEU difference). This confirms
that correcting the output of competitive NMT
engines is a hard task, even for a dynamic APE
system that learns from the interaction with the user.
However, besides improving its performance by
learning from user feedback acquired at test time
(similar to the APE system), SO also relies on
previous fine-tuning on a large in-domain corpus
(similar to S). To answer our initial question (“At
what cost on the MT side will APE become
useless?”) it is worth remarking that leveraging
indomain training/adaptation data is a considerable
advantage for MT but it comes at a cost that should
not be underestimated. In terms of the data itself,
collecting enough parallel sentences for each
target domain is a considerable bottleneck that limits
the scalability of competitive NMT solutions. In
addition to that, the technology requirements (i.e.
having access to the inner workings of the NMT
engine) and the computational costs involved (for
fine-tuning the generic model) are constraints that
few LSPs are probably able to satisfy.
5</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>
        We introduced an online neural APE system,
which is trained on generic data and only exploits
user feedback to improve its performance, and
evaluated it on the output of NMT systems
featuring increasing complexity and in-domain data
demand. Our results show the effectiveness of
current APE technology in the typical setting of
4Statistical significance is computed with paired bootstrap
resampling
        <xref ref-type="bibr" rid="ref14">(Koehn, 2004)</xref>
        .
most LSPs while, in terms of resources (especially
in-domain data) and technical expertise needed.
We also conclude that developing MT engines that
make APE useless is still a prerogative of few.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Dzmitry</given-names>
            <surname>Bahdanau</surname>
          </string-name>
          , Kyunghyun Cho, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Neural Machine Translation by Jointly Learning to Align and Translate</article-title>
          .
          <source>arXiv preprint arXiv:1409</source>
          .
          <fpage>0473</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          Hanna Be´chara, Yanjun Ma, and Josef van Genabith.
          <year>2011</year>
          .
          <article-title>Statistical Post-Editing for a Statistical MT System</article-title>
          .
          <source>In Proceedings of the 13th Machine Translation Summit</source>
          , pages
          <fpage>308</fpage>
          -
          <lpage>315</lpage>
          , Xiamen, China, September.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Ondrˇej Bojar</surname>
            , Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shujian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, and
            <given-names>Marco</given-names>
          </string-name>
          <string-name>
            <surname>Turchi</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Findings of the 2017 Conference on Machine Translation (WMT17)</article-title>
          .
          <source>In Proceedings of the Second Conference on Machine Translation</source>
          , Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Shared</given-names>
            <surname>Task Papers</surname>
          </string-name>
          , pages
          <fpage>169</fpage>
          -
          <lpage>214</lpage>
          , Copenhagen, Denmark, September.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Rajen</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          , Marion Weller, Matteo Negri, and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Turchi</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Exploring the Planet of the APEs: a Comparative Study of State-of-the-art Methods for MT Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics)</source>
          , pages
          <fpage>156</fpage>
          -
          <lpage>161</lpage>
          , Beijing, China, July.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Rajen</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Amin</given-names>
            <surname>Farajian</surname>
          </string-name>
          , Matteo Negri, Marco Turchi, Ankit Srivastava, and
          <article-title>Santanu Pal. 2017a. Multi-source Neural Automatic PostEditing: FBK's participation in the WMT 2017 APE shared task</article-title>
          .
          <source>In Proceedings of the Second Conference on Machine Translation</source>
          , Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Shared</given-names>
            <surname>Task Papers</surname>
          </string-name>
          , pages
          <fpage>630</fpage>
          -
          <lpage>638</lpage>
          , Copenhagen, Denmark, September.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Rajen</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          , Gebremedhen Gebremelak, Matteo Negri, and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Turchi</surname>
          </string-name>
          . 2017b.
          <article-title>Online Automatic Post-editing for MT in a Multi-Domain Translation Environment</article-title>
          .
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>1</volume>
          ,
          <string-name>
            <surname>Long</surname>
            <given-names>Papers</given-names>
          </string-name>
          , pages
          <fpage>525</fpage>
          -
          <lpage>535</lpage>
          , Valencia, Spain, April.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Rajen</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          , Matteo Negri, Raphael Rubino, and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Turchi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Findings of the WMT 2018 Shared Task on Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the Third Conference on Machine Translation</source>
          , Brussels, Belgium, October. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Josep</given-names>
            <surname>Crego</surname>
          </string-name>
          , Jungi Kim, Guillaume Klein, Anabel Rebollo, Kathy Yang, Jean Senellart, Egor Akhanov, Patrice Brunelle, Aurelien Coquard,
          <string-name>
            <given-names>Yongchao</given-names>
            <surname>Deng</surname>
          </string-name>
          , et al.
          <year>2016</year>
          .
          <source>SYSTRAN's Pure Neural Machine Translation Systems. arXiv preprint arXiv:1610</source>
          .
          <fpage>05540</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>M. Amin Farajian</surname>
            , Marco Turchi, Matteo Negri, and
            <given-names>Marcello</given-names>
          </string-name>
          <string-name>
            <surname>Federico</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Multi-Domain Neural Machine Translation through Unsupervised Adaptation</article-title>
          .
          <source>In Proceedings of the Second Conference on Machine Translation</source>
          , pages
          <fpage>127</fpage>
          -
          <lpage>137</lpage>
          , Copenhagen, Denmark, September.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Marcin</given-names>
            <surname>Junczys-Dowmunt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Roman</given-names>
            <surname>Grundkiewicz</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Log-linear Combinations of Monolingual and Bilingual Neural Machine Translation Models for Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the First Conference on Machine Translation</source>
          , pages
          <fpage>751</fpage>
          -
          <lpage>758</lpage>
          , Berlin, Germany,
          <year>August</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Marcin</given-names>
            <surname>Junczys-Dowmunt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Roman</given-names>
            <surname>Grundkiewicz</surname>
          </string-name>
          .
          <year>2018</year>
          . Microsoft and University of Edinburgh at WMT2018:
          <article-title>Dual-Source Transformer for Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the Third Conference on Machine Translation</source>
          , Brussels, Belgium, October.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Guillaume</given-names>
            <surname>Klein</surname>
          </string-name>
          , Yoon Kim, Yuntian Deng, Jean Senellart, and
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Rush</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>OpenNMT: Open-Source Toolkit for Neural Machine Translation</article-title>
          .
          <source>In Proceedings of ACL</source>
          <year>2017</year>
          ,
          <string-name>
            <given-names>System</given-names>
            <surname>Demonstrations</surname>
          </string-name>
          , pages
          <fpage>67</fpage>
          -
          <lpage>72</lpage>
          ,
          <year>July</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Kevin</given-names>
            <surname>Knight</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ishwar</given-names>
            <surname>Chander</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <source>Automated Post-Editing of Documents. In Proceedings of AAAI</source>
          , volume
          <volume>94</volume>
          , pages
          <fpage>779</fpage>
          -
          <lpage>784</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Philipp</given-names>
            <surname>Koehn</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Statistical Significance Tests for Machine Translation Evaluation</article-title>
          .
          <source>In Proceedings of the Empirical Methods on Natural Language Processing</source>
          , pages
          <fpage>388</fpage>
          -
          <lpage>395</lpage>
          , Barcelona, Spain, July.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          Antonio L. Lagarda, Daniel Ortiz-Mart´ınez, Vicent Alabau, and Francisco Casacuberta.
          <year>2015</year>
          .
          <article-title>Translating without In-domain Corpus: Machine Translation Post-Editing with Online Learning Techniques</article-title>
          .
          <source>Computer Speech &amp; Language</source>
          ,
          <volume>32</volume>
          (
          <issue>1</issue>
          ):
          <fpage>109</fpage>
          -
          <lpage>134</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Minh-Thang Luong</surname>
            and
            <given-names>Christopher D</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Stanford Neural Machine Translation Systems for Spoken Language Domains</article-title>
          .
          <source>In Proceedings of the International Workshop on Spoken Language Translation (IWSLT'15)</source>
          , pages
          <fpage>76</fpage>
          -
          <lpage>79</lpage>
          ,
          <string-name>
            <surname>Da</surname>
            <given-names>Nang</given-names>
          </string-name>
          , Vietnam, December.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Minh</given-names>
            <surname>Thang</surname>
          </string-name>
          <string-name>
            <surname>Luong</surname>
          </string-name>
          , Hieu Pham, and
          <string-name>
            <given-names>Christopher D</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Effective Approaches to Attentionbased Neural Machine Translation</article-title>
          .
          <source>arXiv preprint arXiv:1508</source>
          .
          <fpage>04025</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Matteo</given-names>
            <surname>Negri</surname>
          </string-name>
          , Marco Turchi, Rajen Chatterjee, and
          <string-name>
            <given-names>Nicola</given-names>
            <surname>Bertoldi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ), Miyazaki, Japan, May.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Daniel</surname>
          </string-name>
          Ortiz-Mart´ınez and Francisco Casacuberta.
          <year>2014</year>
          .
          <article-title>The New THOT Toolkit for Fully-Automatic and Interactive Statistical Machine Translation</article-title>
          .
          <source>In Proceedings of the 14th Annual Meeting of the European Association for Computational Linguistics</source>
          , pages
          <fpage>45</fpage>
          -
          <lpage>48</lpage>
          , Gothenburg, Sweden, April.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Rudolf</given-names>
            <surname>Rosa</surname>
          </string-name>
          , David Marecek,
          <string-name>
            <given-names>and Ales</given-names>
            <surname>Tamchyna</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Deepfix: Statistical Post-editing of Statistical Machine Translation Using Deep Syntactic Analysis</article-title>
          .
          <source>In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>172</fpage>
          -
          <lpage>179</lpage>
          , Sofia, Bulgaria,
          <year>August</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Rico</given-names>
            <surname>Sennrich</surname>
          </string-name>
          , Barry Haddow, and
          <string-name>
            <given-names>Alexandra</given-names>
            <surname>Birch</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Neural Machine Translation of Rare Words with Subword Units</article-title>
          .
          <source>In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          , pages
          <fpage>1715</fpage>
          -
          <lpage>1725</lpage>
          , Berlin, Germany,
          <year>August</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Michel</given-names>
            <surname>Simard</surname>
          </string-name>
          and
          <string-name>
            <given-names>George</given-names>
            <surname>Foster</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>PEPr: Postedit Propagation Using Phrase-based Statistical Machine Translation</article-title>
          .
          <source>In Proceedings of the XIV Machine Translation Summit</source>
          , pages
          <fpage>191</fpage>
          -
          <lpage>198</lpage>
          , Nice, France, September.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Michel</given-names>
            <surname>Simard</surname>
          </string-name>
          , Cyril Goutte, and
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Isabelle</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Statistical Phrase-Based Post-Editing</article-title>
          .
          <source>In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics</source>
          , pages
          <fpage>508</fpage>
          -
          <lpage>515</lpage>
          , Rochester, New York, April.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Amirhossein</given-names>
            <surname>Tebbifakhr</surname>
          </string-name>
          , Ruchit Agrawal, Rajen Chatterjee, Matteo Negri, and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Turchi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Multi-source Transformer with Combined Losses for Automatic Post-Editing</article-title>
          .
          <source>In Proceedings of the Third Conference on Machine Translation</source>
          , Brussels, Belgium, October.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>