<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the VQA-Med Task at ImageCLEF 2020: Visual Question Answering and Generation in the Medical Domain</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Asma Ben Abacha</string-name>
          <email>asma.benabacha@nih.gov</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vivek V. Datla</string-name>
          <email>vivek.datla@philips.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sadid A. Hasan</string-name>
          <email>sadidhasan@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dina Demner-Fushman</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Henning Muller</string-name>
          <email>henning.mueller@hevs.ch</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CVS Health</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lister Hill Center, National Library of Medicine</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Philips Research Cambridge</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Applied Sciences Western Switzerland</institution>
          ,
          <addr-line>Sierre</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents an overview of the Medical Visual Question Answering (VQA-Med) task at ImageCLEF 2020. This third edition of VQA-Med included two tasks: (i) Visual Question Answering (VQA), where participants were tasked with answering abnormality questions from the visual content of radiology images and (ii) Visual Question Generation (VQG), consisting of generating relevant questions about radiology images based on their visual content. In VQA-Med 2020, 11 teams participated in at least one of the two tasks and submitted a total of 62 runs. The best team achieved a BLEU score of 0.542 in the VQA task and 0.348 in the VQG task.</p>
      </abstract>
      <kwd-group>
        <kwd>Visual Question Answering</kwd>
        <kwd>Visual Question Generation</kwd>
        <kwd>Data Creation</kwd>
        <kwd>Radiology Images</kwd>
        <kwd>Medical Questions and Answers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>With the increasing interest in arti cial intelligence technologies to support
clinical decision making and improve patient engagement, opportunities to generate
and leverage algorithms for automated medical image interpretation are being
explored at a faster pace. The clinicians' con dence in interpreting complex
medical images can be enhanced by a \second opinion" provided by an automated
system. Also, since patients may now access structured and unstructured data
related to their health via patient portals, such access motivates the need to help
them better understand their conditions regarding their available data, including
medical images.</p>
      <p>
        To o er more training data and evaluation benchmarks, we organized the rst
visual question answering (VQA) task in the medical domain in 2018 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and
continued the task in 2019 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as part of the ImageCLEF initiatives [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Following
the strong engagement from the research community in both editions of VQA
in the medical domain (VQA-Med) and the ongoing interests from both the
computer vision and the medical informatics communities, we continued the task
this year (VQA-Med 2020) within the scope of ImageCLEF-2020 initiatives [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
by putting an enhanced focus on answering questions about abnormalities from
the visual content of associated radiology images. Furthermore, we introduced
an additional task this year, visual question generation (VQG), consisting of
generating relevant questions about radiology images.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Task Description</title>
      <p>For the visual question answering task, similar to 2019, given a radiology medical
image accompanied by a clinically relevant question, participating systems were
tasked with answering the question based on the visual image content. In
VQAMed 2020, we speci cally focused on questions about abnormality (e.g., \what
is most alarming about this ultrasound image?"), which can be answered from
the image content without requiring additional medical knowledge or
domainspeci c inference. Additionally, the visual question generation (VQG) task was
introduced for the rst time in this third edition of the VQA-Med challenge.
This task required participants to generate relevant natural language questions
about radiology images using their visual content.
3
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Data Creation</title>
      <sec id="sec-3-1">
        <title>VQA Data</title>
        <p>For the visual question answering task, we automatically constructed the
training, validation, and test sets by: (i) applying several lters to select relevant
images and associated annotations, and, (ii) creating patterns to generate the
questions and their answers. We selected relevant medical images from the
MedPix5 database with lters based on their captions, localities, and diagnosis
methods. We selected only the cases where the diagnosis was made based on the
image. Examples of the selected diagnosis methods include: CT/MRI imaging,
angiography, characteristic imaging appearance, radiographs, imaging features,
ultrasound, and diagnostic radiology.</p>
        <p>Finally, we selected the list of abnormalities to be used to create the
questionanswer pairs. The nal list covers 330 medical problems; each problem occurs at
least 10 times in the created VQA data.</p>
        <p>Examples of medical problems (and their frequency) in the VQA data:
{ pulmonary embolism (114),</p>
        <sec id="sec-3-1-1">
          <title>5 https://medpix.nlm.nih.gov/</title>
          <p>{ acute appendicitis (109),
{ angiomyolipoma (68),
{ osteochondroma (63),
{ adenocarcinoma of the lung (60),
{ sarcoidosis (58).</p>
          <p>The VQA training set includes 4,000 radiology images with 4,000
QuestionAnswer (QA) pairs. The validation set consists of 500 radiology images with
500 QA pairs. The test set includes 500 radiology images and 500 questions. To
further ensure the quality of the data, the test set was manually validated by
a medical doctor. Figure 1 presents examples from the VQA-Med-2020 test set.
The participants were also encouraged to utilize the VQA-Med-2019 dataset as
additional training data.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>VQG Data</title>
        <p>For the visual question generation task, we automatically constructed the
training, validation, and test sets in a similar fashion by using a separate collection
of radiology images and their associated captions. We semi-automatically
generated questions from the image captions rst by using a rule-based
sentenceto-question generation approach6, and then, three annotators manually curated
the list of question-answer pairs by removing or editing the noises related to
grammatical inconsistencies. The nal curated corpus for the VQG task was
comprised of 780 radiology images with 2,156 associated questions (and
answers) for training, 141 radiology images with 164 questions for validation, and
80 radiology images for testing.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Submitted Runs</title>
      <p>Out of 47 online registrations, 30 participants submitted signed end user
agreement forms. Finally, 11 groups submitted a total of 49 successful runs for the
VQA task7 (cf. Figure 2), while 3 groups submitted a total of 13 successful runs
for the VQG task8, indicating a notable interest in the VQA-Med 2020
challenge. Table 1 and Table 2 give an overview of all participants and the number
of submitted runs (please note that were allowed only 5 runs per team).
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>
        Similar to the evaluation setup of the VQA-Med 2019 challenge [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the
evaluation of the participant systems for the VQA task in the VQA-Med 2020 challenge
is also conducted based on two primary metrics: accuracy and BLEU. We used
6 http://www.cs.cmu.edu/~ark/mheilman/questions/
7 https://www.aicrowd.com/challenges/imageclef-2020-vqa-med
8 https://www.aicrowd.com/challenges/imageclef-2020-vqa-med-vqg
(a) Q: what abnormality is seen in the
image? A: ovarian torsion
(b) Q: what is abnormal in the ct
scan? A: partial anomalous
pulmonary venous return
(c) Q: what is the primary
abnormality in this image?
A: necrotizing enterocolitis
(d) Q: is the x-ray normal?
      </p>
      <p>A: no
(e) Q: what abnormality is seen
in the image? A: ollier's disease,
enchondromatosis
(f) Q: what is abnormal in the
ultrasound? A: cirrhosis of the liver
(g) Q: what is abnormal in the
mammograph? A: in ltrating ductal
carcinoma
(h) Q: what is the primary
abnormality in this image?</p>
      <p>
        A: dural stula, avf
an adapted version of accuracy from the general domain VQA9 task that strictly
considers exact matching of a participant provided answer and the ground truth
answer. To compensate for the strictness of the accuracy metric, BLEU [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is
used to capture the word overlap-based similarity between a system-generated
answer and the ground truth answer. The overall methodology and resources for
the BLEU metric are essentially similar to last year's VQA task [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The BLEU
metric is also used to evaluate the submissions for the VQG task, where we
essentially compute the word overlap-based average similarity score between the
system-generated questions and the ground truth question for each given test
image. The overall results of the participating systems are presented in Table 3
and Table 4 in a descending order of the accuracy and average BLEU scores
respectively (the higher the better).
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Discussion</title>
      <p>Similar to the last two years, participants continued to use state-of-the-art deep
learning techniques to build their VQA-Med systems for both VQA and VQG</p>
      <sec id="sec-6-1">
        <title>9 https://visualqa.org/evaluation.html</title>
        <p>
          tasks [
          <xref ref-type="bibr" rid="ref2 ref4">4, 2</xref>
          ]. In particular, most systems leveraged encoder-decoder architectures
with, e.g., deep convolutional neural networks (CNNs) like VGGNet or ResNet.
A variety of pooling strategies were explored, e.g., global average pooling to
encode image features and transformer-based architectures like BERT or recurrent
neural networks (RNN) to extract question features (for the VQA task).
Various types of attention mechanisms are also used coupled with di erent pooling
strategies such as multimodal factorized bilinear (MFB) pooling or multi-modal
factorized high-order pooling (MFH) in order to combine multimodal features
followed by bilinear transformations to nally predict the possible answers in the
VQA task and generate possible question words in the VQG task. Additionally,
the top performing systems rst classi ed the questions into two types: yes/no,
and abnormality, then added another multi-class classi cation framework for
abnormality-related question answering, while using the same backbone
architecture along with utilizing additional training data, leading to better results.
        </p>
        <p>
          Analyses of the results in Table 3 suggest that in general, participating
systems performed well for the VQA task and achieved better accuracy relatively
compared to last year's results for answering abnormality-related questions [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
They obtained slightly lower BLEU scores as we focused on only abnormality
questions this year that are generally complex than modality, plane, or organ
category questions given in the last year. Overall, the VQA task results obtained
this year entail the robustness of the provided dataset compared to last year's
task due to the enhanced focus on the abnormality-related questions for corpus
creation. For the VQG task, results in Table 4 suggest that the task was
comparatively more challenging than the VQA task as the systems achieved lower
BLEU scores. As BLEU is not the ideal metric to semantically compare the
generated questions with the ground-truth questions, this could also urge the
necessity of an embedding-based similarity metric to be explored in the future
edition of this task.
7
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>
        In this paper, we presented the VQA-Med 2020 tasks, datasets, and o cial
results. We created new datasets for the visual question generation and visual
question answering tasks with a focus on questions about abnormality. In the
VQA task, the best team achieved 0.542 BLEU score and 0.496 accuracy. The
VQG task was more challenging, with a best BLEU score of 0.348. In the future
editions of VQA-Med, we will focus on expanding the VQG dataset with more
images and questions [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to enable e ective development of deep learning models
and on designing new evaluation metrics for both tasks.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Al-Sadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Theiabat</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Ayyoub</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The inception team at vqa-med 2020: Pretrained vgg with data augmentation for medical vqa and vqg</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Ben</given-names>
            <surname>Abacha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Datla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.V.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , Muller, H.:
          <article-title>Vqa-med: Overview of the medical visual question answering task at imageclef 2019</article-title>
          . In: Working Notes of CLEF 2019 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , Lugano, Switzerland, September 9-
          <issue>12</issue>
          ,
          <year>2019</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>2380</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Hcp-mic at vqa-med 2020: E ective visual representation for medical visual question answering</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farri</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Lungren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Overview of imageclef 2018 medical domain visual question answering task</article-title>
          .
          <source>In: Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum</source>
          , Avignon, France,
          <source>September 10-14</source>
          ,
          <year>2018</year>
          . (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Peteri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ben</given-names>
            <surname>Abacha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Datla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Kozlovski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.D.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Pelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Friedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.M.</given-names>
            ,
            <surname>de Herrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.G.S.</given-names>
            ,
            <surname>Ninh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.T.</given-names>
            ,
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.K.</given-names>
            ,
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Piras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Riegler</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          , l Halvorsen,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.T.</given-names>
            ,
            <surname>Lux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Gurrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Dang-Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.T.</given-names>
            ,
            <surname>Chamberlain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Campello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Fichou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Berari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Brie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Dogariu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stefan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.D.</given-names>
            ,
            <surname>Constantin</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.G.</surname>
          </string-name>
          :
          <article-title>Overview of the ImageCLEF 2020: Multimedia retrieval in lifelogging, medical, nature, and internet applications</article-title>
          .
          <source>In: Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the 11th International Conference of the CLEF Association (CLEF</source>
          <year>2020</year>
          ), vol.
          <volume>12260</volume>
          .
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          - 25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Peteri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klimuk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarasau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ben</surname>
            <given-names>Abacha</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Datla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Dang-Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.T.</given-names>
            ,
            <surname>Piras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Riegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.T.</given-names>
            ,
            <surname>Lux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Gurrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Pelka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Friedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.M.</given-names>
            ,
            <surname>de Herrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.G.S.</given-names>
            ,
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Kavallieratou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>del Blanco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.R.</given-names>
            , Rodr guez, C.C.,
            <surname>Vasillopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Karampidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Chamberlain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Campello</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>ImageCLEF 2019: Multimedia retrieval in medicine, lifelogging, security and nature</article-title>
          . In:
          <article-title>Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the 10th International Conference of the CLEF Association (CLEF</source>
          <year>2019</year>
          ),
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer, Lugano,
          <source>Switzerland (September 9-12</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jung</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harada</surname>
          </string-name>
          , T.:
          <article-title>bumjun jung at vqa-med 2020: Vqa model based on feature extraction and multi-modal feature fusion</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liao</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , van den Hengel, A.,
          <string-name>
            <surname>Verjans</surname>
          </string-name>
          , J.: Aiml at vqa-med
          <year>2020</year>
          :
          <article-title>Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          : Shengyan at vqa-med
          <year>2020</year>
          :
          <article-title>An encoder-decoder model for medical domain visual question answering task</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Papineni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roukos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ward</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>W.J.:</given-names>
          </string-name>
          <article-title>BLEU: a method for automatic evaluation of machine translation</article-title>
          .
          <source>In: Proceedings of the 40th annual meeting on association for computational linguistics</source>
          . pp.
          <volume>311</volume>
          {
          <fpage>318</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sarrouti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Nlm at vqa-med
          <year>2020</year>
          :
          <article-title>Visual question answering and generation in the medical domain</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Sarrouti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ben</given-names>
            <surname>Abacha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Visual question generation from radiology images</article-title>
          .
          <source>In: Proceedings of the rst workshop on Advances in Language and Vision Research</source>
          (ALVR).
          <article-title>Association for Computational Linguistics</article-title>
          , Seattle, Washington (July
          <year>2020</year>
          ), https://alvr-workshop.github.io/ proceedings/ALVR_
          <year>2020</year>
          _15_Paper.pdf
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Umada</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aono</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>: kdevqa at vqa-med 2020: focusing on glu-based classi - cation</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEURWS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Verma</surname>
            , H.K., S.,
            <given-names>S.R.</given-names>
          </string-name>
          : Harendrakv at vqa-med
          <year>2020</year>
          :
          <article-title>Sequential vqa with attention for medical visual question answering</article-title>
          .
          <source>In: CLEF 2020 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org, Thessaloniki,
          <source>Greece (September</source>
          <volume>22</volume>
          -25
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>