<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ImageCLEF 2018: Semantic descriptors for Tuberculosis CT Image Classi cation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Abdelkader HAMADI[</string-name>
          <email>abdelkader.hamadi@univ-mosta.dz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Djamel Eddine YAGOUB</string-name>
          <email>djamel.ed.y@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Abdelhamid Ibn Badis Mostaganem Faculty of Exact Sciences and Computer Science Mathematics and Computer Science Department Mostaganem</institution>
          ,
          <country country="DZ">Algeria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this article, we present our methodologies used in our participation at the two sub-tasks of the ImageCLEF 2018 Tuberculosis Task (TBT and SVR task). We proposed to extract a single semantic descriptor of 3D CT image to describe each patient rather than using all his slices as separate samples. In TBT task, the resulting descriptors are then exploited in a second learning stage to identify the type of tuberculosis among ve given classes. In SVR task, the same experimental design is used to predict the degree of severity of the disease. We reached a Kappa coe cient value of about 0.0629 in TBT sub-task, and our best run on SVR was ranked 12th out of 36 submission and 5th out of 7 participant teams. We believe that our approach could give better results if applied properly.</p>
      </abstract>
      <kwd-group>
        <kwd>ImageCLEF Tuberculosis Task Deep Learning CT Image Tuberculosis CT Image Classi cation Tuberculosis Severity Scoring</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Tuberculosis is an infectious disease caused by a bacterium called Bacillus
microbacterium tuberculosis. With a high mortality rate in the world, this disease
remained one of the top ten causes of death in the world in 2015. Diagnosing
this sickness quickly and accurately is a vital goal that would limit its invasion
and damage. One of the major problems of this disease is that traditional tests
produce inaccurate or too long results. For these reasons, researchers have been
interested in this disease diagnosis, particularly in the context of the
international challenge ImageCLEF 2017 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and ImageCLEF 2018 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] where two tasks
(three tasks in ImageCLEF 2018) have been reserved for it. The rst aims to
detect multi-drug resistant (MDR) status of patients. The goal of the second
task is to identify the type of tuberculosis. A third task has been introduced
in ImageCLEF 2018 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] which consists to predict the degree of severity of the
patient's case. In all the three tasks, the predictions are based on 3D CT scans
images. Algorithms involving deep learning have been tested to diagnose the
presence or the absence of tuberculosis. The results obtained were interesting.
However, they must be improved for better control and e ective diagnosis,
helping doctors to make the decisions and to choose the necessary treatments at the
right time.
      </p>
      <p>We can summarize the objectives of the Tuberculosis task through the
following points:
{ Helping medical doctors in the diagnosis of drug-resistant TB and TB type
identi cation through image processing techniques;
{ Introducing work towards inexpensive and quick methods for early detection
of the MDR status and TB types in patients;
{ Predicting quickly the type of TB and its severity degree to help doctors to
make quick decisions and give the e ective treatments.</p>
      <p>We present in the following our work that has been made in the context of our
participation to the two sub-tasks of ImageCLEF 2018 Tuberculosis Task:
TuBerculosis Types classi cation (TBT) and Tuberculosis Severity Scoring (SVR).</p>
      <p>The remainder of this article is organized as follows. Section 2 describes the
two tasks to which we had participated. In section 3, we present our contribution
by detailing the system deployed to complete our submissions. Section 4 details
our experimental protocols followed to generate our predictions. We detail and
analyze in the same section the results obtained. We conclude in the last section
by presenting our perspectives and future works.
2
2.1</p>
      <p>Participation to imageCLEF 2018</p>
    </sec>
    <sec id="sec-2">
      <title>Tasks description</title>
      <p>In this paper, we focus on our participation in the TBT and the SVR sub-tasks
that we describe in the following sections.</p>
      <p>In both tasks the data is provided as 3D CT scans. For some patients
several 3D CT scans are given while for some others only one is provided. All the
CT images are stored in NIFTI le format with .nii.gz extension le (g-zipped
.nii les). For each of the 3-dimensions of the CT image, we nd a number of
slices varying from about 50 to 400. Each slice has a size of about 512 512 pixels.</p>
      <p>A training collection is provided at the beginning of the task with its
groundtruth (labels of samples). Participants prepare and train their systems on this
dataset. A test collection is provided at a later date. Participants interrogate
their system and return their predictions to the organizers' committee. An
evaluation is performed by the latter to compare the performance of the systems.
TBT task consists of the automatic categorization of TB cases in 5 target
classes based on CT scans of patients. The ve types considered are:</p>
      <sec id="sec-2-1">
        <title>1. In ltrative</title>
        <p>2. Focal,
3. Tuberculoma
4. Miliary
5. Fibro-cavernous</p>
        <p>The results will be evaluated using unweighted Cohens Kappa and accuracy.
SVR task aims to predict the degree of severity of TB cases. Given a TB
patient, the main goal is to predict its severity score based on his 3D CT scan.The
degree of severity is modeled according to 5 discrete values : from 1
(\critical/very bad") to 5 (\very good"). The score value is simpli ed so that values
1, 2 and 3 correspond to \high severity" class, and values 4 and 5 correspond to
\low severity".</p>
        <p>The classi cation problem are evaluated using ROC-curves (AUC) produced
from the probabilities provided by the participants. For the regression problem,
the root mean square error (RMSE) is used.
3</p>
        <p>
          Our contribution
We proposed to extract semantic descriptors from 3D CT scans. We noticed that
participants of the ImageCLEF TBT 2017 task used each extracted slice as a
separate sample. Thus, hundreds of slices are considered as separate learning
samples while these slices represent the same patient. This introduces a lot of
noise. In addition, each slice will be assigned the label of the patient (its type)
even those whose content does not present any information to identify the type
of TB case. This introduces more noise. The majority of the participants [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] of
ImageCLEF 2017 highlighted this problem and its impact on the results.
        </p>
        <p>To overcome this problem, we believe that the simplest solution is to
produce a single descriptor for each patient. This constitutes the key idea of our
contribution.</p>
        <p>Our proposed system goes through three main stages:</p>
      </sec>
      <sec id="sec-2-2">
        <title>1. Input data pre-processing 2. Features extraction 3. Learning a classi cation model We will detail each step in the following.</title>
        <p>3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Input data pre-processing</title>
      <p>We remind that in both tasks, 3D CT scans are provided in compressed Nifti
format. Firstly, we decompress the les and extract the slices. At the end, we
have three sets of slices corresponding to the three dimensions of the 3D image.
For each dimension and for each Nifti image we obtain a number of slices ranging
from 50 to 400 jpeg images.</p>
      <p>The visual content of the images extracted from the di erent dimensions is
not similar. Indeed, the images of each dimension are taken with from a di
erent angle of view.We noticed from our experiments that the slices of the
-Ydimension give better results compared to the two others (X and Z). However,
the following steps can be applied to slices of any of the three dimensions.</p>
      <sec id="sec-3-1">
        <title>CT scans</title>
      </sec>
      <sec id="sec-3-2">
        <title>Nifti format (.nii.gz)</title>
        <p>Extracting
slices</p>
      </sec>
      <sec id="sec-3-3">
        <title>Converting Nifti to JPG</title>
        <p>Image slices
3 Dimensions
X</p>
        <p>Y</p>
        <p>Z
60 selected slices per</p>
      </sec>
      <sec id="sec-3-4">
        <title>CT image / patient</title>
        <p>Filtering</p>
        <p>Selecting a
dimension</p>
        <p>Y</p>
        <p>On the other hand, not all slices necessarily contain relevant information
that can be useful to identify types of TB. This is why, it is essential to lter
slices by keeping only those that can be informative and may contain relevant
information. Moreover, since we want to extract a single descriptor per patient,
it is essential to keep the same number of slices for each patient. We found that
there is usually a maximum of 60 slices visually informative. Since the slices are
ordered, the 60 most informative are usually at the center of the list. We propose
then to keep the 60 middle slices. This is not optimal but we opted for this choice
for a fully automatic approach. This choice can be improved by performing a
manual ltering with the intervention of a human expert, preferably with medical
skills on TB disease. Figure 1 summarizes the process.
3.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Features extraction</title>
      <p>
        After slices extraction and ltering, we propose to extract a single descriptor per
patient. The transfer learning presents in this context an interesting track that
can be exploited. The results of SGEast [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and even other teams in the same
task of ImageCLEF 2017 proved the e ciency of this approach [
        <xref ref-type="bibr" rid="ref11 ref4">4, 11</xref>
        ]. Indeed,
SGEast opted for the transfer learning where they exploited the output of a
Resnet-50 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] deep learner layer. However, this idea presents a problem of the
resulting descriptor size. Indeed, for example, SGeast considered a descriptor per
slice and not per patient. However, since we want to have a single descriptor,
it is important that the information extracted from each slice must not be very
large. Therefore, we propose to describe each slice by semantic information. This
idea is inspired by the work presented in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>So, we choose to exploit the probabilities predicted by a deep learner trained
on the set of slices. If K is the number of classes considered, this information
typically corresponds to the K predicted probability values for the K classes
( ve probabilities of the ve types for the TBT task, or the ve severity degrees
for the SVR task). We obtain then for each slice K values corresponding to the
number of the considered classes.</p>
      <sec id="sec-4-1">
        <title>Slices, labels</title>
        <p>Slices
Labels /
K classes
Deep
Learner
Deep learned model</p>
      </sec>
      <sec id="sec-4-2">
        <title>ClassK - Classes</title>
        <p>C-3
C-1
C-2
C-4
C-k</p>
      </sec>
      <sec id="sec-4-3">
        <title>Pr-i,j : probability score for the</title>
        <p>i-th slice regarding the j-th classe
patient</p>
      </sec>
      <sec id="sec-4-4">
        <title>CT image</title>
        <p>Pre-processing</p>
      </sec>
      <sec id="sec-4-5">
        <title>Slice-1</title>
      </sec>
      <sec id="sec-4-6">
        <title>Slice-2</title>
      </sec>
      <sec id="sec-4-7">
        <title>Slice-60</title>
        <p>Pr-1,1
Pr-2,1
Pr-1,2
Pr-2,2
Pr-1,3
Pr-2,3
Pr-1,4
Pr-2,4
Pr-60,1
Pr-60,2
Pr-60,3
Pr-60,4</p>
      </sec>
      <sec id="sec-4-8">
        <title>Semantic Sub-discriptors: D-1 D-2 D-3 D-4</title>
        <p>Pr-1,k
Pr-2,k
Pr-60,k
D-K</p>
      </sec>
      <sec id="sec-4-9">
        <title>Final semantic descriptor for the patient</title>
        <p>D-1
D-2
D-3</p>
      </sec>
      <sec id="sec-4-10">
        <title>Concatenation of all</title>
        <p>sub-discriptors</p>
        <p>D-4
---</p>
        <p>D-k</p>
        <p>Furthermore, K sub-descriptors are generated: D1, D2, D3, D4, ... Dk. Each
sub-descriptor Di contains the predicted probabilities for the class i for all the
slices of the patient. A nal semantic descriptor is constructed by
concatenating the K sub-descriptors. Figure 2 details the process of the semantic feature
extraction for one patient.
In this step, we propose to exploit the semantic descriptors of patients obtained
in the previous step. Any approach of supervised classi cation can be applied as
shown in gure 3.</p>
      </sec>
      <sec id="sec-4-11">
        <title>Labels</title>
      </sec>
      <sec id="sec-4-12">
        <title>Train –corpus</title>
      </sec>
      <sec id="sec-4-13">
        <title>CT images</title>
        <p>CT image
of a test
patient
n
o
i
t
c
a
r
t
x
e
s
e
r
u
t
a
e
F
c
i
t
n
a
m
e
S</p>
      </sec>
      <sec id="sec-4-14">
        <title>Semantic descriptors / Train corpus</title>
        <p>Semantic descriptor 1</p>
      </sec>
      <sec id="sec-4-15">
        <title>Semantic d.escriptor 2</title>
        <p>.
.</p>
        <p>.</p>
        <p>Semantic descriptor n</p>
      </sec>
      <sec id="sec-4-16">
        <title>Semantic descriptor</title>
        <p>.</p>
        <p>Semantic..descriptor
.</p>
        <p>Supervised
Learner
learned model
C-1 C-2
C-3 C-4
C-k
Pr-1 Pr-2 Pr-3
Pr-4
Pr-k</p>
      </sec>
      <sec id="sec-4-17">
        <title>Predicting a class / K-classes</title>
        <p>We describe in the following sections our runs submitted to the TBT and SVR
tasks.</p>
        <p>
          We implemented the semantic descriptor approach described in section 3. We
used for that the following tools:
{ The Ca e frawework [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] for deep learning;
{ Weka [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for testing several learning and classi cation algorithms;
{ med2image [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] for the conversion of nifti medical images to the classic Jpeg
format.
        </p>
        <p>We chose to use slices of the -Y- dimension because our experiments showed
that they are more suitable than those of the two others and got better results.</p>
        <p>
          For descriptors extraction, our approach consists to learn a deep model to
generate semantic information. Unfortunately, we had problems with our
machines deployed for training our deep learner. Due to lack of time, we could not
achieve the learning process. As an alternative to this step, we deployed the
same model as the one proposed by the SGeast team [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] at the CLEF 2017
TBT Task. The model is accessible from the following link [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. It is based on a
Resnet-50 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and got the best results at the TBT task of 2017 edition. We have
therefore exploited the outputs of the last layer (named prob) of the Resnet-50
corresponding to the probabilities of the 5 considered classes.
4.1
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>TBT task</title>
      <p>Dataset: The dataset used in TBT tasks includes chest CT scans of TB patients
along with the TB type. Some patients include more than one scan. All scans
belonging to the same patient present the same TB type. Table 1 summarizes
the distribution of CT scans according to the ve types of TB considered.
Experimental protocol: We used the train collection provided by the
organizers and we split it into two sub-collections: 80% for training and 20% as
validation set. We have exploited in all our runs the semantic descriptors
generated as previously described. We tested several learners in the classi cation step.
We naly submitted three main runs. The other submissions are some variants
or are generated through the fusion of some of these three runs:
{ Run 1 (TBT mostaganemFSEI run1): random forest as supervised classi er.</p>
      <p>We tuned the two parameters referring to the number of iterations performed
and the number of features selected randomly;
{ Run 2 (TBT mostaganemFSEI run2): bagging of a set of random forest
learners. We tuned the number of learners for the bagging and the same
two parameters as Run1 for random forest;
{ Run 4 (TBT mostaganemFSEI run4): A hierarchical classi cation. We
organized the ve 5 classes into a hierarchical structure as described in gure 4.
We have created two new virtual classes V 1 and V 2. V 2 regroups the
three classes Type 1, Type 2, and V 2 contains the classes Type 3, Type 4
and Type 5. We have reorganized our collections in order to achieve a
classication on two di erent levels. In the rst stage, we classify the samples into
two virtual classes V 1 and V 2. In the second level of classi cation, we
performed a classi cation of the samples regarding the set of classes of the
predicted class in the previous stage. In two classi cation process we used a
random forest learner by tuning its two parameters as described for Run1.</p>
      <p>First level :</p>
      <p>V-1</p>
      <p>V-2
Second level : Type1</p>
      <p>Type 2 Type3 Type 4 Type 5</p>
      <p>Results: Table 2 shows the results obtained by our runs on validation collection.</p>
      <p>0,26
0,24
0,22
0,2
0,18
0,16
0,14
0,12
0,1
A0,08
P
P
KA0,06
0,04
0,02
0
Mean=0,08</p>
      <p>As shown on validation results, Run 4 has been our best submission and got
also the best results on test collection compared to run 1 and run 2.
task in terms of kappa coe cient and accuracy, respectively.</p>
      <p>TBT_Task
v t v v t t t v v v v v v t v t t t v t v t v t t v t t t t t t t t t t t t t
-----00000-,,,0,,00001,146822 .fttrrrrzsssscca___e1500532bhppdoB i.ttxgTT__ee4dhBwm lil.ttsssccFSaaTTTT____eeendudBA lilli.zsssccSaTTTT____eeeCnduoBRA .I-----txFFSa3170200512020nhnuoRM i.ttxgTT__ee3dhBwm .I-----txFFSaT2000020250nhnuoBRRM illll.ttrrsssccFSaaTT____eeCnoooAA ill.tssccFSaaaTTT____eeeendunBAm lii.tzsssccSaTTTT____eeendudBR ii.sssscSkaaT642nbuom .ssckaTTT__1282B illll.ttrrsssccFSaaTT____eeCnoooAA .I-----txFFFSgae02005120730nhoRAM .tsscLaTTB i.tcxTT__endboBwm i.I-----txFFFxSTT20000202560nuBRRM .-----ttLFxFaTTTe740nhnuooBRRVMM lill.ssccSaTTTT____eeCnduoBAHGO i.tcxTT_endboBm il.ttrrssccSaTTTT____endooBHGO l.tsxaTT__102pBmm ill.tttrrsssccFSaaTTTT____eendooBA .txTT_2Bm I.ttrsxFSgEaaTT__e4unnoBmm il.ttrrssccSaTTTT____endooBHGO llil.trsxaTT___ee1npdbpdooBmmmm I.ttrsxFSgEaaTT__e1unnoBmm I.ttrsxFSgEaaTT__e2unnoBmM I.ttrsxFSgEaaTT__e6unnoBmM iil.trcxaa23Pddbbnnnoo il.tFxaT32nnn .txTT_1Bm I.tLxSTTTB I.ttrsxFSgEaaTT__e3unnoBmm .-------ttsLxkaTee71301pdnuRADUNw .------ttsLxkaTTe31701dpnuBRADU il.tFxSaT2nVM i.trxSeenVM
T R T T T T n B
_ - B B u T
n T T T R
u B
r T T
_ B
T T
B
T</p>
      <p>RUN</p>
      <p>
        Although the results achieved by our submissions are not well ranked
compared to those of the top of the list, we can notice that several runs belong
to the same teams that had good results, and they probably do not di er too
much. On the other hand, we recall that our semantic descriptors were extracted
using a model that was not very well trained. In fact, we met problems with our
machines during the training of our deep learner. Indeed, although SGEast's
deployed model got the best results at ImageCLEF 2017 Tuberculosis TBT task,
we did not have the ability to perform exactly the same pre-processing performed
by this team as described in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. We believe that our semantic descriptors could
give better results if they are extracted from a more adapted and well-developed
deeper model.
0,42
0,39
0,36
0,33
0,3
0,27
Y
CA0,24
R
CU0,21
C
A0,18
0,15
0,12
0,09
0,06
0,03
0
      </p>
      <p>TBT_Task</p>
      <p>Mean = 0,30
B
T
_
n
u
r
_
T
B
T</p>
      <p>B
T</p>
      <p>R
T
B</p>
      <p>T
with the corresponding severity score (1 to 5). Scores from 1 to 3 correspond to
the \High" severity whereas the two scores 4 and 5 refer to the \Low" degree
of severity. Table 4 summarizes the distribution of CT scans according to two
severity classes.
Experimental protocol: We generated in a rst step the semantic descriptors
following the approach described in the section 3. For the prediction of TB
severity scores, we treated the problem as a classi cation problem. We used for
this two approaches :
1. Multi-class classi cation problem: we considered the ve scores as separate
classes. We then tested several classi ers. We selected two that have been
most e ective compared to those tested: Random forest, bagging of a set of
random forest learners.
2. Hierarchical classi cation: We organized our data in order to carry out a
hierarchical classi cation. We considered the hierarchy described in gure 7.
Then, a two-level hierarchical classi cation is carried out. In the rst level
the samples are classi ed into \High" or \Low" classes. In the second level,
the samples are reclassi ed into the descending classes of the one predicted
in the rst level.</p>
      <p>First level :</p>
      <p>High</p>
      <p>Low
Second level : 1
2
3
4</p>
      <p>5
1. Run 1 (SVR mostaganemFSEI run1): Multi-class model using Random
forest as classi er. We tuned the two parameters : the number of iterations
performed and the number of features randomly chosen;
2. Run 2 (SVR mostaganemFSEI run2) : Multi-class model using a bagging of
a set of random forest learners with sub-sampling of the main train collection.
We created two sub-collections by balancing the number of samples for the
5 classes. We then merged the results obtained by the two sub-collections;
3. Run 3 (SVR mostaganemFSEI run3): Hierarchical classi cation using a
Bagging of a set of Random forest learners in each level of the hierarchical
classi cation process.
4. Run 4 (SVR mostaganemFSEI run4): fusion of Run 1 and Run 2
5. Run 6 (SVR mostaganemFSEI run6): fusion of Run 3 and Run 1
Results: Table 5 shows the results obtained by our runs on validation collection.</p>
      <p>Runs
Run 1 (SVR mostaganemFSEI run1)
Run 2 (SVR mostaganemFSEI run2)
Run 3 (SVR mostaganemFSEI run3)
Run 4 (SVR mostaganemFSEI run4)
Run 6 (SVR mostaganemFSEI run6)</p>
      <p>We can see that our Run 3 got best results in terms of RMSE compared to
our other runs on validation collection and even on test data. However, in terms
of AUC, Run 2 seems to be more e cient.</p>
      <p>SVR_Task
E
S 0,8
M
R 0,7
1,6
1,5
1,4
1,3
1,2
1,1
1
0,9
0,6
0,5
0,4
0,3
0,2
0,1
0</p>
      <p>Mean= 1,064
d
B
T
_
n
u
r
_
R
V
S</p>
      <p>V R
S V</p>
      <p>S
task in terms of RMSE and AUC values, respectively.
th</p>
      <p>We can see that our best run is ranked 12 out of 36 submissions. However,
the di erence between the performances of the 12 best runs is not very signi cant.
We recall that our best result is achieved by a hierarchical classi cation approach
using a bagging of random forest learners at each level of the hierarchy. We
believe that our approach could give better results using a well-trained deep
model in the semantic features extraction step.</p>
      <p>SVR_Task
0,9
0,8
0,7 Mean = 0,64
0,6
CU0,5
A0,4
0,3
0,2
0,1
0 illll.ttrrsssccFSvSaaTT____eeCnoooRVAA illl.trrssccSvSaTT____eCnoooRVAHGO il.sccSvSaaTT____eeendunRVHGOm illl.ssccSvSaTT____eeCnduoRVAHGO li.tssccSvSaTT____eendudRVHGO .fttrrrrrzssssccvSaT_____e1005032bhppdnuoBRV ii.ttssxSSnbuoRVm il.tttLxFSSaaT______eee2nnudhpdnRRRVwwm l.ttttrrssxFSaT____eeeeeuBRVAD illll.tsssccFvSSaaTT____eeeCnduoRVAA .scSv9RV ill.tssccFSvSaaaTT____eeeendunRVAm .tttxFSSaT_____eee2nnhpdnRRRVww .--ttxSyaa4oRVGM i.tttrLxSga__epnRRV .tttrsFxSa_nRRV .tttxFSSaT____ee2nhpdnRRRVw il.trrsccSvSaaTT____eennooRVHGOm .----ttLxSa30PnhunoRRVMM .-----tttLLFxSSae560PhnunooRRVVMM .-----ttLxFSaTT010670PhnunoRRVMM .ItttrsxFSSgEaa__e2nunoRVmm .ItttrsxFSSgEaa__e6nunoRVmm .ItttrsxFSSgEaa__e4unnoRVmm .ItttrsxFSSgEaa__e3nunoRVmm .ItttrsxFSSgEaa__e1nunoRVmm .----ttsLxSk10nuRRVAD .ttttrrssxFSaT___eeeeeuBRVD ili.ttrrzsssccSvSaTT____eendooRRV I.tttsFxSSgEaa_enoRVmm .----ttxFSa20nhnuoRRRVM .I-----ttxFFSSa02030305hnnuoRRRVM .I-------ttxFFSSa002300540nhunoRRRVM illli.trrzsssccSvSaTT____eeCnoooRRVA li.--ttrxSa27poRVAG i.ttttrrssxFSaT____eeeeenuBBRVD</p>
      <p>RUN</p>
      <p>Conclusion and future works
We have described in this article our contributions to the TBT and SVR tasks
of ImageCLEF Tuberculosis 2018. We proposed an approach that consists in
extracting a single semantic descriptor for each CT image / patient instead of
considering all the slices as separate samples. Unfortunately, we could not achieve
the training of our deep learner. However, the results obtained show that this
approach could be much more e cient and give more interesting results if it is
applied properly.</p>
      <p>
        As perspectives, we plan to adopt enrichment strategies and learning samples
selection. Indeed, one of the characteristics of the problematic addressed in the
SVR and TBT tasks is the nature of the provided data collections, which are
of a small size and are noisy because of the presence of many slices that do not
contain useful information. Our bagging and sub-sampling strategies adopted in
our experiments con rmed this. In addition, we noticed during the sub-sampling
of our data that the deletion or addition of some samples had an impact on the
results. On the other hand, ltering slices e ectively to keep only those that are
truly informative is a key idea that could further improve system performance
as reported by several participating teams [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Furthermore, we noticed in our
experiments that there is a di erence in terms of precision achieved for each
studied class. Indeed, some classes are more di cult to identify than others.
This is also an interesting track to study.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. med2image: https://github.com/fnndsc/med2image. Last check:
          <volume>30</volume>
          /05/
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <article-title>Sgeast model for imageclef 2017 tubeculosis task</article-title>
          : https://github.com/maizesix92/imageclef2017 tb sgeast. Last check:
          <volume>30</volume>
          /05/
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalinovsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of the imageclef 2017 tuberculosis task - predicting tuberculosis type and drug resistances</article-title>
          .
          <source>In: Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum</source>
          , Dublin, Ireland,
          <source>September 11-14</source>
          ,
          <year>2017</year>
          . (
          <year>2017</year>
          ), http://ceur-ws.org/Vol1866/invited paper 1.pdf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Kalinovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , , Muller, H.:
          <article-title>Overview of ImageCLEFtuberculosis 2017 - predicting tuberculosis type and drug resistances</article-title>
          .
          <source>In: CLEF2017 Working Notes. CEUR Workshop Proceedings</source>
          , CEURWS.org &lt;http://ceur-ws.
          <source>org&gt;</source>
          , Dublin,
          <source>Ireland (September</source>
          <volume>11</volume>
          -14
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , , Muller, H.:
          <article-title>Overview of ImageCLEFtuberculosis 2018 - detecting multi-drug resistance, classifying tuberculosis type, and assessing severity score</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt;</source>
          , Avignon,
          <source>France (September 10- 14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutemann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Witten</surname>
            ,
            <given-names>I.H.</given-names>
          </string-name>
          :
          <article-title>The WEKA data mining software: an update</article-title>
          .
          <source>SIGKDD Explorations</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>10</volume>
          {
          <fpage>18</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hamadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quenot</surname>
          </string-name>
          , G.:
          <article-title>Extended conceptual feedback for semantic multimedia indexing</article-title>
          .
          <source>Multimedia Tools Appl</source>
          .
          <volume>74</volume>
          (
          <issue>4</issue>
          ),
          <volume>1225</volume>
          {
          <fpage>1248</fpage>
          (
          <year>2015</year>
          ). https://doi.org/10.1007/s11042-014-1937-y, https://doi.org/10.1007/s11042-014- 1937-y
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
          </string-name>
          , J.:
          <article-title>Deep residual learning for image recognition</article-title>
          .
          <source>arXiv preprint arXiv:1512.03385</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Villegas</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrearczyk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farri</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lungren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang-Nguyen</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lux</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Overview of ImageCLEF 2018:
          <article-title>Challenges, datasets and evaluation. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Ninth International Conference of the CLEF Association (CLEF</source>
          <year>2018</year>
          ),
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer, Avignon,
          <source>France (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jia</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shelhamer</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Donahue</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karayev</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Long</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girshick</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guadarrama</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Darrell</surname>
          </string-name>
          , T.:
          <article-title>Ca e: Convolutional architecture for fast feature embedding</article-title>
          .
          <source>arXiv preprint arXiv:1408.5093</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chong</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>Y.X.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Binder</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Imageclef 2017: Imageclef tuberculosis task - the sgeast submission</article-title>
          .
          <source>In: Working Notes of CLEF 2017 - Conference and Labs of the Evaluation Forum</source>
          , Dublin, Ireland,
          <source>September 11-14</source>
          ,
          <year>2017</year>
          . (
          <year>2017</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-1866/paper 130.pdf
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>