=Paper=
{{Paper
|id=Vol-2748/Paper54
|storemode=property
|title=Pattern-based Classification Using Entropy Coding For MRI Data Classification
|pdfUrl=https://ceur-ws.org/Vol-2748/IAM2020_paper_54.pdf
|volume=Vol-2748
|authors=Nadjet Bouchaour,Smaine Mazouzi
|dblpUrl=https://dblp.org/rec/conf/iam/BouchaourM20
}}
==Pattern-based Classification Using Entropy Coding For MRI Data Classification==
<pdf width="1500px">https://ceur-ws.org/Vol-2748/IAM2020_paper_54.pdf</pdf>
<pre>
Pattern-based Classification Using Entropy Coding
For MRI Data Classification
Nadjet Bouchaoura , Smaine Mazouzia
a
    LICUS 20 aout 1955Skikda, BP 26 Route el Hadaik 21000, Algeria


                                         Abstract
                                         Dealing with artifacts in order to segment medical images stills a challenge. In this paper we propose
                                         new appropriate features for Magnetic Resonance Images (MRIs) data that allow enhancing tissue seg-
                                         mentation, regardless levels of the inherent artifacts, namely noise and intensity non uniformity (INU).
                                         We show that using features based on spatial entropy of intensities with different classifiers has signif-
                                         icantly enhanced the brain matter segmentation. In addition to their powerful discrimination, the pro-
                                         posed features are computationally low. Experimentation was conducted using a brain web database,
                                         and the obtained results have allowed us to conclude that the proposed new features are well suited to
                                         represent MRI data for image segmentation.

                                         Keywords
                                         MRI, Entropy, Classification, Neural classifier, Naïve Byes classifier


1. Introduction
Pattern-based classification approaches have wined more interest in the last decade. According
to such approaches, not only data are used to infer features, but also patterns that exist in data
are used also for classification, and for feature definition. It has been stated in several works [1]
that pattern mining can help to enhance data classification, mainly with structured data fields,
such as object recognition and image analysis. For several applications in these latter fields,
patterns can be defined as sequences of graphs in the raw data [2, 3]. So defining classification
patterns allows enhancing classification, rather than using only raw data.
   In this work, we are interested in the classification of MRI data where we propose a new
energy based feature, namely entropy of intensity. Indeed, in contrast to most of the published
works, and instead of using the raw image data, stored individually at the different voxels, we
consider the neighborhood of the voxels to form patterns, and then use these latter as features
for classification. To do this, and in order to proof that entropy-based coding allows best results,
we will proceed according to two different patterns: In the first, we use a simple aggregation
of the voxels that surround the voxel in question. So, the pattern in this case consists of the set
of the voxels that form the neighborhood of the voxel to classify.
   For the second pattern, we use an energy-based coding; assuming that the energy defined
within the neighborhood of a voxel represents well its interaction with its neighborhood. The

IAM’20: Third conference on informatics and applied mathematics, 21–22 October 2020,Guelma, ALGERIA
" n.bouchaour@univ-skikda.dz (N. Bouchaour); mazouzi_smaine@yahoo.fr (S. Mazouzi)

                                       © 2020 Copyright for this paper by its authors.
                                       Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
model is therefore based on defining, within the neighborhood of every voxel, an energy func-
tion represented by a spatial entropy. Such an energy function allows considering, in addition
to the image intensity, the specific geometrical features that characterize MRI data.
   We show that the aggregation of data according to the considered patterns, especially with
energy coding, as well as the classification of these patterns with two classifiers: Neural Net-
work classifier and Naïve Bayes classifier, allow to significantly improve the results of classi-
fication of MRI data. According the conducted experimentation, MRI segmentation based on
the proposed features was strongly enhanced. Such enhancement is due first to the ability of
the entropy to capture interactions between neighboring voxels. Secondly the diversity of fea-
ture instances of entropy allows overcoming both the over-training problem that characterizes
MRI data, and the problem of the early convergence of the classification algorithm when it is
optimization-based such as in the neural based methods [4, 5].
   The remain of the paper is organized as follows: in Section 2 we present a review of segmen-
tation methods for MRI data, including those based on machine learning, and we review the
main works published in the literature. In Section 3 we introduce our approach by presenting
the data aggregation method through energy coding. Section 4 is devoted to the experimenta-
tion of the proposed features, where we introduce the used MRI database, the obtained results,
as well as an analysis and a discussion on the obtained results.


2. Related Work
Image segmentation is one of the most important tasks in pattern recognition using visual
data. It consists of subdividing the pixels / voxels of an image into distinct and homogeneous
regions. There are dozens of different segmentation methods. However, all these methods can
be classified according to three main families:

1. Contour-based methods: the common principle to these methods is that they proceed to
detect discontinuities in visual data. These discontinuities represent the edges in the image.
The detected edges are generally disjoint and open, and therefore they must to be joined and
closed for adequate use in the subsequent recognition process.

2. Region-based methods: their principle consists in grouping the pixels / voxels of the image
having similar features, in disjoint but homogeneous subsets according to a given homogeneity
criterion. These homogeneous subsets are called regions.

3. Methods by classification: their major asset is that they allow learning from the labeled
data called training set. Segmentation by classification consists of assigning a label to any pixel
/ voxel of the image using a classifier (single or ensemble, classical or deep). Given that we are
interested in this last family in this work, we devote the remainder of this section to introduce
some methods of MRI segmentation by data classification methods.
   Classification-based segmentation methods can be themselves subdivided into two subcate-
gories:
1. Heuristic-based methods: where one or more heuristics are considered to define a pixel
/ voxel labeling criterion. The heuristics consider given priors, related to the image, to the
noise, or to the distortions that the image could undergo during its acquisition. For instance,
we can cite the Fuzzy c-mean (FCM) algorithm [6], where the classification prior consists in
considering for the pixels / voxels at the borders of the regions and elsewhere that there exists
a mixture of information, each one relating to one of the data classes. Markovian methods
consider the prior of "smoothness", where the data are considered homogeneous by parts, and
any part corresponds to a homogeneous region of the image. Also, Markovianity can express
some spatial constraints that the data must fit.

2. Methods by learning: where machine learning techniques are used. Their common prin-
ciple is to proceed by learning classifiers using labeled data so-called training set, and then
use the trained classifiers to classify the data, in this case called test set. According to the
latter approach, several new methods based on the combination of classifiers have emerged.
Expect for deep learning methods, feature representation and extraction is a major issue for
data classification, including visual data in medical imaging.
   Magnetic Resonance Imaging (MRI) is of great importance for the establishment of correct
diagnoses and thus the prescription of appropriate treatments. The segmentation of a MRI con-
sists in extracting the main tissues for which physicians and radiologists are mainly interested.
These tissues are respectively CSF (Cerebrospinal Fluid), GM (Gray Matter) and WM (White
Matter), for structural MRI, and also LM (Lesion Matter) for pathological MRI. Several methods
of MRI segmentation have been published, starting with contour detectors, passing through
region extractors, and ending with machine-learning based methods. Richard et al. [7, 8] used
a distributed system with Markovian and Bayesian categorization of MRI tissues. The principle
of their method is to segment the volume into sub volumes and then make autonomous agents
cooperate to produce an overall image segmentation. The method suffers from several prob-
lems including the ad hoc subdivision of volumes. Also, the Markovian methods are known by
their minimization methods that are very time consuming. By adopting the same paradigm,
Sherrer et al. [9] proposed a distributed Markovian model for the classification of MRI data. In
their work, they were able to formalize the classification by using both a multi-agent system
for data distribution and processing, and a Markovian representation of MRI data, allowing
their classification using Markovian classifiers to deal with spatial constraints.
   Several works have proposed machine-learning methods for MRI segmentation. Some of
them have combined classifiers, where mainly the unique used feature was the voxel intensity.
In most of reviewed works, authors proceeded to extract features then use the the latter as
inputs for classifiers in order to label MRI data. Rajasree et al. have considered a fractal repre-
sentation of MRI data, by using the Brownian move technique [10]. The adopted features are
then used with the Adaboost algorithm to detect tumors in MRI data.
   Gustavo et al. have combined Genetic Algorithms (GA) and Adaboost classification to detect
the tumor area in the MRI [11]. After a data tresholding using the GA algorithm in order
to delimit the tumor area, Adaboost is trained using the obtained classification by the GA
algorithm, then used to finally detect the tumor as the largest connected component in the
whole image. Recently, deep learning techniques, mainly convolutional neural networks (CNN)
were widely proposed for MRI data processing. Their strong advantage is that they do not
need for feature representation and extraction. In such techniques MRI data in the input are
convolved to kernels in the middle layers, so features are automatically produced. Output
layers classify voxels according the produced features [12, 13].
   Entropy based features for MRI processing are rare in the literature. Sarita et al. have com-
bined probabilistic neural network and wavelet entropy for feature extraction to classify MRI
data [14]. Entropy based features were also used with optimization-based clustering, such as
in the work introduced by Pham et al. [15], where authors combined fuzzy entropy clustering
and multi-objective particle swarm optimization. Contrary to the previous works, where en-
tropy is computed according likelihood states of pixels, our entropy-based features are spatial
and are computed based on disparities between the intensities of the pixels/voxels in a given
neighborhood.


3. Pattern-based Features for MRI Data Classification
We define in this section two pattern-based features that will be tested for brain tissue classifi-
cation. In the two cases, Multilayer Perceptron and Naïve Byes classifiers are used to label the
voxels in the MRI volume. The MRI is first preprocessed using skull-strip algorithm, namely
FSL Brain Extraction Tool (BET) [16, 17], to remove non brain tissues. In our case we have
preferred to avoid the noise filtering, given that, firstly, the MRI data is usually altered for the
voxels in the neighborhood of the different tissues, and secondly because the proposed pat-
terns allow to reduce the effect of the noise during the classification, even without MRI data
smoothing, given that they consider the pixel/voxel in question and its neighborhood.

3.1. MRI Data
The MRI volume obtained after skull-striping is a set of voxels that each one can belong to
one of the three remaining tissues, namely, the Cerebro-Spinal Fluid (CSF), the Gray Matter
(GM), and the White Matter (WM). Each of them is characterized by its mean intensity and
the corresponding standard-deviation (𝜇𝑐 , 𝜎𝑐 ) , 𝑐 ∈ {𝐶𝑆𝐹 , 𝐺𝑀, 𝑊 𝑀}. We assume also that the
intensity distribution in each tissue is Gaussian (see Formula 1)
                                                         1
                                                                                                 (1)
                                                             1          2   2
                                𝑓𝑐 (𝑥𝑖 , 𝜇𝑐 , 𝜎𝑐 ) =     √ 𝑒 2 (𝑥𝑖 −𝜇𝑐 ) /2𝜎𝑐
                                                       𝜎𝑐 2𝜋
where 𝑥𝑖 is the intensity of the voxel at the location 𝑖.

3.2. Local Neigborhood-based Classification
For this fist pattern-based classification of the MRI data, we consider for each voxel in the MRI
volume its neighboring voxels that surround it, except for voxels that are situated on the sides
of the volume where a particular processing is dedicated. The Neural Network and the Naïve
Bayes classifiers are learned based on a training MRI volume with its ground truth labeling.
Figure 1 depicts the principle of MRI segmentation based of this pattern. According to this
pattern, a voxel is not labeled according to its alone intensity but according to the intensities
Figure 1: Principle of the proposed energy-coding-based classification.


of the voxels surrounding it. So, noise and INU are indirectly considered, because voxels that
are wrongly labeled due to high variations of their intensities, are corrected using their respec-
tive neighborhoods. Furthermore, the partial volume effect artifact is also considered. At the
borders of the different tissues a voxel is likely labeled according to its intensity, and according
to the dominant class in the neighborhood. So the resulting class for such voxel will be likely
that of the tissue with close intensity and high occurrence.

3.3. Energy Coding-based Classification
Our proposed entropy-based pattern aims to capture interactions between the voxels belonging
to a local neighborhood. Such interactions can be represented according to an energy function.
So, the proposed pattern for a given voxel 𝑖 in the MRI volume is a vector of three components,
where each component represents the spatial entropy of the intensities of the similar voxels
in the neighborhood. Such subsets of similar voxels are obtained by the 𝑘-means algorithm,
applied on the voxel’s neighborhood with three classes (CSF,GW,WM) (see equation 2).

                                        𝐸𝑐 = − ∑ 𝑃𝑖 × 𝑙𝑜𝑔2 𝑃𝑖                                    (2)
                                                  𝐷𝑐

  where 𝐷𝑐 denotes the set of the voxels belonging to the class 𝑐, and 𝑃𝑖 is the probability that
the voxel belongs to the class 𝑐, and:
                                             1       1          2   2
                                             √     𝑒 2 (𝑥𝑖 −𝜇𝑐 ) /2𝜎𝑐
                                                                                                 (3)
                                           𝜎𝑐 2𝜋
                                     𝑃𝑖 =               1         2   2
                                          ∑𝑐 𝜎 √1 2𝜋 𝑒 2 (𝑥𝑖 −𝜇𝑐 ) /2𝜎𝑐
                                              𝑐

   𝜇𝑐 , 𝜎𝑐 are respectively the mean and the standard-deviation of the intensities of the voxels
belonging to the class 𝑐 and situated in the neighborhood of the voxel in question (𝑖). So,
a clustering by the 𝑘-means algorithm is performed at the voxel neighborhood, so the three
subsets of voxels and their respective couples of (𝜇𝑐 , 𝜎𝑐 ), 𝑐 ∈ {𝐶𝑆𝐹 , 𝐺𝑀, 𝑊 𝑀} are obtained. As
it can be noticed on Fig. 2,the vector of features that will be used for classification by Neural
Network or Naïve Byes is composed of the intensity of the voxel in question and the three
spatial entropies 𝐸1, 𝐸2 and 𝐸3, obtained according to the clustering of the set of the voxels
forming the local neighborhood. Such a pattern captures well the interactions of the voxels,
Figure 2: Principle of the proposed energy-coding-based classification.


and expresses well the spatial constraints that exist within the MRI data. Entropies 𝐸1, 𝐸2, and
𝐸3 allow to distinguish the cases were the voxel is in the neighborhood of a tissue border or
not. Also, they allow distinguishing if a voxel is affected by a high deviation due to noise or
not. Obviously, the intensity of the voxel in question is considered for classification, so the
resulting class is likely that of the tissue with the closest mean intensity, but adjusted if needed
by the voxels in the neighborhood.
  For the case with considering the entropies of the voxels in the neighborhood of the voxel in
question, the vector of features that will be used for classification by Neural Network or Naïve
Byes is composed of the intensity of the voxel in question and the entropies of all its neighbors
calculated according the same principle for the central voxel.


4. Experimentation and Evaluation
The experimentation of the proposed patterns has been done using MRI volumes from the well
known database brain web [18]. This database provides a large set of MRI volumes with their
ground truth labeling that accords researchers to test their machine-learning based methods,
and quantitatively evaluate them. Furthermore, MRIs can be obtained according to various
levels of artifacts, namely noise, and INU. All MRIs are 181 × 217 × 181 voxels of size. In this
work, they are considered only MRIs with T1 modality.

4.1. Performance evaluation
Two main indexes are usually computed to evaluate and compare segmentation methods based
on classification and clustering. They are namely Jaccard and Dice indexes. Based on true
positive (𝑇 𝑃), true negative (𝑇 𝑁 ), and false positive (𝐹 𝑃) labeling instances, Jaccard coefficient
is expressed as follows:
                                                      𝑇𝑃
                                    𝐽 𝑎𝑐𝑐𝑎𝑟𝑑 =                                                     (4)
                                                 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃
  Dice coefficient can be expressed as:
                                                     2𝑇 𝑃
                                      𝐷𝑖𝑐𝑒 =                                                       (5)
                                               2𝑇 𝑃 + 𝑇 𝑁 + 𝐹 𝑃
Table 1
Segmentation results according to the Dice index for the different MRIs and the different brain matters
(WM, GM and CSF). The classification features, using a Neural Network, for this case are the intensity
of the voxel to classify and the intensities of the neighboring voxels.
         WM                                GM                              CSF

 INU/N     1       3       5       7       1       3       5       7        1       3       5       7
 0%        91.55   87.54   87.49   83.16   75.59   69.61   69.43   64.41    51.26   49.15   33.97   29.81
 20%       90.28   87.85   85.6    82.59   74.23   68.82   68.39   64.59    51.08   49.17   24.67   31.41
 40%       87.92   86.7    82.14   80.61   70.97   68.39   64.45   62.89    51.36   49.27   43.43   27.95
 60%       84.34   77.06   67.52   69.48   65.1    54.96   40.36   45.54    47.62   35.63   28.94   17.18
 90%       76.87   74.46   70.08   68.02   55.26   54.64   48.44   46.83    41.97   39.33   26.66   18.7


   We opted for Dice index given that it is more cited in the literature and it has been considered
in works with which we compare our results.

4.2. Experimental Results
In order to show the impact of entropy on improving the segmentation quality, we introduce
a series of experiments with variation of several experimental elements. First, we present the
segmentation results using the gray levels of the voxel and those of its neighbors. Then, we
show the results obtained by using the entropy as a classification feature, and finally the results
obtained based on the entropy but with including the voxels in the neighborhood of the voxel
to be classified. We also considered two different classifiers, in order to show that the improve-
ment of the results is not due to the classifier itself, but to the proposed features, namely the
spatial entropy. Theses classifiers are respectively the Neural Network classifier and the Naïve
Bayes one.

   Table 1 shows the segmentation results according to the Dice index using a Neural classifier,
and considering the value of the voxel in question and the values of its neighboring voxels as
classification features.

   The obtained results for the white and the gray matter are relatively acceptable, especially
for low levels of noise and INU. However, the results are very unsatisfactory for cerebrospinal
fluid (CSF). This is due to the fact that the CSF voxels are located in narrow regions where
the neighborhood of such voxels is overlapping on neighboring tissues, which corrupts the
classification data.

  Table 2 shows the results of MRI segmentation at different noise and INU levels, using the
spatial entropy of the voxel as the unique feature.

 According the results introduced in Table 2, we notice a significant improvement in the seg-
mentation results when the spatial entropy is used as a classification feature. This improvement
Table 2
Segmentation results according to the Dice index for the different MRIs and the different brain matters
(MG, MB and CSF). The features of classification by the Neural Network are the voxel intensity and its
spatial entropies.
          WM                                GM                               CSF

 INU/N     1       3       5        7       1        3       5       7        1       3       5       7
 0%        98.21   94.49   92.87    87.48   96.16    90.72   86.69   79.91    96.54   94.18   91.32   88.01
 20%       96.79   94.98   92.83    88.83   93.75    90.99   87.26   80.92    95.25   93.87   91.74   88.16
 40%       94.11   92.72   90.24    85.87   89.69    86.84   82.81   74.96    93.51   92.45   89.97   86.54
 60%       87.92   83.88   80.66    76.91   76.11    96.19   66.63   56.43    78.19   77.77   74.35   66.14
 90%       82.38   80.26   75.69    74.19   70.84    67.10   55.11   57.53    77.16   73.62   71.27   66.87


Table 3
Segmentation results according to the Dice index for the different MRIs and the different brain matters
(MG, MB and CSF). The features of classification by the Neural Network are the set of the neighboring
voxels and spatial entropy of the voxel in question.
          WM                                GM                               CSF

 INU /N    1       3        5       7        1       3       5       7        1       3       5       7
 0%        97.48   95.96    94.85   93.44    95.55   92.14   89.75   86.97    97.23   93.75   92.1    90.37
 20%       97.1    95.52    93.99   91.73    94.27   91.39   88.91   84.18    95.78   93.76   91.54   89.89
 40%       94.80   93.81    90.24   88.19    90.35   88.74   82.2    78.34    94.09   92.69   89.06   87.6
 60%       91.92   88.81    87.44   84.07    86.68   81.25   78.38   72.03    92.29   88.37   85.73   82.4
 90%       85.68   84.17    81.68   79.24    76.64   76.24   69.77   64.72    90.54   87.6    83.61   78.75


can be explained by the capture, in the expression of the entropy, of the interaction between
neighboring voxels. This interaction is expressed as an energy, formulated by the spatial en-
tropy. A voxel is not classified solely according to its intensity but according to the strength of
its interaction with neighboring voxels.

   Table 3 presents the segmentation results using spatial entropy and by considering the voxel
in question and its neighboring voxels.

   Compared to the results of the entropy of the voxel taken alone, we note a strong robustness
against noise and INU (see tables 2 and 3). For instance, for white matter, the variation of
the Dice index is from 97.48 to 93.44 for a variation of noise from 1 to 7‘% and for an INU of
0%, while this variation was from 98.21 to 87.48 for the same variation of noise. It is the same
for the robustness against the INU. The variation is from 98.21 to 82.38 for the INU varying
from 0% to 90% with a noise level set at 1%, against a variation from 97.48 to 85.68 for the same
variation of the INU (see table 2). It remains the same for the other two tissues, namely the
gray matter and the CSF (see tables 2 and 3). The entropies respectively of the voxel to be
                         (a)                                           (b)
Figure 3: Variation of the Dice index according to the noise level with fixed values of the INU. (a)
Variation for INU = 0%, (b) variation for INU = 40%.


classified and those of the neighboring voxels form a set of features have allowed to better
segment the tissues even with high levels of noise and INU. Figure 3 shows the intensity of
variations according to the level of noise with constant INU level.

  For low values of the INU, the method based on the entropy of the voxel taken alone, pro-
duces better results compared to the method where the voxel is considered with its neighbor-
hood. However, we can see in Figure 4 that taking into account the neighborhood improves
the results in terms of robustness against noise and INU. The segmented images obtained with
voxel neighborhood have more regular boundaries, and several voxels have been correctly re-
assigned. This can be explained by the discarding of the voxels whose gray levels are close to
that of the tissue in question, but in reality they are noise voxels, located outside the region of
the considered tissue.

   All the experiments carried out using the Neural classifier were re-conducted under the same
conditions using the Naïve Bayes classifier. In terms of performance, we have seen a slight drop
in Dice index values for all the test images. However, we have also seen a high improvement
in results with the use of entropy. We also noticed a stability of the results against noise and
INU when the entropy is used as a feature when taking into account the neighborhood of the
voxel. Figure 5 shows the variation of the Dice index according the noise level for the white
matter for INU = 40%. Like the Neural classifier, we notice a stabilization of the results against
noise.

   In order to show the effectiveness of the proposed features, we introduce in table 4 a compar-
ison between the obtained MRI data classification results and those of some well cited works
from the literature. We have considered MRI with 20% INU and different noise levels, and we
compared results for WM and GM tissues.
                   (a)                              (b)                             (c)


                   (d)                              (e)                             (f)
Figure 4: MRI segmentation results with noise level N = 3% and INU = 40%. (a), (b), (c) are respectively
the white matter, the gray matter and the CSF for the case where the classification features are the
intensity of voxel and its entropies . (d), (e) and (f) depict the tissues corresponding where the features
are the entropies of both the voxel and its neighbors. For the last images, the extracted tissues are more
compact because a large number of voxels in the periphery of the tissues were discarded.


   We can notice from the previous table that our method, especially when the neighborhood
is taken into account, scores nearly close to the best one, namely Fast [19] compared to the
others [20, 21]

4.3. Result analysis and discussion
According the different results introduced below, we notice the strong improvement in seg-
mentation results when spatial entropy is used as classification feature. Indeed, despite taking
the neighborhood into account when classifying voxels based on alone intensity, the results
were not so satisfactory, especially with high levels of noise and INU. However, the use of only
the spatial entropy value of the voxel to classify has allowed a large increase in the values of
the Dice index, even with high levels of noise and INU. This can be explained by the fact that
                          (a)                                               (b)
Figure 5: Variation of the Dice index according to the noise level, and for constant values of INU with
Naïve Bayes classifier. (a) Variations for INU = 0%, (b) variations for INU = 40%.


Table 4
Result comparison with some works from the literature.
                                                  WM                   GM

            Method                        Noise    1    3    5    7     1         3    5    7
            Fast                                   97   95   94   92    96        94   91   91
            SMP5                                   94   94   90   86    93        92   90   87
            NL-FCM                                 94   91   90   83    94        93   90   87
            Voxel Entropy                          98   98   96   94    97        96   93   89
            Neighbordhood Entropy                  98   98   96   94    97        96   93   89


the proposed spatial entropy, in addition to its ability to consider the voxel and its neighbor-
hood, and as it has an energy nature, it expresses the interaction force between the voxels in
the MRIs. Thus, a voxel is not classified solely according to its value or the values of its neigh-
bors, but also according to the force of interaction of the voxel with its neighborhood. Taking
into account neighboring voxels in the context of spatial entropy has shown better robustness
against noise and INU, especially when the levels of these artifacts are high. Such a result can
be explained by the widening of the voxel’s field of interaction beyond its local neighborhood.


5. Conclusion
In this work, we have introduced a new feature for MRI data representation, allowing to consid-
erably improving the classification of voxels, and thus the segmentation of this type of images.
It consists of the spatial entropy, whose interest is to capture the interaction between neigh-
boring voxels, which will allow the latter to be better classified. We considered two cases of
use of the proposed spatial entropy: As one feature of the voxel to be classified, and as a set of
features of the voxel in question and its neighboring voxels. The obtained results, by varying
the different artifact levels, showed a strong improvement in the results for the first use case,
and a good robustness for the second case. In future work, the proposed feature should be
tested with different classifiers, and using ensembles of classifiers, as well as deep classifiers.


References
 [1] B. Bringmann, S. Nijssen, A. Zimmermann, Pattern-based classification: A uni-
     fying perspective, CoRR abs/1111.6191 (2011). URL: http://arxiv.org/abs/1111.6191.
     arXiv:1111.6191.
 [2] C. Zhou, B. Cule, B. Goethals, Pattern based sequence classification, IEEE Transactions
     on Knowledge and Data Engineering 28 (2016) 1285–1298.
 [3] A. A. Roma, A. Diaz De Vivar, K. J. Park, I. Alvarado-Cabrero, G. Rasty, J. G. Chanona-
     Vilchis, Y. Mikami, S. R. Hong, N. Teramoto, R. Ali-Fehmi, J. K. L. Rutgers, D. Bar-
     buto, E. G. Silva, Invasive endocervical adenocarcinoma: a new pattern-based clas-
     sification system with important clinical significance, The American journal of sur-
     gical pathology 39 (2015) 667—672. URL: https://doi.org/10.1097/PAS.0000000000000402.
     doi:10.1097/pas.0000000000000402.
 [4] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neu-
     ral Netw. 4 (1991) 251–257. URL: https://doi.org/10.1016/0893-6080(91)90009-T. doi:10.
     1016/0893-6080(91)90009-T.
 [5] J. Park, I. W. Sandberg, Approximation and radial-basis-function networks, Neural
     Comput. 5 (1993) 305–316. URL: https://doi.org/10.1162/neco.1993.5.2.305. doi:10.1162/
     neco.1993.5.2.305.
 [6] J. Bezdek, R. Ehrlich, W. E. Full, Fcm: The fuzzy c-means clustering algorithm, Computers
     et Geosciences 10 (1984) 191–203.
 [7] N. Richard, M. Dojat, C. Garbay, Automated segmentation of human brain mr images
     using a multi-agent approach, Artificial Intelligence in Medicine 30 (2004) 153–176.
 [8] N. Richard, M. Dojat, C. Garbay, Distributed markovian segmentation: Application to mr
     brain scans, Pattern Recogn. 40 (2007) 3467–3480. URL: https://doi.org/10.1016/j.patcog.
     2007.03.019. doi:10.1016/j.patcog.2007.03.019.
 [9] B. Scherrer, F. Forbes, C. Garbay, M. Dojat, Distributed local mrf models for tissue and
     structure brain segmentation, IEEE Transactions on Medical Imaging 28 (2009) 1278–1295.
[10] R. RAJASREE, C. C. COLUMBUS, Brain tumour image segmentation and classification
     system based on the modified adaboost classifier, International Journal of Applied Engi-
     neering Research 10 (2015).
[11] G. C. Oliveira., R. Varoto., A. C. Jr.., Brain tumor segmentation in magnetic resonance
     images using genetic algorithm clustering and adaboost classifier, in: Proceedings of the
     11th International Joint Conference on Biomedical Engineering Systems and Technologies
     - Volume 2 BIOIMAGING: BIOIMAGING„ INSTICC, SciTePress, 2018, pp. 77–82. doi:10.
     5220/0006534900770082.
[12] W. Zhang, R. Li, H. Deng, L. Wang, W. Lin, S. Ji, D. Shen, Deep convolutional neural
     networks for multi-modality isointense infant brain image segmentation, NeuroImage
     108 (2015) 214–224.
[13] A. de Brébisson, G. Montana, Deep neural networks for anatomical brain segmentation,
     CoRR abs/1502.02445 (2015). URL: http://arxiv.org/abs/1502.02445. arXiv:1502.02445.
[14] M. Saritha, K. Paul Joseph, A. T. Mathew, Classification of mri brain images using com-
     bined wavelet entropy based spider web plots and probabilistic neural network, Pat-
     tern Recogn. Lett. 34 (2013) 2151–2156. URL: https://doi.org/10.1016/j.patrec.2013.08.017.
     doi:10.1016/j.patrec.2013.08.017.
[15] T. X. Pham, P. Siarry, H. Oulhadj, A multi-objective optimization approach for brain mri
     segmentation using fuzzy entropy clustering and region-based active contour methods.,
     Magnetic resonance imaging 61 (2019) 41–65.
[16] S. Smith, Fast robust automated brain extraction, Human Brain Mapping 17 (2002).
[17] M. JENKINSON, Bet2 : Mr-based estimation of brain, skull and scalp surfaces, Eleventh
     Annual Meeting of the Organization for Human Brain Mapping, 2005 (2005). URL: https:
     //ci.nii.ac.jp/naid/10030066593/en/.
[18] C. Cocosco, V. Kollokian, R.-S. Kwan, A. Evans, Simulated brain database homepage, 1997.
     URL: https://brainweb.bic.mni.mcgill.ca/brainweb, accessed: 2020-06-13.
[19] Y. Zhang, M. Brady, S. Smith, Segmentation of brain mr images through a hidden markov
     random field model and the expectation-maximization algorithm, IEEE Transactions on
     Medical Imaging 20 (2001) 45–57.
[20] J. Ashburner, K. J. Friston, Unified segmentation, 2005.
[21] B. Caldairou, N. Passat, P. Habas, C. Studholme, F. Rousseau, A non-local fuzzy segmen-
     tation method: Application to brain MRI, Pattern Recognition 44 (2011) 1916–1927. URL:
     https://hal.archives-ouvertes.fr/hal-00476587. doi:10.1016/j.patcog.2010.06.006.

</pre>