<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>CEUR Workshop Proceedings</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.18287/1613-0073-2016-1638-313-319</article-id>
      <title-group>
        <article-title>ADJUSTED POLYNOMIAL FEATURES FOR ANALYSIS OF LUNG CT IMAGES</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A.V. Gaidel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Samara National Research University</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
          ,
          <institution>Image Processing Systems Institute - Branch of the Federal Scientific Research Centre "Crystallography and Photonics" of Russian Academy of Sciences</institution>
          ,
          <addr-line>Samara</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>1638</volume>
      <fpage>313</fpage>
      <lpage>319</lpage>
      <abstract>
        <p>We introduced new polynomial features described as polynomials on the image pixels domain. After imposing natural conditions these features transform into linear combinations of the image autocovariance function readings. We proposed a way to adjust these features using learning sample textural properties. Experiments on a real diagnostic dataset of lung CT images showed a decrease of the error probability in comparison with the previous works.</p>
      </abstract>
      <kwd-group>
        <kwd>texture analysis</kwd>
        <kwd>discriminant analysis</kwd>
        <kwd>feature construction</kwd>
        <kwd>feature selection</kwd>
        <kwd>computer-aided diagnostics</kwd>
        <kwd>polynomial features</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Computer-aided diagnostics provides automatization, objectivization and cheapening
of medical procedures. It also belongs to the diagnostic image analysis performed by
medical professionals using the lesion detection. Emphysema diagnostics by 2D slices
of lung CT images is an example of such procedure (Fig. 1). In the presence of the
disease one can note specific dark areas on the image.</p>
      <p>
        One of the most important and least studied steps of the image recognition process is
the selection of the features, which quantitatively describes the characteristics that are
most different on the images of different classes. Development of the automatic
feature construction methods appropriated to the particular task is an open problem.
For the selection of a small finite number of features one can use the brute force,
allowing him to select the best subset of features for specific applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. If the
number of features exceeds several tens, heuristic selection algorithms are used, one
of which was previously used to analyze lung CT images [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In addition, even early
texture analysis was used to analyze these images in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] mostly with a help of
runlength features.
Approaches to the selection of the infinite collections of features are much less
studied. In most papers, such as [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the new features are selected as a number of
several sequential transforms of the existing primitive features. However, approaches
to the selection of certain feature parameters using evolutionary algorithms are also
proposed [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The theory of the efficient linear local features construction is described
in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        The general approach to the adjusted feature selection is described in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. There was
introduced a simple adjusting algorithm based on image rotation.
      </p>
      <p>The aim of this work was to use the new adjusted polynomial features for the lung CT
images classification, as well as the study of the feature space quality criteria and
adjusting algorithms, the most suitable for this problem.</p>
    </sec>
    <sec id="sec-2">
      <title>Polynomial features and quadratic features</title>
      <p>Let Ω be a set of images we want to recognize, and each image is as follows
m, n : D  Q ,
where the pixels domain D  Z2 is a region of interest, and Q  0, Q 1  Z is a
set consisting of Q gray levels.</p>
      <p>We constructed polynomial features as the polynomials on Ω . Say that a multi-index
of degree q  0 is a tuple
    1  2
  D </p>
      <p>T
such that k  1; D   Z :  k  </p>
      <p>0 and
D
   k   q .
k 1
Here we have</p>
      <p>0   j  Z | j  0 , and the operator A denotes a number of
elements in the finite set A .</p>
      <p>Also suppose there is a lexicographical order on D such as
m1, n1 </p>
      <p>m2 , n2   m1  m2  m1  m2  n1  n2 
and all pixels  m, n  D are numerated in correspondence with that order. It defines</p>
      <p>D
a finite sequence of pixels mk , nk k 1 for which
i, j 1, 2, , D  : i  j    mi , ni 
</p>
      <p> m j , n j  .</p>
      <p>Assuming Iq is a set of all multi-indices of degree q , we can introduce a polynomial
of degree q as</p>
      <p>q D
P ,          mk , nk 
p0 Ip k 1
k</p>
      <p>,
where     : Iq </p>
      <p>
        Here D  m, n is understood as a set of pixels  m, n from the region of interest
for which the pixels  m  m, n  n also belong to the region of interest:
The operator (1) determines the broad collection of parametric features differing by
their coefficients  , some of which may be better or worse appropriated to the
particular application. The adjusted polynomial feature selection is to select the best
coefficients  for the specified problem, for which images from the learning sample are
well divided into classes in the feature space formed by the operator (1).
In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] we proved that under natural constraints related to the invariance to some
texture-independent transforms polynomial features (1) become the form
P2 ,  
      </p>
      <p>
m,nWd</p>
      <p>m, n Rˆ m, n ,
where Rˆ  m, n  are the readings of the image autocovariance function
Rˆ  m, n 
1</p>
      <p>
D  m, n m,nD m,n</p>
      <p>m, n m  m, n  n
calculated within the square window of radius d
Wd  d; d 2  Z2 .</p>
      <p>is its coefficient corresponding to the term with the multi-index
(1)
(2)
(3)
D m, n  m, n  D | m  m, n  n  D .</p>
      <p>In fact, the operator (2) is a linear combination of image autocovariance function
readings. This is only a subset of quadratic polynomials.</p>
      <p>We assumed that the images from the learning sample are standardized, i.e. have zero
mean value and unit standard deviation. Standardization consists of the mean value
subtraction and the dividing by the standard deviation.</p>
    </sec>
    <sec id="sec-3">
      <title>Feature space quality criteria</title>
      <p>
        In this work we used the operator (2) to calculate features and the nearest neighbor
algorithm as a classifier, but we calculated a distance between feature vectors by their
standard score. We determined the coefficients   m, n for the features (2) solving
the optimization problem, where one of the feature space quality criteria computed
from the learning sample is used as an objective function. We investigated the
following feature space quality criteria.
1. The accuracy of recognition J is a fraction of correctly recognized images from
the learning sample calculated using leave-one-out cross-validation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
2. Bhattacharya distance
      </p>
      <p>
         1 
   
 2 
where al and Rl are the means and the correlation matrices of the feature vectors
3. The first discriminant analysis criterion from [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
from the l -th class.
      </p>
      <p>J1  tr  R1R ,
J4  tr1  R  tr  R .</p>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>
        where R is a correlation matrix of feature vectors from the both class distributions
mixture and R   R1  R2  / 2 is a mean intraclass correlation matrix.
4. The fourth discriminant analysis criterion from [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
Experiments were carried out according to the scheme described in detail in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The
dataset for the experiments consists of 160 lung CT images with resolution of
140 × 200 pixels. Each experiment was carried out separately for each of the quality
criteria described in the previous section, and for each of the optimization algorithms
suitable for this criterion.
      </p>
      <p>
        For all criteria except J1 we used one of three optimization algorithms: simple
random search, genetic algorithm or simulated annealing. For the criterion J1 there is a
known optimization algorithm used in the discriminant analysis and based on the
principal component analysis [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        We carried out each set of experiments as follows.
1. We randomly divided the entire set of images into a learning sample and a test
sample of equal volumes.
2. With a help of a given optimization procedure we determined coefficients
  m, n maximizing the specified quality criterion. The output was a taught-in
system that stores in memory all adjusted feature vectors corresponding to the
images from the learning sample.
3. Among the adjusted features we selected a subset of the most efficient features
using the heuristic feature ordering procedure described in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
4. We verified quality of the recognition system by the recognition error rate
assessment using the test sample.
      </p>
      <p>
        Specific parameters of the algorithms described in detail in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]:
d  2 is a window (3) range, in which covariance function readings where
computed;
K  13 is a number of the quadratic features;
Nopt  200 is a number of steps of the global optimization iterative algorithms;
Npop  10 is a population size of the genetic algorithm;
pmut  0,1 is a mutation probability of the genetic algorithm;
 t0  10 is an initial temperature of the simulated annealing;
1   m, n  1 – constraints on the coefficients for the cases in which
optimization procedures can’t work without constraints.
Research findings on adjusted quadratic features (2) effectiveness for the problem of
automatic emphysema diagnosis by the lung CT images are shown in the Tab. 1. For
each adjusting method we presented the best feature space quality criterion, the best
error rate of the test sample images, as well as the number of features in the most
efficient set. We highlighted the line with the lowest error rate.
      </p>
      <p>
        Agresti-Coull interval for the best recognition error rate with the 95 % confidence
level is (0.02, 0.14). We obtained the best results using random search, but they much
surpass their counterparts obtained in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for the same problem.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>
        We investigated new polynomial features allowing adjusting for the textural
properties of the learning sample images in the lung CT image recognition problem.
Analyzing experimental results we found out the decrease of the recognition error rate from
0.11 to 0.06 in comparison with the results given in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In the future we plan
to investigate the effectiveness of these features to other image analysis problems.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>The work was partially funded by Russian Science Foundation (RSF), grant No.
1407-97040-р_поволжье_а, the Russian Federation Ministry of Education and Science
and Fundamental Research Program NITD RAS «Bioinformatics, modern
information technologies and mathematical methods in medicine».</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ilyasova</surname>
            <given-names>NYu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kupriyanov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paringer</surname>
            <given-names>RA</given-names>
          </string-name>
          .
          <article-title>Formation of features for improving the quality of medical diagnosis based on discriminant analysis methods</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2014</year>
          ;
          <volume>38</volume>
          (
          <issue>4</issue>
          ):
          <fpage>851</fpage>
          -
          <lpage>855</lpage>
          . [In Russian]
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gaidel</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zelter</surname>
            <given-names>PM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kapishnikov</surname>
            <given-names>AV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khramov</surname>
            <given-names>AG</given-names>
          </string-name>
          .
          <article-title>Computed tomography texture analysis possibilities of the chronic obstructive pulmonary disease diagnosis</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2014</year>
          ;
          <volume>38</volume>
          (
          <issue>4</issue>
          ):
          <fpage>843</fpage>
          -
          <lpage>850</lpage>
          . [In Russian]
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ginsburg</surname>
            <given-names>SB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lynch</surname>
            <given-names>DA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowler</surname>
            <given-names>RP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schroeder</surname>
            <given-names>JD</given-names>
          </string-name>
          .
          <source>Automated Texture-based Quantification of Centrilobular Nodularity and Centrilobular Emphysema in Chest CT Images. Acad Radiol</source>
          ,
          <year>2012</year>
          ;
          <volume>19</volume>
          (
          <issue>10</issue>
          ):
          <fpage>1241</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Neshatian</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnston</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Feature construction and dimension reduction using genetic programming</article-title>
          .
          <source>Lecture Notes in Comput. Sc.</source>
          ,
          <year>2007</year>
          ;
          <volume>4830</volume>
          :
          <fpage>160</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Fan</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhong</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peng</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verscheure</surname>
            <given-names>O</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            <given-names>Q</given-names>
          </string-name>
          .
          <article-title>Generalized and heuristic-free feature construction for improved accuracy</article-title>
          .
          <source>Proceedings of the 10th SIAM International Conference on Data Mining; Columbus</source>
          ,
          <string-name>
            <given-names>OH</given-names>
            ,
            <surname>United</surname>
          </string-name>
          <string-name>
            <surname>States</surname>
          </string-name>
          ,
          <volume>29</volume>
          <fpage>April</fpage>
          - 1
          <string-name>
            <surname>May</surname>
          </string-name>
          ,
          <year>2010</year>
          :
          <fpage>629</fpage>
          -
          <lpage>640</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lillywhite</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>D-J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tippetts</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Archibald</surname>
            <given-names>J.</given-names>
          </string-name>
          <article-title>A feature construction method for general object recognition</article-title>
          .
          <source>Pattern Recogn</source>
          ,
          <year>2013</year>
          ;
          <volume>71</volume>
          (
          <issue>3</issue>
          ):
          <fpage>514</fpage>
          -
          <lpage>527</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>VV</given-names>
          </string-name>
          .
          <article-title>Constructing efficient linear local features in image processing and analysis problems</article-title>
          . Automat Rem Contr+,
          <year>2010</year>
          ;
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>58</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Myasnikov</surname>
            <given-names>VV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bavrina</surname>
            <given-names>AY</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Titova</surname>
            <given-names>OA</given-names>
          </string-name>
          .
          <article-title>Analysis of methods for construction of efficient linear local features for digital signals and images description</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2010</year>
          ;
          <volume>34</volume>
          (
          <issue>3</issue>
          ):
          <fpage>374</fpage>
          -
          <lpage>381</lpage>
          . [In Russian]
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Gaidel</given-names>
            <surname>AV</surname>
          </string-name>
          .
          <article-title>A method for adjusting directed texture features in biomedical image analysis problems</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2015</year>
          ;
          <volume>39</volume>
          (
          <issue>2</issue>
          ):
          <fpage>287</fpage>
          -
          <lpage>293</lpage>
          [In Russian].
          <source>DOI: 10.18287/0134- 2452-2015-39-2-287-293</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Gaidel</surname>
            <given-names>AV</given-names>
          </string-name>
          .
          <article-title>Matched polynomial features for the analysis of grayscale biomedical images</article-title>
          .
          <source>Computer Optics</source>
          ,
          <year>2016</year>
          ;
          <volume>40</volume>
          (
          <issue>2</issue>
          ):
          <fpage>232</fpage>
          -
          <lpage>239</lpage>
          [In Russian].
          <source>DOI: 10</source>
          .18287/
          <fpage>2412</fpage>
          -6179-2016-40- 2-
          <fpage>232</fpage>
          -239.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fukunaga</surname>
            <given-names>K.</given-names>
          </string-name>
          <article-title>Introduction to statistical pattern recognition</article-title>
          . San Diego: Academic Press;
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>