=Paper=
{{Paper
|id=Vol-233/paper-8
|storemode=property
|title=Combining Color with Spatial and Temporal Position of the Endoscopic Capsule for Improved Topographic Classification and Segmentation
|pdfUrl=https://ceur-ws.org/Vol-233/p17.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/CoimbraKCC06
}}
==Combining Color with Spatial and Temporal Position of the Endoscopic Capsule for Improved Topographic Classification and Segmentation==
Combining Color with Spatial and Temporal
Position of the Endoscopic Capsule for
Improved Topographic Classification and
Segmentation
M. Coimbra, J. Kustra, P. Campos, and J.P. Silva Cunha
difficult (the median error performed by three senior capsule
Abstract—Capsule endoscopy is a recent technology with a specialists was about 400 images) and time-consuming
clear need for automatic tools that reduce the long exam (around 15 minutes can be saved by automation).
annotation times of exams. We have previously developed a The main contribution of this paper is the improvement of
topographic segmentation method, which is now improved by
using spatial and temporal position information. Two approaches
our previous topographic segmentation methods using color
are studied: using this information as a confidence measure for and texture, by incorporating not only temporal but also
our previous segmentation method, and direct integrating of this spatial position information in the image classification
data into the image classification process. These allow us not only process.
to automatically know when we have obtained results with error
magnitudes close to human errors, but also to reduce these II. METHODS
automatic errors to much lower values. All the developed
methods have been integrated in the CapView annotation The ultimate objective of the presented methods is to
software, currently used for clinical practice in hospitals reliably divide the video of the gastrointestinal tract into its 4
responsible for over 250 capsule exams per year, and where we constituent parts (entrance, stomach, small intestine, large
estimate that the two hour annotation times are reduced by intestine) and thus determine its corresponding junctions (eso-
around 15 minutes.
gastric junction, pylorus, ileo-cecal valve).
Index Terms— Endoscopic capsule, image classification, A. Capsule Position and Velocity
biomedical engineering, medical imaging
We can theoretically estimate the spatial position of a
capsule via antenna signal triangulation. We have selected 47
capsule exams where a clinical specialist manually annotated
I. INTRODUCTION
the temporal location of the pylorus (tPYL) and of the ICV (tICV)
The clinical importance of the endoscopic capsule is now in the video using the CapView annotation software. We’ve
solidly established in literature: Iddan [1], Oureshi [2], etc. then used our automatic topographic segmentation algorithm
Due to space limitations, we refer to our previous work [3,4], to determine these same temporal locations. Using our 2D
for more extensive capsule details and clinical importance position information, we can then obtain the corresponding
information. All this attempts to solve an important limitation spatial locations: xPYL, yPYL, xVIC, yVIC etc. For comparison
of the endoscopic capsule, excessively long annotation times. purposes, these were normalized. Besides analysing 2D
Currently it takes about 2 hours to fully view, annotate an position information, we have looked at average capsule
exam and write its corresponding report. Our clinical studies displacement velocity (module of the displacement vector
show that the task of topographic segmentation is both between two points with temporal references t and t+1).
Manuscript received July 11, 2006. This work was supported in part by the B. Topographic Segmentation Algorithm
IEETA institute and the Fundação para a Ciência e Tecnologia (grant nr. Our previously developed automatic topographic
SFRH/BPD/ 20479/2004/YPH2). The authors would like to thank Dr. José
Soares of the gastroenterology department of Santo António General Hospital segmentation method, from now on referred as TSA, is
in Porto, Portugal (www.hgsa.pt) for providing and annotating all the described in Coimbra [4].
anonymous data that has made this work possible and for contributions
regarding the medical importance of capsule endoscopy. C. Spatial Information as a Confidence Measure
M. Coimbra, J. Kustra, P. Campos are with the IEETA institute, Campus
Two high-confidence areas were defined, one for the pylorus
Universitário de Santiago, 3810-193 Aveiro - Portugal (email:
{miguel.coimbra, jacek,pcampos}@ieeta.pt). and another for the ILC. We have measured the median
J.P. Silva Cunha is with the Departments of Electronics, segmentation error SEz for all marks (z12 - eso-gastric
Telecommunications and Informatics of the University of Aveiro, Campus junction; z23 – pylorus; z34 – ileo-cecal valve), and for all
Universitário de Santiago, 3810-193 Aveiro – Portugal (email:
exams SE, whose junctions are inside and outside these areas,
jcunha@det.ua.pt).
and results presented in section 3 have showed that this An analysis of Table 1 and Figure 1 shows that high-
information is indeed useful as a confidence measure for confidence areas contain almost all correct estimations, and
automatic segmentation results. low-confidence areas mainly contain incorrect estimations.
D. Integrating Spatial and Temporal Information for Table 2. Segmentation results using various distances and classifiers for
Classification feature vector F. L1, L2, and Full Multivariate were previously defined. Max
An alternative way of using this information is to use it Z corresponds to our previously used classification method [7], which is the
maximum positive distance to the SVM hyperplanes. Finally, we use
directly for individual image classification. Our previous Mahalanobis distances on a reduced feature vector F = [Z1, Z2, Z3, Z4] –
method trained 4 SVM classifiers, one for each zone, which Multivariate Color.
determine the topographic section each image belongs to as
the classifier with the highest positive distance to the SVM Accuracy SE SE-EGJ SE-PYL SE-ICV
hyperplane (see [4] for details). We can however, use these L1 82.2% 2285 7 50 2228
distances to build a feature vector for each image, along with
L2 83.1% 1730 5 23 1702
additional information such as spatial and temporal location.
Max Z 79.4% 3063 5 16 3042
Our new feature vector F is now defined as:
U Multivariate Color 77.4% 1052 5 22 1025
F >x, y , Z 1 , Z 2 , Z 3 , Z 4 , V , t @ (1) Full Multivariate 79.7% 2285 6 433 1846
where x and y are the normalized spatial location coordinates Table 3. SE values as coefficients are removed from the feature vector F in a
(1,2), Z1, Z2 Z3 Z4 are the SVM classifier results [4] (distances step-wise elimination process. In each step we eliminate the coefficient that
to SVM hyperplanes), V is the spatial velocity, and t the produces the minimum SE when removed from the vector. These areas are
marked in light grey in the table. The corresponding individual classification
temporal location in number of frames. The combination of accuracy is presented instead. Discrepancies with are highlighted in dark grey.
these different features into a single vector requires that all
coefficients are previously normalized.
Median Segmentation Error
A variety of well-known distances was used for classification
Maximum 83.6% 83.1% 83.5% 82.8% 82.2% 82.1% 79.2%
(L1 Norm, Euclidean, Mahalanobis). Finally, we have
measured the relevance of each coefficient for the x 82.3% 82.3% 83.1% 82.8% 82.2% 82.1% 79.2%
segmentation process using a step-wise elimination analysis. y 83.6% 83.1% 83.1% 82.8% 82.2% 82.1% 79.2%
Z1 81.8% 81.2% 81.7% 80.8% 79.9% 79.2% 79.2%
III. RESULTS Z2 81.9% 81.0% 82.8% 82.8% 82.2% 82.1% 79.2%
Z3 80.9% 82.6% 83.4% 79.3% 70.3% 69.8% 68.7%
1 1 Z4 80.8% 82.5% 83.5% 82.2% 82.2% 82.1% 79.2%
ICV-Error
ICV-Correct V 83.2% 82.4% 83.1% 82.8% 82.1% 82.1% 79.2%
High Confidence
0.5 0.5 t 80.4% 79.9% 79.9% 79.5% 66.4% 63.0% 54.2%
-1 -0.5
0
0 0.5 1 -1 -0.5
0
0 0.5 1
IV. DISCUSSION
Results show that doctors can trust that automatic
-0.5
Pylorus-Error
-0.5
segmentation errors in high-confidence areas are as low as
Pylorus-Correct
High Confidence
human ones. Including other information has allowed us to
-1 -1
improve segmentation results significantly. Step-wise
Fig. 1. Spatial distribution of correct (green) and incorrect (blue) estimations. elimination analysis has shown us that the most relevant
Points in high confidence areas are highlighted with a black bounding box.
We can observe that most correct detections fall into high-confidence areas
features for segmentation are capsule temporal position, and
while incorrect ones are more distributed over the whole 2D space. the color recognition of the entrance and the small intestine
topographic sections. It has also shown us that spatial location
Table 1. Numerical analysis of the spatial distribution of automatic is not a relevant factor for individual image classification.
topographic estimations. Accuracy = correct estimations / total estimations;
recall = correct estimations / total annotations; mean and median segmentation
REFERENCES
errors are given in number of images.
[1] G. Iddan, G. Meron, A. Glukhovsky, and P. Swain, “Wireless Capsule
Endoscopy”, in Nature, 2000, 405, pp. 417.
Pylorus ICV
[2] 2. W.A. Qureshi, “Current and future applications of the capsule
Accuracy Recall Accuracy Recall
camera”, in Nature, vol.3, 2004, pp. 447-450.
Correct 80 % 93 % 58 % 96 %
[3] 4. M. Coimbra, and J.P. Silva Cunha, “MPEG-7 visual descriptors –
Incorrect 89 % 70 % 92 % 39 %
Contributions for automated feature extraction in capsule endoscopy”, in
Mean Median Mean Median
IEEE Transaction on Circuits and Systems for Video Processing, vol.
Err. Err. Err. Err.
16/5, 2006, pp. 628-637.
All 2157 158 3966 844
[4] 6. M. Coimbra, P. Campos, and J.P. Silva Cunha, “Topographic
High- 493 45 2096 246
Segmentation and Transit Time Estimation for Endoscopic Capsule
confidence Exams”, in Proc. of IEEE ICASSP 2006, Toulouse, France, 2006.