=Paper=
{{Paper
|id=Vol-233/paper-20
|storemode=property
|title=A Framework and User Interface for Automatic Region Based Segmentation Algorithms
|pdfUrl=https://ceur-ws.org/Vol-233/p41.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/McGuinnessKAO06
}}
==A Framework and User Interface for Automatic Region Based Segmentation Algorithms==
<pdf width="1500px">https://ceur-ws.org/Vol-233/p41.pdf</pdf>
<pre>
A Framework and User Interface for Automatic Region Based
                Segmentation Algorithms
                                 Kevin McGuinness, Gordon Keenan, Tomasz Adamek, Noel O’Connor


   Abstract— In this paper we describe a framework and tool developed               Region-Map Format: The framework encodes region-maps us-
for running and evaluating automatic region based segmentation algo-          ing an efficient, portable format based on a subset of PNG. This
rithms. The tool was designed to allow simple integration of existing
                                                                              allows segmenting video sequences with minimal space overhead.
and future segmentation algorithms, both single image based algorithms
and those that operate on video data. Our framework supports plug-in                User Interface: The user interface provides a lot of function-
segmenters, media decoders, and region-map codecs. We provide several         allity, including automatic decoder selection, concurrent browsing of
sophisticated implementations of these plug-ins, including a video decoder    video frames and segmented images, selected-range segmentation,
capable of frame accurate decoding of a large variety of video formats, an    useful visualization methods, and a simple interface for selecting
image decoder which also handles a comprehensive collection of formats,
and a efficient implementation of a region-map codec. The tool includes       algorithms and their parameters.
both a graphical user interface to allow users to browse, visually inspect,         Batch Processing Interface: The batch processing interface
and evaluate the algorithm output, and a batch processing interface for       allows command line segmentation of large image/video collections.
segmentation of large data collections.                                       All the parameters that can be selected in the graphical user interface
   The application allows researchers to focus more on the development
and evaluation of segmentation methods, relying on the framework for          can be input into a parameter file. Files, ranges and increments can
encoding/decoding input and output, and the front end for visualization.      be selected for highly configurable segmentation.

  Index Terms— Image Segmentation, Video Segmentation, Framework,                                III. A RCHITECTURAL OVERVIEW
User Interface, Integration, Evaluation.


                           I. I NTRODUCTION
   Several different approaches to segmentation were developed and
contributed by each of the partners in the K-Space1 project. Each
method has its own particular merits and limitations, often as a
result of being designed with a different application domain in mind.
Generally, each tool has its own unique interface, and can only
accept one or two input formats. Output formats also tend to vary
across tools. With such a rich set of tools, the task of selecting and
integrating the best tool for a given experiment or domain is time
consuming and non-trivial.
   Automatic evaluation of segmentation algorithms is a very difficult
task. The effectiveness of an algorithm in a domain (semantic
reasoning applications, search tasks) is often not possible to evaluate
automatically. Most automatic evaluation methods compare, in some
way, a manual human segmentation with an automatic segmentation,
and produce a measure of the match. This is not usually a adequate            Fig. 2.   High Level Overview of Software Architecture.
representation of the usefulness of a segmentation in an application
context. A user, however, may be able to intuitively determine what              The framework is arranged into three main areas. The top-level
algorithm would be best for a particular domain context by simply             module, the Application, hosts the user interface, user preferences,
examining some segmentation results.                                          batch processing interface and integration logic. The application
   As one of out research activities is development, testing and              layer implements all of its encoding, decoding and segmentation via
evaluation of segmentation algorithms, we decided that a tool that            interfaces specified in the module below this, the External API. This
would allow us to easily integrate currently available algorithms, and        API consists of a set of interfaces for plug-in developers, as well
develop future ones would be invaluable.                                      as commonly required utilities to simplify development. The bottom
                                                                              layer contains of all the plug-ins; built-in plug-ins and externally
                II. F EATURES AND F UNCTIONALITY                              developed plug-ins are treated the same.
   The following is an overview of the main features of the platform.               Application: The main components hosted by the Application
     Image and Video Formats: The framework provides an interface             module are the user interface and the batch interface.
for seek-able, frame accurate video decoding. The built in video                 The user interface provides a convenient and powerful way to per-
decoder supports many video formats, including MPEG-1, 2, 4,                  form segmentation operations, parameter selection, frame browsing,
Motion-JPEG, Quicktime and WMF. We also provide an image                      region visualization and plug-in configuration. This interface provides
decoder capable of decoding both individual images and sequences of           two visualization modes for viewing region maps, contrast stretching
key-frames transparently. It supports a large range of image formats,         and color averaging mode.
including JPEG, PNG, PNM, GIF and BMP.                                           The batch interface is designed for off-line processing of larger data
                                                                              sets. It is completely configurable from a parameter file, including
  Centre for Digital Video Processing, Dublin City University, Glasnevin,
Dublin 9, Ireland.                                                            decoder/segmenter/output selection and parameters, input files, ranges
  1 K-Space - Knowledge Space of Semantic inference for automaic automatic    and increments. Output of batch operations can later be loaded and
annotation and retrieval of multimedia content.                               browsed in the user interface.
Fig. 1.   Screenshot of the Application User Interface


     Segmentation: Developers wishing to integrate segmenters must             Region Storage: For the standard region map codec provided,
implement the Segmenter interface. This includes all the functions        we decided to utilize the open and widely accepted PNG format [1].
required to configure parameters and perform the segmentation. When       Specifically, the 8 and 16 bit gray-level PNG compression strategies.
a segmenter is implemented and added to the platform, the algorithm          For region maps of less than 256 regions, we employ the 8-bit gray-
name and parameter configuration will appear in the user interface.       level encoding strategy, for more regions, the 16-bit gray-level format.
   The segmentation interface contains a segment responsible for per-     The codec can thus support up to 65536 regions. Our experiments
forming segmentation on a single frame. For each frame, the segment       revealed that the compression rate of the codec was quite favorable.
method is passed a context object. This contains information that may     A typical segmentation of 10 seconds of MPEG-1 video (resolution
be required to perform the operation, including the frame and index,      352x240, frame rate 29.97fps), required less than 500KB of storage.
a frame decoder, region map object, and an interface for acquiring           Advantages of our chosen format are that it can be viewed
previously segmented frames. This design allows each segmentation         in various imaging applications, simply by stretching the contrast
to be a single operation, while also providing enough contextual          between the regions. There are several software libraries for decoding
information for segmenters that require previous segmentations or         PNG images freely available, like libpng [5], ImageMagick, and JAI
frames. It simplifies the integration of single frame based segmenters,   ImageIO, making the format suitable for interchange.
but provides enough information for segmenters that operate in                                IV. I NTEGRATED A LGORITHMS
the temporal domain. Of course, the internal implementation of a
segmenter is entirely up to the developer, who may decide to buffer          For our experiments, we integrated the Syntactic Modified RSST
previous segmentations internally. In this case, no runtime overhead      Algorithm [2], a fast Mean-Shift Algorithm [3], and a version of
is incurred by the segmentation.                                          the Normalized Cuts [4] algorithm. Work is currently in progress to
                                                                          integrate more algorithms into the framework.
      Image and Video Decoder: As the tool is frame based, a single
interface is provided for both image and video decoders. This way the                                V. F UTURE W ORK
segmenter can handle single images (sequences of length 1), multiple         Possible enhancements for the framework include; more visu-
images (e.g. key-frames) and videos in the same way. A powerful set       alization algorithms, MPEG-7 region description output, API for
of decoders are provided with the application, and the framework’s        integrating automatic evaluation tools, and the ability to label regions
plug-in mechanism ensures additional decoders can easily added.           for semantic reasoning applications. We would also like to use the
   The tool’s integrated video decoder provides frame-accurate decod-     framework components to develop a semi-automatic segmentation
ing of a multiple video formats. To achieve this, we decided to use the   tool for ground truth generation.
ffmpeg audio visual codec library [7] as a base for the video decoder.
FFmpeg supports many video formats, so was ideal for our purposes.                               VI. ACKNOWLEDGMENT
However, ffmpeg does not natively support frame-accurate video              This material is based upon work supported by by the European
seeking. A frame-accurate decoder is required to ensure consistency       Commission under contract FP6-027026, K-Space: Knowledge Space
across runs and for frame-accurate segmentation.                          of semantic inference for automatic annotation and retrieval of
   To attain fast, frame-accurate decoding from an arbitrary stream       multimedia content.
index, it was necessary to add a video packet parsing layer to de-                                      R EFERENCES
termine (and sometimes interpolate) packet presentation timestamps,
                                                                          [1] Portable Network Graphics (PNG): Functional specification, ISO/IEC
durations and other necessary information in advance of seeking in            15948:2004, March, 2004.
a stream. This and some additional functionallity is provided by the      [2] N. OConnor, T. Adamek, S. Sav, N. Murphy, S. Marlow, Qimera: a
ffmpeg proxy layer. A standalone C++ and Java interface were built            software platform for video object segmentation and tracking, WIAMIS
for this layer, and are fully re-usable.                                      2003, London, pp. 204-209, Apr., 2003.
                                                                          [3] W. Bailerand, P. Schallauer, H. B. Haraldsson, H. Rehatschek, Optimized
   One advantage of using ffmpeg as a base for the video decoder is           mean shift algorithm for color segmentation in image sequences Proceed-
that new codecs and improvements are constantly being added to it.            ings of the SPIE, Volume 5685, pp. 522-529 (2005).
As ffmpeg grows to support more formats, a simple recompilation of        [4] J. Shi, J. Malik, Normalized Cuts and Image Segmentation, IEEE Trans-
the ffmpeg proxy layer automatically adds this support to the tool.           actions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp.
                                                                              888-905, Aug., 2000.
   The provided image and key-frame decoder plug-ins use the built-       [5] libpng, PNG reference library: http://www.libpng.org/.
in Java image decoders as well as the JAI Image-IO library [6], which     [6] JAI Image I/O: https://jai-imageio.dev.java.net/.
together support a comprehensive collection of image formats.             [7] FFmpeg Multimedia System: http://ffmpeg.mplayerhq.hu/.

</pre>