=Paper=
{{Paper
|id=Vol-233/paper-20
|storemode=property
|title=A Framework and User Interface for Automatic Region Based Segmentation Algorithms
|pdfUrl=https://ceur-ws.org/Vol-233/p41.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/McGuinnessKAO06
}}
==A Framework and User Interface for Automatic Region Based Segmentation Algorithms==
A Framework and User Interface for Automatic Region Based
Segmentation Algorithms
Kevin McGuinness, Gordon Keenan, Tomasz Adamek, Noel O’Connor
Abstract— In this paper we describe a framework and tool developed Region-Map Format: The framework encodes region-maps us-
for running and evaluating automatic region based segmentation algo- ing an efficient, portable format based on a subset of PNG. This
rithms. The tool was designed to allow simple integration of existing
allows segmenting video sequences with minimal space overhead.
and future segmentation algorithms, both single image based algorithms
and those that operate on video data. Our framework supports plug-in User Interface: The user interface provides a lot of function-
segmenters, media decoders, and region-map codecs. We provide several allity, including automatic decoder selection, concurrent browsing of
sophisticated implementations of these plug-ins, including a video decoder video frames and segmented images, selected-range segmentation,
capable of frame accurate decoding of a large variety of video formats, an useful visualization methods, and a simple interface for selecting
image decoder which also handles a comprehensive collection of formats,
and a efficient implementation of a region-map codec. The tool includes algorithms and their parameters.
both a graphical user interface to allow users to browse, visually inspect, Batch Processing Interface: The batch processing interface
and evaluate the algorithm output, and a batch processing interface for allows command line segmentation of large image/video collections.
segmentation of large data collections. All the parameters that can be selected in the graphical user interface
The application allows researchers to focus more on the development
and evaluation of segmentation methods, relying on the framework for can be input into a parameter file. Files, ranges and increments can
encoding/decoding input and output, and the front end for visualization. be selected for highly configurable segmentation.
Index Terms— Image Segmentation, Video Segmentation, Framework, III. A RCHITECTURAL OVERVIEW
User Interface, Integration, Evaluation.
I. I NTRODUCTION
Several different approaches to segmentation were developed and
contributed by each of the partners in the K-Space1 project. Each
method has its own particular merits and limitations, often as a
result of being designed with a different application domain in mind.
Generally, each tool has its own unique interface, and can only
accept one or two input formats. Output formats also tend to vary
across tools. With such a rich set of tools, the task of selecting and
integrating the best tool for a given experiment or domain is time
consuming and non-trivial.
Automatic evaluation of segmentation algorithms is a very difficult
task. The effectiveness of an algorithm in a domain (semantic
reasoning applications, search tasks) is often not possible to evaluate
automatically. Most automatic evaluation methods compare, in some
way, a manual human segmentation with an automatic segmentation,
and produce a measure of the match. This is not usually a adequate Fig. 2. High Level Overview of Software Architecture.
representation of the usefulness of a segmentation in an application
context. A user, however, may be able to intuitively determine what The framework is arranged into three main areas. The top-level
algorithm would be best for a particular domain context by simply module, the Application, hosts the user interface, user preferences,
examining some segmentation results. batch processing interface and integration logic. The application
As one of out research activities is development, testing and layer implements all of its encoding, decoding and segmentation via
evaluation of segmentation algorithms, we decided that a tool that interfaces specified in the module below this, the External API. This
would allow us to easily integrate currently available algorithms, and API consists of a set of interfaces for plug-in developers, as well
develop future ones would be invaluable. as commonly required utilities to simplify development. The bottom
layer contains of all the plug-ins; built-in plug-ins and externally
II. F EATURES AND F UNCTIONALITY developed plug-ins are treated the same.
The following is an overview of the main features of the platform. Application: The main components hosted by the Application
Image and Video Formats: The framework provides an interface module are the user interface and the batch interface.
for seek-able, frame accurate video decoding. The built in video The user interface provides a convenient and powerful way to per-
decoder supports many video formats, including MPEG-1, 2, 4, form segmentation operations, parameter selection, frame browsing,
Motion-JPEG, Quicktime and WMF. We also provide an image region visualization and plug-in configuration. This interface provides
decoder capable of decoding both individual images and sequences of two visualization modes for viewing region maps, contrast stretching
key-frames transparently. It supports a large range of image formats, and color averaging mode.
including JPEG, PNG, PNM, GIF and BMP. The batch interface is designed for off-line processing of larger data
sets. It is completely configurable from a parameter file, including
Centre for Digital Video Processing, Dublin City University, Glasnevin,
Dublin 9, Ireland. decoder/segmenter/output selection and parameters, input files, ranges
1 K-Space - Knowledge Space of Semantic inference for automaic automatic and increments. Output of batch operations can later be loaded and
annotation and retrieval of multimedia content. browsed in the user interface.
Fig. 1. Screenshot of the Application User Interface
Segmentation: Developers wishing to integrate segmenters must Region Storage: For the standard region map codec provided,
implement the Segmenter interface. This includes all the functions we decided to utilize the open and widely accepted PNG format [1].
required to configure parameters and perform the segmentation. When Specifically, the 8 and 16 bit gray-level PNG compression strategies.
a segmenter is implemented and added to the platform, the algorithm For region maps of less than 256 regions, we employ the 8-bit gray-
name and parameter configuration will appear in the user interface. level encoding strategy, for more regions, the 16-bit gray-level format.
The segmentation interface contains a segment responsible for per- The codec can thus support up to 65536 regions. Our experiments
forming segmentation on a single frame. For each frame, the segment revealed that the compression rate of the codec was quite favorable.
method is passed a context object. This contains information that may A typical segmentation of 10 seconds of MPEG-1 video (resolution
be required to perform the operation, including the frame and index, 352x240, frame rate 29.97fps), required less than 500KB of storage.
a frame decoder, region map object, and an interface for acquiring Advantages of our chosen format are that it can be viewed
previously segmented frames. This design allows each segmentation in various imaging applications, simply by stretching the contrast
to be a single operation, while also providing enough contextual between the regions. There are several software libraries for decoding
information for segmenters that require previous segmentations or PNG images freely available, like libpng [5], ImageMagick, and JAI
frames. It simplifies the integration of single frame based segmenters, ImageIO, making the format suitable for interchange.
but provides enough information for segmenters that operate in IV. I NTEGRATED A LGORITHMS
the temporal domain. Of course, the internal implementation of a
segmenter is entirely up to the developer, who may decide to buffer For our experiments, we integrated the Syntactic Modified RSST
previous segmentations internally. In this case, no runtime overhead Algorithm [2], a fast Mean-Shift Algorithm [3], and a version of
is incurred by the segmentation. the Normalized Cuts [4] algorithm. Work is currently in progress to
integrate more algorithms into the framework.
Image and Video Decoder: As the tool is frame based, a single
interface is provided for both image and video decoders. This way the V. F UTURE W ORK
segmenter can handle single images (sequences of length 1), multiple Possible enhancements for the framework include; more visu-
images (e.g. key-frames) and videos in the same way. A powerful set alization algorithms, MPEG-7 region description output, API for
of decoders are provided with the application, and the framework’s integrating automatic evaluation tools, and the ability to label regions
plug-in mechanism ensures additional decoders can easily added. for semantic reasoning applications. We would also like to use the
The tool’s integrated video decoder provides frame-accurate decod- framework components to develop a semi-automatic segmentation
ing of a multiple video formats. To achieve this, we decided to use the tool for ground truth generation.
ffmpeg audio visual codec library [7] as a base for the video decoder.
FFmpeg supports many video formats, so was ideal for our purposes. VI. ACKNOWLEDGMENT
However, ffmpeg does not natively support frame-accurate video This material is based upon work supported by by the European
seeking. A frame-accurate decoder is required to ensure consistency Commission under contract FP6-027026, K-Space: Knowledge Space
across runs and for frame-accurate segmentation. of semantic inference for automatic annotation and retrieval of
To attain fast, frame-accurate decoding from an arbitrary stream multimedia content.
index, it was necessary to add a video packet parsing layer to de- R EFERENCES
termine (and sometimes interpolate) packet presentation timestamps,
[1] Portable Network Graphics (PNG): Functional specification, ISO/IEC
durations and other necessary information in advance of seeking in 15948:2004, March, 2004.
a stream. This and some additional functionallity is provided by the [2] N. OConnor, T. Adamek, S. Sav, N. Murphy, S. Marlow, Qimera: a
ffmpeg proxy layer. A standalone C++ and Java interface were built software platform for video object segmentation and tracking, WIAMIS
for this layer, and are fully re-usable. 2003, London, pp. 204-209, Apr., 2003.
[3] W. Bailerand, P. Schallauer, H. B. Haraldsson, H. Rehatschek, Optimized
One advantage of using ffmpeg as a base for the video decoder is mean shift algorithm for color segmentation in image sequences Proceed-
that new codecs and improvements are constantly being added to it. ings of the SPIE, Volume 5685, pp. 522-529 (2005).
As ffmpeg grows to support more formats, a simple recompilation of [4] J. Shi, J. Malik, Normalized Cuts and Image Segmentation, IEEE Trans-
the ffmpeg proxy layer automatically adds this support to the tool. actions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp.
888-905, Aug., 2000.
The provided image and key-frame decoder plug-ins use the built- [5] libpng, PNG reference library: http://www.libpng.org/.
in Java image decoders as well as the JAI Image-IO library [6], which [6] JAI Image I/O: https://jai-imageio.dev.java.net/.
together support a comprehensive collection of image formats. [7] FFmpeg Multimedia System: http://ffmpeg.mplayerhq.hu/.