<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Shape Extraction Framework for Similarity Search in Image Databases</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jan</forename><surname>Klíma</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Software Engineering</orgName>
								<orgName type="institution">Charles University in Prague</orgName>
								<address>
									<addrLine>Malostranské nám. 25</addrLine>
									<postCode>118 00</postCode>
									<settlement>Prague</settlement>
									<region>FMP</region>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Department of Software Engineering</orgName>
								<orgName type="institution">Charles University in Prague</orgName>
								<address>
									<addrLine>Malostranské nám. 25</addrLine>
									<postCode>118 00</postCode>
									<settlement>Prague</settlement>
									<region>FMP</region>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Tomáš</forename><surname>Skopal</surname></persName>
							<email>tomas.skopal@mff.cuni.cz</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Software Engineering</orgName>
								<orgName type="institution">Charles University in Prague</orgName>
								<address>
									<addrLine>Malostranské nám. 25</addrLine>
									<postCode>118 00</postCode>
									<settlement>Prague</settlement>
									<region>FMP</region>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Department of Software Engineering</orgName>
								<orgName type="institution">Charles University in Prague</orgName>
								<address>
									<addrLine>Malostranské nám. 25</addrLine>
									<postCode>118 00</postCode>
									<settlement>Prague</settlement>
									<region>FMP</region>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Shape Extraction Framework for Similarity Search in Image Databases</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">1916BFB070FDE418585BB157DA872FAA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The task of similarity search in image databases has been studied for decades, while there have been many feature extraction techniques proposed. Among the mass of low-level techniques dealing with color, texture, layout, etc., an extraction of shapes provides better semantic description of the content in raster image. However, even such specific task as shape extraction is very complex, so the mere knowledge of particular raster transformation and shape-extraction techniques does not give us an answer what methods should be preferred and how to combine them, in order to achieve the desired effect in similarity search.</p><p>In this paper we propose a framework consisting of low-level interconnectable components, which allows the user to easily configure the flow of transformations leading to shape extraction. Based on experiments, we also propose typical scenarios of transformation flow, with respect to the best shape-based description of the image content.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Similarity search in image databases <ref type="bibr" target="#b9">[10]</ref> is becoming increasingly important, due to rapidly growing volumes of available image data. Simultaneously, the textbased image retrieval systems become useless, since the requirements on manual annotation exceed human possibilities and resources. The metadata-based search systems are of similar kind, we need an additional explicit information to effectively describe multimedia objects (e.g. structured semantic description, as class hierarchies or ontologies), which is not available in most cases. <ref type="foot" target="#foot_0">1</ref> The only practicable way how to search the vast volumes of raw image data is the content-based similarity search, i.e. we consider the real content of each particular raster image (e.g. a photography), where the images are ranked according to similarity to a query image (the example). Only such images are retrieved, which have been ranked as sufficiently similar to the query image. The similarity measure returns a real-valued similarity score for any two models of multimedia objects.</p><p>Unlike other similarity search applications, the task of highly semantic contentbased search in image/video databases is extremely difficult. Because of generally unrestricted origination of a particular raster image, its visual content is not structured and, on the other side, hides rich semantics (as perceived by human). The most general techniques providing extraction of distinguished features from an image are based on processing of low-level characteristics, like color histograms, texture correlograms, color moments, color layout (possibly considered under spatial segmentation). Unfortunately, the low-level features emphasize just local/global relationships between pixels (their colors, respectively), hence, they do not capture high-level (semantic) features. In turn, usage of the low-level features in similarity search tasks leads to poor retrieval results, which is often referred to as the "semantic gap" <ref type="bibr" target="#b9">[10]</ref>.</p><p>In real-world applications, a design of high-level feature extraction is restricted to the domain-specific image databases. For instance, images of human faces can be processed so that biometric features (like eyes, nose, chin, etc.) are identified and properly represented. Although domain-specific image retrieval systems reach high effectiveness in precision/recall, they cannot be used to manage heterogeneous collections, e.g. images on web.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">Shape Extraction</head><p>The shapes (contours, bounded surfaces) in an image could be understood as a medium-level feature, since shape is an entity recognized also by human's perception (unlike low-level features). Moreover, shape is a practical feature for query-by-sketch support (i.e. a query-by-example where the "example" consists of user-drawn strokes), where extraction of colors or textures is meaningless. Shape Reconstruction. The most common technique is to vectorize the contiguous lines or areas in the raster image. Prior to this, the image has to be preprocessed still on the raster basis (edge detection <ref type="bibr" target="#b16">[17]</ref>, smoothing, morphologic operations, skeletonization, etc.). The subsequent raster-to-vector transformation step follows (e.g. a binary mask is vectorized into a set of (poly)lines).</p><p>Naturally, we are not done at this moment, the hardest task is to filter and combine the "tangle" of short lines (as typically produced) into several (or even single) distinguished major shapes (polylines/polygons). This involves polyline simplification <ref type="bibr" target="#b10">[11]</ref>, removal of artifacts, line connection, etc. The most complex but invaluable part of shape reconstruction should derive the prototypical shape which is approximated by the vectorized information obtained so far.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Shape Representation &amp; Similarity</head><p>Measuring. Once we have sufficiently simplified shapes found in an image, we have to represent them in order to support measuring of similar shapes. The polygonal representation itself is not very suitable for similarity measuring, because of high sensitivity to translation, scale, rotation, orientation, noise, distortion, skew, vertex spacing/offset, etc. More likely, the raw shape is often transformed into a single vector or time series <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b8">9]</ref>, where the shape characteristics are preserved but the transformation non-invariant characteristics are removed. The time series representations are usually measured by Euclidean distance, Dynamic time warping <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b6">7]</ref>, Longest common subsequence <ref type="bibr" target="#b14">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Motivation &amp; Paper Contributions</head><p>As overviewed above, the process of shape extraction (starting from the raster image and producing a vector or time series) is very complex task, where there do not exist general recommendations about particular transformation/extraction steps. In this paper, we propose a component-based framework allowing to encapsulate and connect various image/vector-processing algorithms. Hence, a network consisting of many components can be easily created, through which a particular shape extraction scenario is configured. Following this framework, we have also implemented a catalogue of basic components which, when properly connected, can provide an effective environment for shape extraction experimentation. Finally, we present several domain scenarios for shape-extraction based on experimental results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3">Related Work</head><p>Although we are aware of many existing image processing libraries, most of them lack support for end-user dataflow configuration or vector processing.</p><p>"Filters" <ref type="bibr" target="#b0">[1]</ref> is a library of image, video and vector processing functions, based on idea of configurable filters that perform various tasks. Dataflow configuration is obtained by hardcoding or via python scripts.</p><p>"JaGrLib" <ref type="bibr" target="#b11">[12]</ref> is a 2D/3D graphics library primarily aimed for educational purposes. JaGrLib offers both XML and GUI oriented dataflow configuration, but currently has limited shape extraction capabilities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">IVP Framework</head><p>The idea of Image and Vector Processing Framework <ref type="bibr" target="#b1">[2]</ref> is to separate objects that usually figure in image processing (color bitmaps, grayscale bitmaps, binary bitmaps, gradient maps, vectors, polylines, etc.) and algorithms which work with these objects on an input → output basis. Each algorithm can be considered as a black box that expects certain input data and produces a defined kind of output data. With this point of view the whole shape extraction application reduces to a network of algorithms that send data to each other. An example of the idea is depicted in Figure <ref type="figure" target="#fig_1">1</ref>.</p><p>This gives a view into the IVPF design: it's advantageous to code and store algorithms separately, the role of the client application is to allow user to specify which algorithms should be used, and also their output → input dependence. In the final effect, many specialized applications can be implemented (configured respectively) at high application level, just by specifying the algorithms, their settings and mutual dependencies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Interface &amp; Components</head><p>Each particular component class encapsulating<ref type="foot" target="#foot_2">2</ref> an algorithm (as mentioned above) must implement the IIVPFComponent interface in such a fashion that all  public component features are accessible in a simple and transparent way. In particular, using IIVPFComponent interface the components are interconnected (port registration), and checked for compatibility.</p><p>Higher Level Functionality. At a higher application level, the components are just configured and their ports connected together to create a network. This can be done either via GUI or by loading configuration/connections from an XML file. During the whole network execution, all components are run starting from pure output components (no input ports) and propagating the intermediate data through the entire network to pure input components (no output ports). The whole approach enables to change the behavior of the network in two ways:</p><p>by changing component(s) configuration by changing the structure of the entire component network</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Component Catalogue</head><p>In this section we propose several components already implemented in the IVP framework. In order to ease the understanding, we use the following formal declaration of a component:</p><formula xml:id="formula_0">type of input → component name (parameters) → type of output</formula><p>where the type of input/output we distinguish either bitmap (any 2D bitmap of color, intensity, gradient, etc.) or vectors (collection of polygons/polylines). The single arrow '→' means there is just a single type of connection to input/output supported, while the double arrow '⇒' supports several types of input/output<ref type="foot" target="#foot_3">3</ref> . The parameters declared in parentheses are component-specific, thus they have to be configured by the user.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Image Processing Components</head><p>The first class provides components serving as the basic-processing tools, which do eliminate resolution dependencies, filter the noise, and separate pixels within a specified range of colors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Image Loader Component</head><formula xml:id="formula_1">ImageFromFile (FileName) ⇒ bitmap (of colors) + bitmap (of intensities)</formula><p>To process an image, it must be loaded from a file first. During this phase grayscale image representation is computed and offered simultaneously. In noisy images where one would expect problematic edge detection, smoothing step is required. Gaussian smoothing is a well-established method and the implementation allows to configure both the Sigma parameter of the Gaussian function and the window size.</p><formula xml:id="formula_2">Image</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Edge Detection Components</head><p>For edge detection, the Canny operator <ref type="bibr" target="#b7">[8]</ref> was chosen as a main approach, as it is acceptably stable and configurable. The edge detection is performed in multiple stages, starting on an intensity (grayscale) bitmap, which usually involve 1. Source image smoothing (typically by Gaussian convolution) 2. Gradient approximation using first derivative operator 3. Non-maximum suppression 4. Hysteresis thresholding to identify edge pixels</p><p>The Canny operator is formed by a chain of components (the latter described below): GaussianIntensityFilter → GradientOperator → NonMaximaSuppression → HysteresisThresholding. For an example of the dataflow see Figure <ref type="figure" target="#fig_2">2</ref>. Gradient map obtained by first derivative operator often contains thick regions with high gradient magnitude but to extract edges one would like areas with high gradient magnitude to be as much thin as possible. Non-maximum suppression archives this by ignoring pixels where the gradient magnitude is not maximal in the gradient direction. Based on two thresholds gradient map is traced and edge pixels are extracted. Each pixel with gradient magnitude greater than the Upper threshold is marked as edge immediately. Remaining pixels are marked as edges only if they have their gradient magnitude greater than the Lower threshold and if they are connected to some existing edge pixel chain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Hysteresis Thresholding Component</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Binary Image Processing Components</head><p>The following components process binary bitmap inputs (e.g. obtained from the edge detection). These components could be used to simplification of contours. Refinement of binary images is often needed (to close gaps, round curved objects, etc.). The operators of erosion, dilatation, opening and closing (combined with a properly chosen mask) are usually a good choice to handle some input image defects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Erosion And Dilatation Component</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Thinning Component bitmap (binary) → Thinning (ThinningType) → bitmap (binary)</head><p>Thinning components that implement Stendiford <ref type="bibr" target="#b13">[14]</ref> and Zhang-Suen <ref type="bibr" target="#b15">[16]</ref> thinning algorithms are handy when it comes to polish results given by edge detection, or when there is a need to turn thick objects into one pixel thin lines. A staircase removal to refine lines contained within the binary image also fits in this category of algorithms. An example of the thinning process is given in Figure <ref type="figure" target="#fig_5">3</ref> (the first two images). In some cases (when the binary image contains thick objects), an information about contour is more valuable than the object's thinned form. Implemented method is based on approach found in <ref type="bibr" target="#b2">[3]</ref> although it uses complete image matrix instead of run length encoding as the binary image representation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Vectorization Component bitmap (binary) → Vectorization → vectors</head><p>The vectorization component is responsible to turn binary image with marked edge pixels into a set of vectors (polylines). It is based on approach mentioned by <ref type="bibr" target="#b2">[3]</ref> or <ref type="bibr" target="#b5">[6]</ref>.</p><p>First, the input binary mask is went through by shifting a 3x3 window over every pixel in order to mark critical points. Those are either endpoints (pixels with only one neighbor within the 3x3 window) or intersections (pixels with more than 2 neighbors).</p><p>In second phase all polylines between critical points are traced. Another pass through the image is needed then to identify closed polylines that were left untouched by the previous step. The resulting pixel chains are turned immediately into polylines along with junction and connectivity information (i.e. with the topology).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Vector Processing Components</head><p>Once we get a vectorized form of shape, we move to waters of geometry and graphs algorithms. Although we got rid of the raster information, we now face a tangle of (poly)lines to be meaningfully managed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Polyline Simplification Component vectors → PolylineSimplification(Type, Error) → vectors</head><p>The polylines obtained from vectorization usually carry much more information than required. It also happens that there are undesired irregularities in polylines caused by straightforward vectorization. Hence, a polyline simplification is needed. Multiple approaches are implemented, including those of Douglas-Peucker <ref type="bibr" target="#b10">[11]</ref> and Rosin-West <ref type="bibr" target="#b12">[13]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Artifact Removal Component vectors→ShortVectorRemoval(ArtifactsType,ArtifactCharacteristics)→vectors</head><p>Aside from polyline simplification, the resulting vector image contains artifacts at higher logical level. For a list list o typical artifacts see Figure <ref type="figure" target="#fig_6">4</ref>. Artifact removal component handles these artifacts based on configuration and produces result with reduced noise and longer (more meaningful) vectors, as shown in Figure <ref type="figure" target="#fig_6">4</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Iterative Pruning vectors → IterativePruning (UpperBound) → vectors</head><p>One of the goals of vector processing is to find a given number of the most significant vectors to represent the original image. With this component an experiment was made to find if a metric based on vector length is a good criterion for selection the most significant vectors.</p><p>The algorithm (our original design) works as follows: An upper bound is selected by user that represents the maximum number of vectors to obtain. First, all vectors are sorted with respect to their lengths. The algorithm works in iterations and tends toward the desired number of vectors. In order to guarantee convergence, a half of the vectors are expected to be thrown away in each iteration (the number of vectors to throw away can be easily configured as well, giving slower of faster convergence).</p><p>Which vectors should be thrown away is decided upon their lengths(short vectors are considered first) and upon the knowledge similar to that used in Artifact Removal Component (most probable artifacts are thrown away first). A typical output is depicted in Figure <ref type="figure" target="#fig_7">5</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Polyline Connection Component vectors→PolylineConnection(Scenarios,ScenariosCharacteristics)→vectors</head><p>It happens (especially in edge detection) that a significant polyline gets divided into multiple parts as a result of inaccurate thresholding, noise or overlap.</p><p>Three phase algorithm is employed to join parts together into bigger entities. The first phase operates on polylines that have their endpoints very close to each other and can be joined together without additional heuristics. Second phase takes care of larger gaps between endpoints with respect to multiple criteria (vector length, distance, orientation, etc.). Third phase tries to handle disconnected corners. Figure <ref type="figure" target="#fig_8">6</ref> features typical scenarios of these three stages. All phases are configurable and optionally offer many important concepts like double sided check, identical angle orientation check etc. Many of these concepts are mentioned in <ref type="bibr" target="#b5">[6]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Vector Output Components</head><p>The last group of components provides an output to external storage (now file). Besides an one-to-one export to a well-known format (like WMF, DXF), components in this class cover also secondary feature extraction techniques needed for particular tasktypically representations for subsequent similarity measuring/search.</p><p>Vectors Output Component vectors → VectorToFile(FileName, Format)</p><p>This component saves its vector input into a specified format (DXF, textfile, etc.).</p><p>Angles Extraction Component vectors → AnglesToFile(FileName, AngleType, NumberOfAngles)</p><p>A specialized feature component transforming polylines to (time) series. Each polyline is normalized first by splitting it to a number of parts of the same length. A set of angles is then saved to the output. The angles can be measured as the smaller angles between each pair of successive line segments, as clockwise (or counter-clockwise) angles between them, or as angles between each line segment and the X-axis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Domain Scenarios</head><p>The components of the previously introduced catalogue have been designed in order to easily create scenarios suitable for extraction of simple shapes. Hence, we are primarily interested in simplified representation of shapes inside an image, rather than in a comprehensive image-to-shape transformation preserving as much information as possible.</p><p>A simplified shape could serve as a descriptor involved in measuring similarity of two images (based on shapes found inside). The requirement on small-sized descriptor is justified by handling the shape information by similarity measure. Since the similarity measure is supposed to be evaluated many times on entities of a huge image database, the similarity measuring should be as fast (yet precise) as possible. However, this goal can be achieved by a smart shape extraction providing prototypical descriptors. An image represented by a single (or very few) polylines/polygons limited to several tens or hundreds of vertices (say up to 1 kilobyte) -this is our desirable descriptor.</p><p>On the other side, we are aware of the difficulties when trying to establish such a simple descriptor. Therefore, we have performed experimentation on various images (photography, drawing, etc.) and tried to assemble several configurations of component networks (called domain scenarios) which gave us the best results for a particular image type (with respect to desirable descriptor properties). In the following sections we present three such scenarios.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Drawing</head><p>As for the drawings, the vectorization task is slightly simpler thanks to the fact that the source image contains the desired information in an easily identifiable form (monochromatic strokes describing shapes, colored areas in case of cartoons, etc.). The extracted layer or layers described by binary images can be further processed by means of thinning (in case of strokes) or contour extraction (in case of thick areas). The result often embodies only a low level of noise and can be directly used as is or further simplified. In Figure <ref type="figure" target="#fig_9">7</ref> see the scheme of component configuration and interconnection under Drawing scenario. Note that the two branches to two different types of shape extraction (skeleton and contour). For an example of data flow regarding to the Drawing scenario, see Figure <ref type="figure" target="#fig_10">8</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Simple Object</head><p>For high contrast images containing unsophisticated shapes, the edge detection alone is a reliable way to extract required feature information. When this is known, the artifact removal is a relatively safe operation without the risk of removing important features. A reconnection of disconnected lines (which follows then) almost completely reconstructs the full shape information. Finally, the polyline simplification should be done to straighten jagged lines and minimize the produced number of line segments. In Figure <ref type="figure" target="#fig_12">10</ref> see the scheme of component configuration and interconnection under the Simple object scenario. For an example of data flow, see Figure <ref type="figure" target="#fig_12">10</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Complex Scene</head><p>In real-world images (photos), the edge detection cannot guarantee getting clean shapes, on the contrary there are usually huge amounts of false detected or unwanted edges. The iterative pruning is supposed to take care of most of the "trash" in the vector output and even then, maximum effort must be directed into connecting disrupted polylines, corner detection and polygonal approximation. In Figure <ref type="figure" target="#fig_13">11</ref> see the scheme of component configuration and interconnection under the Complex scene scenario. For an example of data flow, see Figure <ref type="figure" target="#fig_14">12</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions &amp; Future Work</head><p>In this paper we have presented a highly configurable framework for shape extraction from raster images. Based on the framework, we have proposed a catalogue of components, which have been designed to be easily configured into a network. Based on experiments, we have recommended three domain scenarios for extraction of simple shapes, in order to create useful descriptors for similarity search applications.</p><p>In the future we would like to investigate similarity measures suitable for shapebased similarity search. An extraction of simple prototypical shapes from images (as proposed in this paper) is crucial for similarity measuring, so it is an unavoidable step when trying to "bridge the semantic gap" in image retrieval. Furthermore, we would like to automate the scenario recommendation process (where each component in scenario evaluates the goodness of what it produces), resulting in a kind of unsupervised extraction technique.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig.1. Algorithms as black boxes: The example shows algorithms A-G which form an application with arrows representing their output → input dependence. A is an input algorithm that vectorizes the input image and sends the resulting vectors into the network. B-C and E-F-G are branches that transform their vector input somehow. D takes two vectorial inputs and chooses one, based on certain criteria. Then both D and G save their outputs in specified formats (to a file/stream/anything else).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. An example showing input image, image's gradient map and marked edge pixels.</figDesc><graphic coords="6,49.04,220.09,113.46,96.45" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>bitmap (gradients) → HysteresisThresholding (Lower, Upper) → bitmap (binary)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>bitmap(binary)→ErosionDilatation (OperationType, MaskType)→bitmap(binary)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Input binary image along with its thinned form and refined vector result containing n = 18 polylines.</figDesc><graphic coords="7,102.67,313.01,70.05,70.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. The first picture gives example of the most typical small artifacts -a. unconnected polylines b. insignificant cycles c. 1-connected polylines d. spurious loops e. insignificant junction connections. The second picture shows an input vectorized image. On the third picture there is output of the artifact removal component (configured to ignore short artifacts) and Douglas-Peucker polyline simplification algorithm.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Fig. 5 .</head><label>5</label><figDesc>Fig. 5. The first picture shows line drawing containing n = 12, 617 polylines made of m = 38, 433 line segments. In second picture is the result of vector removal component (with upperbound = 20) combined with Douglas-Peucker simplification. The output contains n = 19 polylines with m = 473 line segments.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Fig. 6 .</head><label>6</label><figDesc>Fig.6. An example demonstrating typical scenarios when connecting polylines: a. connection of lines based on their orientation, angle, . . . to create one longer polyline b. closing disrupted cycles c. connecting near endpoints to form a corner d. connecting far endpoints to form a corner, guessing corner shape.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Fig. 7 .</head><label>7</label><figDesc>Fig. 7. Scenario 1 -Drawing.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Fig. 8 .</head><label>8</label><figDesc>Fig. 8. Scenario 1 -Drawing -Example of data flow.</figDesc><graphic coords="11,94.44,344.73,72.05,72.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_11"><head>Fig. 9 .</head><label>9</label><figDesc>Fig. 9. Scenario 2 -Simple object.</figDesc><graphic coords="12,220.38,199.76,71.87,72.11" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_12"><head>Fig. 10 .</head><label>10</label><figDesc>Fig. 10. Scenario 2 -Simple object -Example of data flow.</figDesc><graphic coords="12,87.60,199.98,72.04,72.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_13"><head>Fig. 11 .</head><label>11</label><figDesc>Fig. 11. Scenario 3 -Complex scene.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_14"><head>Fig. 12 .</head><label>12</label><figDesc>Fig. 12. Scenario 3 -Complex scene -Example of data flow.</figDesc><graphic coords="13,297.44,155.65,98.26,73.69" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>There exist special types of images like maps or drawings where it makes little sense to do full edge detection. Instead, certain parts of interest can be extracted by simple interval thresholding.</figDesc><table><row><cell>Thresholding Component</cell></row><row><cell>Gaussian Smoothing Component</cell></row><row><cell>bitmap(intensity)→GaussianIntensityFilter(WindowSize,Sigma)→bitmap(intensity)</cell></row></table><note>Resample Component bitmap (colors) → ImageResample (ResamplingType, DesiredSize) ⇒ bitmap (colors) + bitmap (intensities)The input image might be either too small or very large for the sake of further processing. With this component it's easy to resize it suitably using Nearest Point, Bilinear and Biquadratic resampling. bitmap (colors) → ImageThresholding (RGBUpper, RGBLower) → bitmap (binary)</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">The image search at images.google.com is a successful example of metadata-based engine, where the metadata are extracted from web pages referencing the images. J. Pokorný, V. Snášel, K. Richta (Eds.): Dateso</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2007" xml:id="foot_1">, pp. 89-102, ISBN 80-7378-002-X.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2">The framework has been implemented in .NET framework 2.0.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_3">Naturally, an output port of one component can be connected to input ports of multiple components (providing we obey port compatibility).</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments.</head><p>This research has been partially supported by GA ČR grant 201/05/P036 provided by the Czech Science Foundation.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<ptr target="http://sourceforge.net/projects/filters/" />
		<title level="m">Filters library</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Image and vector processing framework</title>
		<editor>cuni</editor>
		<imprint/>
	</monogr>
	<note>available at siret</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Introduction to Interpretation of Graphic Images</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Ablameyko</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1997">1997</date>
			<publisher>SPIE</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Efficient contour-based shape representation and matching</title>
		<author>
			<persName><forename type="first">T</forename><surname>Adamek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>O'connor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval</title>
				<meeting>the 5th ACM SIGMM international workshop on Multimedia information retrieval<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="138" to="143" />
		</imprint>
	</monogr>
	<note>MIR &apos;03</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A multiscale representation method for nonrigid shapes with a single closed contour</title>
		<author>
			<persName><forename type="first">T</forename><surname>Adamek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">E</forename><surname>O'connor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Circuits Syst. Video Techn</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="742" to="753" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Digitalization of map</title>
		<author>
			<persName><forename type="first">P</forename><surname>Altman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
		<respStmt>
			<orgName>Charles University, Department of Software and Computer Science Education</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Master&apos;s thesis</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Warp: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance</title>
		<author>
			<persName><forename type="first">I</forename><surname>Bartolini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Patella</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Pattern Anal. Mach. Intell</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="142" to="147" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
	<note>Member-Paolo Ciaccia</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A computational approach to edge detection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Canny</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. Pattern Anal. Mach. Intell</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="679" to="698" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A survey of shape similarity assessment algorithms for product design and manufacturing applications</title>
		<author>
			<persName><forename type="first">A</forename><surname>Cardone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Karnik</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Computing and Information Science in Engineering</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="109" to="118" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Image Databases : Search and Retrieval of Digital Imagery</title>
		<author>
			<persName><forename type="first">V</forename><surname>Castelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Bergman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2002">2002</date>
			<publisher>Wiley-Inter</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Algorithms for the reduction of the number of points required to represent a line or its caricature</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">H</forename><surname>Douglas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Peucker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Canadian Cartographer</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="112" to="122" />
			<date type="published" when="1973">1973</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Jagrlib: Library for computer graphics education</title>
		<author>
			<persName><forename type="first">J</forename><surname>Pelikán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kostlivý</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">WSCG (Posters)</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="125" to="128" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Segmentation of edges into lines and arcs</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">L</forename><surname>Rosin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A W</forename><surname>West</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Image Vision Comput</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="109" to="114" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Some new heuristics for thinning binary handprinted characters for ocr</title>
		<author>
			<persName><forename type="first">F</forename><surname>Stentiford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mortimer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Systems Man and Cybernetics</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="81" to="84" />
			<date type="published" when="1983">1983</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Indexing multidimensional time-series with support for multiple distance measures</title>
		<author>
			<persName><forename type="first">M</forename><surname>Vlachos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hadjieleftheriou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Gunopulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Keogh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD &apos;03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="216" to="225" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A fast parallel algorithm for thinning digital patterns</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Y</forename><surname>Suen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="236" to="239" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Edge detection techniques -an overview</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ziou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tabbone</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Pattern Recognition and Image Analysis</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="537" to="559" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
