<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">TELECOM ParisTech at ImageClef 2009: Large Scale Visual Concept Detection and Annotation Task</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Marin</forename><surname>Ferecatu</surname></persName>
							<email>marin.ferecatu@telecom-paristech.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory" key="lab1">CNRS LTCI</orgName>
								<orgName type="laboratory" key="lab2">UMR</orgName>
								<address>
									<addrLine>5141 46, rue Barrault</addrLine>
									<postCode>75634</postCode>
									<settlement>Paris Cedex</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Hichem</forename><surname>Sahbi</surname></persName>
							<email>hichem.sahbi@telecom-paristech.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory" key="lab1">CNRS LTCI</orgName>
								<orgName type="laboratory" key="lab2">UMR</orgName>
								<address>
									<addrLine>5141 46, rue Barrault</addrLine>
									<postCode>75634</postCode>
									<settlement>Paris Cedex</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Institut</forename><surname>Telecom</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory" key="lab1">CNRS LTCI</orgName>
								<orgName type="laboratory" key="lab2">UMR</orgName>
								<address>
									<addrLine>5141 46, rue Barrault</addrLine>
									<postCode>75634</postCode>
									<settlement>Paris Cedex</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Telecom</forename><surname>Paristech</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory" key="lab1">CNRS LTCI</orgName>
								<orgName type="laboratory" key="lab2">UMR</orgName>
								<address>
									<addrLine>5141 46, rue Barrault</addrLine>
									<postCode>75634</postCode>
									<settlement>Paris Cedex</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">TELECOM ParisTech at ImageClef 2009: Large Scale Visual Concept Detection and Annotation Task</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">CB3598A9391AD12969D839A2E9070AEE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T23:00+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing</term>
					<term>H.3.3 Information Search and Retrieval</term>
					<term>H.3.4 Systems and Software</term>
					<term>H.3.7 Digital Libraries</term>
					<term>H.2.3 [Database Managment]: Languages-Query Languages Measurement, Performance, Experimentation Image annotation, Canonical Correlation Analysis, Text and image descriptors</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper we describe the participation of TELECOM ParisTech in the Large Scale Visual Concept Detection and Annotation Task at the ImageClef 2009 challenge. This year, the focus was in the extension of (i) the amount of data available for training and testing, and (ii) the number of concepts to be annotated. We use Canonical Correlation Analysis in order to infer a latent space where text and visual description are highly correlated. Starting from a visual description of a test image, we first map it into the latent space, then we predict the underlying text features (and also annotations) which best fit the visual ones in the latent space. Our method is very fast while achieving good results.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Task description</head><p>The Large Scale Visual Concept Detection and Annotation Task (referred here as VCDAT) offers a unified framework for image annotation to the participating teams: the goal is to annotate each test image with keywords describing the visual content and its semantic interpretation. The task provides annotated images using 53 concepts; all images have multiple annotations and the concepts are organized in a small ontology. The participants are allowed to use the relations between concepts for solving the annotation task. The training set consists of 5.000 images annotated with the 53 visual concepts and the test data consists of 13.000 photos. The participants are allowed to use only the training data in order to tune their algorithms.</p><p>Two evaluation measures are proposed: (a) per concept: false positive and false negative rates and (b) per image: a hierarchical measure that considers partial matches and calculates misclassification costs for each missing or wrongly annotated concept, based on structure information (distance between concepts in the hierarchy) and relationships from the ontology <ref type="bibr" target="#b12">[13]</ref>.</p><p>This year, the VCDAT focuses on scaling annotation algorithms to thousands of images and possibly more, which is indeed a very difficult task. Image annotation is still an unsolved problem and recent state of the art algorithms perform less than satisfactorily on most image databases <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b4">5]</ref>. Image annotation is one branch of computer vision related to object detection and recognition; its goal is to decide whether an image contains one or multiple targeted objects and if yes, finds their locations. This problem is well studied and reasonably well solved for particular objects such as faces <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b16">17]</ref> but remains reputedly difficult for many other classes of objects <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b14">15]</ref>.</p><p>Generally, local approaches, for instance those relying on keypoint extraction or image segmentation, are likely to offer better results, but at the expense of a much higher computational effort <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b13">14]</ref>. Regardless the computational issues, VCDAT uses 53 concepts and many of them are holistic<ref type="foot" target="#foot_0">1</ref> so local (and also object based) methods are unlikely to provide descent results for this particular level of difficulty. Furthermore, local approaches hit the exterme variability of objects (concepts) into scenes and the limited amount of training images in order to capture this variability.</p><p>Instead, we focus on global approaches i.e., those which extract global image descriptions and easily handle large scale databases and annotations. This scalability will be achieved at the detriment of slight decrease of precision. Moreover, as we shall see, adding and training our system with new concepts is straightforward and does not require separate models for each one.</p><p>The remainder of this paper is organized as follows, we first describe our visual image and text features (see §3), then we discuss the application of Canonical Correlation Analysis (CCA) in order to infer a latent space where the two underlying representations are highly correlated ( §4.) Given a visual description of a new (test) image, we first project it into the CCA latent space, then we infer text features as a linear combination of basic concepts which correlate the best with the visual one. Finally, we back-project the resulting text features into the (input) concept space and we normalize the projection coefficients between 0 and 1. A value close to 1 means that the corresponding concept is likely to be present into an image while a value close to 0 corresponds to an unlikely concept.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Text and visual content description</head><p>Visual descriptors. Global image descriptors have some properties that are very desirable in our case: (a) they have small memory footprint and thus fit into standard PCs without any specific storage requirements; (b) they are very fast to compute as they involve simple distance computation operations, guaranteeing real time responses; and (3) they do not include any a priori object model and thus can be applied to any target category. Indeed, global descriptors have been shown to perform well in this framework, for example with machine learning and data mining algorithms <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b3">4]</ref>.</p><p>More precisely, we use a combination of color, texture and shape features, as follows. To represent color we use weighted color histograms: they provide a summary description of the color information including spatial measure in order to enphasize image regions that are interesting with respect to the visual content <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b0">1]</ref>. As for texture features we use the power spectral density distribution in the complex plane. This has been shown to perform well when combined with color and shape histograms <ref type="bibr" target="#b10">[11]</ref>. Roughly, a high energy spectrum concentrated at low frequencies highlights large scale informations in an image, while high frequencies correspond to textured regions (small scale details). In order to describe the shape content of an image we use standard edge orientation histograms. First, edges are extracted from images, then the gradient is computed using only the edge pixels. The orientation of the gradient is quantized w.r.t. the angle resulting into a histogram that is sensible to the general flow of lines in the image <ref type="bibr" target="#b7">[8]</ref>. More details on image descriptors can be found in <ref type="bibr" target="#b2">[3]</ref>.</p><p>Text descriptors. We use the annotations provided for the training set in order to compute the text features. The latter have 53 dimensions, one for each concept c, indicating the presence or the absence of c. The resulting feature vector is very sparse; i.e., when applying principal component analysis (PCA), we found that 48 dimensions are sufficient in order to capture 100% of the statistical variance of the training data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Prediction using CCA</head><p>Canonical Correlation Analysis was first introduced by Hotelling <ref type="bibr" target="#b6">[7]</ref> and it is used in order to capture linear relationships between two (or many) ordered<ref type="foot" target="#foot_1">2</ref> sample sets in different feature spaces. Canonical correlation analysis seeks a pair of linear transformations, one for each of the feature spaces, which map training and testing data into a common latent space. The latter is built in order to maximize the correlation between the sample sets in different feature spaces <ref type="bibr" target="#b5">[6]</ref>.</p><p>Given a test image, first we extract its visual feature vector and we project it into the CCA latent space. Then, we back-project the latent feature vector into the 53 dimensions of text space using the Moore-Penrose pseudo-inverse of the CCA transformation matrix. Now, annotations correspond to the entries among the 53 dimensions where the score is larger than a given threshold.</p><p>Training data consists of 5.000 images sharing 53 concepts. Fig. <ref type="figure" target="#fig_1">1</ref> shows the distribution of the number of images through different concepts. The most frequent one appears in 4656 images while the less frequent annotates only 18 images. Notice that both "very frequent" and "very rare" concepts are difficult to learn as the underlying positive and negative classes are clearly unblanced.  We randomly split the training set in two parts: one used for learning the CCA transform (4.000 images) and the other one used in order to evaluate the performance (1.000 images). Since the output of the algorithm has an asymptotic normal distribution, we normalize it to 0.5 mean and 1/6 standard deviation. This ensures that 99.7 of the predicted scores lie between 0 and 1. Scores less than 0 (resp. larger than 1) are mapped to 0 (resp. 1).</p><p>The evaluation measure we use is the annotation error defined as the expected false negatives and false positives. For each concept c, we fix a threshold τ (c) and we annotate images with c if the underlying scores are larger that τ (c). Notice that τ (c) is fixed in order to minimize the error rate. We then linearly shift τ (c) to 0.5 in order to comply with the submission format.</p><p>On these challenging test images, our annotation method achieves relatively reasonable performances; the false positive error rate is 0.18 while the false negative one reaches 0.21. Nevertheless, our method is very efficient; in practice it tooks about a second in order to achieve training and prediction using a standard Pentium-M processor (with 2500 Mhz).</p><p>We also extended our method in order to use the ontology suggested by the challenge. Text features were enriched using this ontology in order to include all intermediate concepts and then propagate the annotation along the hypernyms tree. Notice that predictions include only the 53 concepts required by the benchmark. However, the ontology is too small in order to provide a noticeable improvement. Indeed, its total number of nodes is 68 where 53 (out of the 68) correspond to the candidate annotations. Again, we found that text features are still living into a subspace of 48 dimensions and this clearly shows that new extended concepts provide the same amount of information as initial ones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusion and perspectives</head><p>In this work we introduced the participation of TELECOM ParisTech in the Large Scale Visual Concept Detection and Annotation Task at ImageClef 2009. This year the task focuses in scalability of the annotation methods to large databases. Consequently, we use global, fast and easy to compute images descriptors that require very few computation resources. Our method constructs a latent space, using Canonical Correlation Analysis, where text and image features are highly correlated. It is extremely fast, it runs in less that a second both for training and for testing on a standard 2.5 GHz PC, and makes annotation effective and efficient in order to handle large scale databases.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Number of images per concepts.</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Holistic means that the annotation is based on a global impression of a scene and not necessarily related to its physical objects.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">One may define any arbitrary order for each sample set but should keep that order in different feature spaces.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://aveir.lip6.fr</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This work was supported by the French National Research Agency (ANR) under the AVEIR 3 project, ANR-06-MDCA-002.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Ikona: Interactive generic and specific image retrieval</title>
		<author>
			<persName><forename type="first">Nozha</forename><surname>Boujemaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Julien</forename><surname>Fauqueur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marin</forename><surname>Ferecatu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Valérie</forename><surname>Franc ¸ois Fleuret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bertrand</forename><forename type="middle">Le</forename><surname>Gouet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hichem</forename><surname>Saux</surname></persName>
		</author>
		<author>
			<persName><surname>Sahbi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International workshop on Multimedia Content-Based Indexing and Retrieval (MMCBIR&apos;</title>
				<meeting>the International workshop on Multimedia Content-Based Indexing and Retrieval (MMCBIR&apos;</meeting>
		<imprint>
			<date type="published" when="2001">2001. 2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Image retrieval: Ideas, influences, and trends of the new age</title>
		<author>
			<persName><forename type="first">Ritendra</forename><surname>Datta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dhiraj</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jia</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1" to="60" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Image retrieval with active relevance feedback using both visual and keyword-based descriptors</title>
		<author>
			<persName><forename type="first">Marin</forename><surname>Ferecatu</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
		<respStmt>
			<orgName>INRIA-University of Versailles Saint Quentin-en-Yvelines, France</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Content-based image retrieval: An overview</title>
		<author>
			<persName><forename type="first">Theo</forename><surname>Gevers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Arnold</forename><forename type="middle">W M</forename><surname>Smeulders</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Emerging Topics in Computer Vision</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Medioni</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Kang</surname></persName>
		</editor>
		<imprint>
			<publisher>Prentice Hall</publisher>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A survey of methods for image annotation</title>
		<author>
			<persName><forename type="first">Allan</forename><surname>Hanbury</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Visual Languages and Computing</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="617" to="627" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Canonical correlation analysis: An overview with application to learning methods</title>
		<author>
			<persName><forename type="first">David</forename><forename type="middle">R</forename><surname>Hardoon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sandor</forename><surname>Szedmak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><surname>Shawe-Taylor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Computation</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="2639" to="2664" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Relations between two sets of variates</title>
		<author>
			<persName><forename type="first">H</forename><surname>Hotelling</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biometrika</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="312" to="377" />
			<date type="published" when="1936">1936</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Shape-based retrieval: a case study with trademark image databases</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vailaya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page" from="1369" to="1390" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Content-based multimedia information retrieval: State-ofthe-art and challenges</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sebe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Djeraba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Jain</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Multimedia Computing, Communication, and Applications</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="19" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A survey of methods for image annotation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Lowe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Vision</title>
		<imprint>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="issue">2</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m">Introduction to MPEG-7: Multimedia Content Description Interface</title>
				<editor>
			<persName><forename type="first">B</forename><forename type="middle">S</forename><surname>Manjunath</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Salembier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Sikora</surname></persName>
		</editor>
		<imprint>
			<publisher>Wiley</publisher>
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A performance evaluation of local descriptors</title>
		<author>
			<persName><forename type="first">Krystian</forename><surname>Mikolajczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cordelia</forename><surname>Schmid</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Pattern Analysis &amp; Machine Intelligence</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page" from="1615" to="1630" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Multilabel classification evaluation using ontology information</title>
		<author>
			<persName><forename type="first">Stefanie</forename><surname>Nowak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hanna</forename><surname>Lukashevich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the IRMLES Workshop</title>
				<meeting>of the IRMLES Workshop</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Towards category-level object recognition</title>
		<author>
			<persName><forename type="first">Jean</forename><surname>Ponce</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martial</forename><surname>Hebert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cordelia</forename><surname>Schmid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrew</forename><surname>Zisserman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>Springer</publisher>
			<biblScope unit="volume">4170</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Sharing visual features for multiclass and multiview object detection</title>
		<author>
			<persName><forename type="first">A</forename><surname>Torralba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Murphy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">T</forename><surname>Freeman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of CVPR</title>
				<meeting>of CVPR</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Upgrading color distributions for image retrieval: can we do better?</title>
		<author>
			<persName><forename type="first">Constantin</forename><surname>Vertan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nozha</forename><surname>Boujemaa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Visual Information Systems</title>
				<imprint>
			<date type="published" when="2000-11">Visual2000. November 2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Sharing visual features for multiclass and multiview object detection</title>
		<author>
			<persName><forename type="first">P</forename><surname>Viola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jones</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of ICCV</title>
				<meeting>of ICCV</meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Face recognition: A literature survey</title>
		<author>
			<persName><forename type="first">W</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Chellappa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Phillips</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rosenfeld</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Vision</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="399" to="458" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
