<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>press_statistics. Retrieved January</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An Evaluation of Search Strategies for User-Generated Video Content</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Christopher G. Harris Informatics Program The University of Iowa Iowa City</institution>
          ,
          <addr-line>IA 52242</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <volume>8</volume>
      <issue>2012</issue>
      <abstract>
        <p>As the amount of user-generated content (UGC) on websites such as YouTube have experienced explosive growth, the demand for searching for relevant content has expanded at a similar pace. Unfortunately the minimally-required production effort and decentralization of content make these searches problematic. In addition, most UGC search efforts rely on notoriously noisy usersupplied tags and comments. In this paper, we examine UGC search strategies on YouTube using video requests from several knowledge markets such as Yahoo! Answers. We compare crowdsourcing and student search efforts to YouTube's own search interface and apply these strategies to different types of information needs, ranging from easy to difficult. We evaluate our findings using two different assessment methods and discuss how the relative time and financial costs of these three search strategies affect our results.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Crowdsearch</kwd>
        <kwd>crowdsourcing</kwd>
        <kwd>search strategies</kwd>
        <kwd>user-generated content</kwd>
        <kwd>YouTube</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Copyright c 2012 for the individual papers by the papers’ authors.
Copying permitted for private and academic purposes. This volume is published
and copyrighted by its editors.</p>
      <p>CrowdSearch 2012 workshop at WWW 2012, Lyon, France
most searched term on Google [2] and was the third-most visited
website as measured by Alexa [3]. Many visitors do more than
simply view content – more video content is uploaded to
YouTube in 60 days than the three major US networks have
created in the past 60 years [1].</p>
      <p>The lower required production effort, exponential growth, and
decentralization of the UGC videos often make searches for
specific content challenging: to compensate, searchable content on
UGC websites is often restricted to producer-supplied categories
and tags or obtained from viewer-supplied comments. YouTube
comment text is frequently noisy and insufficient to produce a set
of content and/or context terms from which to search effectively
[4]. In addition, despite the Web 2.0 features YouTube has
integrated to encourage user participation, an examination by Cha
et. al. [5] found the level of active user participation is remarkably
low - comments on YouTube videos are provided by a mere
0.16% of total viewers. This limited contribution to searchable
text also has a negative impact on search quality.</p>
      <p>
        Categories used on UGC websites are often too broad and lack the
discriminative power for use in most searches; YouTube, for
example, contains 15 broad categories with labels such as Autos &amp;
Vehicles, Comedy, and Education. In contrast, producer-supplied
tags on UGC websites are usually quite sparse and do not always
represent the true video content. In a study of more than one
million YouTube videos conducted by Geisler and Burns in [4],
the median number of tags applied per video was 6.0. One of the
study’s findings was many tags did not adequately describe the
actual video content. Rarely do the terms used by video content
producers match those used in searches, as Bischoff et. al.
illustrated in [
        <xref ref-type="bibr" rid="ref1">6</xref>
        ]. For example, people tagging music videos
would likely use terms associated with its genre, such as “rock,”
whereas people generally do not search for music videos via
genre, instead opting for searches containing song title and/or
artist.
      </p>
      <p>
        Despite these shortcomings, the search function on YouTube’s
website remains the most frequently-used method to find videos,
according to Zhou et. al. [
        <xref ref-type="bibr" rid="ref2">7</xref>
        ], yet many user queries for UGC go
unsatisfied. Knowledge market websites, such as Yahoo!
Answers1 and Answers.com2, contain unfulfilled and
partiallyfulfilled user requests for videos; as of January 2012, Yahoo!
Answers alone had more than 250,000 requests for assistance to
locate videos for a specific information need. Some studies, such
as one conducted by Dearman and Troung [
        <xref ref-type="bibr" rid="ref3">8</xref>
        ] and another by
Bian et. al. [
        <xref ref-type="bibr" rid="ref4">9</xref>
        ] found that inadequate phrasing of a question and/or
corresponding answer on knowledge market websites negatively
affects utility. Consequently, the ability to effectively search for
UGC, particularly on rare or noisy topics, remains a challenge.
Crowdsourcing may provide a viable solution for searching UGC.
The use of the crowd as a search strategy is compelling; it
introduces diversity of search terms since different members of
the crowd will apply different search strategies based on their
familiarity with the search topic. Moreover, the crowd has been
shown to provide good quality in studies involving relevance
judgments. Even with diversity, we can still expect search quality:
some studies on prediction in crowdsourcing systems demonstrate
that reliability of the average of predicted scores by the crowd
improves as the size of the crowd increases [
        <xref ref-type="bibr" rid="ref5 ref6">10, 11</xref>
        ]. Likewise,
search quality is expected to improve as the number of searchers
in the crowd expands. Crowdsourcing contrasts with knowledge
markets in level of engagement; Nielsen mentions in [
        <xref ref-type="bibr" rid="ref7">12</xref>
        ] that
over 90% of knowledge market group participants fail to
contribute; therefore the crowdsourcing aspect introduces some
financial incentive to motivate task participation.
      </p>
      <p>The objective in this paper is to examine if the crowd can provide
a more precise set of UGC search results, given a query,
compared with other multimedia search tools. The contributions
of this paper are as follows. First we compare the retrieval
performance of different retrieval models in terms of precision on
several categories using UGC video requests taken from leading
knowledge market websites. We then compare YouTube’s own
search interface with a search conducted by students as well as a
search approach using crowdsourcing. We evaluate our results
using two methods: mean average precision determined after
applying pooling, and a simple list preference, where the entire
list of videos judged as relevant by each method are compared.
The remainder of the paper is organized as follows. In Section 2
we put our work in the context of previous work. In Section 3 we
discuss our experimental setup. Section 4 offers a discussion of
the results. We conclude and provide insight into future work in
Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. RELATED WORK</title>
      <p>
        Even prior to Web 2.0, there has been significant research in
multimedia search methods, including several organized
competitions that involve traditional search strategies. The
popular TRECVid [
        <xref ref-type="bibr" rid="ref8">13</xref>
        ] benchmarking competition focuses on the
detection of specific features within non-UGC multimedia
collections. Wikipedia Retrieval, a task in ImageCLEF [
        <xref ref-type="bibr" rid="ref9">14</xref>
        ]
involves locating relevant images from the Wikipedia image
collection based on a provided text query and several sample
images. While Wikipedia Retrieval examines noisy and
unstructured textual annotations in Wikipedia multimedia, the
semi-structured content evaluated in ImageCLEF is far less noisy
and more structured than content searches on YouTube.
Several studies have examined search quality on user-supplied
tags in other Web 2.0 applications. Diversity of image tag search
results in Flickr using an implicit relevance feedback model is
explored by von Zwol et. al. [
        <xref ref-type="bibr" rid="ref10">15</xref>
        ], concluding that diversity is an
important component when retrieval is based on small data sets,
such as those found in image tags. Hotho et. al. explore
folksonomy tagging, which is bound by the same noisy
unstructured restrictions as YouTube tags [
        <xref ref-type="bibr" rid="ref11">16</xref>
        ], but their study was
primarily focused on recommender systems usage of these tags.
Others have examined multimedia search effectiveness on
knowledge market websites, such as Chua et. al. in [
        <xref ref-type="bibr" rid="ref12">17</xref>
        ] and Li et.
al. in [
        <xref ref-type="bibr" rid="ref13">18</xref>
        ]; however, their focus is to locate all content addressing
a specific question (e.g. “how to” and “why” question types)
whereas the focus of our study is on finding and ranking videos
that fulfill a specific search request (e.g., “help find a video”).
A few studies have examined the effectiveness of crowds on noisy
data searches. Steiner et. al. demonstrated searches of event
detection methods in YouTube videos at the fragment level [
        <xref ref-type="bibr" rid="ref14">19</xref>
        ].
Hsueh et. al. examined searches in political blogs in [
        <xref ref-type="bibr" rid="ref15">20</xref>
        ] which,
although noisy, do not experience the restrictions inherent in
multimedia tags. In [
        <xref ref-type="bibr" rid="ref16">21</xref>
        ], Yan et. al. provided an innovative
approach called CrowdSearch, which provided near-real-time
assessment of images. Although the authors’ focus was on
labeling images, their approach could feasibly be extended to
locating similar media on YouTube.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. EXPERIMENTAL SETUP</title>
    </sec>
    <sec id="sec-4">
      <title>3.1 Retrieval Process</title>
      <p>
        Our objective is to compare the search results obtained from
crowdsourcing, human searchers, and YouTube’s own search
interface. YouTube’s search interface is a version of Google’s
search that has been refined for YouTube, and represents a
significant share of Google-based searches. Since late 2008,
metrics confirm that video searches on YouTube account for
more than a quarter of all Google search queries in the U.S, and a
similar share in a most other countries [
        <xref ref-type="bibr" rid="ref17">22</xref>
        ]. We began by
extracting a set of 100 questions randomly taken from four
knowledge market sources (Yahoo! Answers, Answers.com,
Blurtit3 and Allexperts4) containing the terms “find” and “video”
and remained either unanswered or partially-answered (i.e. the
requestor did not indicate their query had been satisfied). We
pared our list of questions down to 45 by removing those where
the requestor’s need could not be clearly determined or we could
not find any candidate videos on YouTube’s website that
appeared to meet the stated criteria through a preliminary search.
Our method is similar to that used by Kofler et. al. in [
        <xref ref-type="bibr" rid="ref18">23</xref>
        ]. For
each request, we removed noisy terms from the original request
(e.g., only retain those that support the identified information
need); we call this a Restated Query. An example of this query
refinement procedure is shown in Table 1. We classify each
request into one of three categories based on our own assessment
of query difficulty using the Restated Query using the following
guidelines. Requests classified as “easy” are relatively
straightforward to find one (or more) videos that match the stated
request - likely listed as a result of requestor laziness or
inexperience with search tools. Requests classified as “medium”
require some additional refinement, such as an expansion of terms
or enhancement using synonyms. Requests classified as
“difficult” require significant term refinement to obtain links to
YouTube videos. Our final set of queries contained 15 of each
difficulty level. This retrieval process is outlined in Figure 1.
Examples of Restated Queries categorized as “easy”, “medium”,
and “difficult” appear in Table 2.
      </p>
      <p>For the student search method, we asked five university students
to perform each search. We paid an hourly rate of $10 per hour to
search each of the Restated Queries, a typical wage for this type
of task in our area. Each student was instructed to provide a list
(of up to 40) YouTube video links for each Restated Query.
Although given unlimited time, the student group took an average
of just under 90 minutes to complete all 45 queries. Participants
were told they could use any available search methods or tools.
For the crowdsourced search method, we use the Amazon
Mechanical Turk5 platform (MTurk) to list tasks, and provide
each worker with Restated Query for each question with
instructions to return at least 10, but not more than 40, of the most
relevant YouTube video links. Using MTurk, we created 45
queries, called Human Intelligence Tasks (HITs), amounting to
one HIT for each Restated Query. As with the student searchers,
crowdsourcing participants were told they could use any search
tools they desired and thus were not constrained to using
YouTube’s search interface. We paid participants $0.10 per
completed HIT, which is a typical wage for this type of
crowdsourcing task; to maximize the use of the crowd model and
differentiate it from the student search model, crowdsourcing
participants were not able to participate in more than one HIT.
5 http://www.mturk.com</p>
    </sec>
    <sec id="sec-5">
      <title>3.2 Evaluation</title>
      <p>The result sets were scored and ranked two different ways:
pooling, which has been used in TRECVID, and simple list
preference, where the each result set is first validated and
compared as a whole.</p>
      <sec id="sec-5-1">
        <title>3.2.1 Pooling</title>
        <p>
          Therefore, the following pooling technique is used instead. We
employ the pooling method used in TRECVID [
          <xref ref-type="bibr" rid="ref20">25</xref>
          ]. First, a pool
of potentially-relevant YouTube video links is obtained by
gathering the sets of links returned by the YouTube query, the
human searchers, and the crowdsourcing group. These sets are
then merged, duplicate links are removed, and the relevance of
only this subset of YouTube video links is assessed.
        </p>
        <p>The performance measure used to evaluate and rank the results is
average precision (AP):</p>
        <p>AP
1
∩
where Lk = {l1, l2, …, , lk} is the rank version of the answer set, A.
At any given rank k, let ∩ be the number of relevant videos
in the top k of L, where R is the total number of relevant videos.
Indicator function = 1 if lk ∈R and 0 otherwise. Since the
denominator k and the value of the indicator function are
dominant in determining average precision, it can be understood
that this favors relevant videos appearing towards the top of the
list. Mean average precision (MAP), which is the mean of the
average precision values over a set of queries, has been a key
standard evaluation measure in TRECVID.6 We used the list of
all relevant videos for each question as our determination of
ground truth.</p>
      </sec>
      <sec id="sec-5-2">
        <title>3.2.2 Simple List Preference</title>
        <p>Perhaps a more holistic metric is the simple list preference, which
utilizes the lists returned by each of the three search strategies as
entities. The videos on each list are validated for relevance
6 In recent years, average precision has been replaced by inferred
average precision (IAP), which closely approximates the AP
measure but requires only a subset of the pooled results to be
manually evaluated.
against the video request and those that are judged irrelevant are
removed. The remaining lists are evaluated in pairwise fashion.
Figure 2 gives a high-level overview of this evaluation method.
For each of the 45 requests, we have 3 result sets, one for
YouTube search, one for student search and one for
Crowdsourcing. The mean size of the 135 result sets was 27.7
video links, with a standard deviation of 6.8.</p>
        <p>The first step is validation. We separated the 3743 video links in
groups of 15, comprising 250 separate HITs. To each HIT, we
introduced 2 “trap” links - clearly irrelevant links added to ensure
assessor attention to detail. We posted these 250 HITs, each
containing 17 binary relevance judgments and paid $0.25 per
completed HIT. Each of the 250 crowd assessors was only
permitted to evaluate a single HIT. Thirteen of the HITs were
rejected and had to be relisted due to the assessor failing the trap
link judgment. This validation step reduced the 3743 YouTube
links by just over eighty percent to 728, averaging 5.4 relevant
video links for each of the 135 result sets.</p>
        <p>Using the validated links, within each result set, we presented the
lists in pairs along with an individual thumbnail from each video
as a HIT. For each of the original 45 requests, lists were
presented in random order to avoid selection bias, along with the
Restated Query. We asked each worker to choose the list that best
answered the restated query. We posted each pairwise judgment
at least twice in order to ensure that the highly-subjective
determination of ground truth was made by two different people.
Workers were paid $0.10 per judgment and were restricted from
rating more than one query. If the two raters had a difference in
list preference or the resulting list preference was cyclical (i.e.,
1&gt;2, 2&gt;3, 3&gt;1), we hired an additional rater from the crowd to
establish a clear preference order. Two assessors each made 3
pairwise judgments across the 45 Restated Queries, with a
Cohen’s kappa of 0.624, representing a reasonably strong
interannotator agreement. Of the 270 pairwise decisions 21 required
the use of a tiebreaker, and no cyclical references were
encountered. For each set of results obtained by our Restated
Query, we then apply a Condorcet method to each pairwise
preference among strategies and evaluate based on the lists of
relevant UGC videos they contain.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. RESULTS</title>
    </sec>
    <sec id="sec-7">
      <title>4.1 Pooling</title>
      <p>Using the pooling evaluation method, we calculate the MAP
scores for each of the search efforts. These are given in Table 3.
While these scores seem reasonable, it is likely due to two issues:
our calculation of ground truth and, for most searches, there were
only a small percentage of YouTube videos were considered
relevant. The crowdsourcing search strategy and the student
search strategies performed better than the YouTube search
interface as measured by MAP, a result that is statistically
significant (two tailed, p&lt;0.05).
Since Restated Queries were grouped into three separate
categories (easy, medium, and difficult), we evaluated them
separately for each search strategy. The results are reported in
Table 4.</p>
      <p>
        Second, although the MAP score gap is small between student
search and crowdsourcing, we do notice that the five students
consistently performed slightly better than the crowd. Each
student performed all 45 queries, refining their sources and
techniques as they encountered each new query – all five
participants performed faster and provided better search results
towards the end of their query session than in the beginning (we
cannot observe this improvement with the crowd as each crowd
participant provided results for only a single query). The crowd
had the smallest deviation in MAP scores across the 3 search
categories, primarily because the larger number of people
searching reduces the variation, as discussed in [
        <xref ref-type="bibr" rid="ref5">10</xref>
        ] and [
        <xref ref-type="bibr" rid="ref6">11</xref>
        ].
Third, we can see the value of using human input in these MAP
scores, but Table 4 does not take the costs in both time and money
into consideration. We make the assumption that YouTube’s
search has no cost in terms of time and money and use it as our
baseline. We kept track of the elapsed time taken by the crowd
and for the students as well, so we can evaluate this in aggregate.
This is reported in Tables 5 and 6.
      </p>
      <p>To illustrate, in Tables 5 and 6, for Restated Queries classified as
“difficult”, to obtain an increase in MAP of 0.001 using students,
we would expect to spend 0.06 minutes and incur a cost of 2.723
cents. To obtain an equivalent increase in MAP using
crowdsourcing, we would expect to spend, on average, 0.04
minutes and incur a cost of 1.111 cents. Note that these numbers
represent long term averages. Thus, we observe that using the
crowd, as compared with students, requires 40% of the cost and
takes two thirds the time, on average, to raise MAP by an
equivalent amount. Thus, when obtaining more precise results is
our paramount objective, using students or the crowd is expected
to provide the best results. If time or financial costs are also a
consideration, our results show that using the crowd will provide
the best tradeoff between time, financial cost, and precision.</p>
    </sec>
    <sec id="sec-8">
      <title>4.2 Simple List Preference</title>
      <p>
        We apply Copeland’s pairwise aggregation method, described in
[
        <xref ref-type="bibr" rid="ref21 ref22">26, 27</xref>
        ], is a Condorcet method used to evaluate pairwise
preferences. Copeland’s pairwise aggregation method examines
two lists for a given query in a pairwise fashion and records the
assessor’s preference as a “victory”. Search strategies are ordered
by number of victories over each opponent to determine an overall
winner. We examine each pairwise preference for the three result
lists for all 45 queries. These comparison results are given in
Table 7.
From Table 7, we observe that student search is our Condorcet
winner, beating all other search strategies in pairwise
comparisons. As with the pooling assessment method, there was a
slight preference of student search results over the crowdsourcing
supplied video lists. However, when financial costs are disclosed
to the assessors along with the scores, crowdsourcing is our
Condorcet winner, as observed in Table 8.
      </p>
    </sec>
    <sec id="sec-9">
      <title>5. CONCLUSION</title>
      <p>This study has examined the effects of using students,
crowdsourcing, and YouTube’s search interface on UGC
searches. We observe that human computation efforts provide
better MAP scores than YouTube’s own search interface across
all categories. In addition, our study examines the costs, in terms
of time and money, of this MAP score increase for each search
strategy. Although this study didn’t explicitly vary the financial
incentives offered to students or the crowd, we do observe there is
a tradeoff between better precision and search costs (in terms of
time and money); it is up to each search requester to decide if
these costs outweigh the need for improved precision.
We also examine the retrieval lists as complete entities. We see
that a simple list preference favors the student search strategy
when costs are not considered; if time and cost are to be
considered, crowdsourcing achieves additional consideration due
to the cost savings it offers over student search. This reinforces
the findings observed through pooling evaluation.</p>
      <p>In future studies, we plan to examine the financial incentives
offered to examine the marginal benefit of achieving better
precision. Similarly, we plan to investigate whether we can
incentivize the crowd to increase their performance without
increase time and financial cost. We also plan to examine
different types of searches, such as those specific to a particular
domain to observe if searches can be performed effectively when
specific domain knowledge is required.
6. REFERENCES
[2] Google Insights for Search. http://google.com/
insights/search. Retrieved January 8, 2012.
[4] Geisler, G. and Burns, S. Tagging video: conventions and strategies
of the YouTube community. ACM. 2007.
[5] Cha, M., Kwak, H., Rodriguez, P., Ahn, Y.-Y. and Moon, S. I tube,
you tube, everybody tubes: analyzing the world's largest user
generated content video system. In Proceedings of the 7th ACM
SIGCOMM conference on Internet measurement (San Diego,
California). ACM. 2007</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Bischoff</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Firan</surname>
            ,
            <given-names>C. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nejdl</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Paiu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>Can all tags be used for search?</article-title>
          ACM.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khemmarat</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>The impact of YouTube recommendation system on video views</article-title>
          .
          <source>ACM</source>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Dearman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Truong</surname>
            ,
            <given-names>K. N.</given-names>
          </string-name>
          <article-title>Why users of yahoo!: answers do not answer questions</article-title>
          .
          <source>ACM</source>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Bian</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Agichtein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            and
            <surname>Zha</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. Finding</surname>
          </string-name>
          <article-title>the right facts in the crowd: factoid question answering over social media</article-title>
          .
          <source>ACM</source>
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Surowiecki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>The Wisdom of Crowds</article-title>
          . Anchor Press. New York.
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Pennock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>The wisdom of the ProbabilitySports crowd</article-title>
          . http://blog.oddhead.com/
          <year>2007</year>
          /01/04/
          <article-title>the-wisdomof-the-probabilitysports-crowd/</article-title>
          .
          <source>Retrieved January 12</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Nielsen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Participation inequality: Encouraging more users to contribute</article-title>
          . http://www.useit.com/alertbox/ participation_inequality.html.
          <source>Retrieved January 12</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Smeaton</surname>
            ,
            <given-names>A. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Over</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kraaij</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <article-title>Evaluation campaigns and TRECVid</article-title>
          .
          <source>In Proceedings of the 8th ACM international workshop on Multimedia information retrieval (New York)</source>
          .
          <source>ACM</source>
          .
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popescu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kludas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Overview of the wikipedia image retrieval task at ImageCLEF 2011</article-title>
          . Amsterdam.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [15]
          <string-name>
            <surname>van</surname>
            <given-names>Zwol</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Murdock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Pueyo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            and
            <surname>Ramirez</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Diversifying image search with user generated content</article-title>
          .
          <source>In of the 1st ACM international conference on Multimedia information retrieval (Vancouver</source>
          <year>2008</year>
          ). ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Hotho</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jäschke</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmitz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Information retrieval in folksonomies: Search and ranking</article-title>
          .
          <source>The Semantic Web: Research and Apps</source>
          ,
          <year>2006</year>
          . pp,
          <volume>411</volume>
          -
          <fpage>426</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Chua</surname>
            ,
            <given-names>T. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hong</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>From text questionanswering to multimedia qa on web-scale media resources</article-title>
          .
          <source>ACM</source>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>and</article-title>
          <string-name>
            <surname>Chua</surname>
            ,
            <given-names>T. S.</given-names>
          </string-name>
          <article-title>Video reference: question answering on YouTube</article-title>
          .
          <source>ACM</source>
          .
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Steiner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verborgh</surname>
          </string-name>
          , R., Van de Walle, R.,
          <string-name>
            <surname>Hausenblas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vallés</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          <article-title>Crowdsourcing Event Detection in YouTube Videos</article-title>
          .
          <source>In Proceedings of the 10th International Semantic Web Conference (Koblenz, Germany</source>
          ,
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Hsueh</surname>
          </string-name>
          , P.-Y.,
          <string-name>
            <surname>Melville</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sindhwani</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <article-title>Data quality from crowdsourcing: a study of annotation selection criteria</article-title>
          .
          <source>In Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing (Boulder, Colorado</source>
          ,
          <year>2009</year>
          ). ACL, Strousburg, PA.
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ganesan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones</article-title>
          .
          <source>ACM</source>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [22]
          <article-title>ComScore: YouTube Now 25 Percent Of All Google Searches</article-title>
          . http://techcrunch.com/
          <year>2008</year>
          /12/18/ comscoreyoutube-now-25
          <string-name>
            <surname>-</surname>
          </string-name>
          percent
          <article-title>-of-all-google-searches/</article-title>
          <source>Retrieved January 22</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Kofler</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Hanjalic</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>To seek, perchance to fail: expressions of user needs in internet video search</article-title>
          .
          <source>Advances in Information Retrieval</source>
          .
          <year>2011</year>
          . pp.
          <fpage>611</fpage>
          -
          <lpage>616</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Spink</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolfram</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jansen</surname>
            ,
            <given-names>M. B. J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Saracevic</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>Searching the web: The public and their queries</article-title>
          .
          <source>Journal of the American society for information science and technology, 52:3 2001</source>
          . pp.
          <fpage>226</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Over</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Awad</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smeaton</surname>
            ,
            <given-names>A. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lanagan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Creating</surname>
          </string-name>
          <article-title>a web-scale video collection for research</article-title>
          .
          <source>In Procedings of the 1st workshop on Web-scale multimedia corpus (Beijing</source>
          , China,
          <year>2009</year>
          ). ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Copeland</surname>
            ,
            <given-names>A. H.</given-names>
          </string-name>
          <article-title>A reasonable social welfare function</article-title>
          . mimeo. University of Michigan,
          <year>1951</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Moulin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>Choosing from a tournament</article-title>
          .
          <source>Social Choice and Welfare</source>
          ,
          <volume>3</volume>
          :
          <fpage>4</fpage>
          <lpage>1986</lpage>
          . pp
          <fpage>271</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>