<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Web Search - Challenges and Opportunities</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Berthier Ribeiro-Neto</string-name>
          <email>berthier@dcc.ufmg.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CS Department, UFMG &amp; Google Engineering Belo Horizonte</institution>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>From IR to Search Engines</institution>
        </aff>
      </contrib-group>
      <fpage>16</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>As Udi Manber, currently a VP of Engineering at Google, likes to say: "Search is hard!" Yet, it is possible to build a rather powerful search engine that provides rather accurate answers most of the time. In the talk, I brie y discuss some of the fundamental concepts and technologies behind search engines, starting with a quick review of Information Retrieval and then moving through 15 years of search engines evolution. Following, I discuss some of the current challenges related to the search task, as well as open opportunities for research.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>breaks down with larger collections. To deal with large collections, such as the
ones dealt with by modern search engines, the retrieval model needed to evolve.</p>
      <p>
        In the early days of the Web, search engines were implemented as classic
IR systems. Their rst big evolutionary step was to recognize that the links
among Web pages provide additional information that is of value for ranking.
Indeed, a Web page that is pointed to by many links re ects a degree of authority
on a given topic of interest, a degree of authority that can be computed as a
normalized count of the number of inlinks. Thus, when a user poses a query,
the engine can return rst the pages that match the query and have the highest
degrees of authority. This is the idea behind the now famous Page Rank, the
ranking function used to build Google [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Since then, search engines have evolved to include into the ranking function a
variety of signals originary from the document collection, from the query stream,
and from the user actions. As a result, modern search engines now combine
hundreds of signals into a single ranking function in an attempt to better answer
the user queries. During the talk, I will discuss some of these signals and how
they are used for ranking purposes.
2</p>
      <p>Challenges and Opportunities
While search engines have evolved continuously over the last 15 years, they still
face many challenges, particularly with infrequent and detailed queries. A user
that is seeking a phone number of their doctor will frequently be frustrated
with the answers produced by the search engine. To cope with queries of this
nature, search engines need to evolve further. They need to evolve to incorporate
knowledge encoded in some form that it can be useful for ranking purposes.</p>
      <p>Also important, it is frequently the case that determining the most relevant
document for a query requires interpreting the content of the document and
determining its central topics. That is, document understanding is an area
little understood and which needs much more research, if search is to be further
improved. During the talk, I will discuss additional examples of challenges and
opportunities in the space of search.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ribeiro-Neto</surname>
          </string-name>
          :
          <article-title>Modern Information Retrieval|The Technology and Concepts Behind Search Engines</article-title>
          . Pearson,
          <volume>917</volume>
          <fpage>pages</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>S.</given-names>
            <surname>Brin</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>Page: The anatomy of a large-scale hypertextual Web search engine</article-title>
          .
          <source>World Wide Web Conference</source>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>G.</given-names>
            <surname>Salton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. S.</surname>
          </string-name>
          <article-title>Yang: A Vector Space Model for Automatic Indexing</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>613</volume>
          {
          <fpage>620</fpage>
          , vol
          <volume>18</volume>
          , num
          <volume>11</volume>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>