TREC 2018 News Track

             Shudong Huang                            Ian Soboroff                      Donna K. Harman
            100 Bureau Drive                       100 Bureau Drive                      100 Bureau Drive
              Gaithersburg                           Gaithersburg                          Gaithersburg
          Maryland 20899-8940                    Maryland 20899-8940                   Maryland 20899-8940
         shudong.huang@nist.gov                  ian.soboroff@nist.gov                donna.harman@nist.gov
                                    National Institute of Standards and Technology


                                                                topic, even in a minimal way, as long as that mention
                                                                is worth including in a report on the topic.
                       Abstract                                    In 2018, people consume news overwhelmingly via
                                                                social media recommendation, but also through web
    While more and more people are relying on                   browsing, search, and advertising recommendation.
    social media for news feeds, serious news con-              Traditional news outlets more and more are taking a
    sumers still resort to well-established news                “digital first” strategy, rather than hewing to the no-
    outlets for more accurate and in-depth report-              tion of a newspaper front page. But the most change
    ing and analyses. They may also look for re-                has come from social recommendation and news aggre-
    ports on related events that have happened                  gators. Google News, started in 2002, marked the end
    before and other background information in                  of publisher-driven news delivery by pivoting the focus
    order to better understand the event being                  from the publisher to the story. The diversification of
    reported. Many news outlets already create                  news delivery has democratized news publishing, and
    sidebars and embed hyperlinks to help news                  current news outlets reflect an enormous range of jour-
    readers, often with manual efforts. Technolo-               nalistic standards and methods.
    gies in IR and NLP already exist to support                    NIST realized the time had come to reinvent news
    those features, but standard test collections               search as a focus for information retrieval and natural
    do not address the tasks of modern news con-                language processing research. In partnership with the
    sumption. To help advance such technologies                 Washington Post, NIST launched the News Track as
    and transfer them to news reporting, NIST,                  part of the 2018 Text Retrieval Conference (TREC).1
    in partnership with the Washington Post, is                 One component of this is a new document collection,
    starting a new TREC track in 2018 known as                  the TREC Washington Post Collection, which is avail-
    the News Track.                                             able as a free download from NIST. The second com-
                                                                ponent is a pair of IR tasks driven by how content is
                                                                structured for the Post’s website.
1    Motivation
News content has long been part of information re-              2     Data
trieval test collections, but the search tasks that those
                                                                In partnership with the Washington Post, we have
collections measure is ad hoc search. Ad hoc search
                                                                made a large archive of digital news content avail-
is a task where the user is seeking any and all infor-
                                                                able to participants, extending from 2012 through Au-
mation about a topic of interest. As such, articles are
                                                                gust 2017. It contains both news articles and blogs
judged to be relevant to a topic if they mention the
                                                                as originally published by the Washington Post with
                                                                a total of 608,180 documents (about 6.9GB uncom-
Copyright c 2018 for the individual papers by the papers’ au-
thors. Copying permitted for private and academic purposes.     pressed in size), divided into 12 text files. Each text
This volume is published and copyrighted by its editors.        file represents a collection of either news articles or
In: D. Albakour, D. Corney, J. Gonzalo, M. Martinez,            blogs in one of those 6 years. The documents are
B. Poblete, A. Vlachos (eds.): Proceedings of the NewsIR’18     stored in JSON format, with each line representing
Workshop at ECIR, Grenoble, France, 26-March-2018, pub-
lished at http://ceur-ws.org                                        1 http://trec.nist.gov/
a single news or blog document. Each document has           ever. IR and NLP technology can support journalists
meta-data including article title, original article URL,    in suggesting links to articles and entities that provide
author, date of publication, and sources for text and       background and promote a deeper understanding of a
embedded media. For more information on how to              news story. For the first tasks in the News Track, we
obtain the TREC Washington Post collection, visit           have chosen to work on background linking and entity
http://trec.nist.gov/data/wapost/.                          ranking.
   We also have a reformatted dump of English
Wikipedia from close to the time of the latest news         3.1   Tasks 1: Background Linking
articles available for download.
                                                            The main task for this new track will be “Background
                                                            Linking”, defined as follows: given a news article, the
3   Tasks                                                   system should retrieve other news articles that pro-
                                                            vide important context and/or background informa-
On news outlets’ websites, article content and hyper-
                                                            tion that helps the reader better understand the query
links are used to provide context and background. In
                                                            article. This task is essentially an ad hoc search with a
other words, browsing is not arbitrary but is guided
                                                            specialized relevance criterion. Relevance for this task
through stories in the sidebar and hyperlinks in the
                                                            will be graded along a categorized scale:
story to permit the reader to read more deeply. On
the Washington Post’s website, for example, related         0: the document provides little or no useful back-
stories are manually linked both on the side and at             ground or contextual information that would help
the end of articles, and links within the article fre-          the user understand the broader context of the
quently link to related stories or further information          query article.
about entities in the story.
   However creating such links manually is a tedious        1: the document provides some useful background . . .
and cost-ineffective process. It is not surprising that
crucial background stories as previously reported or        2: the document provides significant useful back-
externally available are not always provided. Consider          ground . . .
for example an article on February 4, 2018 titled “N.
                                                            3: the document provides essential useful back-
Korea to send nominal head of state to S. Korea”.
                                                                ground . . .
There is no single link to background information on
the current state of the Korean conflict (other than        4: the document MUST appear in the sidebar;
one about Kim Jong Un’s sister that was generated               otherwise critical context is missing.
at the time of accessing this article under “Most Read
World” dated later than the current article) , but there       We will refine this category scale with the help of
are no links to recent stories such as “Hot heads or        our partners at the Washington Post. The critical
cold feet? North Korea’s mixed Olympic messages”            points are that relevance hinges on providing “useful
and “North Korean athletes arrive in South Korea for        background information or context”, and that there
Olympics” just reported a few days earlier, or “North       are levels that align with utility for the reader. We an-
Korea agrees to send athletes to Winter Olympics,           ticipate that these relevance judgments would be made
South says” and “Vice President Pence will lead U.S.        at NIST by NIST assessors, with training support from
delegation to Olympics in South Korea” a month be-          journalists and data scientists at the Washington Post.
fore. There was also a report back in 2014 about the           As a research problem, we would like to investigate
North’s high-level visit to the South at the end of the     how this relevance criterion differs from “traditional
Asian Games, titled “North Korean officials pay rare        topical relevance” both in how it is applied by asses-
and surprising visit to the South”. Needless to say,        sors and how it measures systems differently. To that
many names mentioned in the current story have ap-          end, we may also ask whether the article is topically
peared in previous news articles and/or have entries        relevant to the query article. This could be imple-
in other online resources such as Wikipedia. If the         mented by adding one more level to the above scale to
journalist had had at his/her disposal a utility that       capture topical relevance.
can automatically retrieve those relevant stories in or-       We will use NDCG@5 [Jarvelin:2002] as the primary
der of significance and link important entity mentions      effectiveness measure: the sidebar has limited real es-
to more in-depth articles about them elsewhere, s/he        tate, and should ideally contain the best contextual-
would have been able to make them available to the          izing links. We will also report average precision and
reader with much ease.                                      the other standard trec eval measures.
   Getting context to the reader is very difficult in the      The initial task is intentionally simple: we want to
modern news landscape, but is more important than           establish a baseline for the state of the art and use
that performance to consider refinements to the task.        References
These might include:
                                                             [Jarvelin:2002] K Järvelin and J Kekäläinen. Cu-
                                                                       mulated Gain-based Evaluation of IR Tech-
    • Having assessors cluster equivalent background
                                                                       niques. ACM Trans. Inf. Syst., 20(4):422–
      articles, to allow the measure to support “retrieve
                                                                       446, October 2002.
      one of these critical articles”.

    • Snipped generation for the sidebar, where the
      snippet should provide the critical context with-
      out the need to click through.

    • Categories of background, for example about peo-
      ple and organizations. This would be measured
      using diversity metrics.

3.2     Task 2: Entity Ranking
The second task is “Entity Ranking”: given an arti-
cle, identify important entities mentioned in the article
and rank those entities linkable to Wikipedia entries
in the order of importance, in order to support the
reader’s understanding of the story. An example of
an important entity might be “the Supreme Court”,
whereas an example of an unimportant entity might
be “Washington” in a dateline.
   By structuring this as a ranking task, we are sepa-
rating out the core NLP problems of entity detection
and linking from determining importance to the user.
The provided mentions and links may be useful to re-
searchers working on entity extraction as well.
   Again working with criteria developed in conjunc-
tion with Post staff, we will identify the top entities in
each article along a graded relevance scale, and mea-
sure the task as a retrieval task using nDCG.


4     SUMMARY
We set out the New Track with two initial tasks, Back-
ground Linking and Entity Ranking, which we believe
are valuable to both the news creator and consumer.
At the time of submitting this position paper, we are
still in the process of refining the tasks and perfor-
mance measurements via working with journalists and
data scientists at the Washington Post and researchers
in the IR and NLP communities. We welcome feed-
back and suggestions on the current tasks as well as
recommendations on future tasks. We also encourage
participation from researchers around the world.

4.0.1    Acknowledgements
Special thanks to Sam Han at the Washington Post
for coordinating the efforts between the two organi-
zations. We also appreciate the input from the other
team members of the Retrieval Group at NIST.