=Paper=
{{Paper
|id=Vol-1567/preface
|storemode=property
|title=Editorial for the 3rd Bibliometric-Enhanced Information Retrieval Workshop at ECIR 2016
|pdfUrl=https://ceur-ws.org/Vol-1567/editorial.pdf
|volume=Vol-1567
|dblpUrl=https://dblp.org/rec/conf/ecir/MayrFC16a
}}
==Editorial for the 3rd Bibliometric-Enhanced Information Retrieval Workshop at ECIR 2016==
<pdf width="1500px">https://ceur-ws.org/Vol-1567/editorial.pdf</pdf>
<pre>
                                      BIR 2016 Workshop on Bibliometric-enhanced Information Retrieval


      Editorial for the 3rd Bibliometric-Enhanced
    Information Retrieval Workshop at ECIR 2016

          Philipp Mayr1 , Ingo Frommholz2 , and Guillaume Cabanac3
      1
         GESIS - Leibniz-Institute for the Social Sciences, Cologne, Germany,
                              philipp.mayr@gesis.org
   2
     Institute for Research in Applicable Computing, University of Bedfordshire,
                                     Luton, UK,
                            ingo.frommholz@beds.ac.uk
 3
   University of Toulouse, Computer Science Department, IRIT UMR 5505, France
                        guillaume.cabanac@univ-tlse3.fr


1    Introduction
Following the successful workshops at ECIR 20144 and 20155 , respectively, this
workshop was the third in a series of events that brought together experts of
communities which often have been perceived as different ones: bibliometrics /
scientometrics / informetrics on the one hand and information retrieval on the
other. Our motivation as organizers of the workshop started from the observa-
tion that main discourses in both fields are different, that communities are only
partly overlapping and from the belief that a knowledge transfer would be prof-
itable for both sides [1]. The first BIR workshop in 2014 set the research agenda
by introducing each group to the other, illustrating state-of-the-art methods,
reporting on current research problems, and brainstorming about common in-
terests. The second workshop in 2015 further elaborated these themes. This third
full-day BIR workshop6 at ECIR 2016 aimed to foster a common ground for the
incorporation of bibliometric-enhanced services into scholarly search engine in-
terfaces. In particular we addressed specific communities, as well as studies on
large, cross-domain collections like Mendeley and ResearchGate. This third BIR
workshop addressed explicitly both scholarly and industrial researchers.


2    Overview of the papers
This year 15 papers were submitted to the workshop, 7 of which were finally
accepted for presentation and inclusion in the proceedings. The workshop fea-
tured one keynote talk and three paper sessions. The first session discussed text
and reference mining approaches while the second session focused on bibliomet-
ric and IR tools. The final position paper session gave an outlook on further
research. The following briefly describes the keynote and sessions.
4
  http://gesis.org/en/events/events-archive/conferences/ecirworkshop2014
5
  http://gesis.org/en/events/events-archive/conferences/ecirworkshop2015
6
  http://gesis.org/en/events/events-archive/conferences/ecirworkshop2016


                                       1
                                       BIR 2016 Workshop on Bibliometric-enhanced Information Retrieval


2.1   Keynote
The keynote “Bibliometrics in online book discussions: Lessons for complex
search tasks” [2] was given by Marijn Koolen from the University of Amster-
dam. Koolen explores the potential relationships between book search informa-
tion needs and bibliometric analysis. The Social Book Search Lab is introduced,
which utilizes data from Amazon, LibraryThing (LT), the Library of Congress
and the British Library. LT discussions indicate some complex search tasks. Users
catalogue, tag, and relate books to each other. The hypothesis is that reviews,
catalogues, and discussion threads could be interpreted as (implicit) co-citation
and citation structures. Analyzing comments and reviews, several information
need patterns can be identified. Koolen also discusses how the data at hand can
be utilized for information retrieval.

2.2   Text and Reference Mining
In their paper “Weak links and strong meaning: The complex phenomenon of
negational citations” [3], Marc Bertin and Iana Atanassova designed a method
to extract negational citations from full-text publications. They revealed the fre-
quency distribution of such citations appearing throughout the regular IMRaD
structure of about 80,000 PLOS papers. Qualifying the polarity of citations has
many practical applications. This valuable knowledge might inform the scientific
community about papers attracting negative feedback that should be reconsid-
ered and potentially retracted.
    In their paper “Towards a more fine grained analysis of sceintific authorship:
Predicting the number of authors using stylometric features” [4] Andi Rexha,
Stefan Klampfl, Mark Kröll, and Roman Kern aimed to chunk papers accord-
ing to stylometric features. The resulting segments were then attributed to the
corresponding author(s) listed in the byline of the paper (i.e., the individuals
who co-signed the paper). This contribution is likely to enhance paper/passage
retrieval by author name.
    In their paper “The references of references: Enriching library catalogs via
domain-specific reference mining” [5], Giovanni Colavizza, Matteo Romanello,
and Frédéric Kaplan enhanced a digital library by collecting references from
domain-specific reference monographs in the Humanities. Their experiment on a
corpus dedicated to the history of Venice stresses the necessity of including such
overlooked references to improve search effectiveness in such corpora.

2.3   Tools for Bibliometric IR
In the paper “Bibliometrics: a publication analysis tool” [6] by Rosa Padrós-
Cuxart, Clara Riera-Quintero, and Francesc March-Mir, the authors present a
bibliometric data management and consultation tool that can be utilized to
study and analyze an institution’s scientific activity. The tool is able to gen-
erate bibliometric reports on scientific outputs at different analysis levels like
author, journal, and institution. The tool includes data from various sources


                                        2
                                       BIR 2016 Workshop on Bibliometric-enhanced Information Retrieval


like WOS/Scopus and provides different indicators like productivity, visibility,
impact, and collaboration.
    In the paper “Engineering a tool to detect automatically generated papers” [7]
by Nguyen Minh Tien and Cyril Labbé, the authors are focussing on detecting
fake academic papers that are automatically created. The authors work on de-
tection approaches based on distance/similarity measurement and introduce a
tool which is able to detect automatically generated papers, the SciDetect sys-
tem. The authors evaluate the SciDetect system against pattern matching and
Kullback-Leibler Divergence on three different text corpora.

2.4    IR Position Papers
In his article “Bag of works retrieval: TF*IDF weighting of co-cited works”,
Howard D. White proposes an alternative to the well-known bag of words model
called bag of works [8]. This model can in particular be used for finding similar
documents to a given seed one. In the proposed bag of works model, the tf and
idf measures are re-defined based on (co-)citation counts. The properties of the
retrieved documents are discussed and an example is provided.
    In their article “On the need for and provision for an ‘IDEAL’ scientific in-
formation retrieval test collection” [9], Birger Larsen and Christina Lioma argue
there is a need for test collections tailored to bibliometric IR. They discuss
several challenges coming along with creating such a collection (e.g., regarding
size, domain-specific dissemination and retrieval, realistic queries and relevance
judgements, pooling strategies as well as format). Furthermore, procedures to
create an ideal test collection are examined.


3     Outlook

With this continuing workshop series we have built up a sequence of explorations,
visions, results documented in scholarly discourse, and created a sustainable
bridge between bibliometrics and IR.
    As a next iteration we will organize a Joint Workshop on Bibliometric-
enhanced Information Retrieval and Natural Language Processing for Digital
Libraries (BIRNDL 2016)7 at the JCDL conference 2016. The BIRNDL work-
shop will be co-organized together with the natural language processing group
of Min-Yen Kan, National University of Singapore, which includes a shared task
(the CL-SciSumm Shared Task8 ). The shared task tackles automatic paper sum-
marization in the Computational Linguistics (CL) domain.


7
    http://wing.comp.nus.edu.sg/birndl-jcdl2016/
8
    http://wing.comp.nus.edu.sg/cl-scisumm2016/


                                        3
                                          BIR 2016 Workshop on Bibliometric-enhanced Information Retrieval


References
1. Mayr, P., Scharnhorst, A.: Scientometrics and Information Retrieval: weak-links
   revitalized. Scientometrics 102(3) (2015) 2193–2199
2. Koolen, M.: Bibliometrics in online book discussions: Lessons for complex search
   tasks. In: Proc. of the 3rd Workshop on Bibliometric-enhanced Information Retrieval
   (BIR2016). 5–13
3. Bertin, M., Atanassova, I.: Weak links and strong meaning: The complex phe-
   nomenon of negational citations. In: Proc. of the 3rd Workshop on Bibliometric-
   enhanced Information Retrieval (BIR2016). 14–25
4. Rexha, A., Klampfl, S., Kröll, M., Kern, R.: Towards a more fine grained anal-
   ysis of scientific authorship: Predicting the number of authors using stylometric
   features. In: Proc. of the 3rd Workshop on Bibliometric-enhanced Information Re-
   trieval (BIR2016). 26–31
5. Colavizza, G., Romanello, M., Kaplan, F.: The references of references: Enriching
   library catalogs via domain-specific reference mining. In: Proc. of the 3rd Workshop
   on Bibliometric-enhanced Information Retrieval (BIR2016). 32–43
6. Padrós-Cuxart, R., Riera-Quintero, C.: Bibliometrics: a publication analysis tool.
   In: Proc. of the 3rd Workshop on Bibliometric-enhanced Information Retrieval
   (BIR2016). 44–53
7. Tien, N.M., Labbé, C.: Engineering a tool to detect automatically generated pa-
   pers. In: Proc. of the 3rd Workshop on Bibliometric-enhanced Information Retrieval
   (BIR2016). 54–62
8. White, H.D.: Bag of works retrieval: TF*IDF weighting of co-cited works. In: Proc.
   of the 3rd Workshop on Bibliometric-enhanced Information Retrieval (BIR2016).
   63–72
9. Larsen, B., Lioma, C.: On the need for and provision for an ‘IDEAL’ scholarly
   information retrieval test collection. In: Proc. of the 3rd Workshop on Bibliometric-
   enhanced Information Retrieval (BIR2016). 73–81


                                           4

</pre>