<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MultiLingMine 2016: Modeling, Learning and Mining for Cross/Multilinguality</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Salvatore Romeo</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Tagarelli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dino Ienco</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mathieu Roche</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Rosso</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRAD, LIRMM</institution>
          ,
          <addr-line>Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DIMES, University of Calabria</institution>
          ,
          <addr-line>Rende</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>IRSTEA, LIRMM</institution>
          ,
          <addr-line>Montpellier</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Qatar Computing Research Institute</institution>
          ,
          <addr-line>Doha</addr-line>
          ,
          <country country="QA">Qatar</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Universitat Politecnica de Valencia</institution>
          ,
          <addr-line>Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <abstract>
        <p>The increasing availability of text information coded in many di erent languages poses new challenges to modern information retrieval and mining systems in order to discover and exchange knowledge at a larger world-wide scale. The 1st International Workshop on Modeling, Learning and Mining for Cross/Multilinguality (dubbed MultiLingMine 2016) provides a venue to discuss research advances in cross-/multilingual related topics, focusing on new multidisciplinary research questions that have not been deeply investigated so far (e.g., in CLEF and related events relevant to CLIR). This includes theoretical and experimental on-going works about novel representation models, learning algorithms, and knowledge-based methodologies for emerging trends and applications, such as, e.g., cross-view cross-/multilingual information retrieval and document mining, (knowledge-based) translation-independent cross/multilingual corpora, applications in social network contexts, and more.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In the last few years the phenomenon of multilingual information overload has
received signi cant attention due to the huge availability of information coded
in many di erent languages. We have in fact witnessed a growing popularity of
tools that are designed for collaboratively editing through contributors across
the world, which has led to an increased demand for methods capable of
effectively and e ciently searching, retrieving, managing and mining di erent
language-written document collections. The multilingual information overload
phenomenon introduces new challenges to modern information retrieval systems.
By better searching, indexing, and organizing such rich and heterogeneous
information, we can discover and exchange knowledge at a larger world-wide scale.
However, since research on multilingual information is relatively young,
important issues still remain uncovered:
{ how to de ne a translation-independent representation of the documents
across many languages;
{ whether existing solutions for comparable corpora can be enhanced to
generalize to multiple languages without depending on bilingual dictionaries or
incurring bias in merging language-speci c results;
{ how to pro tably exploit knowledge bases to enable translation-independent
preserving and unveiling of content semantics;
{ how to de ne proper indexing and multidimensional data structures to
better capture the multi-topic and/or multi-aspect nature of multi-lingual
documents;
{ how to detect duplicate or redundant information among di erent languages
or, conversely, novelty in the produced information;
{ how to enrich and update multi-lingual knowledge bases from documents;
{ how to exploit multi-lingual knowledge bases for question answering;
{ how to e ciently extend topic modeling to deal with multi/cross-lingual
documents in many languages;
{ how to evaluate and visualize retrieval and mining results.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Objectives, topics, and outcomes</title>
      <p>The aim of the 1st International Workshop on Modeling, Learning and
Mining for Cross/Multilinguality (dubbed MultiLingMine 2016 ),6 held in
conjunction with the 2016 ECIR Conference, is to establish a venue to discuss research
advances in cross-/multilingual related topics. MultiLingMine 2016 has been
structured as a full-day workshop. Its program schedule includes invited talks
as well as a panel discussion among the participants. It is mainly geared
towards students, researchers and practitioners actively working on topics related
to information retrieval, classi cation, clustering, indexing and modeling of
multilingual corpora collections. A major objective of this workshop is to focus on
research questions that have not been deeply investigated so far. Special interest
is devoted to contributions that aim to consider the following aspects:
{ Modeling: methods to develop suitable representations for multilingual
corpora, possibly embedding information from di erent views/aspects, such as,
e.g., tensor models and decompositions, word-to-vector models, statistical
topic models, representational learning, etc.
{ Learning: any unsupervised, supervised, and semi-supervised approach in
cross/multilingual contexts.
{ The use of knowledge bases to support the modeling, learning, or both stages
of multilingual corpora analysis.
{ Emerging trends and applications, such as, e.g., cross-view cross-/multilingual</p>
      <p>IR, multilingual text mining in social networks, etc.</p>
      <p>Main research topics of interest in MultiLingMine 2016 include the following:
{ Multilingual/cross-lingual information access, web search, and ranking
6 http://events.dimes.unical.it/multilingmine/
{ Multilingual/cross-lingual relevance feedback
{ Multilingual/cross-lingual text summarization
{ Multilingual/cross-lingual question answering
{ Multilingual/cross-lingual information extraction
{ Multilingual/cross-lingual document indexing
{ Multilingual/cross-lingual topic modeling
{ Multi-view/Multimodal representation models for multilingual corpora and
crosslingual applications
{ Cross-view multi/cross-lingual information retrieval and document mining
{ Multilingual/cross-lingual classi cation and clustering
{ Knowledge-based approaches to model and mine multilingual corpora
{ Social network analysis and mining for multilinguality/cross-linguality
{ Plagiarism detection for multilinguality/cross-linguality
{ Sentiment analysis for multilinguality/cross-linguality
{ Deep learning for multilinguality/cross-linguality
{ Novel validity criteria for cross-/multilingual retrieval and learning tasks
{ Novel paradigms for visualization of patterns mined in multilingual corpora
{ Emerging applications for multilingual/cross-lingual domains</p>
      <p>The ultimate goal of the MultiLingMine workshop is to increase the
visibility of the above research themes, and also to bridge closely related research
elds such as information access, searching and ranking, information extraction,
feature engineering, text mining and machine learning.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Advisory board</title>
      <p>The scienti c signi cance of the workshop is assured by a Program Committee
which includes 20 research scholars, coming from di erent countries and widely
recognized as experts in cross/multi-lingual information retrieval:
Ahmet Aker, Univ. She eld, United Kingdom
Rafael Banchs, I2R Singapore
Martin Braschler, Zurich Univ. of Applied Sciences, Switzerland
Philipp Cimiano, Bielefeld University, Germany
Paul Clough, Univ. She eld, United Kingdom
Andrea Esuli, ISTI-CNR, Italy
Wei Gao, QCRI, Qatar
Cyril Goutte, National Research Council, Canada
Parth Gupta, Universitat Politcnica de Valncia, Spain
Dunja Mladenic, Jozef Stefan International Postgraduate school, Slovenia
Alejandro Moreo, ISTI-CNR, Italy
Alessandro Moschitti, Univ. Trento, Italy; QCRI, Qatar
Matteo Negri, FBK - Fondazione Bruno Kessler, Italy
Simone Paolo Ponzetto, Univ. Mannheim, Germany
Achim Rettinger, Institute AIFB, Germany
Philipp Sorg, Institute AIFB, Germany
Ralf Steinberger, JRC in Ispra, Italy
Marco Turchi, FBK - Fondazione Bruno Kessler, Italy</p>
      <p>Vasudeva Varma, IIIT Hyderabad, India
Ivan Vulic, KU Leuven, Belgium
4</p>
    </sec>
    <sec id="sec-4">
      <title>Related events</title>
      <p>
        A COLING'08 workshop [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] was one of the earliest events that emphasized the
importance of analyzing multilingual document collections for information
extraction and summarization purposes. The topic also attracted attention from
the semantic web community: in 2014, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] solicited works to discuss principles
on how to publish, link and access mono and multilingual knowledge data
collections; in 2015, another workshop [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] took place on similar topics in order to
allow researchers continue to address multilingual knowledge management
problems. A tutorial on Multilingual Topic Models was presented at WSDM 2014 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
focusing on how statistically model document collections written in di erent
languages. In 2015, a WWW workshop aimed at advancing the state-of-the-art in
Multilingual Web Access [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: the contributing papers covered di erent aspects
of multilingual information analysis, leveraging attention on the lack of current
information retrieval techniques and the necessity of new techniques especially
tailored to manage, search, analysis and mine multilingual textual information.
      </p>
      <p>
        The main event related to our workshop is the CLEF initiative [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which
has long provided a premier forum for the development of new information
access and evaluation strategies in multilingual contexts. However, di erently from
MultiLingMine, it does not have emphasized research contributions on tasks such
as searching, indexing, mining and modeling of multilingual corpora.
      </p>
      <p>Our intention is to continue the lead of previous events about multilingual
related topics, however from a broader perspective which is relevant to various
information retrieval and document mining elds. We aim at soliciting
contributions from scholars and practitioners in information retrieval that are interested
in Multi/Cross-lingual document management, search, mining, and evaluation
tasks. Moreover, di erently from previous workshops, we would emphasize some
speci c trends, such as cross-view cross/multilingual IR, as well as the
growing tightly interaction between knowledge-based and statistical/algorithmic
approaches in order to deal with multilingual information overload.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bandyopadhyay</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poibeau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saggion</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yangarber</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <source>Procs. of the Workshop on Multi-source Multilingual Information Extraction and Summarization (MMIES)</source>
          .
          <source>ACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chiarcos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCrae</surname>
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montiel</surname>
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simov</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Branco</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Osenova</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Slavcheva</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vertan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <source>Procs. of the 3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and NLP (LDL).</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vulcu</surname>
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <source>CEUR Procs. of the 4th Workshop on the Multilingual Semantic Web (MSW4)</source>
          , Vol.
          <volume>1532</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Moens</surname>
          </string-name>
          , M.-F.,
          <string-name>
            <surname>Vulie</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Multilingual Probabilistic Topic Modeling and Its Applications in Web Mining and Search</article-title>
          .
          <source>In Procs. of the 7th ACM WSDM Conf.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Steichen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chi</surname>
            ,
            <given-names>E. E.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <source>Procs. of the Int. Workshop on Multilingual Web Access (MWA).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>6. The CLEF Initiative. http://www.clef-initiative.eu/.</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>