<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Cognitive Automation Approach for a Smart Lending and Early Warning Application</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ermelinda Oro</string-name>
          <email>linda.oro@icar.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Massimo Rufolo</string-name>
          <email>massimo.rufolo@icar.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fausto Pupo</string-name>
          <email>fausto.pupo@altiliagroup.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Augmented Intelligence, Machine Reading Comprehension, Ques-</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Altilia.ai</institution>
          ,
          <addr-line>Rende (CS)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>High Performance Computing and, Networking Institute of the National, Research Council</institution>
          ,
          <addr-line>Altilia.ai, Rende (CS)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>tion Answering, Cognitive Automation, Heterogeneous Data, Financial Services</institution>
          ,
          <addr-line>Smart Lending, Early Warning, Information Extraction, Natural Language Processing</addr-line>
          ,
          <country>Document Layout Analysis.</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapid development of Internet and the dissemination of information and documents through a myriad of heterogeneous data sources is having an ever-increasing impact on the financial domain. Corporate and Investment Banks (CIBs) need to improve and automate business and decision-making processes simplifying the way they access data sources to get alternative data and answers. Manual or traditional approaches to data gathering are not suficient to efectively and eficiently exploit information contained in all available data sources and represent a bottleneck to processes automation. This paper presents a cognitive automation approach, that makes use of Artificial Intelligence (AI) algorithms for automatically and eficiently searching, reading and understanding documents and contents intended to humans. The paper also presents the system that implements the proposed approach by an application in the area of financial risk evaluation and lending automation. The presented approach allows CIBs to obtain answers and analysis useful to improve the ability of diferent bank areas to manage lending processes, forecast situations involving risks, facilitate lead generation, and develop customized marketing and sales strategies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Financial organizations are constantly looking for innovative
ways to generate opportunities, automate and optimize business
and decision-making processes, reduce risks and mitigate
adverse events. In order to build long-term partnerships with their
customers, the Corporate and Investment Banks (CIBs) need to
develop customized marketing and sales strategies, and at the
same time manage financial risks, based on a deep knowledge
of corporate customers and markets in which they operate. The
answers to CIBs’ questions about the entities involved in the
business and decision-making processes must be sought within a
myriad of heterogeneous data sources. Financial markets change
rapidly, therefore CIBs need to quickly process big data available
in both traditional data sources (such as financial statements)
and alternative sources of information (such as social and
online media, corporate websites and online financial document
repositories).</p>
      <p>Traditional approaches are not suficient to efectively and
eficiently exploit information contained in these sources. Indeed,
the ability to select, collect, analyze and interpret big data requires
Artificial Intelligence (AI) algorithms capable of automatically and
eficiently searching, reading and understanding documents and
content designed for humans.</p>
      <p>In this paper, we present a cognitive automation approach and
the related system, along with a financial application, that
automate and simplify business and decision-making tasks and
processes requiring human cognitive abilities. Presented
cognitive automation approach allows CIBs to obtain answers and
analysis useful to improve the ability of diferent bank areas to
manage lending processes, forecast situations involving risks,
facilitate lead generation, and optimize sales activities. Examples
of required answers, alternatively referred to as data points in
this paper, are: entities and relationships between them, yes/no
answers, sentiments, perceptions, and opinions.</p>
      <p>The rest of the paper is organized as follows: Section 2
describes related work useful to comprehend modules of the
proposed system. Section 3 introduces the proposed approach and
related system. Section 4 presents how we solve some needs
of CIBs by implementing a smart lending and early-warning
application. Finally, section 5 concludes the work.</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>The proposed approach and system encompass strong and hard
capabilities in machine reading comprehension (MRC) that
exploit pre-trained language models and human-in-the-loop machine
learning. In this section, we briefly review related work regarding
these main aspects.</p>
      <sec id="sec-2-1">
        <title>Machine Reading Comprehension. Machine Reading Com</title>
        <p>
          prehension (MRC) is the ability to answer questions asked in
natural language by automatically reading from texts. The objective
is to greatly simplify the way in which humans interrogate large
volumes of information sources [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. MRC is related to Natural
Language Processing (NLP) and more specifically to Natural
Language Understanding (NLU), which refers the ability of machines
to understand natural language. NLU is considered an AI-hard
problem and all its activities can be thought within a MRC
framework [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. MRC allows for exploring many aspects of language
understanding, simply by posing questions. MRC can also be
seen as the extended task of question answering (QA).
        </p>
        <p>
          Recently, MRC methods have attracted a lot of attention among
researchers and scholars around the world. Indeed, there have
been many new datasets for reading comprehension developed
in recent years, such as: SQuAD [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], NEWSQA [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], SearchQA
[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], TriviaQA [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], HotpotQA [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], the latter requires multi-hop
reasoning over the paragraphs, and ReCoRD [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] and COSMOS
QA [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] that are designed for challenging reading comprehension
with commonsense reasoning. However, these datasets mainly
concern with understanding general text, and they are not related
to specific knowledge domains. With deep learning (DL),
endto-end models have produced promising results on some MRC
tasks. Unlike traditional machine learning, these models do not
need to engineer complex features. Deep learning techniques for
MRC have achieved very high performances on large standard
datasets in general domains [
          <xref ref-type="bibr" rid="ref14 ref29 ref4">4, 14, 29</xref>
          ] and more recently, big
successes have been obtained with approaches based on Pre-trained
Language Models.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Pre-trained Language Models. We are entering the "Golden</title>
        <p>
          Age of NLP"1. With BERT of Google AI Language [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], initially
published in 2018 as e-print version on ArXiv, which obtained
outstanding performances in multiple NLP tasks (like sentiment
analysis, question answering, sentence similarity), pre-training
with fine-tuning has become one of the most efective and used
method to solve NLP related problems. Compared to the
wordlevel vectors (e.g. Word2Vec [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] released in 2013 and still quite
popular, Glove [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], and FastText [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]) BERT trains sentence-level
vectors and get more information from context. Before BERT,
other pre-trained general language representations have been
introduced. ELMO [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], which uses a bi-directional LSTM,
generalizes traditional word embedding research along a diferent
dimension extracting context-sensitive features. OpenAI GPT
[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] demonstrates that greater results can be obtained by
generative pre-training of a language model on a diverse corpus of
unlabeled text, followed by discriminative fine-tuning on each
specific task. ULMFiT [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] uses LSTM and produces contextual
token representations. ULMFiT has been pre-trained from
unlabeled text and fine-tuned for a supervised downstream task.
Unlike previous papers, BERT uses a bi-directional Transformer.
Transformers were introduced from Vaswani et at. [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. After, a
lot of BERT-based activities in natural language processing and
understanding have shown even better results than BERT. The
model ERNIE [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] is pre-trained by masking semantic units such
as entity concepts, rather than tokens. Liu et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] measure
the impact of many key hyperparameters and training data size
and present RoBERTa. Lan et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] present ALBERT that
implements two parameter-reduction techniques to lower memory
consumption and increase the training speed of BERT. In this
paper, we use a BERT-based MRC method that allows us the
extraction of data points.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Human-in-the-Loop machine learning. Deep learning, in</title>
        <p>
          particular when it is applied to unstructured data, needs very
large training sets to learn the parameters and
hyperparameters, and the desired models [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Therefore, despite the obvious
advantages of deep learning-based MRC systems, their use is
often limited to an academic context where the performance of
MRC techniques are tested on artificial datasets. These datasets
1https://medium.com/@thresholdvc/neurips-2019-entering-the-golden-age-ofnlp-c8f8e4116f9d
poorly match the characteristics of the data of real business
contexts, such as the financial sector, where a complex language
with specialized terminology is used.
        </p>
        <p>
          To facilitate the learning of MRC models in the financial
domain, it is necessary to develop methods and interfaces for
human-in-the-loop machine learning. Using these tools, humans
can transfer domain knowledge to machines by annotating and
validating datasets and models that can be used in the learning
process. Currently, in the literature, there are some weakly
supervised machine learning methods and systems that allow for
creating annotated datasets from a human-driven perspective.
For example, Snorkel2 [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], based on data programming
paradigm [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], is a recently proposed framework that enables users
to generate large volumes of training data by writing labeling
functions (such as rules and patterns) that capture domain
knowledge. By using the data programming, such labeling functions
can vary in accuracy and coverage, and they may be arbitrarily
correlated. Other weakly supervised machine learning methods
are for instance: Prodigy3, Figure Eight4, Amazon Mechanical
Turk5. These methods can use and be combined with: (i) transfer
learning [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] that exploits labeled data, parameters, or knowledge
available in other tasks to reduce the need for labeled data for the
specific new task, (ii) active learning [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] that select data points
for human annotators to label, and (iii) reinforcement learning
[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] that enables learning from feedback received through
interactions with an external environment. Weakly human-driven
methods can facilitate the adoption of MRC methods in complex
domains, such as the financial one, in order to automate and
simplify the extraction and interrogation of data of various formats
in heterogeneous sources. For these reasons, in our approach, we
implement human-driven annotation methods.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>COGNITIVE AUTOMATION APPROACH</title>
      <p>In this section we present the proposed approach useful to
implement cognitive automation in decisional and operational business
processes. Key steps of the presented approach are:
• Search, perform layout analysis, and classify documents.
• Dynamically exploit the knowledge of users for training
and correction of the extraction algorithms thus enabling
continuous learning.
• Extract answers about relevant questions concerning the
entities involved in business processes by exploiting
machines capabilities of reading and comprehend documents.
• Harmonize and store extracted information in knowledge
graphs.
• Explore obtained information, and visualize synthetic and
easily interpretable charts.</p>
      <p>In the following, we describe modules of the system shown in
ifgure 1 that implements the proposed approach.</p>
      <sec id="sec-3-1">
        <title>Documents and Contents Gathering and Analysis. This</title>
        <p>
          module allows, through specific connectors and methods of web
scraping and wrapping, the acquisition of heterogeneous
contents and documents from diferent information sources. In order
to obtain the machine-readable format, it processes image
documents by using optical character recognition (OCR) algorithms.
Then, it applies document layout analysis and understanding
2Snorkel https://www.snorkel.org/
3Prodigy https://prodi.gy
4Figure Eight https://www.figure-eight.com
5Amazon Mechanical Turk https://www.mturk.com
algorithms, also based on spatial reasoning [
          <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
          ], to recognize
structures of the documents (e.g.: columns, sections, tables, lists
of records) and the reading order. Finally, the module enables for
indexing documents and their portions.
        </p>
        <p>Training sets Modeling. This module allows the
humandriven annotation of portions of documents that answer
specific questions exploiting a semi-automatic interactive and
iterative process. This process involves the user by means of actions,
mainly visual and/or based on simple rules, aimed at creating
training sets for deep learning algorithms.</p>
        <p>
          Machine Reading Comprehension (MRC). This module
allows for learning models that extract data from documents in
the form of answers to questions in natural language and it is
based on diferent components:
(i) Retriever that selects a list of documents and portions
that are most likely to contain the answer of a question
obtained as input. It is implemented as a voting system
that considers diferent versions of matching (e.g., based
on Elasticsearch6, DrQA [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] Reader that uses TF-IDF
features exploiting uni-grams and bi-grams, and S-Reader
[
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] that uses diferent embeddings and hyperparameters
with respect to DrQA).
6Elasticsearch https://www.elastic.co/
(ii) Reader that takes as input the question and the portions
chosen by the Retriever and outputs the most probable
answers it can find. This sub-module is based on a
pretrained deep learning model. The model is essentially a
PyTorch version of the well known NLP model BERT [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ],
which is made available by Hugging Face7. To fine-tune
the model, created training sets in the modeling phase are
exploited.
(iii) Selector that compares the answers’ scores obtained by
using an internal function and outputs the most likely
answer according to the scores.
(iv) A graphical user interface that enables human-machine
interaction used to implement reinforcement learning. By
exploiting a graphical user interface that highlights
results on portions of documents, users validate and give
feedbacks to the deep learning algorithms that learn and
improve performance by exploiting the user feedbacks.
        </p>
        <p>Data Harmonization. This module enables the manipulation
in a scalable way of data by using workflows based on Spark 8.
Workflows enable users to visually create complex processes
that allows for gathering and processing data, performing data
analysis, storing results in knowledge graphs, simply by
combining and concatenating blocks. A block embeds algorithms that
7Higging Face Transformers https://github.com/huggingface/transformers
8Spark https://spark.apache.org/
implement a specific task, for instance, the learned model for
extracting data points, descriptive, predictive, and prescriptive
analytics. For the same task diferent blocks that embed diferent
logics (e.g.: various ways to collect data depending on the formats
of sources) can be used.</p>
        <p>Data Storage. Obtained results, including answers and
metadata (e.g., the paragraphs where the answer was found and the
title of the document), are stored into knowledge graphs (KGs).
The current implementation of KGs is based on a multi-structured
database that combines information retrieval capabilities with
the ability to store data as graph databases.</p>
        <p>Data Exploration and Visualization. Results can be
explored through application programming interfaces (APIs) that
allow integration with external applications, and they can be
displayed in reports, dashboards, and presentations that
visually track, analyze and show key performance indicators (KPI),
metrics and key data points.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>SMART LENDING AND EARLY WARNING</title>
    </sec>
    <sec id="sec-5">
      <title>APPLICATION</title>
      <p>The rapid development of web content and the dissemination
of information through social networks, blogs, and newspapers
brought an ever-increasing impact on financial domain. How to
rapidly and accurately mine the key information from big data is
a challenging problem to study for researchers, and has become
one of the key issues for investors and decision-makers. Indeed,
the ability to automatically answer business questions enables
cognitive automation in decisional and operational business
processes in diferent Corporate and Investment Banks (CIBs) areas.
CIBs need to decide if it is convenient to grant a loan to a
company, to know the risk conditions of their customers portfolio,
and to develop customized marketing and sales strategies. To
this end, CIBs need to have a deep knowledge and to perform a
careful evaluation of:
(i) corporate customers (such as, know board members, the
environmental impact of the business, how they are
perceived, the solidity of their business),
(ii) markets in which their customers are located and
operate (e.g., solidity of the market, information about used
commodities, competitors).</p>
      <p>In practical terms, CIBs asked for a system capable to
automatically: (i) answer to specific questions asked in natural language,
i.e. extract data points, (ii) visualize queryable and navigable
customer profiles that can be used for credit scoring, early warning,
and marketing and sales activities.</p>
      <p>In the following, we describe our solution that is based on the
proposed approach and presented in the previous section 3.
4.1</p>
    </sec>
    <sec id="sec-6">
      <title>Documents and Contents Gathering and</title>
    </sec>
    <sec id="sec-7">
      <title>Analysis</title>
      <p>Financial operators search for answers that can be obtained or
inferred by reading and studying, even simultaneously, various
information sources, such as financial documents (e.g., annual
reports, 10-k forms, sustainability reports, notes to balance sheets),
as well as web sources (e.g., news, blogs, social media). Examples
of required answers (i.e., data points) are: the perception of a
corporate brand on social media (customer brand perception),
the geographical distribution of a company’s debts, credits, and
revenues, and the volume of R&amp;D investments.</p>
      <p>For the specific application, the proposed approach enables for
extracting interesting data points in a scalable way from a huge
amount of web sources related to a large number of companies.
Data points enrich diferent aspects of customer profiles, such
as Environment, Society, Governance (ESG) knowledge, which,
for instance, can be used by credit scoring algorithms. The
implemented web scraping tools are used to download news and
ifnancial documents from websites of companies or from the SEC
(Securities and Exchange Commission) website, and to collect
information and reviews from booking websites. These tools are
lfexible, easily configurable and maintainable. More in detail, the
web scraping process consists of the definition of a configuration
ifle for each diferent typology of websites to scrape. The wrapper
uses DOM information and XPath along with similarities between
diferent web sites reducing work needed to design wrappers.
In addition, which kind of data/information to extract from the
websites can be defined by a data model to fill. For instance, in
order to collect information about restaurants and hotels the
scraping tools navigate booking websites extracting reviews and
attributes such as authors, title, date and all relevant info. In
order to collect PDF documents (e.g., annual reports, 10-k forms,
notes to balance sheets) the scraping tools navigate the
companies’ websites searching the sections investor relation and press
release. Alternatively, the scraping tools download documents
from financial document providers like SEC 9. Downloaded PDF
ifles are processed by using document layout analysis algorithms,
even exploiting optical character recognition (OCR) techniques
when needed, to extract portions (such as columns, paragraphs,
tables, notes). Then, the diferent portions of documents (along
with their relations, information of reading order, link to the
original document and metadata) are stored in knowledge graphs
and indexed in the system to be furthermore elaborated.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>Training sets Modeling and MRC</title>
      <p>During the training set modeling phase, a user can define labeling
functions or visually annotate label-entity or question-answer
pairs looking at input documents and information stored in the
system. Figure 2 shows the graphical user interface that aids the
creation of labeling functions.
syntactic, spatial, and ontological information. In addition, they
can use: (i) built-in that calls machine learning procedures or
complex algorithms used as black-box, (ii) functions and
concepts defined in other imported labeling files. The editor provides
some facilities to simplify the writing of labeling functions
exploiting relationships between label-value, titles-paragraphs or
images-caption, table structures, and grammatical relationships
like subject-verb-object (fact). At the upper right part of the
interface, taxonomies of desired concepts to label are visualized.
The GUI shows also the chosen PDF files used to visually
evaluate the results of the executed labeling functions. Results details
(attributes of the labeled concepts) can be visualized in the lower
right-hand corner of the interface. In figure 2 the labeling
functions annotate revenues in a financial statement.</p>
      <p>In addition, as shown in figure 3, the GUI enables also to
visually annotate texts, for instance, to assign labels, or to select
answers of questions in the documents.
revenues, EBIT, EBITDA) of more than 3000 companies.
Information extracted are saved in knowledge graphs and can be provided
in diferent formats selected by the customer (e.g., csv, excel, or
json).
4.4</p>
    </sec>
    <sec id="sec-9">
      <title>Data Exploration and Visualization</title>
      <p>Banks are interested in creating reports, dashboards, and
presentations to visualize customer profiles. In the following, we show
some examples of dashboards and PowerPoint slides obtained by
analyzing extracted data points related to a target company and
considering peer companies used for benchmarking.</p>
      <p>Figure 5 show a comparison of main financial data of the
selected target client (e.g., revenue growth, EBITDA margin and
growth, and net debt-to-EBITDA ratio) with the mean values
of benchmarking companies (peers in the same industry of the
target company). Target companies and peers can be dynamically
selected to see real-time updates of charts.</p>
      <p>Created training sets are exploited within machine / deep
learning algorithms, as described in section 3.
4.3</p>
    </sec>
    <sec id="sec-10">
      <title>Data Harmonization and Storage</title>
      <p>To scale-up KPIs extraction, a workflow can be designed,
deployed in the cloud, and execute in parallel and scheduled way.
In figure 4, the designed workflow enable to search and extract
text portions from PDF documents related to the target questions.</p>
      <p>In the shown example, we are interested in extracting data
points from balance sheets related to financial information (e.g.,
customer, industry, year, financial costs, commodities price, total
In this paper, we presented a cognitive automation approach and
the related system, along with a financial application, that
enables CIBs to automatically: (i) extract data points from textual
data sources, and (ii) visualize dashboards and presentations
containing customers’ data and comparisons between customers
and their peers. The greater wealth and depth of information
on risks and opportunities improve the ability to manage
lending processes, provide real-time early warning, and help sales
activities. In particular, the implemented solution enables: (i)
Automatic, faster, and predictive credit/risk scoring (customer
qualification) creation. (ii) Digitalization of lending processes (loans
underwriting). (iii) Smarter and more efective early warnings.
(iv) Reduction of losses due to unforeseen defaults. In this way,
diferent areas of banks can benefit from developing
customized marketing and sales strategies, as well as building eficient
and efective lending processes, based on a deep knowledge of
corporate customers and the market in which they operate.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Piotr</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , Edouard Grave, Armand Joulin, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>5</volume>
          (
          <year>2017</year>
          ),
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Danqi</given-names>
            <surname>Chen</surname>
          </string-name>
          , Adam Fisch, Jason Weston, and
          <string-name>
            <given-names>Antoine</given-names>
            <surname>Bordes</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Reading wikipedia to answer open-domain questions</article-title>
          .
          <source>ICLR</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Hsinchun</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Roger H. L. Chiang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Veda</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Storey</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Business Intelligence and Analytics: From Big Data to Big Impact</article-title>
          .
          <source>MIS Quarterly</source>
          <volume>36</volume>
          (
          <year>2012</year>
          ),
          <fpage>1165</fpage>
          -
          <lpage>1188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Clark</surname>
          </string-name>
          and
          <string-name>
            <given-names>Matt</given-names>
            <surname>Gardner</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Simple and Efective MultiParagraph Reading Comprehension</article-title>
          . In ACL.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          .
          <source>In NAACL-HLT.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Dunn</surname>
          </string-name>
          , Levent Sagun,
          <string-name>
            <given-names>Mike</given-names>
            <surname>Higgins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V Ugur</given-names>
            <surname>Guney</surname>
          </string-name>
          , Volkan Cirik, and
          <string-name>
            <given-names>Kyunghyun</given-names>
            <surname>Cho</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>SearchQA: A new Q&amp;amp;A dataset augmented with context from a search engine</article-title>
          .
          <source>arXiv preprint arXiv:1704.05179</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jeremy</given-names>
            <surname>Howard</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Ruder</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Universal Language Model Finetuning for Text Classification</article-title>
          .
          <source>In ACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Lifu</given-names>
            <surname>Huang</surname>
          </string-name>
          , Ronan Le Bras, Chandra Bhagavatula, and
          <string-name>
            <given-names>Yejin</given-names>
            <surname>Choi</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <string-name>
            <surname>Cosmos</surname>
            <given-names>QA</given-names>
          </string-name>
          :
          <article-title>Machine Reading Comprehension with Contextual Commonsense Reasoning</article-title>
          . In EMNLP/IJCNLP.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Mandar</given-names>
            <surname>Joshi</surname>
          </string-name>
          , Eunsol Choi, Daniel S. Weld, and
          <string-name>
            <given-names>Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension</article-title>
          . In ACL.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ankit</surname>
            <given-names>Kumar</given-names>
          </string-name>
          , Ozan Irsoy,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Ondruska</surname>
          </string-name>
          , Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher.
          <year>2016</year>
          .
          <article-title>Ask me anything: Dynamic memory networks for natural language processing</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          .
          <fpage>1378</fpage>
          -
          <lpage>1387</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Zhen-Zhong</surname>
            <given-names>Lan</given-names>
          </string-name>
          , Mingda Chen,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Goodman</surname>
          </string-name>
          , Kevin Gimpel, Piyush Sharma, and
          <string-name>
            <given-names>Radu</given-names>
            <surname>Soricut</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>ALBERT: A Lite BERT for Self-supervised Learning of Language Representations</article-title>
          . ArXiv abs/
          <year>1909</year>
          .11942 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Yinhan</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen,
          <string-name>
            <surname>Omer Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mike</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Veselin</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>RoBERTa: A Robustly Optimized BERT Pretraining Approach</article-title>
          . ArXiv abs/
          <year>1907</year>
          .11692 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Tomas</surname>
            <given-names>Mikolov</given-names>
          </string-name>
          , Kai Chen, Greg Corrado, and
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Eficient estimation of word representations in vector space</article-title>
          .
          <source>arXiv preprint arXiv:1301.3781</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sewon</surname>
            <given-names>Min</given-names>
          </string-name>
          , Victor Zhong, Richard Socher, and
          <string-name>
            <given-names>Caiming</given-names>
            <surname>Xiong</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Eficient and Robust Question Answering from Minimal Context over Documents</article-title>
          . arXiv preprint arXiv:
          <year>1805</year>
          .
          <volume>08092</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Ermelinda</given-names>
            <surname>Oro</surname>
          </string-name>
          and
          <string-name>
            <given-names>Massimo</given-names>
            <surname>Rufolo</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A Method forWeb Content Extraction and Analysis in the Tourism Domain</article-title>
          .
          <source>In International Conference on Enterprise Information Systems</source>
          , Vol.
          <volume>2</volume>
          . SCITEPRESS,
          <fpage>365</fpage>
          -
          <lpage>370</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Ermelinda</given-names>
            <surname>Oro</surname>
          </string-name>
          and
          <string-name>
            <given-names>Massimo</given-names>
            <surname>Rufolo</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Object extraction from presentation-oriented documents using a semantic and spatial approach</article-title>
          .
          <source>US Patent 9</source>
          ,
          <issue>582</issue>
          ,
          <fpage>494</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Sinno</given-names>
            <surname>Jialin</surname>
          </string-name>
          Pan and
          <string-name>
            <given-names>Qiang</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>A Survey on Transfer Learning</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>22</volume>
          (
          <year>2010</year>
          ),
          <fpage>1345</fpage>
          -
          <lpage>1359</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Jefrey</surname>
            <given-names>Pennington</given-names>
          </string-name>
          , Richard Socher, and
          <string-name>
            <given-names>Christopher D</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Glove: Global Vectors for Word Representation.</article-title>
          .
          <string-name>
            <surname>In</surname>
            <given-names>EMNLP</given-names>
          </string-name>
          , Vol.
          <volume>14</volume>
          .
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Matthew</surname>
            <given-names>E Peters</given-names>
          </string-name>
          , Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Deep contextualized word representations</article-title>
          .
          <source>arXiv preprint arXiv:1802</source>
          .
          <volume>05365</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Junfei</surname>
            <given-names>Qiu</given-names>
          </string-name>
          , Qihui Wu, Guoru Ding, Yuhua Xu,
          <string-name>
            <given-names>and Shuo</given-names>
            <surname>Feng</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A survey of machine learning for big data processing</article-title>
          .
          <source>EURASIP Journal on Advances in Signal Processing</source>
          <year>2016</year>
          ,
          <volume>1</volume>
          (
          <year>2016</year>
          ),
          <fpage>67</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Alec</surname>
            <given-names>Radford</given-names>
          </string-name>
          , Karthik Narasimhan, Tim Salimans, and
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Sutskever</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Improving language understanding by generative pre-training</article-title>
          .
          <source>URL https://s3-us-west-2</source>
          . amazonaws. com/openai-assets/research-covers/languageunsupervised/language_ understanding_paper. pdf (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Pranav</surname>
            <given-names>Rajpurkar</given-names>
          </string-name>
          , Jian Zhang, Konstantin Lopyrev, and
          <string-name>
            <given-names>Percy</given-names>
            <surname>Liang</surname>
          </string-name>
          .
          <year>2016</year>
          . SQuAD:
          <volume>100</volume>
          , 000+
          <article-title>Questions for Machine Comprehension of Text</article-title>
          . In EMNLP.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Alexander</surname>
            <given-names>Ratner</given-names>
          </string-name>
          , Stephen H Bach, Henry Ehrenberg, Jason Fries,
          <string-name>
            <surname>Sen Wu</surname>
            , and
            <given-names>Christopher</given-names>
          </string-name>
          <string-name>
            <surname>Ré</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Snorkel: Rapid training data creation with weak supervision</article-title>
          .
          <source>Proceedings of the VLDB Endowment 11</source>
          ,
          <issue>3</issue>
          (
          <year>2017</year>
          ),
          <fpage>269</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Alexander J Ratner</surname>
          </string-name>
          ,
          <string-name>
            <surname>Christopher M De Sa</surname>
            , Sen Wu, Daniel Selsam, and
            <given-names>Christopher</given-names>
          </string-name>
          <string-name>
            <surname>Ré</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Data programming: Creating large training sets, quickly</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          .
          <volume>3567</volume>
          -
          <fpage>3575</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Burr</given-names>
            <surname>Settles</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Active Learning Literature Survey</article-title>
          .
          <source>Technical Report</source>
          . University of Wisconsin-Madison Department of Computer Sciences.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Adam</surname>
            <given-names>Trischler</given-names>
          </string-name>
          , Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and
          <string-name>
            <given-names>Kaheer</given-names>
            <surname>Suleman</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>NewsQA: A Machine Comprehension Dataset</article-title>
          . In Rep4NLP@ACL.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Ashish</surname>
            <given-names>Vaswani</given-names>
          </string-name>
          , Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
          <string-name>
            <given-names>Aidan N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Lukasz Kaiser, and
          <string-name>
            <given-names>Illia</given-names>
            <surname>Polosukhin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Attention is All you Need</article-title>
          .
          <source>In NIPS.</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Zhilin</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and
          <string-name>
            <given-names>Christopher D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering</article-title>
          .
          <source>In EMNLP.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Adams</given-names>
            <surname>Wei</surname>
          </string-name>
          <string-name>
            <given-names>Yu</given-names>
            , David Dohan,
            <surname>Minh-Thang</surname>
          </string-name>
          <string-name>
            <surname>Luong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Rui</given-names>
            <surname>Zhao</surname>
          </string-name>
          , Kai Chen, Mohammad Norouzi, and
          <string-name>
            <surname>Quoc</surname>
            <given-names>V</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension</article-title>
          . arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>09541</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Sheng</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Kevin Duh, and Benjamin Van Durme.
          <year>2018</year>
          .
          <article-title>ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension</article-title>
          . ArXiv abs/
          <year>1810</year>
          .12885 (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Zhengyan</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu.
          <year>2019</year>
          .
          <article-title>ERNIE: Enhanced Language Representation with Informative Entities</article-title>
          .
          <source>In ACL.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>