A Cognitive Automation Approach for a Smart Lending and
                   Early Warning Application
                                                         [Industrial and Application paper]

                 Ermelinda Oro                                           Massimo Ruffolo                                  Fausto Pupo
    High Performance Computing and                           High Performance Computing and                                Altilia.ai
    Networking Institute of the National                     Networking Institute of the National                       Rende (CS), Italy
            Research Council                                         Research Council                            fausto.pupo@altiliagroup.com
                Altilia.ai                                               Altilia.ai
            Rende (CS), Italy                                        Rende (CS), Italy
          linda.oro@icar.cnr.it                                 massimo.ruffolo@icar.cnr.it
ABSTRACT                                                                               and alternative sources of information (such as social and on-
The rapid development of Internet and the dissemination of in-                         line media, corporate websites and online financial document
formation and documents through a myriad of heterogeneous                              repositories).
data sources is having an ever-increasing impact on the financial                         Traditional approaches are not sufficient to effectively and
domain. Corporate and Investment Banks (CIBs) need to improve                          efficiently exploit information contained in these sources. Indeed,
and automate business and decision-making processes simplify-                          the ability to select, collect, analyze and interpret big data requires
ing the way they access data sources to get alternative data and                       Artificial Intelligence (AI) algorithms capable of automatically and
answers. Manual or traditional approaches to data gathering are                        efficiently searching, reading and understanding documents and
not sufficient to effectively and efficiently exploit information                      content designed for humans.
contained in all available data sources and represent a bottle-                           In this paper, we present a cognitive automation approach and
neck to processes automation. This paper presents a cognitive                          the related system, along with a financial application, that au-
automation approach, that makes use of Artificial Intelligence (AI)                    tomate and simplify business and decision-making tasks and
algorithms for automatically and efficiently searching, reading                        processes requiring human cognitive abilities. Presented cogni-
and understanding documents and contents intended to humans.                           tive automation approach allows CIBs to obtain answers and
The paper also presents the system that implements the proposed                        analysis useful to improve the ability of different bank areas to
approach by an application in the area of financial risk evaluation                    manage lending processes, forecast situations involving risks,
and lending automation. The presented approach allows CIBs                             facilitate lead generation, and optimize sales activities. Examples
to obtain answers and analysis useful to improve the ability of                        of required answers, alternatively referred to as data points in
different bank areas to manage lending processes, forecast sit-                        this paper, are: entities and relationships between them, yes/no
uations involving risks, facilitate lead generation, and develop                       answers, sentiments, perceptions, and opinions.
customized marketing and sales strategies.                                                The rest of the paper is organized as follows: Section 2 de-
                                                                                       scribes related work useful to comprehend modules of the pro-
KEYWORDS                                                                               posed system. Section 3 introduces the proposed approach and
                                                                                       related system. Section 4 presents how we solve some needs
Augmented Intelligence, Machine Reading Comprehension, Ques-
                                                                                       of CIBs by implementing a smart lending and early-warning
tion Answering, Cognitive Automation, Heterogeneous Data, Fi-
                                                                                       application. Finally, section 5 concludes the work.
nancial Services, Smart Lending, Early Warning, Information Ex-
traction, Natural Language Processing, Document Layout Analysis.
                                                                                       2    RELATED WORK
1    INTRODUCTION                                                                      The proposed approach and system encompass strong and hard
Financial organizations are constantly looking for innovative                          capabilities in machine reading comprehension (MRC) that ex-
ways to generate opportunities, automate and optimize business                         ploit pre-trained language models and human-in-the-loop machine
and decision-making processes, reduce risks and mitigate ad-                           learning. In this section, we briefly review related work regarding
verse events. In order to build long-term partnerships with their                      these main aspects.
customers, the Corporate and Investment Banks (CIBs) need to
develop customized marketing and sales strategies, and at the                              Machine Reading Comprehension. Machine Reading Com-
same time manage financial risks, based on a deep knowledge                            prehension (MRC) is the ability to answer questions asked in nat-
of corporate customers and markets in which they operate. The                          ural language by automatically reading from texts. The objective
answers to CIBs’ questions about the entities involved in the                          is to greatly simplify the way in which humans interrogate large
business and decision-making processes must be sought within a                         volumes of information sources [3]. MRC is related to Natural
myriad of heterogeneous data sources. Financial markets change                         Language Processing (NLP) and more specifically to Natural Lan-
rapidly, therefore CIBs need to quickly process big data available                     guage Understanding (NLU), which refers the ability of machines
in both traditional data sources (such as financial statements)                        to understand natural language. NLU is considered an AI-hard
                                                                                       problem and all its activities can be thought within a MRC frame-
© 2020 Copyright for this paper by its author(s). Published in the Workshop Proceed-
ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen,       work [10]. MRC allows for exploring many aspects of language
Denmark) on CEUR-WS.org. Use permitted under Creative Commons License At-              understanding, simply by posing questions. MRC can also be
tribution 4.0 International (CC BY 4.0)                                                seen as the extended task of question answering (QA).
   Recently, MRC methods have attracted a lot of attention among             poorly match the characteristics of the data of real business con-
researchers and scholars around the world. Indeed, there have                texts, such as the financial sector, where a complex language
been many new datasets for reading comprehension developed                   with specialized terminology is used.
in recent years, such as: SQuAD [22], NEWSQA [26], SearchQA                     To facilitate the learning of MRC models in the financial
[6], TriviaQA [9], HotpotQA [28], the latter requires multi-hop              domain, it is necessary to develop methods and interfaces for
reasoning over the paragraphs, and ReCoRD [30] and COSMOS                    human-in-the-loop machine learning. Using these tools, humans
QA [8] that are designed for challenging reading comprehension               can transfer domain knowledge to machines by annotating and
with commonsense reasoning. However, these datasets mainly                   validating datasets and models that can be used in the learning
concern with understanding general text, and they are not related            process. Currently, in the literature, there are some weakly su-
to specific knowledge domains. With deep learning (DL), end-                 pervised machine learning methods and systems that allow for
to-end models have produced promising results on some MRC                    creating annotated datasets from a human-driven perspective.
tasks. Unlike traditional machine learning, these models do not              For example, Snorkel2 [23], based on data programming para-
need to engineer complex features. Deep learning techniques for              digm [24], is a recently proposed framework that enables users
MRC have achieved very high performances on large standard                   to generate large volumes of training data by writing labeling
datasets in general domains [4, 14, 29] and more recently, big suc-          functions (such as rules and patterns) that capture domain knowl-
cesses have been obtained with approaches based on Pre-trained               edge. By using the data programming, such labeling functions
Language Models.                                                             can vary in accuracy and coverage, and they may be arbitrarily
                                                                             correlated. Other weakly supervised machine learning methods
   Pre-trained Language Models. We are entering the "Golden                  are for instance: Prodigy3 , Figure Eight4 , Amazon Mechanical
Age of NLP"1 . With BERT of Google AI Language [5], initially                Turk5 . These methods can use and be combined with: (i) transfer
published in 2018 as e-print version on ArXiv, which obtained                learning [17] that exploits labeled data, parameters, or knowledge
outstanding performances in multiple NLP tasks (like sentiment               available in other tasks to reduce the need for labeled data for the
analysis, question answering, sentence similarity), pre-training             specific new task, (ii) active learning [25] that select data points
with fine-tuning has become one of the most effective and used               for human annotators to label, and (iii) reinforcement learning
method to solve NLP related problems. Compared to the word-                  [20] that enables learning from feedback received through in-
level vectors (e.g. Word2Vec [13] released in 2013 and still quite           teractions with an external environment. Weakly human-driven
popular, Glove [18], and FastText [1]) BERT trains sentence-level            methods can facilitate the adoption of MRC methods in complex
vectors and get more information from context. Before BERT,                  domains, such as the financial one, in order to automate and sim-
other pre-trained general language representations have been                 plify the extraction and interrogation of data of various formats
introduced. ELMO [19], which uses a bi-directional LSTM, gen-                in heterogeneous sources. For these reasons, in our approach, we
eralizes traditional word embedding research along a different               implement human-driven annotation methods.
dimension extracting context-sensitive features. OpenAI GPT
[21] demonstrates that greater results can be obtained by gen-               3     COGNITIVE AUTOMATION APPROACH
erative pre-training of a language model on a diverse corpus of              In this section we present the proposed approach useful to imple-
unlabeled text, followed by discriminative fine-tuning on each               ment cognitive automation in decisional and operational business
specific task. ULMFiT [7] uses LSTM and produces contextual                  processes. Key steps of the presented approach are:
token representations. ULMFiT has been pre-trained from un-
                                                                                  • Search, perform layout analysis, and classify documents.
labeled text and fine-tuned for a supervised downstream task.
                                                                                  • Dynamically exploit the knowledge of users for training
Unlike previous papers, BERT uses a bi-directional Transformer.
                                                                                    and correction of the extraction algorithms thus enabling
Transformers were introduced from Vaswani et at. [27]. After, a
                                                                                    continuous learning.
lot of BERT-based activities in natural language processing and
                                                                                  • Extract answers about relevant questions concerning the
understanding have shown even better results than BERT. The
                                                                                    entities involved in business processes by exploiting ma-
model ERNIE [31] is pre-trained by masking semantic units such
                                                                                    chines capabilities of reading and comprehend documents.
as entity concepts, rather than tokens. Liu et al. [12] measure
                                                                                  • Harmonize and store extracted information in knowledge
the impact of many key hyperparameters and training data size
                                                                                    graphs.
and present RoBERTa. Lan et al. [11] present ALBERT that im-
                                                                                  • Explore obtained information, and visualize synthetic and
plements two parameter-reduction techniques to lower memory
                                                                                    easily interpretable charts.
consumption and increase the training speed of BERT. In this
paper, we use a BERT-based MRC method that allows us the                        In the following, we describe modules of the system shown in
extraction of data points.                                                   figure 1 that implements the proposed approach.

   Human-in-the-Loop machine learning. Deep learning, in                        Documents and Contents Gathering and Analysis. This
particular when it is applied to unstructured data, needs very               module allows, through specific connectors and methods of web
large training sets to learn the parameters and hyperparame-                 scraping and wrapping, the acquisition of heterogeneous con-
ters, and the desired models [22]. Therefore, despite the obvious            tents and documents from different information sources. In order
advantages of deep learning-based MRC systems, their use is                  to obtain the machine-readable format, it processes image docu-
often limited to an academic context where the performance of                ments by using optical character recognition (OCR) algorithms.
MRC techniques are tested on artificial datasets. These datasets             Then, it applies document layout analysis and understanding
                                                                             2 Snorkel https://www.snorkel.org/
                                                                             3 Prodigy https://prodi.gy
1 https://medium.com/@thresholdvc/neurips-2019-entering-the-golden-age-of-   4 Figure Eight https://www.figure-eight.com
nlp-c8f8e4116f9d                                                             5 Amazon Mechanical Turk https://www.mturk.com
                                               Figure 1: Cognitive Automation System.


algorithms, also based on spatial reasoning [15, 16], to recognize       (ii) Reader that takes as input the question and the portions
structures of the documents (e.g.: columns, sections, tables, lists           chosen by the Retriever and outputs the most probable
of records) and the reading order. Finally, the module enables for            answers it can find. This sub-module is based on a pre-
indexing documents and their portions.                                        trained deep learning model. The model is essentially a
                                                                              PyTorch version of the well known NLP model BERT [5],
    Training sets Modeling. This module allows the human-                     which is made available by Hugging Face7 . To fine-tune
driven annotation of portions of documents that answer spe-                   the model, created training sets in the modeling phase are
cific questions exploiting a semi-automatic interactive and itera-            exploited.
tive process. This process involves the user by means of actions,       (iii) Selector that compares the answers’ scores obtained by
mainly visual and/or based on simple rules, aimed at creating                 using an internal function and outputs the most likely
training sets for deep learning algorithms.                                   answer according to the scores.
                                                                        (iv) A graphical user interface that enables human-machine
   Machine Reading Comprehension (MRC). This module                           interaction used to implement reinforcement learning. By
allows for learning models that extract data from documents in                exploiting a graphical user interface that highlights re-
the form of answers to questions in natural language and it is                sults on portions of documents, users validate and give
based on different components:                                                feedbacks to the deep learning algorithms that learn and
                                                                              improve performance by exploiting the user feedbacks.
    (i) Retriever that selects a list of documents and portions
        that are most likely to contain the answer of a question
        obtained as input. It is implemented as a voting system          Data Harmonization. This module enables the manipulation
        that considers different versions of matching (e.g., based    in a scalable way of data by using workflows based on Spark8 .
        on Elasticsearch6 , DrQA [2] Reader that uses TF-IDF fea-     Workflows enable users to visually create complex processes
        tures exploiting uni-grams and bi-grams, and S-Reader         that allows for gathering and processing data, performing data
        [14] that uses different embeddings and hyperparameters       analysis, storing results in knowledge graphs, simply by combin-
        with respect to DrQA).                                        ing and concatenating blocks. A block embeds algorithms that

                                                                      7 Higging Face Transformers https://github.com/huggingface/transformers
6 Elasticsearch https://www.elastic.co/                               8 Spark https://spark.apache.org/
implement a specific task, for instance, the learned model for          the geographical distribution of a company’s debts, credits, and
extracting data points, descriptive, predictive, and prescriptive       revenues, and the volume of R&D investments.
analytics. For the same task different blocks that embed different          For the specific application, the proposed approach enables for
logics (e.g.: various ways to collect data depending on the formats     extracting interesting data points in a scalable way from a huge
of sources) can be used.                                                amount of web sources related to a large number of companies.
                                                                        Data points enrich different aspects of customer profiles, such
    Data Storage. Obtained results, including answers and meta-         as Environment, Society, Governance (ESG) knowledge, which,
data (e.g., the paragraphs where the answer was found and the           for instance, can be used by credit scoring algorithms. The im-
title of the document), are stored into knowledge graphs (KGs).         plemented web scraping tools are used to download news and
The current implementation of KGs is based on a multi-structured        financial documents from websites of companies or from the SEC
database that combines information retrieval capabilities with          (Securities and Exchange Commission) website, and to collect
the ability to store data as graph databases.                           information and reviews from booking websites. These tools are
                                                                        flexible, easily configurable and maintainable. More in detail, the
   Data Exploration and Visualization. Results can be ex-               web scraping process consists of the definition of a configuration
plored through application programming interfaces (APIs) that           file for each different typology of websites to scrape. The wrapper
allow integration with external applications, and they can be           uses DOM information and XPath along with similarities between
displayed in reports, dashboards, and presentations that visu-          different web sites reducing work needed to design wrappers.
ally track, analyze and show key performance indicators (KPI),          In addition, which kind of data/information to extract from the
metrics and key data points.                                            websites can be defined by a data model to fill. For instance, in
                                                                        order to collect information about restaurants and hotels the
4     SMART LENDING AND EARLY WARNING                                   scraping tools navigate booking websites extracting reviews and
                                                                        attributes such as authors, title, date and all relevant info. In
      APPLICATION                                                       order to collect PDF documents (e.g., annual reports, 10-k forms,
The rapid development of web content and the dissemination              notes to balance sheets) the scraping tools navigate the compa-
of information through social networks, blogs, and newspapers           nies’ websites searching the sections investor relation and press
brought an ever-increasing impact on financial domain. How to           release. Alternatively, the scraping tools download documents
rapidly and accurately mine the key information from big data is        from financial document providers like SEC9 . Downloaded PDF
a challenging problem to study for researchers, and has become          files are processed by using document layout analysis algorithms,
one of the key issues for investors and decision-makers. Indeed,        even exploiting optical character recognition (OCR) techniques
the ability to automatically answer business questions enables          when needed, to extract portions (such as columns, paragraphs,
cognitive automation in decisional and operational business pro-        tables, notes). Then, the different portions of documents (along
cesses in different Corporate and Investment Banks (CIBs) areas.        with their relations, information of reading order, link to the
CIBs need to decide if it is convenient to grant a loan to a com-       original document and metadata) are stored in knowledge graphs
pany, to know the risk conditions of their customers portfolio,         and indexed in the system to be furthermore elaborated.
and to develop customized marketing and sales strategies. To
this end, CIBs need to have a deep knowledge and to perform a           4.2     Training sets Modeling and MRC
careful evaluation of:
                                                                        During the training set modeling phase, a user can define labeling
     (i) corporate customers (such as, know board members, the          functions or visually annotate label-entity or question-answer
         environmental impact of the business, how they are per-        pairs looking at input documents and information stored in the
         ceived, the solidity of their business),                       system. Figure 2 shows the graphical user interface that aids the
    (ii) markets in which their customers are located and oper-         creation of labeling functions.
         ate (e.g., solidity of the market, information about used
         commodities, competitors).
In practical terms, CIBs asked for a system capable to automati-
cally: (i) answer to specific questions asked in natural language,
i.e. extract data points, (ii) visualize queryable and navigable cus-
tomer profiles that can be used for credit scoring, early warning,
and marketing and sales activities.
    In the following, we describe our solution that is based on the
proposed approach and presented in the previous section 3.

4.1    Documents and Contents Gathering and
       Analysis
Financial operators search for answers that can be obtained or
inferred by reading and studying, even simultaneously, various                         Figure 2: Labeling Functions GUI.
information sources, such as financial documents (e.g., annual re-
ports, 10-k forms, sustainability reports, notes to balance sheets),
                                                                           In the left part of the interface, the editor for defining label-
as well as web sources (e.g., news, blogs, social media). Examples
                                                                        ing functions is displayed. These functions can exploit different
of required answers (i.e., data points) are: the perception of a
corporate brand on social media (customer brand perception),            9 https://www.sec.gov/edgar/searchedgar/companysearch.html
syntactic, spatial, and ontological information. In addition, they      revenues, EBIT, EBITDA) of more than 3000 companies. Informa-
can use: (i) built-in that calls machine learning procedures or         tion extracted are saved in knowledge graphs and can be provided
complex algorithms used as black-box, (ii) functions and con-           in different formats selected by the customer (e.g., csv, excel, or
cepts defined in other imported labeling files. The editor provides     json).
some facilities to simplify the writing of labeling functions ex-
ploiting relationships between label-value, titles-paragraphs or        4.4    Data Exploration and Visualization
images-caption, table structures, and grammatical relationships         Banks are interested in creating reports, dashboards, and presen-
like subject-verb-object (fact). At the upper right part of the in-     tations to visualize customer profiles. In the following, we show
terface, taxonomies of desired concepts to label are visualized.        some examples of dashboards and PowerPoint slides obtained by
The GUI shows also the chosen PDF files used to visually evalu-         analyzing extracted data points related to a target company and
ate the results of the executed labeling functions. Results details     considering peer companies used for benchmarking.
(attributes of the labeled concepts) can be visualized in the lower        Figure 5 show a comparison of main financial data of the
right-hand corner of the interface. In figure 2 the labeling func-      selected target client (e.g., revenue growth, EBITDA margin and
tions annotate revenues in a financial statement.                       growth, and net debt-to-EBITDA ratio) with the mean values
   In addition, as shown in figure 3, the GUI enables also to           of benchmarking companies (peers in the same industry of the
visually annotate texts, for instance, to assign labels, or to select   target company). Target companies and peers can be dynamically
answers of questions in the documents.                                  selected to see real-time updates of charts.


Figure 3: A visual annotation of a concept related to the
financial domain.

   Created training sets are exploited within machine / deep            Figure 5: A comparison of main financial data between the
learning algorithms, as described in section 3.                         target company and peers.

4.3    Data Harmonization and Storage                                      Figure 6 shows an exposure of the customer to financial risks
To scale-up KPIs extraction, a workflow can be designed, de-            in the form of a presentation slide exported by the system. In
ployed in the cloud, and execute in parallel and scheduled way.         detail, the figure shows the exposure to interest rate changes on
In figure 4, the designed workflow enable to search and extract         loan risk, to forex rates changes risk, and to commodity price
text portions from PDF documents related to the target questions.       variation risk also making comparisons with peers (benchmark
                                                                        percentages).
                                                                           Figure 7 shows a deep dive on the foreign activities (forex
                                                                        risk) of the selected company (e.g.: revenue and credits/debit by
                                                                        country, non-euro revenues percentage of the total) and the forex
                                                                        derivatives it already has in its portfolio (i.e. derivatives usage
                                                                        table slitted for type of instruments).

                                                                        5     CONCLUSION
                                                                        In this paper, we presented a cognitive automation approach and
                                                                        the related system, along with a financial application, that en-
                                                                        ables CIBs to automatically: (i) extract data points from textual
                                                                        data sources, and (ii) visualize dashboards and presentations con-
                                                                        taining customers’ data and comparisons between customers
                                                                        and their peers. The greater wealth and depth of information
      Figure 4: Workflow to scale-up KPIs extraction.                   on risks and opportunities improve the ability to manage lend-
                                                                        ing processes, provide real-time early warning, and help sales
  In the shown example, we are interested in extracting data            activities. In particular, the implemented solution enables: (i) Au-
points from balance sheets related to financial information (e.g.,      tomatic, faster, and predictive credit/risk scoring (customer qual-
customer, industry, year, financial costs, commodities price, total     ification) creation. (ii) Digitalization of lending processes (loans
                                                                                    [9] Mandar Joshi, Eunsol Choi, Daniel S. Weld, and Luke Zettlemoyer. 2017. Triv-
                                                                                        iaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading
                                                                                        Comprehension. In ACL.
                                                                                   [10] Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan
                                                                                        Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. 2016. Ask me
                                                                                        anything: Dynamic memory networks for natural language processing. In
                                                                                        International Conference on Machine Learning. 1378–1387.
                                                                                   [11] Zhen-Zhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush
                                                                                        Sharma, and Radu Soricut. 2019. ALBERT: A Lite BERT for Self-supervised
                                                                                        Learning of Language Representations. ArXiv abs/1909.11942 (2019).
                                                                                   [12] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi
                                                                                        Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov.
                                                                                        2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv
                                                                                        abs/1907.11692 (2019).
                                                                                   [13] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient esti-
                                                                                        mation of word representations in vector space. arXiv preprint arXiv:1301.3781
                                                                                        (2013).
                                                                                   [14] Sewon Min, Victor Zhong, Richard Socher, and Caiming Xiong. 2018. Efficient
                                                                                        and Robust Question Answering from Minimal Context over Documents.
                                                                                        arXiv preprint arXiv:1805.08092 (2018).
                                                                                   [15] Ermelinda Oro and Massimo Ruffolo. 2017. A Method forWeb Content Ex-
                                                                                        traction and Analysis in the Tourism Domain. In International Conference on
                                                                                        Enterprise Information Systems, Vol. 2. SCITEPRESS, 365–370.
                                                                                   [16] Ermelinda Oro and Massimo Ruffolo. 2017.            Object extraction from
   Figure 6: Exposure of a company to financial risks.                                  presentation-oriented documents using a semantic and spatial approach. US
                                                                                        Patent 9,582,494.
                                                                                   [17] Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE
                                                                                        Transactions on Knowledge and Data Engineering 22 (2010), 1345–1359.
                                                                                   [18] Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove:
                                                                                        Global Vectors for Word Representation.. In EMNLP, Vol. 14. 1532–1543.
                                                                                   [19] Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher
                                                                                        Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word
                                                                                        representations. arXiv preprint arXiv:1802.05365 (2018).
                                                                                   [20] Junfei Qiu, Qihui Wu, Guoru Ding, Yuhua Xu, and Shuo Feng. 2016. A survey
                                                                                        of machine learning for big data processing. EURASIP Journal on Advances in
                                                                                        Signal Processing 2016, 1 (2016), 67.
                                                                                   [21] Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever.
                                                                                        2018. Improving language understanding by generative pre-training. URL
                                                                                        https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/language-
                                                                                        unsupervised/language_ understanding_paper. pdf (2018).
                                                                                   [22] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016.
                                                                                        SQuAD: 100, 000+ Questions for Machine Comprehension of Text. In EMNLP.
                                                                                   [23] Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu,
                                                                                        and Christopher Ré. 2017. Snorkel: Rapid training data creation with weak
                                                                                        supervision. Proceedings of the VLDB Endowment 11, 3 (2017), 269–282.
                                                                                   [24] Alexander J Ratner, Christopher M De Sa, Sen Wu, Daniel Selsam, and Christo-
                                                                                        pher Ré. 2016. Data programming: Creating large training sets, quickly. In
                                                                                        Advances in neural information processing systems. 3567–3575.
                                                                                   [25] Burr Settles. 2009. Active Learning Literature Survey. Technical Report. Uni-
 Figure 7: Deep dive on forex risk of a selected company                                versity of Wisconsin-Madison Department of Computer Sciences.
                                                                                   [26] Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni,
                                                                                        Philip Bachman, and Kaheer Suleman. 2016. NewsQA: A Machine Compre-
                                                                                        hension Dataset. In Rep4NLP@ACL.
underwriting). (iii) Smarter and more effective early warnings.                    [27] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
(iv) Reduction of losses due to unforeseen defaults. In this way,                       Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All
                                                                                        you Need. In NIPS.
different areas of banks can benefit from developing customi-                      [28] Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen,
zed marketing and sales strategies, as well as building efficient                       Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A
                                                                                        Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP.
and effective lending processes, based on a deep knowledge of                      [29] Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mo-
corporate customers and the market in which they operate.                               hammad Norouzi, and Quoc V Le. 2018. QANet: Combining Local Convolu-
                                                                                        tion with Global Self-Attention for Reading Comprehension. arXiv preprint
                                                                                        arXiv:1804.09541 (2018).
REFERENCES                                                                         [30] Sheng Zhang, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Kevin Duh, and
[1] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017.            Benjamin Van Durme. 2018. ReCoRD: Bridging the Gap between Human
    Enriching word vectors with subword information. Transactions of the Associ-        and Machine Commonsense Reading Comprehension. ArXiv abs/1810.12885
    ation for Computational Linguistics 5 (2017), 135–146.                              (2018).
[2] Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading        [31] Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu.
    wikipedia to answer open-domain questions. ICLR (2017).                             2019. ERNIE: Enhanced Language Representation with Informative Entities.
[3] Hsinchun Chen, Roger H. L. Chiang, and Veda C. Storey. 2012. Business               In ACL.
    Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly 36
    (2012), 1165–1188.
[4] Christopher Clark and Matt Gardner. 2017. Simple and Effective Multi-
    Paragraph Reading Comprehension. In ACL.
[5] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019.
    BERT: Pre-training of Deep Bidirectional Transformers for Language Under-
    standing. In NAACL-HLT.
[6] Matthew Dunn, Levent Sagun, Mike Higgins, V Ugur Guney, Volkan Cirik,
    and Kyunghyun Cho. 2017. SearchQA: A new Q&amp;A dataset augmented
    with context from a search engine. arXiv preprint arXiv:1704.05179 (2017).
[7] Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-
    tuning for Text Classification. In ACL.
[8] Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi. 2019. Cos-
    mos QA: Machine Reading Comprehension with Contextual Commonsense
    Reasoning. In EMNLP/IJCNLP.