<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SSN_NLP@IECSIL-FIRE-2018: Deep Learning Approach to Named Entity Recognition and Relation Extraction for Conversational Systems in Indian Languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>D. Thenmozhi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>B. Senthil Kumar</string-name>
          <email>senthil@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chandrabose Aravindan</string-name>
          <email>aravindanc@ssn.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of CSE, SSN College of Engineering</institution>
          ,
          <addr-line>Chennai</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Named Entity Recognition (NER) focuses on the classification of proper nouns into the generic named entities (NE) such as person_names, organizations, locations, currency and dates. NER has several applications like conversation systems, machine translation, automatic summarization and question answering. Relation Extraction (RE) is an information extraction process used to identify the relationship between NEs. RE is very important in applications like short answer grading, conversation systems, question answering and ontology learning. NER and RE in Indian languages are difficult tasks due to their agglutinative nature and rich morphological structure. Further, developing language independent framework that supports all Indian Languages is a challenging task. In this paper, we present a deep learning methodology for both NER and RE in five Indian languages namely Hindi, Kannada, Malayalam, Tamil and Telugu. We proposed a common approach that works for both NER and RE tasks. We have used neural machine translation architecture to implement our methodology for these tasks. Our approach was evaluated using the data set given by IECSIL@FIRE2018 shared task. We have evaluated on two sets of data for NER task and obtained the accuracies as 94.41%, 95.23%, 95.97% and 96.02% for the four variations on pre-evaluation test set and 95.9%, 95.85% and 95.05% for the three runs on final-evaluation test set. Also, for RE task, we have obtained the accuracies as 56.19%, 60.74%, 60.7%, 75.43% and 79.11% for our five variations on pre-evaluation test set and 79.44%, 76.01% and 61.11% for Run 1, Run 2 and Run 3 respectively on final-evaluation test set.</p>
      </abstract>
      <kwd-group>
        <kwd>Named Entity Recognition (NER)</kwd>
        <kwd>Relation Extraction</kwd>
        <kwd>Information Extraction</kwd>
        <kwd>Text mining</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Indian Languages</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Named Entity Recognition (NER) is an Information Extraction (IE) task which
identifies and classifies the proper names in the text into predefined classes such
as names of persons, organizations, dates, locations, numbers and currency. NER
is very important for many NLP applications such as machine translation, text
summarization, question answering and short answer grading. NER is very
popular since 1970 in English and in other European languages. Several approaches
have been reported for NER in these languages. Deep learning methods have
also been employed for English, European and Chinese languages [
        <xref ref-type="bibr" rid="ref19 ref5 ref7 ref9">5, 9, 7, 19</xref>
        ].
However, NER for Indian languages is a very challenging task due to the
characteristics namely no closed set vocabulary, no concept of capitalization, polysemy
and ambiguity. Also, due to the complex morphology structure and
agglutinative nature, IE in Indian languages is still an open challenge. Several
methodologies including rule based, statistical and machine learning approaches have been
reported for NER in Indian languages. However, the approaches are language
dependent and no Indian language except Bengali reported above 90% F-score
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Moreover, developing language independent framework that supports all
Indian Languages is a challenging task.
      </p>
      <p>
        Relation Extraction (RE) is a process of extracting the relationships between
NEs. It is also an IE task which extracts and classifies the relationship between
entities. For example, “lives in” is the relationship present between person and
location. RE is so important for applications such as ontology learning,
conversational systems, question answering and short answer grading. Several methods
are reported in literature to learn the relations automatically from English
documents [
        <xref ref-type="bibr" rid="ref14 ref16 ref22 ref24 ref25 ref8">22, 24, 14, 16, 8, 25</xref>
        ]. Many of them are domain dependent [
        <xref ref-type="bibr" rid="ref25 ref8">8, 25</xref>
        ] and
a few are domain independent approaches [
        <xref ref-type="bibr" rid="ref14 ref22">22, 14</xref>
        ]. They used approaches like
rule-based, supervised and unsupervised to learn taxonomic [
        <xref ref-type="bibr" rid="ref15 ref23">15, 23</xref>
        ] or semantic
relations [
        <xref ref-type="bibr" rid="ref17 ref8">8, 17</xref>
        ] between the entities. Only a very few approaches are presented
that learn both taxonomic and semantic relations for any domain [
        <xref ref-type="bibr" rid="ref22 ref26">22, 26</xref>
        ]. Deep
learning methods are also employed for RE in recent years [
        <xref ref-type="bibr" rid="ref10 ref13 ref4">10, 13, 4</xref>
        ]. However,
RE for Indian languages [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is still an open challenge. Further, developing
language independent framework that supports all Indian Languages for extracting
relations is a challenging task.
      </p>
      <p>
        The shared task IECSIL@FIRE2018 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] focuses on IE for conversational
systems in Indian languages namely Hindi, Kannada, Malayalam, Tamil and
Telugu. This shared task focuses on two sub-tasks namely NER task and RE tasks.
The goal of IECSIL task is to research and develop techniques to extract
information using language independent framework. IECSIL@FIRE2018 is a shared
Task on information extractor for conversational systems in Indian languages
collocated with FIRE-2018 (Forum for Information Retrieval Evaluation). This
paper focuses on both sub tasks of IECSIL@FIRE2018 namely NER and RE.
We propose a common approach that identifies and classifies the NEs to one
of the generic classes namely name, occupation, location, things, organization,
datenum, number and other, and also classifies the relations between NEs to
one of the classes namely information_1, information_2, information_3,
information_4, information_per, information_quant, information_closed,
information_so, information_neg, information_cc, action_1, action_2, action_3,
action_per, action_so, action_quant, action_neg, and other.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Proposed Methodology</title>
      <p>
        We have used a deep learning approach based on Sequence to Sequence (Seq2Seq)
model [
        <xref ref-type="bibr" rid="ref21 ref6">21, 6</xref>
        ]. We have utilized a common approach that addresses both NER
and RE problems. We have adopted the Neural Machine Translation (NMT)
framework [
        <xref ref-type="bibr" rid="ref11 ref12">12, 11</xref>
        ] based on Seq2Seq model for both NER and RE tasks. Figures
1 and 2 depict the flow of our approach for NER and RE tasks respectively.
      </p>
      <p>The steps are detailed below.</p>
      <p>The given text consists of tokens of the sentences and their corresponding
NEs for NER task and consists of the input sentences and their corresponding
relation labels for RE task. Sample inputs for both tasks are given in Figures 3
and 4.</p>
      <p>We have prepared the data in such a way that Seq2Seq deep learning
algorithm may be applied. The input sentences and NER / RE label input sequences
are constructed separately based on the delimiter “newline”. For example, for the</p>
      <p>NER task, the above sequence of tokens are converted to input sentences and
NER label input sequences as shown in Figure 5 and Figure 6 respectively.</p>
      <p>Similarly, for RE task, the input data is converted to input sentences and
relation label output, as shown in Figure 7 and Figure 8.</p>
      <p>Then the input sentences and NER / RE label input sequences are splitted
into training sets and development sets. The vocabulary for both input sentences
and NER / RE label input sequences are determined. For NER task, the input
sentence with n words w1, w2, ...wn and NER label input sequence with n labels
l1, l2, ...ln are given to the embedding layer. However, for RE task, the input
sentence with n words w1, w2, ...wn and relation label input one label rl are
given to the embedding layer.</p>
      <p>To build a deep neural network model, a multi-layer RNN (Recurrent Neural
Network) with LSTM (Long Short Term Memory) as a recurrent unit is used.
This neural network consists of several layers namely, embedding layer,
encoding layer, decoding layer, projection layer or softmax layer and loss layer. The
embedding layer learns weight vectors from the input sentence and NER / RE
label input sequence based on their vocabulary. These embeddings are fed into
multi-layer LSTM where encoding and decoding are performed. The word
embedding vector xwi for each word wi, where wi constitutes a time step, is the
input to LSTM network. The computation of the hidden layer at time t and the
output can be represented as follows.</p>
      <p>it = σ(wx(i)x + wh(i)ht−1 + b(i))
ft = σ(wx(f)x + wh(f)ht−1 + b(f) + 1)
(1)</p>
      <p>ot = σ(wx(o)x + wh(o)ht−1 + b(o))
ct = tanh(wx(c)x + wh(c)ht−1 + b(c))
e
ct = ft ◦ ect−1 + it ◦ ect
hb/f = ot ◦ tanh(ct)
(3)
(4)
(5)
(6)
where ws are the weight matrices, ht−1 is the hidden layer state at time t − 1,
it, ft, ot are the input, forget, output gates respectively at time t, and hb/f is
the hidden state of backward, forward LSTM cells. To have the better efficiency
of LSTM, the bias value in the forget gate is set to a default value 1.</p>
      <p>
        The attention mechanism [
        <xref ref-type="bibr" rid="ref1 ref11">1, 11</xref>
        ] is used to handle the longer sentences.
Softmax layer or projection layer is a dense layer to obtain the NER / RE label
output sequence. Loss layer is used to compute the training loss during model
building. Once, the model is built, the NER / RE label output sequences are
obtained by using the model for sequence mapping.
      </p>
      <p>The target sequences we have obtained are the sequences of NER labels with
respect to the given sentence. Thus, the NER label sequences are tokenized
further to obtain the NE classes for each term. However, for RE task, the target
sequence is itself a RE label and thus it not required to do any post process the
output of our deep neural network.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Implementation</title>
      <p>
        Our methodology was implemented using TensorFlow for IECSIL Shared Tasks
namely NER and RE. The data set [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] used to evaluate the NER and RE tasks
consists of a training set and two test sets namely pre-evaluation set and
finalevaluation set for five Indian languages namely “Hindi”, “Kannada”,
“Malayalam”, “Tamil” and “Telugu”. The details about the NER and RE data are
given in Tables 1 and 2.
      </p>
      <p>The input sentences and NER / RE label input sequences are constructed
based on the delimiter “newline”. We have splitted these sequences into train set
and development set to feed into the deep neural network. The details of the
splits are given in Tables 3 and 4 for NER and RE tasks respectively.
Languages</p>
      <p>
        We have used TensorFlow code based on tutorial code released by Neural
Machine Translation 1 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] that was developed based on Sequence-to-Sequence
(Seq2Seq) models [
        <xref ref-type="bibr" rid="ref1 ref12 ref21">21, 1, 12</xref>
        ] to implement our deep learning approach for NER
and RE tasks. We have implemented several variations of the Seq2Seq model by
varying the directionality, depth and number of training steps with a dropout
of 0.2 and batch size of 128 to show the effectiveness of our proposed model of
8-layer, bi-LSTM with attention.
      </p>
      <p>The implementation details of model building for both NER and RE tasks
are explained below.
3.1</p>
      <sec id="sec-3-1">
        <title>NER Models</title>
        <p>We has used 4 variations of model building for NER task. The variations are
given below.</p>
        <p>– Model 1: 2 layer, uni-directional LSTM, without attention, 50,000 steps
– Model 2: 4 layer, uni-directional LSTM, with scaled-luong attention, 75,000
steps
– Model 3: 8 layer, bi-directional LSTM, with scaled-luong attention, 75,000
steps
– Model 4: 8 layer, bi-directional LSTM, with scaled-luong attention, 1,00,000
steps</p>
        <p>The development bleu scores obtained for these variations are given in Table
5.</p>
        <p>It is observed from Table 5 that except for the “Malayalam” language,
bidirectional LSTM with attention having 8 layers depth and more number of
training steps works well for all other languages.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>RE Models</title>
        <p>We have implemented a total of five variations for finding the relation labels for
the given sentences to show the effectiveness of our proposed model. The first
1 https://github.com/tensorflow/nmt
three variations use machine learning approaches and the last two used the deep
learning approach.</p>
        <p>The three variations using machine learning approach are given below.
– Model 1: Term frequency vectorizer
– Model 2: TF-IDF vectorizer by ignoring the terms with document frequency
less than 1
– Model 3: TF-IDF vectorizer by ignoring the terms with document frequency
less than 2</p>
        <p>For these machine learning approaches, the bag of word features are extracted
from the training instances. We have used Scikit–learn machine learning library
to vectorize the training instances and to implement the classifier for the relation
extraction and classification task.</p>
        <p>In the first variation, CountVectorizer of sklearn is used for vectorization.
Term Frequency - Inverse Document Frequency (TF-IDF) is used for
vectorization with min_df as 1 and 2 in the second and third variations respectively.
min_df is to build the vocabulary by ignoring terms that have a document
frequency lower than the given value. TfidfVectorizer of sklearn is used for these
two variations. We have used neural network classifier with Stochastic Gradient
Descent (SGD) optimizer to classify the relation labels.</p>
        <p>The two variations using deep learning approach for RE are given below.
– Model 4: 4 layer, uni-directional LSTM, with scaled-luong attention, 50,000
steps
– Model 5: 8 layer, bi-directional LSTM, with scaled-luong attention, 75,000
steps</p>
        <p>The development accuracy scores obtained for these two variations are given
in Table 6.</p>
        <p>It is observed from Table 6 that except for the “Telugu” language, bi-directional
LSTM with attention having 8 layers depth performs well for all other languages
in RE task.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>We have evaluated our models for the data set provided by IECSIL shared task.
The results obtained for both NER and RE tasks are discussed in this section.
4.1</p>
      <sec id="sec-4-1">
        <title>NER Results</title>
        <p>We have obtained the accuracies as 94.41%, 95.23%, 95.97% and 96.02% for
Model 1, Model 2, Model 3 and Model 4 respectively on the pre-evaluation test
data.</p>
        <p>We have submitted three runs based on our three models namely Model 4,
Model 3 and Model 2 as Run 1, Run 2 and Run 3 respectively for the task. Table
8 shows the accuracies we have obtained for the final-evaluation test data using
our three models. We have obtained the accuracies as 95.9%, 95.85% and 95.05%
for Run 1, Run 2 and Run 3 respectively.</p>
        <p>It is observed from Table 8 that bi-directional LSTM with attention having 8
layers depth works well for five Indian languages. However, increase in just step
size does not show significant improvement.</p>
        <p>Table 9 shows the F1-scores we have obtained for the final-evaluation test
data using our three runs. This table shows the F1-score for the individual classes
of all five languages. It is observed from the table that we have obtained less
F1score for “datanum” class and high F1-score for “other” class. This may be due
to low recall value for “datanum” class and low precision value for “other” class.
Also, we have obtained a overall F1-score for all the classes is low for Kannada
language. This is due to the size of the dataset which is lesser for Kannada when
compared with all the other languages.</p>
        <p>We have obtained the accuracies for the pre-evaluation test data as 56.19%,
60.74%, 60.7%, 75.43% and 79.11% for Model 1, Model 2, Model 3, Model 4 and
Model 5 respectively.</p>
        <p>We have submitted three runs based on our three models namely Model
5, Model 4 and Model 2 as Run 1, Run 2 and Run 3 respectively for the task.
Table 11 shows the accuracies we have obtained for the final-evaluation test data
using our three models. We have obtained the accuracies as 79.44%, 76.01% and
61.11% for Run 1, Run 2 and Run 3 respectively.</p>
        <p>Tables 10 and 11 show that bi-directional LSTM with attention having 8
layers depth performs better for all the languages except “Telugu” language.</p>
        <p>Table 12 and Table 13 show the F1-scores we have obtained for the
finalevaluation test data using our three runs for relation extraction task. These
tables show the F1-scores for the individual classes of all five languages in which
“A_” and “I_” indicate “Action_” and “Information_” classes respectively. It
is observed from the tables that we have obtained a overall F1-score for all the
classes is very low for Kannada language when compared with all the other
languages while we applied deep learning techniques. This is due to the size of
the dataset which is lesser for Kannada language. However, Kannada language
gives better F1-score than Hindi and Malayalam languages while we use machine
learning algorithm.
We have presented a deep learning approach for NER and RE in Indian
languages namely “Hindi”, “Kannada”, “Malayalam”, “Tamil” and “Telugu”. We
have used neural machine translation model to implement both NER and RE
tasks. Our approach is a common approach that identifies and classifies the NEs
into any of the generic classes namely name, occupation, location, things,
organization, datenum, number and other, and also classifies the relations to any
of the relation labels such as information_1, action_1, etc. We have evaluated
four deep learning models for NER and two deep learning models for RE by
varying the directionality, depth and number of training steps with and without
attention mechanism called scaled-luong. To show the effectiveness of deep
learning approach, we have also implemented three variations of traditional machine
learning models for RE using bag of word features, term frequency / TF-IDF
vectorizer and a neural network classifier with SGD optimizer having minimum
difference as one and two to extract relations. We have evaluated these
models using the data set given by IECSIL@FIRE2018 shared task for NER and
RE. We have used the metric accuracy to measure the performance of different
variations of our approach. For, NER task, we have obtained the accuracies as
94.41%, 95.23%, 95.97% and 96.02% for Model 1, Model 2, Model 3 and Model 4
respectively on pre-evaluation test data and 95.9%, 95.85% and 95.05% for Run
1, Run 2 and Run 3 respectively on final-evaluation test data. For RE task, we
have obtained the accuracies as 56.19%, 60.74%, 60.7%, 75.43% and 79.11% for
Model 1, Model 2, Model 3, Model 4 and Model 5 respectively on pre-evaluation
test data, and 79.44%, 76.01% and 61.11% for Run 1, Run 2 and Run 3
respectively on final-evaluation test data. The performance may be improved further
by incorporating different attention mechanisms, including more hidden layers,
and increasing training steps.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.:</given-names>
          </string-name>
          <article-title>Neural machine translation by jointly learning to align and translate</article-title>
          .
          <source>arXiv preprint arXiv:1409.0473</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Barathi</given-names>
            <surname>Ganesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.B.</given-names>
            ,
            <surname>Soman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.P.</given-names>
            ,
            <surname>Reshma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            ,
            <surname>Mandar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Prachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Gouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Anitha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Anand</surname>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Information extraction for conversational systems in indian languages - arnekt iecsil</article-title>
          .
          <source>In: Forum for Information Retrieval Evaluation</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Barathi</given-names>
            <surname>Ganesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.B.</given-names>
            ,
            <surname>Soman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.P.</given-names>
            ,
            <surname>Reshma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            ,
            <surname>Mandar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Prachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Gouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Anitha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Anand</surname>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Overview of arnekt iecsil at fire-2018 track on information extraction for conversational systems in indian languages</article-title>
          .
          <source>In: FIRE (Working Notes)</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chikka</surname>
            ,
            <given-names>V.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karlapalem</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A hybrid deep learning approach for medical relation extraction</article-title>
          . arXiv preprint arXiv:
          <year>1806</year>
          .
          <volume>11189</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chiu</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nichols</surname>
          </string-name>
          , E.:
          <article-title>Named entity recognition with bidirectional lstm-cnns</article-title>
          .
          <source>arXiv preprint arXiv:1511.08308</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van Merriënboer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gulcehre</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bougares</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwenk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Learning phrase representations using rnn encoder-decoder for statistical machine translation</article-title>
          .
          <source>arXiv preprint arXiv:1406.1078</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dugas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nichols</surname>
          </string-name>
          , E.: Deepnnner:
          <article-title>Applying blstm-cnns and extended lexicons to named entity recognition in tweets</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)</source>
          . pp.
          <fpage>178</fpage>
          -
          <lpage>187</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Frunza</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inkpen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>T.:</given-names>
          </string-name>
          <article-title>A machine learning approach for identifying disease-treatment relations in short texts. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on
          <volume>23</volume>
          (
          <issue>6</issue>
          ),
          <fpage>801</fpage>
          -
          <lpage>814</lpage>
          (
          <year>2011</year>
          ). https://doi.org/10.1109/TKDE.
          <year>2010</year>
          .152
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ballesteros</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kawakami</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Neural architectures for named entity recognition</article-title>
          .
          <source>arXiv preprint arXiv:1603.01360</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Leng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A deep learning approach for relationship extraction from interaction context in social manufacturing paradigm</article-title>
          .
          <source>Knowledge-Based Systems 100</source>
          ,
          <fpage>188</fpage>
          -
          <lpage>199</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brevdo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>Neural machine translation (seq2seq) tutorial</article-title>
          . https://github.com/tensorflow/nmt (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Effective approaches to attention-based neural machine translation</article-title>
          .
          <source>arXiv preprint arXiv:1508.04025</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lv</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Clinical relation extraction with deep learning</article-title>
          .
          <source>IJHIT 9</source>
          (
          <issue>7</issue>
          ),
          <fpage>237</fpage>
          -
          <lpage>248</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Paukkeri</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>García-Plaza</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fresno</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Unanue</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Honkela</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Learning a taxonomy from a set of text documents</article-title>
          .
          <source>Applied Soft Computing</source>
          <volume>12</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1138</fpage>
          -
          <lpage>1148</lpage>
          (
          <year>2012</year>
          ). https://doi.org/10.1016/j.asoc.
          <year>2011</year>
          .
          <volume>11</volume>
          .009
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yuan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Aneec: A quasi-automatic system for massive named entity extraction and categorization</article-title>
          .
          <source>The Computer Journal</source>
          <volume>56</volume>
          (
          <issue>11</issue>
          ),
          <fpage>1328</fpage>
          -
          <lpage>1346</lpage>
          (
          <year>2013</year>
          ). https://doi.org/10.1093/comjnl/bxs114
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Poon</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Domingos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Unsupervised ontology induction from text</article-title>
          .
          <source>In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics</source>
          . pp.
          <fpage>296</fpage>
          -
          <lpage>305</lpage>
          . ACL,
          <string-name>
            <surname>United</surname>
            <given-names>States</given-names>
          </string-name>
          , Uppsala,
          <source>Sweden (July</source>
          <volume>11</volume>
          -16
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Punuru</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
          </string-name>
          , J.:
          <article-title>Learning non-taxonomical semantic relations from domain texts</article-title>
          .
          <source>Journal of Intelligent Information Systems</source>
          <volume>38</volume>
          (
          <issue>1</issue>
          ),
          <fpage>191</fpage>
          -
          <lpage>207</lpage>
          (
          <year>2012</year>
          ). https://doi.org/10.1007/s10844-011-0149-4
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>Senthil</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Thenmozhi</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Named entity recognition in dravidian languages - state of the art</article-title>
          .
          <source>International Journal of Applied Engineering Research</source>
          <volume>10</volume>
          (
          <issue>34</issue>
          ),
          <fpage>27295</fpage>
          -
          <lpage>72300</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yun</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lipton</surname>
            ,
            <given-names>Z.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kronrod</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anandkumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Deep active learning for named entity recognition</article-title>
          .
          <source>arXiv preprint arXiv:1707.05928</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Sinha</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Identification and classification of relations for indian languages using machine learning approaches for developing a domain specific ontology</article-title>
          .
          <source>In: Computational Techniques in Information and Communication Technologies (ICCTICT)</source>
          , 2016 International Conference on. pp.
          <fpage>415</fpage>
          -
          <lpage>420</lpage>
          . IEEE (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinyals</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>Q.V.</given-names>
          </string-name>
          :
          <article-title>Sequence to sequence learning with neural networks</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . pp.
          <fpage>3104</fpage>
          -
          <lpage>3112</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Thenmozhi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aravindan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>An automatic and clause based approach to learn relations for ontologies</article-title>
          .
          <source>The Computer Journal</source>
          <volume>59</volume>
          (
          <issue>6</issue>
          ),
          <fpage>889</fpage>
          -
          <lpage>907</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Velardi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faralli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          , R.:
          <article-title>Ontolearn reloaded: A graph-based algorithm for taxonomy induction</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>39</volume>
          (
          <issue>3</issue>
          ),
          <fpage>665</fpage>
          -
          <lpage>707</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnaghi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bargiela</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Probabilistic topic models for learning terminological ontologies. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on
          <volume>22</volume>
          (
          <issue>7</issue>
          ),
          <fpage>1028</fpage>
          -
          <lpage>1040</lpage>
          (
          <year>2010</year>
          ). https://doi.org/10.1109/TKDE.
          <year>2009</year>
          .122
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Weichselbraun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wohlgenannt</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scharl</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Refining non-taxonomic relation labels with external structured data to support ontology learning</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          <volume>69</volume>
          (
          <issue>8</issue>
          ),
          <fpage>763</fpage>
          -
          <lpage>778</lpage>
          (
          <year>2010</year>
          ). https://doi.org/10.1016/j.datak.
          <year>2010</year>
          .
          <volume>02</volume>
          .010
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Zouaq</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nkambou</surname>
          </string-name>
          , R.:
          <article-title>Evaluating the generation of domain ontologies in the knowledge puzzle project. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on
          <volume>21</volume>
          (
          <issue>11</issue>
          ),
          <fpage>1559</fpage>
          -
          <lpage>1572</lpage>
          (
          <year>2009</year>
          ). https://doi.org/10.1109/TKDE.
          <year>2009</year>
          .25
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>