<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Task-oriented Conversational Agent Self-learning Based on Sentiment Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Serena Leggeri</string-name>
          <email>leggeri.1228424@studenti.uniroma1.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Esposito</string-name>
          <email>andrea.esposito@badgebox.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Iocchi</string-name>
          <email>iocchi@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>BadgeBox srl</institution>
          ,
          <addr-line>Roma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Sapienza University of Rome</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>4</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>One of the biggest issues in creating a task-oriented conversational agent with natural language processing based on machine learning comes from size and correctness of the training dataset. It could take months or even years of data collection and the resulting static resource may get soon out of date thus requiring a signi cant amount of work to supervise it. To overcome these di culties, we implemented an algorithm with the ability of improving learning e ciency based on the emotions and reactions arising from the conversation between a user and the bot, automatically and in real time. To this end, we have studied an error function that, as in any closed loop control system, corrects the input to improve the output. The proposed method is based on both calibrating the interpretation given to the initial dataset and expanding the dictionary with new terms. Thanks to this innovative approach, the satisfaction of the interlocutors is higher if compared to algorithms with a static dataset or with semi-automatic self-learning rules.</p>
      </abstract>
      <kwd-group>
        <kwd>task-oriented conversational agent supervised learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Task-oriented conversational agents are software components based on arti cial
intelligence that are able to simulate an intelligent conversation with the user on
a chat and o er a functional support service through the main messaging
platforms such as Slack, Telegram and Facebook Messenger. Conversational agents
are created for various purposes: from customer care, to the dissemination of
news, o ers, promotions and as support for the activation of a service. The
strength of these solutions is in being autonomous, available 24 hours a day to
o er help to the user who requests it.</p>
      <p>When a developer designs a task-oriented conversational agent, its main
purpose is to make sure that it ful lls all the user requests based on the speci c
topic for which it was designed, trying to nd the most relevant answer to the
question that was sent, without the intervention of human operators. Agents
based on machine learning techniques make use of a training dataset. Initially,
the dataset contains a nite number of contexts that describe the topic for which
the bot was created and for each context there is a nite number of sentences
describing the user's intention. As the dataset is created to satisfy certain types
of requests, there may be limited ways in which the user can ask a question
and limited types of answers. In some cases, the answer may not be adequate
to the question and it is important to keep improving the e cacy of the agent
during the operation. However, this requires manual operations in labelling new
samples for re ning the learning process.</p>
      <p>The idea of the method described in this paper is to exploit the analysis of
user satisfaction, to improve the e ectiveness of the learning process of the agent.
More speci cally, we aim at on-line and automatic generation of new labelled
samples to be included in the training dataset to re ne the agent learned model.</p>
      <p>Starting from this idea, we have developed a method that allows to increase
the dataset automatically and in real time inserting new terms and recalibrating
those already present in the dataset thus improving the recognition of the user's
intentions. To do this, we analyzed the emotionality generated in the users by
the bot answers.</p>
      <p>
        The proposed approach has been deployed and validated on a real use case,
coming from a commercial application of a chatbot acting as customer care
helping on timesheet and employee management in a company [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
evaluation process also contains a comparison with other techniques. The results
show that, when compared with other techniques not using such analysis, the
proposed method can automatically increase the dataset in real time and
improve the quality of the chatbot's answers. The proposed method is also faster
in recognizing the contexts compared with other techniques.
      </p>
      <p>Although the deployment and experimental evaluation have been focussed
on a particular real use case, the proposed method has no domain speci c
components or assumptions and thus we believe that it can be properly applied to
other domains as well.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        Over the course of time, numerous chatbots have been created to provide
information, help making decisions, allow services or simply for entertainment [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Initially, the development of a bot was based on two fundamental
components [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]:
{ Natural Language Understanding module, used by the Dialogue Manager,
that processes the user input to search for keywords through which to
understand the action to be taken.
{ Natural Language Generation module that generates answers from the
information gathered by the Dialogue Manager.
      </p>
      <p>
        Over time, we have faced a real evolution in the development of task-oriented
conversational agents thanks to the availability of deep learning techniques [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
These agents are typically trained for several years with the use of human
intervention supervising them and verifying the correctness of the answers. In this
way, it was able to carry out operations that the other assistants could not
perform such as making payments or booking a trip.
      </p>
      <p>The most di cult challenge for a task-oriented conversational agent is to be
able to incorporate the linguistic context (all that is said during the
conversation and that gives meaning to each taking of the turn in the dialogue) and
the physical context (for example user information, place, date and time of the
conversation, etc.). A good exploitation of the linguistic context is possible only
when the bot is trained on a good training dataset supervised by human experts
in order to minimize errors.</p>
      <p>In addition to task-oriented conversational agents realized assuming a large
amount of training data and the availability of human experts during all the
training phases, it is interesting to study development of agents with minimal
requirements in terms of availability of training data and human supervision.
This goal is motivated by the need of developing chatbots by small companies
for which high amount of data and human supervision have a too high costs
compared to the commercial bene t of the chatbot product.</p>
      <p>Therefore, our goal is to study the de nition of a task-oriented conversational
agent to understand user intents and perform actions starting from a limited
dataset and improving its performance over time in a semi-automatic way.</p>
      <p>
        Some approaches for machine learning classi cation exploiting both labeled
and unlabeled data include:
{ Co-training [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where a labeled dataset is used to train two classi ers A
and B, while the unlabeled dataset is divided in two subsets: each subset is
classi ed through one classi er (e.g., A) and the con dent values are used
to train the other classi er (e.g., B).
{ Re-weighting [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and Common Components Using EM [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] aim at nding
a function using the labeled dataset to rede ne the unlabeled dataset.
      </p>
      <p>Although these approaches provide for interesting ways of exploiting
unlabeled data, they are not appropriate to our problem of creating agents from a
limited dataset and with minimal human supervision. In the rst case, we do
not have a function to estimate the error, thus a scheduled human intervention
would be required to check for it. In the other two cases, our labeled set is too
small to de ne a real and e ective function to apply at the unlabeled set.</p>
      <p>The method proposed in this paper is inspired by the co-training approach
combined with an estimation of error by sentiment analysis for the bot to learn
what is right and what is wrong. Moreover, we want to rely on the user's
satisfaction without asking for explicit feedback, but analyzing user answers through
a sentiment analysis.</p>
      <p>A novel contribution of this paper is thus the use of sentiment analysis to
drive the on-line automatic learning of an agent.</p>
    </sec>
    <sec id="sec-3">
      <title>Proposed method</title>
      <p>De nition 1 (Intents). An intent is a semantic label representing an intention
of the end-user.</p>
      <p>For each intent, we have de ned a set of sentences that represent it. Each
sentence that describes an intent contains entities that are attributes speci c
to the given intent.</p>
      <p>De nition 2 (Entities). Entities are the parameters of the intent that help in
de ning the speci c user request.</p>
      <sec id="sec-3-1">
        <title>An example taken from the dataset is shown below. Example 1. Request: "I need days o from tomorrow to the day after tomorrow".</title>
        <p>Intent: LEAV E REQU EST
Entities:
{ start date: tomorrow.
{ end date: day after tomorrow.</p>
        <p>In this example, the scenario of an employee requesting holidays is
represented by the LEAV E REQU EST intent and by the start date and end date
entities.</p>
        <p>An example of how the proposed method works in case of positive sentiment
is shown below.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Example 2. Time o request</title>
        <p>Request: "Hi" fintent: [HELLO] detected: [Hi] sentiment: [neutral]g
Bot: "Hi, Dave"
Request: "I'm stuck in tra c, I'll be there soon" fintent: [TIMEOFF REQUEST]
detected: [there,soon] new words: [stuck,tra c] sentiment: [neutral]g
Bot: "Ok, do you want to create a time o request?"
Request: "Yes, thank you!" fintent: [CONFIRM] detected: [Yes, thank, you]
sentiment: [positive]g</p>
        <p>In this example the agent detects the correct intent by the words there and
soon and enriches the dictionary with stuck and tra c. In the future, if these
words are often used for a time o request, they will become characteristic for
this intent. (i.g. Dave in the future he will can write "I'm stuck in tra c" or
"There is tra c" to request a time o ).</p>
        <p>An example of how the proposed method works in case of negative sentiment
is shown below.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Example 3. Time o request</title>
        <p>Request: "Hi" fintent: [HELLO] detected: [Hi] sentiment: [neutral]g
Bot: "Hi, Dave"
Request: "Tomorrow I'll be busy" fintent: [LEAVE REQUEST] detected:
[tomorrow,busy] sentiment: [neutral]g
Bot: "Ok, do you want to create a leave request?"
Request: "No, that's not what I want!" fintent: [NOT CONFIRM] detected: [No,
that's not, what, I, want] sentiment: [negative]g</p>
        <p>In this example the agent detects the uncorrect intent by the words tomorrow
and busy. In the future, if the bot will always receive a negative response to the
request that he proposes then the words found will no longer be characteristics
of the intent found and can be totally eliminated.</p>
        <p>Our approach is frame-based, conversational agent extracts from the text
the main informations to ll the user's request and if them aren't enough it can
directly ask the missing information.</p>
        <p>We have also de ned a dictionary that the bot uses to translate the type of
some words. For example, the terms "tomorrow" and "day after tomorrow" are
assigned to the type date.</p>
        <p>Notice that the concepts of intents and entities are domain independent,
while of course the values associated to them must be provided by an expert of
the chatbot application domain.
3.1</p>
        <sec id="sec-3-3-1">
          <title>Classi cation of intents</title>
          <p>The classi cation problem considered in our method is determining the intent
and the entities associated to a given user sentence.</p>
          <p>
            User sentences are represented with bag of words [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ], without
considering the order of the words. To improve classi cation accuracy, we also use a
vocabulary of n-words with an N-gram model [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ].
          </p>
          <p>
            The classi cation algorithm is based on Naive Bayes Text Classi er [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ], a
statistical technique able to estimate the probability of an element belonging to a
certain class. The Naive Bayes technique estimates the conditional probabilities
of each word given the classi cation category by associating every word that
convey the same meaning in the intents, a numerical value that we will consider
as a weight. The words that characterize an intent will have greater weight
because they will only be found within that intent, so their occurrence is limited
compared to non-characterizing words that we nd in numerous intents.
          </p>
          <p>More formally, let Z1; : : : ; Zn be the words that form the user input, the
classi cation process aims at retrieving the intent In such that P fInjZ1; : : : ; Zng
is maximum.</p>
          <p>Example 4. Given an intent Leave representing requests from a user regarding
leaves, we would like sentences such as "I want go to holidays", "I'm tired, I
need to rests", "I want holidays for this month", etc. to be classi ed as Leave.
3.2</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>Self-learning based on sentiment analysis</title>
          <p>The main idea developed in this paper is to provide the agent with the ability to
automatically collect feedback about its answers in order to improve its
knowledge base. To this end, we experimented the use of sentiment analysis. At the
beginning, the agent acts according to a model derived from the initial training
set, but during the use the model is updated according to the self-learning
process explained in this section. In particular, we have de ned an error function for
the agent exploiting the sentiment analysis derived from the dialogue between
the user and the bot.</p>
          <p>
            To detect the sentiment from user sentences, we have de ned another
classi cation problem from user input to three classes: Positive, Negative and
Neutral and use again a Naive Bayes approach to train this classi er on a speci c
dataset [
            <xref ref-type="bibr" rid="ref22">22</xref>
            ]. For any user's sentence, we keep track of local and global sentiment
score, local score is about the last sentence, global score is an average value across
the dialogue. Furthermore, to improve the idea, we can de ne some particular
intents An that act as modi ers. For example, when the user corrects the bot
with phrases like "I'm sorry I did not mean this", this is considered as a negative
feedback, while phrases containing speci c thanks, such as "Thank you! I was
trying to do exactly this!" provide for positive feedback.
          </p>
          <p>
            Based on the result of the sentiment analysis, we can recalibrate the
calculated weights of words wij for the Intent classi er and, if necessary, add new
terms to the dataset. We have implemented a low-pass lter [
            <xref ref-type="bibr" rid="ref28">28</xref>
            ] for smoothing
the high frequence in update function in order to mitigate possible errors.
          </p>
          <p>The main algorithm for self-learning is summarized below.</p>
          <p>Let:
1. Ii be the i-th detected intent in the user input U
2. wij be the value of the weight of the j-th word in U
3. c be a value between 0 and 1 that represents the sentiment for the user
during the dialog and it's computed by the results of negative or positive
nearest of the words
4. mv and Mv be constant values (in our experimental sessions we set mv = 0:1
and Mv = 0:3)
5. v be a variable set to mv by default, and set to Mv if Ii 2 A (where A is a
know positive or negative intent)
6. k be a constant (in our experimental session we set k = 0:4).
{ If the sentiment analysis is positive, a new word is added to the vocabulary
(if it does not exist) and the weight for every word in the user input (if it is
already present) is recalculated, according to this formula:
wij = wij (1
v) + nij v
(1)
where nij = wij + ck.
{ If the sentiment analysis is neutral, no changes are made.
{ If the sentiment analysis is negative than the weight for every word in the
user input (if it is present) is recalculated according to the formula (1) where
nij = wij ck. When wij becomes negative, the word is removed from the
detected intent.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental results</title>
      <p>
        The task-oriented conversational agent described in this paper was created to be
used as a virtual assistant to help people understand the use of BadgeBox [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and
to help them to perform actions directly and easily. The experiments reported
in this section have been performed during normal operations of the chatbot in
a real operational setting. More speci cally, to verify the e ectiveness of
selflearning through the use of sentiment analysis, four groups of six people were
recruited to interact with the chatbot for 28 days. We have selected users between
20 and 50 years old, with an almost equal gender distribution (13 males and
11 females). The groups were randomic but with a fair distribution of genders
and ages. We have created 4 instances of the bot and invited them to use the
correspondent instance. Each day, at 18 o'clock, we used to send them a survey
about the use of the system and how it was useful for the scope. The survey asks
an evaluation from -5 to 5 about how the bot was responsive for their requests.
With these evaluations we have been able to elaborate their satisfaction. None of
the chosen people has worked on the project development, so none of the users
were not aware of the phrases present in the dataset. The four variants of the
chatbot tested were:
      </p>
      <sec id="sec-4-1">
        <title>1. a baseline without self-learning 2. a method using a random self-learning 3. supervision of a human who updates the dataset manually 4. the proposed self-learning method based on sentiment analysis</title>
        <p>The experimental results are summarized in Table 1, where the columns have
the following meaning:
{ Day: day on which the experimental data are collected.
{ Method : variant of the chatbot tested.
{ Interactions: the number of requests made to the bot during the testing
period.
{ Satisfaction: the average satisfaction score of the user surveys.
{ Identi ed intents: correctness of the bot answers compared to the user
requests. We have checked them by analyzing the operations run on the
application during the test.
{ Variance: satisfaction variance.</p>
        <p>Table 1 shows the results obtained after 7, 14, 21 and 28 days of tests. We
can see how the system proposed in this paper (Method 4) improves the level of
satisfaction compared to the baseline methods (Methods 1 and 2) and has
similar or superior recognition performances compared to the manually supervised
one (Method 3). It is interesting to notice also that the self-learning method
retains the advantage over the manually supervised system, probably because
the manual supervision was updated not in real time but on a weekly basis. This
result further con rms the bene ts of on-line self-learning.</p>
        <p>At the moment, we are running the test in production environment and we
are constantly monitoring how the novel approach is performing. After a month
of testing, with about 1.500 users, we have noticed an increment of 10% of
correctness of the answers of the bot with respect to the right intent (from 65%
to 75,4%).</p>
        <p>From the results obtained during these experiments, we can conclude that
the agent is actually able to self-learn through sentiment analysis and achieves
performance that are similar to the agents that learn through human
supervision, but with a signi cant less e ort in human resources necessary for training
purposes.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and future works</title>
      <p>In this paper, we have presented a self-learning method for a chatbot, using the
analysis of the emotionality of the responses sent by the user as an error function
and as a self-learning balancer.</p>
      <p>One of the di culties overcomed and solved was precisely the nature of the
discursive data source that can not be de ned before, leaving the domain of the
unlabeled set unknown until it is written by the user.</p>
      <p>With our implementation and our experimental results, we have shown that
the sentiment analysis approach is able to improve the initial dataset in real
time and automatically, choosing on the basis of the answers what to learn
continuously and in a collaborative way among all the users who interact with
the bot.</p>
      <p>The results also show that our approach is better, in some cases, or similar to
the one supervised by a human expert, where answers and learning are carried
out thanks to the intervention of a person who corrects and improves the dataset
manually.</p>
      <p>Consequently, the proposed method, in addition to signi cantly reducing
maintenance costs, paves the way for many applications aimed at customer
satisfaction and make the software closer and more similar to the people who have
to use it.</p>
      <p>Learning slang, new speci c words and consequently improving and
expanding the training set, makes its service to users more e ective, including more
complex vocabularies and adapting to the target audience.</p>
      <p>After the various tests performed, we have implemented the bot in a
production application by tracing the quantity of actions performed and the degree of
satisfaction. The system is still on-line and is able to guarantee the use of the
software via chat in a natural and increasingly comprehensive way compared to
the requests.</p>
      <p>Among future applications, we can think of working on the parameters and
give personality to the bot, make it more surly or more helpful, or to adapt to
the user. We can also increase its learning ability by giving it the opportunity
to learn new intents through the use of certain actions that expand its skills.
Finally, we can also use this approach to evaluate and measure the skills of bots
without self-learning.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. BadgeBox, https://www.badgebox.com/en/.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Shawar</surname>
          </string-name>
          , E. Atwell:
          <article-title>Chatbots: are they really useful?</article-title>
          .
          <source>Journal for Language Technology and Computational Linguistics</source>
          , vol.
          <volume>22</volume>
          , no.
          <source>1. GSCL German Society for Computational Linguistics</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <article-title>James: Speech and language processing an introduction to natural language processing, computational linguistics, and speech</article-title>
          .
          <source>Pearson Education</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Collobert</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <article-title>Weston: A uni ed architecture for natural language processing: Deep neural networks with multitask learning</article-title>
          ,
          <source>in Proceedings of the 25th international conference on Machine learning. ACM</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>160</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , Y. Bengio,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          <article-title>: Deep learning</article-title>
          .
          <source>in Nature: International weekly journal of science</source>
          , vol.
          <volume>521</volume>
          , no. 7553.
          <string-name>
            <surname>Macmillan</surname>
          </string-name>
          ,
          <year>2015</year>
          , pp.
          <fpage>436</fpage>
          -
          <lpage>444</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Blum</surname>
          </string-name>
          , A., Mitchell, T.:
          <article-title>Combining labeled and unlabeled data with co-training</article-title>
          .
          <source>In Proceedings of the Workshop on Computational Learning Theory</source>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>100</lpage>
          , Madison,
          <string-name>
            <surname>WI</surname>
          </string-name>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Nigam</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Ghani</surname>
          </string-name>
          , R.:
          <article-title>Analyzing the e ectiveness and applicability of co-training</article-title>
          .
          <source>In Proceedings of Ninth International Conference on Information and Knowledge Management</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>86</fpage>
          -
          <lpage>93</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Crook</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Banasik</surname>
          </string-name>
          , J.:
          <article-title>Sample selection bias in credit scoring models</article-title>
          .
          <source>International Conference on Credit Risk Modeling and Decisioning</source>
          , Philadelphia, PA,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>M.I.</given-names>
          </string-name>
          :
          <article-title>Learning from incomplete data</article-title>
          .
          <source>Technical Report 108</source>
          , MIT Center for Biological and
          <source>Computational Learning</source>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Uyar</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A mixture of experts classi er with learning based on both labeled and unlabeled data</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          <volume>9</volume>
          ,
          <year>1997</year>
          , pp.
          <fpage>571</fpage>
          -
          <lpage>578</lpage>
          , MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oles</surname>
            ,
            <given-names>F.J.:</given-names>
          </string-name>
          <article-title>A probability analysis on the value of unlabeled data for classi cation problems</article-title>
          .
          <source>In Proceedings of Seventeenth International conference on Machine Learning</source>
          ,
          <year>2000</year>
          , pp
          <fpage>1191</fpage>
          -
          <lpage>1198</lpage>
          , Stanford, CA.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Seeger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Learning with labeled and unlabeled data</article-title>
          .
          <source>Technical report</source>
          , Institute for ANC, Edinburgh, UK,
          <year>2000</year>
          . http://www.dai.ed.ac.uk/ seeger/papers.html.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kremer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stacey</surname>
            ,
            <given-names>D..</given-names>
          </string-name>
          <article-title>NIPS 2001 Workshop and Competition on unlabeled data for supervised learning</article-title>
          ,
          <year>2001</year>
          . http://q.cis.guelph.ca/ skremer/NIPS2001/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Karakoulas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakhutdinov</surname>
          </string-name>
          , R.:
          <article-title>Semi-supervised Mixture of Experts Classi cation</article-title>
          .
          <source>In Proceedings of the Fourth IEEE International Conference on Data Mining</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>145</lpage>
          , Brighton UK.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Goldman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Enhancing supervised learning with unlabeled data</article-title>
          .
          <source>In Proceedings of the Seventeenth International Conference on Machine Learning</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>327</fpage>
          -
          <lpage>334</lpage>
          , San Francisco, CA.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Provost</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fawcett</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Robust classi cation for imprecise environments</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>42</volume>
          ,
          <year>2001</year>
          ,
          <fpage>203</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Tomoya</surname>
            <given-names>Sakai</given-names>
          </string-name>
          ,
          <source>Marthinus Christo el du Plessis</source>
          , Gang Niu, Masashi Sugiyama:
          <article-title>Semi-Supervised Classi cation Based on Classi cation from Positive and Unlabeled Data</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>S.</given-names>
            <surname>George</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Joseph: Text Classi cation by Augmenting Bag of Words (BOW) Representation with Co-occurrence Feature</article-title>
          .
          <source>IOSR Journal of Computer</source>
          Engineering (
          <string-name>
            <surname>IOSR-JCE)</surname>
          </string-name>
          e-ISSN:
          <fpage>2278</fpage>
          -
          <lpage>0661</lpage>
          , p-
          <fpage>ISSN</fpage>
          :
          <fpage>2278</fpage>
          -
          <lpage>8727Volume</lpage>
          16,
          <string-name>
            <surname>Issue</surname>
            <given-names>1</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ver</surname>
          </string-name>
          . V (
          <year>Jan</year>
          .
          <year>2014</year>
          ), PP
          <fpage>34</fpage>
          -38 www.iosrjournals.org
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Goldberg</surname>
          </string-name>
          <article-title>Yoav: Neural Network Methods in Natural Language Processing</article-title>
          . 1st edn. Morgan and Claypool publishers,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. Daniel Jurafsky, James H.
          <article-title>Martin: Speech and Language Processing An Introduction to Natural Language Processing</article-title>
          ,
          <source>Computational Linguistics, and Speech Recognition. 2nd edn. Pearson</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Li: Naive Bayes Text Classi er</article-title>
          .
          <source>Granular Computing</source>
          ,
          <year>2007</year>
          .
          <article-title>GRC 2007</article-title>
          . IEEE International Conference on (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Russo</surname>
          </string-name>
          , Irene; Frontini, Francesca and Quochi, Valeria,
          <year>2016</year>
          ,
          <string-name>
            <given-names>OpeNER</given-names>
            <surname>Sentiment Lexicon Italian - LMF</surname>
          </string-name>
          ,
          <article-title>ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli"</article-title>
          , National Research Council, in Pisa, http://hdl.handle.
          <source>net/20.500</source>
          .11752/ILC-73.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. Paul Prasse, Christoph Sawade, Niels Landwehr,
          <article-title>Tobias Sche er: Learning to Identify Concise Regular Expressions that Describe Email Campaigns</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>16</volume>
          (
          <year>2015</year>
          )
          <fpage>3687</fpage>
          -
          <lpage>3720</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Jugal</surname>
            <given-names>Kalita</given-names>
          </string-name>
          , Marc Moreno Lopez:
          <article-title>Deep Learning applied to NLP</article-title>
          .
          <source>arXiv:1703.03091v1 [cs.CL] 9 Mar</source>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Z. H.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Understanding bag-of-words model: A statistical framework</article-title>
          .
          <source>International Journal of Machine Learning and Cybernetics</source>
          ,
          <volume>1</volume>
          (
          <issue>1-4</issue>
          ),
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
          . DOI:
          <volume>10</volume>
          .1007/s13042-010-0001-0
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Jonathan J. Webster</surname>
          </string-name>
          , Chunyu Kit:
          <article-title>Tokenization as the initial phase in nlp</article-title>
          .
          <source>Proceeding COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume</source>
          <volume>4</volume>
          ,
          <string-name>
            <surname>Pages</surname>
          </string-name>
          1106-
          <fpage>1110</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27. Wilson, T. Wiebe, J. Ho mann, P.:
          <article-title>Recognizing Contextual Polarity in PhraseLevel Sentiment Analysis</article-title>
          ,
          <source>Proceeding HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Pages</source>
          <volume>347</volume>
          -
          <fpage>354</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <article-title>Low pass lter</article-title>
          , https://en.wikipedia.org/wiki/Low-pass lter.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>