Active Stream Learning with an Oracle of
 Unknown Availability for Sentiment Prediction

                       Elson Serrao and Myra Spiliopoulou

                Otto-von-Guericke-University Magdeburg, Germany
                     elson.serrao@gmail.com, myra@ovgu.de


      Abstract. Active learning holds the promise of learning models from
      the data with minimal expert input. However, it assumes that the expert
      is always available or only at the beginning. We waive this assumption
      and investigate to what extent active learning is effective in practice.
      We focus on sentiment classification over real streams of opinions. We
      show that at least for the two real streams we have analyzed, the random
      strategy is very competitive, and querying the expert in an intelligent way
      does not bring many advantages, at least when the expert is irregularly
      available.

      Keywords: active learning, oracle availablity, polarity model learning,
      opinion stream mining


1   Introduction

The objective of active learning is to obtain better or comparable performance
to a fully supervised learner with fewer labels if the learner is given the oppor-
tunity to select the instances for which it requires labels [1]. Active learning is
thus very suitable in those scenarios where there is an abundance of unlabeled
data and obtaining new labels is rather expensive. Labels are obtained using an
oracle who can, for example, be a domain expert or a human annotator from
a crowd-sourcing platform. However, it is often assumed that there is a single
oracle that is always correct, always available and inexpensive to query. While
there are surveys [1], [2], and [3] and studies [4], [5] that provide insights to the
above mentioned challenges in active learning, only a few studies focus on the
availability of the oracle for streaming data [6].
    In this paper, we consider a stream of opinionated documents and try to
predict the sentiment of the document as being either positive, negative or neu-
tral. Over time drift may be observed in the stream due to evolving topics, data
and vocabulary, requiring the classifier to adapt to the opinionated stream. For
this we use active learning to obtain new labels from the data stream. Instances
to be labeled by the oracle are sampled using an appropriate query strategy.
However, we assume that the oracle is available irregularly i.e. according to a
pattern unknown to the learner. This implies that the oracle may be queried
at each moment, but will respond by delivering the label only if it is available.


                                          36
Active Stream Learning with an Oracle of Unknown Availability


     Fig. 1. Interaction of the stream learner and an irregularly available oracle
If the oracle is unavailable, the instance is not used. This workflow is shown in
Figure 1.
    The remaining of the paper is organized as follows. Section 2 discusses the
related work. In Section 3 we detail our framework and the active learning query
strategies used. Section 4 describes the setup for our experiments. While in
Section 5 we discuss the results of those experiments. We present our concluding
remarks in Section 6.


2    Related Work
Most of the algorithms for active learning on streams either assume infinite
verification latency, whereupon they invoke semi-supervised learning [7] or they
assume that the Oracle is always available to provide labels [8], [9], [10], [11],
[12].
    Shickel and Rashidi in [6] propose a framework that is aware of the oracle’s
availability for data streams. Their framework considers multiple oracles and fo-
cuses on querying first those oracles that have a higher availability. They try to
achieve a cost-benefit tradeoff by using a dynamic labeling budget proportional
to the oracle’s availability with the cost of labeling an instance inversely pro-
portional to the oracle’s availability. However, such a cost-benefit tradeoff seems
unrealistic in real-world scenarios where the cost is likely to be regulated by the
difficulty in obtaining the label and other factors [5]. No experiments were con-
ducted with oracles of varying expertise, which is often seen in active learning
literature considering multiple oracles.


3    Active Learning on an Opinionated Stream
We consider a data stream D of opinionated documents observed at distinct
timepoints t0 , t1 , . . . , ti , . . . where at each timepoint ti we receive a batch of
documents. We define the timepoint on a temporal level where, for example, the
timepoint could be a week. Consequently, all the documents arriving during the
time period from ti−1 to ti would comprise the batch of documents for ti .


                                          37
Active Stream Learning with an Oracle of Unknown Availability

   Our framework encompasses an algorithmic core described in Section 3.1 and
the query strategies in Section 3.2. We link this framework with a simulator of
the oracle’s availability, described in Section 4.1, and with an algorithm for the
preparation of the opinionated data stream, described in Section 4.2.

3.1   Modeling Oracle Unavailability during Querying
We consider the beginning of the stream at t0 to be characterized by the avail-
ability of an initial set of labeled documents L0 . The initially labeled documents
L0 are used to initialize the classifier ∆. At subsequent timepoints i.e. t1 on-
wards, we receive unlabeled data Ut . If the budget B is not exceeded, for every
unlabeled document x, we use the trained classifier ∆ to predict the probability
P (ŷc |x) ∀c ∈ C, where c represents the sentiment of the document, namely, pos-
itive, negative or neutral. Our method calculates the confidence of the learner’s
prediction I using metric φ, and launches a request for the true label y when
necessary. The oracle provides the true label y only if it is available and the
document x is added to the labeled data for the next iteration. In the event that
the oracle is unavailable, x is not used to adapt the learner. An overview of our
framework is shown in Algorithm 1.
     We assume the cost of labeling is the same for every document at any time
t. If nt is the number of queries sent to the oracle at t, then the utilized budget
at t is given by                        nt
                                             <B                                 (1)
                                       |Ut |
     To adapt to the evolving data stream, we utilize a sliding window W [13]. At
every timepoint t, we add the labeled documents Lt to the window. Once the
window is full, the documents from the oldest timepoint within the window are
forgotten. Depending on the chosen classifier ∆, for every iteration, there may
be a need to retrain ∆ with the documents in the window.
     Although the framework is capable of using any confidence metric such as
maximum posterior probability or the maximum margin between the first and
second most probable class, we propose calculating the confidence of a prediction
using entropy as,
                                  "                                   #
                                     X
                       φH = 1 − −        P (ŷc |x) log|C| P (ŷc |x)           (2)
                                   c∈C


3.2   Query Strategies
We use uncertainty based query strategies for the active learner. We also use the
variable uncertainty and variable randomized uncertainty strategies introduced
by Žliobaitė et al. in [9], [10] and [11]. The difference in our implementation is
that we have generalized the strategies to allow the use of any confidence metric
while determining if an instance needs to be sampled.
   Random Strategy: This strategy randomly selects an instance to be labeled
by the oracle with a probability given by the budget B. In this sense, it is very
naive and is used as the baseline strategy.


                                         38
Active Stream Learning with an Oracle of Unknown Availability


 Algorithm 1: Active Learning with an Irregularly Available Oracle
   Input: ∆ - classifier with relevant parameters; size - window size; B - budget;
            o - oracle; φ - confidence metric; queryStrategy - query strategy with
            parameters;
   Initialize: t ← 0; W ← SlidingWindow(size)
 1 Receive labeled data Lt
 2 W ← addToWindow(Lt )
 3 for t = 1, 2, . . . do
 4     Train classifier ∆ with instances in W
 5     Receive unlabeled data Ut
 6     nt ← 0
       Lt ← ∅
 7     for each instance x ∈ Ut do
 8         Pc ← P∆ (ŷc |x) ∀c ∈ C // predict the probability
 9         ŷ ← arg maxc (Pc ) // predicted label
10         if (nt / |Ut |) < B then
11              I ← φ(Pc ) // compute the confidence
12              if queryStrategy(B, I, . . . ) = T rue then
13                    nt ← nt + 1
14                    if isAvailable(o) then
15                        y ← get true label of x from the oracle o
                          Lt ← Lt ∪ (x, y)
16     W ← addToWindow(Lt )

   Fixed Uncertainty Strategy: This strategy samples those instances which
the learner are least confident of by comparing the confidence of prediction of
an instance to a fixed threshold θ.

    Variable Uncertainty Strategy: For a learner to adapt to an evolving
data stream, obtaining labels for the least confident instances within each time-
point would be more beneficial. The variable uncertainty strategy described in
Algorithm 2 provides a variable threshold that adjusts itself to the incoming
data stream. When the data stream speeds up, the confidence of the learner de-
creases. In such cases, it decreases its threshold so that least confident instances
are queried first. On the other hand, at times when the learner is confident of its
prediction, the threshold is increased to capture the most uncertain instances.

 Algorithm 2: VariableUncertainty(I, s)
   Input: I - confidence score; s ∈ (0, 1] - threshold adjustment step
   Output: T rue if true label is required else F alse
   Initialize: labeling threshold θ ← 1
 1 if I < θ then
 2     θ ← θ (1 − s)// uncertain instance: decrease the threshold
 3     return T rue
 4 else
 5     θ ← θ (1 + s)// confident instance: increase the threshold
 6     return F alse


                                         39
Active Stream Learning with an Oracle of Unknown Availability

    Variable Randomized Uncertainty Strategy: Uncertainty based active
learning query strategies generally focus on sampling those instances that are
close to the decision boundary of the learner. In evolving data streams changes
may occur anywhere in the instance space. So as to not miss the change that
may occur elsewhere, the variable randomized uncertainty strategy occasionally
samples those instances that the learner is confident of. As shown in Algorithm 3,
the threshold is multiplied by a normally distributed random variable to sample
the confident instances.
  Algorithm 3: VariableRandomizedUncertainty(I, s, δ)
    Input: I - confidence score; s ∈ (0, 1] - threshold adjustment step
            δ - variance of threshold randomization
    Output: T rue if true label is required else F alse
    Initialize: labeling threshold θ ← 1
  1 θrandomized ← θ × η, where η ∈ N (1, δ) is a random multiplier
  2 if I < θrandomized then
  3     θ ← θ (1 − s)// uncertain instance: decrease the threshold
  4     return T rue
  5 else
  6     θ ← θ (1 + s) // confident instance: increase the threshold
  7     return F alse


4     Experiment Setup
The goal of our experiments is to study how the performance of the learner is
affected when the availability of the oracle changes. The following sub-sections
describe the oracle availability simulator, the datasets used, and the evaluation
criteria and strategy.

4.1    Simulator of Oracle Availability
At timepoint t, we consider the oracle to be available with a probability of αt . In
our experiments, at any timepoint t, we set the oracle’s availability αt = α and
is considered to be independent of the availability at t − 1. Algorithm 4 provides
a more formal definition of our simulator.
  Algorithm 4: Oracle Availability Simulator
    Input: α ∈ (0, 1] - availability of the oracle;
    Output: T rue if the oracle is available else F alse
  1 return uniform(0, 1 ) ≤ α


4.2    Datasets and Feature Engineering
Yelp: The Yelp Dataset 1 contains about 5.2 million reviews of various businesses
over a period of 13 years from 11 metropolitan areas across 4 countries. We
1
    https://www.yelp.com/dataset/challenge


                                        40
Active Stream Learning with an Oracle of Unknown Availability

filtered the dataset for English reviews and additionally removed those reviews
whose length was less than 15 words. For our data stream, we considered the
reviews from 2009 onwards.
     Amazon: The Amazon dataset introduced in [14], contains reviews of sev-
eral product categories. Using the 5-core datasets of the product categories, we
build a dataset with an intention of introducing concept drift. For this pur-
pose, we randomly selected three product categories every three months from
among the following nine categories: Home and Kitchen, Kindle Store, Health
and Personal Care, Cell Phones and Accessories, Apps for Android, Electron-
ics, Clothing, Shoes and Jewelry, CDs and Vinyl, and Beauty. We removed the
duplicate reviews arising from the product having multiple categories.
     Both the Yelp and Amazon employ a 5-star rating scheme, where 5-stars is
the highest rating while 1-star is the lowest. We considered the 1 and 2-star
rating to be negative, 3-star rating neutral and 4 and 5-star rating positive.
     Feature Engineering: We preprocessed the reviews by replacing URLs,
negations and currencies with the tokens URL, NEGATION and CURRENCY
respectively and some emoticons with tokens like SMILE and HEART. We fur-
ther suppressed repeated letters, expanded contractions (e.g. ”I’m” into ”I am”),
removed stopwords and replaced words by their lemmas.
     After preprocessing, we extracted features from the reviews using word n-
grams with n = 3 along with its corresponding frequency of occurrence. We
selected the most relevant 15000 features using the chi-square test. Our feature
vectors were then constructed using the TF-IDF weighting scheme.2

4.3    Evaluation Strategy and Evaluation Criteria
We define the duration of a timepoint to be a week and maintain a sliding
window of five weeks. We perform prequential evaluation [13]: we first test on
all documents for the incoming week and then adapt by using only the sampled
documents whose labels have been provided by the oracle.
    As we use entropy to calculate our confidence measure, we evaluate on log
loss decrease [2]. The log loss lt at timepoint t is given by,
                                    1 XX
                           lt = −              bij log pij                 (3)
                                   |Ut |
                                        i∈Ut j∈C
where bij is a binary indicator of whether or not label j is the correct classification
for instance i, and pij is the model probability of assigning label j to instance i.

5     Experimental Evaluation
As the Stochastic Gradient Descent classifier was found to be effective for sen-
timent analysis in [15], we used the same with hinge loss, l2 penalty and alpha
value of 0.0001 to optimize the objective function of a linear support vector ma-
chine as our base learner. The base learner was calibrated using Platt Scaling to
obtain probabilistic outputs.
2
    Code and supplementary material available at https://github.com/elrasp/osm


                                          41
Active Stream Learning with an Oracle of Unknown Availability

    For each timepoint t, we set a fixed budget B = 0.1. For the query strategies,
we set the fixed threshold θ = 0.9, and the suggested values of 0.01 and 1 for
the threshold adjustment step s and the variance of the normally distributed
random number generator δ respectively [11].
    In Section 5.1, we describe the underlying class distribution of the data
stream for these datasets and in Section 5.2, we analyze the influence of the
oracle’s availability on the performance of the learner.


5.1                        Distribution of Data

The weekly underlying class distribution of data in the stream of opinionated
documents for the Yelp and Amazon datasets are shown in Figure 3(a) and
Figure 3(b) respectively.
   Yelp: For the Yelp dataset, we observe a gradual increase in the number of
reviews obtained over time. The proportion of neutral reviews received remain
almost constant for the entire data stream. In comparison, the positive and
negative reviews are always increasing. Also, we find that the positive reviews
dominate the class distribution accounting for more than 50% of the reviews in
any week.
   Amazon: Unlike the Yelp dataset, the amazon dataset exhibits sudden
bursts in the volume of reviews received. This occurs as some chosen prod-
uct categories are more popular than others and receive more reviews. We also
observe that mostly there is a burst in the positive reviews received as compared
to the negative and neutral reviews. Similar to the Yelp dataset, the positive
reviews dominate the class distribution.


5.2                        Impact of the Oracle’s Availability on Learning

Figure 2 shows how the oracle availability, simulated by the method of Sec-
tion 4.1, affects the number of queries answered. We vary the availability be-
tween 1.0 (all queries answered) and 0.1 (only 10% of the queries per batch are
answered) in steps of 0.1.


                         availability                                                                                          availability
                    2500        1.0                                                                                       3000        1.0
                                0.7                                                                                                   0.7
                                0.4                                                                                       2500        0.4
                    2000        0.1                                                                                                   0.1
 Queries Answered


                                                                                                       Queries Answered


                                                                                                                          2000
                    1500
                                                                                                                          1500
                    1000
                                                                                                                          1000
                     500
                                                                                                                           500

                      0                                                                                                     0
                      2009     2010     2011 2012 2013 2014 2015 2016                      2017                             2011        Jul         2012       Jul        2013       Jul      2014   Jul
                                        (a) Yelp Dataset: Queries answered by the Oracle                                                      (b) Amazon Dataset: Queries answered by the Oracle


                                  Fig. 2. Queries answered by the Oracle for varying availabilities


                                                                                                  42
Active Stream Learning with an Oracle of Unknown Availability

    In Figure 3 we show the results of evaluation for the Yelp (on the left) and
Amazon (on the right) datasets. We aggregated the log loss results for all the
timepoints every two months and plot the mean (lines) and standard deviation
(colored area around the line). For the Yelp dataset, in general, we observe that
the error in performance gradually reduces as the stream progresses irrespective
of the oracles availability and the query strategy used. On the other hand, the
learner finds it much more difficult to adapt to the evolving data stream of the
Amazon dataset.
    In the early stages of the stream, where the data volume in the stream is low,
we observe the learner performing better for lower oracle availabilities. As more
and more data is accumulated, the need to have the oracle for the Yelp dataset
drops but remains for the Amazon dataset as it exhibits more drift.
    We further conducted experiments with oracle availabilities varying between
0.01 and 0.1 in steps of 0.01 and compared the different query strategies at vary-
ing availabilities with the non-parametric Friedman’s test followed by Nemenyi
post-hoc [16]. Friedman’s test proceeds by ranking the models under considera-
tion. The best performing model is given the rank 1, the second best 2 and so
on. If two or more models have identical performance they are given an average
rank. The null hypothesis of Friedman’s test states that all the models perform
the same and thus, will have the same average rank. If the test rejects the null
hypothesis, we proceed with the Nemenyi post-hoc test that makes pair-wise
comparisons of the different models. It identifies statistically significant models
if the difference between their average rank is more than the critical distance.
    Figure 4(a) and Figure 4(b) shows the critical distance diagram of the Yelp
and Amazon dataset respectively. In these diagrams the better performing mod-
els are ranked higher and are placed to the right. The model name corresponds
to a combination of the query strategy and the value of the oracle availability,
whereupon ”rand” in the name refers to the random strategy. The models whose
difference in average rank is less than the calculated critical distance (CD) are
connected to each other by a horizontal line, indicating that these models are
statistically indifferent to each other.
    As we can see in these figures, there are strategies that which perform signif-
icantly better when the oracle availability changes. At very low oracle availabil-
ities we find that there is no strategy that performs significantly better than the
others. For the Amazon dataset, even at higher oracle availabilities there is no
difference in the performance of the strategies. This could mainly be attributed
to the nature of the drift exhibited in both the datasets.


6   Conclusions

The need for an oracle depends on the overall variability of the dataset. If there
is convergence over time as in the case of the Yelp dataset, the need for an
oracle is limited, because the learner can predict the labels by itself. Hence, low
availability of the oracle is only relevant if there is drift.


                                        43
Active Stream Learning with an Oracle of Unknown Availability


   25000             positive                                                        30000               positive
                     neutral                                                                             neutral
                     negative                                                        25000               negative
   20000
                                                                                     20000
   15000
                                                                                     15000
   10000
                                                                                     10000
      5000                                                                              5000
             0
             2009 2010 2011 2012 2013 2014 2015 2016 2017 2018                                 0
                                                                                               2011      Jul    2012      Jul    2013        Jul      2014         Jul
                      (a) Yelp Dataset: Weekly Distribution of Data                                       (b) Amazon Dataset: Weekly Distribution of Data


            1.1                                                                               0.80
                                                        availability                                                                            availability
            1.0                                         0.1          0.7                      0.75                                              0.1          0.7
                                                        0.4          1.0                                                                        0.4          1.0
            0.9                                                                               0.70
            0.8                                                                               0.65
 log loss


                                                                                   log loss
            0.7                                                                               0.60
            0.6                                                                               0.55
            0.5                                                                               0.50
            0.4                                                                               0.45
             2009 2010 2011 2012 2013 2014 2015 2016 2017 2018                                   2011     Jul      2012     Jul     2013     Jul    2014           Jul
                          (c) Yelp Dataset: Random Strategy                                                     (d) Amazon Dataset: Random Strategy


            1.1                                                                               0.75
                                                        availability                                                                            availability
            1.0                                         0.1          0.7                                                                        0.1          0.7
                                                        0.4          1.0                      0.70                                              0.4          1.0
            0.9
                                                                                              0.65
            0.8
                                                                                   log loss
 log loss


            0.7                                                                               0.60
            0.6                                                                               0.55
            0.5                                                                               0.50
            0.4                                                                               0.45
             2009 2010 2011 2012 2013 2014 2015 2016 2017 2018                                   2011     Jul     2012     Jul      2013      Jul      2014        Jul
                      (e) Yelp Dataset: Fixed Uncertainty Strategy                                         (f) Amazon Dataset: Fixed Uncertainty Strategy


            1.1                                                                               0.80
                                                        availability                                                                            availability
            1.0                                         0.1          0.7                      0.75                                              0.1          0.7
                                                        0.4          1.0                                                                        0.4          1.0
            0.9                                                                               0.70
            0.8                                                                               0.65
                                                                                   log loss
 log loss


            0.7                                                                               0.60
            0.6                                                                               0.55
            0.5                                                                               0.50
            0.4                                                                               0.45
             2009 2010 2011 2012 2013 2014 2015 2016 2017 2018                                   2011     Jul    2012      Jul     2013       Jul     2014         Jul
                     (g) Yelp Dataset: Variable Uncertainty Strategy                                     (h) Amazon Dataset: Variable Uncertainty Strategy


            1.0                                         availability                          0.80                                              availability
                                                        0.1          0.7                      0.75                                              0.1          0.7
            0.9                                         0.4          1.0                                                                        0.4          1.0
                                                                                              0.70
            0.8
                                                                                              0.65
 log loss


                                                                                   log loss


            0.7
                                                                                              0.60
            0.6
                                                                                              0.55
            0.5
                                                                                              0.50
            0.4
                                                                                              0.45
             2009 2010 2011 2012 2013 2014 2015 2016 2017 2018                                   2011      Jul    2012       Jul    2013     Jul     2014       Jul
                 (i) Yelp Dataset: Variable Randomized Uncertainty Strategy                         (j) Amazon Dataset: Variable Randomized Uncertainty Strategy


Fig. 3. Evaluation results of the Yelp (on the left) and Amazon (on the right) datasets


                                                                              44
Active Stream Learning with an Oracle of Unknown Availability

                         CD=4.90464453415
                        24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1


      variable_0.01                                                                      variable_1.0
 variable_rand_0.01                                                                      variable_rand_1.0
          fixed_0.01                                                                     variable_0.7
       random_0.01                                                                       fixed_1.0
      variable_0.05                                                                      random_1.0
 variable_rand_0.05                                                                      variable_rand_0.7
       random_0.05                                                                       fixed_0.7
          fixed_0.05                                                                     variable_0.3
  variable_rand_0.1                                                                      random_0.7
        variable_0.1                                                                     random_0.3
            fixed_0.1                                                                    variable_rand_0.3
        random_0.1                                                                       fixed_0.3

                                              (a) Yelp Dataset
                              CD=7.7549239207
                        24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1


      variable_0.01                                                                      variable_rand_0.3
       random_0.01                                                                       fixed_0.3
          fixed_0.01                                                                     variable_0.3
 variable_rand_0.01                                                                      random_0.3
      variable_0.05                                                                      variable_1.0
 variable_rand_0.05                                                                      variable_0.7
       random_0.05                                                                       variable_rand_0.7
          fixed_0.05                                                                     variable_rand_0.1
        random_1.0                                                                       fixed_0.1
        random_0.7                                                                       variable_rand_1.0
            fixed_1.0                                                                    fixed_0.7
        variable_0.1                                                                     random_0.1

                                            (b) Amazon Dataset

Fig. 4. Critical Distance Diagram. The models can be identified by the query strategy
and oracle availability. The best performing models are shown to the right and models
that are not statistically different to each other are connected by the horizontal line.


    If an oracle is available, the random strategy is a good choice, as it shows the
same tendency as the other strategies, is easy to implement and is fast. However,
more experiments are needed to check whether stronger active learning strategies
can beat the random sampler in this setting, by capitalizing more effectively on
the few available labels. Experiments are also required for different domains.
    To improve model quality, we intend to consider more elaborate querying
strategies [17], and to investigate whether instance-based active learning might
deliver better results than the block-based active learning paradigm we currently
use.
    To compensate for oracle inavailability, we also intend to combine active
learning with semi-supervised learning. Semi-supervised methods are used to
propagate labels to the arriving instances, cf. [18], [19]. In that case, we want to
investigate how disagreement between oracle and self-learner can be alleviated
in a seamless way.


   Acknowledgements. This work was inspired and partially conducted (last au-
thor) within the project OSCAR Opinion Stream Classification with Ensembles
and Active Learners, funded by the German Research Foundation.


                                                     45
Active Stream Learning with an Oracle of Unknown Availability

References
 1. Settles, B.: Active learning literature survey. Computer Sciences Technical Report
    1648, University of Wisconsin–Madison (2009)
 2. Fu, Y., Zhu, X., Li, B.: A survey on instance selection for active learning. Knowl-
    edge and Information Systems 35(2) (May 2013) 249–283
 3. Lughofer, E.: On-line active learning: A new paradigm to improve practical useabil-
    ity of data stream modeling methods. Information Sciences 415-416 (nov 2017)
    356–376
 4. Wu, W., Liu, Y., Guo, M., Wang, C., Liu, X.: A probabilistic model of active
    learning with multiple noisy oracles. Neurocomputing 118 (2013) 253 – 262
 5. Donmez, P., Carbonell, J.G.: Proactive learning: Cost-sensitive active learning
    with multiple imperfect oracles. In: Proceedings of the 17th ACM Conference on
    Information and Knowledge Management. CIKM ’08, New York, NY, USA, ACM
    (2008) 619–628
 6. Shickel, B., Rashidi, P.: ART: an availability-aware active learning framework for
    data streams. In Markov, Z., Russell, I., eds.: Proceedings of the Twenty-Ninth
    International Florida Artificial Intelligence Research Society Conference, FLAIRS
    2016, Key Largo, Florida, May 16-18, 2016., AAAI Press (2016) 92–97
 7. Zimmermann, M., Ntoutsi, E., Spiliopoulou, M.: Adaptive semi supervised opinion
    classifier with forgetting mechanism. In: Proceedings of the 29th Annual ACM
    Symposium on Applied Computing. SAC ’14, New York, NY, USA, ACM (2014)
    805–812
 8. Zimmermann, M., Ntoutsi, E., Spiliopoulou, M.: Incremental active opinion learn-
    ing over a stream of opinionated documents. WISDOM’15 (Workshop on Issues of
    Sentiment Discovery and Opinion Mining) 2015 at Knowledge Discovery and Data
    Mining, KDD’15 Workshops 2015, Sydney, Australia, August 10, 2015 (2015)
 9. Žliobaitė, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with evolving
    streaming data. In Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M.,
    eds.: Machine Learning and Knowledge Discovery in Databases, Berlin, Heidelberg,
    Springer Berlin Heidelberg (2011) 597–612
10. Žliobaitė, I., Bifet, A., Holmes, G., Pfahringer, B.: Moa concept drift active learn-
    ing strategies for streaming data. In Diethe, T., Balcazar, J., Shawe-Taylor, J.,
    Tirnauca, C., eds.: Proceedings of the Second Workshop on Applications of Pat-
    tern Analysis. Volume 17 of Proceedings of Machine Learning Research., CIEM,
    Castro Urdiales, Spain, PMLR (19–21 Oct 2011) 48–55
11. Žliobaitė, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with drifting
    streaming data. IEEE Transactions on Neural Networks and Learning Systems
    25(1) (Jan 2014) 27–39
12. Smailović, J., Grčar, M., Lavrač, N., Žnidaršič, M.: Stream-based active learning
    for sentiment analysis in the financial domain. Inf. Sci. 285(C) (November 2014)
    181–203
13. Gama, J.a., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on
    concept drift adaptation. ACM Comput. Surv. 46(4) (March 2014) 44:1–44:37
14. McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommen-
    dations on styles and substitutes. In: Proceedings of the 38th International ACM
    SIGIR Conference on Research and Development in Information Retrieval - SIGIR
    2015, ACM Press (2015)
15. Bifet, A., Frank, E.: Sentiment knowledge discovery in twitter streaming data.
    In: Proceedings of the 13th International Conference on Discovery Science. DS’10,
    Berlin, Heidelberg, Springer-Verlag (2010) 1–15


                                            46
Active Stream Learning with an Oracle of Unknown Availability

16. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach.
    Learn. Res. 7 (December 2006) 1–30
17. Kottke, D., Krempl, G., Spiliopoulou, M.: Probabilistic active learning in datas-
    treams. In Fromont, E., De Bie, T., van Leeuwen, M., eds.: Advances in Intelligent
    Data Analysis XIV, Cham, Springer International Publishing (2015) 145–157
18. Dyer, K.B., Capo, R., Polikar, R.: COMPOSE: A semisupervised learning frame-
    work for initially labeled nonstationary streaming data. IEEE Transactions on
    Neural Networks and Learning Systems 25(1) (jan 2014) 12–26 Journal Article
    Research Support, U.S. Gov’t, Non-P.H.S.
19. Souza, V.M.A., Silva, D.F., Gama, J., Batista, G.E.A.P.A.: Data stream classifica-
    tion guided by clustering on nonstationary environments and extreme verification
    latency. In Venkatasubramanian, S., Ye, J., eds.: Proceedings of the 2015 SIAM
    International Conference on Data Mining. [Society for Industrial and Applied
    Mathematics] (jun 2015) 873–881


                                           47