Towards Continuous Monitoring of Environment under
            Uncertainty: A Fuzzy Granular Decision Tree Approach
              Preetham N. Reddy                                      Sahith N. Dambekodi                               Tirtharaj Dash
              Department of EEE                                       Department of EEE                         Data Science Research Group
            BITS Pilani, Goa Campus                                 BITS Pilani, Goa Campus                        BITS Pilani, Goa Campus
               Goa 403726, India                                       Goa 403726, India                               Goa 403726, India
         f2015174@goa.bits-pilani.ac.in                          f2015192@goa.bits-pilani.ac.in                 tirtharaj@goa.bits-pilani.ac.in

ABSTRACT                                                                              complexity of the environment or availability of the required equip-
Smart monitoring of environment has been an essential area of                         ment for the process. In the past, this has led the theoretical and
research where decision-making process is inevitable. Reliability                     computational computer scientists to develop mathematical models
of the whole system depends on the stability and consistency of                       that could almost simulate the actual target environment.
its decision-making unit. Real-time decision making is another                           Towards advancing the field forward, a research was carried out
challenge in the field on which the research community has been                       very recently by Huerta et al.[2] that focused on online decorrela-
focusing on improving the performance of the underlying models.                       tion of humidity and temperature in chemical sensors for continu-
The underlying models are usually the learning models, that act                       ous monitoring. The work focused on automated processing of data
as a smart engine after being sufficiently trained for the process.                   from simultaneous and continuous readings of the variations in
In this paper, we propose to use a decision tree model that has                       humidity and temperature in the targeted environment. Their work
the capability of handling uncertainty in the acquired data from                      included eight(8) metal-oxide sensors (MOX) sensors that were
the environment. The resulting model is called as Fuzzy Granular                      continuously sensing the environment for 537 days with a sam-
Decision Tree (FGDT). Series of evaluation of FGDT shows that the                     pling rate of 1 sample per second [2]. They estimated the effects of
model is stable and powerful for the presently considered problem.                    variations in air humidity and temperature on the chemical sensors
                                                                                      signals by using standard energy band model with an assumption
CCS CONCEPTS                                                                          that the variations in sensor conductivity can be expressed as a
                                                                                      nonlinear function of changes in the energy bands in the presence
•Computing methodologies →Machine learning approaches;
                                                                                      of external humidity and temperature. Their study showed that
Classification and regression trees; Uncertainty quantifica-
                                                                                      the major factors that were affecting the environment were the
tion;
                                                                                      changes in humidity and the correlated changes in temperature and
                                                                                      humidity. To visualize the process, they used a gas discrimination
KEYWORDS                                                                              system that could discriminate among banana, wine, and baseline
Environment Monitoring; Decision Tree; Fuzzy Logic; Gas Sensors                       response. They had used a variant of Support Vector Machine (SVM)
ACM Reference format:                                                                 algorithm to build the discriminatory model for the process.
Preetham N. Reddy, Sahith N. Dambekodi, and Tirtharaj Dash. 2017. To-                    In the process of continuous environment monitoring, there lies
wards Continuous Monitoring of Environment under Uncertainty: A Fuzzy                 a certain degree of uncertainty in the acquired data. An uncertainty
Granular Decision Tree Approach. In Proceedings of Workshop on Develop-               in the acquired data could lead to the failure or adverse functioning
ment aspects of Intelligent Adaptive Systems, Co-located with 10th ISEC-2017,         of the system. The uncertainty could arise due to the following
Jaipur, India, (DIAS 2017), 5 pages.                                                  situations such as failures of sensors, a sudden change in environ-
DOI:                                                                                  ment due to additional and unknown factors. In such a case, the
                                                                                      discriminatory model that was built may not be trustworthy and
1    INTRODUCTION                                                                     hence the results should, in turn, be imprecise. This impreciseness
                                                                                      or degree of uncertainty in the acquired data or desired outcome
Smart monitoring of environment is increasingly becoming popu-                        is called ‘fuzziness’. An adaptive system that is meant to serve
lar because of the obvious fact that very little human intervention                   for the purpose of continuous environment monitoring should not
is required for such systems to perform. Moreover, it has been a                      only just adapt to the environmental changes with time but also
very challenging and yet interesting area of research in the last                     should be robust with respect to the uncertainties as read through
several years [1]. This field has motivated the research community                    its sensors. To solve the purpose of handling uncertainty or the
to design automated and intelligent models (or systems) towards                       fuzziness present in the acquired data, and to make a robust decision
continuous monitoring of an environment in industrial plants, med-                    based on the data, we propose a granular method – that specifically
ical environment or biological processes. However, designing an                       captures the uncertain inputs – to be induced within an adaptive
almost-accurate system has been a challenge given the background                      system. The research carried out by [2] was experimentally sound
                                                                                      and our present work mainly serves the purpose of embedding the
DIAS 2017, Jaipur, India
Copyright ©2017 for the individual papers by the papers’ authors. Copying permitted   uncertainty aspect of the work. In our present work, we redesign a
for private and academic purposes. This volume is published and copyrighted by its    discriminatory model that also incorporates possible uncertainty
editors. .                                                                            in the acquired data. It should be noted that our work is validated
DIAS 2017, , Jaipur, India                                                  Preetham N. Reddy, Sahith N. Dambekodi, and Tirtharaj Dash


with the experimental data from [2], further information of which
has been presented in the following sections.
   The rest of the paper is organized as follows. The proposed
and implemented model has been discussed in the section 2. The
results that have been obtained for the present problem has been
elaborated in section 3. The obtained results are discussed in the
section 4 and the paper has been concluded in the section 5.

2     METHODOLOGY
Our proposed work has following major phases: data acquisition
phase, uncertainty handling phase, discrimination phase.

2.1    Data acquisition phase
In the data acquisition phase, the sensors acquire the data from
the environment for further processing in later phases. For more
information, one could refer [2].                                        Figure 1: A sample Gaussian fuzzy membership transforma-
                                                                         tion function (X-axis: value of a sample sensor reading, Y–
2.2    Handling uncertainty in acquired data                             axis: degree of belongingness, µ)
In this work, uncertainty in the acquired data has been taken care
with the use of feature transformation using a special fuzzy mem-        2.3      Discrimination phase
bership function [3, 4]. The membership function transforms a
                                                                         Discrimination is the process of assigning a class level to a set of
single sensor reading (datum) to granules containing the informa-
                                                                         inputs i.e. a set of sensor readings at a particular time. Consider
tion of different belongingness. For example, a temperature reading
                                                                         a set of sensor readings at any particular time be represented as
of 38° can be called as a low temperature, medium temperature or
                                                                         {x 1 , x 2 , . . . , x n } whose class is unknown. It should be noted that
high temperature. Hence, it can have some degree of belongingness
                                                                         there are three classes in the present work: banana, wine, and
to each of low, medium and high temperature. In this work, we
                                                                         baseline response. After the fuzzy transformation, the vector can
chose to use the Gaussian fuzzy membership function that is given
                                                                         be represented as
in equation (1) where a is the maximum value of the membership
                                                                                                              hiдh                           hiдh
function, σ is the standard deviation of the readings obtained by               x = {x 1low , x 1medium , x 1        , x 2low , x 2medium , x 2     ,...,
a sensor. The value of x̂ is different for each of the low, medium                                                                                          (2)
                                                                                                                                                  hiдh
and high belongingness. For transformation to low degree of be-                                                            x nlow , x nmedium , x n }
longingness, x̂ = min{x}; for transformation to medium degree            The discrimination phase works with a model that has been built
of belongingness x̂ = mean{x}; and for transformation to higher          with prior knowledge, called training data, to discriminate the
degree of belongingness x̂ = max {x}. Please note that x is a vector     sensor readings in the real time, called test data. In this work,
that contains all the sensor readings for any particular feature. For    we implement a decision tree (DT) classifier that works with the
ith feature (alternatively, for ith sensor), x should be represented     transformed fuzzy granular feature space that has been obtained
as xi .                                                                  from the second phase. Therefore, the whole model can be known as
                                       (x − x̂)2                         Fuzzy Granular Decision Tree (FGDT). The principal reason behind
                                                
                       µ(x) = a exp −                              (1)
                                          2σ 2                           choosing DT over other machine learning models is that it is non-
    The full transformation can be visualized in Figure 1 that rep-      parametric that can learn from the training data in a supervised
resents the curve for low, medium and high membership function           manner. More specifically, we implement CART that is very similar
(from left to right). The belongingness can have a maximum value         to C4.5 decision tree [6]. The FGDT predicts the class of a set of
of 1 (a = 1) and a minimum value of 0. So, for a single datum, the       sensor readings by learning simple decision rules inferred from the
transformation function generates three different granules. At a         features. For completeness and clarity of the readers, we present
particular time t, if there are n sensors, there can be n sensor read-   the working principle of the FGDT as follows.
ings, called features. After the transformation, the feature vector         Given a training dataset in fuzzy granular input feature space
gets transformed to a higher dimension of 3n.                            where each of the features is represented as xi ∈ R 3n . Note that
    A major intuition behind such a high dimensional transformation      each xi is a vector in 3n dimension represented as equation (2). Let
is that a discriminatory model is expected to perform better with        the class labels be represented as d. The FGDT recursively partitions
high dimension than in low dimension. This is as per Cover’s             the training pattern space such that the patterns with same class
theorem on the separability of patterns [5]. Since this work is          labels are grouped together. The following Algorithm 1 explains
focused on discrimination of the sensor readings into different          the generic steps of the FGDT that take the fuzzy granular sensor
classes, the transformation should make the patterns separable in        readings training instances (D et r ain ) and generates a decision tree
higher dimensional space, if they are not easily separable in a lower    from that (eT).1
dimension. In a further section, we shall see that this is indeed the    1 The e
                                                                               · symbol is used to represent uncertainty in input data and the tree is generated
case for this specific problem.                                          from this fuzzy granular data.
Towards Continuous Monitoring of Environment under Uncertainty: A Fuzzy Granular Decision Tree Approach DIAS 2017, , Jaipur, India


   Data: A transformed training set, D          et r ain                                node of the tree into two sub-trees (binary split). It is also possible
   Result: A decision tree, T       e                                                   to split a node into multiple sub-trees (multiple splits) based on
   if All the instances belong to same class then                                       more than one thresholds at the node [6].
        Return a e T with a leaf node labelled with the class;                              Effect on computational complexity? – It is quite obvious that the
   end                                                                                  size of the feature set is increased by the transformation into the
       et r ain = {ϕ} then                                                              fuzzy feature space under the three membership functions such
   if D
                                                                                        as low, medium and high. If we consider that there are ninput
        Return e T = ϕ with warning;
                                                                                        number of features in the original data, after the transformation
   end
                                                                                        into the fuzzy pattern space, the size becomes 3 ×ninput . If the time
   if If there is no feature in D    et r ain then
                                                                                        complexity for the classical decision tree learning is T (ninput ), the
        Returns a leaf labeled with the most frequent class or the                      FGDT has a time complexity of T (3 × ninput ). However, it should
          disjunction of all the classes)                                               be noted that a fuzzy granular decision tree learner would have to
   end                                                                                  learn once with the available training data; and the learned model
  while Until stopping condition is not met do                                          is just deployed for its testing. So, this cost over time would occur
        Find the feature j ∈ D      et r ain with highest informational
                                                                                        just once. Moreover, with the availability of high-performance
          gain or lowest impurity;                                                      computing architectures, this complexity could be lowered and
        Based on feature j, Split the present node of e      T to form                  scalability of the proposed model could be improved. Moreover,
          sub-trees eTl e f t , e
                                Tr iдht ;                                               although a decision tree learner learns from the data by partitioning
        Repeat for e Tl e f t , e
                                Tr iдht ;                                               the space into multiple subspaces with conditions, it is already
   end                                                                                  shown that the decision tree scales well with higher dimensional
 Algorithm 1: FGDT algorithm (The first three if statements are                         data [7].
 called base cases. An impurity function can also be used for the
 information gain: lower the impurity, higher the information gain
 and vice versa.)
                                                                                        3    RESULTS
                                                                                        An adaptive system should not only just adapt to the environmental
                                                                                        changes but also should be robust with respect to the uncertainties
   The stopping conditions while generating the e     T from Det r ain                  as reading through the sensors. This could be done via robust mod-
could be one of the following. A fully grown most generalized                           eling of the environment or via robust decision making withing
tree has been obtained, the training pattern space has been well                        the system. However, our present work demonstrates the former
partitioned into multiple sub-spaces based on classes, the generated                    category where the implemented model adapts to the changes in
tree returns lowest error on validation data2 . For measuring the                       the environment by also incorporating the possible uncertainties by
impurity, an ‘entropy’ estimate is used which is given in equation                      transforming the problem into an imprecise problem. To support
(3).                                                                                    our claim about the improvement, we used the experimental data
                               m
                              Õ                                                         available from the work of Huerta et al. [2] to study the perfor-
                 H (X, e
                       T) = −    P(Xi ) log2 (P(Xi ))               (3)
                                                                                        mance of our proposed fuzzy granular model, FGDT. To elaborate,
                                     i=1
                                                                                        to evaluate the impact on discrimination performance of our pro-
Here, P(·) is the probability estimate. Based on the estimation in
                                                                                        posed FGDT due to decorrelation of signals from temperature and
equation (3), the entropy before split and entropy after splitting a
                                                                                        humidity sensors, four different feature sets were used. The first
node is computed for each feature. The feature which would provide
                                                                                        set of features is a set of raw sensor time series (RS); the second
lowest entropy after split is considered to be split up based on some
                                                                                        feature set is a set of raw sensor data with humidity and tempera-
condition, usually a threshold. Entropy after split is computed from
                                                                                        ture (RS, T, H); the third set of features is a set of filtered data (FS)
the entropy estimate of each of the new sub-trees as
                                                                                        by decorrelating sensors; the fourth set of features is a set of raw
                                        k
                                        Õ                                               sensor data with filtered sensor data (RS, FS). We used these same
           Haf t er Spl it (X, e
                               T) = −         P(X) log2 (Hi (Xi ), e
                                                                   Ti ),         (4)    feature set for a proper evaluation of our proposed model. However,
                                        i=1
                                                                                        additionally, we also use a new dataset that also contains ‘t0’ and
where, k is number of splits at a node of the tree, Hi (Xi ), e
                                                              Ti ) is the               ‘dt’ along with filtered sensor data (FS) for evaluation.
entropy measure for the sub-tree that would be produced after split.                       To properly estimate the generalization ability of the proposed
By using these entropy estimates that is entropy before split and                       FGDT model, we used the standard procedures in machine learning
entropy after split, the information gain which is denoted as G can                     to evaluate the performance of FGDT when discriminating sam-
be computed as                                                                          ples not used for training the classifier so as to remove evaluation
              G = Hbe f or eSpl it (X, e
                                       T) − Haf t er Spl it (X, e
                                                                T)               (5)    bias during the testing phase. There are 919438 number of sensor-
                                                                                        reading instances in the dataset. The available data was randomly
The feature that would produce the highest information gain should                      partitioned into 80% and 20%. The partition with 20% of data was
be used to split at a particular node in the tree e
                                                  T. It should be noted                 used for the independent test. The 80% partition was used for 5-fold
that the discussion and algorithm implemented in this work splits a                     cross validation (5CV). For a fair comparison, we used accuracy as
2 A set of instances that were not used during the training process but are used to     the performance evaluation measure for this work as the same was
check the performance of the generated model after training. These are not test data.   also used in [2], although accuracy should not be always considered
DIAS 2017, , Jaipur, India                                                      Preetham N. Reddy, Sahith N. Dambekodi, and Tirtharaj Dash


reliable [8]. However, as the number of instances in the dataset is         results during independent real-time tests. Moreover, the number
very large, at the order of 9 × 105 , it would not have a large effect on   of instances in a training set and the number of features do also play
the accuracy measure. We considered the fold-wise performance in            crucial roles in the process. With a higher number of instances with
5CV along with independent test performance for the evaluation              a small set of features might not generate an adequate function that
purpose. The results have been noted and depicted in Table 1. To            would generate output from the inputs. After transformation to
have a better evaluation of the model, the 20% partition was not            fuzzy space and hence increasing the dimension of the input space,
made fixed from the beginning, rather, with each fold evaluation,           such a function would be possible for the purpose. This could lead to
the test partition has been chosen.                                         the proper generalization of the FGDT and hence the independent
   A comparison of the result obtained by our proposed FGDT                 test performance. One should note that the present work uses the
with the results obtained by Huerta et al.[2], as shown in Table 2,         CART algorithm available in Scikit-learn [9]. However, one could
suggests that the proposed FGDT model is superior to the ISVM               try many other machine learning approaches for the same task
model for this problem for most of the datasets with selected fea-          after incorporating our proposed fuzzy granular transformation
tures. However, for the dataset with only FS as feature set, the            approach as a step in feature engineering before training of the
performance of FGDT is far lower than the ISVM model. This is               learning model.
probably because of the fact that the FS dataset is quite difficult to
be separable which could be revealed from a sample plot between             5    CONCLUSIONS
F1 vs F2 of the FS feature set (see Figure 2).
                                                                            The uncertainty arising in the acquired data from the environment
                                                                            in the process of continuous home monitoring has been handled by
                                                                            developing a new fuzzy granular approach that has been combined
                                                                            with a decision tree for decision making. The proposed model FGDT
                                                                            has been evaluated with regard to experimentally validated data
                                                                            from continuous monitoring environment. The performance of our
                                                                            FGDT model is found to be superior to a recently published work
                                                                            on the same problem. Moreover, the statistical results also show
                                                                            that FGDT has not only better discrimination capability but also it
                                                                            is quite stable and consistent.
                                                                               Although the present work discusses the applicability of fuzzy
                                                                            modeling towards the uncertainty handling in the aspect of envi-
                                                                            ronmental monitoring systems; there are numerous possible appli-
                                                                            cations of such techniques in real-world problems such as medical
                                                                            diagnosis [10], robotic navigation control mechanisms [11, 12],
                                                                            handling uncertainty in software testing [13]. All these mentioned
                                                                            real-world problems may not always be subject to precise computa-
            Figure 2: Scatter plot between F1 and F2                        tion, and hence uncertainty handling methods such as fuzzy gran-
                                                                            ular decision tree (FGDT) approach would aid to achieve greater
                                                                            performance of the intended adaptive systems.
4    DISCUSSION
The FGDT algorithm does handle uncertainty in the acquired data
                                                                            REFERENCES
                                                                             [1] Joey F Boatman and Bryan S Reichel. Environment monitoring system, April 6
by transforming it to a feature space that is higher in dimension                1999. US Patent 5,892,690.
than that of the original data. This could led the decision tree             [2] Ramon Huerta, Thiago Mosqueiro, Jordi Fonollosa, Nikolai F Rulkov, and Irene
classifier capture the underlying relationship between the inputs                Rodriguez-Lujan. Online decorrelation of humidity and temperature in chemical
                                                                                 sensors for continuous monitoring. Chemometrics and Intelligent Laboratory
and outputs in a better way. This is evident from the fact that                  Systems, 157:169–176, 2016.
the higher dimension of the transformed data makes the visibility            [3] Sushmita Mitra, Rajat K De, and Sankar K Pal. Knowledge-based fuzzy mlp
                                                                                 for classification and rule generation. IEEE Transactions on Neural Networks,
sparse and hence drawing a clear boundary between two different                  8(6):1338–1350, 1997.
classes becomes easy [5], which could not have been possible by              [4] Tirtharaj Dash, Sanjib Kumar Nayak, and HS Behera. Hybrid gravitational
using SVM classifier as used in [2]. More specifically, when a lower             search and particle swarm based fuzzy mlp for medical data classification. In
                                                                                 Computational Intelligence in Data Mining-Volume 1, pages 35–43. Springer, 2015.
dimensional input space is projected onto a higher dimensional               [5] Thomas M Cover. Geometrical and statistical properties of systems of linear in-
space, the sparsity in the projected space increases and thereby                 equalities with applications in pattern recognition. IEEE transactions on electronic
increasing the chance of learning a possibly-optimal hyperplane                  computers, (3):326–334, 1965.
                                                                             [6] Leo Breiman, Jerome Friedman, Charles J Stone, and Richard A Olshen. Classifi-
that would be serving as the boundary among various resulting                    cation and regression trees. CRC press, 1984.
groups or classes of patterns. Technically, the distance between             [7] Guosheng Lin, Chunhua Shen, Qinfeng Shi, Anton van den Hengel, and David
                                                                                 Suter. Fast supervised hashing with decision trees for high-dimensional data. In
any pattern in the project space and the learned hyperplane does                 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
improve which makes a strong impact on the training of the decision              pages 1963–1970, 2014.
tree learner. This interpretation could probably be true because of          [8] Siddharth Dinesh and Tirtharaj Dash. Reliable evaluation of neural network for
                                                                                 multiclass classification of real-world data. arXiv preprint arXiv:1612.00671, 2016.
the fact that a more generalized model that is built after training and      [9] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel,
validation would possess a higher capability of generating accurate              Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss,
Towards Continuous Monitoring of Environment under Uncertainty: A Fuzzy Granular Decision Tree Approach DIAS 2017, , Jaipur, India

                   Table 1: Performance (accuracy) of FGDT with different feature set for 5CV and independent test

                            Features→                 RS               RS, T, H                 FS               FS, RS              FS, t0, dt
                               Fold            CV           Test     CV       Test       CV           Test    CV      Test         CV        Test
                                 1            96.99        97.01    97.03 97.03         37.30        37.32   96.96 96.94          99.10 99.09
                                 2            96.95        96.99    96.96 97.03         37.20        37.32   96.95 96.97          99.06 99.09
                                 3            96.91        97.01    96.96 97.04         37.27        37.31   96.99 96.96          99.06 99.09
                                 4            96.97        97.00    96.99 97.04         37.36        37.32   97.01 96.98          99.11 99.09
                                 5            96.93        97.01    96.96 97.04         37.38        37.32   96.90 96.98          99.11 99.09
                             average          95.95        97.01    96.98 97.04         37.30        37.33   96.96 96.97          99.10 99.09
                             ±std.dev.        ±0.03        ±0.01    ±0.03 ±0.01         ±0.01        ±0.01   ±0.04 ±0.01          ±0.02 ±0.01

                                      Table 2: Comparison of FGDT with results obtained by Huerta et al. [2]

                                                                         ISVM [2]                  FGDT
                                                      Feature Set
                                                                        CV Test             CV           Test
                                                          RS            78.5 76.5       95.95 ± 0.03 97.01 ± 0.01
                                                       RS, T, H         73.3 71.1       96.98 ± 0.03 96.98 ± 0.03
                                                          FS            72.4 71.2       37.30 ± 0.01 37.33 ± 0.01
                                                        RS, FS          82.6 80.9       96.96 ± 0.04 96.97 ± 0.01
                                                       FS, t0, dt         –    –        96.96 ± 0.04 96.97 ± 0.01


     Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of               network. In Proceedings of the 2nd international conference on perception and
     Machine Learning Research, 12(Oct):2825–2830, 2011.                                        machine intelligence, pages 196–200. ACM, 2015.
[10] Tanistha Nayak, Tirtharaj Dash, D Chandrasekhar Rao, and Prabhat K Sahu.              [12] Tirtharaj Dash. Automatic navigation of wall following mobile robot using
     Evolutionary neural networks versus adaptive resonance theory net for breast               adaptive resonance theory of type-1. Biologically Inspired Cognitive Architectures,
     cancer diagnosis. In Proceedings of the International Conference on Informatics            12:1–8, 2015.
     and Analytics, page 97. ACM, 2016.                                                    [13] Salman Abdul Moiz. Uncertainty in software testing. In Trends in Software
[11] Tirtharaj Dash, Tanistha Nayak, and Rakesh Ranjan Swain. Controlling wall                  Testing, pages 67–87. Springer, 2017.
     following robot navigation based on gravitational search and feed forward neural