=Paper= {{Paper |id=Vol-1353/paper_36 |storemode=property |title=A Study of Supervised Machine Learning Techniques for Structural Health Monitoring |pdfUrl=https://ceur-ws.org/Vol-1353/paper_36.pdf |volume=Vol-1353 |dblpUrl=https://dblp.org/rec/conf/maics/NickSAE15 }} ==A Study of Supervised Machine Learning Techniques for Structural Health Monitoring== https://ceur-ws.org/Vol-1353/paper_36.pdf
     A Study of Supervised Machine Learning Techniques for Structural Health
                                  Monitoring
      William Nick                         Joseph Shelton              Kassahun Asamene                    Albert Esterline
North Carolina A&T State U.          North Carolina A&T State U. North Carolina A&T State U. North Carolina A&T State U.
 Department of Comp. Sci.             Department of Comp. Sci. Department of Mechanical Eng. Department of Comp. Sci.
  Greensboro, NC 27411                 Greensboro, NC 27411        Greensboro, NC 27411        Greensboro, NC 27411
     wmnick@ncat.edu                      jashelt1@ncat.edu           glbulloc@ncat.edu           esterlin@ncat.edu


                             Abstract                                  2004). Much of the intelligence here is finding the appropri-
                                                                       ate techniques for the situation at hand. In one situation, we
  We report on work that is part of the development of an agent-
  based structural health monitoring system. The data used are         might want a given task done quickly with only rough accu-
  acoustic emission signals, and we classify these signals ac-         racy, while in another situation accuracy may be paramount
  cording to source mechanisms. The agents are proxies for             and speed of only secondary importance. Regarding the re-
  communication- and computation-intensive techniques and              sults of machine learning for SHM, we would like an assort-
  respond to the situation at hand by determining an appropriate       ment of classifiers to provide a range of possibilities for the
  constellation of techniques. It is critical that the system have     diversity of situations that arises in SHM.
  a repertoire of classifiers with different characteristics so that      The data we use are acoustic signals, and the condition of
  a combination appropriate for the situation at hand can gen-         greatest interest is crack growth. Since signal sources are un-
  erally be found. We use unsupervised learning for identifying
  the existence and location of damage but supervised learning
                                                                       observable, classifying acoustic signals by their source must
  for identifying the type and severity of damage. This paper          be based on machine learning. Sensing here is passive: no
  reports on results for supervised learning techniques: support       energy is required to generate or sense the signals (although
  vector machines (SVMs), naive Bayes classifiers (NBs), feed-         energy is required to store and communicate the data). Once
  forward neural networks (FNNs), and two kinds of ensemble            an event that is sensed via its acoustic emission has been
  learning, random forests and AdaBoost. We found the SVMs             classified, we may address a multitude of issues and provide
  to be the most precise and the techniques that required the          diagnoses of the problems. Note that there may be more than
  least time to classify data points. We were generally disap-         one valid classification scheme for events detected via their
  pointed in the performance of AdaBoost.                              acoustic emissions.
                                                                          In SHM, data is interpreted by extracting streams of vec-
                         Introduction                                  tors of feature values from the sensor-data streams. Fea-
Structural health monitoring (SHM) provides real-time data             ture vectors are classified as to the events producing sensed
and consequently information on the condition of the mon-              signals by classifiers that have been trained with machine-
itored structure whose integrity may be threatened by such             learning techniques. For our experiments, a correlation co-
things as corrosion and cracking. This paper reports on re-            efficient is computed between an observed waveform and
search related to SHM that has been carried out as part of the         six reference waveforms that are generated from numerical
NASA Center for Aviation Safety (CAS) at North Carolina                simulations of acoustic emission events. The vector of all
A&T State University. Ultimately, the target structures will           six correlation coefficients characterizes the waveform. Our
be aircraft, but experiments at this stage are carried out on          dataset consists of 60 samples from the work reported by
laboratory specimens.                                                  Esterline and his colleagues (Esterline et al. 2010).
   Our architecture involves a multiagent system that directs             Worden and his colleagues (Worden et al. 2007) have for-
a workflow system. Agents typically serve as proxies for               mulated seven axioms for SHM that capture general aspects
techniques with intensive communication or computation re-             that have emerged in several decades of experience. Of par-
quirements. Wooldridge defined an agent as an autonomous,              ticular interest is their Axiom III, which states that unsu-
problem-solving, computational entity that is capable of ef-           pervised learning can be used for identifying the existence
fectively processing data and functioning singly or in a com-          and location of damage but identifying the type and sever-
munity within dynamic and open environments (Wooldridge                ity of damage can only be done with supervised learning.
2009). The agents in our system negotiate to determine a pat-          Supervised learning tries to generalize responses based on a
tern of techniques for solving the task at hand, and they com-         training set with the correct responses indicated. Unsuper-
municate this pattern to our workflow engine (implemented              vised learning tries to categorize the inputs based on their
on one or more high-performance platforms), which actually             similarities.
carries out the tasks on the data streams provided. The mul-              Following Axiom III, our previous research investigated
tiagent system is thus the brains and the workflow engine the          two unsupervised and three supervised learning techniques
brawn of our SHM system (Foster, Jennings, and Kesselman               for different aspects of the SHM problem. The objective is
to explore these techniques and note their characteristics so     is the estimation of the remaining useful life of a mechanical
that various combinations of them may be used appropri-           structure (Farrar and Lieven 2007).
ately in various circumstances. The results of all five tech-        The field of SHM has matured to the point where several
niques for acoustic test data are reported in (Nick et al.        fundamental axioms or general principles have emerged.
2015). The current paper reviews the results for the three        Worden and his colleagues (Worden et al. 2007) suggest
previously investigated supervised learning techniques and        seven axioms for SHM. The following are three that are par-
reports results for two new techniques, which are both vari-      ticularly relevant to this paper.
eties of ensemble learning. The previously-investigated su-         Axiom IVa: Sensors cannot measure damage. Fea-
pervised learning techniques are support vector machines            ture extraction through signal processing and statisti-
(SVM), naive Bayes classifiers, and feed-forward neural net-        cal classication is necessary to convert sensor data into
works (FNN). For each technique, we tested a version with           damage information.
principal component analysis (PCA) as a frontend to reduce
the dimensionality of the data (usually to three principal          Axiom IVb: Without intelligent feature extraction, the
components), and we tested another version without PCA.             more sensitive a measurement is to damage, the more
Since PCA generally did not result in significant improve-          sensitive it is to changing operational and environmen-
ment, the new techniques were tested only without PCA.              tal conditions.
   For our supervised-learning experiments, class labels on         Axiom V: The length- and time-scales associated with
data points indicate one of six possible source types: im-          damage initiation and evolution dictate the required
pulses of three different durations applied to the neutral axis     properties of the SHM sensing system.
(equidistant between the two surfaces) or to the surface of       The following, however, is the most relevant.
the specimen. These are cleanly defined events ideal for test-
                                                                    Axiom III: Identifying the existence and location of
ing our learning techniques. In practice, class labels would
                                                                    damage can be done in an unsupervised learning mode,
include sources that are crack growth and fretting (friction-
                                                                    but identifying the type of damage present and the dam-
producing), the former being a threat, the latter generally
                                                                    age severity can generally only be done in a supervised
being innocuous.
                                                                    learning mode.
   The approach followed here can be generalized for ex-
ploring the characteristics of machine-learning techniques        As we address supervised learning in this paper, we expect
for monitoring various kinds of structures. One must first        our techniques to be able to identify the type of damage and
determine what signals are appropriate for monitoring the         its severity.
structures, (For example, acoustic signals are appropriate
for monitoring metallic structures while signals propagated       Previous Work in Machine Learning for SHM
through optical fiber are appropriate for bridge type struc-      Bridge-like structures have been the main structures ad-
tures.) One then determines the sensor and communica-             dressed in the literature on machine learning for SHM. We
tion infrastructure. Finally, as per this paper, one determines   have a quick look at this rather mature area before turning
the characteristics of various supervised and unsupervised        to our subject, which targets aircraft. (Farrar and Worden
learning techniques for monitoring the structures in ques-        2012) is a text that addresses machine learning for SHM
tion (given the signals and infrastructure chosen). Admit-        in general. It is directed to mechanical engineers and ded-
tedly, the repertoire of techniques explored here is far from     icates most of its space to background. Considering original
complete, but we have included the ones most often encoun-        results, Figueiredo and his colleagues performed an experi-
tered in structural health monitoring.                            ment on a three-story frame aluminum structure that used a
   The remainder of this paper is organized as follows. The       load cell and four accelerometers (Figueiredo et al. 2011).
next sections provides a brief overview of SHM, and the fol-      For each test of state conditions, the features were esti-
lowing section looks into previous work in machine learning       mated by using a least squares technique applied to time-
for SHM. The section after that explains the supervised ma-       series from all four accelerometers and stored into feature
chine learning techniques we use, and the penultimate sec-        vectors. They used four machine learning techniques in an
tion presents our results. The last section concludes.            unsupervised learning mode: 1) auto-associative neural net-
                                                                  work (AANN), 2) factor analysis (FA), 3) singular value de-
           Structural Health Monitoring                           composition (SVD), and 4) Mahalanobis squared distance
In general, damage is defined as change introduced into a         (MSD). First the features from all undamaged states were
system that will adversely affect its current or future perfor-   taken into account. Then those feature vectors were split into
mance (Farrar and Worden 2007). For mechanical structures,        training and testing sets. In this case, a feed-forward neural
damage can be defined more narrowly as change to the ma-          network was used to build-up the AANN-based algorithm
terial and/or geometric properties. SHM provides real-time        to perform mapping and de-mapping. The network had ten
information on the integrity of the structure. It allows bet-     nodes in each of the mapping and de-mapping layers and
ter use of resources than scheduled maintenance, which may        two nodes in the bottleneck layer. The network was trained
take place when there is no need.                                 using back-propagation. The AANN- and MSD- based al-
   In characterizing the state of damage in a system, we can      gorithms performed better at detecting damage. The SVD-
ask whether there is damage, where in the system it is, what      and FA- based algorithms performed better at avoiding false
kind of damage it is, and how severe it is. Damage prognosis      indications of damage.
   Tibaduiza and his colleagues (Tibaduiza et al. 2013), in      sumptions and the underlying probabilistic model allow us
investigating SHM for an aircraft fuselage and a carbon          to capture any uncertainty about the model. This is generally
fiber reinforced plastic (CFRP) composite plate, made use        done in a principled way by determining the probabilities of
of multiway principal component analysis (MPCA), discrete        the outcomes. NBs were introduced to solve diagnostic and
wavelet transform (DWT), squared prediction error (SPE)          predictive problems. Bayesian classification provides practi-
measures and a self-organizing map (SOM) for the classifi-       cal learning through the use of algorithms, prior knowledge,
cation and detection of damage. Each PCA was created us-         and observation of the data in combination. A Gaussian NB
ing 66 percent of the whole data set from the undamaged          assumes that the conditional probabilities follow a Gaussian
structure. Signals from the remaining 34 percent of this data    or normal distribution.
set plus 80 percent of the data set of the damaged structure        Ensemble learning is a supervised machine learning tech-
were used in classifying with the SOM. This approach had         nique that uses multiple hypothesis spaces for predicting a
an area under the ROC curve of 0.9988. A ROC chart is a          solution to a problem (Dietterich 2000) (Bennett, Demiriz,
display of the performance of a binary classifier, with true     and Maclin 2002) (Maclin and Opitz 1999). Generally, a so-
positive rate vs. false positive rate.                           lution found in a hypothesis space may be a weak solution,
   Esterline and his colleagues (Esterline et al. 2010) (also    even if the space is constrained to optimal solutions. Ensem-
targeting aircraft) ran an experiment with two approaches.       ble methods combine different solutions to form accurate de-
Their first approach used as training instances experimental     cisions for a problem. A unique characteristic of ensemble
data with eighteen traditional acoustic emission features to     methods is that the ensemble of solutions can all be accurate
train a SVM, while their second approach used six correla-       yet diverse. Diversity, however, will occur only if the prob-
tion coefficients between basic modes and waveforms from         lem is unstable. ”Unstable” means that minor changes to the
simulation data also to train a SVM. The SVM with the sec-       training set affect the classifying performances greatly. We
ond approach performed as well or better than the SVM us-        investigate two forms of ensemble learning: random forest
ing the first approach, suggesting the superiority of a set of   and AdaBoost.
correlation coefficients over a substantial set of traditional      Choosing a structure for a tree and training decision trees
acoustic emission features for learning to identify the source   is time consuming for deep trees. Creating the leaves for the
of acoustic emissions. It is for this reason that the work re-   trees is relatively less time consuming. One solution to this
ported here uses the six correlation coefficients.               is to use fixed tree structures and random features. By us-
                                                                 ing a collection of trees, classifiers can be built. The col-
                        Approach                                 lection of trees and the randomness of features lead to this
Recall that the supervised learning techniques we previously     algorithm being called random forest (Breiman 2001) (Liaw
investigated are FNN, SVM, and nave Bayes classifiers and        and Wiener 2002). The random forest algorithm works as
that the supervised learning techniques we are reporting on      follows. A number of user specified trees are randomly cre-
for the first time here are ensemble techniques, specifically    ated, and each tree has the same depth. The training data
random-forest learning and AdaBoost. An artificial neural        is then used to fill in the leaves, which forms predictions
network (ANN) is a computational model based on the struc-       for the classifier. The many trees are formed as a commit-
ture and functions of a biological neural network (Bishop        tee machine of sorts to form a classifier. If features are too
2006). In a FNN, or multilayer perceptron, input vectors are     irrelevant, then the classifying performance will not be ade-
put into input nodes and fed forward in the network. The in-     quate since there will be a small number of features chosen.
puts and first-layer weights will determine whether the hid-     The number of trees is important for the classifying process.
den nodes will fire. The output of the neurons in the hidden     If there are enough trees, the randomness of features cho-
layer and the second-layer weights are used to determine         sen will be overridden by the number of relevant features
which of the output layer neurons fire. The error between        selected. Meanwhile, the effects of the completely random
the network output and targets is computed using the sum-        features will be diminished.
of-squares difference. This error is fed backward through the       The concept of boosting involves using a series of weakly
network to update the edge weights in a process known as         performing classifiers to form some strong performing clas-
back propagation.                                                sifier. Each classifier can be given some weight that has
   SVMs rely on preprocessing to represent patterns in the       some correlation to its performance. As different classifiers
data in a high dimension, usually higher than the original       are added, the weights are readjusted. Weights can be min-
feature space, so that classes that are entangled in the orig-   imized or maximized depending on the boosting algorithm.
inal space are separated by hyper-planes at higher dimen-        One popular boosting algorithm is the AdaBoost, or adap-
sion. Training a SVM (Duda, Hart, and Stork 2001) involves       tive boosting algorithm (Schapire 1999) (Rätsch, Onoda,
choosing a (usually nonlinear) function that maps the data to    and Müller 2001). AdaBoost works as follows. Each data
a higher-dimensional space. Choices are generally decided        point in a classifier is given some weight based on its signif-
by the users knowledge of the problem domain. SVMs can           icance. A series of classifiers is then trained on training data.
reduce the need for labeled training instances.                  A classifier’s weight is then determined based on the predic-
   Naı̈ve Bayes’ classifiers (NBs) form a supervised learn-      tions it makes on the training data. The weight can be used
ing technique that belongs to a family of classifiers based      to determine some adaptive value, which is the importance
on Bayes’ theorem with a strong assumption about the in-         of some classifier. The adaptive value changes based on the
dependence of features (Duda, Hart, and Stork 2001). As-         classifiers that have been checked. The poorer performing
classifiers have lower weights then better performing classi-              Technique     Gaussian NB           FNN
fiers.                                                                       Mean         0.78 - 0.78       0.58 - 0.74
                                                                            St. Dev.      0.10 - 0.10      0.00 - 0.009
                          Results
The learning techniques were run on a machine running a           Table 2: Precision of the Gaussian NB and FNN (12 points,
Windows 7 64-bit operating system with a 2.4 GHz quad             26 groups of 5 runs)
core processor and 4 GB of RAM. Software from scikit-
learn (Pedregosa et al. 2011) was used for SVM, Gaussian
                                                                  validation. Both of these techniques were implemented with
NBs, random forests, and AdaBoost. Software from PyBrain
                                                                  6, 50, and 75 constituent classifiers. For AdaBoost, the mean
(Schaul et al. 2010) was used for the FNN. Both scikit-learn
                                                                  precision for each of the 26 runs for all numbers of con-
and PyBrain are written in Python. We recorded the time
                                                                  stituent classifiers was 0.57 and the standard deviation for
taken by the classifiers produced by each technique to clas-
                                                                  each run was 0.10. Table 3 shows, for the random forest clas-
sify the data points in our test set. This involved executing
                                                                  sifiers, the range in the mean precision values of each group
Python code.
                                                                  of five runs and the range in the standard deviations of the
   To avoid overfitting, we used stratified five-fold cross-
                                                                  precision values for these runs.
validation with our set of 60 data points. In five-fold cross-
validation, the data points are divided into five sets (called
folds), all as nearly as possible of the same size. The classi-    No. of Est.         6             50            75
fier is learned using four of the folds, and the remaining fold      Mean         0.73 - 0.87    0.82 - 0.88   0.82 - 0.92
is held out for testing. In multiple runs, different folds are      St. dev.      0.05 - 0.19    0.06 - 0.17   0.04 - 0.17
held out for testing. In stratified five-fold cross-validation,
the folds are stratified, that is, each contains approximately    Table 3: Precision of the Random Forests with various num-
the same proportion of the labels as the complete data set.       bers of estimators (12 points, 26 groups of 5 runs)
   For each learning technique, we had 26 groups of cross-
validation runs. In each group, we performed stratified five-        Regarding precision, the best techniques were SVM with
fold cross-validation five times, each time holding out a         linear (88%) and sigmoid (87%) kernel functions. The ran-
different fold. For each cross-validation run, we computed        dom forest with 75 estimators had average precision values
the precision for the test fold. The precision is defined as      in the range 82-92% and ranks up with these two SVM clas-
tp/(tp + f p), where tp is the number of true positives, and      sifiers. The random forest with 50 estimators is close be-
f p is the number of false positives. We also recorded the        hind (82-88%). Next comes the SVM with an RBF kernel
time it took to classify the 12 data points in the test fold.     function (83%), followed by the random forest with six es-
We then computed the average precision and average clas-          timators (73-87%), and then the Gaussian NB (78%). The
sification time for all five runs in the group. We found the      FNN performed poorly (58-74%), and the AdaBoosts with
minimum, maximum, and standard deviation of the average           any number of estimators were the worst performing tech-
precision and average time to classify 12 points across the       niques (57%).
26 groups of runs.                                                   Turning to the time it took the classifiers trained with var-
   We ran a SVM with four types of kernel function: linear,       ious techniques to classify the 12 data points in the test fold,
radial basis (RBF, with γ = 0.03125), polynomial and sig-         Tables 4 and 5 shows the range of the 26 five-run means of
moid (again with γ = 0.03125). Table 1 displays the mean          this time (in milliseconds) for each of the kernel functions
(over 26 groups of five runs each) precision with which our       of our SVM. It also shows the standard deviations for these
SVMs classified the 12 data points in our test folds. Note        times. All techniques classified the 12 points in 0.08 to 0.12
that, for each kernel function, the mean precision and the        msec. Table 6 shows the range of the means and standard de-
standard deviation turned out the same for each of the 26         viations for this time in milliseconds for Gaussian NB and
groups of runs.                                                   FNN to classify the 12 data points.

    Kernel      RBF     Polynomial     Linear    Sigmoid                    Kernel         RBF           Polynomial
     Mean       0.83       0.70         0.88       0.87                     Mean        0.09 - 0.11      0.09 - 0.10
    St. Dev.    0.07       0.11         0.11       0.04                     St.dev.    0.0004 - 0.02    0.0002 - 0.02

Table 1: Precision of the SVMs (12 points, 26 groups of 5         Table 4: Time (msec.) for SVMs to classify the 12 data
runs)                                                             points in the test fold (26 groups of 5 runs)

   A Gaussian NB classifier and an FNN were trained and              Finally, Table 7 shows these times (in msec.) for Ad-
tested again with 26 groups of five runs each of five-fold        aBoost with 6, 50, and 75 estimators, respectively, to clas-
cross-validation. Table 2 shows the resulting ranges of mean      sify 12 data points, and Table 8 shows the same for random
precision values and standard deviations of the precision val-    forest.
ues for the 12 data points in the test fold.                         The SVM classifiers with all the kernel functions investi-
   Random-forest and AdaBoost classifiers were also trained       gated, at around 0.10 msec. to classify 12 data points, were
and tested with 26 groups of five runs each of five-fold cross-   significantly faster than the next fastest technique, which
            Kernel       Linear          Sigmoid                   No. of Est.        6              50             75
            Mean       0.08 - 0.11     0.10 - 0.12                   Mean        0.49 - 0.59     3.70 - 3.99    5.59 - 5.88
            St.dev.   0.001 - 0.04     0.001 - 0.02                 St. dev.     0.001 - 0.18    0.04 - 0.18    0.04 - 0.57

Table 5: Time (msec.) for SVMs to classify the 12 data            Table 7: Time (msec.) for AdaBoost with various numbers
points in the test fold (26 groups of 5 runs)                     of estimators to classify the 12 data points in the test fold
                                                                  (26 groups of 5 runs)
         Technique     Gaussian NB           FNN
           Mean         0.22 - 0.27       1.60 - 1.91              No. of Est.         6              50             75
           st.dev.     0.0003 - 0.07     0.001 - 0.64                Mean         0.32 - 0.36     1.58 - 1.75    2.29 - 2.45
                                                                    st. dev.     0.0004 - 0.06    0.02 - 0.28    0.03 - 0.15
Table 6: Time (msec.) for Gaussian NB and FNN to classify
the 12 data points in the test fold (26 groups of 5 runs)         Table 8: Time (msec.) for random forest with various num-
                                                                  bers of estimators to classify the 12 data points in the test
                                                                  fold (26 groups of 5 runs)
was Gaussian NB, in the range 0.22-0.27 msec. Random for-
est with 6 estimators (0.32-0.36 msec.) was close behind,
followed at a significant interval by Adaboost with 6 esti-       identifying the type and severity of damage. Our objective at
mators (0.49-0.59). The remaining classifiers took well over      this stage is to explore various machine-learning techniques
one msec. FNN (1.60-1.91) was close to random forest with         and note their characteristics so that various combinations of
50 estimators (1.58-1.75 msec.). AdaBoost with 50 estima-         them may be used appropriately in various circumstances.
tors (3.70-3.99 msec.) was slower than random forest with         This paper in particular reports on experiments with super-
75 estimators (2.29-2.45 msec.), and AdaBoost with 75 es-         vised learning techniques using data typical of our domain.
timators (5.59-5.88 msec.) was significantly slower still.        The supervised learning techniques investigated are support
   SVM with an linear or sigmoid kernel function was the          vector machines (SVMs), naive Bayes classifiers (NBs), and
most precise technique (87 or 88%) and the technique that         feed-forward neural networks (FNNs) as well as those newly
classified data points fastest (taking about 0.1 msec. to clas-   reported with this paper, the ensemble techniques random
sify 12 data points). Random forest had an increase in preci-     forests and AdaBoost. SVMs were used with four kernel
sion of only 1 to 12% going from 6 to 50 estimators, but the      functions: linear, radial basis (RBF, with γ = 0.03125),
time required to classify 12 data points went from 0.32-0.36      polynomial, and sigmoid (also with γ = 0.03125). Random
msec. to 1.58-1.75 msec. Increasing the number of estima-         forest and AdaBoost both were implemented with 6, 50, and
tors from 50 to 75 (50%) increased the precision modestly         75 estimators.
(from 0.82-0.88 msec. to 0.82-0.92 msec.), but enough to ri-         As before, SVM with a linear or sigmoid kernel function
val the SVMs, while increasing the time to classify 12 data       was the most precise technique and the technique that classi-
points by 40-45%. So the random forest technique proved           fied data points fastest. The random forest technique proved
reasonably precise if somewhat on the slow side. AdaBoost         reasonably precise but somewhat slow. Increasing the num-
was a complete disappointment as its precision (56%) was          ber of estimators made no difference in the precision of Ad-
worse than any other technique, the second worst being FNN        aBoost and only a modest improvement for random forest,
(58-74%). With six estimators, Adaboost was about three           but the time required to classify data points appeared to be
times faster than FNN, but with just 50 estimators (3.70-3.99     nearly linear in the number of estimators. AdaBoost was a
msec.) it is significantly slower than FNN (1.60-1.91 msec.).     complete disappointment as it produced the worst precision
                                                                  of any of the techniques, and even with just six estimators it
                       Conclusion                                 took twice as long to classify data points as Gaussian NB.
We report here on work that is part of our development of            These results apparently leave no room for intelligent de-
an agent-based structural health monitoring (SHM) system.         cision by our multiagent system as it appears that a classi-
The data used are acoustic signals, and one attempts to clas-     fier trained as an SVM with either a linear or sigmoid kernel
sify these signals according to source. The agents are for        function should be chosen every time. But recall that we con-
the most part proxies for communication- and computation-         sider combinations of classifiers trained in unsupervised and
intensive techniques. They negotiate to determine a pattern       supervised learning mode, the first to find existence and lo-
of techniques for understanding the situation at hand. Such       cation of damage and then the second to determine the extent
a pattern determines a workflow. The agents respond in an         and type of damage. For unsupervised learning, we found
intelligent way by determining a constellation of techniques      (Nick et al. 2015) that self-organizing maps (SOMs) appear
appropriate for the situation at hand. It is critical that the    to give more reliable classifications than k-means classifiers
system have a repertoire of classifiers with different charac-    although they take much longer to classify data points. So
teristics so that a combination appropriate for the situation     with unsupervised learning there are tradeoffs and a mean-
at hand can generally be found.                                   ingful choice. In fact, there is still a large number of tech-
   Following Worden and his colleagues (Worden et al.             niques to investigate, even when restricting ourselves to en-
2007), we use unsupervised learning for identifying the ex-       semble techniques. And many techniques can be adapted in
istence and location of damage but supervised learning for        subtle ways not considered here. Finally, even among super-
vised learning techniques, some might be better than others      Farrar, C. R., and Worden, K. 2012. Structural Health Moni-
in specific circumstances while being inferior in general.       toring: A Machine Learning Perspective. Hoboken, NJ: John
   In a practical situation, we look at a large number of        Wiley & Sons.
events and watch for cases where hundreds are classified as      Figueiredo, E.; Park, G.; Farrar, C. R.; Worden, K.; and
originating from crack growth. So we can tolerate a certain      Figueiras, J. 2011. Machine learning algorithms for dam-
amount of inaccuracy. Cracks, however, grow over months,         age detection under operational and environmental variabil-
yet relevant events may be only milliseconds apart, and mon-     ity. Structural Health Monitoring 10(6):559–572.
itoring a large structure may put a premium on speed. So the
                                                                 Foster, I.; Jennings, N. R.; and Kesselman, C. 2004. Brain
extent to which classification time is critical is an involved
                                                                 meets brawn: Why grid and agents need each other. In Pro-
issue.
                                                                 ceedings of the Third International Joint Conference on Au-
   Future work will continue investigating supervised and        tonomous Agents and Multiagent Systems-Volume 1, 8–15.
unsupervised learning techniques, looking for combinations       Piscataway, NJ: IEEE Computer Society.
of techniques appropriate for various situations. One specific
topic will be random forests with boosting. We stated how        Liaw, A., and Wiener, M. 2002. Classification and regres-
our approach can be generalized for exploring the character-     sion by randomforest. R news 2(3):18–22.
istics of machine-learning techniques for monitoring various     Maclin, R., and Opitz, D. 1999. Popular ensemble meth-
kinds of structures. We intend also to make this generaliza-     ods: An empirical study. Journal of Artificial Intelligence
tion explicit.                                                   Research.
                                                                 Nick, W.; Asamene, K.; Bullock, G.; Esterline, A.; and Sun-
                  Acknowledgments                                daresan, M. 2015. A study of machine learning techniques
The Authors would like to thank Army Research Office             for detecting and classifying structural damage. Forthcom-
funding for proposal number 60562-RT-REP and NASA                ing.
Grant # NNX09AV08A for the financial support. Thanks are         Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.;
also due to members of the ISM lab and Dr. M. Sundaresun         Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss,
of the Mechanical Engineering Department at North Car-           R.; Dubourg, V.; et al. 2011. Scikit-learn: Machine learn-
olina A&T State University for their assistance.                 ing in python. The Journal of Machine Learning Research
                                                                 12:2825–2830.
                       References                                Rätsch, G.; Onoda, T.; and Müller, K.-R. 2001. Soft margins
                                                                 for adaboost. Machine learning 42(3):287–320.
Bennett, K. P.; Demiriz, A.; and Maclin, R. 2002. Exploiting
unlabeled data in ensemble methods. In Proceedings of the        Schapire, R. E. 1999. A brief introduction to boosting. In
eighth ACM SIGKDD international conference on Knowl-             Ijcai, volume 99, 1401–1406.
edge discovery and data mining, 289–296. New York,NY:            Schaul, T.; Bayer, J.; Wierstra, D.; Sun, Y.; Felder, M.;
ACM.                                                             Sehnke, F.; Rückstieß, T.; and Schmidhuber, J. 2010. Py-
Bishop, C. M. 2006. Pattern Recognition and Machine              brain. The Journal of Machine Learning Research 11:743–
Learning. New York, NY: Springer.                                746.
Breiman, L. 2001. Random forests. Machine learning               Tibaduiza, D.-A.; Torres-Arredondo, M.-A.; Mujica, L.;
45(1):5–32.                                                      Rodellar, J.; and Fritzen, C.-P. 2013. A study of two un-
                                                                 supervised data driven statistical methodologies for detect-
Dietterich, T. G. 2000. Ensemble methods in machine learn-       ing and classifying damages in structural health monitoring.
ing. In Multiple Classifier Systems, 1–15. New York, NY:         Mechanical Systems and Signal Processing 41(1):467–484.
Springer.
                                                                 Wooldridge, M. 2009. An Introduction to Multiagent Sys-
Duda, R. O.; Hart, P. E.; and Stork, D. G. 2001. Pattern         tems. Hoboken, NJ: John Wiley & Sons.
Classification. Hoboken, NJ: John Wiley & Sons.
                                                                 Worden, K.; Farrar, C. R.; Manson, G.; and Park, G. 2007.
Esterline, A.; Krishnamurthy, K.; Sundaresan, M.; Alam, T.;      The fundamental axioms of structural health monitoring.
Rajendra, D.; and Wright, W. 2010. Classifying acous-            Proceedings of the Royal Society A: Mathematical, Physi-
tic emission data in structural health monitoring using sup-     cal and Engineering Science 463(2082):1639–1664.
port vector machines. In Proceedings of AIAA Infotech@
Aerospace 2010 Conference.
Farrar, C. R., and Lieven, N. A. 2007. Damage prognosis:
The future of structural health monitoring. Philosophical
Transactions of the Royal Society of London A: Mathemat-
ical, Physical and Engineering Sciences 365(1851):623–
632.
Farrar, C. R., and Worden, K. 2007. An introduction to
structural health monitoring. Philosophical Transactions of
the Royal Society A: Mathematical, Physical and Engineer-
ing Sciences 365(1851):303–315.