=Paper= {{Paper |id=Vol-1583/CoCoNIPS_2015_paper_12 |storemode=property |title=Early Detection of Combustion Instability by Neural-Symbolic Analysis on Hi-Speed Video |pdfUrl=https://ceur-ws.org/Vol-1583/CoCoNIPS_2015_paper_12.pdf |volume=Vol-1583 |authors=Soumalya Sarkar,Kin Gwn Lore,Soumik Sarkar |dblpUrl=https://dblp.org/rec/conf/nips/SarkarLS15 }} ==Early Detection of Combustion Instability by Neural-Symbolic Analysis on Hi-Speed Video== https://ceur-ws.org/Vol-1583/CoCoNIPS_2015_paper_12.pdf

Early Detection of Combustion Instability by
Neural-Symbolic Analysis on Hi-Speed Video

Soumalya Sarkar Kin Gwn Lore
United Technology Research Center Mechanical Engineering, Iowa State University
East Hartford, CT 06118 Ames, IA 50011
sms388@gmail.com kglore@iastate.edu

Soumik Sarkar
Mechanical Engineering, Iowa State University
Ames, IA 50011
soumiks@iastate.edu

Abstract

This paper proposes a neural-symbolic framework for analyzing a large volume
of sequential hi-speed images of combustion flame for early detection of insta-
bility that is extremely critical for engine health monitoring and prognostics. The
proposed hierarchical approach involves extracting low-dimensional semantic fea-
tures from images using deep Convolutional Neural Networks (CNN) followed by
capturing the temporal evolution of the extracted features using Symbolic Time
Series Analysis (STSA). Furthermore, the semantic nature of the CNN features
enables expert-guided data exploration that can lead to better understanding of the
underlying physics. Extensive experimental data have been collected in a swirl-
stabilized dump combustor at various operating conditions for validation.

1 Introduction

Recent advancements in deep learning shows that neural approaches are excellent at low-level fea-
ture extraction from raw data, automated learning and discriminative tasks. However, such models
still may not be suited as much for logical reasoning, interpretation and domain knowledge incor-
poration. On the other hand, symbolic approaches can potentially alleviate such issues as they are
shown to be effective in high-level reasoning and capturing sequence of actions. Therefore, a hy-
brid neural-symbolic [1] learning architecture has the potential to execute high-level reasoning tasks
using the symbolic part based on the automated features extracted by the neural segment.
In this paper, we propose a neural-symbolic anomaly detection framework for the crucial physical
process of combustion where a pure black box model is unacceptable in order to enable domain
interpretation and better understanding of the underlying complex physics. Combustion instability,
that reduces the efficiency and longevity of a gas-turbine engine, is considered a significant anomaly
characterized by high-amplitude flame oscillations at discrete frequencies. These frequencies typ-
ically represent the natural acoustic modes of the combustor. Combustion instability arises from a
positive coupling between the heat release rate oscillations and the pressure oscillations, provided
this driving force is higher than the damping present in the system. Coherent structures are fluid me-
chanical structures associated with coherent phase of vorticity, high levels of vorticity among other
definitions [2]. These structures, whose generation mechanisms vary system wise, cause large scale

Copyright © 2015 for this paper by its authors. Copying permitted for private and academic purposes.
velocity oscillations and overall flame shape oscillations by curling and stretching. These structures
can be caused to shed/generated at the duct acoustic modes when the forcing (pressure) amplitudes
are high. The interesting case of the natural shedding frequency of these structures, causing acoustic
oscillations, has been observed by Chakravarthy et al. [3]. There is a lot of recent research interest
on detection and correlation of these coherent structures to heat release rate and unsteady pressure.
The popular methods resorted for detection of coherent structures are proper orthogonal decompo-
sition (POD) [4] (similar to principal component analysis [5]) and dynamic mode decomposition
(DMD) [6], which use tools from spectral theory to derive spatial coherent structure modes.
Although it is known that abundant presence of coherent structure indicates instability, it is quite
difficult to visually characterize such structures. Furthermore, it becomes particularly difficult to
identify precursors of instability due to the lack of physical understanding of the coherent structures.
In this paper, we show that a deep CNN [7] based feature extractor can learn meaningful patterns
from unstable flame images that can be argued as coherent structures. Then a symbolic model can
capture the temporal dynamics of appearance of such patterns as a flame makes transition from stable
to unstable states which results in an early detection of instability. Specifically, we use a recently
reported Symbolic time series analysis (STSA) [8], a fast probabilistic graphical modeling approach.
Among many other applications such as fault detection in gas turbine engines [9], STSA has been
recently applied on pressure and chemiluminescence time series for early detection of Lean-blow
out [10] and thermo-acoustic instability [11]. Note, a fully neural temporal model (e.g., deep RNN)
would not be preferable as it is important to understand specific transitions among various coherent
structures. Major contributions of the paper are delineated below.

• A novel data-driven framework, with CNN at lower layer and STSA at upper layer, is
proposed for early detection of thermo-acoustic instability from hi-speed videos.
• In the above framework, the CNN layers extract meaningful shape-features to represent the
coherent structures of varied sizes and orientations in the flame images. This phenomenon
enables STSA at the temporal modeling layer to capture all the fast time scale precursors
before attaining persistent instability.
• The proposed theory and the associated algorithms have been experimentally validated for
transition data at multiple operating conditions in a swirl-stabilized combustor by charac-
terizing the stable and unstable states of combustion.
• Training and testing of the proposed framework have been performed on different operating
conditions (e.g., air flow rate, fuel flow rate, and air-fuel premixing level) of the combustion
process to test the transferability of the approach. Performance of the proposed framework
(‘CNN+STSA’) have been evaluated by comparison with that of a framework, where CNN
is replaced by another extensively used dimensionality reduction tool, principal component
analysis (PCA) [5].

2 Problem Setup and Experiments
To collect training data for learning the coherent structures, thermo-acoustic instability was induced
in a laboratory-scale combustor with a 30 mm swirler (60 degree vane angles with geometric swirl
number of 1.28). Figure 1 (a) shows the setup and a detail description can be found in [12]. In the
combustor, 4 different instability conditions are induced: 3 seconds of hi-speed videos (i.e., 9000
frames) were captured at 45 lpm (liters per minute) FFR (fuel flow rate) and 900 lpm AFR (air flow
rate) and at 28 lpm FFR and 600 lpm AFR for both levels of premixing. Figure 1 (b) presents se-
quences of images of dimension 100 × 237 pixels for unstable (AF R = 900lpm, F F R = 45lpm
and full premixing) state. The flame inlet is on the right side of each image and the flame flows
downstream to the left. As the combustion is unstable, figure 1 (b) shows formation of mushroom-
shaped vortex (coherent structure) at t = 0, 0.001s and the shedding of that towards downstream
from t = 0.002s to t = 0.004s. For testing the proposed architecture, 5 transition videos of 7
seconds length were collected where stable combustion progressively becomes unstable via inter-
mittancy phenomenon (fast switching between stability and instability as a precursor to persistent
instability) by reducing FFR or increasing AFR. The transition protocols are as follows (all units are

2
lpm): (i) AFR = 500 and FFR = 40 to 28, (ii) AFR = 500 and FFR = 40 to 30, (iii) FFR = 40 and
AFR = 500 to 600, (iv) AFR = 600 and FFR = 50 to 35, (v) FFR = 50 and AFR = 700 to 800. These
data sets are mentioned as 50040to38 , 50040to30 , 40500to600 , 60050to35 and 50700to800 respectively
throughout the rest of this paper.

t=0s t = 0.001 s t = 0.002 s

t = 0.003 s t = 0.004 s

(a) (b)
Figure 1: (a) Schematics of the experimental apparatus. 1 - settling chamber, 2 - inlet duct, 3 - inlet
optical access module (IOAM), 4 - test section, 5 & 6 - big and small extension ducts, 7 - pressure
transducers, Xs - swirler location, Xp - transducer port location, Xi - fuel injection location, (b)
Visible coherent structure in greyscale images at 900 lpm AFR and full premixing for 45 lpm FFR
3 Neural symbolic dynamics
This section describes the proposed architecture for early detection of thermo-acoustic instability in
a combustor via analyzing a sequence of hi-speed images. Figure 2 presents the schematics of the
framework where a deep CNN is stacked with symbolic time series analysis (STSA). In the training
phase, images (or a segment of the images) from unstable state for various operating conditions are
used as the input to the CNN.
Deep Convolutional Neural Network Training with
Detailed Condition Partitioning
Input hi-speed ‫ܥ‬ଵ ܵଵ ‫ܥ‬ଶ ܵଶ Fully Of instability
flame video Feature Maps Feature Maps Feature Maps Feature Maps Connected ߛ
100 x 237 68 x 204 34 x 102 12 x 80 6 x 40 Output Layer ߚ
ߙ
time

2x2 23 x 23 2x2
… ߙߙߙߚߚߛߚߙߙߚߛߚߛߚߛ…
33 x 34 Subsampling Convolution Subsampling Symbol sequence
time Convolution
(2D Maxpooling) (50 Kernels) (2D Maxpooling)
(20 Kernels)

ߝ State
ߙ ߚ merging
‫_ݍݍ‬2ଶ ߙ ߛ
Instability measure

‫ݍ‬ଵ ߙ ߚ
ߛ
ߚ ‫ݍ‬ଵ ‫ݍ‬ସ
ߛ ߙ ߙ
ߚ
‫ݍ‬ସ ‫ݍ‬ଷ ߙߚ ߚߚ ߛߚ
ߛ ߛ ߚ
‫ݍ‬ଶ ‫ݍ‬ଷ
Stable
time
Early detection of Instability State transition matrix Generalized D-Markov Machine State splitting

Figure 2: Neural-Symbolic Dynamics Architecture

While testing, sigmoid outputs from the fully connected layer can be utilized as a symbol sequence
to facilitate in capturing the temporal evolution of coherent structures in the flame, thereby serving
as a precursor in the early detection of unstable combustion flames. In STSA module, the time-
series is symbolized via partitioning the signal space and a symbol sequence is created as shown in
the figure 2. A generalized D-Markov machine is constructed from the symbol sequence via state
splitting and state merging [13, 14], which models the transition from one state to another as state
transition matrix. State transition matrix is the extracted feature which represents the sequence of
images, essentially capturing the temporal evolution of coherent structures in the flame. Deep CNN
and STSA structures are explained in the sequel.

3.1 Deep Convolutional Neural Network

The recent success of the deep learning architecture can be largely attributed to the strong emphasis
on modeling multiple levels of abstractions (from low-level features to higher-order representations,
i.e., features of features) from data. For example, in a typical image processing application while
low-level features can be partial edges and corners, high-level features may be a combination of

3
edges and corners to form part of an image [7]. Among various deep learning techniques, Convo-
lutional Neural Network (CNN) [15] is an attractive option for extracting pertinent features from
images in a hierarchical manner for detection, classification, and prediction. For the purpose of
the study, CNN remains a suitable choice as it preserves the local structures in an image at various
scales. Hence, it is capable to extract local coherent structures of various sizes in a flame image.
CNNs are also easier to train while achieving a comparable (and often better) performance despite
the fact that it has fewer parameters relative to other fully connected networks with the same number
of hidden layers.
In CNNs, data is represented by multiple feature maps in each hidden layer as shown in the figure 2.
Feature maps are obtained by convolving the input image by multiple filters in the corresponding
hidden layer. To further reduce the dimension of the data, these feature maps typically undergo
non-linear downsampling with a 2 × 2 or 3 × 3 maxpooling. Maxpooling essentially partitions the
input image into sets of non-overlapping rectangles and takes the maximum value for each partition
as the output. After maxpooling, multiple dimension-reduced vector representations of the input is
acquired and the process is repeated in the next layer to learn a higher representation of the data.
At the final pooling layer, resultant outputs are linked with the fully connected layer where sigmoid
outputs from the hidden units are post-processed by a softmax function in order to predict the class
that possesses the highest joint probability given the input data. This way, coherent structures in the
unstable flame can be learned at different operating condition.

3.2 Symbolic Time Series Analysis (STSA)

STSA [16] deals with discretization of dynamical systems in both space and time. The notion of
STSA has led to the development of a (nonlinear) data-driven feature extraction tool for dynamical
systems. Rao et al. [17] and Bahrampour et al. [18] have shown that the performance of this PFSA-
based tool as a feature extractor for statistical pattern recognition is comparable (and often superior)
to that of other existing techniques (e.g., Bayesian filters, Artificial Neural Networks, and Principal
Component Analysis [5]). The trajectory of the dynamical system is partitioned into finitely many
mutually exclusive and exhaustive cells for symbolization, where each cell corresponds to a single
symbol belonging to a (finite) alphabet Σ. There are different types of partitioning tools, such
as maximum entropy partitioning (MEP), uniform partitioning (UP) [19] and maximally bijective
partitioning [20]. This paper has adopted MEP for symbolization of time series, which maximizes
the entropy of the generated symbols by putting (approximately) equal number of data points in
each partition cell. The next step is to construct probabilistic finite state automata (PFSA) from the
symbol strings to encode the embedded statistical information. PFSA is a 4-tuple K = (Σ, Q, δ, π)
which consists of a finite set of states (Q) interconnected by transitions [21], where each transition
corresponds to a symbol in the finite alphabet (Σ). At each step, the automaton moves from one
state to another (including self loops) via transition maps (δ : Q × Σ → Q) according to probability
morph function (π̃ : Q × Σ → [0, 1]), and thus generates a corresponding block of symbols so
that the probability distributions over the set of all possible strings defined over the alphabet are
represented in the space of PFSA.
3.2.1 Generalized D-Markov Machine [10]
D-Markov machine is a model of probabilistic languages based on the algebraic structure of PFSA.
In D-Markov machines, the future symbol is causally dependent on the (most recently generated)
finite set of (at most) D symbols, where D is a positive integer. The underlying FSA in the PFSA of
D-Markov machines are deterministic. The complexity of a D-Markov machine is reflected by the
entropy rate which also represents its overall capability of prediction. A D-Markov machine and its
entropy rate are formally defined as:

Definition 3.1 (D-Markov) A D-Markov machine is a statistically stationary stochastic process
S = · · · s−1 s0 s1 · · · (modeled by a PFSA in which each state is represented by a finite history of at
most D symbols), where the probability of occurrence of a new symbol depends only on the last D
symbols, i.e.,
P [sn | · · · sn−D · · · sn−1 ] = P [sn | sn−D · · · sn−1 ] (1)

4
D is called the depth. Q is the finite set of states with cardinality |Q| ≤ |Σ|D , i.e., the states are
represented by equivalence classes of symbol strings of maximum length D, where each symbol
belongs to the alphabet Σ. δ : Q × Σ → Q is the state transition function that satisfies the following
condition: if |Q| = |Σ|D , then there exist α, β ∈ Σ and x ∈ Σ⋆ such that δ(αx, β) = xβ and
αx, xβ ∈ Q.
Definition 3.2 (D-Markov Entropy Rate) The D-Markov entropy rate of a PFSA (Σ, Q, δ, π) is
defined in terms of the conditional entropy as:
X XX
H(Σ|Q) , P (q)H(Σ|q) = − P (q)P (σ|q) log P (σ|q)
q∈Q q∈Q σ∈Σ
where P (q) is the probability of a PFSA state q ∈ Q and P (σ|q) is the conditional probability of a
symbol σ ∈ Σ given that a PFSA state q ∈ Q is observed.

3.2.2 Construction of a D-Markov Machine [13]
The underlying procedure for construction of a D-Markov machine from a symbol sequence consists
of two major steps: state splitting and state merging [13, 14]. In general, state splitting increases the
number of states to achieve more precision in representing the information content of the dynamical
system. State merging reduces the number of states in the D-Markov machine by merging those
states that have similar statistical behavior. Thus, a combination of state splitting and state merging
leads to the final form of the generalized D-Markov machine as described below.
State Splitting: The number of states of a D-Markov machine of depth D is bounded above by |Σ|D ,
where |Σ| is the cardinality of the alphabet Σ. As this relation is exponential in nature, the number
of states rapidly increases as D is increased. However, from the perspective of modeling a symbol
sequence, some states may be more important than others in terms of their embedded information
contents. Therefore, it is advantageous to have a set of states that correspond to symbol blocks of
different lengths. This is accomplished by starting off with the simplest set of states (i.e., Q = Σ
for D = 1) and subsequently splitting the current state that results in the largest decrease of the
D-Markov entropy rate. The process of splitting a state q ∈ Q is executed by replacing the symbol
block q by its branches as described by the set {σq : σ ∈ Σ} of words. Maximum reduction of the
entropy rate is the governing criterion for selecting the state to split. In addition, the generated set of
states must satisfy the self-consistency criterion, which only permits a unique transition to emanate
from a state for a given symbol. If δ(q, σ) is not unique for each σ ∈ Σ, then the state q is split
further. The process of state splitting is terminated by either the threshold parameter ηspl on the rate
of decrease of entropy rate or a maximal number of states Nmax . For construction of PFSA, each
element π(σ, q) of the morph matrix Π is estimated by frequency counting as the ratio of the number
of times, N (qσ), the state q is followed (i.e., suffixed) by the symbol σ and the number of times,
N (q), the state q occurs; the details are available in [13]. The estimated morph matrix Π b and the
stationary state probability vector Pb(q) are obtained as:
1 + N (qσ) 1 + N (q)
π̂(q, σ) , ∀σ ∈ Σ ∀q ∈ Q; Pb (q) , P ∀q ∈ Q (2)
|Σ| + N (q) |Q| + N (q ′ )
q′ ∈Q
P
where σ∈Σ π̂(σ, q) = 1 ∀q ∈ Q. Then, the D-Markov entropy rate (see Definition 3.2) is com-
puted as:
XX XX
H(Σ|Q) = − P (q)P (σ|q) log P (σ|q) ≈ − Pb (q)π̂(q, σ) log π̂(q, σ)
q∈Q σ∈Σ q∈Q σ∈Σ
State Merging: While merging the states, this algorithm aims to mitigate this risk of degraded
precision via a stopping rule that is constructed by specifying an acceptable threshold ηmrg on the
distance Ψ(·, ·) between the merged PFSA and the PFSA generated from the original time series.
The distance metric Ψ(·, ·) between two PFSAs K1 = (Σ, Q1 , δ1 , π1 ) and K2 = (Σ, Q2 , δ2 , π2 ) is
as follows:
n
X P1 (Σj ) − P2 (Σj ) ℓ1
Ψ(K1 , K2 ) , lim (3)
n→∞
j=1
2j+1

5
where P1 (Σj ) and P2 (Σj ) are the steady state probability vectors of generating words of length
j from the PFSA K1 and K2 , respectively, i.e., P1 (Σj ) , [P (w)]w∈Σj for K1 and P2 (Σj ) ,
[P (w)]w∈Σj for K2 . States that behave similarly (i.e., have similar morph probabilities) have a
higher priority for merging. The similarity of two states, q, q ′ ∈ Q, is measured in terms of the
respective morph functions of future symbol generation as the distance between the two rows of the
estimated morph matrix Π b corresponding to the states q and q ′ . The ℓ1 -norm has been adopted to be
the distance function as seen below. X
M(q, q ′ ) , kπ̂(q, ·) − π̂(q ′ , ·)kℓ1 = |π̂(q, σ) − π̂(q ′ , σ)|
σ∈Σ
Hence, the two closest states (i.e., the pair of states q, q ′ ∈ Q having the smallest value of M(q, q ′ ))
are merged using the merging algorithm explained in [13]. The merging algorithm updates the
morph matrix and transition function in such a way that does not permit any ambiguity of nonde-
terminism [8]. Subsequently, distance Ψ(·, ·) of the merged PFSA from the initial symbol string is
evaluated. If Ψ < ηmrg where ηmrg is a specified merging threshold, then the machine structure is
retained and the states next on the priority list are merged. On the other hand, if Ψ ≥ ηmrg , then
the process of merging the given pair of states is aborted and another pair of states with the next
smallest value of M(q, q ′ ) is selected for merging. This procedure is terminated if no such pair of
states exist, for which Ψ < ηmrg .
4 Results and Discussions
This section discusses the results that are obtained when the proposed framework is applied on the
experimental data of hi-speed video for early detection of thermo-acoustic instability.
4.1 CNN training

The network is trained using flame images with 4 different unstable combustion conditions men-
tioned in the section 2. The data consists of 24,000 examples for training and 12,000 examples for
cross-validation. In the first convolutional layer, 20 filters of size 33 × 34 pixels (px) reduce the
input image of dimension 100 × 237 pixels to feature maps of 68 × 204. Next, the feature maps are
downsampled with a 2×2 max-pooling, resulting in pooled maps of 34×102 px. Each of these maps
undergoes another convolutional layer with 50 filters of 23 × 23 px which produces feature maps of
12 × 80 px (before 2 × 2 max-pooling), and 6 × 40 pooled maps after max-pooling. All generated
maps are connected to the fully connected layer of 100 hidden units followed by 10 output units
where the sigmoid activations are extracted. Training is performed with a batch size of 20 and learn-
ing rate of 0.1. Convolution is done without any padding with a stride-size of 1. Visualization of few
filters at first and second convolutional layer is shown in figure 3 (a), (b). Second layer visualization
shows that it captures fragments of the flame coherent structures. Figure 3 (c) presents couple of the
feature maps of a stable frame (top) and an unstable frame (bottom) after convolving with first layer
filter. Red outline at the bottom exhibits how the mushroom-shaped coherent structure is highlighted
on the unstable frame feature map.

(a) (b) (c)
Figure 3: Filter visualization at convolutional layer (a) one and (b) two. (b) shows fragmented repre-
sentations of coherent structures that are visible in unstable flame. (c) Feature maps of a stable frame
(top) and an unstable frame (bottom) after applying first convolutional layer filter. Red outline on
the unstable flame visualization shows how the mushroom-shaped coherent structure is highlighted
4.2 STSA-based Instability measure
Once the CNN is trained on the sets of unstable data, every frame of the transition data sets(i.e.,
50040to38 , 50040to30 , 40500to600 , 60050to35 and 50700to800 as mentioned in section 2) are fed to

6
the CNN. Each sigmoid activation unit out of ten at the last fully connected layer generates a
time series for one transition data set. For capturing the fast change in a transition data, a win-
dow of 0.5 seconds (1500 frames) is traversed over the hi-speed video with an overlap of 80%
to keep the response speed at 10 Hz, which is necessary for real-time combustion instability
control. The time window output of a sigmoid activation unit is symbolized by maximum en-
tropy partitioning (MEP) with an alphabet size of |Σ| = 3. Considering the first window to
be reference stable state, a generalized D-Markov machine is constructed by state splitting with
Nmax = 10 and state merging with ηmrg = 0.05. Nmax is chosen as 10 because window
length is not enough to learn a large state machine. For the alphabet {1, 2, 3}, the set of states
after state splitting is {11, 21, 31, 2, 113, 213, 313, 23, 133, 233, 333} and state merging leads to
{11, 21, 31, 2, {113, 313}, 213, {23, 133}, {233, 333}} for one of the sigmoid activation outputs in
the transition video 60050to35 . State probability vector, arising from D-Markov machine at each
time window, is the feature capturing the extent of instability which is transmitted through the cor-
responding sigmoid hidden unit. Instability measure of a time window is defined as the l2 norm
distance from the reference stable time window.

Figure 4: Variation of the proposed instability measure with time for the transition video named
60050to35 . Multiple regions on the measure curve denote different combustion states such as stable,
temporary intermittancy (a significant precursor to persistent instability) and unstable . They are cor-
responded to varied coherent structures (bounded by red box) that are detected by the ‘CNN+STSA’
framework. On the right, rms variation of the pressure is shown as it is one of the most commonly
used instability measures. Progression of Prms can not detect the aforementioned precursors.

Figure 4 shows an aggregated progression (summation of individual instability measure obtained
from each sigmoid activation unit) of instability measure for 60050to35 . The rms curve of the
pressure on right of the figure 4 gives a rough idea about the ground truth regarding stability. Two
fold advantages of the proposed instability measure over Prms are as follows: (i) intermittancy
phenomenon (region 2 and 3 on figure 4) is captured by this measure because it can detect variable-
size mildly-illuminated mushroom-shaped coherent structure (bounded by red box in the figure 4)
in the ‘CNN+STSA’ framework whereas Prms ignores these important precursors to instability and
(ii) region 4 of the figure 4 shows that the proposed measure rises faster towards instability. Other
transition data sets also exhibit similar nature regarding this measure. Hence, the proposed measure
performs better in early detection of instability than other commonly used measures such as Prms .
4.3 Comparison with ‘PCA+STSA’

To compare with the proposed approach, Principal Component Analysis (PCA) [5], a well-known
dimensionality reduction technique is used as a replacement of CNN module. Figure 5 (a) shows
that the transition (stable to unstable) increment of aggregated instability measure for ‘CNN+STSA’
is larger than that for ‘PCA+STSA’ in all transition data. This will result in more precise instability
control in real time. The condition 50700to800 is observed in figure 5 (b) as the transition jump for
both frameworks are very close. A close observation of the instability measure variation reveals that
‘CNN+STSA’ can detect an intermittancy precursor (region 1 at figure 5 (b)) although the coherent
structure formation is not very prominent. However, ‘PCA+STSA’ misses this precursor before

7
arriving at the inception of persistent instability. A probable rationale behind this observation is that,
while PCA is averaging the image vector based on just maximum spatial variance, CNN is learning
semantic features based on the coherent structures of varied illumination, size and orientation seen
during unstable combustion.

5.5
Transition jump in instability measure

4.5

4
(1) Intermittancy
2.5

Instability measure

Instability measure
3.5 CNN+STSA 2.5 PCA+STSA
2
2
3 1.5 1.5
1 1
1
2.5 CNN + STSA 1
PCA + STSA 0.5 0.5
2 0 0
1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5 6 7
Tansition conditions time (sec) time (sec)

(a) (b)

Figure 5: (a) Comparison of sudden change in instability measure when instablity sets in for different
transition conditions which are 1. 50040to38 , 2. 50040to30 , 3. 40500to600 , 4. 60050to35 and 5.
50700to800 . The jump is larger for ‘CNN+STSA’ than ‘PCA+STSA’. (b) Variation of instability
measure for both ‘CNN+STSA’ than ‘PCA+STSA’ at transition condition 50700to800 . The measure
arising from ‘CNN+STSA’ can detect the intermittancy precursor whereas it is mostly ignored by
‘PCA+STSA’. A frame with an intermittancy coherent structure in a red box is shown on the top.

5 Conclusions and future work
The paper proposes a framework that synergistically combines the recently introduced concepts of
CNN and STSA for early detection of thermo-acoustic instability in gas turbine engines. Extensive
set of experiments have been conducted on a swirl-stabilized combustor for validation of the pro-
posed method. Sequences of hi-speed greyscale images are fed into a multi-layered CNN to model
the fluctuating coherent structures in the flame, which are dominant during unstable combustion.
Fragments of coherent structures are observed in the CNN filter visualization. Therefore, an en-
semble of time series data is constructed from sequence of images based on the sigmoid activation
probability vectors of last hidden layer at the CNN. Then, STSA is applied on the time series that is
generated from an image sequence and ‘CNN+STSA’ is found to exhibit larger change in instability
measure while transition to instability than ‘PCA+STSA’. The proposed framework detects all the
intermittent precursors for different transition protocols, which is the most significant step towards
detecting the onset of instability early enough for mitigation. In summary, while CNN captures the
semantic features (i.e., coherent structures) of the combustion flames at varied illuminations, sizes
and orientations, STSA models the temporal fluctuation of those features at a reduced dimension.
One of the primary advantages of the proposed semantic dimensionality reduction (as opposed to
abstract dimensionality reduction, e.g., using PCA) would be seamless involvement of domain ex-
perts into the data analytics framework for expert-guided data exploration activities. Developing
novel use-cases in this neural-symbol context will be a key future work. Some other near-term re-
search tasks are: (i) dynamically tracking multiple coherent structures in the flame to characterize
the extent of instability, (ii) multi-dimensional partitioning for direct usage of the last sigmoid layer
and (iii) learning CNN and STSA together.

Acknowledgment
Authors sincerely acknowledge the extensive data collection performed by Vikram Ramanan and Dr.
Satyanarayanan Chakravarthy at Indian Institute of Technology Madras (IITM), Chennai. Authors
also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce
GTX TITAN Black GPU used for this research.

8
References
[1] A. Garcez, T. R. Besold, L. de Raedt, P. Foeldiak, P. Hitzler, T. Icard, K. Kuehnberger, L. C. Lamb,
R. Miikkulainen, and D. L. Silver. Neural-symbolic learning and reasoning: Contributions and challenges.
Proceedings of the AAAI Spring Symposium on Knowledge Representation and Reasoning: Integrating
Symbolic and Neural Approaches, Stanford, March 2015.
[2] A. K. M. F. Hussain. Coherent structures - reality and myth. Physics of Fluids, 26(10):2816–2850, 1983.
[3] S. R. Chakravarthy, O. J. Shreenivasan, B. Bhm, A. Dreizler, and J. Janicka. Experimental characterization
of onset of acoustic instability in a nonpremixed half-dump combustor. Journal of the Acoustical Society
of America, 122:120127, 2007.
[4] G Berkooz, P Holmes, and J L Lumley. The proper orthogonal decomposition in the analysis of turbulent
flows. Annual Review of Fluid Mechanics, 25(1):539–575, 1993.
[5] C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, NY, USA, 2006.
[6] P. J. Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of Fluid Me-
chanics, 656:5–28, 2010.
[7] K. Kavukcuoglu, Y. L. Sermanet, P. Boureau, K. Gregor, M. Mathieu, and Y. LeCun. Learning convolu-
tional feature hierachies for visual recognition. In NIPS, 2010.
[8] A. Ray. Symbolic dynamic analysis of complex systems for anomaly detection. Signal Processing,
84(7):1115–1130, July 2004.
[9] S. Sarkar, K. Mukherjee, S. Sarkar, and A. Ray. Symbolic dynamic analysis of transient time series
for fault detection in gas turbine engines. Journal of Dynamic Systems, Measurement, and Control,
135(1):014506, 2013.
[10] S. Sarkar, A. Ray, A. Mukhopadhyay, R. R. Chaudhari, and S. Sen. Early detection of lean blow out (lbo)
via generalized d-markov machine construction. In American Control Conference (ACC), 2014, pages
3041–3046. IEEE, 2014.
[11] V. Ramanan, S. R. Chakravarthy, S. Sarkar, and A. Ray. Investigation of combustion instability in a swirl-
stabilized combustor using symbolic time series analysis. In Proc. ASME Gas Turbine India Conference,
GTIndia 2014, New Delhi, India, pages 1–6, December 2014.
[12] S. Sarkar, K.G. Lore, S. Sarkar, V. Ramanan, S.R. Chakravarthy, S. Phoha, and A. Ray. Early detection of
combustion instability from hi-speed flame images via deep learning and symbolic time series analysis.
In Annual Conference of The Prognostics and Health Management, pages pre–prints. PHM, 2015.
[13] K. Mukherjee and A. Ray. State splitting and state merging in probabilistic finite state automata for signal
representation and analysis. Signal Processing, 104:105–119, November 2014.
[14] S. Sarkar, A. Ray, A. Mukhopadhyay, and S. Sen. Dynamic data-driven prediction of lean blowout in
a swirl-stabilized combustor. International Journal of Spray and Combustion Dynamics, 7(3):in–press,
2015.
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural
networks. In NIPS, 2012.
[16] C. Daw, C. Finney, and E. Tracy. A review of symbolic analysis of experimental data. Review of Scientific
Instruments, 74(2):915–930, 2003.
[17] C. Rao, A. Ray, S. Sarkar, and M. Yasar. Review and comparative evaluation of symbolic dynamic
filtering for detection of anomaly patterns. Signal, Image and Video Processing, 3(2):101–114, 2009.
[18] S. Bahrampour, A. Ray, S. Sarkar, T. Damarla, and N.M. Nasrabadi. Performance comparison of feature
extraction algorithms for target detection and classification. Pattern Recogntion Letters, 34(16):2126–
2134, December 2013.
[19] V. Rajagopalan and A. Ray. Symbolic time series analysis via wavelet-based partitioning. Signal Pro-
cessing, 86(11):3309–3320, November 2006.
[20] S. Sarkar, A. Srivastav, and M. Shashanka. Maximally bijective discretization for data-driven modeling
of complex systems. In Proceedings of American Control Conference, Washington, D.C., 2013.
[21] M. Sipser. Introduction to the Theory of Computation, 3rd ed. Cengage Publishing, Boston, MA, USA,
2013.