=Paper=
{{Paper
|id=Vol-1583/CoCoNIPS_2015_paper_12
|storemode=property
|title=Early Detection of Combustion Instability by Neural-Symbolic Analysis on Hi-Speed Video
|pdfUrl=https://ceur-ws.org/Vol-1583/CoCoNIPS_2015_paper_12.pdf
|volume=Vol-1583
|authors=Soumalya Sarkar,Kin Gwn Lore,Soumik Sarkar
|dblpUrl=https://dblp.org/rec/conf/nips/SarkarLS15
}}
==Early Detection of Combustion Instability by Neural-Symbolic Analysis on Hi-Speed Video==
Early Detection of Combustion Instability by Neural-Symbolic Analysis on Hi-Speed Video Soumalya Sarkar Kin Gwn Lore United Technology Research Center Mechanical Engineering, Iowa State University East Hartford, CT 06118 Ames, IA 50011 sms388@gmail.com kglore@iastate.edu Soumik Sarkar Mechanical Engineering, Iowa State University Ames, IA 50011 soumiks@iastate.edu Abstract This paper proposes a neural-symbolic framework for analyzing a large volume of sequential hi-speed images of combustion flame for early detection of insta- bility that is extremely critical for engine health monitoring and prognostics. The proposed hierarchical approach involves extracting low-dimensional semantic fea- tures from images using deep Convolutional Neural Networks (CNN) followed by capturing the temporal evolution of the extracted features using Symbolic Time Series Analysis (STSA). Furthermore, the semantic nature of the CNN features enables expert-guided data exploration that can lead to better understanding of the underlying physics. Extensive experimental data have been collected in a swirl- stabilized dump combustor at various operating conditions for validation. 1 Introduction Recent advancements in deep learning shows that neural approaches are excellent at low-level fea- ture extraction from raw data, automated learning and discriminative tasks. However, such models still may not be suited as much for logical reasoning, interpretation and domain knowledge incor- poration. On the other hand, symbolic approaches can potentially alleviate such issues as they are shown to be effective in high-level reasoning and capturing sequence of actions. Therefore, a hy- brid neural-symbolic [1] learning architecture has the potential to execute high-level reasoning tasks using the symbolic part based on the automated features extracted by the neural segment. In this paper, we propose a neural-symbolic anomaly detection framework for the crucial physical process of combustion where a pure black box model is unacceptable in order to enable domain interpretation and better understanding of the underlying complex physics. Combustion instability, that reduces the efficiency and longevity of a gas-turbine engine, is considered a significant anomaly characterized by high-amplitude flame oscillations at discrete frequencies. These frequencies typ- ically represent the natural acoustic modes of the combustor. Combustion instability arises from a positive coupling between the heat release rate oscillations and the pressure oscillations, provided this driving force is higher than the damping present in the system. Coherent structures are fluid me- chanical structures associated with coherent phase of vorticity, high levels of vorticity among other definitions [2]. These structures, whose generation mechanisms vary system wise, cause large scale 1 Copyright © 2015 for this paper by its authors. Copying permitted for private and academic purposes. velocity oscillations and overall flame shape oscillations by curling and stretching. These structures can be caused to shed/generated at the duct acoustic modes when the forcing (pressure) amplitudes are high. The interesting case of the natural shedding frequency of these structures, causing acoustic oscillations, has been observed by Chakravarthy et al. [3]. There is a lot of recent research interest on detection and correlation of these coherent structures to heat release rate and unsteady pressure. The popular methods resorted for detection of coherent structures are proper orthogonal decompo- sition (POD) [4] (similar to principal component analysis [5]) and dynamic mode decomposition (DMD) [6], which use tools from spectral theory to derive spatial coherent structure modes. Although it is known that abundant presence of coherent structure indicates instability, it is quite difficult to visually characterize such structures. Furthermore, it becomes particularly difficult to identify precursors of instability due to the lack of physical understanding of the coherent structures. In this paper, we show that a deep CNN [7] based feature extractor can learn meaningful patterns from unstable flame images that can be argued as coherent structures. Then a symbolic model can capture the temporal dynamics of appearance of such patterns as a flame makes transition from stable to unstable states which results in an early detection of instability. Specifically, we use a recently reported Symbolic time series analysis (STSA) [8], a fast probabilistic graphical modeling approach. Among many other applications such as fault detection in gas turbine engines [9], STSA has been recently applied on pressure and chemiluminescence time series for early detection of Lean-blow out [10] and thermo-acoustic instability [11]. Note, a fully neural temporal model (e.g., deep RNN) would not be preferable as it is important to understand specific transitions among various coherent structures. Major contributions of the paper are delineated below. • A novel data-driven framework, with CNN at lower layer and STSA at upper layer, is proposed for early detection of thermo-acoustic instability from hi-speed videos. • In the above framework, the CNN layers extract meaningful shape-features to represent the coherent structures of varied sizes and orientations in the flame images. This phenomenon enables STSA at the temporal modeling layer to capture all the fast time scale precursors before attaining persistent instability. • The proposed theory and the associated algorithms have been experimentally validated for transition data at multiple operating conditions in a swirl-stabilized combustor by charac- terizing the stable and unstable states of combustion. • Training and testing of the proposed framework have been performed on different operating conditions (e.g., air flow rate, fuel flow rate, and air-fuel premixing level) of the combustion process to test the transferability of the approach. Performance of the proposed framework (‘CNN+STSA’) have been evaluated by comparison with that of a framework, where CNN is replaced by another extensively used dimensionality reduction tool, principal component analysis (PCA) [5]. 2 Problem Setup and Experiments To collect training data for learning the coherent structures, thermo-acoustic instability was induced in a laboratory-scale combustor with a 30 mm swirler (60 degree vane angles with geometric swirl number of 1.28). Figure 1 (a) shows the setup and a detail description can be found in [12]. In the combustor, 4 different instability conditions are induced: 3 seconds of hi-speed videos (i.e., 9000 frames) were captured at 45 lpm (liters per minute) FFR (fuel flow rate) and 900 lpm AFR (air flow rate) and at 28 lpm FFR and 600 lpm AFR for both levels of premixing. Figure 1 (b) presents se- quences of images of dimension 100 × 237 pixels for unstable (AF R = 900lpm, F F R = 45lpm and full premixing) state. The flame inlet is on the right side of each image and the flame flows downstream to the left. As the combustion is unstable, figure 1 (b) shows formation of mushroom- shaped vortex (coherent structure) at t = 0, 0.001s and the shedding of that towards downstream from t = 0.002s to t = 0.004s. For testing the proposed architecture, 5 transition videos of 7 seconds length were collected where stable combustion progressively becomes unstable via inter- mittancy phenomenon (fast switching between stability and instability as a precursor to persistent instability) by reducing FFR or increasing AFR. The transition protocols are as follows (all units are 2 lpm): (i) AFR = 500 and FFR = 40 to 28, (ii) AFR = 500 and FFR = 40 to 30, (iii) FFR = 40 and AFR = 500 to 600, (iv) AFR = 600 and FFR = 50 to 35, (v) FFR = 50 and AFR = 700 to 800. These data sets are mentioned as 50040to38 , 50040to30 , 40500to600 , 60050to35 and 50700to800 respectively throughout the rest of this paper. t=0s t = 0.001 s t = 0.002 s t = 0.003 s t = 0.004 s (a) (b) Figure 1: (a) Schematics of the experimental apparatus. 1 - settling chamber, 2 - inlet duct, 3 - inlet optical access module (IOAM), 4 - test section, 5 & 6 - big and small extension ducts, 7 - pressure transducers, Xs - swirler location, Xp - transducer port location, Xi - fuel injection location, (b) Visible coherent structure in greyscale images at 900 lpm AFR and full premixing for 45 lpm FFR 3 Neural symbolic dynamics This section describes the proposed architecture for early detection of thermo-acoustic instability in a combustor via analyzing a sequence of hi-speed images. Figure 2 presents the schematics of the framework where a deep CNN is stacked with symbolic time series analysis (STSA). In the training phase, images (or a segment of the images) from unstable state for various operating conditions are used as the input to the CNN. Deep Convolutional Neural Network Training with Detailed Condition Partitioning Input hi-speed ܥଵ ܵଵ ܥଶ ܵଶ Fully Of instability flame video Feature Maps Feature Maps Feature Maps Feature Maps Connected ߛ 100 x 237 68 x 204 34 x 102 12 x 80 6 x 40 Output Layer ߚ ߙ time 2x2 23 x 23 2x2 … ߙߙߙߚߚߛߚߙߙߚߛߚߛߚߛ… 33 x 34 Subsampling Convolution Subsampling Symbol sequence time Convolution (2D Maxpooling) (50 Kernels) (2D Maxpooling) (20 Kernels) ߝ State ߙ ߚ merging _ݍݍ2ଶ ߙ ߛ Instability measure ݍଵ ߙ ߚ ߛ ߚ ݍଵ ݍସ ߛ ߙ ߙ ߚ ݍସ ݍଷ ߙߚ ߚߚ ߛߚ ߛ ߛ ߚ ݍଶ ݍଷ Stable time Early detection of Instability State transition matrix Generalized D-Markov Machine State splitting Figure 2: Neural-Symbolic Dynamics Architecture While testing, sigmoid outputs from the fully connected layer can be utilized as a symbol sequence to facilitate in capturing the temporal evolution of coherent structures in the flame, thereby serving as a precursor in the early detection of unstable combustion flames. In STSA module, the time- series is symbolized via partitioning the signal space and a symbol sequence is created as shown in the figure 2. A generalized D-Markov machine is constructed from the symbol sequence via state splitting and state merging [13, 14], which models the transition from one state to another as state transition matrix. State transition matrix is the extracted feature which represents the sequence of images, essentially capturing the temporal evolution of coherent structures in the flame. Deep CNN and STSA structures are explained in the sequel. 3.1 Deep Convolutional Neural Network The recent success of the deep learning architecture can be largely attributed to the strong emphasis on modeling multiple levels of abstractions (from low-level features to higher-order representations, i.e., features of features) from data. For example, in a typical image processing application while low-level features can be partial edges and corners, high-level features may be a combination of 3 edges and corners to form part of an image [7]. Among various deep learning techniques, Convo- lutional Neural Network (CNN) [15] is an attractive option for extracting pertinent features from images in a hierarchical manner for detection, classification, and prediction. For the purpose of the study, CNN remains a suitable choice as it preserves the local structures in an image at various scales. Hence, it is capable to extract local coherent structures of various sizes in a flame image. CNNs are also easier to train while achieving a comparable (and often better) performance despite the fact that it has fewer parameters relative to other fully connected networks with the same number of hidden layers. In CNNs, data is represented by multiple feature maps in each hidden layer as shown in the figure 2. Feature maps are obtained by convolving the input image by multiple filters in the corresponding hidden layer. To further reduce the dimension of the data, these feature maps typically undergo non-linear downsampling with a 2 × 2 or 3 × 3 maxpooling. Maxpooling essentially partitions the input image into sets of non-overlapping rectangles and takes the maximum value for each partition as the output. After maxpooling, multiple dimension-reduced vector representations of the input is acquired and the process is repeated in the next layer to learn a higher representation of the data. At the final pooling layer, resultant outputs are linked with the fully connected layer where sigmoid outputs from the hidden units are post-processed by a softmax function in order to predict the class that possesses the highest joint probability given the input data. This way, coherent structures in the unstable flame can be learned at different operating condition. 3.2 Symbolic Time Series Analysis (STSA) STSA [16] deals with discretization of dynamical systems in both space and time. The notion of STSA has led to the development of a (nonlinear) data-driven feature extraction tool for dynamical systems. Rao et al. [17] and Bahrampour et al. [18] have shown that the performance of this PFSA- based tool as a feature extractor for statistical pattern recognition is comparable (and often superior) to that of other existing techniques (e.g., Bayesian filters, Artificial Neural Networks, and Principal Component Analysis [5]). The trajectory of the dynamical system is partitioned into finitely many mutually exclusive and exhaustive cells for symbolization, where each cell corresponds to a single symbol belonging to a (finite) alphabet Σ. There are different types of partitioning tools, such as maximum entropy partitioning (MEP), uniform partitioning (UP) [19] and maximally bijective partitioning [20]. This paper has adopted MEP for symbolization of time series, which maximizes the entropy of the generated symbols by putting (approximately) equal number of data points in each partition cell. The next step is to construct probabilistic finite state automata (PFSA) from the symbol strings to encode the embedded statistical information. PFSA is a 4-tuple K = (Σ, Q, δ, π) which consists of a finite set of states (Q) interconnected by transitions [21], where each transition corresponds to a symbol in the finite alphabet (Σ). At each step, the automaton moves from one state to another (including self loops) via transition maps (δ : Q × Σ → Q) according to probability morph function (π̃ : Q × Σ → [0, 1]), and thus generates a corresponding block of symbols so that the probability distributions over the set of all possible strings defined over the alphabet are represented in the space of PFSA. 3.2.1 Generalized D-Markov Machine [10] D-Markov machine is a model of probabilistic languages based on the algebraic structure of PFSA. In D-Markov machines, the future symbol is causally dependent on the (most recently generated) finite set of (at most) D symbols, where D is a positive integer. The underlying FSA in the PFSA of D-Markov machines are deterministic. The complexity of a D-Markov machine is reflected by the entropy rate which also represents its overall capability of prediction. A D-Markov machine and its entropy rate are formally defined as: Definition 3.1 (D-Markov) A D-Markov machine is a statistically stationary stochastic process S = · · · s−1 s0 s1 · · · (modeled by a PFSA in which each state is represented by a finite history of at most D symbols), where the probability of occurrence of a new symbol depends only on the last D symbols, i.e., P [sn | · · · sn−D · · · sn−1 ] = P [sn | sn−D · · · sn−1 ] (1) 4 D is called the depth. Q is the finite set of states with cardinality |Q| ≤ |Σ|D , i.e., the states are represented by equivalence classes of symbol strings of maximum length D, where each symbol belongs to the alphabet Σ. δ : Q × Σ → Q is the state transition function that satisfies the following condition: if |Q| = |Σ|D , then there exist α, β ∈ Σ and x ∈ Σ⋆ such that δ(αx, β) = xβ and αx, xβ ∈ Q. Definition 3.2 (D-Markov Entropy Rate) The D-Markov entropy rate of a PFSA (Σ, Q, δ, π) is defined in terms of the conditional entropy as: X XX H(Σ|Q) , P (q)H(Σ|q) = − P (q)P (σ|q) log P (σ|q) q∈Q q∈Q σ∈Σ where P (q) is the probability of a PFSA state q ∈ Q and P (σ|q) is the conditional probability of a symbol σ ∈ Σ given that a PFSA state q ∈ Q is observed. 3.2.2 Construction of a D-Markov Machine [13] The underlying procedure for construction of a D-Markov machine from a symbol sequence consists of two major steps: state splitting and state merging [13, 14]. In general, state splitting increases the number of states to achieve more precision in representing the information content of the dynamical system. State merging reduces the number of states in the D-Markov machine by merging those states that have similar statistical behavior. Thus, a combination of state splitting and state merging leads to the final form of the generalized D-Markov machine as described below. State Splitting: The number of states of a D-Markov machine of depth D is bounded above by |Σ|D , where |Σ| is the cardinality of the alphabet Σ. As this relation is exponential in nature, the number of states rapidly increases as D is increased. However, from the perspective of modeling a symbol sequence, some states may be more important than others in terms of their embedded information contents. Therefore, it is advantageous to have a set of states that correspond to symbol blocks of different lengths. This is accomplished by starting off with the simplest set of states (i.e., Q = Σ for D = 1) and subsequently splitting the current state that results in the largest decrease of the D-Markov entropy rate. The process of splitting a state q ∈ Q is executed by replacing the symbol block q by its branches as described by the set {σq : σ ∈ Σ} of words. Maximum reduction of the entropy rate is the governing criterion for selecting the state to split. In addition, the generated set of states must satisfy the self-consistency criterion, which only permits a unique transition to emanate from a state for a given symbol. If δ(q, σ) is not unique for each σ ∈ Σ, then the state q is split further. The process of state splitting is terminated by either the threshold parameter ηspl on the rate of decrease of entropy rate or a maximal number of states Nmax . For construction of PFSA, each element π(σ, q) of the morph matrix Π is estimated by frequency counting as the ratio of the number of times, N (qσ), the state q is followed (i.e., suffixed) by the symbol σ and the number of times, N (q), the state q occurs; the details are available in [13]. The estimated morph matrix Π b and the stationary state probability vector Pb(q) are obtained as: 1 + N (qσ) 1 + N (q) π̂(q, σ) , ∀σ ∈ Σ ∀q ∈ Q; Pb (q) , P ∀q ∈ Q (2) |Σ| + N (q) |Q| + N (q ′ ) q′ ∈Q P where σ∈Σ π̂(σ, q) = 1 ∀q ∈ Q. Then, the D-Markov entropy rate (see Definition 3.2) is com- puted as: XX XX H(Σ|Q) = − P (q)P (σ|q) log P (σ|q) ≈ − Pb (q)π̂(q, σ) log π̂(q, σ) q∈Q σ∈Σ q∈Q σ∈Σ State Merging: While merging the states, this algorithm aims to mitigate this risk of degraded precision via a stopping rule that is constructed by specifying an acceptable threshold ηmrg on the distance Ψ(·, ·) between the merged PFSA and the PFSA generated from the original time series. The distance metric Ψ(·, ·) between two PFSAs K1 = (Σ, Q1 , δ1 , π1 ) and K2 = (Σ, Q2 , δ2 , π2 ) is as follows: n X P1 (Σj ) − P2 (Σj ) ℓ1 Ψ(K1 , K2 ) , lim (3) n→∞ j=1 2j+1 5 where P1 (Σj ) and P2 (Σj ) are the steady state probability vectors of generating words of length j from the PFSA K1 and K2 , respectively, i.e., P1 (Σj ) , [P (w)]w∈Σj for K1 and P2 (Σj ) , [P (w)]w∈Σj for K2 . States that behave similarly (i.e., have similar morph probabilities) have a higher priority for merging. The similarity of two states, q, q ′ ∈ Q, is measured in terms of the respective morph functions of future symbol generation as the distance between the two rows of the estimated morph matrix Π b corresponding to the states q and q ′ . The ℓ1 -norm has been adopted to be the distance function as seen below. X M(q, q ′ ) , kπ̂(q, ·) − π̂(q ′ , ·)kℓ1 = |π̂(q, σ) − π̂(q ′ , σ)| σ∈Σ Hence, the two closest states (i.e., the pair of states q, q ′ ∈ Q having the smallest value of M(q, q ′ )) are merged using the merging algorithm explained in [13]. The merging algorithm updates the morph matrix and transition function in such a way that does not permit any ambiguity of nonde- terminism [8]. Subsequently, distance Ψ(·, ·) of the merged PFSA from the initial symbol string is evaluated. If Ψ < ηmrg where ηmrg is a specified merging threshold, then the machine structure is retained and the states next on the priority list are merged. On the other hand, if Ψ ≥ ηmrg , then the process of merging the given pair of states is aborted and another pair of states with the next smallest value of M(q, q ′ ) is selected for merging. This procedure is terminated if no such pair of states exist, for which Ψ < ηmrg . 4 Results and Discussions This section discusses the results that are obtained when the proposed framework is applied on the experimental data of hi-speed video for early detection of thermo-acoustic instability. 4.1 CNN training The network is trained using flame images with 4 different unstable combustion conditions men- tioned in the section 2. The data consists of 24,000 examples for training and 12,000 examples for cross-validation. In the first convolutional layer, 20 filters of size 33 × 34 pixels (px) reduce the input image of dimension 100 × 237 pixels to feature maps of 68 × 204. Next, the feature maps are downsampled with a 2×2 max-pooling, resulting in pooled maps of 34×102 px. Each of these maps undergoes another convolutional layer with 50 filters of 23 × 23 px which produces feature maps of 12 × 80 px (before 2 × 2 max-pooling), and 6 × 40 pooled maps after max-pooling. All generated maps are connected to the fully connected layer of 100 hidden units followed by 10 output units where the sigmoid activations are extracted. Training is performed with a batch size of 20 and learn- ing rate of 0.1. Convolution is done without any padding with a stride-size of 1. Visualization of few filters at first and second convolutional layer is shown in figure 3 (a), (b). Second layer visualization shows that it captures fragments of the flame coherent structures. Figure 3 (c) presents couple of the feature maps of a stable frame (top) and an unstable frame (bottom) after convolving with first layer filter. Red outline at the bottom exhibits how the mushroom-shaped coherent structure is highlighted on the unstable frame feature map. (a) (b) (c) Figure 3: Filter visualization at convolutional layer (a) one and (b) two. (b) shows fragmented repre- sentations of coherent structures that are visible in unstable flame. (c) Feature maps of a stable frame (top) and an unstable frame (bottom) after applying first convolutional layer filter. Red outline on the unstable flame visualization shows how the mushroom-shaped coherent structure is highlighted 4.2 STSA-based Instability measure Once the CNN is trained on the sets of unstable data, every frame of the transition data sets(i.e., 50040to38 , 50040to30 , 40500to600 , 60050to35 and 50700to800 as mentioned in section 2) are fed to 6 the CNN. Each sigmoid activation unit out of ten at the last fully connected layer generates a time series for one transition data set. For capturing the fast change in a transition data, a win- dow of 0.5 seconds (1500 frames) is traversed over the hi-speed video with an overlap of 80% to keep the response speed at 10 Hz, which is necessary for real-time combustion instability control. The time window output of a sigmoid activation unit is symbolized by maximum en- tropy partitioning (MEP) with an alphabet size of |Σ| = 3. Considering the first window to be reference stable state, a generalized D-Markov machine is constructed by state splitting with Nmax = 10 and state merging with ηmrg = 0.05. Nmax is chosen as 10 because window length is not enough to learn a large state machine. For the alphabet {1, 2, 3}, the set of states after state splitting is {11, 21, 31, 2, 113, 213, 313, 23, 133, 233, 333} and state merging leads to {11, 21, 31, 2, {113, 313}, 213, {23, 133}, {233, 333}} for one of the sigmoid activation outputs in the transition video 60050to35 . State probability vector, arising from D-Markov machine at each time window, is the feature capturing the extent of instability which is transmitted through the cor- responding sigmoid hidden unit. Instability measure of a time window is defined as the l2 norm distance from the reference stable time window. Figure 4: Variation of the proposed instability measure with time for the transition video named 60050to35 . Multiple regions on the measure curve denote different combustion states such as stable, temporary intermittancy (a significant precursor to persistent instability) and unstable . They are cor- responded to varied coherent structures (bounded by red box) that are detected by the ‘CNN+STSA’ framework. On the right, rms variation of the pressure is shown as it is one of the most commonly used instability measures. Progression of Prms can not detect the aforementioned precursors. Figure 4 shows an aggregated progression (summation of individual instability measure obtained from each sigmoid activation unit) of instability measure for 60050to35 . The rms curve of the pressure on right of the figure 4 gives a rough idea about the ground truth regarding stability. Two fold advantages of the proposed instability measure over Prms are as follows: (i) intermittancy phenomenon (region 2 and 3 on figure 4) is captured by this measure because it can detect variable- size mildly-illuminated mushroom-shaped coherent structure (bounded by red box in the figure 4) in the ‘CNN+STSA’ framework whereas Prms ignores these important precursors to instability and (ii) region 4 of the figure 4 shows that the proposed measure rises faster towards instability. Other transition data sets also exhibit similar nature regarding this measure. Hence, the proposed measure performs better in early detection of instability than other commonly used measures such as Prms . 4.3 Comparison with ‘PCA+STSA’ To compare with the proposed approach, Principal Component Analysis (PCA) [5], a well-known dimensionality reduction technique is used as a replacement of CNN module. Figure 5 (a) shows that the transition (stable to unstable) increment of aggregated instability measure for ‘CNN+STSA’ is larger than that for ‘PCA+STSA’ in all transition data. This will result in more precise instability control in real time. The condition 50700to800 is observed in figure 5 (b) as the transition jump for both frameworks are very close. A close observation of the instability measure variation reveals that ‘CNN+STSA’ can detect an intermittancy precursor (region 1 at figure 5 (b)) although the coherent structure formation is not very prominent. However, ‘PCA+STSA’ misses this precursor before 7 arriving at the inception of persistent instability. A probable rationale behind this observation is that, while PCA is averaging the image vector based on just maximum spatial variance, CNN is learning semantic features based on the coherent structures of varied illumination, size and orientation seen during unstable combustion. 5.5 Transition jump in instability measure 5 4.5 4 (1) Intermittancy 2.5 Instability measure Instability measure 3.5 CNN+STSA 2.5 PCA+STSA 2 2 3 1.5 1.5 1 1 1 2.5 CNN + STSA 1 PCA + STSA 0.5 0.5 2 0 0 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Tansition conditions time (sec) time (sec) (a) (b) Figure 5: (a) Comparison of sudden change in instability measure when instablity sets in for different transition conditions which are 1. 50040to38 , 2. 50040to30 , 3. 40500to600 , 4. 60050to35 and 5. 50700to800 . The jump is larger for ‘CNN+STSA’ than ‘PCA+STSA’. (b) Variation of instability measure for both ‘CNN+STSA’ than ‘PCA+STSA’ at transition condition 50700to800 . The measure arising from ‘CNN+STSA’ can detect the intermittancy precursor whereas it is mostly ignored by ‘PCA+STSA’. A frame with an intermittancy coherent structure in a red box is shown on the top. 5 Conclusions and future work The paper proposes a framework that synergistically combines the recently introduced concepts of CNN and STSA for early detection of thermo-acoustic instability in gas turbine engines. Extensive set of experiments have been conducted on a swirl-stabilized combustor for validation of the pro- posed method. Sequences of hi-speed greyscale images are fed into a multi-layered CNN to model the fluctuating coherent structures in the flame, which are dominant during unstable combustion. Fragments of coherent structures are observed in the CNN filter visualization. Therefore, an en- semble of time series data is constructed from sequence of images based on the sigmoid activation probability vectors of last hidden layer at the CNN. Then, STSA is applied on the time series that is generated from an image sequence and ‘CNN+STSA’ is found to exhibit larger change in instability measure while transition to instability than ‘PCA+STSA’. The proposed framework detects all the intermittent precursors for different transition protocols, which is the most significant step towards detecting the onset of instability early enough for mitigation. In summary, while CNN captures the semantic features (i.e., coherent structures) of the combustion flames at varied illuminations, sizes and orientations, STSA models the temporal fluctuation of those features at a reduced dimension. One of the primary advantages of the proposed semantic dimensionality reduction (as opposed to abstract dimensionality reduction, e.g., using PCA) would be seamless involvement of domain ex- perts into the data analytics framework for expert-guided data exploration activities. Developing novel use-cases in this neural-symbol context will be a key future work. Some other near-term re- search tasks are: (i) dynamically tracking multiple coherent structures in the flame to characterize the extent of instability, (ii) multi-dimensional partitioning for direct usage of the last sigmoid layer and (iii) learning CNN and STSA together. Acknowledgment Authors sincerely acknowledge the extensive data collection performed by Vikram Ramanan and Dr. Satyanarayanan Chakravarthy at Indian Institute of Technology Madras (IITM), Chennai. Authors also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce GTX TITAN Black GPU used for this research. 8 References [1] A. Garcez, T. R. Besold, L. de Raedt, P. Foeldiak, P. Hitzler, T. Icard, K. Kuehnberger, L. C. Lamb, R. Miikkulainen, and D. L. Silver. Neural-symbolic learning and reasoning: Contributions and challenges. Proceedings of the AAAI Spring Symposium on Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Stanford, March 2015. [2] A. K. M. F. Hussain. Coherent structures - reality and myth. Physics of Fluids, 26(10):2816–2850, 1983. [3] S. R. Chakravarthy, O. J. Shreenivasan, B. Bhm, A. Dreizler, and J. Janicka. Experimental characterization of onset of acoustic instability in a nonpremixed half-dump combustor. Journal of the Acoustical Society of America, 122:120127, 2007. [4] G Berkooz, P Holmes, and J L Lumley. The proper orthogonal decomposition in the analysis of turbulent flows. Annual Review of Fluid Mechanics, 25(1):539–575, 1993. [5] C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, NY, USA, 2006. [6] P. J. Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of Fluid Me- chanics, 656:5–28, 2010. [7] K. Kavukcuoglu, Y. L. Sermanet, P. Boureau, K. Gregor, M. Mathieu, and Y. LeCun. Learning convolu- tional feature hierachies for visual recognition. In NIPS, 2010. [8] A. Ray. Symbolic dynamic analysis of complex systems for anomaly detection. Signal Processing, 84(7):1115–1130, July 2004. [9] S. Sarkar, K. Mukherjee, S. Sarkar, and A. Ray. Symbolic dynamic analysis of transient time series for fault detection in gas turbine engines. Journal of Dynamic Systems, Measurement, and Control, 135(1):014506, 2013. [10] S. Sarkar, A. Ray, A. Mukhopadhyay, R. R. Chaudhari, and S. Sen. Early detection of lean blow out (lbo) via generalized d-markov machine construction. In American Control Conference (ACC), 2014, pages 3041–3046. IEEE, 2014. [11] V. Ramanan, S. R. Chakravarthy, S. Sarkar, and A. Ray. Investigation of combustion instability in a swirl- stabilized combustor using symbolic time series analysis. In Proc. ASME Gas Turbine India Conference, GTIndia 2014, New Delhi, India, pages 1–6, December 2014. [12] S. Sarkar, K.G. Lore, S. Sarkar, V. Ramanan, S.R. Chakravarthy, S. Phoha, and A. Ray. Early detection of combustion instability from hi-speed flame images via deep learning and symbolic time series analysis. In Annual Conference of The Prognostics and Health Management, pages pre–prints. PHM, 2015. [13] K. Mukherjee and A. Ray. State splitting and state merging in probabilistic finite state automata for signal representation and analysis. Signal Processing, 104:105–119, November 2014. [14] S. Sarkar, A. Ray, A. Mukhopadhyay, and S. Sen. Dynamic data-driven prediction of lean blowout in a swirl-stabilized combustor. International Journal of Spray and Combustion Dynamics, 7(3):in–press, 2015. [15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. [16] C. Daw, C. Finney, and E. Tracy. A review of symbolic analysis of experimental data. Review of Scientific Instruments, 74(2):915–930, 2003. [17] C. Rao, A. Ray, S. Sarkar, and M. Yasar. Review and comparative evaluation of symbolic dynamic filtering for detection of anomaly patterns. Signal, Image and Video Processing, 3(2):101–114, 2009. [18] S. Bahrampour, A. Ray, S. Sarkar, T. Damarla, and N.M. Nasrabadi. Performance comparison of feature extraction algorithms for target detection and classification. Pattern Recogntion Letters, 34(16):2126– 2134, December 2013. [19] V. Rajagopalan and A. Ray. Symbolic time series analysis via wavelet-based partitioning. Signal Pro- cessing, 86(11):3309–3320, November 2006. [20] S. Sarkar, A. Srivastav, and M. Shashanka. Maximally bijective discretization for data-driven modeling of complex systems. In Proceedings of American Control Conference, Washington, D.C., 2013. [21] M. Sipser. Introduction to the Theory of Computation, 3rd ed. Cengage Publishing, Boston, MA, USA, 2013. 9