Introduction

Image Recognition by a Second-Order Convolutional Neural Network with Dynamic Receptive Fields

Oksana Mezentseva

omezentceva@ncfu.ru 1

Department of Information Systems & Technologies, North-Caucasus Federal University.

Adaptation of a Convolution

2 0 2, Kulakov Prospect , Stavropol, Russian Federation 1 Maksim Brodnikov 2 Second-Order Neurons

2017

This paper presents the method description of convolutional neural networks parameters synthesis with dynamic receptive fields and secondorder neurons for convolutional layers. The experiments of pattern recognition which shown that the combination of second-order neurons and dynamic receptive fields allows to reduce the generalization error are described in the article.

Introduction

with

Dynamic Fields towards

CNN idea with dynamic RFs is that if one changes the set of RFs for some layers then the same pattern can be perceived in different ways by the network. Hence, one can extend a training set. It is known that the classical form of a RF for a CNN is a square. It is offered to use a template for non-standard RF generation. The elements of a template are indices indicating neighbors within two discrete steps from the element on a pixel array. If you change all RFs lying on a feature map, additional information will impact on the configurable parameters. It will lead to the extract of a better invariant (Fig. 1).

High-order CNNs are expansion over signal direct distribution networks where polynomial neurons summators are used. That type of neurons enables a network to extract information from the input signal more qualitatively [6]. At the same time high-order functions (square, cube, trigonometric, etc.) are used instead of the usual weighted summation. The main drawback of this approach is the need to introduce additional coefficient arrays, but it is offset by higher quality features extracting (hence the generalization error reduces).

The activation function of the standard neuron with two inputs is calculated according to the formula 1, the activation function of the second-order neuron is calculated according to the formula 2.

2 f (w1 ∗ x1 + w2 ∗ x2) = f (X wi ∗ xi)

i=1 f (w1 ∗ x1 + w2 ∗ x2 + u1 ∗ x1 ∗ x1 + u2 ∗ x2 ∗ x2 + u3 ∗ x1 ∗ x1) = f (X xi ∗ wi + uj ∗ xj ) 2 i,j where w, u are vectors of adjustable coefficients, and x is an input vector, f(·) is the activation function of a neuron.

This approach does not touch activation functions of neurons and the principles of error calculation, so the only difference in contrast to the standard back-propagation algorithm is the calculation of the weighted sum.

We need to calculate two partial derivatives according to the formula 3 with the aim to know how much we want to change the weights w and u.

( ∂∂wvjji((nn)) = (Pim=0 wji(n) ∗ yi(n) + uji(n) ∗ yi2(n))0|wji(n) = Pim=0 yi(n) ∂∂uvjji((nn)) = (Pim=0 wji(n) ∗ yi(n) + uji(n) ∗ yi2(n))0|uji(n) = Pim=0 yi2(n) (1) (2) (3) where vj(n) is a weighted sum of the neuron j at iteration n, yi(n) is an input values for the neuron j.

The algorithm [7] which changes RFs for neurons belonging to any combination of convolutional layers (Clayers) before feeding a pattern was developed. The synthesis method of mathematical model parameters of CNN with dynamic RFs and second-order neurons for C-layers is proposed. This method was developed on the basis of the change algorithm of RFs and adaptation of the forward and back-propagation to neurons of second-order (2, 3). This method adapts the back-propagation algorithm, it consists of the following steps: 1. In the process of training before entering the next pattern, we need to change neurons RFs at the desired combination of C-layers using the algorithm of RFs form changing.

2. During the forward propagation we need to calculate the output values of C-layer neurons according to the http://ceur-ws.org formula 4.

(Cmi,n = ϕ(b + Pq∈Qi PkK=C0−1 PlK=C0−1 ([Xmq+k+Fi(·),n+l+Fj(·) ∗ Wkq,l] + [Xmq+k+Fi(·),n+l+Fj(·) ∗ Aqk,l])) Smi,n = ϕ(b + u PKS−1 PKS−1 Cmi∗KS+k,n∗KS+l)

k=0 l=0 where Cim,n is the neuron output disposed at i-th map of C-layer in position m, n, ϕ(·) = A ∗ tanh(B ∗ p) when A = 1.7159, B = 2/3, b is a bias, Qi is indices set of the previous layer cards associated with Ci card, KC is the size of square RF for the neuron Cim,n, Xqm+k,n+l is an input value to the neuron Cim,n, the vectors W and A are custom weights for the second-order neurons of C-layer, Sim,n is the neuron output of averaging layer; Fi(·) and Fj(·) are the Fi(RFm,n, k, l), Fj(RFm,n, k, l) (Functions which return the offsets for the row and the column for RF-template belonging to neuron m, n at position k, l within the template). indexk,l is a template element of at position k, l, indexk,l = 0..24.

These functions are determined by the following formulas: 0; indexk,l ∈ {0, 4, 5, 16, 17}  1; indexk,l ∈ {6, 7, 8, 18, 19}  Fi(·) = 2; indexk,l ∈ {20, 21, 22, 23, 24}  −1; indexk,l ∈ {1, 2, 3, 14, 15}   −2; indexk,l ∈ {9, 10, 11, 12, 13} 0; indexk,l ∈ {0, 2, 7, 11, 22}  1; indexk,l ∈ {3, 5, 8, 12, 23}  Fj(·) = 2; indexk,l ∈ {13, 15, 17, 19, 24}  −1; indexk,l ∈ {1, 4, 6, 10, 21}   −2; indexk,l ∈ {9, 14, 16, 18, 20} (4) (6) , (5) 3. During the back propagation we need to receive the local gradient for C-layer according to the formula 6.  ∂E  ∂(Wkλ,l)q = PSizeC PSizeC δmλ,n ∗ ymλ−+1k+Fi(RFmλ,n,k,l),n+l+Fj(RFmλ,n,k,l)

m=0 n=0  ∂(Akλ,l)q = PSizeC PSizeC δm,n ∗ ymλ−+1k+Fi(RFmλ,n,k,l),n+l+Fj(RFmλ,n,k,l) ∂E λ

m=0 n=0 where δmi,n is a residual was gathered for a neuron with coordinates m, n within the map of layer λ, q is the part of the kernel of configurable features for which we receive the components of the gradient, SizeC is the size of the C-layer card, Aqk,l, Wk,l is q-th part of the adjustable parameters, which is responsible for q interaction with the q-th card of the previous layer.

CNN learning algorithm was developed on the basis of the proposed adaptation method of back propagation algorithm to second-order neurons and dynamic RFs (Fig. 2). 3

Experiments

Experiments were carried out to evaluate the generalization capability of a CNN with second-order neurons and dynamic RFs.

Stereo dataset Small NORB [8] was used as a sample database for the invariant pattern recognition. The use of classical CNN of LeNet-5 type gives the generalization error of 8.4% [9].

The use of CNN with dynamic RFs and the general structure of LeNet-5 gives the generalization error of 4.3% [ 10 ]. Table 1 shows the results of LeNet-5 with dynamic RFs and second-order neurons in various combinations of the C-layer.

Where Ci, Si in the Table 1 are the numbers of corresponding convolution and averaging layers, I is the input layer, O is the output layer, number 2 means the second-order neurons in the corresponding layer. The value of the generalization error was calculated as the average of 15 experiments.

The table shows that not all combinations of the second-order neurons used in the C-layers provide a significant reduction of the generalization error. The best results show the embodiment number 6 and number 9. The best

C3 2 combination can only be found through exhaustive search of possible values. It limits the use of higher order CNN.

As a result of the proposed CNN with dynamic RFs and second-order neurons we were able to reduce the generalization error by an average of 2% in comparison with the use of the CNN with simply dynamic RFs and by 6.1% in comparison with the CNN with alternating C and S - layers. It is shown that the use of the artificial neural network with the higher-order neurons leads to the decrease of the generalization error. But as a result we have to carry out additional experiments in order to find the necessary combination of C-layers with the modified neural activation functions. [6] Lagunov, N.: Allocation and recognition of objects with the help of optimized algorithm of selective search and a high-order convolutional neural network // Fundamental research. 2015. 5. pp. 511-516. Nemkov, R.: The method of a mathematical model parameters synthesis for a convolution neural network with an expanded training set. The modern problems science and education, 2015. 1. URL: http://www.science-education.ru/125-19867/; (NORB).

URL: Nemkov, R.: Dynamical Change of the Perceiving Properties of Neural Networks as Training with Noise and Its Impact on Pattern Recognition / R. Nemkov // Young Scientists International Workshop on Trends in Information Processing (YSIP) 2014. URL: http://ceur-ws.org/Vol-1145/paper4.pdf/

Goodfellow , I. , Bengio , Y. : Deep Learning (Adaptive Computation and Machine Learning series) . Adaptive Computation and Machine Learning series . 2016 . P. 800 .

Mirza , M. , Courville , A. , Bengio , Y. : Generalizable Features From Unsupervised Learning , 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 4945 - 4949 , 2016 Nemkov, R. , Mezentseva , O. , Mezintsev , D. : Using of a Convolutional Neural Network with Changing Receptive Fields in the Tasks of Image Recognition / Proceedings of the First International Scientific Conference Intelligent Information Technologies for Industry (IITI16) , Volume 451 of the series Advances in Intelligent Systems and Computing , 2016 , pp. 15 - 23 .

The 4th International Scientific Conference: Applied Natural Sciences. - Novy Smokovec , 2013 . - pp. 284 - 289 .

Nemkov , R. M. , Mezentseva

O. S.:

Dynamical change of the perceiving properties of convolutional neural networks and its impact on generalization . Neurocomputers: development and application , 2015 , no. 2 , pp.

[10] Nemkov , R. M. : Synthesis method of mathematical model parameters of the convolutional neural network with extended training set . URL: http://www.science-education. ru/125-19867/ (30.01 . 2016 ).