=Paper= {{Paper |id=Vol-2718/paper29 |storemode=property |title=On Lindenmayer Systems and Autoencoders |pdfUrl=https://ceur-ws.org/Vol-2718/paper29.pdf |volume=Vol-2718 |authors=Andrej Lucny |dblpUrl=https://dblp.org/rec/conf/itat/Lucny20 }} ==On Lindenmayer Systems and Autoencoders== https://ceur-ws.org/Vol-2718/paper29.pdf
                                On Lindenmayer Systems and Autoencoders

                                                                 Andrej Lucny

                                        Comenius University, Bratislava 84248, Slovakia,
                                                   lucny@fmph.uniba.sk,
                                WWW home page: http://dai.fmph.uniba.sk/w/Andrej_Lucny/en

Abstract: Lindenmayer Systems can serve to deep learn-
ing not only as a generator of simulated datasets. They                   Table 1: Production rules for rose-leaf images, borrowed
could provide datasets of images generated from a very                    from [10], page 126.
few parameters that enable us a better study of the latent                  ω0 : [{A(0, 0).}][{A(0, 1).}]
space, which is crucial for the majority of deep neural net-                p1 : A(t, d) : d = 0 → .G(LA, RA).
works. We process a dataset, generated by a parametric                                [+B(t)G(LC, RC,t).}][+B(t){.]A(t + 1, d)
Lindenmayer system, with the convolutional autoencoder.                     p2 : A(t, d) : d = 1 → .G(LA, RA).
We aim to recognize the values of the Lindenmayer system                              [−B(t)G(LC, RC,t).}][−B(t){.]A(t + 1, d)
parameters by its encoder part. Finally, we partially turn a                p3 : B(t) : t > 0 → G(LB, RB)B(t − 1)
generator based on its decoder part to a neural network that                p4 : G(s, r) → G(s ∗ r, r)
generates images from the dataset upon the Lindenmayer                      p5 : G(s, r,t) : t > 1 → G(s ∗ r, r,t − 1)
system parameters.                                                        d = 0 means the left side and d = 1 the right side of the
                                                                          leaf. t is timing. G(length, growth_rate) corresponds to
1     Introduction                                                        venations. + and − represent rotation. [ and ] define a tree
                                                                          structure of the generated string. Dots represent points on
Deep neural networks for the vision we typically train                    the leaf that are structured to polygons by {}. LA, LB, LC
from datasets of annotated images. Their preparation is                   are parameters for the initial length of the main segment,
a manual job that sometimes we can avoid if we use the                    the lateral segment, and the marginal notch. RA, RB, RC
so-called simulated dataset. There are several grammar-                   represent their growth rate. + and − have also parameter
based systems that we can use for the simulation [5]. One                 - the angle between stem and venations. The last param-
of them is Lindenmayer systems [8], used already for this                 eter is the direction of the stem, that we select when we
purpose [14]. However, this approach issues many other                    interpret the string and turn it into an image.
questions that we would like to deal with in this paper.
Can we find the parameters of the Lindenmayer system
somewhere inside the neural network that is processing a                  Table 2: Strings that represent rose leaves, generated by
dataset produced by the Lindenmayer system? Can we                        the Lindenmayer system. From top to bottom: axiom, the
create a neural network that generates the same images as                 first, the second, and the third iteration.
the Lindenmayer system? And could it make them from
the parameters of the Lindenmayer system?                                  [{A(0, 0).}][{A(0, 1).}]

                                                                           [{.G(5, 1.15).[+B(0)G(3, 1.19, 0).}][+B(0){.]A(1, 0).}]
1.1   Parametric Lindenmayer Systems                                       [{.G(5, 1.15).[−B(0)G(3, 1.19, 0).}][−B(0){.]A(1, 1).}]
We employ the parametric Lindenmayer system proposed
                                                                           [{.G(5.75, 1.15).[+B(0)G(3, 1.19, 0).}][+B(0){.].
in [10]. It generates rose leaves upon eight parameters,
                                                                           G(5, 1.15).[+B(1)G(3, 1.19, 1).}][+B(1){.]A(2, 0).}]
from which just two can significantly vary: the angle of
                                                                           [{.G(5.75, 1.15).[−B(0)G(3, 1.19, 0).}][−B(0){.].
the stem of the rose leaf and the angle between the left
                                                                           G(5, 1.15).[−B(1)G(3, 1.19, 1).}][−B(1){.]A(2, 1).}]
and right venations and the stem. It has a set of produc-
tion rules which are applied on an initial axiom iteratively
                                                                           [{.G(6.6125, 1.15).[+B(0)G(3, 1.19, 0).}][+B(0){.].
and per iteration simultaneously (Table 1). As a result, the
                                                                           G(5.75, 1.15).[+G(1.3, 1.25)B(0)G(3, 1.19, 1).}]
system generates strings in Table 2.
                                                                           [+G(1.3, 1.25)B(0){.].G(5, 1.15).[+B(2)
   We turn the generated strings into images in two steps.
                                                                           G(3, 1.19, 2).}][+B(2){.]A(3, 0).}][{.
At first, we use turtle graphics following symbols G, + and
                                                                           G(6.6125, 1.15).[−B(0)G(3, 1.19, 0).}][−B(0){.].
− (go, rotate left, rotate right) structured by [ ]. In this way,
                                                                           G(5.75, 1.15).[−G(1.3, 1.25)B(0)G(3, 1.19, 1).}]
                                                                           [−G(1.3, 1.25)B(0){.].G(5, 1.15).[−B(2)G(3, 1.19, 2)
      Copyright c 2020 for this paper by its authors. Use permitted un-    .}][−B(2){.]A(3, 1).}]
der Creative Commons License Attribution 4.0 International (CC BY
4.0).
                                                                Figure 3: Any image in the dataset can be expressed as a
                                                                sum of the mean and multiples of the eigenvectors.
Figure 1: Rose-leaf images generated by the Lindenmayer
system.




                                                                                 Figure 4: Autoencoder.


                                                                1.2   Deep learning and Autoencoders

                                                                Deep learning [4] is young but well-known and very suc-
                                                                cessful part of machine learning based on artificial neural
                                                                networks with a specific architectural design. They en-
                                                                hance the classic neural networks like the perceptron [12]
                                                                that are theoretically strong, but processing larger inputs
                                                                as images is not a tractable task form them in practice.
                                                                Deep neural networks typically employ gradual decreas-
Figure 2: Eigenvalues confirms the fact that the generated      ing of data dimension and turn the input image to a feature
dataset has a few amount of parameters.                         vector which has a small enough dimension to be further
                                                                processed classically. This approach is reflected in their
                                                                architecture by a deep sequence of convolutional layers
                                                                (which usually implement 3x3 or 5x5 kernel-based opera-
we calculate the exact positions of all dot symbols that rep-   tors) interlaced with the MaxPooling layers (which are re-
resent points. In the second step, we structure these points    sponsible for the dimension reduction since they replace e.
into polygons following symbols { } and draw the poly-          g. 2x2 values by their maximum) followed by a few fully
gons. As a result, we get an image containing a rose leaf       connected layers corresponding to the classic perceptron.
(Figure 1). Varying the parameters of the Lindenmayer           The features do not need to be designed manually but they
system, we can generate the whole dataset of such images.       are found automatically in the process of end-to-end train-
                                                                ing [13] that corresponds to minimalization of a suitable
   Having any dataset, we can get imagination about             loss function. The feature vector can be concerned as a
the number of parameters, that generate it, via Principal       point in the so-called latent space. We wish that similar
Component Analysis (PCA) [9]. We concern that two-              images are mapped to close points and different images
dimensional images are just one-dimensional vectors, i.e.       to far points in that space. We also want that the feature
we put their pixels row by row to one line. Thus we turn        vector would contain as much information about the cor-
the dataset of 28x28 images to a set of 784-dimensional         responding image as possible. The trick on how to push
vectors. Then we can calculate their covariation matrix         the neural network to learn such feature extraction is the
and find its eigenvectors. Following the corresponding          core of the whole deep learning. It can be demonstrated
eigenvalues, we can find that much less than 784 eigen-         on a neural network called autoencoder (Figure 4).
vectors is significant. In our case, it is enough to concern       Autoencoder not just reduces the dimension of the input
from 8 to 16 eigenvectors (Figure 2). Now, we can express       data into the feature vector but then performs an opposite
each image from the dataset as a sum of the mean and mul-       process and expands the data to their original size, using
tiples of the eigenvectors (Figure 3). We can also make a       UpSampling layers (which replace each value with its e.g.
generator that turns manually selected values of the eigen-     2x2 copies). Then we train it to provide the same output
vector multipliers to images, but its quality regarding the     on a given input. If we succeed, then we are sure that
generation of rose leaves is low.                               the feature vector represents the input image well, because
                                                                 been varying mainly the stem angle and the angle be-
                                                                 tween stem and venations. Other parameters can vary just
                                                                 slightly; the resulted image is far from a rose leaf other-
                                                                 wise. We have also turned the output images to binary
                                                                 form and resized them to 28x28. That enables us to use a
                                                                 proven autoencoder architecture, which requires this input
                                                                 size.
                                                                    We have decided that the stem always starts in the top
                                                                 left corner. This decision enables us to process the dataset
                                                                 also with straightforward methods like eigenimages and
                                                                 compare their results with the autoencoder. All together
                                                                 our dataset had 1498 images. A few samples can be seen
                                                                 in Figure 5. Of course, we have recorded also the parame-
                                                                 ters which we have used for the generation of each image.
                                                                 In this way, we have created an annotated dataset free of
                                                                 charge.


                                                                 3   Autoencoder training

Figure 5: A few samples from the generated dataset. The          Involving deep learning, we start with the training of the
images are annotated by parameters of the Lindenmayer            autoencoder. Thus, so far, we will not work with the im-
system used for their generation.                                age annotation. We utilize a proven architecture of autoen-
                                                                 coder from [3] [11]. On the input, the neural network re-
                                                                 ceives grayscale images (pixels in range 0.0-1.0). They are
it is possible to generate the image from the feature vec-       processed by a block of sixteen convolutional layers with
tor. After such training, we can cut the autoencoder into        kernels 3x3, then the dimension is reduced by MaxPool
two parts: encoder and decoder. The encoder turns images         layers. The output is processed by the next eight convo-
to feature vectors and can be combined with a perceptron         lutional layers and again reduced. And this repeats until
to provide classification or detection tasks. The decoder        the input data shape 28x28x1 is turned trough 28x28x16,
turns feature vectors to images and can be used as a gen-        14x14x8, 7x7x8, 4x4x8 to which is the feature vector.
erator of images, even such images which have been never         Then like in the mirror, we expand the data by convolu-
presented to the network.                                        tional and UpSampling layers to the original size 28x28x1
   Of course, typically it is difficult to understand repre-     (Figure 6).
sentation in the latent space when we are working with              For non-linearity, the convolutional layers use the ReLU
real images that have many parameters. Will it be more           activation function, besides two places. There is sigmoid
simple if we present to the autoencoder dataset precisely        used just before the latent space to ensure that values in
generated from a very concrete and small number of pa-           the latent space are from interval <0.0,1.0>. And there
rameters (what Lindenmayer systems can do for us)? The           is sigmoid on the output from the network; not only to
organization of the latent space is crucial as it is shown by    enable us to interpret the output as an image with pixels in
its advanced versions like the variational autoencoder [6].      range 0.0-1.0 but also to enable us to use the binary cross-
Therefore we would like to play with this idea a bit. In the     entropy loss function, which has a better performance than
next chapters, we prepare a suitable dataset (chapter 2),        the classic MSE.
we train an autoencoder and compare the set of the images           We train the autoencoder with Keras 2.3.1 [3] using
with the set of their feature vectors (chapter 3), try to rec-   Tensorflow 2.1.0 as a backend. We use the Adadelta batch
ognize parameters of the Lindenmayer system by the en-           gradient descent algorithm. After 200 epochs, the accu-
coder (chapter 4) and turn decoder from a generator based        racy is 98.38% on the training set and 98.60% on the test-
on feature vectors to a generator based on the parameters        ing set (10% of samples) (Figure 7). The achieved quality
of the Lindenmayer system (chapter 5).                           is good (Figure 8). Now, the autoencoder can code each
                                                                 image to a vector of 128 floats 0.0-1.0 and decode the vec-
                                                                 tor to a very similar image (Figure 9) and we can continue
2   Dataset preparation                                          with its splitting into two parts: encoder and decoder.
                                                                    While we can employ the encoder part for generating
We have employed the Lindenmayer system defined in Ta-           another dataset that contains the feature vectors, the de-
ble 1 for generating our dataset of rose-leaf images. We         coder part can be used as a generator of rose-leaf images.
have implemented the Lindenmayer system in Python 3.6            It is not a very handy generator since we have to set up
using OpenCV 4.3.0 [1]. Concerning simplicity, we have           properly 128 values 0.0-1.0, but it is possible to gener-
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 28, 28, 1)         0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 16)        160
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 14, 14, 8)         1160
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8)           0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 7, 7, 8)           584
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8)           0
_________________________________________________________________
flatten_1 (Flatten)          (None, 128)               0
_________________________________________________________________
activation_1 (Activation)    (None, 128)               0
_________________________________________________________________
reshape_1 (Reshape)          (None, 4, 4, 8)           0
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 4, 4, 8)           584
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 8, 8, 8)           0
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 8)           584
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 16, 16, 8)         0
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 14, 16)        1168         Figure 8: Sample input images from our dataset in the top
_________________________________________________________________   line and the corresponding output images calculated by our
up_sampling2d_3 (UpSampling2 (None, 28, 28, 16)        0
_________________________________________________________________   autoencoder.
conv2d_7 (Conv2D)            (None, 28, 28, 1)         145
=================================================================
Total params: 4,385




Figure 6: The architecture of the used autoencoder in de-
tails [11].
                                                                    [0.98625815, 0.66299981, 0.99246186, 0.57722825, 0.5       ,
                                                                     0.90062493, 0.93622261, 0.5       , 0.5       , 0.95261234,
                                                                     0.99352407, 0.5       , 0.5       , 0.98512369, 0.5       ,
                                                                     0.82000422, 0.54893059, 0.98727709, 0.71470815, 0.5       ,
                                                                     0.5       , 0.94741422, 0.5       , 0.95114863, 0.75802684,
                                                                     0.9719044 , 0.5       , 0.5       , 0.5       , 0.82802659,
                                                                     0.51641005, 0.97730321, 0.9978047 , 0.80023605, 0.99893409,
                                                                     0.6468786 , 0.5       , 0.94131184, 0.99782324, 0.88698453,
                                                                     0.5       , 0.5       , 0.99995816, 0.50391984, 0.5       ,
                                                                     0.5       , 0.50378138, 0.5       , 0.5       , 0.78034681,
                                                                     0.99993455, 0.53947443, 0.5       , 0.5       , 0.5       ,
                                                                     0.5       , 0.5       , 0.98526198, 0.91212052, 0.5       ,
                                                                     0.5       , 0.87080342, 0.5       , 0.66338164, 0.99515474,
                                                                     0.83904874, 0.99057883, 0.5       , 0.5       , 0.95147383,
                                                                     0.99873096, 0.96877152, 0.77010125, 0.5       , 0.99994564,
                                                                     0.96718907, 0.5       , 0.5       , 0.99889612, 0.5       ,
                                                                     0.5       , 0.5       , 0.99997294, 0.88351119, 0.5       ,
                                                                     0.5       , 0.86901689, 0.5       , 0.5       , 0.5       ,
                                                                     0.9869802 , 0.80338162, 0.5       , 0.5       , 0.5       ,
                                                                     0.5       , 0.91442615, 0.8813709 , 0.5       , 0.5       ,
                                                                     0.5       , 0.81748968, 0.93058556, 0.98220086, 0.87647492,
                                                                     0.5       , 0.97868299, 0.95324636, 0.5       , 0.77138752,
            Figure 7: Training of the autoencoder.                   0.99827719, 0.88651741, 0.5       , 0.5       , 0.99581003,
                                                                     0.99394023, 0.5       , 0.5       , 0.99652219, 0.5       ,
                                                                     0.5       , 0.5       , 0.90829074, 0.9915418 , 0.5       ,
                                                                     0.5       , 0.5       , 0.5       ]
ate someting like a leaf just from the feature vector values
(Figure 10, on the left).
   Can we make such a generator handier? Yes, we can
- in a similar way, how we created the generator based
on eigenimages. We perform PCA on the dataset of the                Figure 9: An example of an input image from the dataset,
feature vectors, and we set up just the main components.            its feature vector in the latent space of the autoencoder,
We express the feature vector as a sum of mean and mul-             and the corresponding output image.
tiples of eigenvectors. Then we need to set up manually
just the multipliers of a few significant eigenvectors. We
Figure 10: On the left: An image generated from the man-
ually selected 128 values of the feature vector. Quality
                                                                Figure 12: The architecture of the recognizer of the Lin-
is quite poor. On the right: An image generated from 8
                                                                denmayer system parameters.
most significant multipliers of the latent space eigenvec-
tors. Quality is better.
                                                                tron to map the feature vectors to the Lindenmayer system
                                                                parameters. Though the approach was operational, later
                                                                we found that it was over-engineered. The linear regres-
                                                                sion can provide here results as good as the perceptron.
                                                                In both cases, we can sufficiently recognize the stem an-
                                                                gle: 97% by regression and 99% by the perceptron. On
                                                                the other hand, the angle between stem and venations we
                                                                have failed to recognize. It is perhaps due to small resolu-
                                                                tion and binary form of images that is a limitation coming
                                                                from the used architecture and our hardware.
                                                                   Linear regression can be added to the encoder neural
                                                                network as one fully connected layer without bias and with
                                                                the linear activation function. In this way, we have con-
                                                                structed a neural network that gets an image generated by
                                                                the Lindenmayer system and recognizes the values of the
                                                                Lindenmayer system parameters (Figure 12).
Figure 11: Eigenvalues of the latent space enlighten that
encoder does not reduce the number of parameters. (Com-
                                                                5   Neural network generating images from
pare to Figure 2)
                                                                    the Lindenmayer system parameters

have used only eight parameters from which we calculate         Though recognition of the Lindenmayer system parame-
the 128 items of the feature vector and that we put into the    ters from the feature vector is straightforward, the inverse
decoder to obtain the corresponding image. This generator       operation is not. It is even clear without a trial. However,
is handier, and it provides pretty rose leaves (Figure 10 on    we can still train a perceptron that approximates the in-
the right), though not only them.                               verse relation. We put all the eight parameters from the
                                                                annotation of our dataset (two of which significantly vary
                                                                and six almost constant) to the perceptron input and expect
4   Recognition of the Lindenmayer system                       the corresponding feature vector (128 values) calculated
    parameters                                                  by the encoder from the dataset image. Then we search
                                                                for a suitable number of hidden layers and suitable num-
Though we can generate rose leaves from a few parameters        bers of neurons in those layers. We have trained each such
now, it is hopeless to look for the parameters of the Lin-      candidate architecture. We have followed namely valida-
denmayer system among them. Neither parameter of the            tion loss since the accuracy was very low (up to 40%). For-
latent space nor multiplier of its eigenvectors directly cor-   tunately, this does not mean that the trained network does
responds to a parameter of the Lindenmayer system. Even         not work, because some items of the feature vector are less
when we perform the PCA over the set of the feature vec-        important than others, and the error on them can be high
tors calculated by the encoder from images in the dataset,      without a bad impact. Finally, we have used a perceptron
we find that it has the same distribution of the main com-      with two hidden layers with the hyperbolic tangent acti-
ponents (Figure 11).                                            vation function, each containing 256 neurons. And when
   However, we can easily reveal that they are not so far       we joined the perceptron and the decoder, we have got a
from them. In the beginning, we aimed to train a percep-        neural network (Figure 13) that can generate images from
                                                                 All codes developed during the preparation of this paper
                                                              are available at GitHub: https://github.com/andylucny/On-
                                                              Lindenmayer-Systems-and-Autoencoders.git

                                                              Acknowledgement This research was supported by the
                                                              project VEGA 1/0796/18.

Figure 13: The architecture of the generator of images        References
from the Lindenmayer system parameters.
                                                               [1] Bradski, G.: The OpenCV Library. Dr. Dobb’s Journal of
                                                                   Software Tools, (2000)
                                                               [2] Brownlee, J.: Deep Learning for Computer Vision. edition
                                                                   v 1.4, machinelearningmastery.com, 2019
                                                               [3] Chollet, F.: Deep Learning with Python. Manning Publica-
                                                                   tions Co., Greenwich, CT, USA, 2017
                                                               [4] Goodfellow, I., Bengio, Y., Courville, A. Deep Learning.
                                                                   MIT Press, 2016
                                                               [5] Kelemen, J., Kelemenova, A., Mitrana, V.: Towards Biolin-
                                                                   guistics. Grammars 4 (2001), pp. 187–292
                                                               [6] Kingma, D., Welling, M.: An Introduction to Variational
                                                                   Autoencoders. Foundations and Trends in Machine Learn-
                                                                   ing: Vol. 12 (2019): No. 4, pp 307-392
                                                               [7] Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet Clas-
                                                                   sification with Deep Convolutional Neural Networks. Ad-
                                                                   vances in neural information processing systems 25(2),
                                                                   (2012)
Figure 14: Generating images from the Lindenmayer sys-         [8] Lindenmayer, A.: Mathematical models for cellular inter-
tem parameters.                                                    action in development. J. Theoret. Biology 18, (1968), pp.
                                                                   280–315
                                                               [9] Pearson, K.: On Lines and Planes of Closest Fit to Systems
the parameters of the Lindenmayer system, namely from              of Points in Space. Philosophical Magazine. 2 (11), (1901),
the stem angle (Figure 14).                                        pp. 559–572.
                                                              [10] Prusinkiewicz, P., Lindenmayer, A., Hanan, J.: Algorith-
                                                                   mic beauty of plants. NewYork: Springer-Verlag, 1990.
6   Conclusion                                                [11] Rosebrock, A.: Deep Learning for Computer Vision with
                                                                   Python. ImageNet Bundle. 2nd edition. PyImageSearch,
In this paper, we have dealt with the potential of Linden-         2018
mayer systems to pose attractive questions related to deep    [12] Rosenblatt, F.: The Perceptron: A Probabilistic Model for
learning. We have prepared a dataset generated by the              Information Storage and Organization in the Brain, Cornell
Lindenmayer system. Thus we have got its annotation in             Aeronautical Laboratory, Psychological Review, v65, No.
the form of the Lindenmayer system parameters free of              6, (1958), pp. 386–408
charge. Then we used the dataset for the training of the      [13] Rumelhart, D., Hinton G., Williams, R.: Learning internal
convolutional autoencoder. Further, we have investigated           representations by error propagation. Parallel Distributed
the relationship between its latent space (feature vectors)        Processing. Vol 1: Foundations. MIT Press, Cambridge,
                                                                   MA, 1986
and the Lindenmayer system parameters. We found that at
least some parameters of the Lindenmayer system we can        [14] Ubbens, J., Cieslak, M., Prusinkiewicz, P., Stavness, I.: The
                                                                   use of plant models in deep learning: an application to leaf
easily recognize from feature vectors. Finally, we have
                                                                   counting in rosette plants. Plant Methods 14(1), (2018).
tried to create a neural-network-based generator analog-
ical to the Lindenmayer system, i.e. a neural network
that generates the same images as the Lindenmayer sys-
tem from the Lindenmayer system parameters. This last
job was successful just partially. Our future work should
concentrate on the hyper-parameters of the autoencoder ar-
chitecture. We need an operational architecture that has a
larger input image and the latent space as small as possi-
ble, containing just parameters that directly correspond to
the Lindenmayer system.