=Paper= {{Paper |id=Vol-2694/paper1 |storemode=property |title=Comparisons of two approaches of pattern recognition for text detection |pdfUrl=https://ceur-ws.org/Vol-2694/p1.pdf |volume=Vol-2694 |authors=Ewa Lis,Roman Kluger |dblpUrl=https://dblp.org/rec/conf/system/LisK20 }} ==Comparisons of two approaches of pattern recognition for text detection== https://ceur-ws.org/Vol-2694/p1.pdf
Comparisons of two approaches of pattern recognition
for text detection
Ewa Lisa , Roman Klugera
a Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice



                                          Abstract
                                          The paper describes two methods of pattern recognition, one based on soft sets, another based on neural networks. Soft set
                                          theory is quite recently developed method of AI. That approach has rather simple mathematical background, but can perform
                                          satisfactory. Widespread neural network approach demands much more calculating power to perform at the same level of
                                          accuracy. Some hints to make neural network perform better have been written. Authors explain necessary theoretical terms
                                          and definitions then describe their idea of two AI systems. At the end some results of first system are presented.

                                          Keywords
                                          Text detection, Pattern recognition, Soft reasoning, Simple soft classifier, Weighted soft classifier, Weighted mean soft
                                          classifier


1. Introduction                                                                                                    1.1. Related works
Pattern recognition is a widespread problem in mod-                                                                There are many possible variations of intelligent sys-
ern applications. It has many solutions which require                                                              tems which are trained to detect patterns. In [8] was
usage of different kinds of neural networks, some im-                                                              presented a system developed to detect patterns of hu-
age preprocessing [1] and plenty of another techniques.                                                            man voice in spectrograms. Some patterns are also de-
We stated our problem in such way: given photo or                                                                  tected in medical informatics, where we can search for
scan of some text (handwritten or printed) and photo                                                               disorders of patients [9]. In [10] was discussed how to
of some word or letter (pattern) find all occurences of                                                            develop a rule based system for detection of nodules
pattern. Our first solution was to use soft sets tech-                                                             in rtg lung images. Recently research aspects are also
nique. That gave rather satisfactory results. Another                                                              oriented on using soft sets approach. In [11] was pro-
approach was to use neural network classifier, but it                                                              posed to mix neural networks with soft sets as detec-
failed due to problems highlighted in this article. Our                                                            tors of patterns in images from various places.
solution needs some optimization, but it is a good start-                                                             Pattern detection in writing style is a complex prob-
ing point for further research.                                                                                    lem. There are several approaches. In [12] was im-
   Our first goal was to recognize names in histori-                                                               plemented transverse sequence detection. In [13] was
cal handwritten documents. But we had difficulties in                                                              given a discussion how instance segmentation of im-
finding scans of these with sufficient quality of hand-                                                            ages can influence the efficiency of correct recogni-
writing. There were also problems with binarization of                                                             tion. In [14] was proposed to use rectified attention
such documents. The problem of historical document                                                                 method for text recognition.
binarization is discussed in works: [2, 3, 4, 5, 6, 7]. Due                                                           In our model we have developed simplified, how-
to aforementioned problems we decided to simplify                                                                  ever efficient mechanism. The idea is developed on
our task and use scans of books or our notes. These                                                                soft sets approach. We have defined a relation table
scans proved to be much easier to binarize and helped                                                              which is used to compare symbols and therefore de-
to concentrate on algorithmical part of our project.                                                               cide which of them match the pattern. Our experi-
                                                                                                                   ments show that such idea is efficient both for hand-
                                                                                                                   written and printed texts.


                                                                                                                   2. Mathematical part of the
SYSTEM 2020: Symposium for Young Scientists in Technology,                                                            algorithm
Engineering and Mathematics, Online, May 20 2020
" ewalis343@student.polsl.pl (E. Lis);
romaklu253@student.polsl.pl (R. Kluger)                                                                            2.1. Soft sets - introduction

                                    ยฉ 2020 Copyright for this paper by its authors. Use permitted under Creative   Soft set theory is one of recently developed ideas. That
                                    Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)                                        is quite surprising due to its mathematical simplicity
and, as we shall see, powerful performance. First re- and ๐‘› denotes cardinality of ๎ˆต๎ˆน(๎ˆฒ ). That matrix ๎ˆน
sults have been published by Russian scholar Dmitri is given by a formula:
Molodotsov in his paper [15] in 1999. His idea was to
simulate the uncertainty of membership in given set.                                     โŽก ๐‘Ž1,1 ๐‘Ž1,2 โ‹ฏ ๐‘Ž1,๐‘› โŽค
                                                                                         โŽข๐‘Ž      ๐‘Ž2,2 โ‹ฏ ๐‘Ž2,๐‘› โŽฅ
Definition 1 (Soft set). Let ๐•Œ be the universe and ๐”ผ                   ๎ˆน = [๐‘Ž๐‘–,๐‘— ]๐‘šร—๐‘› = โŽข 2,1                       , (4)
                                                                                         โŽข  โ‹ฎ      โ‹ฎ    โ‹ฑ      โ‹ฎ โŽฅโŽฅ
be the set of parameters describing elements of ๐•Œ. A soft                                โŽฃ๐‘Ž๐‘š,1 ๐‘Ž๐‘š,2 โ‹ฏ ๐‘Ž๐‘š,๐‘› โŽฆ
set is an ordered pair (๎ˆฒ , ๎ˆญ), where ๎ˆญ โІ ๐•Œ and ๎ˆฒ โˆถ
๎ˆญ โŸถ ๎ˆผ(๐•Œ). By ๎ˆผ(๐•Œ) we denote the power set of ๐•Œ. where ๐‘Ž๐‘–,๐‘— = ๐œ’๎ˆพ๎ˆญ ((๐‘ข๐‘– , ๐‘’๐‘— )). More definitions and theo-
We shall reference to ๎ˆฒ as membership function.                retical introduction can be found in [16]. Our system
                                                               will depend only on aforementioned terms.
   As we can see from the definition (1) the name soft
originates from parametric description of membership.
It can be clearly seen, that classical set considered in 2.2. Soft reasoning
the set theory are special cases of soft sets. Its mem- The process of making decisions using soft sets will
bership function is its indicator function.                    be called soft reasoning. The classical approach to soft
                                                               reasoning can be divided into several steps:
Definition 2 (Soft subset). Let ๐•Œ be the universe and
(๎ˆฒ , ๎ˆญ), (๎ˆณ, ๎ˆฎ) be soft sets specified in ๐•Œ. (๎ˆฒ , ๎ˆญ) is a soft     1. choose your universe ๐•Œ of objects and set ๎ˆญ โІ
subset of (๎ˆณ, ๎ˆฎ) if                                                    ๐•Œ,
                                                                   2. choose parameters which describe your objects
     โ€ข ๎ˆญ โІ ๎ˆฎ in classical sense,                                       and construct set ๐”ผ,
     โ€ข โˆ€๐‘’ โˆˆ ๎ˆญ โˆถ ๎ˆฒ (๐‘’) = ๎ˆณ(๐‘’).                                      3.  construct   membership function of every ๐‘ข โˆˆ ๐•Œ,
                                                                   4. choose set ๐”ป which will be called set of demands,
We can also define soft set in a relation form. That               5. for every object ๐‘ข โˆˆ ๎ˆญ calculate the value of
representation is very useful in further applications.                 classifier,
                                                                   6. choose the best options using results of classi-
Definition 3 (Soft set in relation form). Let (๎ˆฒ , ๎ˆญ)
                                                                       fier.
be soft set in given universe ๐•Œ. Let
                                                                  First of all we shall discuss step 4. A demand ๐‘‘ will
            ๎ˆพ๎ˆญ = {(๐‘ข, ๐‘’) โˆถ ๐‘’ โˆˆ ๎ˆญ, ๐‘ข = ๎ˆฒ (๐‘’)}.              (1) be a vector of length the same as the cardinality of
                                                               ๐”ผ. Elements of ๐‘‘ will be real numbers from interval
๎ˆพ๎ˆญ is a soft set (๎ˆฒ , ๎ˆญ) in relation form.                     [0, 1]. Each of the elements will correspond to priority
Now we can consider Cartesian product of ๎ˆญ and im- of each of parameters from set ๐”ผ.
age of ๎ˆฒ (denote as ๎ˆต๎ˆน(๎ˆฒ )). It can be seen that:                 Now, let us consider step 5. A soft classifier is a func-
                                                               tion which takes vector of parameters of an object and
                     ๎ˆพ๎ˆญ โІ ๎ˆญ ร— ๎ˆต๎ˆน(๎ˆฒ ).                      (2) vector of demands and returns a numeric value [17].
                                                               We shall discuss several kinds of soft classifiers in next
Using relation (2) we can express the soft set (๎ˆฒ , ๎ˆญ) section.
in language of classical set theory. Especially, we can           Lastly, we shall present a few notes about step 6.
define an indicator function of that soft set.                 The determination of number of the best options can
                                                               depend on the specific problem. For example, some-
Definition 4 (Indicator function of a soft set). Let times we can wish to choose the best option, at some
๎ˆพ๎ˆญ be a soft set in relation form corresponding to soft set other time we wish to find all options better than given
(๎ˆฒ , ๎ˆญ) and (๐‘ข, ๐‘’) โˆˆ ๎ˆญ ร— ๎ˆต๎ˆน(๎ˆฒ ). The indicator function threshold value.
of the soft set (๎ˆฒ , ๎ˆญ) is given by following formula:
                         {
                           1, when (๐‘ข, ๐‘’) โˆˆ ๎ˆพ๎ˆญ ,               2.3. Soft classifiers
         ๐œ’๎ˆพ๎ˆญ ((๐‘ข, ๐‘’)) =                                    (3)
                           0, otherwise.                       We shall discuss three kinds of soft classifiers:

Using function ๐œ’๎ˆพ๎ˆญ ((๐‘ข, ๐‘’)) a special matrix called bi-            โ€ข simple soft classifier (SSC),
nary relation table can be constructed. Its dimensions             โ€ข weighted soft classifier (WSC),
are ๐‘š ร— ๐‘›, where ๐‘š denotes cardinality of the set ๎ˆญ
                                                                   โ€ข weighted mean soft classifier (WMSC).



                                                            2
For SSC we simply need binary vector of demands ๐‘‘ - Table 1
its values are only 0 or 1. For readability purposes we SSC - results
will write that kind of vector by naming those param-
eters which have value 1. Let ๐‘› be the length of vector                                   A      B     C
of demands ๐‘‘. This classifier ๐‘‘ and object ๐‘ข is given                           apple     1      2     3
by a formula:                                                                  orange     1      1     2
                                                                               pepper     3      0     2
                                ๐‘›
                                                                               spinach    0      3     1
                 ๐‘†๐‘†๐ถ(๐‘‘, ๐‘ข) = โˆ‘ ๐‘‘๐‘– ๐‘ข๐‘– .              (5)
                                                                               tomato     2      0     2
                                ๐‘–=1
                                                                                potato    1      0     1
WSC does not demand binarity of vector of demand.
We will use weighted sum in our calculations:
                                 ๐‘›                            Table 2
                ๐‘Š ๐‘†๐ถ(๐‘‘, ๐‘ข) = โˆ‘ ๐‘‘๐‘– ๐‘ข๐‘– .              (6)       WSC - results
                                ๐‘–=1
                                                                                            A         B
WMSC enables us to set convenient threshold value                                apple     1.1       0.4
for our classification. It is given by a formula:                               orange     0.6       0.4
                                     ๐‘›                                          pepper     0.9       0.6
                                 โˆ‘ ๐‘‘๐‘– ๐‘ข๐‘–                                        spinach    2.1       0.8
                                                                                tomato     0.7        0
               ๐‘Š ๐‘€๐‘†๐ถ(๐‘‘, ๐‘ข) = ๐‘–=1๐‘›  .                (7)
                                                                                 potato    0.4       0.7
                              โˆ‘ ๐‘‘๐‘–
                                      ๐‘–=1


2.4. SSC - example                                            2.5. WSC - example
Let ๐•Œ be the set of goods in grocery store and ๎ˆญ โІ ๐•Œ Let ๐•Œ, ๎ˆญ be defined as in previous example. Now,
such that:                                            we shall construct non-binary vectors of demands (we
                                                      shall also omit zero-valued elements):
๎ˆญ = {๐‘Ž๐‘๐‘๐‘™๐‘’, ๐‘œ๐‘Ÿ๐‘Ž๐‘›๐‘”๐‘’, ๐‘๐‘’๐‘๐‘๐‘’๐‘Ÿ, ๐‘ ๐‘๐‘–๐‘›๐‘Ž๐‘โ„Ž, ๐‘ก๐‘œ๐‘š๐‘Ž๐‘ก๐‘œ, ๐‘๐‘œ๐‘ก๐‘Ž๐‘ก๐‘œ}.
                                                  (8)     โ€ข {๐‘”๐‘Ÿ๐‘’๐‘’๐‘› โˆ’ 0.7, ๐‘Ÿ๐‘’๐‘‘ โˆ’ 0.3, ๐‘“ ๐‘Ÿ๐‘œ๐‘ง๐‘’๐‘› โˆ’ 1, ๐‘™๐‘œ๐‘๐‘Ž๐‘™ โˆ’
Now we can choose set of interesting parameters:             0.4, ๐‘ก๐‘Ÿ๐‘œ๐‘๐‘–๐‘๐‘Ž๐‘™ โˆ’ 0.6}

      ๐”ผ = {๐‘“ ๐‘Ÿ๐‘’๐‘ โ„Ž, ๐‘“ ๐‘Ÿ๐‘œ๐‘ง๐‘’๐‘›, โ„Ž๐‘œ๐‘ก, ๐‘ ๐‘ค๐‘’๐‘’๐‘ก, ๐‘”๐‘Ÿ๐‘’๐‘’๐‘›,                    โ€ข {โ„Ž๐‘œ๐‘ก โˆ’ 0.6, ๐‘ ๐‘ค๐‘’๐‘’๐‘ก โˆ’ 0.4, ๐‘“ ๐‘Ÿ๐‘œ๐‘ง๐‘’๐‘› โˆ’ 0.5, ๐‘™๐‘’๐‘Ž๐‘“ ๐‘ฆ โˆ’
         ๐‘Ÿ๐‘’๐‘‘, ๐‘™๐‘œ๐‘๐‘Ž๐‘™, ๐‘ก๐‘Ÿ๐‘œ๐‘๐‘–๐‘๐‘Ž๐‘™, ๐‘™๐‘’๐‘Ž๐‘“ ๐‘ฆ, ๐‘ก๐‘ข๐‘๐‘’๐‘Ÿ}.                      0.3, ๐‘ก๐‘ข๐‘๐‘’๐‘Ÿ โˆ’ 0.7}

Now we can construct a binary relation table using            Values of WSC are presented in table (2). As we can
standard ๐œ’ function. That table is presented in table         see the best option for each of demands is spinach. It
(3).                                                          can be clearly seen that SSC is special case of WSC, but
   Now consider three binary vectors of demands:              WSC gives much more precise results.

    โ€ข ๐ด = {๐‘“ ๐‘Ÿ๐‘’๐‘ โ„Ž, โ„Ž๐‘œ๐‘ก, ๐‘Ÿ๐‘’๐‘‘},                                 2.6. Neural networks - model of neuron
    โ€ข ๐ต = {๐‘“ ๐‘Ÿ๐‘œ๐‘ง๐‘’๐‘›, ๐‘”๐‘Ÿ๐‘’๐‘’๐‘›, ๐‘ ๐‘ค๐‘’๐‘’๐‘ก, ๐‘™๐‘’๐‘Ž๐‘“ ๐‘ฆ},             To analyze neural network architecture we have to start
    โ€ข ๐ถ = {๐‘“ ๐‘Ÿ๐‘’๐‘ โ„Ž, ๐‘”๐‘Ÿ๐‘’๐‘’๐‘›, ๐‘Ÿ๐‘’๐‘‘, ๐‘ ๐‘ค๐‘’๐‘’๐‘ก}.                 with the definition of a McCulloch-Pittsโ€™s neuron model.
                                                       It is an attempt to simulation of real neurons known
We calculate the values of the SSC which are presented from neurobiology. Three main parts can be separated
in table (1). Now we can choose goods with highest in that model:
value of SSC for every demand:
                                                            โ€ข input signals,
    โ€ข A - pepper,
                                                            โ€ข activation function,
    โ€ข B - spinach,
                                                            โ€ข output signal.
    โ€ข C - apple.




                                                          3
    Input signals come to neuron through synapses which Dimension of input layer has to be equal to number
are connections with other neurons. Let us suppose, of parameters describing given category of objects. It
that we have ๐‘› synapses in given neuron. Signal of ๐‘–- is first layer in the network, so neurons are only con-
th synapse shall be referenced to as ๐‘ ๐‘– Each of synapses nected to the next layer. For example if we want to
has special number called weight. We shall denote ๐‘ค๐‘– classify pictures of dimension 28 ร— 28 pixels we have
as weight assigned to ๐‘–-th synapse. A weighted sum of to use 28 ร— 28 = 784 input neurons. There are no set
values of input signals is calculated:                   rules about optimal number and dimension of hidden
                                                         layer, but generally more hidden layers cause better
                            ๐‘›
                                                         performance of neural network.
                      ๎ˆฟ = โˆ‘ ๐‘ ๐‘– ๐‘ค๐‘– .                  (9)
                           ๐‘–=1                              Those numbers have to be tailored for each task sep-
                                                         arately. There will be ๐‘˜ neurons on output layer where
Every neuron can also have a special number ๐‘ called ๐‘˜ corresponds to number of classes in our system. For
bias. That means a fixed treshold value of each neuron. binary classification 1 neuron will be sufficient, but for
It is added to weighted sum ๎ˆฟ:                           10 classes we will need 10 neurons on output layer.

                        ๎ˆฟฬ‚ = ๎ˆฟ + ๐‘.                 (10)
                                                               2.8. Neural reasoning
Now, special function ๐‘“ (๎ˆฟฬ‚ ) called activation function       To make a decision using neural network we have to
is calculated. Result of that calculation is assigned to       feed forward input signal. We calculate output of each
output signal ๐‘  of the neuron. There exist plenty of           of layer and the output of the last layer is decision of
possible activation functions. One of the simplest is          the network. Formula for output of ๐‘–-th neuron on ๐‘—-th
threshold function:                                            layer is as follows:
                         {
                          1 when ๎ˆฟฬ‚ > ๐‘Ž,                                                   ๐พ
              ๐‘“1 (๎ˆฟฬ‚ ) =                   .        (11)                                        ๐‘—
                                                                                                                           (14)
                          0 otherwise                                    ๐‘“ (๐‘ ๐‘–,๐‘— ) = ๐‘“    โˆ‘(๐‘ค๐‘˜,๐‘– ๐‘ ๐‘˜,๐‘—โˆ’1 ) + ๐‘๐‘–,๐‘—
                                                                                         (๐‘˜=1                      )
                                                                                                                       ,

In further applications differentiability of function ๐‘“ (๎ˆฟฬ‚ ) where:
is demanded. Very popular activation function is called
sigmoid function:                                                โ€ข ๐พ is number of neurons on ๐‘— โˆ’ 1-st layer.

                                  ๐‘’๐‘ฅ                               โ€ข ๐‘ค๐‘˜,๐‘–
                                                                       ๐‘—
                                                                          is weight of connection between ๐‘˜-th neu-
                      ๐‘“2 (๎ˆฟฬ‚ ) = ๐‘ฅ   .              (12)             ron on (๐‘— โˆ’ 1)-th layer and ๐‘–-th neuron on ๐‘—-th
                                ๐‘’ +1
                                                                     layer,
Another widespread activation function is hyperbolic
tangent:                                                           โ€ข ๐‘๐‘–,๐‘— is bias of ๐‘–-th neuron on ๐‘—-th layer,

                                  ๐‘’ โˆ’๐‘’๐›ผ๐‘ฅ   โˆ’๐›ผ๐‘ฅ                     โ€ข ๐‘“ is activation function.
          ๐‘“3 (๎ˆฟฬ‚ ) = tanh (๐›ผ๎ˆฟฬ‚ ) = ๐›ผ๐‘ฅ       .       (13)
                                  ๐‘’ + ๐‘’ โˆ’๐›ผ๐‘ฅ
                                                          Formula (14) has to be applied on every layer start-
We can observe analogies and similarities between sin- ing with first hidden layer. After several steps we get
gle neuron and soft classifiers. The neuron can be con- the output of the network. That process, however, re-
sidered as a little more sophisticated soft classifier. quires well suited weights. Due to complexity of ar-
                                                        chitecture there can be thousands of weights to adjust
2.7. Neural networks - architecture                     which task is unbearable to do manually.
                                                          A special algorithm called back-propagation of er-
A neural network is set of interconnected neurons. Neu- ror was developed to improve performance of neural
rons are organized into layers, every neuron of next network. Process of adjusting the weights is called
layer is connected to every neuron in previous layer. training or learning of the network.
We define three categories of layers:
    โ€ข input layer,                                             2.9. Back-propagation algorithm
    โ€ข hidden layers,                                           Differentiability of activation function shall be used in
                                                               that algorithm. We shall start with randomly chosen
    โ€ข output layer.                                            weights and biases. We shall compare desired output



                                                           4
Table 3
SSC - binary relation table

                         fresh   frozen   hot   sweet    green        red   local    tropical   leafy   tuber
               apple       1        0      0      1        1           0      1         0         0       0
              orange       1        0      0      1        0           0      0         1         0       0
              pepper       1        0      1      0        0           1      0         1         0       0
              spinach      0        1      0      0        1           0      1         0         1       0
              tomato       1        0      0      0        0           1      1         0         0       0
               potato      1        0      0      0        0           0      1         0         0       1



๐‘‘ of the network with actual output ๐‘Ž of the network.            erenced as pattern. Then page is converted to gray-
We have to define an mean square error function:                 scale with value:
                            ๐‘                                                      ๐‘Ÿ๐‘’๐‘‘(๐‘ฅ, ๐‘ฆ) + ๐‘”๐‘Ÿ๐‘’๐‘’๐‘›(๐‘ฅ, ๐‘ฆ) + ๐‘๐‘™๐‘ข๐‘’(๐‘ฅ, ๐‘ฆ)
                               1                                 ๐‘”๐‘Ÿ๐‘Ž๐‘ฆ๐‘ ๐‘๐‘Ž๐‘™๐‘’(๐‘ฅ, ๐‘ฆ) =
               ๐‘€๐‘†๐ธ(๐‘Ž, ๐‘‘) = โˆ‘ (๐‘Ž๐‘– โˆ’ ๐‘‘๐‘– )2              (15)                                           3
                           ๐‘–=1 2                                                                                   (19)
                                                                 where ๐‘Ÿ๐‘’๐‘‘(๐‘ฅ, ๐‘ฆ), ๐‘”๐‘Ÿ๐‘’๐‘’๐‘›(๐‘ฅ, ๐‘ฆ), ๐‘๐‘™๐‘ข๐‘’(๐‘ฅ, ๐‘ฆ) correspond to
Let us consider how much ๐‘€๐‘†๐ธ(๐‘Ž, ๐‘‘) is being influ-               RGB value of pixel with coordinates (๐‘ฅ, ๐‘ฆ). Let:
enced by each of weights and biases. That influence
can be expressed as partial derivative with respect to                          1      1
given parameter ๐‘:                                                ๐‘ ๐‘ข๐‘š(๐‘ฅ, ๐‘ฆ) = โˆ‘      โˆ‘ ๐‘”๐‘Ÿ๐‘Ž๐‘ฆ๐‘ ๐‘๐‘Ž๐‘™๐‘’(๐‘ฅ + ๐‘–, ๐‘ฆ + ๐‘—) , (20)
                                                                              ๐‘–=โˆ’1 (๐‘—=โˆ’1                      )
                        ๐œ•๐‘€๐‘†๐ธ(๐‘Ž, ๐‘‘)
                                   .                  (16) and:
                           ๐œ•๐‘
                                                                   ๐‘ ๐‘ข๐‘Ÿ๐‘Ÿ(๐‘ฅ, ๐‘ฆ) = ๐‘ ๐‘ข๐‘š(๐‘ฅ, ๐‘ฆ) โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘ฆ๐‘ ๐‘๐‘Ž๐‘™๐‘’(๐‘ฅ, ๐‘ฆ).       (21)
We have to apply chain rule to calculate the derivative
(16). That is the reason of demanding differentiabil- After converting to grayscale image is binarized using
ity of activation function. Now we have to define the the formula:
learning rate ๐œ‚ โˆˆ (0, 1). That coefficient will tell us how                 {
fast we want to correct parameters in the network. We                         1 when ๐‘”๐‘Ÿ๐‘Ž๐‘ฆ๐‘ ๐‘๐‘Ž๐‘™๐‘’(๐‘ฅ, ๐‘ฆ) < ๐‘ ๐‘ข๐‘Ÿ๐‘Ÿ(๐‘ฅ,๐‘ฆ)
                                                                                                             8   ,
                                                                ๐‘“ (๐‘ฅ, ๐‘ฆ) =
can formulate correction equations:                                           0 otherwise.
                                                                                                                   (22)
                            ๐œ•๐‘€๐‘†๐ธ(๐‘Ž, ๐‘‘)
                 ๐‘ค =๐‘ค โˆ’๐œ‚                   ,            (17) Then the pattern is converted to gray-scale using the
                                 ๐œ•๐‘ค                          same function and normalized into interval [0, 1] using
                            ๐œ•๐‘€๐‘†๐ธ(๐‘Ž, ๐‘‘)                       following formula:
                 ๐‘ =๐‘โˆ’๐œ‚                  ,              (18)
                                 ๐œ•๐‘                                                  255 โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘ฆ๐‘ ๐‘๐‘Ž๐‘™๐‘’(๐‘ฅ, ๐‘ฆ)
where ๐‘ค and ๐‘ are given weights and biases respec-                         ๐‘”(๐‘ฅ, ๐‘ฆ) =                       .       (23)
                                                                                              255
tively. After several steps we can significantly min-
imize error of the network. Neural classifier can be Gray-scale is reversed, that operation enables us to
applied only after training to give acceptable results. treat dark points as having higher values.
As we can see, large networks demand a lot of calcu-            Now binarized page is divided into segments of size
lations to be trained, because gradients (16) has to be      of pattern. Then both, pattern and segment of page are
calculated for each of multiple parameters.                  converted    into one dimensional vectors. After repeat-
                                                             ing whole process for every possible segment of page
                                                             we have our universe ๐•Œ consisting of segments ๐‘ข. For
3. Description of proposed                                   every ๐‘ข we have our membership function and we can
                                                             build binary relation table for them. Then we take vec-
     system                                                  tor built from pattern and check which of the segments
                                                             meets the demands expressed by number from interval
3.1. Soft classifier
                                                             [0, 1].
Our system takes two pictures - one large which will            Simple soft classifier or weighted soft classifier do
be referenced as page and one small, which will be ref- not work well in our case. That happens because they



                                                             5
Figure 1: Handwriting recognition




Figure 2: Printed text recognition




                                     6
Table 4                                                        on MNIST dataset. All of the attempts unfortunately
Threshold values for WMSC                                      failed - network learned to return the same answers
                                                               rather than recognize patterns. After some research
           single handwritten symbol    0.45                   we found main disadvantages of our system:
               handwritten word         0.4
                                                                   โ€ข too shallow architecture,
                 printed symbol         0.6
                                                                   โ€ข too low computing power accessible,
                                                                   โ€ข too large training batches.
demand knowledge about number of patterns existing               Keeping that in mind we can construct classifi-
in our page or calculating threshold values for every          ers with better performance in the future.
pattern. After taking these objectives into account we
decided to use WMSC. After several trials we have de-
termined sufficient values for aforementioned thresh-          5. Conclusions
old. Results are presented in table (4).
   After calculation every segment with WMSC value        As we can see, soft classifiers are quite well perform-
greater than specified treshold value is highlighted and  ing in given task. We plan to develop method of de-
final result is saved.                                    termination of threshold values of soft classifier and
                                                          use them to more demanding tasks. We want also to
                                                          improve training of neural classifier. To achieve that
3.2. Neural classifier
                                                          goal we will have to optimize code for training algo-
We propose also a neural classifier for that task. We rithm to perform calculations faster. That will allow
wanted to train our classifier on MNIST dataset of hand- us to explore deeper architectures of neural networks
written digits to recognize similar patterns and then and then find sufficient number of layers to our task.
apply that to our binarized page and pattern. Our idea
was to build a neural network with 2ร—28ร—28 = 1568 in-
puts and one binary output telling us whether images References
are similar or not. After training it had to go through
                                                           [1] G. Capizzi, S. Coco, G. Lo Sciuto, C. Napoli, A
whole segment of page and find similarities with pat-
                                                               new iterative fir filter design approach using a
tern.
                                                               gaussian approximation, IEEE Signal Processing
                                                               Letters 25 (2018) 1615โ€“1619.
4. Experiments                                             [2] C. Mello, A. Oliveira, A. Sanchez, Historical doc-
                                                               ument image binarization., volume 1, 2008, pp.
4.1. Soft classifier                                           108โ€“113.
                                                           [3] C. Napoli, G. Pappalardo, E. Tramontana, An
After some trials and errors we determined satisfying          agent-driven semantical identifier using radial
threshold values for WMSC. We prepared test set for            basis neural networks and reinforcement learn-
our classifier. It consisted of pairs (page, pattern). We      ing, arXiv preprint arXiv:1409.8484 (2014).
manually included type of documents and run the pro- [4] M. Almeida, R. Lins, R. Bernardino, D. Jesus,
gram. Then we also manually checked the behavior               B. Lima, A new binarization algorithm for his-
the classifier. Results are presented on following fig-        torical documents, Journal of Imaging 4 (2018)
ures:                                                          27.
    โ€ข searching for printed sign - fig. 2,                 [5] A. Venckauskas, A. Karpavicius, R. Damaลกe-
                                                               viฤius, R. Marcinkeviฤius, J. Kapoฤiuฬ„te-Dzikienรฉ,
    โ€ข searching for handwritten word or letter - fig. 1.       C. Napoli,       Open class authorship attribu-
                                                               tion of lithuanian internet comments using one-
Threshold values determined experimentally are sho-            class classifier, in: 2017 Federated Conference
wn in table 4.                                                 on Computer Science and Information Systems
                                                               (FedCSIS), IEEE, 2017, pp. 373โ€“382.
4.2. Neural classifier                                     [6] O. Boudraa, W. K. Hidouci, D. Michelucci, De-
                                                               graded historical documents images binarization
As it was mentioned earlier we constructed few ar-             using a combination of enhanced techniques,
chitectures of neural network and tried to train them          2019.



                                                           7
 [7] C. Napoli, E. Tramontana, G. L. Sciuto, M. Woz-
     niak, R. Damaevicius, G. Borowik, Author-
     ship semantical identification using holomorphic
     chebyshev projectors, in: 2015 Asia-Pacific Con-
     ference on Computer Aided System Engineering,
     IEEE, 2015, pp. 232โ€“237.
 [8] D. Poล‚ap, M. Woลบniak, R. Damaลกeviฤius,
     R. Maskeliuฬ„nas, Bio-inspired voice evaluation
     mechanism, Applied Soft Computing 80 (2019)
     342โ€“357.
 [9] F. Beritelli, G. Capizzi, G. Lo Sciuto, C. Napoli,
     M. Woลบniak, A novel training method to preserve
     generalization of rbpnn classifiers applied to ecg
     signals diagnosis, Neural Networks 108 (2018)
     331โ€“338.
[10] G. Capizzi, G. Lo Sciuto, C. Napoli, D. Polap,
     M. Woลบniak, Small lung nodules detection based
     on fuzzy-logic and probabilistic neural network
     with bio-inspired reinforcement learning, IEEE
     Transactions on Fuzzy Systems 6 (2020).
[11] M. Woลบniak, D. Poล‚ap, Soft trees with neural
     components as image-processing technique for
     archeological excavations, Personal and Ubiqui-
     tous Computing (2020) 1โ€“13.
[12] Y. Liu, L. Jin, S. Zhang, C. Luo, S. Zhang, Curved
     scene text detection via transverse and longitu-
     dinal sequence connection, Pattern Recognition
     90 (2019) 337โ€“345.
[13] Y. Zhu, J. Du, Textmountain: Accurate scene
     text detection via instance segmentation, Pattern
     Recognition (2020) 107336.
[14] C. Luo, L. Jin, Z. Sun, Moran: A multi-object rec-
     tified attention network for scene text recogni-
     tion, Pattern Recognition 90 (2019) 109โ€“118.
[15] D. Molodtsov, Soft set theoryโ€”first results,
     Computers & Mathematics with Applications 37
     (1999) 19โ€“31.
[16] Onyeozili, T. M. Gwary, A study of the funda-
     mentals of soft set theory, International Journal
     of Scientific & Technology Research 3 (2014) 132โ€“
     143.
[17] G. Cardarilli, L. Di Nunzio, R. Fazzolari,
     A. Nannarelli, M. Re, S. Spano, N-dimensional
     approximation of euclidean distance,          IEEE
     Transactions on Circuits and Systems II: Express
     Briefs 67 (2020) 565โ€“569.




                                                          8