Oriented Local Binary Patterns for
                              Writer Identification

                      Anguelos Nicolaou                                      Marcus Liwicki and Rolf Ingolf
              Institute of Computer Science and                      Document, Image and Voice Analysis (DIVA) Group
            Applied Mathematics University of Bern                                University of Fribourg
                      Neubrückstrasse 10                                          Bde des Perolles 90
                    3012 Bern, Switzerland                                         Fribourg Switzerland
             Email: anguelos.nicolaou@gmail.com                            Email: firstname.lastname@unifr.ch


    Abstract—In this paper we present an oriented texture feature      specific to handwriting and treats the problem as a generic
set and apply it to the problem of offline writer identification.      oriented binary texture classification problem. The extent to
Our feature set is based on local binary patterns (LBP) which          which handwriting contains invariant characteristics of the
were broadly used for face recognition in the past. These features     writer is an open question. While forensic document examiners
are inherently texture features. Thus, we approach the writer          have been tested in detecting disguised handwriting by Bird
identification problem as an oriented texture recognition task
and obtain remarkable results comparable to the state of the art.
                                                                       et al [10], Malik et al [11] have started to address the
Our experiments were conducted on the ICDAR 2011 and ICHFR             issue of different writing styles for automated offline writer
2012 writer identification contest datasets. On these datasets we      identification systems. It remains an open question whether
investigate the strengths of our approach as well its limitations.     handwriting style can provide us with real biometric markers,
                                                                       invariant to the sample acquisition conditions. By preserving
                      I.   I NTRODUCTION                               the generic attributes of our method, we can safely avoid
                                                                       addressing many complications that are specific to handwriting
A. Local Binary Patterns                                               analysis and writer detection.
    Local binary patterns (LBP) were broadly popularized in
2002 with the work of Ojala et al [1] as a texture feature set                            II.   LBP F EATURE S ET
extracted directly on grayscale images. As well demonstrated
by Ojala, the histogram of some specific binary patterns is a              Although writer identification seems to require scale invari-
very important feature-set. LBP are inherently texture features,       ant features, scale sensitive features might be suited as well.
but they have been used in a very broad range of applications          Writers tend to write with a specific size, therefore the scale of
in Computer Vision (CV), many of which exceed the typical              the texture tends to be directly dependent on the sampling rate.
texture recognition tasks. In 2004, Ahonen et al [2] used              The task of writer identification is almost always done with
successfully LBP for face recognition. In 2007, Zhao et al [3]         respect to a dataset, where the sampling rate is defined or at
extended the operator as a 2D plus time voxel version of LBP,          least known when performing feature extraction. It is feasible
called VLBP, and used them successfully for facial gesture             and probably worth the effort of resampling all text images to
recognition. In 2009, Whang et al [4] combined LBP features            a standard sampling resolution, rather than improvising a scale
with HOG features to address the problem of partial occlusions         invariant feature-set. Our feature-set as is the norm, is derived
in the problem of human detection.                                     from the histogram of occurring binary patterns.

B. Writer Identification                                               A. The LBP operator
     While graphology, i.e. the detection of personality traits            LBP were defined in [1] as a local structural operator,
based on handwriting, has been associated with bad science [5]         operating on the periphery of a circular neighborhood. LBP
and has failed to provide experimentally sound significant             are encoded as integers, which in binary notation would map
results [6], handwriting style can be considered an invariant          each sample on the periphery to a binary digit. As can be
attribute of the individual. Writer identification has tradition-      seen in Fig. 1 and (2), LBP are defined by the radius of the
ally been performed by Forensic Document Examiners using               circular neighborhood and the number of pixels sampled on
visual examination. In recent decades there is an attempt              the periphery. The sampling neighborhood Nr,b is formally
to automate the process and codify this knowledge in to                defined in (1).
automated methods. In 2005, Bensefia et al [7] successfully
used features derived from statistical analysis of graphemes,
bigrams, and trigrams. In 2008, He et al [8] used Gabor                          ∀n, φ : n ∈ [0..b − 1] ∧ φ = (n ∗ 2 ∗ π)/b
filter derived features and in 2010 Du et al [9] introduced
LBP on the wavelet domain. Even-though the method of Du                          ∀f (x1 , x2 ) : R2 =⇒ {0, 1}
uses LBP for feature extraction in writer identification, the
similarities end there. Our method makes no assumptions                   Nr,b (I(x, y), n) = I(x + sin(φ) ∗ r, y + cos(φ) ∗ r)      (1)
                                                                                                  (a)


                                                                                                  (b)
      (a)             (b)             (c)             (d)
                                                                   Fig. 2: LBP edge patterns. In (a) the top-edge contributing
                                                                   patterns and in (b) the top-left edge contributing patterns can
                                                                   be seen. Contribution: black 100%, dark gray 50%, gray 25%,
                                                                   and light gray 12.5%


      (e)             (f)             (g)             (h)          only background have an LBP value of 255. By suppressing
Fig. 1: Indicative LBP operators: LBP1,4 (a), LBP1,8 (b),          (ignoring) the 255 bin, we make the LBP histogram surface
LBP1.5,8 (c), LBP2,8 (d),LBP2,12 (e), LBP2,16 (f),                 invariant. All occurrences left in the histogram represent the
LBP3,8 (g), LBP3,16 (e). Dark green represents pixels              pixels in the border between foreground and background. The
with 100% contribution, green represents pixels with 50%,          core of the feature set comprises of the 255 histogram bins
light green pixels with 25%, and black is the reference pixel.     normalized to a sum of 1. This normalization renders the
                                                                   features derived from the histogram invariant to the number
                                                                   of signal pixels in the image.
   LBPr,b,f (x, y) = f (Nr,b (I(x, y), n) ∗ 2n , I(x, y))+
                                                                   C. Redundant Features
     f (Nr,b (I(x, y), n − 1) ∗ 2n−1 ), I(x, y)) + ...       (2)
     ... + f (Nr,b (I(x, y), 0) ∗ 20 , I(x, y))                        Having the normalized 255 bins from the histogram as
                                                                   the core of the feature set, we calculate some redundant
    When defined on grayscale images, LBP are obtained by          features that will amplify some aspects of the LBP we consider
thresholding each pixel on the periphery by the central pixel.     significant in the writer identification task. Our goal is to have
Because we worked on binary images as input, a lot more            a feature-set discriminative enough to work well with naive
operations than greater or equal (thresholding) were possible      classifiers such as nearest neighbor or, even more, classify
as a binary operation. We generalized our definition of the LBP    writers by clustering the samples without any training.
in (2), to consider the boolean operator marked as f a third           The first redundant feature group we use, is edge participa-
defining characteristic of the LBP operator LBPr,b,f along         tion. We consider each pattern to have a specific probability of
with the radius r and the number of samples b.                     belonging to an edge of a specific orientation; from now on we
   We took into account several factors for selecting the          call that contribution. The sum of the number of occurrences
appropriate LBP binary operator. In what concerns the bit          of each pattern, multiplied by its contribution factor makes up
count, a bit-count of 8 presents us with many benefits. Im-        the oriented edge occurrences. In Fig. 2a all top-edge patterns
plementation wise, the LBP transform is an image that uses         can be seen along with their probability, in 2b we can see the
one byte per pixel. Its histogram has 256 bins providing           patterns of the top-left-edge patterns and their probabilities
a high feature-vector dimentionality and good discriminative       which are derived from the top-edge patterns by rotating them
properties. Additionally, containing the distinct LBP count to     counter-clock-wise. By rotating the contributing patterns of
256, guaranties highly representative sampling in relatively       the top-edge, we can obtain the contributing patterns of all
small surfaces of text.                                            eight edge-orientations. We also add the more general edge-
                                                                   orientations: horizontal, vertical, ascending, and descending as
                                                                   separate features which are calculated as the sum of the respec-
B. The LBP function                                                tive pair of edge-orientations. In the end we calculate the two
    While LBP are traditionally derived from grayscale images,     final aggregations of perpendicular and diagonal, which are the
when dealing with text, its better to use binarized text images    sum of horizontal and vertical and respectively ascending and
as input, thus avoiding all information coming from the text       descending. In total we obtain 14 edge-features, which we then
background. We considered many different binary operations         normalize to a sum of 1. One of our aims when introducing
and chose the binary operator ”equals” (3) as f () in (2 ).        these redundant features is to enhance characteristics that have
                                                                  been associated with writer identification such as text slant.
                                1 : xceter = xperiphery
   f (xceter , xperiphery ) =                               (3)        The second redundant feature-group we implemented are
                                0 : xceter 6= xperiphery
                                                                   the rotation invariant hashes. We grouped all patterns, so
”Equals” as a boolean function on an image means true for any      that each pattern in a group can be transformed in to any
background pixel in the peripheral neighborhood of a back-         other pattern in that group by rotating. When having an 8
ground pixel, true for any foreground pixel in the peripheral      sample LBP, the distinct rotation invariant patterns are 36
neighborhood of a foreground pixel, and false for everything       in total [1]. Some pattern groups contain only one pattern
else. When using the ”equals” function as the binary function      eg. pattern 0, while other groups contain up to 8 patterns,
in a 8 bit-count LBP, all pixels with only foreground and          such as all one bit true patterns 1,2,4,8,16,32,64,128. We
took the number of occurrences for each group in the input           which was defined by evolutionary optimization on the train-
image and normalized them to a sum of 1, thus providing              set. The optimization process is also performed on a given
36 rotation invariant features. A complementary feature-group        dataset and should also be considered as a training stage. While
to the rotation invariant patterns is what we named rotation         it is not required, it makes more sense that both training steps
phase. For each group of rotation invariant features, we took        are performed on the same dataset. The fourth and last step
the minimum, with respect to the numeric value, pattern in           is to calculate the L1 norm on the scaled and rebased feature
the group and designated it as group-hash. The number of             vectors. Steps two and three can be combined in to a linear
clockwise rotations each pattern needs to become its groups-         operation on the feature space and in many aspects should be
hash, is what we call the rotation phase. By definition, the         viewed as a statistically derived heuristic matrix. Our classifier,
distinct phases in an LBP image, are as many as the number           as was implemented, has two inputs, a subject dataset and a
of samples of the LBP. The frequency of all phases normalized        reference dataset. The output consists of a table where each
to the sum of 1, provides us with 8 more redundant features          row refers to a sample in the subject dataset and contains all
that are complementary to the rotation invariant hashes.             samples in the reference dataset ranked by similarity to the
                                                                     specific sample. When benchmarking classification rates of our
    A third group of redundant features we introduced to our         method, we can simply run our classifier with an annotated
feature-set is what we called beta-function as defined in (4)        dataset as both object dataset and reference dataset. In this
along with the bit-count of every pattern.                           case, the first column contains the object sample and the
       ∀n ∈ [1..bitcount]                                            second column contains the most similar sample in the dataset
                                                                     other than its self. The rate at which the classes in the first
       ∀lbp ∈ [0..2bitcount−1 ]
                                                                     column agree to the classes in second column, is the nearest
                       1 : bit n is set in lbp∧
                   (
                                                                     neighbor classification rate.
       d(lbp, n) =         bit n − 1 is not set in lbp        (4)
                       0 : otherwise                                 E. Scale Vector Optimisation
                 X
       β(lbp) =      d(lbp, n)                                           Describing in detail the optimization process of the scaling
                    n
                                                                     vector would go beyond the scope of this paper. In brief
When the sample count is 8, the β function, has up to 5 distinct     we optimized using an evolutionary algorithm. We used as
values. The histogram of the β function (5 bins) normalized to       input the 125 most prominent components of the features
a sum of 1 and the histogram of the bit-count of every pattern       and the id of the writer of each sample. We optimized using
normalized to 1 as well, are the last redundant feature-group        the ICHFR 2012 writer identification competition dataset [13]
we defined. The β function becomes an important feature when         which contains 100 writers contributing 4 writing samples
the LBP radius is greater than pen stroke thickness. In those        each. Individuals of the algorithm were modeled as vector
situations, e.g. a β count of one, would indicate the ending         of continuous scaling factors for each feature in the feature
of a line, and a β count of three or four would indicate lines       space. The fitness function was based on the classification rate
crossing.                                                            a nearest neighbor classifier obtains when the feature space
                                                                     is scaled by each individual. The stoping criteria was set to
    If we put it all together, we have 255 bins of the histogram,
                                                                     2000 generations, and each generation had 20 individuals.
plus 36 rotation invariant features, plus 8 rotation phase
                                                                     Suitable parents were determined by the rank they obtained
features, plus 14 edge-features, plus 5 β function features, plus
                                                                     in the generation.
9 sample-count features, makes a total of 327 features; these
are the proposed feature-set. The redundant features make the
features well suited for naive classifiers. By setting the 255                     III.   E XPERIMENTAL P ROCEDURE
histogram bin to 0, the feature set ignores all non signal areas         In order to have a proper understanding of our methods
in the image. The normalization of all bins to a sum of 1, as        performance, its robustness, and its limitations, we conducted
well as the nullification of the last bin, renders our feature set   a series of experiments. We used two datasets for our experi-
invariant with respect to non signal areas (white).                  ments: the dataset from the ICDAR 2011 writer identification
                                                                     contest [12], hereafter 2011 dataset and the dataset from the
D. The Classifier                                                    ICHFR 2012 writer identification challenge [13], hereafter
                                                                     2012 dataset. The 2011 dataset has 26 writers contributing
    Once we transform a given set of images into feature             samples in Greek, English, German, and French with 8 samples
vectors, we can either use them as a nearest neighbor classifier     per writer. The 2012 dataset has 100 writers, contributing
or perform clustering on them. While clustering seems to be          samples in Greek and English with 4 samples per writer.
a more generic approach, it is constrained by the need to            While the 2011 dataset was given as the train set for the 2012
process all samples at the same time. Such a constraint makes        contest, we used them in the opposite manner. In order to
the clustering approach very well suited for research purposes       avoid overfitting during the optimization step, we deemed the
but hard to match any real world scenarios. The construction         ”harder” dataset, containing more classes and less samples per
of the classifier consists of four steps. In the first step, we      class, was better suited for training.
extract the image features. In the second step, we rebase the
features along the principal components of a given dataset by
                                                                     A. Performance
performing principal components analysis. This step might, in
a very broad sense of the term, be considered training because           As previously described, our method consist of four stages:
our method acquires information from a given dataset. In the         feature extraction, principal components analysis, scaling vec-
third step we scale the rebased features by a scaling vector         tor optimization, and L1 distance estimation. Steps two and
TABLE I: Performance Results. Various modalities of our method’s results on the 2011 dataset [12] and state of the art methods
performance for reference
                                                 Nearest   Hard     Hard        Hard     Hard      Soft     Soft
                                  NAME          Neighbor   Top-2    Top-3       Top-5    Top-7    Top-5    Top-10
                                 Tsinghua        99.5%     97,1%     NA        84.1%     44.1%    100%      100%
                                MCS-NUST         99.0%     93.3%     NA        78.9%     38.9%    99.5%    99.5%
                                  Tebessa        98.6%     97.1%     NA        81.3%     50.0%    100%      100%
                              No PC, No train   96.63%     87.02%   79.33%     63.94%   28.84%    98.56%   99.04%
                                PC, No train    98.56%     91.35%   84.62%     68.27%   34.62%    98.56%   98.56%
                                 PC, Train      98.56%     95.19%   91.83%     84.13%   50.48%    99.04%   99.04%


three require a training dataset, while steps one and four are              took the 2012 dataset and we rotated its samples by 1◦ from
totally independent of any data. In TABLE I analytical scores               −20◦ to 20◦ . We obtained our measurements by classifying the
of our method in various modalities can be seen. Apart from                 original 2012 dataset with the the rotated versions. In Fig. 3
the nearest neighbor accuracy we also add the hard TOP-N                    the rotation sensitivity of our method can be demonstrated.
and soft TOP-N criteria [12], [13]. The soft TOP-N criterium                Two different measurements can be seen. The first one, noted
is calculated by estimating the percentage of samples in the                as Sample Self Recognition, is the the nearest neighbor in-
test set that have at least one sample of the same class in their           cluding the test sample. Sample Self Recognition rate will
N nearest neighbors. The hard TOP-N criterium is calculated                 be by definition 100% when no rotation occurs. The second
by estimating the percentage of samples in the test set that have           measurement, marked as Nearest Neighbor is the accuracy
only samples of the same class in their N nearest neighbors.                of nearest neighbor excluding the first occurrence. Nearest
More in detail, in TABLE I we can see various versions of                   Neighbor is by definition the accuracy when no rotation occurs.
our method and their performance as well as some state of                   As can be seen in Fig. 3 our method demonstrates some
the art methods for reference. Methods Tsinghua, MCS-NUST                   rotation tolerance from −5◦ to +5◦ with sustainable accuracy
and Tebessa [14] are the top performing methods from the                    rates, but performance drops significantly beyond this limit1 .
ICDAR 2011 writer identification contest. We must point out                 It is also worth noticing the fact that −1◦ and +1◦ rotations
that our method had a vastly superior train set, consisting of              perform slightly worst than −2◦ to +2◦ ; a possible explanation
400 samples, and we had access to the test set while working.               for this could be aliasing phenomena.
Our method has two parts that were optimized on our train
set, the 2012 dataset. The first is the principal components of                 2) Downsampling: As we stated previously, in most real
the train set and the second is the scaling of the feature space.           world scenarios, the sampling resolution will be known to
No PC, No train is the raw feature space without any training,              a writer identification system, but not always controlled as
just the features in an L1 nearest neighbor setup. PC, No train             sometimes the data are acquired by external sources or at
is the feature space rebased along the principal components of              different times. We devised an experiment that demonstrate
the the train set in a L1 nearest neighbor setup. PC, Train is              the behavior and limitations of our method in what concerns
the feature space rebased along the principal components of                 the resolution. We took the ICDAR 2011 Writer Identification
the the train-set and scaled along the optimized vector in a                dataset and rescaled it to various scales from 100% down to
L1 nearest neighbor setup. As we can see our method almost                  10%. As can be seen in Fig. 4 we obtained three measurements.
reaches the overall performance of the state of the art when                The first, marked as Self Recognition Unscaled Sample, is
it incorporates the full trained heuristics but it also provides            the nearest neighbor when classifying the initial dataset with
very good results in its untrained form.                                    the subsampled dataset as a database. The second, marked
                                                                            as Nearest Neighbor Unscaled Sample, is the second nearest
                                                                            neighbor when classifying the initial dataset with the subsam-
B. Qualitative Experiments                                                  pled dataset as a database. We presume that the first nearest
    Apart from providing a comprehensive accuracy score that                neighbor will always be the same sample in different scales
is comparable to other methods, in order to describe the                    and therefore disregard it for this measurement. The third
strength and limitations of our method, we performed a series               measurement, named Nearest Neighbor Scaled Sample, is the
of experiments that simulate frequently appearing distortions               accuracy of the second nearest neighbor when classifying the
to the data.                                                                scaled dataset with the scaled dataset a database. The first two
                                                                            measurements describe the sensitivity our method has in com-
    1) Rotation: Text orientation, is a text image characteristic           paring samples of different sampling resolution and therefore
that is definitely affected by the writer. Under controlled                 scale as well, while the third measurement demonstrates how
homogeneous experimental conditions of data acquisition, text               well our method would work on datasets of lower resolution.
orientation should depend only on the writer. Quite often in                We should also point out that the optimization process was
real life scenarios we have no way of knowing whether an                    performed on the original resolution. As we expected and can
image has been rotated or not and to which extent. One of the               be seen in Fig. 4, we find that our method has no tolerance in
important characteristics of a writer identification system is the          comparing samples from different sampling rates. We can also
robustness against moderate rotation. We address this issue by
an experiment where we try to recognize samples of a dataset                  1 samples rotated by more than 5◦ could be manually corrected during
with rotated versions of the database. More specifically we                 sample aquisition
       Fig. 3: Rotation Sensitivity              Fig. 4: Resolution/scale sensitivity        Fig. 5: Grapheme quantity sensitivity


conclude that our method has tolerance to lower than standard        the sample recognition rate. We performed a one tail t-test
resolutions, but benefits mostly from higher resolutions. The        on the results on 165 sample-classifications and obtained a p-
out of the norm measurement in Nearest Neighbor Scaled               value of 0.3734, which by all standards make the recognition-
Sample posed us with a puzzle. The most probable explanation         rates indistinguishable. This experiment indicates that for our
is that it is related to aliasing but is worth investigating more.   method any two samples written in different writing styles
                                                                     are as different regardless of whether they were written by
    3) Removing Graphemes: A very important characteristic           the same individual or not. From a forensic perspective, these
of writer identification methods is how much text is required        measurements imply that our method does not distinguish
to in order to reach the claimed accuracy. We conducted an           between disguised writing style and natural writing style.
experiment to answer specifically this question. Our strategy
was to create group datasets that vary only on the amount                                   IV.   D ISCUSSION
of signal (text) and then compare results on these datasets.
As the primary dataset we took the ICDAR 2011 writer                     In this paper we presented a powerful feature set that
identification dataset, because it provides us with relatively       summarizes any texture on a binary image as a vector of 327.
large text samples. In order to quantify the available signal,       We use our feature extraction method to produce a database
we took the 2011 dataset and for each image in the dataset,          from any given dataset with handwriting samples and use it as a
we produced 20 images with different amounts of connected            nearest neighbor classifier. In order to improve our classifier we
components from the original image. Due to the very high             performed PCA on a specific dataset and linearly transformed
locality of our feature set, the fact that we removed connected      the feature space. We also scaled the features by a scaling
components instead of text lines should be negligible and            vector in order to increase the impact of the features that
at the same time it gave us quite higher control over the            contribute to correct classifications on our test set. Both these
signal quantity. As can be seen in Fig. 5 the results are quite      improvements can be combined in to single matrix with which
surprising. Instead of having a gradual drop in performance,         we multiply all feature vectors. This single matrix should be
the performance is unaffected down to 30% of the graphemes,          viewed as a heuristic matrix statistically derived from the 2012
bellow that point, performance drops linearly.                       dataset. It is also valid to think of this matrix as the result
                                                                     of a supervised learning process. The idea is to calculate
    4) Writer vs Writing Style: We submitted an earlier version      this matrix once per type of texture we want to classify. In
of our method to the SigWiComp2013 competition. The goal             the context of western script handwriting, we obtained the
of the writer identification part of the competition, is to mea-     matrix from the 2012 dataset and used it in benchmarking
sure the performance of writer identification systems, when the      our method on the 2011 dataset, our qualitative experiments,
handwriting style has been altered. A sample dataset was made        and our submission to SigWiComp2013 [11]. When comparing
available by the organizers of the competition. The dataset          the experimental results to the state of the art, we can not
contained 55 writers contributing 3 text samples each and            obtain a perfectly fair comparison. The state of the art methods
each sample written a different writing style. Having access         participated in a blind competition with a very small train-
to the sample dataset, we performed a simple experiment to           set, although we could maybe assume that participants had
determine whether our features encapsulate writer biometrical        access to larger third-party datasets as well. Since datasets
information or simply the writing style. We separated the            of competitions are published after the competitions, the only
dataset of 165 samples in to left and right halves. We then          way to have a perfectly fair comparison to the state of the art
performed a pair matching of the left halves to the right halves     is to participate in those competitions. A comparison of the
based on the nearest neighbor classification. We obtained two        untrained classifier (96.63%) to the state of the art (99.5%)
measurements, first the percentage of left-samples having an         is quite unfair towards our method. On the other hand, a
assigned right-sample written by the same writer (55 classes),       comparison of our trained classifier (98.56%) to the state of
and second the percentage of left-samples having the spe-            the art (99.5%) is a bit unfair towards the state of the art. In
cific sample’s complementary right-half as the nearest neigh-        the authors opinion, a fair comparison would be a lot closer to
bor (165 classes). The writer identification rate was 87.27%,        the trained classifier than to the untrained. The performance of
while the specific sample recognition rate was 86.06%. By            the untrained classifier demonstrates clearly the potency of our
definition the writer identification rate is greater or equal to     feature set. The qualitative experiments were not performed
with forensics in mind, except for the last one, writer vs                         [4]   X. Wang, T. X. Han, and S. Yan, “An hog-lbp human detector with
writing style. In writer vs writing style we tried to determine                          partial occlusion handling,” in Computer Vision, 2009 IEEE 12th
the extent to which our feature set can deal with disguising                             International Conference on. IEEE, 2009, pp. 32–39.
writers; the quick answer is, no our method can not deal with                      [5]   G. A. Dean, I. W. Kelly, D. H. Saklofske, and A. Furnham, “Graphology
disguising writers. There are many subtleties in the conclusions                         and human judgment.” 1992.
that can be drawn from the writer vs writing style experiment                      [6]   A. Furnham, “Write and wrong: The validity of graphological analysis,”
about what phenomena is that our features model. One could                               The Hundreth Monkey and Other Paradigms of the Paranormal, pp.
even say that our method is more about texture similarity than                           200–205, 1991.
about writer similarity; assuming there are biometric features                     [7]   A. Bensefia, T. Paquet, and L. Heutte, “Handwritten document analysis
in handwriting, the proposed feature set does not seem to                                for automatic writer recognition,” Electronic letters on computer vision
encapsulate them. From a software engineering perspective the                            and image analysis, vol. 5, no. 2, pp. 72–86, 2005.
approach of treating writer identification as a distance metric                    [8]   Z. He, X. You, and Y. Y. Tang, “Writer identification of chinese
instead of a classifier [12] seems more efficient and modular, it                        handwriting documents using hidden markov tree model,” Pattern
allows for simplification and standardization of benchmarking.                           Recognition, vol. 41, no. 4, pp. 1295 – 1307, 2008. [Online]. Available:
                                                                                         http://www.sciencedirect.com/science/article/pii/S0031320307004037
The fact that the proposed features encapsulate no structural
information what so ever, makes them a very good candidate                         [9]   L. Du, X. You, H. Xu, Z. Gao, and Y. Tang, “Wavelet domain local
for fusion with other feature sets.                                                      binary pattern features for writer identification,” in Pattern Recognition
                                                                                         (ICPR), 2010 20th International Conference on. IEEE, 2010, pp. 3691–
                                                                                         3694.
                         ACKNOWLEDGMENT
                                                                                  [10]   C. Bird, B. Found, and D. Rogers, “Forensic document examiners skill
    The first author of this paper would like to thank Georgios                          in distinguishing between natural and disguised handwriting behaviors,”
Louloudis for his precious insights on the subject of writer                             Journal of forensic sciences, vol. 55, no. 5, pp. 1291–1295, 2010.
identification and performance evaluation. The first author                       [11]   M. I. Malik, M. Liwicki, L. Alewijnse, W. Ohyama, M. Blumenstein,
would also like to thank Muhammad Imran Malik for his                                    and B. Found, “Signature verification and writer identification competi-
effort and assistance in the participation of this method on                             tions for on- and offline skilled forgeries (sigwicomp2013),” in 12th Int.
the SigWiComp2013 competition.                                                           Conf. on Document Analysis and Recognition, Washigton, DC, USA,
                                                                                         2013, p. n.A.
                              R EFERENCES                                         [12]   G. Louloudis, N. Stamatopoulos, and B. Gatos, “Icdar 2011 writer
                                                                                         identification contest,” in Document Analysis and Recognition (ICDAR),
 [1] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale               2011 International Conference on. IEEE, 2011, pp. 1475–1479.
     and rotation invariant texture classification with local binary patterns,”
     Pattern Analysis and Machine Intelligence, IEEE Transactions on,             [13]   G. Louloudis, B. Gatos, and N. Stamatopoulos, “Icfhr 2012 com-
     vol. 24, no. 7, pp. 971–987, 2002.                                                  petition on writer identification challenge 1: Latin/greek documents,”
 [2] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local              in Frontiers in Handwriting Recognition (ICFHR), 2012 International
     binary patterns,” in Computer Vision-ECCV 2004. Springer, 2004, pp.                 Conference on. IEEE, 2012, pp. 829–834.
     469–481.                                                                     [14]   D. Chawki and S.-M. Labiba, “A texture based approach for arabic
 [3] G. Zhao and M. Pietikainen, “Dynamic texture recognition using local                writer identification and verification,” in Machine and Web Intelligence
     binary patterns with an application to facial expressions,” Pattern                 (ICMWI), 2010 International Conference on. IEEE, 2010, pp. 115–
     Analysis and Machine Intelligence, IEEE Transactions on, vol. 29, no. 6,            120.
     pp. 915–928, 2007.