Customized error clustering of industrial surface
                       inspection images

                                                 Melinda Pap
                                        pap.melinda@uni-eszterhazy.hu
                                              Eszterházy Károly University
                                                     Eger, Hungary


                                                        Abstract
                       In industrial quality control the optical surface inspection of the pro-
                       duced items frequently plays an important role. The typical aim of
                       industrial visual inspection is to determine whether the produced item
                       meets certain requirements or contains errors. In our particular dataset,
                       due to the nature of the errors and the special properties of the con-
                       struction of the surface inspection images, the error-regions in most
                       cases are not connected any more.
                       The goal of our study is to provide a clustering process that finds
                       reasonable grouping of the disconnected error regions, such that the
                       inspection of these images can be automated. By leaving out the hu-
                       man supervision, we can reduce the uncertainty and inconsistency, and
                       accelerate the inspection. Furthermore, our aim was to find a com-
                       putationally efficient method that does not require high level domain
                       knowledge and is able to extract the required information only from the
                       analysed images themselves. We set up a clustering pipeline, select the
                       proper methods for preprocessing, experiment with several available
                       clustering methods (SLINK [FLP+ 51], DBSCAN [EpKSX96], TURN*
                       [FZ02],RDG [PP05] and CTS [FJ02]) and observe the possibilities of the
                       automation of parameter selection. We evaluate the clustering meth-
                       ods with the aid of external validity indices, but we also propose an
                       internal validity index for the case when no ground truth is available
                       [HBV01].


1    Introduction
In industry, inspection is the process of determining if a product deviates from a given set of specifications. It is
used, for instance, to determine if the quality of the product is acceptable for further usage, to guide the process
stages, to collect statistical information, and to sort the final products according to their quality or appearance.
Examples of industrial visual inspection applications can be found in [MTPL13, BKJJ13]. This task can be
divided into several steps illustrated in Fig. 1, where each step is checking a specific criterion [DSAW+ 99].
   In conventional industrial surface inspection processes, the image of the observed product has to undergo
some preprocessing steps before it is analysed by image processing tools. In most cases it is known which image

Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: E. Vatai (ed.): Proceedings of the 11th Joint Conference on Mathematics and Computer Science, Eger, Hungary, 20th – 22nd of
May, 2016, published at http://ceur-ws.org


                                                             227
                          Figure 1: Steps of a typical industrial visual inspection task

areas are needed to be analysed, thus they can be isolated from the rest of the image. This restriction results in
image information of reduced size that accelerates the further processing. A specific isolation process of objects
is the extraction of regions of interests (ROIs).
    In some cases an ideal template of the product is stored for verification. This template is not necessarily a real
image taken from a physically existing ”master piece”, but can be a computer generated one. If such information
is available it simplifies the task of ROI extraction to a simple image arithmetic operation i.e. template matching
or image subtraction [NJ95]. The next step is to preprocess our dataset, this frequently means noise removal and
dimensionality reduction. The next stage is to segment the images to gain the objects present in them. Depending
on our task, we either stop here, if our goal was only to find objects, or continue to support a classification task
by feature selection. The latter is the most frequently used when it comes to quality assurance, where we want
to sort and differentiate the products or decide whether to accept or reject them.
    It is important to note that digital image processing in many applications is still dependent on the human
experts. A human perceives most of the information by her/his sense of vision and visual inspection has un-
doubtedly been used in industry since the very beginning, long before computers. There are still tasks that yet
can only be solved by humans effectively. One example of such case is the tasks of perceptual grouping. The
human visual system can detect many classes of patterns and statistically significant arrangements of image
elements.
    However image processing tasks performed by human operators are more sensitive to errors, more subjective
and dependent on the experience of the operator. Furthermore, automated inspection allows objects to be
inspected in environments unsafe for people or where the production is done with high speed (no need to stop
the product line and position the piece for inspection). Moreover, automated inspection often results in lower
costs and improved and more consistent quality [NJ95].
    Several paper has been published in the field of industrial image processing. Some of them aim at fault
detection using a template image for the ROI extraction [BKJJ13, CP16]. However only a few deals with further
clustering or segmentation of the extracted ROIs [EHL+ 10, LGW+ 17].
    The rest of this paper is organized as follows. First the problem is introduced in detail, then our solution
proposal is explained, which involves the pre-processing and clustering steps. Afterwards we provide a parameter
selection method that is one of the contributions of this paper. The results and discussion are presented in Section
4., where multiple clustering methods are evaluated. Finally, conclusions and further work are out-lined in Section
4.4.

2   Problem statement
Deviation images (difference images) are constructed based on a stored template: master image (reference image,
template) (see Fig. 2(a)) that represents the ideal look of the type of the item that is inspected and a sample
image (current image) that is the image of the currently observed item that we would like to analyse (see Fig.
2(b)). The deviation image is constructed by the subtraction of the master and the sample image, thus contains
regions where the sample image deviates from the master image (see Fig.2(c)). The deviation image construction
is a widely used method in print quality inspection and PCB defect detection [DSAW+ 99, ANDAM11]. In the


                                                         228
case of gray-scale images, the result of the deviation image construction is a gray-scale image as well, where black
pixels are representing the correct areas and bright pixels the regions where the master and the sample image
deviated from each other. Hence, the bright (non black) areas are representing the regions of interest (ROIs),
denoting potential error regions. Since all deviating areas are detected, the bright pixels can represent either real
defects or pseudo-errors, i.e. artefacts caused by varying illumination conditions or improper image alignment.


                      (a) Master image            (b) Sample image           (c) Deviation image


                               Figure 2: Example of deviation image construction.
   The set of the used images in this paper is coming from an industrial product line. It consists of 660 high
resolution: 12264 × 9406 (≈ 3m) gray-scale deviation images of the produced items. The sample images were
taken at the final stage of the production, with cameras placed above the conveyor belt, then the deviation
images were constructed immediately. The produced items in this particular case are transparent disks with
black, bar-code-like pattern painted on them. This pattern is circular and consists of bars of varying lengths and
widths. An example can be found in Fig. 3.


             Figure 3: Slice of a sample image on which the bar-code-like pattern can be observed.
  It can happen that after constructing the deviation images, the contained ROIs are not continuous any more.
There can be several reasons behind this phenomenon e.g.: the improper alignment of the master and the sample
image during the deviation image construction, inaccurate threshold selection, or in our particular case: the
uncommon pattern, texture or colour of the master image. Examples of this phenomenon can be seen in Fig. 4.
  In this scenario the company wants to know the real size of the defected areas, such that they can base their
decision upon this information, e.g. reject the disk if the size of the defected area exceeds a certain threshold.
We can see that in most cases the real size of the defects is greater than what appears on the deviation images.
Thus, our aim is to find which objects belong to one error region and by that support such decision making.

3   Our approach
Our aim is to perform the grouping of the non-continuous bright pixel segments on the deviation images without
any knowledge about the master or the sample images. This would lead us to a more flexible and more widely
applicable solution that can be used in other applications with different types of items and data as well.
   The first step is to apply preprocessing on the dataset such that removing noise and reducing the dimensionality
of the raw data, select relevant features that are descriptive enough to represent our dataset. The next step is
to find the best grouping method. In order to achieve this, we examined several clustering algorithm that could
perform the grouping on our data. The clustering algorithms that suit our needs best will be evaluated on our
dataset. For this purpose we analysed the available clustering evaluation techniques as well and performed the
evaluation of the results of the selected clustering algorithms.


                                                        229
                              (a) Pores                         (b) Pores on the deviation image


                         (c) Remained paint                     (d) Remained paint on the deviation image


                  Figure 4: Example of errors and their appearance on the deviation images.
   Our task here does not include the actual classification of ROIs, we do not need to conclude to which error
type one ROI corresponds. However, the correct clustering is necessary to maintain information (e.g. the
characterization of clusters = objects by features such as area and perimeter) that can be used for a classification
process on a later stage.

3.1   Ground truth construction
We extracted 60 images from the complete dataset (cut out regions from deviation images that contain ROIs
of all kinds) that will be used as ”training” data. Due to the high resolution, the number of bright pixels in
these images varies in the range of 500 to 600000. This subset of the complete dataset will be used to derive
some conclusions about the type of ROIs and cluster structures. Moreover, the ground truth dataset is not only
good for evaluation, but can be used to analyse the expected cluster structures. Table 3.1 presents some general
information about the training data of which the ground truth is available.
   From the analysis of the ground truth files we can conclude that we are looking for clusters of arbitrary shapes
and varying sizes. Furthermore, in our project we expect clustering algorithms to assume that the pixels that
belong to the same object of an image (connected component) should belong to the same cluster as well.


Figure 5: Examples of ground truth labelled matrices in image form. The different colours are representing the
different labels.

3.2   Pre-processing
First of all, we decided to binarize the gray-scale images. This will reduce the computational costs, since we will
operate on 2-dimensional logical matrices. The acceleration of clustering is highly important when it has to be


                                                        230
                                                                    Possible range per image
                                                                    [Min,      Max]
                       Number of points                             551           540008
                       Number of clusters                           3             112
                       Average cluster sizes                        55.1          50724.3
                       Length of gap within a cluster               1             110

Table 1: General information about the training data derived from the ground truth files. The possible ranges
of the number of points, the number of clusters and the average cluster sizes per image in pixels. In the last row
the range of the length of the possible discontinuity within a cluster is shown i.e. the shortest distance between
two objects that still belong to the same cluster.


applied during the production process. Furthermore, since our goal does not include the classification of errors,
the gray-level information does not mean significantly more information.
   As already mentioned before, it is often the case that images contain noise. In our case we also have to deal
with the problem of pseudo errors. After the observations of the deviation images, we concluded that normally
the noise and the pseudo error type of Moir patterns appear on the images as small bright areas. We decided to
remove them by applying morphological filtering (morphological opening followed by closing). This will remove
objects from the image that are smaller than the structuring element (in our case: a 2×2 matrix), while distorting
the other objects as little as possible. The result of this operation can be seen in Fig. 6.


                            (a) Binarized deviation image         (b) Binarized deviation image
                            containing pseudo errors.             after applying morphological
                                                                  filtering.


                           Figure 6: Morphological filtering for pseudo error removal.


3.3   Clustering algorithm selection
When we talk about grouping process in the context of machine vision, we refer to the task of clustering.
Clustering is the task of grouping a set of objects (image pixels) in such a way that objects in the same group
(called cluster) are more similar (with respect to some characteristic) to each other than to those in other groups
(clusters) [GMW07, TSK05]. Thus, its principle is similar to the perceptual grouping of humans.
   Cluster analysis groups data points based only on the information found in the data that describes the objects
and their relationships. It also helps to discover the distribution of patterns and the interesting correlations in
large data sets [JMF99]. Cluster analysis is sometimes also referred to as unsupervised classification, as it creates
a labelling of data points with class (cluster) labels. It is unsupervised, since there are no predefined classes
and no examples that would show what kind of desirable mapping between features and classes should be valid
among the data.


                                                            231
   There are thousands of clustering algorithms available because they mainly differ in their notion of a cluster.
We defined that we are looking for clusters with arbitrary shapes and varying sizes. We can assume that we
need algorithms that do not require the number of clusters as an input parameter and can handle large datasets
in short time, that is essential if a method is used during industrial production. In Table 2 we compared several
clustering algorithms in order to see which of them suits our afore mentioned needs.

                                                        Discovers clusters with
  Algorithm                  Type               arbitrary    non-           noise                Approx.
                                                shapes       uniform                             time    com-
                                                             density                             plexity
  k-means [Mac67]            Prototype based    No               No              No              O(n)
  CLARANS [NHS05]            Prototype based    No               No              No              ≥ O(kn2 )
  2-MSTClus [ZMW10]          Graph based        Yes              Yes             No              O(n2 )
  RDG [PP05]                 Graph based        Yes              Yes             Yes             O(n2 )
  AUTOCLUST [ECL00]          Graph based        Yes              Partially       Partially       O(n log n)
  TRICLUST [LNS08]           Graph based        Yes              Yes             Yes             O(n log n)
  SLINK [FLP+ 51]            Hierarchical       Yes              No              No              O(n2 )
  BIRCH [ZRL96]              Hierarchical       No               No              Yes             O(n)
  CURE [GRS98]               Hierarchical       No               No              Yes             O(n2 log n)
  Chameleon [KHK99]          Hierarchical       Yes              Partially       No              > O(n log n)
  DBSCAN [EpKSX96]           Density based      Yes              No              Yes             O(n log n)
  TURN* [FZ02]               Density based      Yes              No              Yes             O(n log n)
  ADACLUS [NLS08]            Density based      Yes              Yes             Partially       O(n)
  FCM [Bez81]                Fuzzy prototype    No               No              Yes             O(n)
  DifFUZZY [CMS+ 10]         Fuzzy graph        Yes              No              No              ≥ O(n2 )
  WaveCluster [SCZ98]        Grid based         Yes              No              Yes             O(n)
  DENCLUE 2.0 [HG07]         Grid based         Yes              No              Yes             O(n log n)
  BMCA [PZLB93]              Morphological      Yes              No              Yes             O(n2 )
  Genetik k-means [PS12]     Search based       No               No              No              O(n)
  SVC [NS06]                 Search based       Yes              No              Yes             < O(n2 )

Table 2: Comparison of some clustering algorithms, where n is the number of data points in the dataset and k
is the number of clusters.


   We selected some of the available clustering algorithms for our experiments based on their complexity, sensi-
tivity to their input parameters and whether they are capable of discovering clusters with arbitrary shapes and
sizes. Hence, the following clustering algorithms were selected:

 1. CC: The connected components algorithm with 8-connectivity neighbourhood, that can extract the contin-
    uous pixel regions. This method is already used in our specific industrial production line (however, with
    insufficient performance), thus we use its results as the baseline in our experiments.

 2. CCM: The connected components algorithm extended by morphological preprocessing. This is a
    morphology-based clustering method that operates by applying morphological closing with the use of a
    disc shaped structuring element on the image before extracting the connected components in it. This is a
    simplified version of the BMCA [PZLB93] algorithm since in this case the dataset is already 2-dimensional
    and binary. This method has one parameter r, the radius of the structuring element.

 3. RDG [PP05]: The Reduced Delaunay Graph algorithm is a graph-based clustering approach that constructs
    a Delaunay graph of the point set and cuts the edges if they exceed a certain threshold tRDG . This threshold
    can be found by sorting the edges of the Delaunay graph according to their weights in descending order
    and plotting this sorted list against its rank. The knee-point of this graph will correspond to the most
    appropriate value for tRDG as proposed in [EHL+ 10].


                                                       232
 4. SLINK [FLP+ 51]: The Single-link hierarchical clustering method that constructs a clusterring hierarchy. It
    starts with each point considered as a cluster and in each step the two closest clusters will be merged into
    a new one. The algorithm stops when all points are merged into one cluster. We used the implementation
    provided by the authors of [EHL+ 10] that only calculates the cluster hierarchy until the similarity of the
    currently merged clusters is above a threshold called cutOf f .

 5. DBSCAN [EpKSX96]: A density-based clustering method that is based on the assumption that the clusters
    are dense regions in the data space, separated by regions of lower density. Each cluster is a maximum set
    of density-connected points. Points not connected are considered outliers. It has two parameters: the Eps
    that determines a radius around a point p and the second is the M inP ts that is the minimum number of
    points required within the Eps radius of the point p in order to call it an inter-cluster point. The authors
    of [EpKSX96] proposed an algorithm that helps the parameter selection. It works in such a way that if we
    set the M inP ts to a fixed value we can find the best value for the Eps automatically with the help of a
    k-distance graph.

 6. TURN* [FZ02]: A density-based clustering method that operates on multiple resolutions of the data, that
    requires no input parameters. It consists of an overall algorithm called TURN and two component algo-
    rithms: TURN-RES and TurnCut. The first is an efficient resolution dependent clustering algorithm which
    returns both a clustering result and global statistics corresponding to that result. The latter is an automatic
    method for finding the optimum resolutions from a set of resolution results from TURN-RES. TurnCut
    detects a change (knee-point) in the third differential of the series of the returned statistics across resolution
    built by repeated calls to TURN-RES by the TURN* algorithm.

 7. CTS [FJ02]: The Connected-triple based similarity method is a link-based consensus clustering algorithm
    that analyses the co-occurrence of points in the same cluster among the partitions of the cluster ensemble. It
    uses single-link clustering in its consensus function. The CTS has three parameters: the number of partitions
    in the ensemble m, the decay factor DC and a threshold t that is used in the consensus function. According
    to the authors of [IoG10], a suitable value of m is in the range [10, 30] and setting the value of DC to 0, 8
    usually leads to good results.

3.4   Resampling
It comes from the nature of our data that we prefer clustering algorithms that group two pixels into one cluster
if they belong to the same object (connected component) of an image. Hence, the calculation of distances within
points that are contained in the same object is not always necessary e.g. in the case of graph based clustering
methods we only wish to obtain the edges between objects and not within.
    Thus, we can apply some preprocessing steps in order to reduce the amount of data points to be processed.
First we obtain the one pixel wide boundaries of the objects, that is achieved by applying morphological boundary
extraction. Then, we extract the set of connected components Dcon = {C1 , C2 , . . . , Cm } of the resulted image,
where m is the number of objects. In this case the set will contain the m connected boundaries of the original
objects i.e. the object representatives. After obtaining the set of object representatives Dcon , the graphs used
by clustering algorithms are created by analysing the distances between the pairs of object representatives Ci
and Cj .
    We have already mentioned the consensus clustering methods proposed in the literature. The core idea of
such methods is to aggregate several input data clusterings to generate a single output clustering, thus the first
step of a consensus clustering method is to generate a cluster ensemble that contains several partitions of the
original dataset.
    Most of the consensus clustering methods that use point co-occurrence as their consensus function, only require
the set of cluster labels associated to each point in the dataset and do not need the coordinate of the points
to generate the output clustering. If we use base clustering algorithms that do not violate our definition of the
expected cluster structure, then we can expect that two points that belong to the same connected component of
an image will belong to one cluster in any of the partitions of the cluster ensemble. In this case, we can conclude
that we do not need to consider the co-occurrence of two points that belong to the same object, because those
will always belong to the same cluster. Hence, we can significantly reduce the number of points used in the
consensus function, by not using all the points of a connected component, but only one.


                                                        233
3.5     Parameter selection
A general approach that can be used with any type of clustering algorithm to find the best partitioning of the
dataset automatically is based on the use of an evaluation metric e.g. any internal or external cluster validity
index. This approach requires multiple runs of the specific clustering algorithm with varying parameter settings.
Then the resulted partitions are rated based on a specific evaluation metric and the one that has the best
rating will be returned as the final result [JD88, HBV01]. However, in most cases the parameters of clustering
algorithms can vary in a wide range, thus it might require to run one clustering method hundreds or thousands
of times that makes this approach extremely inefficient, especially if even the cluster validation algorithm has
high complexity.
   Here, we would like to propose an alternative method to find the best parameters of some clustering algorithms,
without requiring them to run multiple times. Several clustering algorithms e.g. graph theoretical clustering
and hierarchical clustering algorithms are based on the assumption that points within a cluster are closer to
each other than to any other point not included in that cluster. The distance is the measure of dissimilarity.
This statement usually holds for connectivity-based clusters with even density. These algorithms search for the
clusters iteratively until a stopping criterion is fulfilled. One such stopping criterion can be a distance threshold,
such that if the distance between two points exceeds this threshold they cannot be merged into one cluster.
Other algorithms like the morphological clustering can use the distance information to size the used structuring
element. The diameter of the structuring element determines the shortest distance between two clusters that can
be merged into one new cluster (although the merge is also dependent on the shape of the structuring element).
Such distance threshold can be the parameter of the previously mentioned DBSCAN algorithm as well, it can
determine the value of the parameter Eps.
   We decided to search for such a distance threshold with the help of a relative neighbourhood graph (RNG).
This can be calculated in O(n log n) time. The steps of obtaining a proper distance threshold based on the RNG
of images with continuous pixel regions are the following (illustrated in Fig. 7), where m is the number of objects
(connected components):

    1. Apply re-sampling by morphological boundary extraction, then extract the connected components. This
       results in a set that contains the m object representatives of the original objects Dcon = {C1 , C2 , . . . , Cm }.

    2. Calculate the m×m proximity matrix Pm in which the elements Pm (i, j) correspond to the shortest distances
       between object representatives Ci and Cj .

    3. Calculate the RNG from the proximity matrix Pm .

    4. Sort the edge lengths and use the L-method introduced in [SC04] to calculate the knee-point of the resulted
       curve. Return the found knee-point as the distance threshold.

  We used this distance threshold as parameter for the CCM, SLINK and DBSCAN clustering algorithms in
our experiments.

4      Evaluation
In this section we compare the results of the clustering algorithms selected previously: CC, CCM, SLINK, RDG,
DBSCAN, TURN* and CTS.
   We will evaluate the results of the clustering algorithms with the help of the available ground truth files
mentioned before. We will determine the similarity between the ground truth and the clustering result by using
the Jaccard Coefficient (J ↑) [HBV01] external validity index that can be considered as a percentage of similarity.
It is calculated as follows:
                                                               a
                                                      J ↑=                                                            (1)
                                                             a+b+c
   where a is the number of point pairs that belong to the same cluster in the ground truth as well as in the
clustering result, b is the number of point pairs that belong to the same cluster in the ground truth, but to
different ones in the clustering result and c is the number of point pairs that are in different clusters in the
ground truth, but in the same cluster in the clustering result.


                                                             234
                             (a) Original binary image      (b) Object representatives marked
                                                            by different colours


                  (c) RNG of the object representatives     (d) Sorted RNG edge list (a suitable knee-
                                                            point is at 4.24)


Figure 7: The steps of obtaining the RNG graph of an image with continuous pixel regions. By the analysis of
the list of edges, a suitable threshold to cut the graph and gain the four clusters is at 4.24.
4.1   Experimental setup
In our experiments we used a computer with 64-bit Windows Server 2012 operating system, Intel Duo CPU 2.80
GHz and 31 GB Memory.
   The clustering algorithms: CCM, SLINK, DBSCAN, RDG are implemented in Matlab and were provided by
the authors of [EHL+ 10]. We implemented the TURN* algorithm in C#. Furthermore, we used the built-in
CC algorithm of Matlab. The Matlab package (LinkCluE) containing the consensus clustering method CTS was
provided by the authors of [IOBG08].

4.2   Results of clustering
First of all, we compared how well the automated parameter selection method described in Sec. 3.5 performed
compared to the evaluation metric based iterative approach.
   We ran the three chosen algorithms (CCM, SLINK, DBSCAN) with varying parameters and evaluated them
by the afore mentioned external validity index J ↑. In the case of CCM we searched for the best value for r in
the range of [5, 110] with step size 5. We used the same range and step size for the parameter cutOf f of the
SLINK algorithm and for the parameter Eps of the DBSCAN algorithm. The parameter M inP ts of the latter
algorithm was searched for in the range [0, 5 ∗ Eps, 12 ∗ Eps].
   In Table 3 we compared the results of the external evaluation metric based iterative approach and the au-
tomated parameter selection supported clustering. According to the Wilcoxon signed-rank test [Wil45] there is
no significant difference between the results of one clustering algorithm using iterative and automatic parameter
selection (p > 0.05 in all the three cases).
   We compared the results of the other clustering algorithms as well. The TURN* algorithm does not require
any input parameters and we used the RDG with its built in parameter selection method. In the case of the
CTS method we used DC = 0, 8 as it was proposed by the authors of the algorithm and we set the number of


                                                          235
                 Clustering             Iterative approach            Automated parameter
                 method                                               selection
                 CCM                    91.23 ± 16.57%                87.34 ± 19.50%
                 SLINK                  93.21 ± 13.40%                89.19 ± 18.06%
                 DBSCAN                 93.68 ± 12.86%                88.37 ± 19.52%


Table 3: The average quality of clustering with standard deviation. First, the best results of iterative parameter
selection, then of automatic parameter selection (described in Sec. 3.5).


ensembles m to 10. We used 5 SLINK and 5 CCM methods for the cluster ensemble generation and selected the
corresponding input parameters by taking m equidistant points from the range [5, 60]. In this comparison we
used the algorithms CCM, SLINK and DBSCAN with the automated parameter selection that we will indicate
by a lower index A: CCMA , SLIN KA and DBSCANA .

                                        Clutering        J↑
                                        algorithm
                                        CC               42.49 ± 31.75%
                                        CCMA             87.34 ± 19.50%
                                        RDG              82.59 ± 23.42%
                                        SLIN KA          89.19 ± 18.06%
                                        DBSCANA          88.37 ± 19.52%
                                        TURN*            83.74 ± 21.36%
                                        CTS              85.54 ± 21.21%

Table 4: Final results of the clustering algorithms. The average quality and standard deviation of the results are
shown and the best result is given in bold font. The lower index A indicates that the clustering algorithm was
supported by our automated parameter selection method.

   In Table 4. the average results of the clustering algorithms are shown with standard deviation. We can see
that the SLIN KA achieved the highest accuracy with 89.19%. However, according to the Wilcoxon signed-rank
test there is no significant difference between the results of the CCMA , DBSCANA and SLIN KA algorithms
(p > 0.05 in all the three cases).
   The corresponding runtimes of the clustering algorithms are shown in Table 5. According to the Wilcoxon
signed-rank test [Wil45], there is significant difference between the runtimes of the algorithms. We can see
that even though the DBSCANA performed with high accuracy on the dataset, it is almost the worst in this
aspect. On the other hand, the CCMA and SLIN KA algorithms run similarly fast but they still have significant
difference (p = 0.013) in runtime. The average runtime of the CCMA algorithm for the complete dataset is
00 : 01 : 43 sec, while it is 00 : 02 : 41.84 sec for the SLIN KA algorithm. Hence, we can conclude that the
CCMA algorithm is the best in the aspect of runtimes.
   Now we see that the CCMA method is the best method to choose, since it is not significantly worse than the
algorithm with the highest accuracy (SLIN KA ), but it is significantly faster.

4.3   Internal evaluation
In our experiments we used external cluster validity indices to evaluate the clustering results. Unfortunately
these require the existence of ground truth files. Our motivation was to find an internal validity index that
approximates well the actual accuracy of the clustering result, even if there is no ground truth data available any
more. Since we know that our clusters can appear in different shapes and sizes, we focused mainly on the indices
that are capable of evaluating such a dataset. We found that the graph-based internal validity indices described
in [Dun73, Ilc12, PB97] were unable to handle datasets of this size unlike the indices Silhouettes (Sil ↑) [Rou87]
and Davies-Bouldin (DB ↓) [DB79].
    The Sil ↑ value varies in the range [−1, 1], the greater the index the better. We adjusted the indices to the
[0, 1] scale to be able to compare it to the used external validity index J ↑. For the DB ↓ we know that the


                                                       236
                                            CPU runtime with pixel number:
                            Algorithm      551       17593        540008
                            CC             00:00:00.003   00:00:00.138    00:00:00.183
                            CCMA           00:00:19.95    00:09:10.14     00:22:25.29
                            RDG            00:00:00.91    00:14:52.02     -
                            SLIN KA        00:00:00.13    00:00:27.50     00:52:24.50
                            DBSCANA        00:00:01.82    05:24:56.97     18:02:02.00
                            TURN*          00:00:00.16    00:01:45.86     24:22:03.00
                            CTS            00:00:45.94    00:48:03.41     00:50:12.13

Table 5: The elapsed CPU times of clustering algorithms run on datasets of different sizes (the sizes are given
in pixels). The blank cell indicates that the algorithm was not able to handle the corresponding dataset.
The TURN* algorithm explored 6 resolutions. The lower index A indicates that the clustering algorithm was
supported by our automated parameter selection method.


lower the index, the better the clustering result, hence, we negated the values. We also know that although the
upper-bound of this index is known to be 0 in this case, there is no fix lower-bound. Thus we took the minimum
value of this index on this dataset as the lower-bound, then adjusted it to the [0, 1] scale. The average deviation
of the Sil ↑ index to the J ↑ was 25%, while that of the DB ↓ was 14%.


5   Conclusions
The goal of our project was to provide a clustering method that finds reasonable grouping of the disconnected error
regions in our set of industrial images, such that the inspection of these images can be automated. Furthermore,
our aim was to find a method that does not require high-level domain knowledge and is able to extract the required
information only from the analysed images themselves. Since we would like to support the industry with the
found clustering method, another important criterion was that the clustering method has to be computationally
efficient, such that it can be applied for product lines in real-time.
    We managed to achieve these goals with 87.34% clustering accuracy by the CCM clustering method combined
with our automatic parameter selection method CCMA . Although the SLIN KA method achieved 89.19%
accuracy on our training data, there is no significant difference between the results of the CCMA and the
SLIN KA methods, however the first is significantly faster than the latter.
    With our automatic parameter selection method we avoided to run the clustering methods multiple times on
one image to find the best parameters, therefore, the computation time did not increase significantly in any of
the cases. Furthermore, by this method we approximated well the best parameter values, such that the results
came close to the real optimal solutions (without using any external information on the true cluster structure).
    We used two types of cluster validity measures, such as external and internal cluster validity indices. The
first one was used to compare how well the results of the applied clustering algorithms approximated the true
cluster structures, by using pre-labelled images as ground truth files. This was important to find the clustering
algorithm that performs the best on our dataset. However, the ground truth files are only available for a subset
of the images.
    The second type of validity measures are the internal validity indices, that can be used on a later stage when
no ground truth file is available any more. We found the DB ↓ index to be the closest to the external index,
with 14% of average deviation. This index can be used later, along with the CCMA clustering method to give
an impression of the quality of the clustering result.


6   Acknowledgements
The research was supported by the grant EFOP-3.6.1-16-2016-00001 ”Complex improvement of research capac-
ities and services at Eszterhazy Karoly University”.


                                                       237
References
[ANDAM11] Eckhard Ammann, Ismael Navas-Delgado, and José F. Aldana-Montes. A semantic-supported
          knowledge development conception. In Proceedings book of work congress on engineering.
          WCE2011. London, UK, 6-8 July, 2011, volume 2. Newswood Limited, nov 2011.

[Bez81]      James C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic
             Publishers, Norwell, MA, USA, 1981.

[BKJJ13]     Csaba Benedek, Olivér Krammer, Mihály Janóczki, and László Jakab. Solder paste scooping
             detection by multilevel visual inspection of printed circuit boards. Industrial Electronics, IEEE
             Transactions on, 60(6):2318–2331, 2013.

[CMS+ 10]    Ornella Cominetti, Anastasios Matzavinos, Sandhya Samarasinghe, Don Kulasiri, Sijia Liu, Philip
             Maini, and Radek Erban. Diffuzzy: a fuzzy clustering algorithm for complex datasets. International
             Journal of Computational Intelligence in Bioinformatics and Systems Biology, 1(4):402–417, 2010.

[CP16]       Ssu-Han Chen and Der-Baau Perng. Automatic optical inspection system for ic molding surface.
             Journal of Intelligent Manufacturing, 27(5):915–926, 2016.

[DB79]       David L Davies and Donald W Bouldin. A cluster separation measure. Pattern Analysis and
             Machine Intelligence, IEEE Transactions on, (2):224–227, 1979.

[DSAW+ 99] Christian Demant, Bernd Streicher-Abel, Peter Waszkewitz, Michaela Strick, and Gary Schmidt.
           Industrial image processing : visual quality control in manufacturing. Springer, Berlin, Heidelberg,
           Paris, 1999.

[Dun73]      Joseph C Dunn. A fuzzy relative of the isodata process and its use in detecting compact well-
             separated clusters. 1973.

[ECL00]      Vladimir Estivill-Castro and Ickjai Lee. Autoclust: Automatic clustering via boundary extraction
             for mining massive point-data sets. In In Proceedings of the 5th International Conference on
             Geocomputation, pages 23–25, 2000.

[EHL+ 10]    Christian Eitzinger, Wolfgang Heidl, Edwin Lughofer, Stefan Raiser, Jim E. Smith, Muham-
             mad Atif Tahir, Davy Sannen, and Hendrik Van Brussel. Assessment of the influence of adaptive
             components in trainable surface inspection systems. Mach. Vis. Appl., 21(5):613–626, 2010.

[EpKSX96]    Martin Ester, Hans peter Kriegel, Jörg S, and Xiaowei Xu. A density-based algorithm for discov-
             ering clusters in large spatial databases with noise. pages 226–231. AAAI Press, 1996.

[FJ02]       Ana LN Fred and Anil K Jain. Data clustering using evidence accumulation. In Pattern Recognition,
             2002. Proceedings. 16th International Conference on, volume 4, pages 276–280. IEEE, 2002.

[FLP+ 51]    K Florek, J Lukaszewicz, J Perkal, Hugo Steinhaus, and S Zubrzycki. Sur la liaison et la division
             des points d’un ensemble fini. In Colloquium Mathematicae, volume 2, pages 282–285. Institute of
             Mathematics Polish Academy of Sciences, 1951.

[FZ02]       Andrew Foss and Osmar R. Zaı̈ane. A parameterless method for efficiently discovering clusters of
             arbitrary shape in large datasets. In ICDM, pages 179–186. IEEE Computer Society, 2002.

[GMW07]      Guojun Gan, Chaoqun Ma, and Jianhong Wu. Data clustering - theory, algorithms, and applica-
             tions. SIAM, 2007.

[GRS98]      Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. Cure: an efficient clustering algorithm for large
             databases. SIGMOD Rec., 27(2):73–84, June 1998.

[HBV01]      Maria Halkidi, Yannis Batistakis, and Michalis Vazirgiannis. On clustering validation techniques.
             Journal of Intelligent Information Systems, 17:107–145, 2001.


                                                     238
[HG07]      Alexander Hinneburg and Hans-Henning Gabriel. Denclue 2.0: Fast clustering based on kernel
            density estimation. In Proceedings of the 7th International Conference on Intelligent Data Analysis,
            IDA’07, pages 70–80, Berlin, Heidelberg, 2007. Springer-Verlag.
[Ilc12]     Nejc Ilc. Modified dunn’s cluster validity index based on graph theory. (2), 2012.
[IOBG08]    Natthakan Iam-On, Tossapon Boongoen, and Simon Garrett. Refining pairwise similarity matrix
            for cluster ensemble problem with cluster relations. In Discovery Science, pages 222–233. Springer,
            2008.
[IoG10]     Natthakan Iam-on and Simon Garrett. Linkclue: A matlab package for link-based cluster ensembles.
            Journal of Statistical Software, 36(9):1–36, 8 2010.
[JD88]      Anil K. Jain and Richard C. Dubes. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper
            Saddle River, NJ, USA, 1988.
[JMF99]     A K Jain, M N Murty, and P. J. Flynn. Data clustering: A review, 1999.
[KHK99]     George Karypis, Eui-Hong (Sam) Han, and Vipin Kumar. Chameleon: A hierarchical clustering
            algorithm using dynamic modeling, 1999.
[LGW+ 17]   Xiaoqing Li, Bin Gao, Wai Lok Woo, Gui Yun Tian, Xueshi Qiu, and Liangyong Gu. Quantitative
            surface crack evaluation based on eddy current pulsed thermography. IEEE Sensors Journal,
            17(2):412–421, 2017.
[LNS08]     Dongquan Liu, Gleb V. Nosovskiy, and Olga Sourina. Effective clustering and boundary detection
            algorithm based on delaunay triangulation. Pattern Recogn. Lett., 29(9):1261–1273, July 2008.
[Mac67]     J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In
            L. M. Le Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on Mathematical
            Statistics and Probability, volume 1, pages 281–297. University of California Press, 1967.
[MTPL13]    Miha Možina, Dejan Tomaževič, Franjo Pernuš, and Boštjan Likar. Automated visual inspection
            of imprint quality of pharmaceutical tablets. Machine vision and applications, 24(1):63–73, 2013.
[NHS05]     Raymond T. Ng, Jiawei Han, and Ieee Computer Society. Clarans: A method for clustering objects
            for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, pages 1003–1017,
            2005.
[NJ95]      T.S. Newman and A.K. Jain. A survey of automated visual inspection. CVIU, 61(2):231–262,
            March 1995.
[NLS08]     Gleb V. Nosovskiy, Dongquan Liu, and Olga Sourina. Automatic clustering and boundary detection
            algorithm based on adaptive influence function. Pattern Recognition, 41(9):2757–2776, 2008.
[NS06]      J Saketha Nath and Shirish Krishnaj Shevade. An efficient clustering scheme using support vector
            methods. Pattern Recognition, 39(8):1473–1480, 2006.
[PB97]      N. R. Pal and J. Biswas. Cluster validation using graph theoretic concepts. 30(6):847–857+, 1997.
[PP05]      Giuseppe Papari and Nicolai Petkov. Algorithm that mimics human perceptual grouping of dot
            patterns. In Massimo De Gregorio, Vito Di Maio, Maria Frucci, and Carlo Musio, editors, BVAI,
            volume 3704 of Lecture Notes in Computer Science, pages 497–506. Springer, 2005.
[PS12]      Piyaphol Phoungphol and Inthira Srivrunyoo. Boosting-genetic clustering algorithm. In ICMLC,
            pages 1218–1223, 2012.
[PZLB93]    J.-G. Postaire, R. D. Zhang, and C. Lecocq-Botte. Cluster analysis by binary morphology. IEEE
            Trans. Pattern Anal. Mach. Intell., 15(2):170–180, feb 1993.
[Rou87]     Peter Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster
            analysis. J. Comput. Appl. Math., 20(1):53–65, 1987.


                                                     239
[SC04]    Stan Salvador and Philip Chan. Determining the number of clusters/segments in hierarchical
          clustering/segmentation algorithms. In Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th
          IEEE International Conference on, pages 576–584. IEEE, 2004.

[SCZ98]   Gholamhosein Sheikholeslami, Surojit Chatterjee, and Aidong Zhang. Wavecluster: A multi-
          resolution clustering approach for very large spatial databases. pages 428–439, 1998.
[TSK05]   P. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison-Wesley, 2005.
[Wil45]   Frank Wilcoxon. Individual comparisons by ranking methods. Biometrics bulletin, pages 80–83,
          1945.
[ZMW10]   Caiming Zhong, Duoqian Miao, and Ruizhi Wang. A graph-theoretical clustering method based
          on two rounds of minimum spanning trees. Pattern Recogn., 43(3):752–766, March 2010.
[ZRL96]   Tian Zhang, Raghu Ramakrishnan, and Miron Livny. Birch: an efficient data clustering method
          for very large databases. In ACM SIGMOD Record, volume 25, pages 103–114. ACM, 1996.


                                                  240