=Paper= {{Paper |id=Vol-2844/fashion2 |storemode=property |title=Towards Fashion Image Annotation: A Clothing Category Recognition Procedure |pdfUrl=https://ceur-ws.org/Vol-2844/fashion2.pdf |volume=Vol-2844 |authors=Tryfon-Rigas Tzikas,Alexandros-Charalampos Kyprianidis,Maria Kotouza,Sotirios-Filippos Tsarouchis,Antonios Chrysopoulos,Pericles Mitkas |dblpUrl=https://dblp.org/rec/conf/setn/TzikasKKTCM20 }} ==Towards Fashion Image Annotation: A Clothing Category Recognition Procedure== https://ceur-ws.org/Vol-2844/fashion2.pdf
        Towards Fashion Image Annotation: A Clothing
              Category Recognition Procedure
            Tryfon-Rigas Tzikas                             Alexandros-Charalampos                               Maria Kotouza
            tzikasta@ece.auth.gr                                  Kyprianidis                     maria.kotouza@issel.ee.auth.gr
   Electrical and Computer Engineering,                alexandros.kyprianidis@issel.ee.auth.gr Electrical and Computer Engineering,
    Aristotle University of Thessaloniki               Electrical and Computer Engineering, Aristotle University of Thessaloniki
            Thessaloniki, Greece                        Aristotle University of Thessaloniki            Thessaloniki, Greece
                                                                Thessaloniki, Greece

      Sotirios-Filippos Tsarouchis                           Antonios Chrysopoulos                               Pericles Mitkas
     sotiris.tsarouchis@issel.ee.auth.gr                      achryso@issel.ee.auth.gr                           mitkas@auth.gr
   Electrical and Computer Engineering,                Electrical and Computer Engineering,          Electrical and Computer Engineering,
    Aristotle University of Thessaloniki                Aristotle University of Thessaloniki          Aristotle University of Thessaloniki
             Thessaloniki, Greece                               Thessaloniki, Greece                          Thessaloniki, Greece
Abstract                                                                       1    Introduction
In contemporary clothing industry, design, development and                     Fashion clothing is one of the oldest industries, occupying
procurement teams are constantly asked to present more                         one of the highest market shares. In this age of fast fashion,
products with fewer resources in a shorter time. Thus, cloth-                  trends change in a highly frequent manner, making it an
ing companies that aim to remain competitive in today’s                        appropriate field for applying optimization techniques to ef-
market have to deploy new Artificial Intelligence techniques                   ficiently extract valuable information from the huge amount
aiming at the automation of their traditional procedures. In                   of generated data. To this end, contemporary clothing brands
this direction, the presented approach utilizes a deep learning                tend to introduce Artificial Intelligence (AI) techniques, aim-
model to accurately classify fashion images. The predictions                   ing to improve the processes of supply chain, while keeping
are intended to be used on a personalized recommendation                       up to date with the newest fashion trends. Fashion houses
system, that acts as an assistant for the fashion designers.                   such as Hugo Boss1 and Tommy Hilfiger2 have already devel-
Two well established architectures are studied, VGG and                        oped AI-driven tools to improve the design process, whereas
ResNet, as well as a variation of ResNet. The realized experi-                 Prada3 uses AI to deliver high-quality content faster.
ments include: (a) architecture comparison, (b) hyperparam-                       The development of such tools was not feasible before
eter tuning and classification, and (c) transfer learning. Two                 the evolution of Deep Learning and Computer Vision: image
fashion datasets are used for the model training and classifi-                 recognition, detection, segmentation and generation, as well
cation: DeepFashion (for training the model from scratch)                      as 3D reconstruction, are some of the techniques that are
and iMaterialist (used to evaluate the transferability of the                  being used in the development of fashion related solutions.
produced model). The results show that the first set of ex-                    The emergence of an abundance of related projects is justified
periments achieved 80.5% accuracy, whereas the pre-trained                     by the rapid growth in the specific scientific fields.
model used on the second dataset led to a decrease of 60%                         In this paper, Deep Learning algorithms for clothing cate-
on training time, while attaining satisfying results.                          gory classification are evaluated. Two datasets are used as
                                                                               inputs, DeepFashion and iMaterialist, while data augmenta-
CCS Concepts: • Computing methodologies → Object                               tion techniques are applied on them. The first one is used to
recognition; Supervised learning by classification; Neural                     train the model from scratch, while the second one to eval-
networks; • Applied computing → Consumer products.                             uate the transferability of the produced model. The models
                                                                               that were used during the experiments are VGG16, ResNet50
Keywords: object classification, fashion clothing images,                      and a variation of ResNet50.
fine-tuning, convolutional neural networks


                                                                               1 https://www.hugoboss.com/fashionstories/digitalisation-is-and-remains-

                                                                               a-big-trend-which-has-already-been-embraced-by-hugo-boss/fs-story-
AI4FASHION2020, September 02–04, 2020, Athens, Greece                          1e6xd6hk2kr8e.html
Copyright © 2020 for this paper by its authors. Use permitted under Creative   2 https://www.ibm.com/blogs/think/2018/01/tommyhilfiger-ai/

Commons License Attribution 4.0 International (CC BY 4.0).                     3 https://www.pradagroup.com/en/news-media/news-section/prada-

                                                                               group-expands-collaboration-with-adobe.html
   The proposed solution is part of the Data Annotation            variation of ResNet50 (ResNet50v2), by using the DeepFash-
module introduced in our previous work [11], where an AI-          ion dataset, after applying image pre-processing techniques.
enabled system utilized towards the improvement of cloth-          The next step contains the selection of the architecture with
ing design process was proposed. Specifically, the aforemen-       the highest accuracy, by performing a grid search for the
tioned system is responsible for retrieving, organizing and        image augmentation parameters and the model’s training hy-
combining data from many different sources, while taking           perparameters. In the last step, the fine-tuned model is used
into account the designers’ preferences, in order to suggest       on the iMaterialist dataset, to evaluate the transferability of
clothing products of interest and help fashion designers with      the produced model.
the decision-making process.
   The rest of paper is organized as follows. Section 2 lists      3.1   Image Pre-processing
related works. Section 3 introduces the methodology. Sec-          The efficiency of the model is heavily dependent on the input
tion 4 presents the experimental setup, datasets and results.      dataset that is used during the training process. Taking this
Section 5 contains the conclusion and future work.                 into consideration, the images need to be cropped, using
                                                                   the provided bounding boxes from the dataset, to exclude
                                                                   non-relatable objects as well as background noise, in order
2   Related Work                                                   to restrain the model from capturing irrelevant information.
Several research works have been realized in the field of          Moreover, in a multi-class classification problem, each image
AI-enabled Fashion applications. There are many works that         corresponds to one label, thus it needed to avoid having mul-
tried to discern the AI applications in the fashion industry in    tiple clothes in a single image, as it can mislead the training
four categories [7]: (a) apparel design, (b) manufacturing, (c)    process and affect its performance in a negative manner.
retailing, (d) supply chain management. In the work of [13] a         In order to achieve higher performance and reduce over-
comprehensive review of AI systems in apparel supply chains        fitting, Data Augmentation techniques are applied, on the
is presented, while in [5] an empirical review on existing         available training set, in the following order: 1) rotation, 2)
apparel recommendation systems is conducted.                       shearing, 3) horizontal flip and 4) zoom in or out; experi-
   Fashion image analysis has emerged as a challenging task.       menting on each one of them to fine-tune them. Starting
The majority of the approaches that have been used over            with the first technique, a range of low values was tested
time can be described as follows: (a) traditional features         and the optimal values were kept in the end.
learning methods based on manually created features which
are then processed by machine learning algorithms [15], (b)        3.2   Clothes Recognition with ResNet
Deep Learning algorithms based on deep neural networks             There are many state-of-the art solutions in the literature
and especially convolutional neural networks. In most cases,       related to image recognition using Deep Learning techniques.
the models that have been developed achieve high results           Architectures like VGG [14] and ResNet [10] are proved to
concerning image classification and recognition. [12] [3] [9]      be ideal for recognizing clothing categories from fashion
   In the area of fashion image classification, Hidayati et        images [1] [2]. More specifically, VGG16 and ResNet50 are
al. [9] proposed a classification technique that recognizes        commonly used in this field.
clothing genres based on visually differentiable style ele-           In this work, experimentation with VGG16 and ResNet50
ments. Additionally, Cychnerski et al.[2] presented a set of       was realized. Additionally, a variation of ResNet50 was in-
experiments in order to evaluate ResNet and SqueezeNet.            vestigated, which is characterized by an architecture with
   Many datasets have been introduced as test-beds to apply        the following modifications in the skip connection: the batch
various AI techniques in the field of fashion. DeepFashion         normalization and the ReLU function takes place before the
[12] is composed of 800,000 images which are richly an-            convolutional layer [2]. This variation of ResNet50 was cho-
notated with attributes, clothing landmarks and correspon-         sen as the one with the best performance amongst other
dence of images taken under different scenarios. DeepFash-         variation attempts on the input dataset.
ion2 [3] is an improved version of DeepFashion, with en-
riched annotations; style, scale, viewpoint, occlusion, bound-     3.3   Hyperparameter Tuning
ing box and dense landmarks were added.                            Hyperparameter tuning is a crucial task towards achieving
                                                                   the optimal performance in Deep Learning modelling. In
                                                                   this process, a set of optimizers were investigated in order
3   Methodology                                                    to find the appropriate one for the problem at hand. More
The clothing category classification, as well as the fine-tuning   specifically, the optimizers examined are Adam, Adadelta,
of an existing model to another dataset are challenging tasks.     Adamax, Adagrad, SGD.
In Figure 1, the proposed approach is described, being divided        Weight initialization of a Deep Learning network strongly
in three steps. As a first step, three different deep learning     affects the performance of the model, since problems like
architectures are tested: (a) VGG16, (b) ResNet50 and (c) a        vanishing and exploding gradients are tackled by using the
                                       Figure 1. Overview of the proposed methodology

correct initializer. The following initializers were used in      evaluation of the produced models’ performance. The sec-
the experiments: (a) Random Normal, (b) He Normal [8],            tion is composed of three sets of experiments, as follows:
(c) Glorot Normal [4], (d) Zeros, (e) He Uniform [8], and (f)     (1) architecture comparison, (2) hyperparameter tuning and
Glorot Uniform [4].                                               classification, and (3) transfer learning.
   In addition, regularization restricts the exponential growth
of model’s weights and prevents the model from overfitting.       4.1   Datasets
The techniques employed in the proposed approach are a            Two datasets were used for the training and evaluation of the
combinations of regularizers and weight decay. Both these         models, DeepFashion and iMaterialist. DeepFashion dataset
parameters are investigated in regard with the learning rate,     [12] consists of 800,000 images characterized by many fea-
as they are correlated with it. The regularizers examined are     tures and labels. iMaterialist dataset [6] consists of 1,000,000
as follows: (a) L1 (b) L2 (c) L1 & L2, while the weight decay     images and contains 8 groups of 228 fine-grained attributes.
values are: (a) 0.98, (b) 0.95, (c) 0.75.                         The imbalanced distribution of the classes in each dataset
                                                                  was balanced by randomly choosing 5000 images for every
3.4   Transfer Learning                                           clothing category, using 50.000 images in total. They were
After the completion of the first set of experiments, focused     split into training, validation and test set with ratios of 0.7,
on the multi-class classification problem of clothing cate-       0.15, 0.15, respectively.
gories, we proceed with the examination of the second set,
which deals with the evaluation of the performance of an          4.2   Experimental Setup
already trained model in another dataset, making use of           Input images were scaled down to 224x224 RGB images and
transfer learning techniques. The evaluation of the model         classified into 10 classes including coat and jacket, dress, top,
in a second dataset can be broken down in two cases: (a)          shorts, trousers, skirt, leggings and jeggings, outfit, special
evaluating the pre-trained model without further training,        occasion and suits. The models were trained on a Nvidia
and (b) using the pre-trained model as a starting point to        Tesla K40c GPU with 32GB memory RAM and utilizing an
re-train either the whole model, or only specific layers. The     Intel Xeon E5-2630 processor. The batch size that was used
whole idea is based on the similarity between the two fashion     during training is 32 and the initial learning rate was set
datasets and on the fact that they share common low-level         according to Keras defaults values for each optimizer (0.01
features, which are also captured from the weights of the         for SGD and 0.001 for the rest of them).
bottom layers of the model. The main hypothesis should im-
prove the model’s performance as it can achieve comparative       4.3   Results
results in significant less time.                                 4.3.1 Architecture Comparison. The architectures tested
                                                                  for the classification of the provided clothing categories are
4     Experiments                                                 the following: VGG16, ResNet50 and a variation of ResNet50
This section contains the experimental process on the prob-       (ResNet50v2) [2]. They were all tested using the same values
lem of multi-class clothing categories classification and the     on each hyperparameter, based on the configuration in Table
1. Moreover, Table 1 makes clear that ResNet50v2 outper-                    Table 2. Image augmentation experiments
forms the rest of the models, achieving accuracy 74%; thus it
is selected to be used for the rest of the experiments.            Rotation      Accuracy        Shear                    Accuracy
   The performance of the models was measured with the us-
age of the following evaluation metrics: accuracy, precision,      0                 71.0%       0                           73.0%
recall and f1 score.                                               10                73.0%       0.05                        76.0%
                                                                   30                67.0%       0.1                         71.0%
Table 1. Model initialization parameters and architecture          90                52.0%       0.2                         77.0%
comparison
                                                                   Zoom          Accuracy        Horizontal Flip          Accuracy
Parameters                Values       Model        Accuracy       0                 77.0%
                                                                                                 True                        77.5%
Optimizer              Adam                                        0.05              77.2%
                                       VGG16           67%         0.1               76.0%
Initializer        Glorot Uniform
                                                                                                 False                       77.2%
Learning Rate           0.01
                                       ResNet50        70%         0.2               74.0%
Weight Decay             0.9
Regularizer              L1
                                       ResNet50v2      74%
Image Augmentation      None                                               Table 3. Initializer and optimizer experiments

                                                                   Initializer           Accuracy         Optimizer       Accuracy
4.3.2 Classification Results. Towards the improvement
of the produced model’s performance, many experiments
                                                                   Random Normal              78.3%       Adam               78.0%
were conducted in order to find the best configuration of the      He Normal                  77.8%       Adagrad            77.8%
available hyperparameters. During this process, a grid search      Glorot Normal              78.8%       Adadelta           80.0%
for the image augmentation parameters was performed, as            Zeros                      10.0%       SGD                71.0%
well as the model’s training hyperparameters, in order to          He Uniform                 77.6%       Adamax             79.0%
boost the accuracy of the model. The order in which the            Glorot Uniform             77.5%
experiments were performed is as follows: (a) Image aug-
mentation (b) Initializer, (c) Optimizer, (d) Learning rate and    Table 4. Learning rate, weight decay and regularizer experi-
Regularizer, (e) Learning rate and weight decay. In the fol-       ments
lowing experiments the default parameters are used for the
initial configuration, as mentioned in Table 1. The order in
which each parameter’s experiments are conducted is impor-                                    Regularizer          Weight Decay
tant, as with the completion of each one, the optimal value         Learning rate        L1     L2      L1 & L2   0.98     0.95 0.75
of the corresponding parameter is extracted and is used in
the configuration of the following experiments.                     0.01                67%     63%       68%      67%     63%   60%
   The results of the image augmentation experiments, are           0.1                 65%     78%       78%      76%     78%   76%
presented in Table 2. The optimal values for each technique         1                   73%     80%       75%      80%     80%   81%
are the following: (a) Rotation: 10, (b) Shear: 0.2, (c) Zoom:
0.05 and (d) Horizontal flip: True. The optimal values led the               Table 5. Model optimization parameters
produced model to not only achieve better performance, but
to avoid overfitting, as well. It is clear that the model per-     Image Augmentation         Values    Parameters          Values
forms better when the image augmentation process causes
                                                                   Rotation                     10      Optimizer          Adadelta
mediocre changes in the datasets.
                                                                   Shear                        0.2     Initializer      Glorot Normal
   In Table 3, the results of the various initializers and opti-   Zoom                        0.05     Learning Rate           1
mizers are presented. In the first case Glorot Normal achieved     Horizontal Flip             True     Weight Decay          0.75
the best results, while Zeros provided the worst, as expected.                                          Regularizer            L2
As far as the optimizers are concerned, they all achieved
similar results, except from SGD. The reason behind this is
that SGD demands additional fine-tuning to determine the           Table 4. Their optimal values are strongly dependent on the
appropriate hyperparameters, in contrast with the rest of          learning rate parameter. For this reason each parameter is
the optimizers, who are adaptive gradient methods. Among           tested in respect to different values of learning rate. The best
the optimizers, Adadelta achieved the highest accuracy.            values of three parameters coming in pairs are as follows:
   The results of the experiments conducted in order to de-        (a) learning rate: 1, regularizer: L2, (b) learning rate: 1, weight
termine the weight decay and regularizer are presented in          decay: 0.75.
   The final trained model using the optimal parameters                       Table 6. Transfer Learning Experiments
achieved 80.5% accuracy, as presented in Table 5. Figure
2 is the confusion matrix of the model for each class. The         Experiments                    Precision Recall F1 Score   Accuracy
diagonal of the matrix presents the true positive value per         Benchmark (No training)        40.7%    37.8%    37.3%     38.0%
class. The classes Skirt, Trousers, Dress and Shorts are classi-    Last layer                     46.3%    42.4%    42.1%     42.0%
                                                                    Whole model                    65.2%    64.6%    64.7%     62.5%
fied better than the rest, while many samples of Outfit and         Without pre-trained weights    65.1%    64.9%    64.8%     65.0%
Suits are misclassified as Coat and Dress respectively, since
there is a vivid resemblance between the images of these
classes.                                                              On the second step of the experimental process, all the
                                                                   layers of the model were frozen, except from the last one, in
                                                                   order to keep the learned features intact and modify only
                                                                   the classifier’s weights, which constitutes the last layer of
                                                                   the model. The results show a slight improvement over the
                                                                   benchmark on each evaluation metric.
                                                                      To further improve the model’s performance on the new
                                                                   dataset, the whole model was unfrozen, which actually led to
                                                                   significantly better results. The model achieved 62.5% accu-
                                                                   racy, almost 20% better than the previous best performance,
                                                                   revealing that even though the datasets share common fea-
                                                                   tures, as they both contain fashion clothing images, they also
                                                                   appear to have variant inputs.
                                                                      To highlight this last point, the confusion matrix of the
                                                                   last experiment is presented on Figure 3. The classes Shorts,
                                                                   Trousers, Coat are classified with greater confidence, while
                                                                   Leggings are misclassified as Trousers and Dress as Skirts and
                                                                   vice versa. This behavior may derive from either annotation
     Figure 2. ResNet50v2 evaluated in DeepFashion                 fault or the fact that these two classes share many visual
                                                                   characteristics, as a long skirt can be easily misjudged as a
                                                                   dress.
4.3.3 Transfer Learning Results. In this section, the per-            Lastly, the model was trained from scratch, without using
formance of the Deep Learning model produced from the first        any weights originating from the pre-trained model. The
set of experiments is evaluated on the iMaterialist dataset,       model achieved 65% accuracy, surpassing the previous results.
which was not used previously. The datasets have many              The result is completely justified, as the newly estimated
visual features in common, as they both are used for classi-       hyperparameters are more suitable for whole model training,
fying fashion clothing images to categories. Therefore, it is      while in fine-tuning it is needed to use lower learning rate
assumed that the pre-trained model can be used as a baseline,      to slightly adjust the weights. Comparing the performance
upon which we can apply a set of slight weight adjustments         of the model trained from scratch and the model trained
through fine-tuning to improve its performance, while using        using the pre-trained weights, it seems that the second one
a low value for the training learning rate. In order to have       achieved 2.5% less accuracy. However, this is compensated by
comparative results, the same hyperparameters and the eval-        the time the model needed for completing its training, since
uation results of the pre-trained model in iMaterialist were       it was 60% faster than the first one (8 hours and 20 hours
maintained as benchmark in the fine-tuning experiments.            respectively), saving significant amount of computation time.
   Table 6 contains the comparison results of the fine-tuning
experiments against the ones achieved by the pre-trained           5    Conclusion and Future Work
model, which is the benchmark and has not undergone any            In this work, a classification model capable of recognizing
further training. The differentiation between the experi-          10 different categories of clothing images was presented.
ments lies on the model’s layers that each time are trained.       The process followed for analyzing the Deep Learning ar-
Thus, for the first step of the Transfer Learning process the      chitectures of VGG, ResNet and a variation of ResNet were
pre-trained model was applied on the input dataset as is,          described in detail, as well as the techniques performed to
without changing any of the pre-defined hyperparameters.           find the optimal model and boost its performance.
The results were very poor, since the model achieved a mere           DeepFashion was used for model training, while iMaterial-
38% accuracy, indicating that the two datasets contain differ-     ist was used for evaluating the transferability of the produced
ent content and they cannot be processed by the produced           model. The work was mainly focused on hyperparameter
model without additional training.                                 tuning, which is a necessary but time-consuming process
                                                                              [3] Yuying Ge, Ruimao Zhang, Lingyun Wu, Xiaogang Wang, Xiaoou
                                                                                  Tang, and Ping Luo. 2019. DeepFashion2: A Versatile Benchmark
                                                                                  for Detection, Pose Estimation, Segmentation and Re-Identification
                                                                                  of Clothing Images. CoRR abs/1901.07973 (2019). arXiv:1901.07973
                                                                                  http://arxiv.org/abs/1901.07973
                                                                              [4] Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty
                                                                                  of training deep feedforward neural networks. In Proceedings of the
                                                                                  Thirteenth International Conference on Artificial Intelligence and Sta-
                                                                                  tistics (Proceedings of Machine Learning Research, Vol. 9), Yee Whye
                                                                                  Teh and Mike Titterington (Eds.). PMLR, Chia Laguna Resort, Sardinia,
                                                                                  Italy, 249–256. http://proceedings.mlr.press/v9/glorot10a.html
                                                                              [5] Congying Guan, Sheng-feng Qin, Wessie Ling, and Guofu Ding. 2016.
                                                                                  Apparel recommendation system evolution: an empirical review. In-
                                                                                  ternational Journal of Clothing Science and Technology 28 (11 2016),
                                                                                  854–879. https://doi.org/10.1108/IJCST-09-2015-0100
                                                                              [6] Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui,
                                                                                  Yuan Li, Matthew R. Scott, Hartwig Adam, and Serge J. Belongie. 2019.
                                                                                  The iMaterialist Fashion Attribute Dataset. CoRR abs/1906.05750 (2019).
                                                                                  arXiv:1906.05750 http://arxiv.org/abs/1906.05750
                                                                              [7] Z.X. Guo, W. Wong, SYS Leung, and Min Li. 2011. Applications of artifi-
Figure 3. Retrained ResNet50 on iMaterialist based on the
                                                                                  cial intelligence in the apparel industry: A review. Textile Research Jour-
pretrained weights                                                                nal 81 (11 2011), 1871–1892. https://doi.org/10.1177/0040517511411968
                                                                              [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delv-
                                                                                  ing deep into rectifiers: Surpassing human-level performance on ima-
                                                                                  genet classification. In Proceedings of the IEEE international conference
for achieving the highest accuracy. The produced model                            on computer vision. 1026–1034.
achieved 80.5% accuracy on DeepFashion, while the fine-                       [9] Shintami Chusnul Hidayati, Chuang-Wen You, Wen-Huang Cheng,
tuning of the pre-trained model on iMaterialist led to an                         and Kai-Lung Hua. 2018. Learning and Recognition of Clothing Genres
62.5% accuracy with a 60% reduction in training time, com-                        From Full-Body Images. IEEE Transactions on Cybernetics 48 (2018),
                                                                                  1647–1659.
pared to the corresponding model trained from scratch.
                                                                             [10] Riaz Ullah Khan, Xiaosong Zhang, Rajesh Kumar, and Emelia Opoku
   Future work involves the improvement of the input datasets                     Aboagye. 2018. Evaluating the Performance of ResNet Model Based on
by manually refining its misplaced labels, which can be pre-                      Image Recognition. In Proceedings of the 2018 International Conference
cisely identified using already trained models and even its                       on Computing and Artificial Intelligence (Chengdu, China) (ICCAI 2018).
enhancement with more samples, in order for the produced                          Association for Computing Machinery, New York, NY, USA, 86–90.
                                                                                  https://doi.org/10.1145/3194452.3194461
model to provide more robust results. Moreover, a wider set
                                                                             [11] Maria Th Kotouza, Sotirios-Filippos Tsarouchis, Alexandros-
of experiments can be conducted in order to improve the                           Charalampos Kyprianidis, Antonios C Chrysopoulos, and Pericles A
performance of the model, such as further investigation on                        Mitkas. 2020. Towards Fashion Recommendation: An AI System for
selecting a proper model architecture, detailed tuning of the                     Clothing Data Retrieval and Analysis. In IFIP International Conference
hyperparameters in the pre-trained model’s fine-tuning pro-                       on Artificial Intelligence Applications and Innovations. Springer,
                                                                                  433–444.
cess and testing other training techniques in the fine-tuning
                                                                             [12] Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016.
process.                                                                          DeepFashion: Powering Robust Clothes Recognition and Retrieval
                                                                                  With Rich Annotations. In The IEEE Conference on Computer Vision
Acknowledgments                                                                   and Pattern Recognition (CVPR).
                                                                             [13] E.W.T. Ngai, S. Peng, Paul Alexander, and Karen Moon. 2014. De-
This research has been co-financed by the European Re-                            cision support and intelligent systems in the textile and apparel
gional Development Fund of the European Union and Greek                           supply chain: An academic review of research articles. Expert Sys-
national funds through the Operational Program Competi-                           tems with Applications: An International Journal 41 (01 2014), 81–91.
tiveness, Entrepreneurship and Innovation, under the call                         https://doi.org/10.1016/j.eswa.2013.07.013
                                                                             [14] Karen Simonyan and Andrew Zisserman. 2014.                    Very Deep
RESEARCH – CREATE – INNOVATE (project code: T1EDK-                                Convolutional Networks for Large-Scale Image Recognition.
03464)                                                                            arXiv:1409.1556 [cs.CV]
                                                                             [15] S. Vittayakorn, K. Yamaguchi, A. C. Berg, and T. L. Berg. 2015. Runway
References                                                                        to Realway: Visual Analysis of Fashion. In 2015 IEEE Winter Conference
                                                                                  on Applications of Computer Vision. 951–958.
[1] Kuan-Ting Chen and Jiebo Luo. 2016. When Fashion Meets Big
    Data: Discriminative Mining of Best Selling Clothing Features. CoRR
    abs/1611.03915 (2016). arXiv:1611.03915 http://arxiv.org/abs/1611.
    03915
[2] J. Cychnerski, A. Brzeski, A. Boguszewski, M. Marmolowski, and M.
    Trojanowicz. 2017. Clothes detection and classification using convolu-
    tional neural networks. In 2017 22nd IEEE International Conference on
    Emerging Technologies and Factory Automation (ETFA). 1–8.