=Paper=
{{Paper
|id=Vol-2203/146
|storemode=property
|title=Preference Learning by Matrix Factorization on Island Models
|pdfUrl=https://ceur-ws.org/Vol-2203/146.pdf
|volume=Vol-2203
|authors=Stepan Balcar
|dblpUrl=https://dblp.org/rec/conf/itat/Balcar18
}}
==Preference Learning by Matrix Factorization on Island Models==
<pdf width="1500px">https://ceur-ws.org/Vol-2203/146.pdf</pdf>
<pre>
S. Krajči (ed.): ITAT 2018 Proceedings, pp. 146–151
CEUR Workshop Proceedings Vol. 2203, ISSN 1613-0073, c 2018 Štepán Balcar


                    Preference learning by matrix factorization on island models

                                                               Štěpán Balcar

                               Charles University, Faculty of Mathematics and Physics, Czech Republic, Prague
                                                       Stepan.Balcar@mff.cuni.cz

      Abstract: This paper presents island models, methods, im-         on size of population and probability of mutation (other
      plementation and experiments connecting stochastic opti-          parameters are fixed).
      mization methods and recommendation task (collaborative              In this paper, parameters of our methods were chosen
      one). Our models and methods are based on matrix fac-             after experiments on sample data. We will try to improve
      torization. Parallel run of methods optimizes the RMSE            RMSE using parallelization. We will provide results on
      metric from an island model point-of-view.                        both ML100k and ML1M data.
         This paper comments on architecture and some imple-
      mentation decisions.                                              Main contribution of this paper is:
         We dealt with two research hypotheses. First, whether
      island models bring always improvement. We will show                 • Multiple island model brings statistically significant
      that almost always yes. Second, whether evolutionary al-               improvement of recommendation in comparison to
      gorithm does or does not always find the best solution.                single instance optimization.
      This will be confirmed only on smaller data. Experiments             • Our implementation is able to handle individuals
      were provided on Movie Lens 100k and 1M data.                          (matrix factors). Size of our individuals is several or-
                                                                             ders bigger than usual in evolutionary computation.
      1 Introduction                                                       This paper is organized as follows: Chapter 2 "Related
                                                                        work" concerns recommender systems and matrix factor-
      This paper studies recommender systems, especially learn-         ization; evolutionary algorithms and island model. Chap-
      ing user/customer preferences. Main idea of this paper is         ter 3 "Methods, models and parameters" sketches our view
      to connect this study to evolutionary island models. Here,        of stochastic optimization methods; parameters and set-
      island models are a computational tool for matrix factor-         tings of island model used in tests. Next section "Data"
      ization.                                                          describes realization of experiments and design of tests
         Instances of one stochastic optimization method run in         and computing resources. Finally section 4 "Results" gives
      parallel on island models, searching the same state space.        both numeric and graphic presentation of our results and
      Such a method traverses the state space of solutions. The         discusses confirmation of our hypotheses. After conclu-
      actual state they visit is influenced by (mutually indepen-       sions paper ends with Future work.
      dent) stochasticity. Hence, different instances can visit dif-
      ferent parts of the state space. Island model parallelization
      is based on cooperation of methods during the computa-            2 Related work
      tion.
         This resembles ensemble and/or hybrid learning, where          This section gives basic notation for recommender sys-
      different optimization methods are:                               tems, evolutionary computation used and overviews some
                                                                        relevant literature.
         • run in parallel
                                                                        2.1 Recommender systems and matrix factorization
         • allowed to run in different state spaces
                                                                        It was the Netflix price (announced in October 2006)
      The solution (usually one of them) is chosen at the end by        which excited public and changed the landscape of rec-
      an additional aggregation/decision method.                        ommender systems area research.
         Using evolutionary computing for recommendation is                The BellKor squad included Chris Volinsky and his
      surveyed in the paper of Horvath [1]. This survey shows           AT&T colleagues Robert Bell and Yehuda Koren, along
      that usage of evolutionary methods is quite frequent in rec-      with four other engineers from the United States, Austria,
      ommendation systems (see also [2]). One of approaches is          Canada and Israel (BellKor’s Pragmatic Chaos) on June
      model based. Our approach uses matrix factorization and           26, 2009, finally crossed the 10% finish line. Main in-
      hence the model is constituted by factors. In [1] they men-       gredient of winning solution has been matrix factorization
      tion only one paper [3] where evolution individuals are           (MF), see e.g. [4].
      matrix factors. Authors of the article [3] provide results           We focus on latent model based recommendation -
      on ML100k data on minimizing squared error depending              which is a part of collaborative filtering technique (Co),
Preference learning by matrix factorization on island models                                                                             147

      asked for in Netflix Price competition. Co was introduced          of evolutionary algorithms in combination with the com-
      by [4].                                                            putation of latent factors.
          Let us denote U set of user ids and I the set of item ids.        Motivated by this, in our paper individuals are repre-
      Input data can be viewed as partial function r : U ×I −→           sented by concatenation of U| and I| (like a worm d-thick
      P, here P is the preference scale (e.g. one to five stars          and |U | + |I | long)
      or real numbers). Alternative representation is in the form
      of algebraic matrix R ∈ P |U |×|I | with dimensions repre-                                                              
      senting users and items respectively. Rating of user i ∈ U                   u11   ...    u1|U |    i11    ...    i1|I |
      of an item j ∈ I is content of corresponding cell of matrix                  ..   ..       ..       ..    ..        .. 
                                                                                   .       .      .        .       .       . 
      ri j ∈ P. Usually this matrix is very sparse, so in practice
                                                                                   xd1   ...    ud|U |    id1    ...    xd|I |
      input data are represented in a form of a database table R
      with schema R(uid = user id, iid = item id, rating). Al-
      though implementation uses database table formal model             Figure 1: Individual represented by latent vector of users
      of [4] description is easier in the language of matrices. The      and latent vector of items
      matrix R is factorized to lower dimensional representation
                                                                            In our experiments we use the RMSE as the fitness func-
                                R ≈ U × I| = R̂                    (1)   tion
         where U ∈ P |U |×d , I ∈ P |I |×d are user and item la-                                v
                                                                                                u |U | |I |
      tent factors matrices and × is a matrix product. d is much                                u ∑ ∑ e2
                                                                                                t i=1 j=1 i j
      smaller than |U | and |I |. Approximation of R by R̂ is                                                                      (3)
      measured by                                                                                  |U | ∗ |I |
                                                    !
                                                      d
                  e2i j = (ri j − rˆi j )2 =   ri j − ∑ uik i jk   (2)   2.3 Island model
                                                     k=1

         here rˆi j will serve as approximation of value ri j .          Island models are one of approaches how parallelize opti-
         Our main optimization method is SGD - stochastic gra-           mization algorithms. Their characterization is that there is
      dient descent method. For other approaches and dimen-              no central master and populations are distributed. Island
      sions of research in recommender systems we reffer to [5].         model consists of parallel algorithms enriched by send-
                                                                         ing and accepting migrants. Migrants are individuals from
                                                                         other islands population. The hope is that this propagation
      2.2 Evolutionary algorithms                                        of genetic material can speed up convergence and escape
      Evolutionary algorithms became popular after 1970,                 from local optima trap.
      thanks to John Holland [6] as a means for solving opti-               Parameters of island model are frequency of sending-
      mization problems. The solution is represented as an indi-         receiving migrants and the way how the migrant is chosen
      vidual, the set of individuals forms the population. Evo-          from local population (remember that parameters of meth-
      lutionary algorithm is formed by phases: initialization,           ods are fixed and are chosen after experiments on a sample
      selection (parent selection for next generation offspring),        data). Optimal choice of these parameters depends on the
      crossover, mutation and evaluation of the whole popula-            type of optimization method and is subject of study. These
      tion. The stochastic computing of individuals seeks con-           aspects of island models are studied in [7].
      vergence to the global optimum.                                       Specific type of island models, heterogeneous, were
         The contribution of evolutionary algorithms in the area         used in the application of multi-criteria optimization by
      of recommender systems was reviewed by Horvath de Car-             Martin Pilát and Roman Neruda [8]. They designed a
      valho in [1]. This review analyzed more than 65 research           new algorithm of MOGASOLS (Multiobjective Genetic
      papers using EC techniques for user recommendations. It            Algorithm with Single Objective Local Search) combining
      analyzed different approaches according to five aspects: (i)       multi-criteria genetic algorithm (MOGA) with a single-
      recommendation technique used, (ii) datasets employed in           criteria genetic algorithm (SOGA). It is proved that par-
      the experiments and (iii) evaluation methods used in the           allelization of time-consuming method MOGA achieves
      experiments, (iv) baselines used to compare with the pro-          worse results than parallel running MOGA and SOGA
      posed methods and (v) reproducibility of the research.             methods. They were tested in NSGA-II [9] and IBEA [10]
         Most of nature-inspired algorithms reviewed in [1] find         algorithms. This was further developed in [11].
      application in solving partial problems such as feature               The first who come up with heterogeneous island mod-
      weighting extraction, model based approach, clustering             els (HeIM), the way we understand them, was Martin Pilat
      and function learning.                                             ([11]). In these models, the islands may carry diverse com-
         Computing latent factor models by using of evolution-           putational methods, differing not only in parameters but
      ary algorithms has emerged as a possible approach [3].             also in the structure of the whole stochastic algorithm [11].
      Available publications do not mention the parallelization          For the choice of optimal methods for the given problem is
148                                                                                                                                  Štepán Balcar

      responsible the central planner, which replaces unsuccess-                  Individual equals solution. Solution is represented by
      ful method by more successful methods during the whole                   pair of latent factors of users and items (Figure 1). Mul-
      computation. In publication [11] the original Balcar’s tool              tiplying of these latent factor we get a matrix which rep-
      [12] was used.                                                           resents our estimation of ratings. Length of first vector is
         In this paper this tool is used with homogeneous islands              the number of users and the length of second vector equals
      to find optimal user and item latent factor.                             number of items. Size of the individual depends on size
         Then different optimization techniques are used to con-               of input problem. Width of latent vector is an optional pa-
      verge with factor to optimal one. For illustration we give               rameter.
      formulas for stochastic gradient descent. Recall e2i j , then
      next generation values of user and item factors equal to
                                                                               3.2 Parameters and settings of island model used in
                                                                                   tests
                              ∂ 2                            ∂ 2
            u´ik = uik + α        e      ik´ j = ik j + α         e      (4)   Now we describe island model - the environment in which
                             ∂ uik i j                      ∂ ik j i j
                                                                               the above methods will be parallelized. It is the envi-
        Note, that in our implementation we do not consider                    ronment which ensures dissemination of genetic material.
      normalization factor, this is left for future. Nevertheless,             The main tool for this is the migration. This migration
      some normalization effect is obtained by migration and                   does not mean only exchange of best solutions. We would
      synergy of islands. α is not fixed, just bounded from                    like to spread across the system (between islands) genetic
      above. More on our system is in [12].                                    material which has the potential to contribute to further
                                                                               improvement of population quality and to speed-up the
                                                                               convergence. Migration is inspired by processes in nature
      3 Methods, models and parameters                                         ([15], [16]). Most of island models exchange migrants in a
                                                                               synchronized way. In our system exchange of migrants is
      Island models where originally developed for paralleliza-                not synchronized - so in fact our islands are more general.
      tion of evolutionary algorithms. In this paper we will use                  Key parameter of island models , as nature shows ([17]),
      also other stochastic optimization methods.                              is the frequency of migration and size of local populations.
                                                                               The bigger the frequency of migration the bigger is the
      3.1    Stochastic optimization methods                                   chance that islands will help each other. On other side each
                                                                               hardware and software is limited by data through put.
      Evolutionary algorithms try to balance trade off between                    Matrix factorization needs migration of much bigger in-
      exploration and exploitation. This property should help                  dividuals than e.g. continuous optimization, where indi-
      evolutionary algorithm to escape from local extremes and                 vidual is a point in an n-dimensional space. In this paper
      simultaneously converge to optimal solution. For this abil-              we had to change architecture of model used in [11]. In
      ity they pay a higher time complexity because of parallel                [11] input problems were TSP (traveling salesman prob-
      development of bigger population. Because of this fact,                  lem of size cca. 1000 cities), BP (bin packing, one dimen-
      on some types of problems, the winners are other stochas-                sional with 1000 items), CO (continuous optimization, 10-
      tic optimization methods. An example is TSP solution by                  dimensional function) and vertex cover (1000 vertexes).
      hill climbing with 2-opt ([13]). So, this is the main rea-               Solution of preference learning by matrix factorization on
      son we consider several additional stochastic optimization               island models is represented of much bigger individuals,
      methods (see Table 1).                                                   rough estimation of our state space is > #TSP20 . Evolu-
         All these methods use the stochastic gradient descent.                tionary algorithms use incoming individuals in groups and
      A general link to description of stochastic optimization                 after a while. Hence, it is advantageous to store them in a
      methods is [14].                                                         front. As far as individuals are big and memory is limited
         Our implementation of stochastic optimization methods                 we had to limit the size of fronts (see Table 2).
      was changed in two ways. First, we had to create operators
      to enable them to work with latent factor based individu-
      als. Second, methods were modified for parallel run in is-               3.3 Data
      land models with migration. They were enriched with abil-                We used data from the movie recommendation domain in
      ity to cooperate. They receive individuals and enrich their              the experiments. The effectiveness of parallelization has
      local population. Please notice, that only evolution and                 been verified on datasets ml-100k1 and ml-1m2 . Datasets
      differential evolution have bigger population. Remaining                 are formed by movie evaluation (1-5 stars), for trials we
      methods have only one individual population. In this case                use the trinity (user, movie, rating).
      enrichment means replacement. They provide their solu-                      The training set consists of four-fifths from the dataset,
      tion from local population to neighboring islands. Meth-                 the rest is the test set. Counting derivatives for SGD lever-
      ods manipulate incoming individuals in concordance with
      basic algorithm idea. E.g. the tabu search method inserts                   1 http://grouplens.org/datasets/movielens/100k/

      a newcomer to the tabu set.                                                 2 https://grouplens.org/datasets/movielens/1m/
Preference learning by matrix factorization on island models                                                                                149

                   Algorithms          Tool                                 Parameters
                   HillClimbing        SGD random rating from each line     numberOfNeighbors=10
                   RandomSearch        generation of latent vectors
                   Evolution           uniform crossover + SGD              popSize=10, mutationRate=0.9,
                                                                            crossRate=0.1,
                                                                            parentSelector=CompareTwoRandomSelectors
                   BruteForce          SGD according to the I-th rating
                   TabuSearch          SGD random rating from each line     tabuModelSize=50
                   Sim.Annealing       SGD random rating from each line     temperature=10000,
                                                                            coolingRate=0.002
                   Diff.Evolution      differential crossover               popSize=50, F=0.25
                                                        Table 1: Methods and parameters


            Parameter                  value                                                Island 1                             Island 2

            Number of iterations       50, period = 60 seconds                              Method
                                                                                            (agent)
                                                                                                                                 Method
                                                                                                                                 (agent)

            Number of islands          4 (AMD Opteron 6376)                                    I.                                  II.

                                                                                                       Manager
            Neighbors of method        3 (distributed to everyone)                                     (planner)


            Migration frequency        5 seconds                                                                    Monitor
                                                                                                                   (statistic)


                   Table 2: Parameters of the island model
                                                                                                           Island 3
                                                                                                            Method
                                                                                                            (agent)
                                                                                                              III.


      ages the training set. The test set is used to evaluate indi-                    Figure 2: Architecture of system
      viduals.


      3.4     Realization of the experiment                                  The use of this system is the implementation of methods
                                                                          as an agent, who can develop methods as adaptive com-
      Our main idea is to increase stochasticity of searching the
                                                                          puting resources. The infrastructure of the system is made
      state space (which is enormous for movie data). First
                                                                          up of two central points, by Agent-manager that manages
      level of stochasticity is enabled by stochastic optimization
                                                                          computation and by Agent-monitor, that monitors genetic
      method. Second level is enabled by several independent is-
                                                                          material in the system.
      lands. The third level is attained by migration. Our system
      enables all of these. Moreover, all of these run in parallel           Software allows multiple ways of monitoring. The mon-
      with mutually assisting methods (not competitive).                  itor statistically processes the quality of the individuals.
         In order to be able to obtain the best solution from such        Another way of observing what happens in the system is
      a computation resource, the computation must be continu-            to produce the pedigrees of an individual. The basic idea
      ously monitored (Figure 2). Solutions represent individu-           is to enrich every individual of the pedigree that contains
      als that are unpredictably moving across the island model.          information on the involvement of concrete islands in its
      We can never know when and where the best solution will             creation.
      appear (sometimes the best solution does not appear at the             For our experiments, were used statistics that are com-
      end, see Figure 3).                                                 puted from each monitored individuals in one iteration of
         Our implementation uses agent middle-ware Jade3 for              planner. We will present the results as a measure of the
      achievement of higher adaptivity of methods (in our three           system, which is monitoring follow-planner-iterations.
      levels of stochasticity).                                              Our evolutionary methods (Table 1) use uniform
         Here we were facing main decision. Whether to prefer             crossover. In phase of mutation they apply stochastic gra-
      higher adaptivity (based e.g. on Jade) or better use of ef-         dient descent on a randomly chosen rating.
      fectiveness of specific hardware (based e.g. on C). From               Differential evolution combines 3 latent vectors (indi-
      pure experimental curiosity we went along the agent based           viduals which are models/solutions) LV1 which is a con-
      middle-ware framework direction.                                    catenation of U|1 and I|1 , similarly LV2 and LV3 . Result
         The central brain of the multi-agent based system is a           of the differential operator is latent vector
      manager of migration which directs the computation and
      measures genetic material during evolution.
                                                                                      LVnew = LV1 + F ∗ (LV2 − LV3 )

            3 http://jade.tilab.com/                                      here F is 0.25.
150                                                                                                                                                                                                                                                      Štepán Balcar

      3.5     Test design and computing resources
                                                                                                                                     Comparison of single methods - MFML1m
                                                                                                      1.7
      Two pairs of tests were created, comparing single meth-                                                                                                                                                 Single-BruteForce
                                                                                                                                                                                                              Single-Evolution
      ods and island models, differing in the size of the input                                       1.6                                                                                                     Single-HillClimbing
                                                                                                                                                                                                              Single-SimulatedAnnealing
      Movieland dataset. The width of the latent vector was set                                       1.5
                                                                                                                                                                                                              Single-TabuSearch


      to 10 (best after initial experiments on smaller samples).
      The tests run on all initial parameter settings (Table 3) 9                                     1.4


                                                                                    y: fitness RMSE
      times. As far as our computations are nondeterministic                                          1.3
      this makes difference (see Figures 4 and 5). These are
      computationally demanding calculations.                                                         1.2


                                                                                                      1.1
       Parameter                     value
       Number of runs                9                                                                 1


       Computing nodes               AMD Opteron 6376, 64GB memory                                    0.9
       Latent factor width           10                                                                     0          5             10              15               20
                                                                                                                     x: time in planner iterations (1x iteration = 60x seconds)
                                                                                                                                                                                      25              30               35             40              45          50


       Datasets                      ml-100k, ml-1m
                         Table 3: Parameters of the test
                                                                       Figure 3: Methods ML1m investigation of median run


      4 Results
                                                                                                                           Comparison single methods and island - MFML100k
                                                                                                       1


                                                                                              0.98
      Our experiments validated two hypotheses. First is that
      evolution does not necessary give best solution and sec-                                0.96

      ond that island models improve results. We will present
                                                                           y: fitness RMSE


      results for two datasets separately. We will show the best                              0.94


      solution of 9 runs and average of all runs (for distribution                            0.92
      see Figures 4 and 5).
         On the smaller data set the winner is always Evolution                                       0.9

      and Islands give always better solution.
         On the bigger dataset the single island winner is simu-                              0.88


      lated annealing (in both minimum and average). In Islands                                                 Si
                                                                                                                  ng
                                                                                                                     le
                                                                                                                                 Is
                                                                                                                                    la
                                                                                                                                           nd
                                                                                                                                             -B
                                                                                                                                                         Si
                                                                                                                                                           ng
                                                                                                                                                              le
                                                                                                                                                                            Is
                                                                                                                                                                               la
                                                                                                                                                                                 nd
                                                                                                                                                                                    -E
                                                                                                                                                                                                 Si
                                                                                                                                                                                                   ng
                                                                                                                                                                                                      le
                                                                                                                                                                                                                     Is
                                                                                                                                                                                                                        la
                                                                                                                                                                                                                          nd
                                                                                                                                                                                                                             -H
                                                                                                                                                                                                                                         Si
                                                                                                                                                                                                                                           ng
                                                                                                                                                                                                                                               le
                                                                                                                                                                                                                                                           Is
                                                                                                                                                                                                                                                             la
                                                                                                                                                                                                                                                               nd
                                                                                                                                                                                                                                                                 -T
                                                                                                                       -B                                       -E                                      -H                                       -T
      the winner is hill climbing (in both minimum and aver-                                                                ru
                                                                                                                              te
                                                                                                                                Fo
                                                                                                                                  rc
                                                                                                                                                ru
                                                                                                                                                  te
                                                                                                                                                     For
                                                                                                                                                                     vo
                                                                                                                                                                       lu
                                                                                                                                                                          tio
                                                                                                                                                                             n
                                                                                                                                                                                      vo
                                                                                                                                                                                         lu
                                                                                                                                                                                           tio
                                                                                                                                                                                              n
                                                                                                                                                                                                           illC
                                                                                                                                                                                                               lim
                                                                                                                                                                                                                  bi
                                                                                                                                                                                                                               illC
                                                                                                                                                                                                                                   lim
                                                                                                                                                                                                                                      bi
                                                                                                                                                                                                                                                    ab
                                                                                                                                                                                                                                                      uS
                                                                                                                                                                                                                                                        ea
                                                                                                                                                                                                                                                                   ab
                                                                                                                                                                                                                                                                      uS
                                                                                                                                                                                                                                                                         ear
                                                                                                                                                         ce                                                                               ng              rc
      age). On bigger data we can not always observe advantage
                                                                                                                                       e                                                                             n  g                                     h              ch


      brought by parallelization by islands and migration. This
      can be observed especially by simulated annealing. One                         Figure 4: Methods ML100k boxplot of results
      possible explanation is that hill climbing does not risk that
      much going in wrong direction as simulated annealing is
      doing (sometimes). Bigger data bring bigger state space         6 Future work
      and hence risk is necessary, but 50 iterations of the plan-
      ner is probably not enough. In future research we will run      In our future work we plan to extend this to heterogeneous
      island models with more iterations.                             island models. We also plan to develop new planners
                                                                      which will be specifically designed for recommendation
                                                                      domain. We would like to consider also islands with statis-
      5 Conclusions                                                   tical learning methods. Challenge is extension to content
                                                                      based recommendation.
      We proved that island models are a good computing tool
      for recommender systems. Our experiments have shown
      following. Island models brought clear improvement on           References
      smaller data set. On bigger data, it is also true except of
      simulated annealing. Moreover on bigger data evolution           [1] Tomáš Horváth and André C. P. L. F. de Carvalho. Evo-
      does not find best solution.                                         lutionary computing in recommender systems: a review of
         Our implementation is publicly available on Github4               recent research. Natural Computing, 16(3):441–462, 2017.
      and hence enables repeatability of our experiments (see          [2] M. Rezaei and R. Boostani. Using genetic algorithm to
      [1], requirement (v)).                                               enhance nonnegative matrix factorization initialization. In
                                                                           2010 6th Iranian Conference on Machine Vision and Image
            4 https://github.com/sbalcar/distributedea                     Processing, pages 1–5, Oct 2010.
Preference learning by matrix factorization on island models                                                                                                                                                                                                                151

                                                   Methods                                                                              Single-min                                         Islands-min         Single-average       Islands-average
                                                   BruteForce                                                                          0.98787115                                          0.94367838             0.99088728            0.94571058
                                                   DifferentialEvolution                                                               1.50105714                                          1.49096267             1.51771324            1.49957429
                                                   Evolution                                                                           0.88196787                                          0.87695296             0.88705747            0.87952888
                                                   HillClimbing                                                                        0.98047412                                          0.97400051             0.98207866            0.97824895
                                                   RandomSearch                                                                        1.60448414                                          1.58333447             1.61703977            1.60802325
                                                   SimulatedAnnealing                                                                  1.06110779                                          1.03108626             1.07797515            1.03806128
                                                   TabuSearch                                                                          0.98048607                                          0.97681064             0.98217126            0.97963446
                                            Table 4: Comparison of single methods and island models: Dataset - MFML100k, Runs - 9

                                                   Methods                                                                              Single-min                                         Islands-min         Single-average       Islands-average
                                                   BruteForce                                                                          1.22268511                                          1.18410586             1.23279212             1.2033589
                                                   DifferentialEvolution                                                               1.55227103                                          1.55057106             1.55780106            1.55422412
                                                   Evolution                                                                           1.10254503                                          1.07069968             1.13526207            1.09140823
                                                   HillClimbing                                                                        0.96832015                                          0.96727131             0.97065502            0.96840234
                                                   RandomSearch                                                                        1.65483811                                          1.66265835             1.66738744             1.6660426
                                                   SimulatedAnnealing                                                                  0.96655732                                          0.97118289             0.97048761             0.9729405
                                                   TabuSearch                                                                          0.96873215                                          0.96735123             0.97106628            0.96854622
      Table 5: Comparison of single methods and island models: Dataset - MFML1m, Runs - 9 (in red there are violations of
      island improvement)


                                                                                                                                                                                                            1998.
                             0.974
                                            Comparison single methods and island - MFML1m
                                                                                                                                                                                                        [8] Martin Pilát and Roman Neruda. Combining multiobjective
                                                                                                                                                                                                            and single-objective genetic algorithms in heterogeneous
                             0.973
                                                                                                                                                                                                            island model. In CEC 2010, pages 1–8, 2010.
                             0.972                                                                                                                                                                      [9] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast
                                                                                                                                                                                                            and elitist multiobjective genetic algorithm: Nsga-ii. Trans.
           y: fitness RMSE


                             0.971


                              0.97
                                                                                                                                                                                                            Evol. Comp, 6(2):182–197, 2002.
                                                                                                                                                                                                       [10] Eckart Zitzler, Marco Laumanns, and Lothar Thiele. Spea2:
                             0.969
                                                                                                                                                                                                            Improving the strength pareto evolutionary algorithm.
                             0.968                                                                                                                                                                          Technical report, ETH Zurich, 2001.
                             0.967                                                                                                                                                                     [11] S Balcar and M Pilat. Heterogeneous island model with re-
                                     Si                        Is                         Si                      Is                             Si                         Is
                                                                                                                                                                                                            planning of methods. In GECCO’18, accepted as poster,
                                                                 la                                                 la                                                         la
                                                                                                                                                                                                            2018.
                                       ng                          nd                       ng                        nd                           ng                            nd
                                          le                          -H                      le                              -S                     le                             -T
                                             -Hi                                                 -S                                                      -T
                                                 llC                    illC                       im                           im                          ab                        ab
                                                    lim                        lim                   ul                              ul                       uS                         uS
                                                       bi
                                                          ng
                                                                                  bi
                                                                                     ng
                                                                                                        at
                                                                                                          ed
                                                                                                             An
                                                                                                                                       at
                                                                                                                                         ed
                                                                                                                                            An
                                                                                                                                                                   ea
                                                                                                                                                                      rch
                                                                                                                                                                                           ea
                                                                                                                                                                                              rc
                                                                                                                                                                                                   h
                                                                                                                                                                                                       [12] Stepan Balcar. Heterogeneous island model with re-
                                                                                                               ne                             ne
                                                                                                                    al
                                                                                                                         in
                                                                                                                           g
                                                                                                                                                  al
                                                                                                                                                    in
                                                                                                                                                        g                                                   planning of methods (in czech). Master thesis, Martin Pilát
                                                                                                                                                                                                            supervisor, 2017.
                                                                                                                                                                                                       [13] G. A. Croes. A method for solving traveling salesman prob-
                              Figure 5: Methods ML1m boxplot of results                                                                                                                                     lems. Operations Research, page 791–812, 1958.
                                                                                                                                                                                                       [14] R.K. Arora. Optimization: Algorithms and Applications.
                                                                                                                                                                                                            CRC Press, 2015.
       [3] Dariush Zandi Navgaran, Parham Moradi, and Fardin                                                                                                                                           [15] Jinliang Wang and Michael C. Whitlock. Estimating effec-
           Akhlaghian.      Evolutionary based matrix factorization                                                                                                                                         tive population size and migration rates from genetic sam-
           method for collaborative filtering systems. 2013 21st ICEE,                                                                                                                                      ples over space and time. Genetics, 163(1):429–446, 2003.
           pages 1–5, 2013.                                                                                                                                                                            [16] William Astle and David J. Balding. Population struc-
       [4] Y Koren, R Bell, and Volinsky C. Matrix factorization tech-                                                                                                                                      ture and cryptic relatedness in genetic association studies.
           niques for recommender systems. Computer, 42(8):30–37,                                                                                                                                           Statist. Sci., 24(4):451–471, 11 2009.
           2009.                                                                                                                                                                                       [17] John Wakeley. The coalescent in an island model of popu-
       [5] L. Peska and P. Vojtas. Using implicit preference relations                                                                                                                                      lation subdivision with variation among demes. Theoretical
           to improve recommender systems. Journal on Data Seman-                                                                                                                                           Population Biology, 59(2):133 – 144, 2001.
           tics, 6,1:15–30, 2017.
       [6] John H. Holland. Adaptation in Natural and Artificial Sys-                                                                                                                                    Acknowledgements. This work was supported by a
           tems. MIT Press, 1992.                                                                                                                                                                      Ph.D grant SVV2018-260451 under supervision of Peter
       [7] Darrell Whitley, Soraya Rana, and Robert B. Heckendorn.                                                                                                                                     Vojtas.
           The island model genetic algorithm: On separability, pop-
           ulation size and convergence. J.Comp.Inf.Tech., 7:33–47,

</pre>