=Paper= {{Paper |id=Vol-1917/paper14 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1917/paper14.pdf |volume=Vol-1917 }} ==None== https://ceur-ws.org/Vol-1917/paper14.pdf
                    PRIMPing Boolean Matrix Factorization by
                   Proximal Alternating Linearized Minimization

                                 Sibylle Hess, Nico Piatkowski, and Katharina Morik

                          TU Dortmund, Computer Science, LS 8, 44221 Dortmund, Germany


                        Abstract. We propose a novel Boolean matrix factorization algorithm,
                        based on recent results from optimization theory. We demonstrate the
                        superior robustness of the new approach in the presence of several kinds
                        of noise and the interpretability on synthetic and real-world data.

                        Keywords: Boolean Matrix Factorization, Minimum Description Length,
                        Proximal Minimization, Nonconvex-Nonsmooth Minimization


                 Given the task to explore binary data, Boolean Matrix Factorization (BMF) is a
                 method of choice. BMF yields a simultaneous clustering of rows and columns of
                 the data matrix into binary, thus interpretable, cluster representatives. Further-
                 more, state-of-the art methods automatically estimate the number of prevalent
                 clusters through the application of the minimum description length principle [2].
                                                     Unfortunately, existing algorithms are greedy
                                                     and rely on heuristics to solve the NP-
                                                     hard problem of BMF. We propose with the
                      Picture     Factorization 1    procedure Primp to apply the optimization
                                                     scheme PALM to a real-valued relaxation of
                                                     the objective. PALM enables the minimiza-
                                                     tion of the generally nonconvex description
                  Factorization 2 Factorization 3    length and the nonsmooth penalization of
                                                     non-binary values under convergence guaran-
                 Fig. 1: Reconstructions of the tees. A rounding procedure rounds the result
                 image top left by three outer to binary values and decides over the number
                 products returned by Primp. of returned clusters. For more information,
                 Best viewed in color.               we refer to [1].

                 Acknowledgments SFB 876, projects A1 and C1.

                 References
                 1. Sibylle Hess, Katharina Morik, and Nico Piatkowski. 2017. The PRIMPING routine–
                    Tiling through proximal alternating linearized minimization. Data Min. Knowl. Dis-
                    cov. 31, 4 (July 2017), 1090-1131.
                 2. Pauli Miettinen and Jilles Vreeken. 2014. MDL4BMF: Minimum Description Length
                    for Boolean Matrix Factorization. ACM Trans. Knowl. Discov. Data 8, 4, Article 18
                    (October 2014).




Copyright © 2017 by the paper’s authors. Copying permitted only for private and academic purposes.
In: M. Leyer (Ed.): Proceedings of the LWDA 2017 Workshops: KDML, FGWM, IR, and FGDB.
Rostock, Germany, 11.-13. September 2017, published at http://ceur-ws.org