<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Robust Spectral Clustering for Noisy Data</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>1. Bojchevski</institution>
          ,
          <addr-line>A., Matkovic, Y., Gunnemann, S.</addr-line>
          <institution>: Robust Spectral Clustering for Noisy Data. ACM SIGKDD Conference on Knowledge Discovery and Data Mining</institution>
          ,
          <addr-line>2017</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Aleksandar Bojchevski</institution>
          ,
          <addr-line>Yves Matkovic, and Stephan Gunnemann</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technical University of Munich, Munich, Germany Department of Informatics and Institute for Advanced Study</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <abstract>
        <p>Spectral clustering is one of the most prominent clustering approaches, with a multitude of successful applications from computer vision to network analysis. Since spectral clustering relies on a similarity graph only, it is applicable in many domains. However, one of the biggest limitations of spectral clustering is its sensitivity to noisy input data. In this work, we introduce a principle to robustify spectral clustering. We propose a sparse and latent decomposition of the similarity graph used in spectral clustering. Thus, instead of operating on the original graph, the spectral clustering will be performed on the latent representation. In our model, we jointly learn the spectral embedding as well as the latent decomposition { thus, enhancing the clustering performance overall. We provide algorithmic solutions for our model for all three established versions of spectral clustering using di erent Laplacian matrices. For our solutions we relate to principles such as Eigenvalue perturbation and the multidimensional Knapsack problem. In each case, the complexity of the overall method is linear in the number of edges. Our experimental analysis con rms the signi cant potential of our approach for robust spectral clustering, with up to 15 percentage points improvement in accuracy on real-world data compared to standard spectral clustering. Moreover, we propose two novel measures { local purity and global separation { which enable us to evaluate the intrinsic quality of an embedding without relying on a speci c clustering technique.</p>
      </abstract>
    </article-meta>
  </front>
  <body />
  <back>
    <ref-list />
  </back>
</article>