<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Role of the Human Expert in the Era of Big Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emille E. O. Ishida</string-name>
          <email>emilleishida@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CNRS/Laboratoire de Physique de Clermont (LPC), Université Clermont-Auvergne (UCA)</institution>
          ,
          <addr-line>Clermont Ferrand</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The full exploitation of the next generation of large scale photometric surveys depends
heavily on our ability to provide a reliable early-epoch classification based solely on
photometric data. In preparation for this scenario, there have been many attempts to
apply different machine learning algorithms to a series of classification problems in
astronomy. Although different methods present different degrees of success, text-book
machine learning methods fail to address the crucial issue of lack of representativeness
between spectroscopic (training) and photometric (target) samples. In this talk I will
show how Active Learning (or optimal experiment design) can be used as a tool for
optimizing the construction of spectroscopic samples for classification purposes. I will
present results on how the design of spectroscopic samples from the beginning of the
survey can achieve optimal classification results with a much lower number of spectra
and show how this strategy is being applied to the current ZTF alert stream by the Fink
broker. I will also describe how such strategies have proven to be effective also in
search for scientifically interesting anomalies within the efforts of the SNAD
collaboration.</p>
      <p>Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>