<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <abstract>
        <p>Aiming for high-performance computing of DNNs, we conducts comprehensive and thorough studies on the layer-wise characteristics of typical DNN architectures, as well as the instruction-level deep workloads of DNN inference on Single Instruction Multiple Data (SIMD) CPUs; First and foremost, the data provides the fundamental theoretical basis for the following high-performance computing research. At the software level, the research puts forward two methods for compressing DNNs: the refined channel-level pruning method based on the layer-wise sparsity and channel-wise important indexes (SI-Pruning), and layer-level pruning (LL-Pruning) to optimize DNNs. As for the acceleration at the hardware level, we challenge to accelerate DNNs at the SIMD-instruction level at first; Furthermore, we implement the acceleration of DNNs on FPGA. To be exact, DNN optimization is to improve the eficiency of the feature extraction capability of DNNs. In view of this, we further propose an enhanced pooling function max-average pooling (FMAPooling) and an improved channelattention mechanism (FMAttn) to enhance the feature extraction capability of DNNs.</p>
      </abstract>
    </article-meta>
  </front>
  <body />
  <back>
    <ref-list />
  </back>
</article>