<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detection of odor-related objects in images based on everyday odors in Japan</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuki Eda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haruka Matsukura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuji Nozaki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maki Sakamoto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The University of Electro-Communications</institution>
          ,
          <addr-line>1-5-1 Chofugaoka, Chofu, Tokyo 182-8585</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <fpage>27</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>This paper reports on detection of odor-related objects in images. An image data set of odor-related objects classified into 12 categories is built to train an object detection model. The results are presented to show around 60% accuracy is obtained with the trained network. The detection system of odor-related objects has potential to be applied to not only to entertainment purposes but also olfactory sensing system which may be related to human well-being.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;odor</kwd>
        <kwd>object detection</kwd>
        <kwd>object recognition</kwd>
        <kwd>well-being</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        This paper reports on detection of odor-related objects
in images. Object detection has received great attention
and been widely used for various purposes[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Generally
speaking, object detection is technique for recognizing
and locating objects in images based on visual features.
Meanwhile, the authors have attempted to detect objects
based on whether they emit smell or not. Therefore, we
build an image data set of odor-related objects to train an
object detection model which is built with an algorithm
called YOLOv7[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as shown in fig. 1.
      </p>
      <p>
        The data set of odor-related images is built, referring to
classification of odors[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The perception of odors varies
by various factors including culture and environment.
In this research, we focus on the odors perceived in
everyday life in Japan. There is already a report in which
detection of odor-related objects was attempted[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This
research, hovever, addressed only four specific types of
categories: aqua, cofee, orange, and rose. Our research
tries to handle 12 types of categories, which are broader
and vaguer.
      </p>
      <p>
        Many of animals living on land including humans have
a keen sense of odors which is sophisticated and are able
to distinguish numerous odors by detecting faint
chemical substances in the air. There are various researches
reporting sensing system of chemical substances
mimicking olfactory mechanism in animals so as to diferentiate
odors[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This kind of sensing system is called electric
nose, e-nose, and is applied to wide variety of situations.
Some researchers are thinking of using e-nose as a tool
to overcome anosmia which is known as smell
blindness. However, e-nose systems developed so far have
not reached a level to substitute for animals’ olfaction.
Our research has potential to assist the sensing system
in enhancing accuracy of odor discrimination.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Image data set of odor-related objects</title>
        <p>rottenness/feces, sulfur, dast, burning odor,
gasoline/rubber, and thinner. These seven groups were excluded from
our data set in the current stage because they may cause
discomfort feeling when the odors are actually presented
to the users in an application of our detection system.</p>
        <p>
          The images of cofee, flowers, fruits, menthol,
incense,and woods were collected from Google Images
using a python library called google-images-download1.
The images of sweets were collected from ImageNet2.
The images of curry, vinegar, garlic, soy sause, and
butter were collected form UECFOOD-256, a meal image
data set that includes many representative images of
Japanese dishes[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The collected images are annotated
with bounding boxes using annotation tools. The
number of collected images varies for each odor group. The
minimum number is 292 for menthol and the maximum
number is 1559 for soy source.
        </p>
        <p>As for the images of vinegar, garlic, soy sauce, and
butter, we collected images of dishes including them as
seasoning and spice rather than images of themselves.
For example, for garlic, images of various things smelling
of garlic such as Chinese dumplings and fried rice were
selected. Therefore, website of Japanese cooking recipes
called Delish Kitchen3 was used to find dishes including
the four classes. Four to ten diferent dishes were
extracted for each class and find images corresponding to
the extracted dishes from UECFOOD-256.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Training and Evaluation</title>
        <p>Among various object detection models, the YOLOv7
model was employed in this paper because it has a high
analysis speed and enables real-time detection.
Evaluation of the precidion of object detection is conducted
with mean average precision (mAP) value, which is an
index generally used in the research field.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments and Discussion</title>
      <p>The YOLOv7 model uses ELAN and E-ELAN in its basic
architecture, which allows for faster processing. In
addition, the model uses a label assignment strategy called
auxiliary loss, which aims to improve Recall.</p>
      <p>This detection model was trained over 300 epochs,
dividing 6,356 data into 8 batch sizes and the pre-trained
model used "yolov7-e6.pt" 4 The percentage of correct
answers for each class is summarised in the confusion
matrix shown in fig. 2. Finally, the average of AP values
(mAP) ended up at 0.66 under the condition that
intersection over union (IoU) value was 0.5 .</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>This paper reported on object detection of odor-related
objects in image. The detection model trained with data
sets of 12 classes of image including odor-related
objects. The accuracy of 66% was obtained as a result. In
future work, improvement of detection accuracy will be
addressed by using techniques such as label Smoothing.
This work was supported by JSPS KAKENHI Grant
Numbers 22K12124.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Jiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <article-title>A survey of deep learning-based object detection</article-title>
          ,
          <source>IEEE Access 7</source>
          (
          <year>2019</year>
          )
          <fpage>128837</fpage>
          -
          <lpage>128868</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2019</year>
          .
          <volume>2939201</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bochkovskiy</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-Y. M. Liao</surname>
          </string-name>
          ,
          <article-title>Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors</article-title>
          ,
          <source>arXiv preprint arXiv:2207.02696</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Saito</surname>
          </string-name>
          ,
          <article-title>Expressions of ofensive odors and everyday odors using words, (in japanese)</article-title>
          ,
          <source>Journal of Japan Association on Odor Environment</source>
          <volume>44</volume>
          (
          <year>2013</year>
          )
          <fpage>363</fpage>
          -
          <lpage>379</lpage>
          . doi:
          <volume>10</volume>
          .2171/jao.44.363.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Seeing is smelling: Localizing odor-related objects in images</article-title>
          ,
          <source>in: Proceedings of the 9th Augmented Human International Conference</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lekha</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. M</surname>
          </string-name>
          ,
          <article-title>Recent advancements and future prospects on e-nose sensors technology and machine learning approaches for non-invasive diabetes diagnosis: A review</article-title>
          ,
          <source>IEEE Reviews in Biomedical Engineering</source>
          <volume>14</volume>
          (
          <year>2021</year>
          )
          <fpage>127</fpage>
          -
          <lpage>138</lpage>
          . doi:
          <volume>10</volume>
          .1109/RBME.
          <year>2020</year>
          .
          <volume>2993591</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kawano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yanai</surname>
          </string-name>
          ,
          <article-title>Automatic expansion of a food image dataset leveraging existing categories with domain adaptation</article-title>
          ,
          <source>in: Proc. of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>