<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Leveraging Egocentric and Surrounding Environment Data to Adaptively Measure a Personal Air Quality Index</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dang-Hieu Nguyen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Minh-Tam Nguyen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Loc Tai Tan Nguyen</string-name>
          <email>locntt.12@grad.uit.edu.vn</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>27</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>This paper introduces a new solution for measuring the personal air quality index that reflects the egocentric perspective of human beings with their surrounding environment. Two instances of the solution are introduced and evaluated by using the MediaEval 2019 Insights for wellbeing task dataset and evaluation metric. The first instance calculates the Air Quality Index (AQI) using sensors data, utilizes the user's tags and visual features to measure the personal AQI adaptively. The second instance leverages the average value of the user's tags and feature of the route to determine personal AQI. The performance of these two instances is also discussed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the author gives various evidence gathered from many
reference sources and points out the impact of air pollution
on individuals in many perspectives (health, psychology).
The mentioned pollution factors include environmental
factors (e.g. fine particulate matter PM2.5, Nitrogen dioxide N O2,
Ozone O3, Sulfur dioxide SO2), weather variables (e.g.
temperature, humidity), and urban nature, trafic. Unfortunately,
most of investigations on this domain focusing on
measuring the air quality index using sensors data regardless of
understanding how people feel of air qualification around
them.
      </p>
      <p>
        MediaEval 2019 Insights for wellbeing task [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] introduces
an interesting subtask of measuring personal air quality
index (PAQI). The PAQI is defined as the personal feeling of
AQI comparing to the real AQI calculated by using sensors
data. The subtask requests to measure the PAQI using
egocentric data (e.g., lifelog image, heartbeat, step counts, user’s
annotations) and surrounding environment data (e.g., air
pollution, weather).
      </p>
      <p>
        The definitions, dataset and evaluation metric of this
subtask are described in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>As mentioned above, environmental factors, weather
variables, urban nature, and trafic impact on individuals.
Observing the dataset provided by the subtask, we found that
main streets with lots of trafic and fewer trees will have a
low PAQI and vice versa. This observation gives a hint to
propose the solution to measure PAQI using AQI, user’s tags,
and visual features. Two instances of the solution are
introduced and evaluated by using the MediaEval 2019 Insights
for wellbeing task dataset and evaluation metric. The first
instance calculates the Air Quality Index (AQI) using sensors
data, utilizes the user’s tags and visual features to measure
the personal AQI adaptively. The second instance leverages
the average value of the user’s tags and feature of the route
to determine personal AQI.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Data Processing</title>
      <p>First, data along each route are pre-processed to get rid of
noises and outliers. Necessary interpolations are conducted
to compensate for missing data. Then, two instances (runs)
of the proposed solution are constructed as follows:
Run 1: From the dataset, we can identify a group of users
walking along a specific route. Since the 2018 dataset is
recorded by seconds, we convert a recording time to the
minute to make sure the highest value of each factor within
1 minute is retained. Then we calculate AQI using these
factors (e.g., PM2.5, N O2, O3). Next, visual features are extracted
from images.</p>
      <p>Run 2: We first collect all data in the same group, then we
only keep data coordinating with user’s tags. Next, we divide
each segment of one route into four smaller segments. This
task aims to have a segment as straight as possible so that the
radius can sweep all the points tagged on the segment. Then,
we scan the radius with a radius by the distance between
the small segments, if any of the tag points are within this
range, we collect them and calculate the average value of
that user’s tags (e.g., assume the distance between line_start
and line_end is 100 m. We divide it into four road segment
with 25 m each and get new 3 points in between).
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Visual Features Extraction</title>
      <p>We use the visual features provided by the task’s organizers.
Besides, we develop a tool that crawls images from Google
Street View using coordinates provided in the dataset. This
tool aims to enrich the image dataset. Finally, we extracted
trafic and tree density from these images.
2.3</p>
    </sec>
    <sec id="sec-5">
      <title>PAQI Measurement</title>
      <p>In this section, we use the data input obtains from section
2.1 corresponding to each run mentioned.</p>
      <p>2.3.1 Run 1: PAQI measurement is first calculated by
using the AQI calculation formula. Then we use the user’s
tags and trafic density, tree density to adjust the AQI values.
We build a function to adaptively adjust AQI into PAQI as
follows:</p>
      <p>Ín
f (x ) = i=1(f anctori · αi ) (1)
Where: Ín</p>
      <p>i=1(αi ) = 1; f actori : input data such as user’s
tags and visual feature.</p>
      <p>The PAQI’s value are specified by:</p>
      <p>PAQI =AQI · f (x )
(2)</p>
      <p>Finally, we adjust the value of αi according to a route to
get the final PAQI. The value of αi is calculated based on the
factors’ values. If the factors’ values are high, αi increases
and the PAQI is high. If factors’ values are low, αi decreases
and the PAQI is low.</p>
      <p>We set parameters as follows: f actor1 ← user’s tags,
f actor2 ← trafic density, f actor3 ← tree density, α1 + α2 +
α3 = 1. First, we define α1 = α2 = α3 = 31 . Then we use
adhoc-based approach to calculate factors’ values and adjust
the values of α corresponding with each factor. With f actor1,
if its value is larger than the predefined threshold (2.5 in our
case), α1 increases, otherwise α1 decreases. With f actor2,
if its value is high, α2 decreases, otherwise, α2 increases.
With f actor3, if its value is high, α3 increases, otherwise, α3
decreases. This optimal loop is carried on until the
convergence happens. With the maximum value of α is 1 and the
minimum value of α is 0.</p>
      <p>2.3.2 Run 2: First, we based on line_start and line_end
points to determine the features of routes 1, 2, and 3 that are
featured in Table 1. Second, we calculate the average value
of the user’s tags. Third, we calculate the weight of routes:</p>
      <sec id="sec-5-1">
        <title>D.H. Nguyen et al.</title>
        <p>trail” feature same value with “Bayside path” feature. Table
1 shows the weight of routes 1, 2, and 3 when running on
the development dataset.
Where: PAQIoutput : is value predicted of routes 4 and 5.; wr :
is weight from Table 1.; avд(user ′staдs): the average user’s
tags on routes 4 and 5.
3</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>RESULTS AND ANALYSIS</title>
      <p>The experimental results running on the training dataset are
denoted in Table 2. The results show we can measure PAQI
with acceptable accuracy. Table 3 shows the results when
running on the testing dataset. In Table 2 does not include
the result of run 2 because we only obtain the weight of
routes from Development data and use it to infer PAQI for
Testing data.
Where: wr : is weight of route.; PAQIintput : based on
Development Dataset of routes 1, 2, and 3.; avд(user ′staдs): the
average user’s tags on routes 1, 2, and 3.</p>
      <p>Because we can not find the “Mountain trail” feature in
Development Dataset, we assume the weight of “Mountain
(3)
4</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION</title>
      <p>In this paper, we report our solution for the challenge raised
by MediaEval 2019 Insight for wellbeing task - subtask2.
We introduce an ad-hoc approach that adaptively adjust
user’s tags, trafic and tree density observed along a route
to re-adjust the AQI value towards measuring an acceptable
personal AQI value.</p>
      <sec id="sec-7-1">
        <title>Insights for Well-being</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Minh-Son</surname>
            <given-names>Dao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Peijiang</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Tomohiro</given-names>
            <surname>Sato</surname>
          </string-name>
          , Koji Zettsu, DucTien Dang-Nguyen,
          <string-name>
            <given-names>Cathal</given-names>
            <surname>Gurrin</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ngoc-Thanh Nguyen</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Overview of MediaEval 2019: Insights for Wellbeing Task: Multimodal Personal Health Lifelog Data Analysis</article-title>
          .
          <source>In MediaEval2019 Working Notes (CEUR Workshop Proceedings)</source>
          .
          <article-title>CEUR-WS</article-title>
          .org &lt;http://ceur-ws.
          <source>org&gt;</source>
          ,
          <string-name>
            <surname>Sophia</surname>
            <given-names>Antipolis</given-names>
          </string-name>
          , France.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Siqi</given-names>
            <surname>Zheng</surname>
          </string-name>
          , Jianghao Wang, Cong Sun, Xiaonan Zhang, and Matthew E Kahn.
          <year>2019</year>
          .
          <article-title>Air pollution lowers Chinese urbanites' expressed happiness on social media</article-title>
          .
          <source>Nature Human Behaviour</source>
          <volume>3</volume>
          ,
          <issue>3</issue>
          (
          <year>2019</year>
          ),
          <fpage>237</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>