<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How to Cope with Bias in Wellbeing AI? - Towards Fairness in Wellbeing AI by Personal and Long-term Evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Keiki Takadama</string-name>
          <email>keiki@inf.uec.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The University of Electro-Communications</institution>
        </aff>
      </contrib-group>
      <fpage>4</fpage>
      <lpage>7</lpage>
      <abstract>
        <p>This paper focuses on the fairness in ML (machine learning) meaning that output of ML should not be “biased” and aims at clarifying bias in wellbeing AI. From the analysis of bias from the viewpoint of healthcare, the bias in wellbeing AI can be reduced by employing the personal and long-term evaluation while many biases in ML arise. To investigate an effectiveness of the personal and long-term evaluation, our previous research conducted the human subject experiment by focusing on sleep of aged persons in care house and found that our wellbeing AI based on the personal and long-term evaluation succeed to extract knowledge for good sleep and to estimate mind change of an aged person from her sleep quality change.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
“Can ML (machine learning) provide a fair decision?”. To
answer this question, this paper starts to explain one
example. As you may know, Amazon developed the ML
personnel recruitment system but stopped it in 2018 because
women do not tend to be recruited in comparison with men
due to the reason why most of input data for ML is men’s
data (Dastin, 2018). This example suggests an importance
of fairness in ML. In other words, the output by ML should
be fair or should not be “biased”. What should be noted here
is that many healthcare systems based on ML (hereafter we
call it as “wellbeing AI”) have also a rick of providing the
biased outputs and such outputs are very critical for our
daily life. From this fact, this paper aims at investigating
what kinds of biases arises in wellbeing AI and how such
biases can be reduced to cope with them. For this issue, this
paper starts to explain bias in general by focusing on the bias
on the Internet and the bias in ML, and then clarifies bias in
Wellbeing AI.</p>
    </sec>
    <sec id="sec-2">
      <title>Bias on the Internet</title>
      <p>bias, (2) data bias, (3) sampling bias, (4) algorithmic bias,
(5) interaction bias, (6) self-selection bias, and (7)
secondorder bias. The essential difference among them is
summarized as follows.
(1) Activity bias</p>
      <p>This bias arises from the different number of active/silent
users. For example, only top 4% of Amazon users posted
the reviews, which means that we cannot receive
messages from all users, i.e., they are the biased messages.
(2) Data bias</p>
      <p>This bias arises from the different number of data. For
example, the number of Westerner face pictures tends to
be larger than that of Asian in the face pictures in dataset
such as MS (Microsoft) celebrity dataset.
(3) Sampling bias</p>
      <p>This bias arises from the fact where the sampled data is
not always followed by true distribution. For example, an
asthma patient rate in near highway tends to be higher
than the rate in whole area. This means that the data in
big city is different from the data in whole area
(4) Algorithmic bias</p>
      <p>This bias arises from the different outcome caused by
different algorithms. For example, a search ranking by
Google is different from the ranking by Bing. This means
that the behaviors of users are biased by the different
search engine.
(5) Interaction bias</p>
      <p>This bias arises from the different interaction according
to web presentation. When focusing on the medicine list
on the pharmacy web site, for example, they are
differently displayed, e.g., one by one or all image. In the one
by one representation, users are hard to watch the less
prioritized medicines because they need to scroll the web
page to find them. In the all image representation, on the
other hand, users tend to watch the upper left image of
medicine because we usually read the sentence from left
to right and its line starts from upper to lower. Such
different representations causes bias of selecting medicines.
(6) Self-selection bias</p>
      <p>This bias arises from the different number of users who
are willing to participate or not. For example, many
questionnaires are returned from healthy persons, but not
from the non-healthy persons. This is because healthy
persons do not hesitate to tell their health information
without worry about it, while non-healthy persons do not
want to tell their health information honestly due to their
worry about it.
(7) Second-order bias</p>
      <p>This bias arises from the original bias. After a biased
information (in the high ranking) is spread, for example,
active users post other messages related to this
information, and such messages are sampled in high
possibility and increases its rank in search engine. This cycle
amplifies the original bias.</p>
    </sec>
    <sec id="sec-3">
      <title>Bias in Wellbeing AI</title>
      <p>
        To clarify the bias in Wellbeing AI for easy understanding,
let start to simplify the bias from the viewpoint of ML.
According to Mehrabi’s survey
        <xref ref-type="bibr" rid="ref2">(Mehrabi et al. 2022)</xref>
        , bias in
ML arises in the cycle of (i) users, (ii) data, and (iii)
algorithm as shown in Fig. 1. The connection of the seven biases
on Web to the cycle from (i) to (iii) is summarized as follows.
Firstly, the activity bias and self-selection bias arise in the
cycle from “user” to “data” because both biases are caused
by user and affect data. Secondly, the data bias and sampling
bias arise in the cycle from “data” to “algorithm” because
both biases are found in data and affect algorithm. Thirdly,
the algorithmic bias and interaction bias arise in the cycle
from “algorithm” to “user” because both biases are caused
by algorithms and affect user. Finally, the second-order bias
also arises in this cycle, which is the same as the bias on the
web.
      </p>
      <p>To consider the features of wellbeing in the cycle of
arising the bias of ML that connects with the bias on Web, the
following features should be taken into consideration.
⚫ Personal information</p>
      <p>Good information of others is not always good. For
example, the knowledge of good sleep for a certain person
is not always useful for other persons. This indicates that
that the personal data is very important in wellbeing.
⚫ Long-term evaluation</p>
      <p>Current evaluation of health is not enough because
keeping good health and better health (better life) are more
important than the current health. This indicates that the
long-term evaluation is very important in wellbeing.
From the viewpoint of the personal information and
longterm evaluation, the seven biases do not arise or can be
reduced as the following reasons. As shown in Figure 2, firstly,
the activity bias and self-selection bias do not arise because
the data comes from only one person. This indicates that the
“single” user provides the “personal” data. Secondly, the
data bias and sampling bias can be reduced if we can get
long-term daily data. This is because such a kind of data is
not heavily biased in comparison with the short-term daily
data due to the large number of data. Thirdly, an influence
of the algorithmic bias and interaction is very small because
only one person is affected. Finally, the second-order bias
can be reduced because other biases in this cycle are reduced
by the above reasons. From this analysis, the algorithm in
Wellbeing AI analyzes the “personal” data that comes from
the “single” user and provides the result to the user. This
indicates that the “personal and long-term evaluation”
(precisely, the personal and long-term evaluation based on the
personal data) are important for fairness AI.</p>
    </sec>
    <sec id="sec-4">
      <title>Personal and long-term evaluation</title>
      <p>The goals of many examples of wellbeing AI are roughly
classified into the following two categories.
⚫ Keeping good health (not getting a disease)
Since many patients such as dementia, diabetes, and sleep
apnea syndrome (SAS) want to worsen their health, it is
important to find something wrong for early detection.
For this issue, the personal and long-term evaluation is
needed for early detection.
⚫ Better health (improving activities)</p>
      <p>For better health, it is important to know (measure) the
accumulated small progress and its change of activities of
daily living (ADL) for aged persons, performance after
nap for office workers, and sleep quality for all ages. For
this issue, the personal and long-term evaluation is also
needed.</p>
      <p>
        Among them, this paper focuses on sleep of aged persons in
care house because sleep is significant for aged persons. For
example, aged person easily wakes up due to light sleep and
may wander in midnight. Sleep can also provide some
message of mind change of aged persons when their sleep
quality change from good to bad. This is because persons tend
to have deep sleep without anxiety but change to light sleep
when they are worry about something. To tackle these issues,
our previous research developed the wellbeing AI system
for the first issue to extract knowledge for good sleep
        <xref ref-type="bibr" rid="ref3">(Takadama et al. 2015)</xref>
        and for the second issue to estimate
mind change of aged person from sleep
        <xref ref-type="bibr" rid="ref4">(Takadama 2013)</xref>
        .
These researches took the approach of the personal and
long-term evaluation (in detail, we investigated the data of
the individual persons in one year). Technically, we
developed the sleep quality estimation system based on vital
vibration data from pressure sensor
        <xref ref-type="bibr" rid="ref5">(Harada et al. 2016)</xref>
        .
      </p>
      <sec id="sec-4-1">
        <title>Knowledge extraction for good sleep</title>
        <p>In our experiment, the daily activity and sleep quality are
recorded every day. In one day, many activities as such
meals, rehabilitation, and bathing, are scored in the integer
values. For example, when eating full amount of meal, the
score is 3. When no rehabilitation, then the score is 0. In
addition to the dairy activity, the sleep quality (i.e., the ratio
of deep sleep) is estimated by our sleep stage estimation to
classify whether the deep or light sleep.</p>
        <p>Figure 3 shows the knowledge for good sleep. For person
A, when the aged person had rehabilitation in AM, he
became to be tired and mostly took a nap. This caused him not
sleep very well at night. For this problem, our Wellbeing AI
suggested to change the time of having rehabilitation from
AM to PM. As a result, he could have a deep sleep. What
should be noted here is that this knowledge is not always
useful for other persons. For person B, when the aged person
had rehabilitation in PM, he lost appetite due to tiredness of
rehabilitation and could not diner as usual. This caused him
not sleep very well because of hungry at night. For this
problem, our Wellbeing AI suggested to change the time of
having rehabilitation from PM to AM. As a result, he could have
a deep sleep. This results clearly show that the knowledge
for good sleep is different among persons.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Mind change estimation of aged person</title>
        <p>Figure 4 shows that the sleep quality of the aged diabetes
person before/after the great east Japan earthquake, where
the blue dots indicate the deep sleep while the red dots
indicate the light sleep. The horizontal axis (f1) indicates the
achievement degree of what an aged person wants to do,
while the vertical axis (f2) indicates the achievement degree
of what a care worker wants to do for an aged person. From
this figure, the blue dots are located at the right side while
the red dots are located at the left side before the earthquake.
This is because an aged person tended to have a deep sleep
when she could achieve the activities (such as eating as
usual) because of not being worry about anything while she
tended to have a light sleep when she is hard to achieve the
activities (such as less eating as usual) because of being
worry about something.</p>
        <p>What should be noted here is that the blue and red dots
were mixed after the earthquake, which had a possibility of
the message of something mind changes of aged person. For
this issue, our Wellbeing AI estimated that amount of
breakfast should change from full to medium and the time of
having rehabilitation should change from none to AM. After
these changes, she could have a usual sleep. To verity these
suggestions, care workers asked to her and she said that she
was not willing to eat a full amount of breakfast due to news
of death of many people by tsunami caused by the
earthquake. Regarding the rehabilitation, she did not like it and
was often absent from rehabilitation. After the earthquake,
she noticed that many people killed by tsunami could not
extend their life while she could extend it by having
rehabilitation to tackle her diabetes. This changes her mind to be
willing to exercise.
To explore the answer to the question of how we should
cope with bias in Well-being AI, this paper started to focuses
on bias on the Internet and bias in ML and analyzed the bias
in wellbeing AI after connecting biases on the Internet and
bias in the ML. From this analysis, the bias in wellbeing AI
can be reduced by employing the personal and long-term
evaluation while many biases in ML arise. This paper
discussed the fairness in Well-being AI from the viewpoint of
the personal and long-term evaluation and found that our
wellbeing AI based on the personal and long-term
evaluation showed its potential by extracting knowledge for good
sleep and estimating mind change of aged person from her
sleep quality change. Future work includes an investigation
of other domains.
Dastin, J. 2018. “Amazon scraps secret AI recruiting tool that
showed bias against women”, Reuters, Oct. 11.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Baeza-Yates</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>“Bias on the Web.” Communications of the ACM</article-title>
          , Vol.
          <volume>61</volume>
          , No.
          <issue>6</issue>
          , pp.
          <fpage>54</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Mehrabi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morstatter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saxena</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lerman</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Galstyan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2022</year>
          .
          <article-title>“A Survey on Bias and Fairness in Machine Learning</article-title>
          .
          <source>” ACM Computing Surveys</source>
          , Vol.
          <volume>54</volume>
          ,
          <string-name>
            <surname>Issue</surname>
            <given-names>6</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Article</given-names>
            <surname>No</surname>
          </string-name>
          .
          <volume>115</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Takadama</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nakata</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2015</year>
          . “
          <article-title>Extracting Both Generalized and Specialized Knowledge by XCS using Attribute Tracking</article-title>
          and Feedback,”
          <source>2015 IEEE Congress on Evolutionary Computation (CEC2015)</source>
          , pp.
          <fpage>3034</fpage>
          -
          <lpage>3041</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Takadama</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <year>2013</year>
          . “
          <article-title>Towards a Care Support System that Can Guess The Way Aged Persons Feel,” The AAAI 2013 Spring Symposia, Data Driven Wellness: From Self-Tracking to Behavior Change, AAAI (The Association for the</article-title>
          <source>Advancement of Artificial Intelligence)</source>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Harada</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uwano</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Komine</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tajima</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kawashima</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morishima</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Takadama</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <year>2016</year>
          . “
          <article-title>Real-time Sleep Stage Estimation from Biological Data with Trigonometric Function Regression Model,” The AAAI 2016 Spring Symposia, Well-Being Computing: AI Meets Health and Happiness Science, AAAI (The Association for the</article-title>
          <source>Advancement of Artificial Intelligence)</source>
          , pp.
          <fpage>348</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>