<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Feature Concepts as Pattern Language for Data-Federative Innovations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yukio Ohsawa</string-name>
          <email>ohsawa@sys.t.u-tokyo.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sae Kondo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Teruaki Hayashi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>7-3-1 Hongo</institution>
          ,
          <addr-line>Bunkyo-ku, Tokyo 113-8656</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Mie University</institution>
        </aff>
      </contrib-group>
      <fpage>96</fpage>
      <lpage>97</lpage>
      <abstract>
        <p>To ensure that all papers in the publication have a uniform appearance, the authors must adhere to the following instructions: Feature concepts, an essential tool for data-federative innovation processes, are introduced here as a language to express the model of knowledge to be acquired from data. A feature concept can be represented by a simple feature, such as a single variable, or by a conceptual illustration of the abstract information obtained from the data. Useful feature concepts for satisfying the latent or explicit requirements in society, or the market of data, are found to have been elicited so far via creative communication among stakeholders. Here, the contribution of feature concepts to useful findings is shown with a couple of use cases, for example, explanation of change in markets and earthquakes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The necessity to elicit information about the data-use
contexts, that is, the situations where to use data and/or receive
the services or products created based on data, has been
positioned as a key scope in creating a solution for satisfying a
requirement in businesses. Although participants enjoyed
workplaces for innovations using/reusing data [Ohsawa et
al. 2013], a marketplace for data-federative innovation)
have been urged to speak out requirements and ideas for
their satisfaction so that they can add or revise the DJs and
store the used ideas in the background database, the missing
links between data and the requirement cannot be covered.
To cope with this problem, in this study, a method is
introduced to illustrate the abstract image of the information to
be acquired using datasets for requirement satisfaction.</p>
    </sec>
    <sec id="sec-2">
      <title>Feature concepts</title>
      <p>A feature concept is an abstract image of the information or
knowledge to be acquired using data linked to the method,
that is, how, why, and the dataset(s), that is, what, should be
used to satisfy a requirement. In the examples shown below,
we discover that human creativity in data utilization has
been enhanced by eliciting, using, and sharing concepts in
various forms. These concepts, if the creator explicitly
represents, are regarded as feature concepts. Below, we
consider the feature concepts illustrated in Fig.1. For example,
___________________________________
In T. Kido, K. Takadama (Eds.), Proceedings of the AAAI 2022 Spring Symposium
“How Fair is Fair? Achieving Wellbeing AI”, Stanford University, Palo Alto, California,
USA, March 21–23, 2022. Copyright © 2022 for this paper by its authors. Use permitted
under Creative Commons License Attribution 4.0 International (CC BY 4.0).
unsupervised machine learning methods, such as clustering
with cutting noise events (e.g. [Fränti and Yang 2018]) are
algorithms in which a hidden cluster is restored from data
including scattered noise signals. Thus, embedded clusters,
as the desired information to be acquired from the data, can
be interpreted as a feature concept, as shown in Fig.1. In
addition, the decision tree [Quinlan 86] realizes a tree as a
feature. Feature concepts, if represented explicitly via the
communication of participants in the data market, play the role
of bridging social requirements and features in datasets, as
illustrated in Fig. 2.</p>
      <p>Fig. 1. Examples of feature concepts for three basic methods
for data mining (left: from [Ohsawa 2018b]).</p>
      <p>Fig. 2. The images and positions of feature concepts (FC#) in the
communication to connect requirements to solutions and DJs.</p>
    </sec>
    <sec id="sec-3">
      <title>Examples of feature concepts in data utilities</title>
    </sec>
    <sec id="sec-4">
      <title>Feature concepts and pattern language</title>
      <p>
        An example is the change explanation in businesses and
sciences, which was elicited as a requirement for supermarkets.
In comparison with the detection or prediction of changes
using machine learning technologies
        <xref ref-type="bibr" rid="ref2">(e.g., [Fearnhead and
Liu 2007, Miyaguchi and Yamanishi 2017])</xref>
        , change
explanation means linking the observed change in the data to
human understanding of the dynamics in the real world. Thus,
it is essential to create a feature concept for enabling data
visualization that inspires humans to understand the
underlying dynamics. Borrowing the idea of diversity shift
proposed by Kahn [Kahn 1995] in a market, we drew an image
corresponding to the feature concept in Fig.3 and invented
graph-based entropy (GBE [Ohsawa 2018a]) which is an
index of the diversity of events on their distribution to the
clusters in the co-occurrence graph of items in the market. The
change in GBE is a sign of structural change in the target
real world and is informative in explaining changes if
coupled with the graph shown in Fig.3, where the bridging edge
between the two clusters is cut in the 10th week of the year,
which is interpreted as the growth of the lower cluster
corresponding to spices for cooking stew. The 10th week in the
data was a hot period in August, but the frequency of the
query “stew” in Google increased from August in Japan
every year.
      </p>
      <p>The author then diverted the diversity shift to an analysis of
earthquakes [Ohsawa 2018b]. Here, a model was introduced
to explain the dynamics of earthquakes in two phases: (1)
the increase in the diversity of epicenter clusters, and (2) the
coupling of the clusters due to new activity in the seismic
gap, followed by a large one. The entropy defined in the
distribution of the epicenters increases in phase (1) and
decreases in phase (2). Thus, the FC diversity shift used for
marketing was reused to explain the earthquake precursors.</p>
      <p>
        Feature concepts may be regarded as a customized pattern
language, initially proposed in urban planning [
        <xref ref-type="bibr" rid="ref1">Alexander et
al 1977</xref>
        ] and diverted so far to other systems design. Here, a
set of patterns composed of urban elements, such as parks,
bridges, houses, etc. were used to explain and design
structures of urban areas. Each pattern with an illustration is
linked to a context, problem, and solution to the problem.
Individual thoughts and communication toward consensus
within a team engaged in a task of design or other
collaborations can be smoothed by using the patterns as a common
language for expressing contexts, problems, and solutions.
Furthermore, the patterns can be connected via relationships
from/to each other, which may be hierarchical relations or
likeliness to be combined. Similarly, once a feature concept
is created and shared with others, it becomes a tool for
innovators who think and communicate to a federate and/or use
data. In addition, the relationships among feature concepts
can be, similarly to patterns in a pattern language, the
hierarchical structure (e.g., “diversity shift” over “diversity,”
etc.), the connectivity (e.g., diversity shift can be connected
with clusters or with networks), etc. Thus, the links between
feature concepts or from feature concepts to the real-world
should come from the communication between data
scientists or data scientists with others.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This study is partially supported by JSPS 20K20482</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ishikawa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silverstein</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>1977</year>
          .
          <string-name>
            <given-names>A</given-names>
            <surname>Pattern Language: Towns</surname>
          </string-name>
          , Buildings, Construction. Oxford Univ. Press, USA.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Fearnhead</surname>
            ,
            <given-names>P</given-names>
          </string-name>
          , Liu,
          <string-name>
            <surname>Z.</surname>
          </string-name>
          <year>2007</year>
          .
          <article-title>Online Inference for Multiple Changepoint Problems</article-title>
          .
          <source>J. Royal Statistical Soc. B69(4) ISSN</source>
          <volume>1369</volume>
          -7412
          <string-name>
            <given-names>Fränti P.</given-names>
            ,
            <surname>Yang</surname>
          </string-name>
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2018</year>
          .
          <article-title>Medoid-Shift for Noise Removal to Improve Clustering.</article-title>
          , In: Rutkowski L., et al (eds),
          <source>Artificial Intelligence and Soft Computing, LNCS10841</source>
          . Springer Kahn,
          <string-name>
            <surname>B.K.</surname>
          </string-name>
          <year>1995</year>
          .
          <article-title>Consumer variety seeking among goods and service</article-title>
          ,
          <source>J. Retailing and Consumer Services</source>
          <volume>2</volume>
          ,
          <fpage>139</fpage>
          -
          <lpage>148</lpage>
          Miyaguchi,
          <string-name>
            <given-names>K.</given-names>
            , and
            <surname>Yamanishi</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Online detection of continuous changes in stochastic processes</article-title>
          ,
          <source>Int J. Data Science and Analytics</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ),
          <fpage>213</fpage>
          -
          <lpage>229</lpage>
          Ohsawa,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Kido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Hayashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2013</year>
          .
          <article-title>Data Jackets for Synthesizing Values in the Market of Data</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>22</volume>
          ,
          <fpage>709</fpage>
          -
          <lpage>716</lpage>
          , doi.org/10.1016/j.procs.
          <year>2013</year>
          .
          <volume>09</volume>
          .152 Ohsawa,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2018a</year>
          .
          <article-title>Graph-Based Entropy for Detecting Explanatory Signs of Changes in Market</article-title>
          .
          <source>Rev Socionetwork Strat</source>
          <volume>12</volume>
          ,
          <fpage>183</fpage>
          -
          <lpage>203</lpage>
          (
          <year>2018</year>
          ). https://doi.org/10.1007/s12626-018-0023-8 Ohsawa,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2018b</year>
          .
          <source>Regional Seismic Information Entropy for Detecting Earthquake Activation Precursors, Entropy</source>
          <volume>20</volume>
          (
          <issue>11</issue>
          ),
          <fpage>861</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Quinlan</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          <year>1986</year>
          .
          <article-title>Induction of Decision Trees</article-title>
          .
          <source>Mach. Learn. 1</source>
          ,
          <issue>1</issue>
          ,
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>