<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Augmented  Visualization  of  Association  Rules  for   Data  Mining  </article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Engineering and Architecture, Arturo Prat University</institution>
          ,
          <addr-line>Iquique -­‐</addr-line>
          <country country="CL">Chile</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Engineering and Geological Sciences, North Catholic University</institution>
          ,
          <addr-line>Antofagasta -­‐</addr-line>
          <country country="CL">Chile</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>  This   paper   describes   a   proposal   for   enhanced   visualization   of   a   data-­‐mining   model   generated   with   Association   Rule   (AR)   techniques   by   applying   Self-­‐Organizing   Maps   (SOM).   A   representation   of   visual   percep-­tion  model  of  AR  based  on  a  method  called  AVM-­‐DM  (Augmented  Visualiza-­tion   Models   for   Data   Mining)   is   established,   together   with   data   and   pat-­terns,  which  support  the  visual  exploration  stage,  thus  fitting  in  the  context   of  the  KDD  (Knowledge  Discovery  in  Databases)  process.  This  methodology   seeks  to  answer  generic  user  questions  regarding  the  inner  workings  of  the   model,  and  to  support  understanding  the  generated  model.  The  use  of  the   SOM  technique  as  a  visual  enhancer  applied  to  an  AR  model,  serves  a  dual   purpose:  to  obtain  the  spatial  distribution  of  the  subset  of  data  associated   with  a  rule,  and  to  display  this  subset  using  a  map.    The  visualization  of  the   RA  model,  proposed  in  this  work,  is  implemented  through  a  software  tool   giving  users  different  interaction  mechanisms.  Results  of  user  experiments   demonstrate  the  usefulness  of  the  proposed  SOM  technique  in  visually  en-­hancing  and  helping  to  understand    the  AR  model.  </p>
      </abstract>
      <kwd-group>
        <kwd>  Data   mining</kwd>
        <kwd>  visual   data   mining</kwd>
        <kwd>  visualization   of   data   mining   models</kwd>
        <kwd> visualization  of  association  rules</kwd>
        <kwd> </kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        results   of   the   KDD   process   [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].   In   this   regard,   appropriate   visualizations  
of  DM  models  can  transform  them  into  understandable  tools  that  convert  
data   into   knowledge.   In   addition,   appropriate   visualizations   of   patterns  
can  facilitate  the  task  of  discovering  knowledge  to  interpret  and  evaluate  
these  patterns  visually  [2,  3].      
This   work   proposes   to   visually   enhance   the   DM   model   generated   by   an  
Association  Rule  mining  (AR)  technique,  by  combining  the  SOM  technique  
and   creating   complementary   views   of   the   different   rules   or   model   com-­‐
ponents.   This   method   seeks   to   answer   generic   user   questions   regarding  
the  inner  workings  of  the  model.  This  approach  is  based  on  the  Augment-­‐
ed  Visualization  Models  for  Data  Mining  (AVM-­‐DM)  [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]  method  that  pro-­‐
poses   a   model   of   visual   perception   and   user   interaction,   focusing   on   the  
stage  of  adjustment  or  refinement  of  the  DM  model  generated  within  the  
wider  context  of  the  entire  KDD  process.    
The   proposed   work   includes   the   implementation   of   part   of   the   AVM-­‐DM  
method  in  a  prototype  tool  that  accepts  a  set  of  appropriate  data  and  an  
AR   model.   Finally,   a   subjective   evaluation   of   the   prototype   is   presented  
through   the   user   evaluation   experiment,   consisting   of   a   survey   .   Partici-­‐
pants   provided   information   about   the   performance,   usability,   manage-­‐
ment  views  and  support  provided  by  the  developed  tool  in  understanding  
a  previously  generated  DM  model.  
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Visualization of Association Rules</title>
      <p>
        AR   represent   the   relationships   between   several   variables,   i.e.,   consider  
that  AR  is  an  implication  of  the  form  X  →  Y,  where  X  is  a  set  of  items  ca-­‐
lled  antecedents,  and  Y  is  the  set  of  consequent  items.  At  least  five  para-­‐
meters  should  be  considered  in  the  visualization  of  an  AR:  the  set  of  ante-­‐
cedent   items,   consequent   items,   associations   between   antecedents   and  
consequents,  the  rule’s  support,  and  its  confidence  [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].    
Research   on   visualization   of   AR   can   be   categorized   into   three   main  
groups,   depending   on   whether   they   are   based   on   tables,   matrices,   or  
graphs.   Tabl-­‐based   techniques   are   the   most   common   and   traditional   ap-­‐
proach   to   represent   AR.   The   columns   of   a   table   generally   represent   the  
items  of  the  AR  model  while  each  row  represents  a  rule.  Examples  of  te-­‐
chniques   based   on   tables   can   be   found   in   several   commercial   systems,  
including   SAS   Enterprise   Miner   and   DB   Miner   [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].   Matrix-­‐based   tech-­‐
niques   such   as   those   implemented   in   MineSet   [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]   and   InfoVis   [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]   use   a  
coordinate   axes   grid   that   represents   the   antecedents   and   consequents.  
The  last  group  consists  of  the  techniques  that  are  based  on  graphs  using  
nodes   to   represent   the   items   and   edges   to   represent   the   associations   of  
items  in  the  rules.  Some  of  these  techniques  have  proposed  several  types  
of  representations  known  to  study  a  large  set  of  data,  such  as  hyperbolic  
trees  [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].    
In   summary,   although   these   efforts   to   improve   the   visualization   of   ARs  
were   able   to   supplement   the   rule   mining   with   graphics   that   allow   us   to  
observe  each  rule  in  detail,  we  failed  to  find  visualization  tools  that  allow  
an  interaction  with  each  rule,  while  also  visualizing  how  the  data  in  each  
rule  are  spatially  distributed.  A  comparative  review  of  visualization  tools  
for  DM  (including  AR)  techniques,  by  Castillo  [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],  concluded  that:  a)  most  
research   recommends   using   a   combination   of   DM   techniques   with   ap-­‐
propriate   views,   b)   it   is   essential   to   consider   in   the   design   of   views,   the  
mechanisms   for   user   interaction,   and   c)   the   role   of   visualization   in   the  
KDD  process  must  be  extended  in  all  its  stages.  
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>The AVM-DM Scheme</title>
      <p>
        The  AVM-­‐DM  scheme  proposed  by  Castillo  in  [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]  considers  the  character-­‐
istics   of   the   analyzed   models   of   perception,   and   includes   the   most   rele-­‐
vant   aspects   of   each,   particularly   with   regard   to   the   integration   of   the  
display   in   step   adjustment   or   refinement   and   evaluation   of   DM   models.  
AVM-­‐DM  brings  the  concept  of  “Augmented  Visualization”  for  DM  models,  
and  suggests  that,  given  a  DM  technique  to  be  visualized,  called  Primary  
DM  Technique  (PT-­‐DM),  should  allow  the  user  to  incorporate  in  this  dis-­‐
play,  different  visuals  regarding  the  type  of  model  and  data  domain,  and  
in  turn  need  to  apply  another  DM  technique,  called  Secondary  DM  Tech-­‐
nical  (ST-­‐DM),  as  a  visual  enhancer  that  allows  exploring  the  PT-­‐DM.  The  
selected   ST-­‐DM   technique   must   meet   the   requirements   of   being   a   de-­‐
scriptive  DM  technique  that  is  appropriate  to  the  domain  data  being  ana-­‐
lyzed  within  the  PT-­‐DM.    
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Augmented Visualization of AR Model using SOM</title>
      <p>
        In  the  case  of  the  AR  techniques,  several  visualization  methods  analyzed  
in  [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]  propose  a  static  display,  without  any  possibility  for  the  user  to  in-­‐
teract  with  each  rule.  Most  DM  visualization  tools  delivered  an  overview  
of  the  ARs  but  cannot  combine  DM  techniques  to  provide  information  on  
model  rules  and  instances  supporting  each  rule,  and  only  a  few  tools  pro-­‐
vide   interaction   mechanisms   for   the   user.   The   proposed   use   of   the   SOM  
technique  as  an  AR  visual  augmenter,  serves  a  dual  purpose:  to  obtain  the  
spatial  distribution  of  the  subset  of  data  associated  with  each  rule,  and  to  
display  this  partition  using  a  map.  
The   prototype   implements   the   AVM-­‐MD   scheme   for   hierarchical   struc-­‐
ture  techniques  in  DM  (decision  tree  &amp;  AR),  and  in  this  paper,  we  concen-­‐
trate  on  the  AR  mining  technique.  It  incorporates  a  set  of  visual  elements;  
data   table,   pie   chart   (by   rule   and   general),   dot   plot,   and   parallel   coordi-­‐
nates  plot.  Also,  available  interaction  mechanisms  include  zoom,  selection  
rules,  and  setting  of  the  parameters.  Figure  1  shows  the  main  interface  of  
the  experimental  prototype,  where  ARs  are  displayed  in  the  central  part,  
together   with   complementary   views   and   visual   elements   on   the   right  
side.   In   this   tool   all   architecture   components   of   the   proposed   AVM-­‐DM  
scheme  are  implemented.  The  user  can  maximize  the  image  located  in  the  
c)  section  of  the  interface  by  clicking  with  the  mouse,  opening  a  window  
that   presents   a   detailed   view   of   this   technique.   They   can   re-­‐configure  
their   initial   parameters   on   a   selected   rule   and   apply   the   SOM   technique.  
Also,   the   user   can   see   the   shape   of   the   distribution   of   the   instances   cov-­‐
ered  by  this  rule.  
      </p>
    </sec>
    <sec id="sec-5">
      <title>Controlled Experiment: Evaluations &amp; Analysis</title>
      <p>The   following   controlled   experiment   provides   a   comparison   and   subjec-­‐
tive  evaluation  of  the  visualization  of  ARs  obtained  through  a  DM  task  to  
be   performed   by   a   set   of   users,   whose   aim   is   to   check   if   the   SOM-­‐based  
visualization   enhanced   AR   mining   along   with   the   set   of   visual   elements  
provided   by   the   prototype   software,   can   improve   the   understanding   of  
the   model,   such   as   looking   at   the   distribution   of   data   in   each   rule,   com-­‐
pared   with   the   visualization   provided   by   another   DM   tool,   that   does   not  
have   this   focus   or   visualization   scheme.   This   experiment   was   conducted  
with  17  users  of  varying  levels  of  expertise  in  DM  processes,  and  the  use  
of  DM  tools.  We  asked  participants  to  perform  a  generic  task  description  
and  could  answer  questions  about  the  model  and  its  components,  and  to  
relate  the  model  to  the  characteristics  of  the  data  from  which  the  model  
was  generated.    
 
Subsequently,   once   the   DM   task   was   prepared   for   this   experiment,   the  
users  had  to  answer  a  survey  designed  to  gather  the  subjective  opinion  of  
the   group,   regarding   the   performance   of   both   tools,   the   visualization   of  
the  generated  AR  model,  usability,  utility  of  visual  elements,  the  desirabil-­‐
ity   of   combining   the   SOM   technique   to   achieve   a   visually   augmented  
model,   and   the   efficiency   in   understanding   of   the   model.   Users   mostly  
stated  that  both  the  combination  of  the  SOM  technique  applied  to  the  AR  
model,  and  the  use  of  graphic  elements  on  the  data  rules,  allowed  them  to  
improve   their   understanding   of   the   generated   AR   model,   achieving   a  
score   distribution   of   54,   9%   good   and   33.33%   very   good,   which   can   be  
seen   from   the   graph   in   Figure   2   a).   Also,   as   shown   in   Figure   2   b),   users  
expressed   mostly   positive   ability   (76.5%   high   and   11.8%   very   high)   to  
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Work</title>
      <p>The   preliminary   results   of   the   presented   study   allow   us   to   confirm   the  
suitability  and  utility  of  combining  the  AR  mining  technique  with  the  SOM  
technique   for   achieving   augmented   visualization   for   the   AR   model,   and  
for   visualizing   the   spatial   distribution   of   the   data   covered   by   each   rule,  
thus  helping  improve  the  understanding  of  their  inner  workings.  Also  the  
visual  tools  provided  in  the  prototype  software  support  the  analysis  and  
examination   of   the   AR   model.   As   future   work,   we   are   evaluating   other  
descriptive  DM  techniques  that  can  provide  alternative  views  for  visually  
enhanced  AR  models.  
We  thank  the  anonymous  reviewers  for  their  helpful  suggestions.  In  par-­‐
ticular,   we   thank   all   the   effort   in   the   editing   phase   which   substantially  
improved  the  readability  of  the  paper.  
7
8</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Keim,   D.A.,
          <string-name>
            <surname> </surname>
          </string-name>
          (
          <year>1997</year>
          ).   Visual   Techniques   for   Exploring   Databases.   Third   International   Conference  on  KDD  &amp;  
          <string-name>
            <surname>Data</surname>
          </string-name>
           Mining.  Newport  Beach,  CA,  
          <year>August</year>
          .  
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Meneses,  C.  J.  &amp;  
          <string-name>
            <surname>Grinstein</surname>
          </string-name>
          ,  G.  G.,  (
          <year>2001</year>
          ).
          <article-title> Visualization  for  Enhancing  the  Data  Mining   Process</article-title>
          .  In  Proceedings  of  the  Data  Mining  &amp;  KDD:  Theory,  Tools,  and  Technology.  III   Conference.  Orlando-­‐FL,  April.    
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Thearling,  K.,  Becker,  B.,  Mawby,  B.,  Pilote,  M.,  Sommerfield,  D.  (
          <year>1998</year>
          ).  Visualizing  Da-­‐ ta  Mining  Models.  In  Proceedings  
          <article-title>of  the  Integration  of  Data  Mining  and</article-title>
           Data  Visuali-­‐ zation  Workshop,  Springer-­‐Verlag.  
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Castillo-­‐Rojas</surname>
          </string-name>
          ,  W.,  Meneses,  C.,  &amp;  
          <string-name>
            <surname>Medina</surname>
          </string-name>
          ,  F.  (
          <year>2013</year>
          ).
          <article-title> Augmented  Decision  Tree  Models   Using   SOM</article-title>
          .   6th   Latin   American   Conference   on   Human   Computer   Interaction,   Costa   Rica.  Proceedings  pp.  
          <fpage>148</fpage>
          -­
          <lpage>‐</lpage>
          155.  Springer  LNCS  
          <fpage>8278</fpage>
          ,
          <string-name>
            <surname> </surname>
            <given-names>ISBN</given-names>
          </string-name>
           
          <fpage>978</fpage>
          -­‐3-­‐319-­‐03067-­‐8.  
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Castillo-­‐Rojas</surname>
          </string-name>
          ,  W.,  &amp;  
          <string-name>
            <surname>Meneses</surname>
          </string-name>
          ,  C.  (
          <year>2012</year>
          )  Comparative  Review  of  Schemes  of  Multidi-­‐ mensional  Visualization  for  Data  Mining  Techniques.  III  International  Congress  of  In-­‐ formatics.  August,  Arica-Chile.  
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Han,  J.,  Kamber,  M.  (
          <year>2001</year>
          ).
          <article-title> Data  Mining  Concepts  and</article-title>
           Techniques.  Morgan  Kaufmann.  
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Brunk,  C.,  Kelly,  J.  and  Kohavi,  R.  (
          <year>1997</year>
          ).
          <article-title> MineSet:  An  Integrated  System  for  Data  Min-</article-title>
          ­‐ ing.  Proc.  of  Third  Intel:  Knowledge  Discovery  and  Data  Mining,  pages  
          <fpage>135</fpage>
          -­
          <lpage>‐</lpage>
          138.  
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wong</surname>
          </string-name>
          ,  P.C.,  Whitney,  P.,
          <string-name>
            <surname> </surname>
          </string-name>
          (
          <year>1999</year>
          ).
          <article-title> Visualizing  association  rules  for  text  mining</article-title>
          .  INFOVIS.   Pages  
          <fpage>120</fpage>
          -
          <lpage>123</lpage>
          .  
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Lamping,  J.,  Rao,  R.,  and  P.  Pirolli.  (
          <year>1995</year>
          ).
          <article-title> A  focus+context  technique  based  on  hyper-­‐ bolic   geometry   for   visualizing   large   hierarchies</article-title>
          .   In   Proceedings   of   the   ACM  
          <fpage>confer</fpage>
          -­‐ ence  on  Human  Factors  in  Computing  Systems,  ACM  Press.  USA,  pages  
          <fpage>401</fpage>
          -
          <lpage>408</lpage>
          .  
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>