=Paper= {{Paper |id=Vol-162/paper-6 |storemode=property |title=Evaluation of IPAQ questionnaire by FCA |pdfUrl=https://ceur-ws.org/Vol-162/paper6.pdf |volume=Vol-162 |dblpUrl=https://dblp.org/rec/conf/cla/SklenarZS05 }} ==Evaluation of IPAQ questionnaire by FCA== https://ceur-ws.org/Vol-162/paper6.pdf
       Evaluation
       Evaluation of
                  of IPAQ
                     IPAQ questionnaire
                          questionnaire by
                                        by FCA
                                           FCA

                   Vladimı́r Sklenář, Jiřı́ Zacpal, Erik Sigmund
                  Vladimı́r Sklenář, Jiřı́ Zacpal and Erik Sigmund
     Dept. Computer Science, Palacký University, Tomkova 40, CZ-779 00 Olomouc,
     Dept. Computer Science, Palacký University,
                                   Czech  RepublicTomkova 40, CZ-779 00 Olomouc,
                                   Czech  Republic
              {vladimir.sklenar,jiri.zacpal,erik.sigmund}@upol.cz
              {vladimir.sklenar,jiri.zacpal,erik.sigmund}@upol.cz


        Abstract. This paper presents using of Formal Concept Analysis (FCA)
        in evaluation of IPAQ questionnaire. IPAQ is global epidemiological ques-
        tionnaire physical activity data. It tries to catch state of physical activity
        (inactivity) in representative file of population. The goal of authors was
        find dependencies between demographic data (age, gender, education,
        occupation, ...) and degree of physical activity. We tried to obtain these
        dependencies from intents of concept lattice created on the base of ques-
        tionnaire. Because the whole concept lattice was very large and contained
        number of concepts not interesting for expert that evaluated data from
        questionnaire, we used binary relations to constrain it. Primarily, we
        focused on equivalence relations.


 Keywords: FCA, evaluation of questionnaire, constrained concept lattice, equivalence
 relation


 1    Preliminaries and Problem Setting
 Evaluation of questionnaire is traditionally way how to discover properties (at-
 tributes) shared by important set of respondents (objects) and dependencies be-
 tween properties of respondents. Standard technique of their evaluation is using
 statistical methods. In this paper we show another method how to get informa-
 tion from data gained from large set of respondents. We used Formal Concept
 Analysis (FCA) to evaluate data recorded by more than 4000 respondents in
 IPAQ questionnaire.

 Formal concept analysis In its basic setting, formal concept analysis deals with
 input data in the form of a table with rows corresponding to objects and columns
 corresponding to attributes which describes a relationship between the objects
 and attributes. The data table is formally represented by a so-called formal
 context which is a triplet hX, Y, Ii where I is a binary relation between X and
 Y , hx, yi ∈ I meaning that the object x has the attribute y. For each A ⊆ X
 denote by A↑ a subset of Y defined by
                        A↑ = {y | for each x ∈ A : hx, yi ∈ I}.
 Similarly, for B ⊆ Y denote by B ↓ a subset of X defined by
                        B ↓ = {x | for each y ∈ B : hx, yi ∈ I}.


Radim Bělohlávek, Václav Snášel (Eds.): CLA 2005, pp. 60–69, ISBN 80–248–0863–3.
                                  Evaluation of IPAQ questionnaire by FCA         61


That is, A↑ is the set of all attributes from Y shared by all objects from A (and
similarly for B ↓ ). A formal concept in hX, Y, Ii is a pair hA, Bi of A ⊆ X and
B ⊆ Y satisfying A↑ = B and B ↓ = A. That is, a formal concept consists of a
set A (extent) of objects which fall under the concept and a set B (intent) of
attributes which fall under the concept such that A is the set of all objects sharing
all attributes from B and, conversely, B is the collection of all attributes from Y
shared by all objects from A. The set B (X, Y, I) = {hA, Bi | A↑ = B, B ↓ = A}
of all formal concepts in hX, Y, Ii can be naturally equipped with a partial order
defined by

          hA1 , B1 i ≤ hA2 , B2 i iff A1 ⊆ A2 (or, equivalently, B2 ⊆ B1 ).

 Under ≤, B (X, Y, I) happens to be a complete lattice, called a concept lattice.
    We refer to [11] for background information in formal concept analysis (FCA).
    Formal concept analysis thus treats both the individual objects and the in-
dividual attributes as distinct entities for which there is no further information
available except for the relationship I saying which objects have which attributes.
    However, in case of evaluation of questionnaire, it is necessary to work with
some additional information. First, identity of one concrete object is not interest-
ing (respondents are often anonymous). We want to find out properties common
to some subsets of respondents (for example young females). Thus we have to
define these interesting subsets and consider only concepts which extent contain
all (or majority of) respondents from these subsets. Second, we have to calculate
with some noise in data. For example, that small number of respondents have
different properties than others in their subsets.


2   IPAQ questionnaire
In 1996, Dr. Michael Booth of Sydney, Australia, initiated a collaborative effort
to develop a valid and reliable questionnaire measuring health-related physical
activity suitable for both research and surveillance. An international group of
physical activity assessment experts were invited to form a working group, re-
ferred to as the International Consensus Group for the Development of an Inter-
national Physical Activity Questionnaire. A year later, the consensus group came
together for a meeting at the World Health Organisation (WHO) in Geneva,
Switzerland. The purpose of the International Physical Activity Questionnaires
(IPAQ) is to provide a set of well-developed instruments that can be used in-
ternationally to obtain comparable estimates of physical activity. In response
to the global demand for comparable and valid measures of physical activity
within and between countries, IPAQ was developed for surveillance activities
and to guide policy development related to health-enhancing physical activity
across various life domains. In IPAQ questionnaire is many attributes, such as
age, gender, education, occupation and other particularities of physical activ-
ity (PA) and physical inactivity (PI) at representative file of Czech population
between 18 and 65 years old. In 2004 were got data for analysis PA and PI pat-
terns from 2300 women a 2018 men. In respect of much adventitious information
62      Vladimı́r Sklenář, Jiřı́ Zacpal, Erik Sigmund


characteristic PA and PI is evaluation by ”classical” statistics with hypothesis
test almost inexhaustible.


3    Concept lattices of contexts with binary relations

In our recent papers we presented how further information additionally supplied
with the basic object-attribute data table can be utilized [2],[5],[6],[7]. We now
recall the basic concepts of [6].

Definition 1. A formal context with a binary relation (R-context, for short) is
a structure hX, Y, I, ≡i (written also hhX, ≡i, Y, Ii) where hX, Y, Ii is a formal
context and ≡ is a binary relation on X.

Remark 1. (1) We are primarily interested in case when ≡ is an equivalence
relation. Then x1 ≡ x2 means that objects x1 and x2 are equivalent from some
point of view (similar, indistuinguishable).
    (2) Equivalence ≡ may be supplied by an expert or may result from some
previous analysis or external source. For example, objects from X may be par-
titioned by some clustering (based on attributes from Y or some other data
available) or some convention (a catalogue). Such a partition gives naturally a
rise to an equivalence relation.
    If ≡ represents an indistinguishability (or intended indistinguishability), it
might be desirable to consider only those formal concepts which do not separate
indistinguishable objects. We call such formal concepts compatible.

Definition 2. For an R-context hhX, ≡i, Y, Ii, a formal concept
hA, Bi ∈ B (X, Y, I) is called compatible with ≡ if for each x1 , x2 ∈ X, if x1 ∈ A,
and x1 ≡ x2 or x2 ≡ x1 , then x2 ∈ A.

    Compatible concepts are thus certain formal concepts from B (X, Y, I) satis-
fying a natural restriction with respect to ≡. The set of all formal concepts from
B (X, Y, I) which are compatible with ≡ will be denoted by B (hX, ≡i, Y, I).

   For an equivalence ≡ on X, extents of compatible formal concepts are unions
of ≡-classes (recall that an ≡-class corresponding to x ∈ X is a set [x]≡ = {x0 ∈
X | x ≡ x0 }; the collection of all ≡-classes is denoted by X/ ≡).


Theorem 1. ([6]) B (hX, ≡i, Y, I) equipped with ≤ is a complete lattice in which
                                                                             V
arbitrary infima coincide with infima in B (X, Y, I), i.e. it is a complete -
sublattice of B (X, Y, I).

   It can be shown by an easy example that suprema in B (hX, ≡i, Y, I) do not
generally coincide with suprema in B (X, Y, I).
                                  Evaluation of IPAQ questionnaire by FCA        63


4   P%-compatible concepts
When we evaluate questionnaire, we want to find properties that are shared
by majority of respondents (or interesting subset of respondents, for example
all young females). Thus our previous definition of compatible concept is too
strict, because it is unnecessary to desire that attributes in compatible intents
are shared by all equivalent objects. In most cases it is sufficient that extent
contains only important portion(given in percents) of the class of equivalent
objects [x]≡ .
Definition 3. For an R-context hhX, ≡i, Y, Ii and 0 ≤ p ≤ 100, a formal con-
cept hA, Bi ∈ B (X, Y, I) is called p%-compatible with ≡ if for each x ∈ A,
|[x]≡ ∩ A| ≥ |[x]≡ |.p/100

    This is, if object x belongs to extent, than also at least p% of others ob-
jects from the same equivalent class must belong to extent. The set of all formal
concepts from B (X, Y, I) which are p%-compatible with ≡ will be denoted by
Bp (hX, ≡i, Y, I)

   The following lemma is obvious. It shows a natural result saying that the less
percents of objects from [x]≡ is sufficient, the more formal concepts satisfying
the restrictions.
Lemma 1. If p1 ≤ p2 then Bp2 (hX, ≡i, Y, I) ⊆ Bp1 (hX, ≡i, Y, I)

5   Evaluation IPAQ questionnaire by FCA
Creation of context. First step in analyse IPAQ questionnaire by FCA is creation
of context from questionnaire data table. The set of objects is set of respondent.
The set of attributes is given by queries in questionnaire (age, sex, location, BMI,
...). Because of data are not in bivalent form, we have to transform this date to
bivalent form by scaling. The expert provided this data for scaling, who assigned
borders between degrees of attribute. For example characteristic age divided to
three attributes young (age is less then 20 years), middle (age is between 21 and
55 years) and old (age is more then 55 years). The transformation to context
is very important, because bad alignment of borders can make for deformation
whole concept lattice. Part of questionnaire is in Fig. 1. Part of context is in
Fig. 2..
     Resulting context has 72 attributes. We can calculate concept lattice for this
context. We have lattice, which has about 21 millions concepts. It is very much
for finding dependencies between attributes. Because of it we try to constrain
lattice by equivalence relation and consider only p%-compatible concepts.

6   Obtaining equivalence relations
The key question is, how to obtain particular equivalences. The most important
is expert, who has to specify, which set of attributes is interesting for him. One
64   Vladimı́r Sklenář, Jiřı́ Zacpal, Erik Sigmund




                          Fig. 1. Part of questionnaire




                              Fig. 2. Part of context
                                  Evaluation of IPAQ questionnaire by FCA       65


class of equivalent objects then contains all objects (respondents) that have the
same subset of interesting attributes.
    More formally. For a formal context hX, Y, Ii and set of interesting(important)
attributtes M ⊂ Y we denote by ≡M the binary relation defined on X by

    x1 ≡M x2 if and only if {x1 }↑ ∩ M = {x2 }↑ ∩ M .

    In other words, x1 ≡M x2 if and only if x1 and x2 have the same subset of
attributes from M. Obviously, ≡M is an equivalence relation on X.

    In our case, expert was interested in discavering attributes, that are common
for all (or important part of) respondents from given class (for example smoking
old men).
    Together with expert we defined 32 sets of important attributes, which are
from 4 main groups:
 – Physical activity, age and gender of respondents.
 – Physical activity, age, gender and education of respondents.
 – Physical activity, age, gender and body mass index (BMI) of respondents.
 – Physical activity, age, gender smoking of respondents.
All above mentioned attributes are many-valued attributes with nominal scale.
Each set of many-valued attributes built up equivalent classes of respondents,
which have value of this attributes identical. We calculated constrained concept
lattices for each equivalence relation. Because of the data from respondents are
very sensitive for noise, we also built up lattices contained 90% - compatible
and 75% - compatible concepts. We delivered these constrained lattices to the
expert to analyze. He finds ”interesting” concepts, which describe dependencies
between demographic data and physical activity or inactivity. Each compati-
ble concept is interesting for his intent, which contains at most one value for
each many-valued important attribute. In addition to these attributes may be
in intent contained another attributes. Occurrence of such attributes is interest-
ing for expert, because they are shared by majority of respondents from given
equivalence class. Cardinality of extent is also interesting, because it determines
number of respondents, who have attributes in intent.
    We demonstrate this method on one group of attributes.


7    Example
Expert selected those attributes: gender, age, education and intensive physi-
cal activity (PA). Because of gender is scaling to 2 attributes(Man, Woman),
age to 3 attributes(young, middle, old) and intensive PA to 3 attributes(below-
average, average, above-average) we have 54 equivalent classes. Corresponding
constrained concept lattices have 188 concepts. Set of all p%-compatible con-
cepts contain 418 concepts (90%) and 1 449 concepts (75%). Now the expert
can analyze lattices. He choose one equivalent class. For example: SEX - man,
66     Vladimı́r Sklenář, Jiřı́ Zacpal, Erik Sigmund


AGE - middle, EDUCATION - secondary, intensive physical activity (PA) -
above-average. For those attributes he finds greatest(by concept lattice order-
ing) concept, which intent contains all this attributes. Such concept with all
concepts, which are less create sublattice. Expert goes through this sublattice
and finds out intents, which contains another attributes then those, which char-
acterize given class(or group of classes). At first we analyze sublattice, which
contains 100%-compatible concepts. Corresponding sublattice is in Fig. 3.

                                     SEX - man,
                                   AGE - middle,
                               EDUCATION - secondary,
                             INTENSIVE PA - above-average

                                          574




                                           0

             Fig. 3. sublattice contained 100% - compatible concepts



    This sublattice has only smallest and greatest element. There is not an-
other concept in this sublattice. It is causing by requirment, that all objects-
respondents from equivalent class have to be contained in concept. For expert
is important intent of greatest concept, because in it are all atribututes com-
mon for all respondents from given class. In this example we can see, that there
are only attributtes, which determine given equivalente class. More interesting
is lattice contained 90% - compatible concepts. Corresponding sublattice is in
Fig. 4.
    There are 3 aditionally concepts. First includes 569 respondents (it is 99%
from all members of equivalent class), who have value of attribute SITTING
equal to low. It confirm expecting, that respondents with high intensive PA
sitting low. Second concept includes respondents, who have value of attribute
NATIONALITY equal to Czech. This fact we would interprete that only Czech
respondents have high intensive PA. Really it means, that majority of all respon-
dents had Czech nationality. Third concept is infimum of previous two concepts.
    The largest sublattice is from lattice contained 75% - compatible concepts.
Corresponding sublattice is in Fig. 5.
    This sublattice has some interesting concepts. For example the concept, which
include 457 respondents (it is 79% from all members of equivalent class) with
value of attribute SPORT ACTIVITY equal to yes, value of attribute SITTING
equal to low and value of attribute NATIONALITY equal to Czech. We can de-
                                                               Evaluation of IPAQ questionnaire by FCA                                               67

                                            SEX - man,
                                          AGE - middle,
                                      EDUCATION - secondary,
                                    INTENSIVE PA - above-average

                                                                   574
                    SITTING - low                                                                NATIONALITY - Czech



                             569                                                                   569




                                                                   564




                                                                       0

            Fig. 4. sublattice contained 90% - compatible concepts

                                                                  SEX - man,
                                                                AGE - middle,
                                                            EDUCATION - secondary,
                                                          INTENSIVE PA - above-average

                                                                              574
                                                                                        SP
                                                                                        OR




                                                            ch                                                   EM
                                                          ze
                                                                 igh




                                                                                          TA




                                                                                                                      PL
                                                     -C
                                                                           BIKE - has




                                                                                                        C                O
                                                                 -h




                                                                                            CT




                                       w         Y                                                          AR            YE
                                  - lo       LI
                                                T                                                                            D
                                                               G




                                                                                             IVI




                                                                                                                 -h
                             NG
                                                          IN




                                           NA
                                                                                              TY




                         I                                                                                         as
                                                       LK




                     ITT               IO
                                                                                                 -y




                    S               T
                                                     WA




                                  NA
                                                                                                   es




              569       569                         489                       489                  465                       460   483




455   484     485       564                         485                       485                  461                       456   478   461   478




                        481                                                   480                  457                       451   473




                                                                                   0



            Fig. 5. sublattice contained 75% - compatible concepts
68      Vladimı́r Sklenář, Jiřı́ Zacpal, Erik Sigmund


duce form this concept, that above-average physical activity is closely associated
with fact, that person sits low and does some sport activity during his leisure
time.


8    Future research

We now comment on some further topics and future research (some of these are
studied in [6]).

 – A concept lattice may be thought of as a hierarchical clustering scheme.
   The partition corresponding to ≡ represents another clustering (more gen-
   erally, we can think of a hierarchical clustering scheme). Several interesting
   problems arise here (constraining one clustering by the other, comparing the
   clusterings, measuring their mutual consistency, etc.), a work is in progress.
 – There is more ways of creating context from questionnaire. Naturally way
   is using of fuzzy logic and fuzzy sets. We will experiment with creating of
   fuzzy context and methods of constraining resulting fuzzy concept lattice by
   fuzzy relations. Main ideas of fuzzy concept analysis are in [3],[4].


9    Conclusion

Our way of evaluating gives a new point of view on data contained in question-
naire. On the base of first response from expert, who worked with our results,
we can say, that our approache may be usefull for finding dependencies between
properties of respondents of questionnaire.

Acknowledgment Supported by grant No. 1ET101370417 of the GA AV CR.


References
1. G. Ammons, D. Mandelin, R. Bodik, J. R. Larus. Debugging temporal specifications
  with concept analysis. In Proc. ACM SIGPLAN’03 Conference on Programming
  Language Design and Implementation, pages 182–195, San Diego, CA, June 2003.
2. R Bělohlávek, V. Sklenář, J. Zacpal. Formal concept analysis with hierarchically
  ordered attributes. Int. J. General Systems 33(4)(2004), 283-294.
3. Bělohlávek R.: Fuzzy Relational Systems: Foundations and Principles. Kluwer Aca-
  demic/Plenum Publishers, New York, 2002.
4. Bělohlávek R.: Concept lattices and order in fuzzy logic. Annals of Pure and Applied
  Logic 128(1-3)(2004), 277-298.
5. Bělohlávek R., Sklenář V.: Formal Concept Analysis Constrained by ADF. In: Proc.
  ICFCA 2005, pp. 176–191. [ISBN 3-540-24525-1]
6. Bělohlávek R., Sklenář V., Zacpal J.: Concept lattices constrained by equivalence
  relations. In: Proc. CLA 2004, pp. 58–66. [ISBN 80-248-0597-9]
7. Bělohlávek R., Sklenář V., Zacpal J.: Concept lattices constrained by systems of
  partitions. In: Proc. Znalosti 2005, pp. 5–8.
                                    Evaluation of IPAQ questionnaire by FCA           69


8. C. Carpineto, R. Romano. A lattice conceptual clustering system and its application
  to browsing retrieval. Machine Learning 24:95–122, 1996.
9. R. Cole, P. Eklund. Scalability in formal context analysis: a case study using medical
  texts. Computational Intelligence 15:11–27, 1999.
10. U. Dekel, Y. Gill. Visualizing class interfaces with formal concept analysis. In
  OOPSLA’03, pages 288–289, Anaheim, CA, October 2003.
11. B. Ganter, R. Wille. Formal Concept Analysis. Mathematical Foundations.
  Springer-Verlag, Berlin, 1999.
12. O. S. Kuznetsov, S. A. Obiedkov. Comparing performance of algorithms for gen-
  erating concept lattices. J. Exp. Theor. Artif. Intelligence 14(2/3):189–216, 2002.
13. D. Maier. The Theory of Relational Databases. Computer Science Press, Rockville,
  1983.
14. O. Ore. Galois connections. Trans. Amer. Math. Soc. 55:493–513, 1944.
15. G. Stumme, R. Wille, U. Wille. Conceptual knowledge discovery in databases using
  formal concept analysis methods. In J. M. Zytkow, M. Quafofou (Eds.). Principles
  of Data Mining and Knowledge Discovery. LNAI 1510, pages 450–458, Springer, Hei-
  delberg, 1998.
16. P. Valtchev, R. Missaoui, R. Godin, M. Meridji. Generating frequent itemsets
  incrementally: two novel approaches based on Galois lattice theory. J. Exp. Theor.
  Artif. Intelligence 14(2/3):115–142, 2002.