<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Saleh Alwahaishi, Jan Martinovič, Václav Snášel, and Miloš Kudělka</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saleh Alwahaishi</string-name>
          <email>s@b.vcszb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Martinovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vaclav Snasel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Milos Kudelka</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FEECS, VŠB- Technical University of Ostrava, Department of Computer Science</institution>
          ,
          <addr-line>FEECS, VSB</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <fpage>132</fpage>
      <lpage>139</lpage>
      <abstract>
        <p>The definitive classification of scientific journals depends on their aim and scope details. In this paper, we present an approach to facilitate the journals classification of the DBLP datasets. For the analysis, the DBLP data sets were pre-processed by assigning each journal attributes defined by its topics. It is subsequently shown how theory of formal concept analysis can be applied to analyze the relations between journals and the extracted topics from their aims and scopes. It is shown how this approach can be used to facilitate the classifications of scientific journals.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Digital Bibliography &amp; Library Project (DBLP)</title>
      <p>Digital libraries are collections of resources and services stored in digital formats and
accessed by computers. Studying them offers an interesting case study for researches
for the following reasons: Firstly, they grow quickly; secondly, they represent a
multidisciplinary domain which has attracted researchers from a wide area of
expertise. DBLP (Digital Bibliography &amp; Library Project) is a computer science
bibliography database hosted at University of Trier, in Germany.</p>
      <p>
        It was started at the end of 1993 and listed more than one million articles on
computer science in January 2010. These articles were published in Journals such as
VLDB, the IEEE and the ACM Transactions and Conference proceedings [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
Besides DBLP has been a credible resource for finding publications, its dataset has
been widely investigated in a number of studies related to data mining and social
networks to solve different tasks such as recommender systems, experts finding, name
ambiguity, etc. Even though, DBLP dataset provides abundant information about
author relationships, conferences, and scientific communities it has a major limitation
that is its records provide only the paper title without the abstract and index terms.
      </p>
      <p>
        In addition to using the DBLP dataset for finding academic experts, it has been used
extensively in academic recommender systems. A number of studies were conducted
to recommend academic events and collaborators for researchers using different
methods and techniques. For example, a recommender system for academic
collaboration called DBconnect was presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Authors of this paper used
DBLP data to generate bipartite (author-conference) and tripartite
(authorconferencetopics) graph models, and designed a random walk algorithm for these models to
calculate the relevance score between authors. And in another study [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] a
recommender system for events and scientific communities for researchers was
proposed based on social network analysis.
      </p>
      <p>
        Querying large datasets produces large sets too, which makes the user unable to
decide from where he has to start looking at the results. To solve this problem
clustering and ranking were suggested in many papers. A system to visualize author
information and relationships simultaneously was presented in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The authors
applied two types of clustering, keyword clustering and author clustering to visualize
the relationships and groupings of authors. In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] document clustering was applied to
provide an overview of the recent trends in data mining activities. Clustering and
ranking are often applied separately but in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] a novel framework called RankClus
was proposed to integrate them. To increase the accuracy of IR clustering, the authors
in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] proposed transferring knowledge available on the word side to the document
side; they introduced a model based on nonnegative matrix factorization to achieve it.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Concept Analysis Of Journals Classification</title>
      <p>This section describes how formal concept analysis is employed to analyze the
DBLP’s journals classification. Formal Concept Analysis can be used as an
unsupervised clustering technique. The starting point of the analysis is a database
table consisting of rows G (i.e. objects), columns M (i.e. attributes) and crosses I
  
 
⊆ G×M (i.e. relationships between objects and attributes). The mathematical structure
used to reference such a cross table is called a formal context (G, M, I).</p>
      <p>
        A group of interested similar journals, which covered the scope of computer
science, were selected. The list of selected journals (objects) was obtained from
wellknown DBLP database that contains information about the published articles and their
authors as well. The selected list of links to journals has the size of 115 items. The
next step was to identify main topics (attributes), which each of the journals covers.
From the journal web sites we have found the aim and scope of each journal, and have
manually extracted the main topics, such as Pattern Recognition, Image Processing,
etc. Each journal has been identified by an existing classifier by company due to the
problem with using their own names or similar names of topics. The used classifier
that contains about 1224 sub disciplines classified to disciplines and those classified
to discipline field, e.g. sub discipline Pattern Recognition is in disciplines Artificial
Intelligence and Image Processing and that is in Information and computing sciences
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. We selected only sub disciplines in the field Technology and Information and
computing sciences. Our manually extracted topic from journals in many cases
correspond the classified disciplines, but in some cases it was necessary to assign the
extracted topic to sub discipline, which was almost similar. Therefore, journals were
classified into a list of topics based in their relation to the topic. The classification
process ends up with ten main topics that have twenty nine subfields or disciplines.
Table 2 shows the main topics and their subfields.
      </p>
      <p>A journal is represented as a list of topics. The topics are the disciplines that being
covered by all journals, based on the extracted data from their aims and scopes. Each
topic is assigned a weight of 0 or 1. A topic’s weight for a journal expresses the
coverage possibility of the topic by the related journal. A value of 1 denotes that the
journal covers the column’s topic and 0 denotes the lack of coverage. Formally, these
data can be represented as a matrix of journals by topics whose m rows and n columns
correspond to m journals and n topics, respectively. The elements of the journal-topic
matrix are the weights of each term for a particular document, that is:
Where yij denotes the weight assigned to topic Tj for journal Ji.</p>
      <p>The formal concept analysis of the data starts with the creation of a formal context.
The formal objects of the formal context are the journals Ji that were retrieved from
DBLP database. The set of these journals is denoted by J. Using the information that
was extracted from the aim and scope of the journals in J. The coverage possibility Tj
that shows the topic coverage by the journals in J, constitute the formal attributes of
the formal context. The set containing these attributes is denoted by T.</p>
      <p>The cross table of the resulting formal context has a row for each journals in J, a
column for each topic in T and a cross in the row of Ji and the column of Tj if the
corresponding weight yij is 1. To minimize the cross table size, journals impact factors
will be considered to decrease the number of tested journals. The journals with an
impact factor of 3.0 and above will be enlisted in the matrix, dropping the number of
selected journals to be 18 as shown in Table 3. After the formal context is
constructed, formal concept analysis is applied to produce the concept lattice.</p>
      <p>Table 4 represents the formal context. A cross in the row of Ji and the column of Tj
indicates that Tj is believed to be a covered topic by the journal of Ji.</p>
      <p>Table 1. Journals’ impact factors and abbreviations 
Abbreviation </p>
      <p>Journal
A 
B 
C 
D 
E 
F 
G 
H 
I 
J 
K 
L 
M 
N 
O 
P 
Q 
R </p>
      <p>Nucleic Acids Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Computer Vision
Computer Applications in the Biosciences
Journal of Selected Areas in Communications
Transactions on Medical Imaging
Transactions on Information Theory
BMC Bioinformatics
Transactions on Neural Networks
Journal of Chemical Information and Computer Sciences
Transactions on Fuzzy Systems
Journal of Computational Chemistry
Transactions on Graphics
Transactions on Mobile Computing
Transactions on Image Processing
Pattern Recognition
Automatica
Information Sciences
Impact
Factor
6.878
5.96
The intent of each formal concept contains precisely those topics covered by all
journals in the extent. Conversely, the extent contains precisely those journals sharing
all topics in the intent.</p>
      <p>The line diagram of the concept lattice, showing the partially ordered set of
concepts is shown in Fig 1, has the minimal set of edges necessary; all other edges
can be derived by using reflexivity and transitivity. Journals and topics label the node
that represents the formal concept they generate. All concept nodes above a node
labeled by a journal have the journal in their extent. All concept nodes below a node
labeled by a topic have the topic in their intent. The extent of the concept node labeled
by the topic “STVV” for example is easily found by collecting the journal H labeling
this concept node on a path going downward.
A 
B 
C 
D 
E 
F 
G 
H 
I 
J 
K 
L 
M 
N 
O 
P 
Q 
R 
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
  
 
The intent of this concept is found by first collecting the topic “STVV” and by going
upward to collect the topic “CTM”, and “CS” labeling the two concepts found on
paths going upward. The resulting extent-intent pair of this concept is ({H},
{CTM,CS,STVV}).</p>
      <p>The concept generated by the topic “DF” is a sub concept of the concept generated
by the topic “AIIP”, for the extent of the former concept is contained in the extent of
the latter concept. All journals classified by the topic “DF” were also classified by the
topic “AIIP”, suggesting that within the given formal context “DF” is a more specific
topic than “AIIP”.</p>
      <p>Another multi constructed example is found in the extent of the concept node
labeled by the topic “DIP”, which is found by collecting the journal Q labeling this
concept node on a path going downward. The intent of this concept is found by
collecting the topics “CTM”, “CA”, and “ISLIS” labeling the three concepts found on
paths going upward. The latter two topics, however, are sub concepts of the concept
generated by the topic “AIIP”. The resulting extent-intent pair of this concept is ({Q},
{AIIP,CTM,CS,ISLIS,CA,DIP}).</p>
      <p>Fig. 1. Concept lattice for journals classification 
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>The concept lattice uncovers relational and contextual information. Journals’ topic
categorizations are put into relational context depending on how they are associated
by the journals’ aims and scopes. The topics “Computer Theory and Mathematics –
CTM“, and “Data and Information Processing –DIP” for example are shown as
related because these topics share a similar classification context. The implicit
structures revealed help researchers to classify journals more efficiently. This
approach has the potential to support the emergence of new knowledge by identifying
concept relations, making these explicit and enabling researchers to inspect these
concept relations.</p>
      <p>
        Concept lattices are not intended to build or substitute traditional static ontologies,
rather they aim to support specifications of less rigorous relations, or associations
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which might be more intuitive to knowledge workers and lead to more
interesting links via associations.
  
 
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Wille</surname>
          </string-name>
          . R. (
          <year>1982</year>
          ).
          <article-title>Restructuring lattice theory: an approach based on hierarchies of concepts</article-title>
          .
          <source>In I. Rival (Ed.). Ordered sets. Reidel</source>
          . Dordrecht-Boston.
          <fpage>445</fpage>
          -
          <lpage>470</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>Formal Concept Analysis: Mathematical foundations</article-title>
          . Springer
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Beydoun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2009</year>
          )
          <article-title>Using Formal Concept Analysis towards Cooperative E-Learning</article-title>
          .
          <string-name>
            <given-names>D.</given-names>
            <surname>Richards</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.H.</given-names>
            <surname>Kang</surname>
          </string-name>
          (Eds.): PKAW, LNAI
          <volume>5465</volume>
          ,
          <fpage>109</fpage>
          -
          <lpage>117</lpage>
          . Springer
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Priss</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>2006</year>
          ),
          <article-title>Formal Concept Analysis in Information Science</article-title>
          . Cronin. Blaise (ed.).
          <source>Annual Review of Information Science and Technology, ASIST</source>
          , Vol.
          <volume>40</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>Scale Coarsening as Feature Selection. Medina and S</article-title>
          . Obiedkov (Eds.) : ICFCA, LNAI
          <volume>4933</volume>
          ,
          <fpage>217</fpage>
          -
          <lpage>228</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stumme</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          (
          <year>1998</year>
          )
          <article-title>Conceptual knowledge discovery in databases using Formal Concept Analysis Methods</article-title>
          . PKDD,
          <fpage>450</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ley</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2002</year>
          )
          <article-title>The dblp computer science bibliography: Evolution, research issues, perspectives</article-title>
          .
          <source>SPIRE 2002: Proceedings of the 9th International Symposium on String Processing and Information Retrieval</source>
          . London, UK: Springer-Verlag, pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>8. URL, http://en.wikipedia.org/wiki/DBLP.</mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zaiane</surname>
            ,
            <given-names>O. R.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Goebel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2007</year>
          )
          <article-title>Dbconnect: mining research community on dblp data</article-title>
          .
          <source>WebKDD/SNA-KDD '07: Proceedings of the 9th WebKDD and 1st SNAKDD 2007 workshop on Web mining and social network analysis</source>
          . New York, NY, USA: ACM, pp.
          <fpage>74</fpage>
          -
          <lpage>81</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>R.</given-names>
            <surname>Klamma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Cuong</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cao</surname>
          </string-name>
          (
          <year>2009</year>
          )
          <article-title>You never walk alone: Recommending academic events based on social network analysis</article-title>
          .
          <source>Complex (1)</source>
          , pp.
          <fpage>657</fpage>
          -
          <lpage>670</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Pon</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Cardenas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2006</year>
          )
          <article-title>Visualization and clustering of author social networks</article-title>
          .
          <source>Distributed Multimedia Systems Conference</source>
          , pp.
          <fpage>174</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2006</year>
          )
          <article-title>Recent trends in data mining: Document clustering of dm publications</article-title>
          .
          <source>International Conference on Service Systems and Service Management</source>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>1653</fpage>
          -
          <lpage>1659</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          , J. Han,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yin</surname>
          </string-name>
          , H. Cheng, and T. Wu, “
          <article-title>Rankclus: integrating clustering with ranking for heterogeneous information network analysis,”</article-title>
          <source>in EDBT '09: Proceedings of the 12th International Conference on Extending Database Technology</source>
          . New York, NY, USA: ACM,
          <year>2009</year>
          , pp.
          <fpage>565</fpage>
          -
          <lpage>576</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Shao</surname>
          </string-name>
          , “
          <article-title>Knowledge transformation from word space to document space,” in SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval</article-title>
          . New York, NY, USA: ACM,
          <year>2008</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Obadi</surname>
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Drazdilova</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hlavacek</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinovic</surname>
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Snasel</surname>
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2010</year>
          )
          <article-title>A Tolerance Rough Set Based Overlapping Clustering for the DBLP Data, Web Intelligence and Intelligent Agent Technology</article-title>
          , IEEE/WIC/ACM International Conference on, pp.
          <fpage>57</fpage>
          -
          <lpage>60</lpage>
          ,
          <year>2010</year>
          IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Krohn</surname>
            <given-names>U</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davies</surname>
            <given-names>NJ</given-names>
          </string-name>
          , Weeks,
          <string-name>
            <surname>R.</surname>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>Concept lattices for knowledge management</article-title>
          .
          <source>BT Technology Journal</source>
          ,
          <volume>17</volume>
          (
          <issue>4</issue>
          ):
          <fpage>108</fpage>
          -
          <lpage>116</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>