<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of the Community of Learning Analytics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sadia Nawaz</string-name>
          <email>sadia@alumni.purdue.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Farshid Marbouti</string-name>
          <email>fmarbout@purdue.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Strobel</string-name>
          <email>jstrobel@purdue.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Purdue University</institution>
          ,
          <addr-line>West Lafayette, IN</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The trends of the learning analytics community being presented in this paper are in terms of authors, their affiliation and geographical location. Thus the most influential authors, institutes, and countries who have been actively contributing to this field are brought out. In addition, this paper identifies collaborations among authors, institutes, and countries. The paper also tries to explore the research themes followed by the learning analytics community.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Graph Metric (graph theory terminologies)</title>
      <sec id="sec-1-1">
        <title>Total unique vertices / nodes (authors)</title>
        <p>Unique edges (edge is loop for single author articles &amp; straight line otherwise)
Edges with duplicates (i.e., edge weight is greater than 1)
(These edges show joint authorship in more than one publication )</p>
      </sec>
      <sec id="sec-1-2">
        <title>Total edges</title>
      </sec>
      <sec id="sec-1-3">
        <title>Self-loop (single author articles)</title>
      </sec>
      <sec id="sec-1-4">
        <title>Multi-author article count</title>
        <p>Connected components (authors forming a cluster based on authorship)</p>
      </sec>
      <sec id="sec-1-5">
        <title>Single-vertex connected components (Count of the authors of single author articles who did not collaborate) Maximum vertices in a connected component Maximum edges in a connected component</title>
        <p>2008
74
100
17
2009
79
106
18
124
1
31
22
0
7
2010
151
208
50
258
3
61
38
3
15
2011
193
251
42
293
10
75
53
8
29
2012
Total
281
435
48
483
8
96
79
7
22
623
938
337
1275
26
27
140
14
113</p>
        <sec id="sec-1-5-1">
          <title>4. COLLABORATION TRENDS</title>
          <p>Collaboration as defined in Oxford dictionary [8] is the „action of
working with someone to produce something‟ and in current
context it represents co-authorship of an article by two or more
researchers. This term can be extended to institutes and even
countries and hence extended collaboration patterns will be
extracted between and within institutes and countries respectively.
Table 2 shows that there have been 938 pairs of authors who
collaborated just once (this number includes single author articles
- since in that case a self-loop serves as an edge to itself).
Alternatively, it can be stated that 73.57% of all articles have been
written by the authors who have collaborated just once. It could
either mean that new collaborations are forming or that the
authors published just once and then they started working in other
research areas, with other authors or they started targeting other
venues. Therefore, initiatives such as LAK Data challenge will
attract more researchers towards this field and hence may help in
further growth and development of authorship networks.</p>
        </sec>
        <sec id="sec-1-5-2">
          <title>5. DIVERSITY</title>
          <p>Diversity in this context is the count of distinct researchers – a
given author may have worked with. Table 4 aims at identifying
the contributors who have worked with most diverse group of
authors e.g., K.R. Koedinger has worked with 34 distinct authors
and Ryan Baker has worked with 25 distinct authors. We also
extracted the graph of these top contributors (based on degree)
i.e., a graph which includes these top authors and all of their
collaborators; and it was found that this new graph consists of 128
authors (roughly 21% of the total authors). This percentage shows
the significance of the top authors towards EDM, LAK, JETS and
in general towards learning analytics.</p>
        </sec>
        <sec id="sec-1-5-3">
          <title>6. GEOGRAPHICAL LOCATION</title>
          <p>
            Next, the geographical analysis of this dataset is presented which
aims to explore the countries that have been extending this field
especially through contributions to the venues: EDM, LAK and
JETS. There have been contributions from 41 different countries.
For extracting this information, all aliases of a country‟s name
were merged e.g., Netherland, Netherlands, The_Netherlands etc.
were all merged together. The top countries that have had
international collaborations are provided in table 5. Clearly, USA
and UK are on top of the list. To illustrate the collaboration
patterns between countries figure 1 is drawn using „NetDraw‟. In
this figure an edge between two countries depicts the
coauthorship between the researchers from these countries. The edge
width (also represented by a number) shows the strength of such
collaboration. Also, different symbols have been used for different
nodes based on their „betweeness‟ values. „Betweenness
centrality‟ is the “number of times a node acts as a bridge along
the shortest path between two other nodes” [9]. Clearly, USA, UK
and Germany are on top of this list based on degree and centrality
measures. It is apparent that most of the nodes have „betweenness‟
value of zero as depicted with a „+‟ symbol. It indicates the
peripheral nature of these nodes and thus depicts the birth or
growth of this field – in that newer nodes are being added and the
graph is currently sparse. Figure 2 illustrates geographical
diversity of collaborators. The smaller circles show lesser
diversity in terms of collaboration (with researchers from other
countries). Similarly, larger circles are indicative of the countries
whose researchers have more diverse group of co-authors (from
across the world). In this figure a small table at the bottom depicts
the count of papers from each continent. Thus it brings out the
most active region for research in the area of learning analytics.
Clearly, North America and Europe are at the top of this list
(complete geographical mapping is available at [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]).
          </p>
        </sec>
        <sec id="sec-1-5-4">
          <title>7. AUTHOR AFFILIATION</title>
          <p>Next, the institutional affiliation of authors was analyzed and it
was found that there have been contributions from 200 different
institutes world-wide. The ranking of the top few institutes in
terms of collaboration with other institutes is provided in table
6. The term degree represents count of unique institutes that a
given institute may have worked with. This term can be
influenced by both the „article counts‟ and the „coauthor
counts‟. Table 7 provides the institutes with highest count of
intra-institute collaboration and table 8 provides the „institute –
pairs‟ that have had highest collaboration. Such analysis is
beneficial to research institutes and organizations so that they
may collaborate and extend further studies in the field of
learning analytics. Figure 3 illustrates trends of collaboration
between institutes.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Institute</title>
      <p>Carnegie Mellon University
University of Cordoba
Stanford University
Fraunhofer Institute for Applied Information Technology
Dept. Computer wetenschappen, KU Leuven
Worcester Polytechnic Institute
Open University of the Netherlands
University of Pittsburgh</p>
      <sec id="sec-2-1">
        <title>8. RESEARCH THEMES</title>
        <p>
          In order to track the research themes being followed by learning
analytics society and to see their emergence over time, the
authors conducted a keyword based analysis. The information
for this analysis has been extracted from the keyword (subject)
section of the data provided by Society for Learning Analytics
Research (SoLAR) website [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. However, for initial two years
i.e., 2008-2009 this field is empty, similarly some of the articles
in later years had this field empty. Therefore, it was decided to
use the „title‟ field for the purpose of keyword extraction. The
selection of „title‟ field rather than the „abstract‟ field for the
purpose of keyword extraction relies on an earlier study by the
authors of this paper [10]. Later, Hermetic Word Frequency
Counter (HWFC) software [11] was used to parse out top 30
keywords for each year. Some of the common English keywords
are already ignored by this software, as available in its stop word
list. Other words which are apparent by the nature of the venues
EDM, LAK and JETS were then manually eliminated (since
they would not bring any insightful information for this
analysis) e.g., student, learn, knowledge, education etc. Further
refinement was made to merge varying instances of the same
word such as „visual, visualize, visualization‟ etc. Then, IBM‟s
Many-eyes software utility was used to obtain the Matrix Chart
as provided in figure 4. In this figure top 30 keywords for each
year have been presented. It should be noted that since the count
of articles and venues has also increased over years; therefore,
the relative rank or position of keywords will be discussed rather
than absolute frequency counts. From this figure, it was found
that the usage of some of the keywords such as „visualization,
intelligent, network*‟ is increasing over time. Some keywords
such as „model*, system*, tutor*‟ retain their ranks. The
keywords „online, collaborat*, performance‟ etc. show
fluctuating trends. Similarly, other trends can be interpreted.
The authors further extracted the context of these keywords: it
was found that „visualization co-occurs with data-mining‟,
„intelligent appears with tutoring system‟. The word „online‟ has
a broader class of co-occurring keywords which includes
„learning, education, university, assessment systems, tutoring,
courses, curriculum‟ etc. Interestingly, in 2012 the context
changed to „online communities, interactions and social
learning‟ etc. Due to space restriction further analysis cannot be
provided in this paper.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>CONCLUSION</title>
        <p>In this paper the data of past five years of publications related to
learning analytics are analyzed. The trends show increasing
number of authors and more collaboration between authors as
well as institutes. Geographical analysis of authors shows that
scholars from different countries have been collaborating and
contributing towards this field. Top authors, collaborators, and
institutes are identified in this paper. The authors also attempted
to bring out the research themes followed by the learning
analytics community based on the frequency of the usage of
keywords.</p>
        <p>The authors plan to extend this study based on author‟s
disciplinary diversity and on the association between authors
and their explored research areas within learning analytics.</p>
        <sec id="sec-2-2-1">
          <title>Worcester Polytechnic Institute</title>
          <p>Claremont Graduate University
University of Belgrade
Northern Illinois University
Hochschule fur Wirtschaft und Recht
Beuth Hochschule fur Technik Berlin
Universidade Federal de Alagoas
Fraunhofer Institute for Applied Information Technology</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Institute</title>
      <p>Carnegie Mellon University
University of Memphis
Simon Fraser University
University of Memphis
Hochschule fur Technik und Wirtschaft
Hochschule fur Technik und Wirtschaft
Carnegie Mellon University
Saarland University
Edge weight
[8] OXFORD DICTIONARIES, 2013. Oxford dictionary
collaboration.
http://oxforddictionaries.com/definition/english/collaborati
on
[9] WIKIPEDIA, 2013. Wikipedia centrality.
http://en.wikipedia.org/wiki/Betweenness#Betweenness_ce
ntrality
[10] Nawaz, S., Strobel, J., 2013. IEEE Transactions on
Education – authorship and content analysis, under
preparation
[11] HERMETIC, 2013. Hermetic Word Frequency Counter.
http://www.hermetic.ch/wfc/wfc.htm
Figure 4: Keyword analysis for research theme extraction</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Taibi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dietze</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <article-title>Fostering analytics on learning analytics research: the LAK dataset</article-title>
          ,
          <source>Technical Report</source>
          , 03/2013
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>LUXON</surname>
            <given-names>SOFTWARE</given-names>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Luxon software converter</article-title>
          . http://www.luxonsoftware.com/converter/xmltocsv
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>NODEXL</surname>
          </string-name>
          ,
          <year>2013</year>
          . NodeXL. http://nodexl.codeplex.com/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Borgatti</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          ,
          <year>2002</year>
          .
          <article-title>NetDraw Software for Network Visualization</article-title>
          . Analytic Technologies: Lexington, KY
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>IBM</surname>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Many eyes</article-title>
          . http://www958.ibm.com/software/analytics/manyeyes/visualizations
          <article-title>/a nalysis-of-the-community-of-learn</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Ferguson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>The State Of Learning Analytics in 2012: A Review and Future Challenges</article-title>
          .
          <source>Technical Report KMI 12-01</source>
          , Knowledge Media Institute, The Open University, UK. http://kmi.open.ac.uk/publications/techreport/kmi-12-01
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>YWORKS</surname>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>Y works developer‟s guide glossary</article-title>
          . http://docs.yworks.com/yfiles/doc/developersguide/glossary.html
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>