<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Browsing Publication Data using Tag Clouds over Concept Lattices Constructed by Key-Phrase Extraction</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Gillian J. Greene, Marcel Dunaiski, Bernd Fischer Computer Science Division Stellenbosch University</institution>
          ,
          <country country="ZA">South Africa</country>
        </aff>
      </contrib-group>
      <fpage>10</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>In order to nd research on a speci c topic or to get an overview of the topics that are published at di erent academic venues, academics need to browse data from existing academic publications. The title and abstract of publications contains useful key-phrases indicating the topic of the publication, but these need to be directly extracted and presented in a browsable format in order to allow the user to nd relevant publications. We extract key-phrases and use these to construct a concept lattice for a dataset of publications. We then present the information in an intuitive interactive tag cloud browser where navigation is supported by the underlying concept lattice.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In order to nd research on a speci c topic or to get an overview of the topics that
are common at di erent academic venues, academics need to browse data from
existing academic publications. Publications are often associated with speci c
keywords, but often these are from a restricted vocabulary and thus may not
be comprehensive. Relevant information for the publication is however provided
as free text in the abstract and title which includes an overview of the work
and may contain key-phrases that could be useful in characterizing the research.
However, these key-phrases need to be extracted together from the free-text.</p>
      <p>
        We use our ConceptCloud browser [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which is based on a novel combination
of concept lattices and tag clouds, to present key-phrases which we extract from
academic publications, and so enable users to browse the publication data by
selecting a combination of key-phrases. Our tag clouds allow users to navigate
along di erent paths, and to aggregate the publication data in di erent ways.
      </p>
      <p>
        Tag clouds are a simple visualization method for textual data where the
importance of each tag (typically its frequency) is re ected in its size. Navigation
using tag clouds has previously been explored using a Bayesian approach [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ];
however, navigation in our browser is supported by a novel combination of tag
clouds and concept lattices [
        <xref ref-type="bibr" rid="ref14 ref37 ref8">37, 14, 8</xref>
        ]. Concept lattices have been shown to be
useful for browsing data [
        <xref ref-type="bibr" rid="ref25 ref7 ref9">9, 25, 7</xref>
        ] but large lattices do not provide a suitable
data visualization because the relationships between the concepts are di cult to
identify in a large Hasse diagram.
      </p>
      <p>Our navigation algorithm maintains a focus concept in the underlying lattice.
We derive the tag cloud visualization from the current focus concept and update
it after each navigation step. Navigation is driven by the user's selection (or
de-selection) of tags in the tag cloud.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <sec id="sec-2-1">
        <title>Key-Phrase Extraction</title>
        <p>
          A key-phrase extraction system typically extracts a list of words or phrases that
serve as candidate phrases using some heuristics [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] and then determines which
of these candidate phrases are key-phrases using supervised or unsupervised
learning approaches. Typical heuristics include using a stop word list to remove
commonly occurring stop words [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], and using words with certain part-of-speech
tags (e.g., nouns, adjectives, verbs) as the keywords [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]. N-grams [
          <xref ref-type="bibr" rid="ref38">38</xref>
          ] or noun
phrases [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] that satisfy pre-de ned lexico-syntactic patterns [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] can also be
used as the candidate phrases. Since most of the approaches are restricted to
the boundaries of the sentence key-phrases could be also extracted directly from
the paragraph using speci c representation called \parse thicket" [
          <xref ref-type="bibr" rid="ref10 ref13">10, 13</xref>
          ]. In that
case extended phrases (including discourse relations) are used instead of regular
ones [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>
          Supervised learning approaches used to select key-phrases from the pool of
candidate key-phrases make use of di erent types of features such as statistical,
syntactic, structural or external resources. Unsupervised learning approaches to
key-phrase identi cation typically make use of a graph-based [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] or clustering
approach [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Formal Concept Analysis</title>
        <p>
          Formal Concept Analysis (FCA) [
          <xref ref-type="bibr" rid="ref14 ref37 ref8">37, 14, 8</xref>
          ] uses lattice-theoretic methods to
investigate abstract relations between objects and their attributes. Such contexts
can be imagined as cross tables where the rows are objects and the columns are
attributes. Note that we follow the de nitions of [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>De nition 1 A formal context is a triple (G; M; I) where G and M are sets
of objects and attributes, respectively, and I G M is an arbitrary incidence
relation.</p>
        <p>De nition 2 The common attributes of the objects in A are given as A0 :=
fm 2 M j (g; m) 2 I for all g 2 Ag for A G. The common objects of the
attributes in B are given as B0 := fg 2 G j (g; m) 2 I for all m 2 Bg for
B M .
De nition 3 A formal concept of the context (G; M; I) is a pair (A; B) with
A G and B M such that A0 = B and B0 = A.</p>
        <p>Under the ordering (A1; B1) (A2; B2) :, A1 A2 the concepts from any
formal context form a complete lattice which is called the concept lattice.</p>
        <p>
          E cient algorithms exist for the computation of the concept lattices and the
meet and join of concepts in the lattice (for example [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]).
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Tag Clouds</title>
        <p>Tag clouds are a common visualization of textual data. In Web 2.0
applications tag clouds are often built from user-generated tags for particular content.
However tag clouds can also be generated directly from common words in text.
Figure 1 shows a tag cloud generated from the content of Obama and Bush's
inaugural speeches. The most commonly used words are presented in the largest
font size. The tag cloud provides a simple overview of the speech content and
enables comparison between the two speeches.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Navigation Framework</title>
      <sec id="sec-3-1">
        <title>Contexts from Publications</title>
        <p>In order to generate a context from publication data we use the paper itself
represented by its paper-id as the object in the context table and assign attributes
from the paper's authors, year of publication and extracted key-phrases from
the title and abstract.</p>
        <p>Building the context in this way allows us to see papers that share keywords
and also papers that share authors. Selecting an author tag will display in the
tag cloud all attributes of their publications (including co-authors) and their
tags will be sized according to how often the attributes occur.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Tag Clouds from Concepts</title>
        <p>We generate tag clouds directly from concepts in a concept lattice instead of free
text or user-generated tag information. The tag cloud provides a more intuitive
interface to the information contained in our concept lattice.</p>
        <p>Since a concept comprises a set of objects and a set of attributes, it is
tempting to use the attributes (i.e., the intent) as the tag cloud. However, this produces
degraded clouds because (i) the intent only contains the attributes common to
all objects, and (ii) each attribute only occurs once so that all tags would have
the same size. Instead, we use the intents of the extents; more precisely, we
collect all attributes of the de ning concept of each object in the extent of the focus
concept; we also add the objects themselves, to allow their direct selection in
the tag cloud.</p>
        <p>De nition 4 The tag cloud from a concept c = (A; B) 2 B(C) is de ned as
(c) = A ] Ua2A M (a) where M (c) refers to the concept c's intent and (a)
refers to the de ning concept of object a.</p>
        <p>Here ] denotes multiset union. By construction, the objects in the tag cloud
induce subconcepts of the concept from which the tag cloud was derived;
moreover, all tags have a non-bottom meet with that concept.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Navigating Concept Lattices with Tag Clouds</title>
        <p>The browser maintains a focus concept, from which it renders the tag cloud
as described above; when the user selects (or deselects) a tag, the browser
updates the focus and re-renders the tag cloud. The focus, or more precisely, its
extent contains the subset of objects (i.e, academic papers) that share all
currently selected tags. The initial focus (corresponding to an empty selection set)
is therefore the lattice's top element, whose extent contains the entire data-set.</p>
        <p>Navigation is re nement-based: when the user selects another tag, the browser
updates the focus by computing the meet of that tag's de ning concept and the
old focus, rather than recomputing it from the full selection set.</p>
        <p>For example in Figure 2, we see the initial tag cloud generated from the top
concept in the lattice and after selection of tags \model" and \checking" (Figure
4) the tag cloud is regenerated from an updated focus concept with \model" and
\checking" contained in the focus's intent.</p>
        <p>Intuitively, deselection should be the inverse of selection: deselecting the last
selected tag should move the focus back to its previous position. Therefore for
deselection we recompute the focus as the meet of the de ning concepts of the
remaining selected tags.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Key-Phrase Extraction</title>
      <p>Our key-phrase extraction technique consists of two steps; we rst extract
candidate key-phrases and then we remove stop words from the collected phrases.</p>
      <p>
        In order to reduce the key-phrases to only those including more technical
information, we extract only noun phrases which do not include pronouns. For
each single-word phrase we apply lemmatizing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and extract only one phrase
for each group of words having the same lemmas.
      </p>
      <p>
        We do the syntax parsing of the abstract based on Stanford Natural Language
Processing (NLP) tool [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which includes tokenization, sentence splitting,
partof-speech tagging, lemmatizing and parsing itself. We take all noun phrases with
the length less than 5 as key-phrases. We therefore compute all subtrees with
noun part-of-speech tags in the root that have less than 5 leaves for each syntax
tree corresponding to the paper's abstract.
      </p>
      <p>We then remove stop-words from the single-word phrases that we have
extracted. The stop list includes common words that are used in research papers
but are not domain-speci c such as paper and research. According to our task
we consider all long phrases as meaningful and do not remove them.</p>
      <p>For example in the sentence \Software development is the process of
computer programming, documenting, testing and bug xing" our system will
extract the following key-phrases: \software development", \process", \computer
programming", \documenting", \testing", \bug xing".
5</p>
    </sec>
    <sec id="sec-5">
      <title>ConceptCloud Tool</title>
      <p>
        We use our ConceptCloud browser [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] which is a web application available at
www.conceptcloud.org in order to make the academic publication set
browsable. ConceptCloud comprises two main components; a concept constructor tool
to construct a context table in the desired format, and a tag cloud display to
display the interactive tag cloud (see Section 5.2) of the resulting lattice.
5.1
      </p>
      <sec id="sec-5-1">
        <title>Concept Constructor Tool</title>
        <p>
          ConceptCloud's ConceptConstructor automates the process of creating a tag
cloud visualization from an XML or JSON le and provides a wizard to allow
users to construct the table with their desired combination of pre-processing
steps. The browser is generic and can show tag clouds of di erent context types.
It is also completely automatic: there are no manual pre-processing steps, and
the user only needs upload the dataset, choose which of the pre-processing steps
to apply and export the table with the desired objects and attributes.
ConceptCloud's ConceptConstructor also allows users to export tables in a \.cxt"
format so that they can also be used to generate a lattice diagram. A more
detailed description of the tool architecture is available in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          For the lattice construction, we use a method based on the Colibri/Java
library [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] which constructs concepts on the y. We thus never need to compute
the full lattice and are able to render an initial tag cloud relatively quickly.
5.2
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Tag Cloud Visualization</title>
        <p>
          We make use of a tag cloud visualization that can be customized to show di erent
views on the publications. Multiple di erent visualizations for di erent metrics
were found to confuse users [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. We therefore propose one uniform visualization
that can be used to explore various di erent aspects of a data archive.
        </p>
        <p>
          The simplest and most popular tag cloud layout [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] is as an alphabetically
sorted list of tags in a roughly rectangular shape which was found by Schrammel
et al. to perform better than random or semantic layouts [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ]; we use this layout
because it simpli es textual search within the tag cloud. We scale each tag i
between the given minimum and maximum font sizes fmin and fmax , according
to its weight ti in relation to the minimum and maximum weights in the context
table, tmin and tmax ; hence,
size(i) =
(fmax
fmin ) (ti
        </p>
        <p>tmin )
tmax
tmin
+ fmin
1
for ti &gt; tmin and size(i) = fmin otherwise.</p>
        <p>
          A variety of alternative tag layout methods have been proposed, such as tag
akes by Caro et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Tag akes are used in order to provide context for tags
as basic tag clouds fail to show how the tags are related. However, instead of
using such complex visualization that depicts the relationships between the tags,
we use incremental re nement in the tag cloud to provide context and structure
to the tag clouds. By selecting a tag in the tag cloud the resulting cloud will
provide background for the selected tag.
        </p>
        <p>The initial tag cloud shown in ConceptCloud includes tags from all attributes
and objects in the context table (using the top concept in the lattice as the focus).
This allows the user to select any tag from the extracted publication data. Tags
in the initial tag cloud will be at their largest size because we scale all tags
according the maximum and minimum tags in this cloud. Making selections in
the initial tag cloud will result in clouds with smaller tags, indicating that the
cloud is only showing attribute tags from a subset of the total objects in the
context table.</p>
        <p>
          A tag is implied if it has not been selected explicitly, but corresponds to
an attribute in the focus' intent. Implied tags thus reveal the dataset's internal
structure, similar to the way association rules reveal the implicit structure of
shopping baskets [
          <xref ref-type="bibr" rid="ref39">39</xref>
          ] but without any additional cost.
6
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Illustrative Case Study</title>
      <p>
        We build a tag cloud from data extracted from the proceedings of the
Automated Software Engineering Conference [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This dataset comprises 1400
papers and contains their titles, abstracts, author information and some optional
IEEE/ACM keywords for the papers.
      </p>
      <p>In Figure 3, showing the 200 most common key-phrases extracted from the
abstracts and titles of the same set of publications, we see the introduction of
the key-phrases which better characterize the topic of the research. We see
keyphrases such as \Model Checking", \Design Patterns" and \Formal Speci
cations" which are di cult to identify in the single words of Figure 2. Key-phrases
thus better highlight the content of the conference proceedings. The key-phrase
extraction has also removed verbs from the keywords and from Figure 2 we see
that the verbs contain little information when compared to the nouns.</p>
      <p>Selecting the tag for \Model Checking" (indicated in red) in Figure 5 and
still showing the 200 most common tags, shows which authors commonly work
on \Model Checking" at this conference and also what other keywords, such as
\State Space" are associated with \Model Checking", sized according to how
often they appear together. Using only single words from the abstract in the tag
cloud would mean that phrases are not automatically visible in the tag cloud
and have to be selected by selecting two individual tags. When the tags for
\Model" and \Checking" are selected separately (see Figure 4) and not as a
key-phrase they may also not appear together in the abstract and so may show
papers of which \Model Checking" is not a topic. From Figure 4 where words
in the title and abstract have only been split we can also see that identifying
key-words related to \Model Checking" through the tag cloud is di cult because
it is unclear which of the other keywords in the cloud are related to each other.
For example, from this cloud we would not be able to see that both keywords
\state" and \space" often occur together along with \model" and \checking"
unless we were to select these as additional tags.</p>
      <p>The addition of the key-phrase extraction allows users to re ne the
publication set to only papers referring to a particular subset of the domain. In addition
when one key-phrase is selected the tag cloud shows which other phrases are
commonly used together with the selected phrases in the same publication. This
allows the user to investigate related key-phrases to a particular research topic.
7.1</p>
    </sec>
    <sec id="sec-7">
      <title>Related Work</title>
      <sec id="sec-7-1">
        <title>Tag Clouds and Navigation</title>
        <p>
          Mesnage and Carmen use a Bayesian approach for navigation in tag clouds that
allows tags related to one or more selected tags to be shown in the cloud, where
previously clouds could only be created for one selected tag [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. Gwizdka and
Bakelaar look at displaying a tag cloud history, which allows users to keep track
of their previous navigation steps, when clouds are used for pivot navigation
[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This approach is not directly applicable to our tag clouds since we use
re nement navigation where multiple tags can be selected. Hernandez et al.
use multiple linked tag clouds to browse semi-structured clinical trial data [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ].
These tag clouds are generated from the results of an initial search query and
each represent one facet (e.g. medical condition), of the data. A multi-faceted
view can also be created in ConceptCloud by moving tag categories into separate
tag clouds.
7.2
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>Key-Phrase Extraction from Scienti c Articles</title>
        <p>
          Key-phrase extraction from the scienti c texts is an application of common
extraction techniques (see Section 2.1) to a dataset of research publications. Our
approach focuses on the candidate phrase (in the form of nouns or noun phrases)
selection step of the key-phrase extraction process. Given a document, candidate
identi cation is the task of detecting all key-phrases. Candidate phrase selection
methods are largely based on n-grams [
          <xref ref-type="bibr" rid="ref22 ref33 ref36">22, 36, 33</xref>
          ] or parts-of-speech (POS) tag
sequences [
          <xref ref-type="bibr" rid="ref24 ref32 ref5">5, 32, 24</xref>
          ]. A comprehensive analysis of the accuracy and coverage of
candidate extraction methods was carried out by Hulth [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. She compared three
methods: n-grams (excluding those that begin or end with a stop word), POS
sequences (pre-de ned) and (Noun Phrase) NP-chunks, excluding initial
determiners (\a", \an" and \the"). In our approach we make use of a modi cation of
the standard approach based on the extraction of NP-chunks.
8
        </p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Conclusions and Future Work</title>
      <p>We have combined key-phrase extraction with tag clouds and concept lattices in
order to provide an interface through which users can browse academic
publications using key-phrases. Our approach allows formal contexts to be built
automatically using their desired combination of pre-processing steps and key-phrase
extraction. Browsing of the dataset is then supported by our ConceptCloud tool.
The addition of key-phrases as opposed to only the keywords in the tag clouds
allow users to investigate research topics more accurately and also to identify
related topics.</p>
      <p>We see many avenues for future work. The key-phrase extraction process
typically includes an extraction and selection step. Our current model is based on
a simple stop-word selection technique for the extracted single words. Currently,
the stop-list that is used is a manually de ned from common words. This does
not scale in size and over di erent academic domains since di erent disciplines
use varying common phrases. To overcome this drawback, a solution would be
to cluster the papers into topics, compute the frequencies of words within each
cluster, and build an adaptable and more comprehensive stop-word list from
the intersection of frequently used words from the clusters. In future we could
also improve our key-phrase extraction by using a ranking or learning approach
based on computing tf/idf-like scores and features for the extracted phrases.</p>
      <p>
        We could use structural syntactic and discourse representation (so called
\parse thicket" [
        <xref ref-type="bibr" rid="ref11 ref12 ref35">11, 12, 35</xref>
        ]) of the whole abstract as an attribute in the context
table to provide more navigation structure for the dataset. It would then also be
possible to use soft matching between the abstracts in the context table to link
related papers. We could also extract keywords from the publication's full text
in order to enrich the tag cloud.
      </p>
      <p>Our tag cloud for academic paper browsing could also be improved by adding
additional data to the context table, such as citation counts for the papers and
author's university a liations.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>This research is funded in part by a STIAS Doctoral Scholarship, NRF Grant
93582, RFBR Grant 14-01-93960 and the MIH Media Lab. We thank Jean
Breytenbach for building the ConceptConstructor component of ConceptCloud.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. Ase conferences. http://ase-conferences.org/.</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>2. Stanford nlp. http://nlp.stanford.edu/software/.</mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <article-title>Tag cloud of obama and bush's inaugural speechs</article-title>
          . https://en.wikipedia.org/ wiki/Tag_cloud#/media/File:State_
          <article-title>of_the_union_word_clouds</article-title>
          .png.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>C.</given-names>
            <surname>Anslow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Marshall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Noble</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Biddle</surname>
          </string-name>
          . Sourcevis:
          <article-title>Collaborative software visualization for co-located environments</article-title>
          .
          <source>In Software Visualization (VISSOFT)</source>
          ,
          <source>2013 First IEEE Working Conference on, pages</source>
          <volume>1</volume>
          {
          <fpage>10</fpage>
          ,
          <string-name>
            <surname>Sept</surname>
          </string-name>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>K.</given-names>
            <surname>Barker</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Cornacchia</surname>
          </string-name>
          .
          <article-title>Using noun phrase heads to extract document keyphrases</article-title>
          .
          <source>In Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Arti cial Intelligence</source>
          ,
          <source>AI</source>
          '
          <volume>00</volume>
          , pages
          <fpage>40</fpage>
          {
          <fpage>52</fpage>
          , London, UK, UK,
          <year>2000</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Caro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Candan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Sapino</surname>
          </string-name>
          .
          <article-title>Navigating within news collections using tag- akes</article-title>
          .
          <source>Journal of Visual Languages and Computing</source>
          ,
          <volume>22</volume>
          (
          <issue>2</issue>
          ):
          <volume>120</volume>
          {
          <fpage>139</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>C.</given-names>
            <surname>Carpineto</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Romano</surname>
          </string-name>
          .
          <article-title>A lattice conceptual clustering system and its application to browsing retrieval</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <volume>95</volume>
          {
          <fpage>122</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Davey</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Priestley</surname>
          </string-name>
          .
          <article-title>Introduction to Lattices and Order (2</article-title>
          . ed.). Cambridge University Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>B.</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <article-title>Speci cation-based browsing of software component libraries</article-title>
          .
          <source>Autom. Softw. Eng.</source>
          ,
          <volume>7</volume>
          (
          <issue>2</issue>
          ):
          <volume>179</volume>
          {
          <fpage>200</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>B.</given-names>
            <surname>Galitsky</surname>
          </string-name>
          , G. Dobrocsi,
          <string-name>
            <surname>J. De La Rosa</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kuznetsov</surname>
          </string-name>
          .
          <article-title>From generalization of syntactic parse trees to conceptual graphs</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics)</source>
          , 6208 LNAI:
          <volume>185</volume>
          {
          <fpage>190</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>B.</given-names>
            <surname>Galitsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ilvovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Strok</surname>
          </string-name>
          .
          <article-title>Matching sets of parse trees for answering multi-sentence questions</article-title>
          .
          <source>pages 285{293</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>B.</given-names>
            <surname>Galitsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ilvovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Strok</surname>
          </string-name>
          .
          <article-title>Finding maximal common sub-parse thickets for multi-sentence search</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics)</source>
          , 8323 LNAI:
          <volume>39</volume>
          {
          <fpage>57</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>B.</given-names>
            <surname>Galitsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Usikov</surname>
          </string-name>
          .
          <article-title>Parse thicket representation for multisentence search</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics)</source>
          , 7735 LNCS:
          <volume>153</volume>
          {
          <fpage>172</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>B.</given-names>
            <surname>Ganter</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          .
          <source>Formal concept analysis: mathematical foundations. Springer Science &amp; Business Media</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>D. N.</surname>
          </string-name>
          <article-title>Gotzmann</article-title>
          . Colibri/java. http://code.google.com/p/colibri-java/,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>G.</given-names>
            <surname>Greene</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <article-title>Interactive tag cloud visualization of software version control repositories</article-title>
          .
          <source>In Software Visualization (VISSOFT)</source>
          ,
          <source>2015 IEEE 3rd Working Conference on</source>
          , pages
          <volume>56</volume>
          {
          <fpage>65</fpage>
          ,
          <string-name>
            <surname>Sept</surname>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Greene</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <article-title>Conceptcloud: A tagcloud browser for software archives</article-title>
          .
          <source>In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE</source>
          <year>2014</year>
          , pages
          <fpage>759</fpage>
          {
          <fpage>762</fpage>
          , New York, NY, USA,
          <year>2014</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>M. Grineva</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Grinev</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lizorkin</surname>
          </string-name>
          .
          <article-title>Extracting key terms from noisy and multitheme documents</article-title>
          .
          <source>In Proceedings of the 18th International Conference on World Wide Web, WWW '09</source>
          , pages
          <fpage>661</fpage>
          {
          <fpage>670</fpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>J.</given-names>
            <surname>Gwizdka</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bakelaar</surname>
          </string-name>
          .
          <article-title>Tag trails: navigation with context and history</article-title>
          .
          <source>In CHI'09 Extended Abstracts on Human Factors in Computing Systems</source>
          , pages
          <fpage>4579</fpage>
          {
          <fpage>4584</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Hasan</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Ng</surname>
          </string-name>
          .
          <article-title>Automatic keyphrase extraction: A survey of the state of the art</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          , pages
          <fpage>1262</fpage>
          {
          <fpage>1273</fpage>
          ,
          <string-name>
            <surname>Baltimore</surname>
          </string-name>
          , Maryland,
          <year>June 2014</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>M.-E. Hernandez</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Falconer</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Storey</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Carini</surname>
            ,
            <given-names>and I. Sim.</given-names>
          </string-name>
          <article-title>Synchronized tag clouds for exploring semi-structured clinical trial data</article-title>
          .
          <source>In Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, CASCON '08</source>
          , pages
          <fpage>4</fpage>
          :
          <issue>42</issue>
          {4:
          <fpage>56</fpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>A.</given-names>
            <surname>Hulth</surname>
          </string-name>
          .
          <article-title>Improved automatic keyword extraction given more linguistic knowledge</article-title>
          .
          <source>In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP '03</source>
          , pages
          <fpage>216</fpage>
          {
          <fpage>223</fpage>
          ,
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA,
          <year>2003</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>A. Hulth.</surname>
          </string-name>
          <article-title>Combining machine learning and natural language processing for automatic keyword extraction</article-title>
          .
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Baldwin</surname>
          </string-name>
          , and M.-y. Kan.
          <article-title>The use of topic representative words in text categorization</article-title>
          .
          <source>In Australasian document computing symposium (ADCS</source>
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>C.</given-names>
            <surname>Lindig</surname>
          </string-name>
          .
          <article-title>Concept-based component retrieval</article-title>
          .
          <source>In IJCAI</source>
          , pages
          <volume>21</volume>
          {
          <fpage>25</fpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <given-names>C.</given-names>
            <surname>Lindig</surname>
          </string-name>
          .
          <article-title>Fast concept analysis</article-title>
          .
          <source>In Working with Conceptual Structures</source>
          , pages
          <volume>152</volume>
          {
          <fpage>161</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <article-title>Clustering to nd exemplar terms for keyphrase extraction</article-title>
          .
          <source>In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 -</source>
          Volume 1, EMNLP '
          <volume>09</volume>
          , pages
          <fpage>257</fpage>
          {
          <fpage>266</fpage>
          ,
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA,
          <year>2009</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <given-names>S.</given-names>
            <surname>Lohmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Tetzla</surname>
          </string-name>
          .
          <article-title>Comparison of tag cloud layouts: Taskrelated performance and visual exploration</article-title>
          .
          <source>In INTERACT (1)</source>
          , pages
          <fpage>392</fpage>
          {
          <fpage>404</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Mesnage</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Carman</surname>
          </string-name>
          .
          <article-title>Tag navigation</article-title>
          .
          <source>In Proceedings of the 2Nd International Workshop on Social Software Engineering and Applications</source>
          , SoSEA '
          <volume>09</volume>
          , pages
          <fpage>29</fpage>
          {
          <fpage>32</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Tarau</surname>
          </string-name>
          . TextRank:
          <article-title>Bringing order into texts</article-title>
          .
          <source>In Conference on Empirical Methods in Natural Language Processing</source>
          , Barcelona, Spain,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <given-names>C. Q.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. T.</given-names>
            <surname>Phan</surname>
          </string-name>
          .
          <article-title>An ontology-based approach for key phrase extraction</article-title>
          .
          <source>In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, ACLShort '09</source>
          , pages
          <fpage>181</fpage>
          {
          <fpage>184</fpage>
          ,
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA,
          <year>2009</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>T. D. Nguyen</surname>
            and M.-
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kan</surname>
          </string-name>
          .
          <article-title>Keyphrase extraction in scienti c publications</article-title>
          .
          <source>In Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers</source>
          , pages
          <volume>317</volume>
          {
          <fpage>326</fpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>M.-S. Paukkeri</surname>
            ,
            <given-names>I. T.</given-names>
          </string-name>
          <string-name>
            <surname>Nieminen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Polla, and</article-title>
          <string-name>
            <given-names>T.</given-names>
            <surname>Honkela</surname>
          </string-name>
          .
          <article-title>A language-independent approach to keyphrase extraction and evaluation</article-title>
          .
          <source>In COLING (Posters)</source>
          , pages
          <fpage>83</fpage>
          {
          <fpage>86</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>J. Schrammel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Leitner</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Tscheligi</surname>
          </string-name>
          .
          <article-title>Semantically structured tag clouds: An empirical evaluation of clustered presentation approaches</article-title>
          .
          <source>In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '09</source>
          , pages
          <year>2037</year>
          {
          <year>2040</year>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <given-names>F.</given-names>
            <surname>Strok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Galitsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ilvovsky</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          .
          <article-title>Pattern structure projections for learning discourse structures</article-title>
          .
          <source>Lecture Notes in Computer Science (including subseries Lecture Notes in Arti cial Intelligence and Lecture Notes in Bioinformatics)</source>
          ,
          <volume>8722</volume>
          :
          <fpage>254</fpage>
          {
          <fpage>260</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <given-names>T.</given-names>
            <surname>Tomokiyo</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Hurst</surname>
          </string-name>
          .
          <article-title>A language model approach to keyphrase extraction</article-title>
          .
          <source>In Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment-</source>
          Volume
          <volume>18</volume>
          , pages
          <fpage>33</fpage>
          {
          <fpage>40</fpage>
          . Association for Computational Linguistics,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          .
          <article-title>Restructuring lattice theory: an approach based on hierarchies of concepts</article-title>
          .
          <source>In Ordered sets</source>
          , pages
          <volume>445</volume>
          {
          <fpage>470</fpage>
          .
          <string-name>
            <surname>Reidel</surname>
          </string-name>
          ,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <given-names>I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Paynter</surname>
          </string-name>
          , E. Frank,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutwin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Nevill-Manning</surname>
          </string-name>
          .
          <article-title>Kea: Practical automatic keyphrase extraction</article-title>
          .
          <source>In Proceedings of the Fourth ACM Conference on Digital Libraries, DL '99</source>
          , pages
          <fpage>254</fpage>
          {
          <fpage>255</fpage>
          , New York, NY, USA,
          <year>1999</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>M. J. Zaki</surname>
            and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ogihara</surname>
          </string-name>
          .
          <article-title>Theoretical foundations of association rules</article-title>
          .
          <source>In In 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery</source>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>