<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bilal Hayat Butt</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Rafi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arsal Jamal</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raja Sami Ur Rehman</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Syed Muhammad Zubair Alam</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Bilal Alam</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Document Classification, Sentiment Analysis.</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Research is a continuous phenomenon. It is recursive in nature. Every research is based on some earlier research outcome. A general approach in reviewing the literature for a problem is to categorize earlier work for the same problem as positive and negative citations. In this paper, we propose a novel automated technique, which classifies whether an earlier work is cited as sentiment positive or sentiment negative. Our approach first extracted the portion of the cited text from citing paper. Using a sentiment lexicon we classify the citation as positive or negative by picking a window of at most five (5) sentences around the cited place (corpus). We have used Naïve-Bayes Classifier for sentiment analysis. The algorithm is evaluated on a manually annotated and class labelled collection of 150 research papers from the domain of computer science. Our preliminary results show an accuracy of 80%. We assert that our approach can be generalized to classification of scientific research papers in different disciplines.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Preliminaries and Formalism</title>
      <p>In this section we introduce some formal terms and preliminaries for better understanding of
technicalities in paper.</p>
      <sec id="sec-1-1">
        <title>Terms used with citation analysis</title>
        <p>
          In Citation Analysis, root paper or cited paper is the research paper that is being cited in another
research paper. Citation is a quotation that is used to refer a root paper. Citing Paper is the
research paper that consists of a reference to the root paper. Cited Area is the paragraph in
citing paper with citation of root paper. Terms are inspired from
          <xref ref-type="bibr" rid="ref7">Nanba (2000)</xref>
          and explained
in Figure 1. Referencing Style are the writing styles that one uses to organize the information
when writing a research paper.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Terms used with sentiment analysis</title>
        <p>In Text Categorization, Tokenization is the process that breaks down sentences of text into
word(s) for further processing. It is done using Regular Expressions, which is a generic way of
representing a string to be searched from a large corpus of text. Sentiment Analysis is done to
classify the corpus into positive or negative categories based on the words extracted during
tokenization. Lexicon is a dictionary that consists of key words which are divided into positive
and negative categories. Positive Incrementer and Negative Incrementer are dictionaries,
besides the Lexicon that were used in our approach, consisting of phrases of words that were
observed to be inclining a particular sentence to have an overall positive or negative weight.</p>
      </sec>
      <sec id="sec-1-3">
        <title>Terms used with experimental study</title>
        <p>
          In our experimental study, Target Data is the data extracted from the cited paragraph. Citation
Types are the different categories into which target data are placed after sentiment analysis.
Naïve-Bayes Classifier is a popular machine learning algorithm for text categorization.
MonteCarlo Technique is a computational algorithm that rely on repeated random sampling.
Precision, Recall and F1-score are calculated, taken from
          <xref ref-type="bibr" rid="ref6">Manning (2008)</xref>
          .
        </p>
        <p>Precision is the fraction of retrieved instances that are relevant, calculated through the equation:</p>
        <p>P = |{Relevant Documents} ∩ {Retrieved Documents}| / |{Retrieved Documents}|
Recall is the fraction of relevant instances that are retrieved, calculated through the equation:</p>
        <p>R = |{Relevant Documents} ∩ {Retrieved Documents}| / |{Relevant Documents}|
F- Measure is a measure that combines precision and recall. It is the harmonic mean of precision
and recall. The balanced F1-score, calculated through the equation:</p>
        <p>F1 = 2*(Precision*Recall) / (Precision + Recall)</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Authors have taken different strategies in approaching citation analysis,
        <xref ref-type="bibr" rid="ref11">Tanguy (2009)</xref>
        provides a detailed overview. One of the approaches focuses on identifying citation categories
or classification schemes, while other focuses on identifying cue words that can help better
annotation and classification of papers.
      </p>
      <sec id="sec-2-1">
        <title>Classification of Citations</title>
        <p>
          <xref ref-type="bibr" rid="ref7">Nanba (2000)</xref>
          classified the cited papers into three categories, manually, using cue words.
Author discusses a prototype system called PRESRI that relies on citation relationships.
Keeping this research as our base we enhanced and automated the procedure. We applied
different methods to classify the papers using lexicon dictionaries instead of cue words.
          <xref ref-type="bibr" rid="ref11">Tanguy (2009)</xref>
          discusses an automated technique to identify the citation and uses linguistic cues
for analysis of French humanities articles through natural language processing techniques. The
approach presented is quite similar to CRC in terms of the lexicon used to identify which
category a particular sentence belongs to. However, their approach is limited to APA style
referencing only, while we have extended it to AMA and IEEE, as well.
        </p>
        <p>
          <xref ref-type="bibr" rid="ref3">Cohen (2006)</xref>
          discusses importance of automated classification of document citations in
reducing the time spent by experts in reviewing journal articles. The paper proposes the use of
classification of research papers in the field of medicine; specific to the field of drugs and their
use in treatment of diseases. Through an automated classification process, the paper puts
forward a review system for a selected list of drugs. Thus classifying the drug to be either
positive or negative, against a disease. Similar to the approach of CRC, the paper looks for the
cited areas which contains the reviews of the drug. The paper then extracts these areas to
perform classification. The domain of their research is specific to medical science whereas our
approach can be generalized to variety of disciplines.
        </p>
        <p>
          Mostly techniques reviewed are focused on one discipline while we assert that our approach
can be generalized to classification of scientific research papers in different disciplines, since,
instead of cue words for a specific domain, we are using generalized sentiment lexica used in
          <xref ref-type="bibr" rid="ref4">Evert (2014)</xref>
          . Author identified sentiment lexica as one of the most important features along
with bag-of-words unigrams and bigrams, for machine learning classifier.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Classification Schemes</title>
        <p>
          <xref ref-type="bibr" rid="ref1">Agarwal (2010)</xref>
          developed an eight-category classification scheme, annotated using that
scheme, developed and evaluated the supervised machine learning classifiers using annotated
data. As discussed in the paper, inter-annotator agreement could not be reached for overlapping
categories and author suspect that this issue could increase with huge collection of articles. To
overcome this issue, we have combined the overlapping categories.
        </p>
        <p>
          <xref ref-type="bibr" rid="ref12">Teufel (2006)</xref>
          proposed a scheme of 12 categories for any citation being made. Table 1 shows
another such scheme suggested by
          <xref ref-type="bibr" rid="ref10">Spiegel (1977)</xref>
          with 13 different motivations which could
lead an author to cite any research paper. This scheme is discussed in detail, because it is based
on scholarly articles published in science studies.
1. Cited Paper provides historical facts regarding undergoing Research Question.
2. Continuing a Research from point where Cited paper finished.
3. Citing paper to use its ideas, definitions, terms in a Research
4. Citing a paper to refer to data also used in Current Research.
5. Citing a paper to refer to data it used and to draw similarities from the Data used.
6. Citing Paper contains Data and Material used throughout different phases
7. Citing Paper to adopt part/full methodology it adopted for a certain task.
8. Citing paper verified/proved a statement or enlightens with its details.
9. Citing Paper evaluated positively.
10. Citing Paper evaluated negatively.
11. Ongoing Research giving proof of statement in Cited Paper.
12. Ongoing Research giving rebuttal of statement in Cited Paper.
        </p>
        <p>13. Giving a new interpretation to the findings/statements in Cited Paper.</p>
        <p>In our approach we have grouped these citation types into three generalized sentiment types:
 TYPE-I: Positive

</p>
        <sec id="sec-2-2-1">
          <title>TYPE-II: Negative</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>TYPE-III: Neutral</title>
          <p>
            Referring to Table 1, Type-I strictly refers to 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, Type-II strictly refers
to 10, 12, 13 and Type-III refers to 4. However 1, 6 may fall under all three Types and 2, 3 may
fall under Type-I and Type-II. Such overlapping nature of categories creates dis-agreement
between annotators
            <xref ref-type="bibr" rid="ref1">Agarwal (2010)</xref>
            and
            <xref ref-type="bibr" rid="ref11">Tanguy (2009)</xref>
            , so we narrowed down the categories.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>In this paper, we propose a novel automated technique, which classifies whether an earlier work
is cited as sentiment positive or sentiment negative. Our approach first extracted the portion of
the cited text from citing paper. Using a sentiment lexicon we classify the citation as positive
or negative by picking a window of at most five (5) sentences around the cited place (corpus).
The algorithm is evaluated on a manually annotated and class labelled collection of 150 research
papers from the domain of computer science. Complete process is explained in Figure 2, and
reference to the example papers are provided. Root paper followed by citing papers.</p>
      <sec id="sec-3-1">
        <title>Dataset Collection</title>
        <p>We collected a data set consisting of 150 research papers, manually downloaded in pdf format
from Google Scholar. To further strengthen our basis for citation types we devised manual
annotation of citation corpuses from all research paper in our data set and then manually
classifying them into Type-I and Type-II. The results are shared in Table 2.</p>
        <sec id="sec-3-1-1">
          <title>Type-I 109</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Type-II 41 Total 150</title>
          <p>Figure 2. Overview of CRC
All the annotation were done manually by the co-authors and the results were further
crosschecked by other co-authors. Inter-annotator agreement was fair, however, to further strengthen
our annotation results, we feel that annotation should be done by individuals who have at least
a post graduate degree in their fields related to the corpus. In future, we plan to work with PhD
Scholars in the field of computer science to provide annotated data in their field of research.
For every root paper that we queried, we stored its top 10 citing papers. To make our approach
more robust we included research papers of different writing styles such as Institute of Electrical
and Electronics Engineers (IEEE) Standard Style, American Psychological Association (APA)
Style and American Medical Association (AMA) Manual of Style. We summarize variations in
referencing style in Table 3, along with percentage of sample papers in our dataset.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>Reference Style</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>In-Text Citation</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Percentage of papers</title>
        </sec>
        <sec id="sec-3-1-6">
          <title>IEEE</title>
          <p>[1] Name Paper,
Journal. Year
[1], [1,2]
67%</p>
          <p>APA</p>
        </sec>
        <sec id="sec-3-1-7">
          <title>Name (Year). Paper, Journal</title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Name (year)</title>
          <p>15%
AMA</p>
        </sec>
        <sec id="sec-3-1-9">
          <title>1. Name Paper,</title>
          <p>Journal. Year
[1], [1,2] or 1-2
18%</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Data Conversion Module</title>
        <p>
          Root paper and all its citing papers were placed in separate folders. The first step was to convert
all pdf files in dataset into text format. Python Library ‘PDF Miner’
          <xref ref-type="bibr" rid="ref9">Shinyama (2010)</xref>
          was used
for file conversion. All converted text files were placed in their respective folders.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Data Extraction Module</title>
        <p>
          To process data for analysis phase we extract useful information from all text files in the dataset.
From root paper we extract the title and the abstract section. Similarly, from citing paper we
extract the title of citing paper, reference (number) which points to in-text citations made to
root paper, reference to root paper, corpus, continuing reference i.e. reference to any research
paper other than root paper in corpus region, an annotated label assigned to each citing paper.
First the title of each root paper is extracted, but pdf-to-text conversion tend to modify
formatting style, making it difficult to construct a regular expression to identify title of each
root paper. Instead we programmed a Scrapping Module API in Python
          <xref ref-type="bibr" rid="ref13">Venthur (2014)</xref>
          .
Scrapper searches the Input on Google Scholar and successfully scraps the title of target paper.
Extracted title and its path on our system directory are stored in a CSV file. We use these root
paper titles to search for reference in their respective citing paper. First we extract the reference
section in each citing paper, let’s call this cropReference. Next we search for root paper’s title
in cropReference. Depending on the format style of citing paper a particular regular expression
execute to correctly extract reference. Following this, with the help of reference we form
another regular expression to extract corpus region of maximum five (5) sentences. We search
and remove any sentence in our corpus that includes reference to any paper other than our root
paper. All this information is stored in a XML file. Process is explained in Figure 3.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Parsing XML file</title>
      </sec>
      <sec id="sec-3-5">
        <title>Feature Extractor</title>
        <p>
          XML Parser creates a separate file for each citing paper containing its Class (Label), Reference
(number) and Corpus. These files were then placed in respective subfolder (Type I, Type II).
Once the dataset is ready, we made features upon which citing papers will be automatically
classified. For this purpose we included following lexicon dictionaries:
 Positive and Negative lexica: Derived from
          <xref ref-type="bibr" rid="ref4">Evert (2014)</xref>
          consisting of approximately
28,000 distinct positive and 31,000 distinct negative words.

        </p>
        <p>
          Positive Incrementer and Negative Incrementer: Derived from
          <xref ref-type="bibr" rid="ref7">Nanba (2000)</xref>
          . Each
includes approx. 75 entries that are unigrams, bigrams or trigram.
        </p>
        <p>
          Feature Extractor iteratively reads each files. It uses Text Blob library
          <xref ref-type="bibr" rid="ref5">Loria (2014)</xref>
          in Python
for tokenization. First corpus is tokenized into sentences to compare with Positive and Negative
Incrementer, and then each sentence is tokenized into words to compare with Positive and
Negative Dictionaries. For each match and mismatch between token and dictionary we create
a feature-set list, composed of Token (‘1’ for match and ‘0’ for mismatch) and Label of that
citing paper.
        </p>
      </sec>
      <sec id="sec-3-6">
        <title>Running the Classifier</title>
        <p>
          We have used Naïve-Bayes Classifier for sentiment analysis, using Scikit-learn library
          <xref ref-type="bibr" rid="ref8">Pedregosa (2011)</xref>
          and NLTK library
          <xref ref-type="bibr" rid="ref2">Bird (2009)</xref>
          . Due to limited data, we had to decide what
portion of feature-set will go into training and what portion of feature-set will this classifier
test. In our first approach we trained and tested our classifier on all of feature-set. Next, we used
different techniques to distribute feature-set in training and testing sets.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Study</title>
      <p>
        In this section we discuss our findings for data extraction and classification. For data extraction
our results show 86% accuracy, which is not only higher than the estimated precision score of
80% by
        <xref ref-type="bibr" rid="ref11">Tanguy (2009)</xref>
        for syntactic parsing, but also we have generalized the approach for
different referencing styles. Few papers from which we could not accurately extract data were
Papers Dataset
      </p>
      <sec id="sec-4-1">
        <title>Category</title>
        <p>Type I
Type II
Average/Total
either due to pdf-to-text conversion limitations for images and remaining were because
referencing style were not correctly followed by the author. Results are explained in Table 4.
After extracting the data from the citing papers, the data was analysed with the help of
NaïveBayes classifier. Our preliminary results show an accuracy of 80%, explained in Table 5.
The reason for low recall for Type II was because we had only 21 papers that were manually
annotated to be negative while the rest of the 109 papers were annotated to be positive. Due to
lack of data our classifier couldn't accurately train itself on Type II papers. To further strengthen
our result and remove low sampling bias of Type II papers we have used two approaches. In
our first approach, the data set was adjusted to 42 research papers. For testing and training we
used 50% data. To balance Type I and Type II papers, the technique was applied on different
window sizes of Type I papers, while using all 21 papers of Type II. The windows were set to
0-21, 22-43, 44-65, 66-87, 88-109 and 110-125. Different accuracy were obtained at each
window and the average accuracy was approximately 78%. Results are explained in Table 6.
Secondly, the data was analysed using the Monte-Carlo Technique, to obtain stability in
classification. Further changes were made to the Type I windows accordingly, explained in</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>The purpose of our paper was to help the research scholars by minimizing the time required to
find the relevant research, for a topic of interest. In CRC, we proposed to classify the citations
into three categories i.e. Type-I (Positive), Type-II (Negative) and Type-III (Neutral). A
researcher now if desires to know any further advancements in paper under study can directly
refer to Type-I papers, or if he wishes to know any research giving rebuttal of findings, in paper
under study, can refer to Type-II paper. Our preliminary results show an accuracy of 80%. We
assert that the technique can be generalized to classification of scientific research papers.
Currently, support for Type-III papers in CRC is in progress, and we are working on a lexicon
dictionary with neutral words. In future, we plan to provide a web portal to assist the research
scholars in automatically searching and downloading citing papers, for a root paper, and
classification of citing papers into sentiment categories.</p>
      <sec id="sec-5-1">
        <title>Reference for papers used in explanation (Figure 2).</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choubey</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Automatically classifying the role of citations in biomedical articles</article-title>
          .
          <source>In AMIA Annual Symposium Proceedings</source>
          (Vol.
          <year>2010</year>
          , p.
          <fpage>11</fpage>
          ). American Medical Informatics Association.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Loper</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Natural language processing with Python.</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hersh</surname>
            ,
            <given-names>W. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peterson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Yen</surname>
            ,
            <given-names>P. Y.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Reducing workload in systematic review preparation using automated citation classification</article-title>
          .
          <source>Journal of the American Medical Informatics Association</source>
          ,
          <volume>13</volume>
          (
          <issue>2</issue>
          ),
          <fpage>206</fpage>
          -
          <lpage>219</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Evert</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Proisl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greiner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kabashi</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>SentiKLUE: Updating a Polarity Classifier in 48 Hours</article-title>
          . SemEval,
          <volume>551</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Loria</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ),
          <source>Textblob: Simplified Text Processing. Retrieved 4 May</source>
          <year>2015</year>
          from: http://textblob.readthedocs.org/en/dev/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raghavan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Schütze</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2008</year>
          ). Introduction to Information Retrieval (Vol.
          <volume>1</volume>
          , p.
          <fpage>496</fpage>
          ). Cambridge: Cambridge university press.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Nanba</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kando</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Okumura</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>Classification of research papers using citation links and citation types: Towards automatic review article generation</article-title>
          .
          <source>11th ASIS SIG/CR Classification Research Workshop</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Duchesnay</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>The Journal of Machine Learning Research</source>
          ,
          <volume>12</volume>
          ,
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Shinyama</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2010</year>
          )
          <article-title>PDFMiner: Python PDF parser and analyzer</article-title>
          .
          <source>Retrieved on 11 June</source>
          <year>2015</year>
          from: http://www.unixuser.org/~euske/python/pdfminer/
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Spiegel-Rösing</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>1977</year>
          ).
          <article-title>Science studies: Bibliometric and content analysis</article-title>
          .
          <source>Social Studies of Science</source>
          ,
          <volume>97</volume>
          -
          <fpage>113</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Tanguy</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lalleman</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>François</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Séguéla</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>RHECITAS: citation analysis of French humanities articles</article-title>
          .
          <source>In Corpus Linguistics</source>
          <year>2009</year>
          (pp.
          <fpage>http</fpage>
          -ucrel).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Teufel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siddharthan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Tidhar</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2006</year>
          ,
          <article-title>July)</article-title>
          .
          <article-title>Automatic classification of citation function</article-title>
          .
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          (pp.
          <fpage>103</fpage>
          -
          <lpage>110</lpage>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Venthur</surname>
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2014</year>
          )
          <article-title>GScholar: Query Google Scholar with Python</article-title>
          .
          <source>Retrieved 5 April</source>
          <year>2015</year>
          from: http://github.com/venthur/gscholar/
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Felt</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanna</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wagner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2011</year>
          ,
          <article-title>October)</article-title>
          .
          <article-title>Android permissions demystified</article-title>
          .
          <source>In Proceedings of the 18th ACM conference on Computer and communications security</source>
          (pp.
          <fpage>627</fpage>
          -
          <lpage>638</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Grace</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sadeghi</surname>
            ,
            <given-names>A. R.</given-names>
          </string-name>
          (
          <year>2012</year>
          , April).
          <article-title>Unsafe exposure analysis of mobile in-app advertisements</article-title>
          .
          <source>In Proceedings of the fifth ACM conference on Security and Privacy in Wireless and Mobile Networks</source>
          (pp.
          <fpage>101</fpage>
          -
          <lpage>112</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Grace</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          (
          <year>2012</year>
          , June).
          <article-title>Riskranker: scalable and accurate zero-day android malware detection</article-title>
          .
          <source>In Proceedings of the 10th international conference on Mobile systems</source>
          , applications, and services (pp.
          <fpage>281</fpage>
          -
          <lpage>294</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Kelley</surname>
            ,
            <given-names>P. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cranor</surname>
            ,
            <given-names>L. F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sadeh</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2013</year>
          , April).
          <article-title>Privacy as part of the app decision-making process</article-title>
          .
          <source>In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems</source>
          (pp.
          <fpage>3393</fpage>
          -
          <lpage>3402</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saïdi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2012</year>
          ,
          <article-title>August)</article-title>
          .
          <article-title>Aurasium: Practical Policy Enforcement for Android Applications</article-title>
          .
          <source>In USENIX Security Symposium</source>
          (pp.
          <fpage>539</fpage>
          -
          <lpage>552</lpage>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>