<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Categorization of Segments of Ancient and Mediaeval Zoological Texts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Catherine Faron-Zucker</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Irene Paj on Leyra</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantina Poulida</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea G. B. Tettamanzi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Inria Sophia Antipolis</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Univ.</institution>
          <addr-line>Nice Sophia Antipolis</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>68</lpage>
      <abstract>
        <p>In this paper we present a preliminary work conducted in the framework of the multidisciplinary research network Zoomathia, which aims at studying the transmission of zoological knowledge from Antiquity to the Middle Ages through compilation literature. We propose an approach of knowledge extraction from ancient texts consisting in semantically categorizating text segments based on machine learning methods applied to a representation of segments built by processing their translations in modern languages with Natural Language Processing (NLP) methods and by exploiting a dedicated thesaurus of zoology-related concepts. The nal aim is to semantically annotate the ancient texts and reason on these annotations to help epistemologists, historians and philologists in their analysis of these texts.</p>
      </abstract>
      <kwd-group>
        <kwd>History of Zoology</kwd>
        <kwd>Knowledge Extraction from Texts</kwd>
        <kwd>Semantic Categorization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Semantic Web has a key role to play to support cultural studies. During
the last decade, several works addressed the semantic annotation and search
in Cultural Heritage collections and Digital Library systems. They focus on
producing Cultural Heritage RDF datasets, aligning these data and their
vocabularies on the Linked Data cloud, and exploring and searching among
heterogenous semantic data stores. In the framework of the international research
network Zoomathia,3 we address the challenge of adopting such a Linked Data
cloud-based approach to support multidisciplinary studies in History of Science.
Zoomathia primarily focuses on the transmission of zoological knowledge from
Antiquity to the Middle Ages through textual resources, and considers
compilation literature such as encyclopaedias.</p>
      <p>
        The automatic annotation of the Zoomathia corpus of selected texts is a rst
step to enable automatic reasoning on these annotations, supporting the
evaluation and interpretation of the development of a zoological knowledge through
the ages. The work presented in this paper takes place in the continuation of
Tounsi et al.'s work presented in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] on (i) the automatic extraction of zoonyms
and zoological topics (ethology, anatomy, medicinal properties, etc.) from the
fourth book of the late mediaeval encyclopaedia Hortus Sanitatis (15th century),
written in Latin and compiling ancient texts on shes, and (ii) the semantic
annotation of the units of this text. The approach for extracting zoonyms was
relatively simple, based on a set of patterns (syntactic rules) to recognize the
occurence of terms from a taxonomy among the lemmas identi ed in the Latin
texts. The performances of the approach closely depends on the available
taxonomic resources. We can now rely on the translation of the TAXREF taxonomic
thesaurus of zoological and botanical names in SKOS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. As for the extraction
of zoological topics, the proposed approach consisted of (i) semi-automatically
building a list of semantically related terms for each of the 8 targeted zoological
topics, based on the eXtended WordNet Domains4 (XWND) and BabelNet5
terminological resources; and (ii) automatically annotating each text segment by
a topic when the number of its terms belonging to the set of terms
representing a topic was greater than a given threshold. While the overall approach was
promising and launched a real dynamic among the participants of the Zoomathia
network, the results achieved with the proposed method of knowledge extraction
were limited, and the method itself was limited: (i) it required a manual step to
build a representative set of terms for each considered topic; (ii) it required to
translate the semantically related terms of each topic in Latin, which had to be
done manually by a philologist; (iii) the criterion used to assign a topic to a text
segment was too simplistic.
      </p>
      <p>
        To overcome these limitations, we conceived a possibly more promising method
to automatically annotate segments of ancient texts with zoological concepts.
First, we take advantage of the terminological work conducted in the
meantime in Zoomathia which led to the publication of the THEZOO thesaurus
in SKOS, gathering all the zoology-related concepts encountered in Pliny the
Elder's Naturalis Historia (1st century),6 considered as representative of the
zoological knowledge in the Zoomathia corpus of texts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Second, we reuse
state-of-the-art Natural Language Processing (NLP) methods and supervised
learning algorithms and libraries for the categorization of text segments. A text
segment may be classi ed into several categories: our classi er is a set of binary
classi ers deciding for each considered category whether a segment belongs to it
or not. Categories can be any concepts of the THEZOO thesaurus and the
semantics of the subsumption relations among concepts are taken into account in
our classi er. Third, to take advantage of the amount of available terminological
resources developed in the community for modern languages (much more rare
for ancient languages), we consider modern translations of ancient texts; and to
compensate the possible lost of precision in processing a translation rather than
4 http://adimen.si.ehu.es/web/XWND
5 http://babelnet.org/
6 For the moment, only books VIII{XI are concerned, respectively dealing with: VIII|
terrestrial animals; IX|aquatic animals; X|birds; XI|insects and other terrestrial
invertebrates.
the original text, we consider several modern translations for each ancient text
and we combine the results of their processing. Finally, we use the identi ed
categories to annotate the original ancient text.
      </p>
      <p>Our research question is thus: How can we e ectively categorize ancient text
segments by relying on their translation in modern languages and taking
advantage of the terminological resources and NLP APIs available for modern
languages?</p>
      <p>This paper is organized as follows: Section 2 presents our approach to
automatic classi cation of ancient texts. Section 3 presents the experiments of our
approach to the classi cation of text segments of Book 9 of Pliny's Naturalis
Historia on aquatic animals and discusses the obtained results. Section 4
concludes.
2</p>
    </sec>
    <sec id="sec-2">
      <title>A Semantic Approach to Segment Classi cation</title>
      <p>
        The problem we tackle is essentially a particular case of text categorization,
which may be de ned as the classi cation of documents into a xed number
of prede ned categories, where each document may belong in one, more than
one, or no category at all [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The solution we propose falls within statistical
text categorization, in that we rely on machine-learning methods to learn
automatic classi cation rules based on human-labeled training \documents" (in our
case, text segments). In addition, to take advantage of linked-data resources and
structured domain knowledge, we follow a variant of text segment vector
representation whereby the features correspond to senses (i.e., meanings) of words
or phrases occurring in the text, rather than directly to the words or phrases
themselves. In this sense, our approach may be called semantic.
      </p>
      <p>By the way, the semantic approach is also a fundamental aspect in the
philological work. Precisely, the general idea of THEZOO is to overcome the lexical
and grammatical levels of texts and to work at the level of meaning.</p>
      <p>One speci city of our problem is that the texts we are interested in
categorizing are written in ancient languages (primarily Latin and ancient Greek), for
which computational linguistic resources like structured machine-readable lexica
and parsers are hard to nd, somewhat incomplete, or not interoperable with
Semantic Web technologies. We propose, as a workaround, to use one or more
translations into modern languages (for which such resources are available) as
proxies of the original text. As a matter of fact, translation into modern
languages exist for most ancient and medieval texts; furthermore, such translations
are of a particularly high quality, being the work of well-trained philologists who
strive to convey, as accurately as they can, the full meaning of the ancient text.
2.1</p>
      <sec id="sec-2-1">
        <title>Dataset Construction</title>
        <p>Our approach is in two steps. The rst one consists in a semantic-based
approach for extracting from texts a representation of text segments which will be
processed in a second step to categorize them.</p>
        <p>We rst process the corpus of texts studied to extract from WordNet the list
of synsets occuring at least once in the corpus. Then each text segment is
represented by a binary vector of the size of this list, indicating the presence or absence
of terms belonging to a synset in the segment. The vectors are then weighted
by using the term frequency-inverse document frequency (TF-IDF) statistic to
re ect how important each synset is to a text segment. This processing step
mainly relies on tools available in the Natural Language Toolkit (NLTK).7</p>
        <p>Second, for each concept of interest in the thesaurus, a binary classi er is
constructed, with a training set built by considering the manual annotation of
a subset of the text segments in the corpus with terms from the THEZOO
thesaurus. This manual annotation activity was conducted by a philologist. At
this step, the semantics of the thesaurus is taken into account by considering all
the concepts specializing the concept targeted by each classi er.</p>
        <p>Finally, with the same training sets, we tested several implementations of
classi ers available in the Weka machine learning suite.8
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Combining Several Modern Translations</title>
        <p>Upon undergoing the treatment described in the previous section, even the most
accurate modern translation of an ancient text is likely to introduce noise in the
process.</p>
        <p>
          To begin with, contemporary translation studies [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] have made it clear that,
when applied to texts of cultural and literary relevance, translation is not just
a means of recovering a source text, but also a process of interpretation and
production of literary meaning and value. The translator faces multiple choices
when having to render the sense of a word or phrase in the target language and
some of these choices imply an interpretation of the meaning of the original text
which might be subject to debate. Whereas all possible choices are implicitly
contained in the original text|in potentiality, once the translator commits to a
particular interpretation and choice, there is necessarily a loss of meaning.
        </p>
        <p>At the same time, and besides the possible loss of meaning, there is also
the risk of introducing novel meaning, which was not necessarily implied by the
original text, and this because the terms employed by the translator to convey
the intended meaning of the original text may be ambiguous or polysemous.</p>
        <p>One way to obviate both the problem of sense loss and the problem of
ambiguity/polysemy is to consider multiple translations, in the same or di erent
modern languages. We concentrate on the case of combining translations in di erent
modern languages, because it is most general and, by solving all the challenges
it poses, an approach providing for it is then suitable for dealing with multiple
translations in the same language as well. Besides, di erent languages are not
always equally capable of expressing the nuances of the original text. Therefore,
using versions in di erent languages helps recovering a more complete
perspective of the original meaning.</p>
        <sec id="sec-2-2-1">
          <title>7 http://www.nltk.org/ 8 http://weka.wikispaces.com/</title>
          <p>An essential requirement for combining multiple translations is that the
original text and their translations must be aligned. In the case of classic and
medieval texts, a conventional segmentation of the text into books, chapters, and
paragraphs is generally agreed upon by philologists. Therefore, if the
granularity of the segments we are interested in categorizing is, as we assume here, the
same as the smallest unit of such traditional segmentation, this step does not
pose particular problems, all the more so because, in general, translations into
modern languages preserve it.</p>
          <p>At the level of a given segment, the combination of multiple translations
works as follows:
1. the multiset Si of synsets giving the senses of all terms (after eliminating
stopwords) occurring in each translation Ti of the segment is computed;
2. each multiset Si is converted into a multiset Si0 by mapping every synset id
sij 2 Si into the corresponding synset id s0ij in the Princeton WordNet; if
no corresponding synset id can be determined based on the available index
les, sij is simply dropped;
3. the intersection of the converted multisets, S = Ti Si0, is computed and it
is used as a basis for constructing the feature vector representation of the
segment, using the TF-IDF as described above.</p>
          <p>The main rationale for taking the intersection of the multisets computed from
the various translations is that, by keeping only the senses which are shared
among them, we hope to reduce the noise due to polysemous terms occurring in
the translations and, in an indirect way, to disambiguate the original text. One
possible drawback of taking the intersection is that, if two of the translations
considered were based on radically di erent interpretations of the original text,
the synsets corresponding to some important term in the original text might
disappear altogether. However, this is very unlikely to happen in reality, for even
if two di erent sense of the same word are construed by two translators, chances
are that the terms employed to render them are not too distant semantically,
so that the intersection of their respective synsets is not empty. A quantitative
investigation of this claim, however, is left for future work.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>To test our approach, we focused our attention on Book 9 of Pliny the Elder's
Naturalis Historia on aquatic animals, which consists of 186 paragraphs. In this
case, paragraphs are the segments of text which are categorized; on average they
are 56 word long.</p>
      <p>
        We have used translations which are now in the public domain, namely [
        <xref ref-type="bibr" rid="ref1 ref7">1, 7</xref>
        ]
for English, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] for French, and [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for German. As for linguistic resources, we have
used Princeton WordNet9 for English, WOLF (Wordnet Libre du Francais)10 for
French, and GermaNet11 for German.
      </p>
      <sec id="sec-3-1">
        <title>9 https://wordnet.princeton.edu/</title>
        <p>10 http://alpage.inria.fr/~sagot/wolf-en.html
11 http://www.sfs.uni-tuebingen.de/GermaNet/index.shtml</p>
        <p>Seven pairs of training and test datasets have been constructed for the
following translation languages or combinations of languages:
1. English;
2. French;
3. German;
4. English and French;
5. English and German;
6. French and German;
7. English, French, and German.</p>
        <p>Each paragraph has been transformed into a vector of features, where each
feature is the TF-IDF in the paragraph of a synset whose lexicalization occurs
in the translations of Book 9 in the modern languages considered. When the
translations in two or three modern languages are considered, the synsets of
languages other than English are translated into the corresponding Princeton
Wordnet synset and the intersection of the synsets from each modern translation
is taken for computing the feature vector for that paragraph.</p>
        <p>We manually assigned paragraphs (and, by extension, their associated feature
vectors) to the categories corresponding to their topic. A paragraph may belong
to more than one category.</p>
        <p>The training and test datasets for a category C (i.e., a topic against which
paragraphs are to be classi ed) are obtained by randomly selecting half of the
feature vectors (or records) classi ed as C and half of the feature vectors classi ed
as :C for the training dataset and the remaining half for the test set, so that
the training and test datasets contain the same fraction of C and :C records.</p>
        <p>The datasets thus obtained, however, are imbalanced. For instance, out of
the 186 paragraphs in Book 9 of Pliny the Elder's Naturalis Historia, 55, or
29.6%, are about \anatomy"; most paragraph are not about anatomy. Such an
imbalance, if not properly corrected, may lead many classi cation methods to
take the shortcut of classifying all paragraphs as \not-anatomy", which would
be an easy way of obtaining a 70% accuracy.</p>
        <p>
          Random under- and oversampling are two popular techniques to obtain a
balanced training set from an imbalanced one [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. However, undersampling, which
works by removing examples from the most represented class, is not suitable
for cases, like ours, where training data are scarce and could potentially remove
certain important examples; random oversampling, which injects into the least
represented class additional copies of its examples, on the other hand, may lead
to over tting if some examples get sampled more than others. To obviate this
problem, we adopted a deterministic oversampling strategy which constructs a
perfectly balanced dataset of a size n much larger than the size of the
original imbalanced dataset by alternatively picking an example from either class
and wrapping around when all the examples of a class have been exhausted, as
shown in Algorithm 1. As a result, two examples of the same class will always
get sampled a number of times which can di er by at most 1. By taking a su
ciently large n, one can make the maximum deviation between the frequency of
examples as small as desired.
        </p>
        <sec id="sec-3-1-1">
          <title>Algorithm 1 balance(d; n)</title>
          <p>Speci cally, for our experiments, we set n = 1000.</p>
          <p>A number classi cation methods implemented in Weka, including
complement and multinomial naive Bayes, k-nearest neighbors, and the support vector
machines have been applied to the datasets thus obtained. Support vector
machines proved to give the best results.</p>
          <p>Table 1 summarizes the results obtained by support vector machines when
used to classify paragraphs not used for training (test set) with respect to
category \anatomy". In this table, accuracy is the percentage of correct classi
cations; precision is the percentage of paragraphs classi ed as \anatomy" by the
model that were annotated as such by the human expert; recall is the percentage
of paragraphs annotated as \anatomy" that are correctly recognized; F-measure
is the average of the F-scores for class \anatomy"and for its complement.</p>
          <p>
            In terms of accuracy, these results constitute an improvement over the results
obtained in [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ].
          </p>
          <p>Although the performance in terms of accuracy looks promising, in reality,
when one focuses on the capability of the classi cation model to recognize and
thus automatically annotate a paragraph about a given topic (category), these
results are quite disappointing, with precision and recall gures well below an
acceptable level.</p>
          <p>A rather surprising fact, which calls for a more in-depth investigation, is that
the results obtained by combining translations in three languages (cf. the last
row of Table 1) are no better than those obtained by combining translations in
two languages, which, in turn, are no better than those obtained by considering a
single translation. This preliminary evidence would thus suggest that combining
translations in di erent languages is not a good idea, but we are cautious to
jump to such conclusions and we think more evidence based on a larger corpus
of texts should be gathered before dismissing this proposal.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>Despite the disappointing preliminary results, we believe the proposed approach
to have the potential to provide a viable solution to the problem of automatic
or semi-automatic annotation of ancient texts.</p>
      <p>We think that the reason for the observed poor performance of the classi
cation models in our preliminary experiments may be twofold: on the one hand,
the number of examples available for training the models is exceedingly small, in
the face of a very high-dimensional feature space (ranging from 2,500 to 10,500
features); on the other hand, the features that could prove useful to reach the
correct classi cation are drowned among all the other features. Coming up with
a heuristic to select a small number of relevant features given a category would
probably alleviate both problems. We plan on concentrating our future e orts
in that direction. In addition, we are aware that many tools for semi-automatic
analysis are currently under development, for exemple in the Perseus Project.
Currently, NLTK does not enable to the exploit the Latin or Classical Greek
version of WordNet. For some phases of our work, perhaps a framework like the
Classical Language Toolkit,12 an extension of NLTK, could be useful. Conversely,
our research work described here could somehow contribute to these e orts. For
example, we plan on aligning the THEZOO thesaurus with WordNet. An
implicit assumption of our methodological choice is that the categories in ancient,
medieval, 19th-century and contemporary texts are supposed to match perfectly.
Of course this is not the case and using a speci c thesaurus like THEZOO might
contribute to make our approach more anthropologically-aware.
Acknowledgments. Zoomathia is an International Research Group (GDRI)
supported by the French National Scienti c Research Center (CNRS).</p>
      <p>GermaNet13 is a German lexical-semantic resource developed at the
Linguistics Department of the University of Tubingen.
12 http://cltk.org
13 http://www.sfs.uni-tuebingen.de/GermaNet/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>J.</given-names>
            <surname>Bostock</surname>
          </string-name>
          and H. T. Riley, editors.
          <source>Pliny the Elder, The Natural History</source>
          , Vol. II. Taylor and Francis, London,
          <year>1890</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>C.</given-names>
            <surname>Callou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faron-Zucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Montagnat</surname>
          </string-name>
          .
          <article-title>Towards a shared reference thesaurus for studies on history of zoology, archaeozoology and conservation biology</article-title>
          . In A. Zucker, I. Draelants,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faron-Zucker</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Monnin, editors,
          <source>Proceedings of the First International Workshop Semantic Web for Scienti c Heritage at the 12th ESWC 2015 Conference</source>
          , Portoroz, Slovenia, June 1st,
          <year>2015</year>
          ., volume
          <volume>1364</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <volume>15</volume>
          {
          <fpage>22</fpage>
          . CEUR-WS.org,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Chawla</surname>
          </string-name>
          .
          <article-title>Data mining for imbalanced datasets: An overview</article-title>
          . In O. Maimon and L. Rokach, editors,
          <source>Data Mining and Knowledge Discovery Handbook</source>
          , 2nd ed., pages
          <volume>875</volume>
          {
          <fpage>886</fpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>E.</given-names>
            <surname>Gentzler</surname>
          </string-name>
          .
          <source>Contemporary Translation Theories: Revised 2nd Edition</source>
          . Multilingual Matters, Clevedon,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. M. E. Littre, editor. Histoire Naturelle de Pline,
          <article-title>avec la traduction en francais</article-title>
          .
          <source>Firmin-Didot et Cie</source>
          , Paris,
          <year>1877</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. I.
          <string-name>
            <surname>Pajon-Leyra</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Zucker</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Faron-Zucker</surname>
          </string-name>
          .
          <article-title>Thezoo : un thesaurus de zoologie ancienne et medievale pour l'annotation de sources de donnees heterogenes</article-title>
          . to appear
          <source>in ALMA (Archivum Latinitatis Medii Aevi)</source>
          ,
          <volume>73</volume>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. H. Rackham, editor. Pliny:
          <article-title>Natural History volume III (Books VIII{XI)</article-title>
          . Cambridge, Massachusetts,
          <year>1940</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M.</given-names>
            <surname>Tounsi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faron-Zucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Villata</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Cabrio</surname>
          </string-name>
          .
          <article-title>Studying the history of pre-modern zoology by extracting linked zoological data from mediaeval texts and reasoning on it</article-title>
          . In The Semantic Web:
          <article-title>ESWC 2015 Satellite Events</article-title>
          , Portoroz, Slovenia,
          <year>2015</year>
          , Revised Selected Papers, volume
          <volume>9341</volume>
          <source>of LNCS</source>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. G. C. Wittstein, editor.
          <source>Die Naturgeschichte des Cajus Plinius Secundus</source>
          ,
          <article-title>ins Deutsche ubersetzt und mit Anmerkungen versehen, zweiter Band (VII{XI Buch)</article-title>
          .
          <source>Gressner &amp; Schramm</source>
          , Leipzig,
          <year>1881</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Joachims</surname>
          </string-name>
          .
          <article-title>Text categorization</article-title>
          .
          <source>Scholarpedia</source>
          ,
          <volume>3</volume>
          (
          <issue>5</issue>
          ):
          <fpage>4242</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>