<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tomaž Erjavec</string-name>
          <email>tomaz.erjavec@ijs.si</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joh Dokler</string-name>
          <email>joh.dokler@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petra Vide Ogrin</string-name>
          <email>petra.vide@zrc-sazu.si</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Knowledge Technologies, Jožef Stefan Institute Jamova cesta 39</institution>
          ,
          <addr-line>Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Seven Past Nine Ltd.</institution>
          <addr-line>Vrtača 3, Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Slovenian Academy of Sciences and Arts</institution>
          ,
          <addr-line>Library Novi trg 3, Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <fpage>16</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>The paper presents the Slovenian Biography portal and data. The Slovenian Biography currently comprises almost 9,000 person entries, which are composed of the free text from three source biographical lexicons and semi-automatically extracted structured authority records. These contain detailed information about the person name(s), date and geolocated place of birth and death, as well as the occupation(s) of the person from a rich developed taxonomy, covering both historical and contemporary occupations. The encoding of the data follows the Text Encoding Initiative Guidelines, in particular its module for biographical and prosopographical data. The Web portal of the Slovenian Biography is built on the BaseX XML database engine, queried by XPath / XQuery expressions, the Lucene search engine and the Django web framework. The portal allows faceted search, visualisation on a map, export of the entries in TEI etc. The SB is still work in progress, with new entries being added on regular basis. We are also working on the enrichment of its encoding, e.g. marking up all person names appearing in the text, adding relations between persons and expanding the numerous abbreviations, carried over from the print editions.</p>
      </abstract>
      <kwd-group>
        <kwd>Slovenian biography</kwd>
        <kwd>TEI encoding</kwd>
        <kwd>Structured data</kwd>
        <kwd>Web portal</kwd>
        <kwd>Natural language processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Slovenia only became an independent country in 1991,
however, long before that Slovenians had a very strong
national identity, mostly centred on their language, so
Slovenians, i.e. people born where Slovenian is or was
spoken, or having significant influence in these areas, can
be traced back for over one thousand years.</p>
      <p>
        This paper presents the data and portal of the Slovenian
Biography1 (SB), detailing the life and work of important
Slovenians. It comprises data from three biographical
lexicons, the first two having been published in print in
several volumes:
• SBL: the Slovenian Biographical Lexicon
        <xref ref-type="bibr" rid="ref2">(Cankar et al., 1925–1991)</xref>
        , containing 5,048 entries;
• PSBL: the Primorska Slovenian Biographical
Lexicon
        <xref ref-type="bibr" rid="ref6">(Jevnikar et al., 1974–1994)</xref>
        , containing
4,429 entries;
• NSBL: the New Slovenian Biographical Lexicon
        <xref ref-type="bibr" rid="ref10">(Svetina et al., 2013–)</xref>
        , currently containing 455
entries.
      </p>
      <p>SBL is included in the SB in its entirety; PSBL is still in
the process of digitisation, with currently about 40% of
the entries included, while NSBL is being added to the SB
gradually as its publication is an ongoing process.
The SBL, the largest of the three lexicons, strives to be an
authoritative resource: the authors of the articles follow
strict scientific standards, using a responsible historical
and biographical method, meaning that all data is checked
against the relevant historical materials and pre-existing
publications. For example, the biographical and other
dates are always compared to those in registers and other
1 http://www.slovenska-biografija.si/
primary sources, literary citations are compared with
originals, sources are cited at the end of the articles and
the publication includes an index of all person names that
appear in the articles and a list of abbreviations. It is thus a
reference work and a precious resource for any serious
research in the fields of Slovenian humanities, social
sciences and history of natural sciences.</p>
      <p>The SB is a long-term joint project of the Slovenian
Academy of Sciences and Arts (Slovenska akademija
znanosti in umetnosti – SAZU), the Scientific Research
Centre at SAZU, the Jožef Stefan Institute and Seven Past
Nine Ltd. This paper details its current state: Section 2
overviews the digitisation and up-translation of the source
biographical lexicons to XML, including the editing
environment and the structure and content of the resulting
documents, Section 3 explains the technologies used and
the user interface of the web portal, Section 4 overviews
our on-going work, and Section 5 gives some conclusions.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>Developing the database</title>
      <p>
        The digitisation of the printed edition of SBL (the
beginning of an example entry is given in Figure 1),
comprising 16 volumes, began in 2007, in order to make
this main biographical resource freely accessible. The
digitised version was made available on-line, using a
platform based on Fedora Commons (
        <xref ref-type="bibr" rid="ref4 ref5">Javoršek et al.,
2009</xref>
        a, 2009b), since discontinued, as we migrated to the
current platform presented in this paper.
      </p>
      <p>After finishing the SBL, the work moved on to digitising
the PSBL; the scanning, OCR and manual correction of
the transcription has now also been completed, and its
articles are being added gradually to SB, where the
extracted structured data is still being manually corrected
before the articles are fully integrated into the SB.</p>
      <p>Finally, the publication of NSBL is an ongoing process
and volumes are to be published successively in
alphabetical order. The articles are born-digital and the
structured biographical data is being added manually and
integrated into the SB simultaneously with the free text
articles, as they become available.</p>
      <sec id="sec-2-1">
        <title>2.1 Editing environment</title>
        <p>
          The SB uses XML as a markup language to encode and
structure its information. The XML is compliant with the
Text Encoding Initiative2 Guidelines for Electronic Text
Encoding and Interchange (TEI, The TEI Consortium),
which is an extensive and flexible schema used to
represent texts in digital form. We chose TEI as our base
encoding not only because of the wide range of text types
that it covers and its continuous development, but also
because we have substantial prior experience in using it
for developing e.g. text-critical digital editions of
Slovenian literature
          <xref ref-type="bibr" rid="ref3">(Erjavec &amp; Ogrin, 2005)</xref>
          .
        </p>
        <p>The editorial team works directly on the XML using the
specialized XML editor Oxygen that supports data
validation based on the TEI schema. This allows editors to
encode rich data structures in a data-safe manner through
the use of schema validation without the need to
implement complicated user interfaces. In this sense,
XML is a user interface and one that editors can extend
themselves. In the context of the project, we did in fact
experiment with a simplified web based user interface (on
top of the relational database models) that provided
editing without prior knowledge of XML. However, this
proved to be both limiting and inflexible in terms of
project growth. But more importantly, editors, once they
learned XML, preferred working with XML.</p>
        <p>A very important component of our XML workflow is the
use of Git as the version control system. This allows
different editors to simultaneously work on the same
source XML files, track changes and history, and resolve
conflicts. Since Git can be a challenging technology to use
in its original form (using the command line) we use a
GUI offered by GitHub3. While this version provides only
a subset of Git functionality it proved to be adequate for
our use case and importantly, required the editors to learn
only a handful of concepts and commands like pull,
commit and push.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Database design and content</title>
        <p>The complete database of the SB consists of four TEI
documents. SBL, PSBL and NSBL are each encoded as one
TEI document, which contains the TEI header, giving its
metadata, and the body with full text entries. The entries
are lightly structured, containing a series of paragraphs
and the bibliography list. Crucially, each entry has a
reference to the person entry in the fourth “authority” (or
“index”) document (SBI), which contains the structured
biographic data, further discussed below.</p>
        <p>Table 1 gives a quantitative overview of the current state
of the database. The first three lines give the number of
full-text entries in the SBL, PSBL, and NSBL documents,
and the fourth their sum. Next, the numbers of entries in
the authority SBI document are given. The SBI-family is
the number of family entries (as opposed to individual
persons) included in the document. Next, SBI-main, the
number of “main” person entries is given – these are the
entries that are linked to from the individual lexicons. It
should be noted that the sum of the entries in the lexicons
is greater than the sum of the family and person records in
the authority document, as some persons from the three
lexicons overlap. The table next gives the number of
SBI-sub entries, i.e. the number of “subordinate” person
entries in the authority document. These are structured
person entries that, however, do not have a corresponding
full-text description in the individual lexicons. As further
described in Section 4, these are persons related to the
main persons included in the lexicons, which were added
manually to the authority document and linked to their
main person. Finally, the table gives the sum of all the
family and person records in the SBI, currently almost
9,000.</p>
        <p>SBL
PSBL
NSBL
Σ</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3 Structure of the authority records</title>
        <p>The SBI authority document is encoded using the TEI
module for names, dates, people and places (TEI, Chp.
13) that allows for their detailed annotation. Individual
persons are encoded in &lt;person&gt; elements, while families
are encoded in the &lt;personGrp&gt; elements.</p>
        <sec id="sec-2-3-1">
          <title>2 http://www.tei-c.org/</title>
        </sec>
        <sec id="sec-2-3-2">
          <title>3 https://desktop.github.com/</title>
          <p>The structure of a typical person entry is illustrated in
Figure 2. The &lt;person&gt; element is given its canonical ID
and contains &lt;sex&gt;, &lt;persName&gt;, &lt;occupation&gt;, &lt;birth&gt;,
&lt;death&gt;, further containing &lt;date&gt; and &lt;placeName&gt;.
Dates have their ISO 8601 value in the @when attribute,
while other TEI attributes are used in cases when the exact
date is not known, e.g. &lt;date notAfter="0940"
source="#nsbl"&gt;pred letom 940&lt;/date&gt; (before year
940, also giving the source of this information, in this case
NSBL).</p>
          <p>For the &lt;settlement&gt; element, we use gazetteers to ensure
the standard form of settlement names. We also encode
information about historical settlements for those that are
not in existence any more or for settlements that still exist
but have changed their name.</p>
          <p>Of course, the actual structure, described above in
principle, varies to a certain extent and depends on the
information on a particular person; a detailed overview of
all the elements used is given in Section 2.5.</p>
          <p>The basic structured data was semi-automatically
extracted from the corrected full text. First, regular
expressions were written in Perl to extract pieces of
information from the full texts and produce the initial
authority records. For the most part, these regular
expressions were quite simple, e.g. transforming “r. 3.
dec. 1800” to &lt;birth&gt;&lt;date when="1800-12-03"&gt;3. dec.
1800&lt;/date&gt;&lt;/birth&gt;. The automatically produced
authority records were then manually checked and, where
necessary, corrected.</p>
          <p>
            It should be noted that even where no automatic
annotation was possible, obligatory distinctions were
tagged manually, such as different variants of names that
were mentioned somewhere in the text of the article,
important activities apart from the primary occupation of
the person, or certain periods of time that marked
important milestones in their life. The major aspects of
this conversion process have been reported in more detail
in
            <xref ref-type="bibr" rid="ref12">Vide Ogrin and Erjavec (2007</xref>
            ).
          </p>
          <p>Where we were able to obtain high-quality images of the
person, these are also included. More recently, we started
to geocode (assign GPS coordinates) all place names.
This is a four-step process that includes:
1. extracting the place names from the source XML
document,
2. geocoding the extracted place names using Google’s
geo-location service4,
3. checking the results for correctness and resolving
cases returned no or multiple results, and
4. populating the source XML with the GPS
coordinates.</p>
          <p>To support the review process we developed a separate
web application that allows the editors to quickly verify
and amend the resolved location on an interactive map.
2.4</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Occupation taxonomy</title>
        <p>Along with the detailed markup, we also recognized the
need for taxonomy of occupations. Its construction
proceeded in a bottom-up fashion, i.e. we extracted and
normalised the information from the occupations and
activities that occur in the full text of the articles. The
4 https://developers.google.com/maps/documentation/geolocation
taxonomy is already quite detailed with 1,077 categories,
of which 160 have subordinate categories (e.g.
“poklici-za-osebne-storitve”,
occupations-for-personal-services), while 917 are leaf nodes (e.g. “brivec”,
barber); the maximum depth of the taxonomy is 3.
Each category has, apart from its formal identifier, one or
(in about 20% of the cases) two Slovenian glosses, giving
also the name of the occupation for the female gender, as
these often differ in Slovenian, e.g. “politik/političarka”,
male/female politician.</p>
        <p>The taxonomy is still work-in-progress, the most obvious
reason being the fact that new persons are being added
with the new NSBL volumes, so their occupations may be
new to the existing taxonomy.</p>
        <p>Element
listPerson
personGrp
person
sex
idno
persName
forename
surname
name
genName
nameLink
birth
death
date
floruit
trait
occupation
roleName
placeName
settlement
country
geo
geogName
region
district
To give a comprehensive picture of the types and numbers
of distinctions made, we list in Table 2 all the elements
used in the body of the authority SBI document.
The elements are grouped roughly according to their
function. The first group repeats the SBI information from
the second part of Table 1, while the second group gives
the basic information about a person, including the
fine-grained classification of the parts of their names and
information about their birth, death, occupation(s) and
activities. The next group includes elements to do with
locations, including the &lt;geo&gt; element giving their
geocoding. Next comes a group that specifies the picture
of the person, where available (currently just over a
thousand). The final group contains the list of relations
and the number of relations between persons that it
includes, followed by 750 notes, giving some basic
free-text information about a person, independent of the
lexicon documents. All together, the authority document
thus currently uses almost 160,000 TEI elements.
3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Implementation of the portal</title>
      <p>3.1</p>
      <sec id="sec-3-1">
        <title>Technologies used</title>
        <p>Following the “XML as a data model” approach, we
chose BaseX 5 , a high performance and mature XML
database server that provides rich querying mechanism
through the use of XQuery6 and XPath, as well as full text
search via the integrated Lucene search engine. With this
approach, we are able to use the original XML data
directly in the database, with only minor structural
transformations for performance optimization.
The implementation of the web portal and administrative
interfaces is based on the Django7 web framework, which
in this case serves primarily as the “glue” between BaseX
and the web interface. We also use Django to implement
specific editorial tools like geolocation of place names.
The Slovenian Biography recently started to serve part of
the data as JSON-LD8 segments, which are embedded in
the served HTML pages. JSON-LD is a new serialization
format targeting Linked Data or Semantic Web Data that
makes semantic annotation and publishing of data
relatively easy. However, due to the lack of extensive
biographical ontologies, it is not possible at this time to
export a large part of the data that is otherwise available in
the HTML (non-semantic) version. This lack of available
specialized ontologies or schemas is also a problem when
it comes to annotating and exposing the occupational
taxonomy discussed in Section 2.4.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>The Web front-end</title>
        <p>The highly structured biographical data and metadata
information is presented via user-friendly web pages of
the portal. The TEI elements are rendered, which can be
5 http://basex.org/
6 https://www.w3.org/XML/Query/
7 https://www.djangoproject.com/
8 https://json-ld.org/spec/latest/json-ld/
rather complex for the detailed person names, which can
use up to five different elements, and sometimes have
several variants.</p>
        <p>As shown in Figure 3, the available metadata is used for a
number of aggregation and navigation views such as an
alphabetical index, chronological index, browsable
occupational taxonomy and interactive map, which all
help users to explore the available data. The data is also
searchable through the use of simple and advanced
searches.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Current work</title>
      <p>The SB is a continuous work in progress, not only with
new entries being added on a regular basis, but also
striving to make the biographical information more
access-friendly both to the average and to the more
demanding, research oriented user.</p>
      <p>We are also working on the enrichment of the encoding, in
several directions. First, we are manually adding relations
between persons, mostly family relations, but also close
contemporaries, co-workers etc. This involves adding
&lt;relation&gt; elements, which give the type of relation and
references to the IDs of the related persons, e.g. &lt;relation
name = "parents" passive = "#sbi215416" active =
"#sbi215416-0 #sbi215416-1"/&gt;. As already mentioned
in Section 2.2, we are also adding and linking new
“subordinate” persons, which are related to the existing
ones, to the authority document.</p>
      <p>
        Second, named entities (i.e. person, location, organisation
and “other”) appearing in the text have been automatically
annotated using the Stanford Named Entity Recognizer
(NER) trained for Slovenian (
        <xref ref-type="bibr" rid="ref8">Ljubešić et al., 2013</xref>
        ). On a
general and manually marked-up test corpus this NER
tool achieved an overall precision of 73% and recall of
67%, however, the accuracy very much depends on the
type of named entity: it is highest for persons (P = 82.2% ,
R = 86.7%) and lowest for the type “other” (P = 29.2%, R
= 15.4%). This automatically assigned NER markup is
now being manually verified and corrected.
      </p>
      <p>Third, as can be already seen in Figure 1, the SB contains
many abbreviations, which were numerous and typical in
the print editions and then carried over into the digitised
version, where they are now obviously unnecessary. We
plan to do semi-automatic expansion of these
abbreviations, where we are faced with two problems.
First, the citation form of the abbreviation must be known,
and, second, it needs to be inserted into the text in the
correct inflected form, which is, of course, dependent on
the context; as Slovenian is a highly inflected language,
this is a difficult problem. We plan to approach it in the
same way as the others, i.e. first using an automatic
method to pre-annotate the abbreviations and then
manually verify the results. We have currently manually
annotated a sample of the SB containing 50 entries, and
we will use this dataset to train a machine learning system
to automatically expand and inflect the abbreviations – it
should be noted that we also have the background lexicon
containing most of the use abbreviations and providing
their expansions to their citation form.</p>
      <p>We are, as already mentioned, further elaborating our
occupational taxonomy. The taxonomy IDS and the
category descriptions are currently only in Slovenian,
and, apart from adding new categories, further work will
concentrate on translating the taxonomy to English and
harmonising it with standard occupational taxonomies,
such as SOC9 (Standard Occupational Classification), a
necessary step in the light of exposing our data also as
RDF linked open data.</p>
      <p>To improve search precision and provide a better user
experience we are moving the implementation of the
search to the Elasticsearch 10 that will allow us to
implement more fine-grained and weighted full-text
search, inclusion of a lemmatiser and a search type-ahead
(autosuggest) search box.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>The paper presented the Slovenian Biography, i.e. the
source biographical lexicons that it contains, the process
of their up-translation to the TEI encoded digital edition,
methods used in editing and enhancing the data, the
architecture and functionality of the web portal and the
on-going work.</p>
      <p>
        The SB is already extensively used: the analysis of access
logs shows that in 2017 the portal had over 135,000
different users, who, during their 199,000 visits, accessed
332,000 pages, indicating that the SB is already perceived
as a valuable and useful resource. Nevertheless, in our
further work we also aim to focus on outreach, and on
enriching its data, by linking the SB documents better
with other Internet resources. We would like to connect
the person descriptions of SB to other on-line
biographical lexica, but also with relevant books, articles,
pictures and multimedia content found on stable
locations. We are also considering publishing the
authority data in the scope of the CLARIN.SI repository
under an open licence, so others can make use of it
directly, thus enabling e.g. statistical
        <xref ref-type="bibr" rid="ref1">(Anderson, 2007)</xref>
        or
GIS-based investigations
        <xref ref-type="bibr" rid="ref7">(Knowles &amp; Hillier, 2008)</xref>
        over
the data.
      </p>
      <p>Finally, most users will likely search for person names via
Google, and will typically first find the Wikipedia article
of the person, where it exits. We have already taken the
step of adding the external link to some SB articles from
Wikipedia, and this practice should be continued and
intensified, also adding stub articles to Wikipedia for
missing persons and the link to SB to them. In this way,
we also facilitate others to write the relevant Wikipedia
articles.</p>
      <sec id="sec-5-1">
        <title>Acknowledgements</title>
        <p>The authors would like to thank the three anonymous
reviewers for their helpful comments and suggestions,
which we have taken into account to the limit of our
abilities. The work presented here was partially supported
by the Slovenian research infrastructure CLARIN.SI and
the Slovenian Research Agency programme “Knowledge
Technologies”.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Quantitative history</article-title>
          . In W. Outwaite &amp; S. Turner (Eds.),
          <source>The Sage Handbook of Social Science Methodology. London: Sage Publications.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Cankar</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          et al. (eds.) (
          <year>1925</year>
          -
          <fpage>1991</fpage>
          ).
          <article-title>Slovenski biografski leksikon</article-title>
          . Ljubljana: SAZU.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Erjavec</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ogrin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Digital Critical Editions of Slovenian Literature: an Application of Collaborative Work Using Open Standards. From Author to Reader: Challenges for the Digital Content Chain</article-title>
          :
          <source>proceedings of the 9th ICCC International Conference on Electronic Publishing</source>
          , Arenberg Castle / Dobreva, M.;
          <string-name>
            <surname>Engelen</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          . (eds.).
          <source>Leuven: Peeters</source>
          ,
          <fpage>151</fpage>
          -
          <lpage>156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Javoršek</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erjavec</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vide</surname>
            <given-names>Ogrin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>2009a</year>
          ).
          <article-title>Slovenian Biographical Lexicon - From a Digital Edition to an On-Line Application</article-title>
          .
          <source>In: The Future of Information Sciences: Digital Information and Heritage: Proceedings of the 1st International Conference The Future of Information Sciences - INFuture</source>
          <year>2009</year>
          .
          <article-title>Zagreb: Odsjek za informacijske znanosti, Filozofski fakultet</article-title>
          , Sveučilište u Zagrebu.
          <volume>115</volume>
          -
          <fpage>124</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Javoršek</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erjavec</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vide</surname>
            <given-names>Ogrin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>2009b</year>
          ).
          <article-title>The digitisation and deployment of the Slovenian Biographical Lexicon. Research infrastructure for digital lexicography</article-title>
          :
          <source>proceedings of the 12th International Multiconference Information Society</source>
          <year>2009</year>
          , Mondilex Fifth Open Workshop, Ljubljana, Slovenia,
          <source>October 14-15</source>
          ,
          <year>2009</year>
          . Ljubljana: Institut Jožef Stefan.
          <year>2009</year>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>71</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Jevnikar</surname>
          </string-name>
          , Martin et al. (eds.) (
          <year>1974</year>
          -
          <fpage>1994</fpage>
          ).
          <article-title>Primorski slovenski biografski leksikon</article-title>
          .
          <source>Gorica: Goriška Mohorjeva družba.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Knowles</surname>
            ,
            <given-names>A. K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hillier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Placing history: how maps, spatial data, and GIS are changing historical scholarship</article-title>
          .
          <source>ESRI</source>
          , Inc.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Ljubešić</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stupar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurić</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agić</surname>
          </string-name>
          , Ž. (
          <year>2013</year>
          ).
          <article-title>Combining available datasets for building named entity recognition models of Croatian and Slovene</article-title>
          . Jezikovne tehnologije,
          <source>(Slovenščina 2.0, ISSN 2335-2736</source>
          ,
          <string-name>
            <surname>Tematska</surname>
            <given-names>številka</given-names>
          </string-name>
          ,
          <source>Letn. 1, št. 2)</source>
          . Ljubljana: Trojina, zavod za uporabno slovenistiko.
          <year>2013</year>
          , letn. 1,
          <issue>št</issue>
          . 2, pp.
          <fpage>35</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          http://www.trojina.
          <source>org/slovenscina2.0/arhiv/2013/2/Slo2.0_2</source>
          <volume>013</volume>
          _2_
          <fpage>03</fpage>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Svetina</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al.
          <article-title>(2013-). Novi Slovenski biografski leksikon</article-title>
          . Ljubljana: Založba ZRC,
          <fpage>2013</fpage>
          -&lt;
          <year>2017</year>
          &gt;
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          TEI Consortium, eds.
          <source>Guidelines for Electronic Text Encoding and Interchange</source>
          . http://www.tei-c.org/P5/.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Vide</given-names>
            <surname>Ogrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Erjavec</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Towards a Digital Edition of the Slovenian Biographical Lexicon</article-title>
          .
          <source>In: The Future of Information Sciences: Digital Information and Heritage: Proceedings of the 1st International Conference The Future of Information Sciences - INFuture</source>
          <year>2007</year>
          .
          <article-title>Zagreb: Odsjek za informacijske znanosti, Filozofski fakultet</article-title>
          , Sveučilište u Zagrebu.
          <volume>115</volume>
          -
          <fpage>124</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>