<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Practical Course in Corpus Linguistics for Students with a Humanist Background</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mihaela Vela</string-name>
          <email>m.vela@mx.uni-saarland.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hannah Kermes</string-name>
          <email>h.kermes@mx.uni-saarland.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Language Science and Technology, Saarland University</institution>
        </aff>
      </contrib-group>
      <fpage>49</fpage>
      <lpage>56</lpage>
      <abstract>
        <p>We present a practical course in corpus linguistics meant to provide students with a humanities background with the necessary knowledge and skills for an empirical study as basis for term papers, BA- or MA-thesis. The course is part of a new Bachelor program and is combined with a theoretically oriented course on corpus linguistics. The challenge is to provide students with the necessary understanding of the underlying concepts and skills of corpus linguistics without overwhelming them with too much technical detail. The course material is modular, allowing for easy updates, modifications and adaptations as well as reusable for different target groups, settings, and applications.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>In this paper we present a practical course in corpus
linguistics, which is meant to provide students with
a humanities background with the necessary
knowledge and skills for an empirical study as basis for
term papers, BA- or MA-thesis. The course is part
of a new Bachelor program Language Science and
is combined with a theoretically oriented course
on corpus linguistics. The students in the program
come from various backgrounds including
translatology and language studies among others. Most
of the students have only little or no experience in
natural language processing.</p>
      <p>The challenge is to provide them with the
necessary understanding of the underlying concepts and
skills of corpus linuistics without overwhelming
them with too much technical detail. The course
material is modular, allowing for easy updates,
modifications and adaptations as well as reusable
for different target groups, settings, and
applications. The described processes, analysis and
exercises are reproducible and portable to further
studies.</p>
      <p>In the following we will discuss challenges for
teachers and students (Section 2) and describe the
general concept of the course (Section 3) and its
composition (Section 4). We conclude with a brief
summary and envoy (Section 5).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Challenges for teachers and students</title>
      <p>A practical course on corpus linguistics for students
with a humanities background has challenges for
both teachers and students.</p>
      <p>The challenges stem from the seemingly
opposed character of the digital applications and the
humanities disciplines as well as from the
character of a practical course requiring a lot of active
learning on the side of the students.</p>
      <p>Challenges for teachers include:
motivating students and lowering the
psychological and practical barriers
trying to avoid or solve technical problems
dealing with heterogeneous groups both with
regard to the prior knowledge of the students
as well as with their different learning pace
keeping track of the learning success of the
group and individual students, adjusting the
teaching speed and/or type accordingly</p>
      <sec id="sec-2-1">
        <title>Challenges for students include:</title>
        <p>engaging with a potentially new kind of
subject matter
dealing with and solving technical problems
coping with the high demands of active
learning</p>
        <p>
          Motivating students to engage with the technical
aspects of corpus linguistics often boils down to
answering the question about the usefulness of the
methodology. Good and obvious examples for
applications in the students’ discipline(s) exemplify
the additional value. Useful to this respect are
simple and understandable practical exercises to
exemplify and to help lower potential
psychological barriers with regard to technical applications.
Motivating students stays a challenge throughout
a course as technical aspects can easily become
cumbersome and tedious. Active learning plays an
important role in this respect. Active learning in the
sense of an instructional method engaging students
in meaningful learning activities in the classroom
(e.g. doing exercises, working on and discussing
problems/results)
          <xref ref-type="bibr" rid="ref11 ref2">(Prince, 2004; Bonwell and
Eison, 1991)</xref>
          . It allows to keep the students active
and involved giving them an immediate feedback
on their learning success. The broader goal of the
session, however, should be made clear to show the
necessity of the activities of the students.
        </p>
        <p>Technical problems with regard to applications
and code are a challenge for both teachers and
students. A lot of technical problems, e.g. with
installing software, can be avoided by using
online tools, e.g. online corpora or web services
for corpus annotation. Another possibility to limit
technical problems is to provide sample code for
more complex examples, which only needs to be
modified or complemented for exercises or later
application. This can help to focus on the main
aspects of the methodology as the technical
difficulties are reduced to a minimum. Nevertheless,
technical problems cannot be avoided completely,
it can even be good to provoke particular problems
in class. Problem solving, especially finding error
in code or regular expressions, is also an aspect of
corpus linguistic research. Working through
examples and exercises in class will inevitably lead to
some technical problems. However, as they occur
in class, the teacher can immediately provide help
in finding and solving the problems.</p>
        <p>Teachers are often confronted with
heterogeneous groups, both with regard to their prior
knowledge as well as with different learning speeds.
Although this is a general challenge in teaching, the
groups are often more heterogeneous in digital
humanities settings and are especially pronounced
when teaching technical skills. Again an active
learning environment where examples and
exercises are worked through in class can help to adopt
to individual needs. It is easier for the teacher to
find out about individual problems and to provide
immediate support. It is also possible to provide
exercises for different levels or extra exercises for
more advanced students. Working on a problem as
a group can also foster deeper understanding.</p>
        <p>Making students present the results of exercises
and discussing them as a group helps teachers
keeping track of the learning success of the group and of
individual students. This is important to eventually
adjust the teaching speed and/or type accordingly.
The discussion of the results can also be used to
sum up and point to important aspects of a teaching
unit. This helps the students to reflect on their own
learning success and to evaluate the personal
organization and structure of their learning activities.</p>
        <p>In the following we will now describe the general
concept of the course and how it addresses the
challenges described above.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>General concept</title>
      <p>The course covers the two main aspects of corpus
linguistics: (i) corpus building and (ii) corpus
analysis. The main emphasis, however, is on corpus
analysis as depicted in Figure 1. Tutorials lead the
students through the main steps in corpus building
from the digitized text to an annotated searchable
corpus and from the linguistic research question
and corpus extraction to corpus analysis.</p>
      <p>The course is constructed like a sample study,
each tutorial representing a particular step in the
process. In this sense, the tutorials in both parts
build on one another, as each tutorial produces the
input data for the next. However, as the necessary
sample data is provided at the beginning of each
tutorial, the tutorials are also self-contained. In the
first part, we create a corpus out of a small plain
text sample, adding meta data and basic linguistic
annotation. In the second part, we look at a
sample research question extracting and analyzing the
respective data. The characteristic of the course
of a sample study allows the students to get
acquainted with the process of an empirical corpus
linguistic study, facilitating a later application of
the methodology to a study of their own.</p>
      <p>The course material is provided as a website
with detailed step-by-step tutorials including the
necessary background information, examples with
sample data and exercises. Links to external
knowledge sources and tutorials, provide access to
additional information including more (technical)
details, more profound background or more complex
applications. The tutorials may be worked through
at individual pace adapting to the specific needs
of different target groups and individual students,
skipping sections or providing additional
information or exercises.</p>
      <p>The tutorials are written in R-Markdown
and converted into HTML websites. Using
R-Markdown as source documents has several
advantages: (i) the tutorials are easy to modify, (ii)
additional information as well as new material can
easily be integrated, (iii) the students can download
the source document, which allows for individual
notes as well as to reproduce the provided sample
analysis.</p>
      <p>
        Although the tutorials were initiated as course
material for a university course, they can also
function as self-learning tutorials or as a (knowledge)
base and memory hook for later use when writing
a term paper, BA- or MA-thesis. As a university
course it follows the concept of inverted or flipped
classroom
        <xref ref-type="bibr" rid="ref1 ref5 ref5">(Bergmann and Sams, 2012; Handke
et al., 2012)</xref>
        in the sense that sample study and
exercises are worked through in the classroom
individually or as a group, followed by a group
discussion. This allows to address problems immediately,
discuss them as a group, work through advance
concept and engage in collaborative learning and
problem solving
        <xref ref-type="bibr" rid="ref14">(Tucker, 2012)</xref>
        again more or less
simulating a research process in a team.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Tutorial Description</title>
      <p>The practical course described here consisted of ten
incremental sessions on Corpus Linguistics. The
goal, as described in the previous section, is to get
students acquainted with the concepts and notions
of Corpus Linguistics. Each session was planned as
an interactive combination between a tutorial
(presented by the lecturers or prepared by the students)
and the corresponding exercises (solved in class
by the students with the lecturer’s assistance). The
tutorial corresponds to the theoretical part,
introducing a new topic, while the exercises are meant
as practical applications.</p>
      <p>In this section we describe the course in detail
including not only the structure, content and technical
aspects of the course, but also the necessary
infrastructure to teach this type of classes. We decided
to publish the entire material of the course online1,
facilitating the availability and reproducibility of
all course materials. As mentioned before, we use
R Markdown for this purpose, being able to
combine both narrative text and code and producing
formatted output.</p>
      <p>
        As shown in Figure 1, the course is structured
into four blocks, Corpus Building, Corpus
Annotation, Corpus Query and Data Analysis, distributed
over ten sessions. Each new session introduces a
new concept, but at the same time using previously
introduced concepts. In terms of corpora we use
the Royal Society Corpus (RSC)
        <xref ref-type="bibr" rid="ref9">(Kermes et al.,
2016)</xref>
        , a historical corpus of written scientific
English, as well as the BROWN
        <xref ref-type="bibr" rid="ref4">(Francis and Kucˇera,
1979)</xref>
        , FLOB
        <xref ref-type="bibr" rid="ref10">(Mair, 1999b)</xref>
        , LOB
        <xref ref-type="bibr" rid="ref8">(Johansson and
Goodluck, 1978)</xref>
        or FROWN
        <xref ref-type="bibr" rid="ref10">(Mair, 1999a)</xref>
        corpora, covering different time periods and registers
1http://fedora.clarin-d.uni-saarland.
de/teaching/Corpus_Linguistics/index.
html
for both American and British English.
      </p>
      <p>The course is structured as follows.</p>
      <sec id="sec-4-1">
        <title>Session 1: Corpus building with XML and TEI</title>
      </sec>
      <sec id="sec-4-2">
        <title>Session 2: Tagging with TreeTagger</title>
      </sec>
      <sec id="sec-4-3">
        <title>Session 3: Corpus annotation with WebLicht Session 4: Corpus query with regular expressions</title>
      </sec>
      <sec id="sec-4-4">
        <title>Session 5: Corpus query with patterns</title>
      </sec>
      <sec id="sec-4-5">
        <title>Session 6: Data extraction and data formats</title>
      </sec>
      <sec id="sec-4-6">
        <title>Session 7: Data analysis and data evaluation with Excel</title>
      </sec>
      <sec id="sec-4-7">
        <title>Session 8: Manipulating data sets with R Session 9: Normalization and frequency distribution with R</title>
      </sec>
      <sec id="sec-4-8">
        <title>Session 10: Plotting analysis results with R</title>
        <p>Session 1 belongs to the Corpus Building block
and provides an introduction to XML (EXtensible
Markup Language) and TEI (Text Encoding
Initiative). The goal of this class is to make students
understand the importance and the syntax of
markup languages when working with corpora. The first
class starts with an exercise by asking students to
mark title, paragraphs and sentences in a .txt file.
The different solutions are meant to show the
possible variation in marking linguistic units, here title,
paragraphs and sentences, and making a point why
a standardized mark-up language is necessary. The
tutorial introduces first the XML syntax, followed
by the TEI syntax. For completion the session ends
with an exercise on encoding the same text
according to the TEI guidelines.</p>
        <p>As described in Section 3, we provided
additional information, beyond the scope of the specific
class. In Session 1 this additional information was
provided as links depicted in the Figure 2 below.</p>
        <p>
          Session 2 and Session 3 are part of the Corpus
Annotation block. Session 2 deals with
part-ofspeech tagging, including its definition and the
introduction of the concept of a tagset. More
specifically, this class provides also an introduction to
the usage and configuration options of the
TreeTagger
          <xref ref-type="bibr" rid="ref12 ref13">(Schmid, 1994; Schmid, 1995)</xref>
          . The exercise
deals with the installation and usage of the
TreeTagger as well as performing tagging on .txt files,
but also on .xml files, as depicted in Figure 3.
        </p>
        <p>
          Session 3 goes one step further by introducing
additional annotation layers to the already existing
part-of-speech annotation. This is carried out by
WebLicht
          <xref ref-type="bibr" rid="ref7">(Hinrichs et al., 2010)</xref>
          , a web based
environment for the annotation of corpora. It includes
tools for tokenization, lemmatization, pos-tagging
and parsing (among others), which can be
combined individually to tool chains. The WebLicht
tutorial describes the usage of the tool by
depicting screenshots and giving examples. The exercise
for the students is to build a processing chain in
WebLicht including at least a tokenizer and the
TreeTagger. The files used for this exercise are the
same as in Session 2.
        </p>
        <p>
          Session 4 and Session 5 are concerned with
corpus query belonging to the Corpus Query block,
being concerned with the qualitative analysis of the
texts. Session 4 is meant as an introduction to
regular expressions, defining the concept of a regular
expression, but also explaining, by examples, the
special characters and their role in formulating a
regular expression. After practicing formulating
queries with regular expressions in Notepad++2
the students were introduced to the Saarland
University CQPWeb
          <xref ref-type="bibr" rid="ref6">(Hardie, 2012)</xref>
          platform and to
the CQP syntax
          <xref ref-type="bibr" rid="ref3">(Evert and Hardie, 2011)</xref>
          3. The
tutorial consists of a series of examples for queries
meant to consolidate the knowledge about regular
expressions. The corresponding exercises consist
of a set of queries to be carried out in CQPWeb
on the RSC and BROWN corpora. An example of
such an exercise can be found in Figure 4.
        </p>
        <p>Session 5 is a continuation of Session 4 building
on regular expression syntax extending the simple
search queries introduced before to more complex
queries including patterns. In the exercise part
2https://notepad-plus-plus.org/
3https://corpora.clarin-d.uni-saarland.
de/cqpweb/
of the class the students are being asked to build
their own pattern by using the CQP syntax and to
query again the RSC and BROWN corpora using
CQPWeb as shown in Figure 5.</p>
        <p>Session 6 belongs to both the Corpus Query as
well as the Data Analysis block, dealing with the
results of a query. The result of a specific linguistic
induced query is usually a data set containing
information about a particular linguistic phenomenon
extracted from a particular corpus. In this class
students were provided with the concept of a data set,
creating, formatting and manipulating it by using
the online search tool CQPweb. Strongly related to
the data set this class is introducing the notions of
observations, features and values of features.
During the practical part students were asked to create
their own data set based on a research question
(e.g. distribution of content verbs and their
partsof-speech across registers in a specific corpus) and
to formulate the research question in terms of query,
observations and features.</p>
        <p>Session 7 to 10 belong to the Data Analysis
block, introducing basic data analysis and data
evaluation methods such as frequency distribution,
normalization and statistical significance test using
the c 2 (chi-square) test. The statistical analyses in
these sessions are based on the queries and data
extracted in the previous sessions.</p>
        <p>Session 7 is a gentle introduction to data
analysis, introducing (with relevant examples) all
theoretical notions related to frequency distribution,
normalization and c 2. The practical application of
these concepts is realized by exercises executed in
Excel/Libre Office. Excel/Libre Office is not the
state-of-the-art in statistical analysis but has big
advantage: the statistical analysis can be performed
step by step permitting students to understand the
path to the final formula. Understanding how a
specific formula works (including the intermediate
steps) is a great benefit for learners, who need to
use this kind of knowledge later in their academic
studies.</p>
        <p>Session 8 introduces statistical analysis with R in
R Studio. It introduces basic notions related to
R, including data manipulation such as adding
column names, adding additional variables (columns),
summarizing the data, merging and combining two
or more data sets by presenting appropriate
examples for each of these topics. In the exercise part
of this session students are asked to extract
similar data sets from other corpora applying the same
data manipulation as used in the examples. At the
end all data sets are combined to one large data set,
which will then be analyzed in Session 9. Figure 7
shows the introductory part related to data frames
in R.</p>
        <p>Session 9 relates to Session 7 in that it is
concerned with basic data analysis and data evaluation
methods such as frequency distribution (see
Figure 8), normalization and statistical significance
test using the c 2 test. The two sessions differ by
the tools used for the analysis. While data analysis
in Session 7 was exemplified using Excel, Session
9 uses R. The exercises from Session 7 are repeated
in Session 9 to show the relation of the tools.
However, it is also shown that R is more powerful when
dealing with multivariate data sets extending the
analysis performed in Session 7.</p>
        <p>Session 10 is the continuation of Session 9
introducing additional aspects of data analysis showing
how to visualize and interprete data from different
perspectives (see Figure 8). Students prepare the
different visualizations and their interpretation in
groups. The results are then discussed together in
class. The importance of verifying the
interpretation of the macro perspective of the visualization
with the micro perspective, the examples from the
corpora (concordance lines) is made explicit in this
discussions, linking and intertwining quantitative
and qualitative analysis.</p>
        <p>The R Markdown documents used throughout
Session 8 to 10 include sample code for data
manipulation and data analysis (see Figures 8 and 9).</p>
        <p>The documents are modular allowing to apply them
to different data sets as well as copying, modifying
and adapting the included code. The modification
and adaptation is exemplified in the exercises and
the students are encouraged to make notes about
technical aspects and interpretations.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We presented a ten session practical course on
Corpus Linguistics for students with a humanities
background. The structure of the course is based on
active learning methods to address the challenges
of teaching a technical course to students with little
or no technical background. An active learning
environment encourages students to work on research
question alone or as a group, addressing (technical)
challenges, solving technical and research related
problems and discussing results. The role of the
teacher moves in the direction of an assistant,
answering questions, pushing in the right direction
and helping to find solutions. The course and the
course material, as presented here, allows for an
easy modification, adaptation and extension of the
course material. This makes the course and its
material applicable to different target groups and
settings, making the creation of such material worth
the effort.</p>
      <p>Christian Mair, 1999b. The Freiburg-LOB Corpus
(F</p>
      <p>LOB).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Jonathan</given-names>
            <surname>Bergmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Aaron</given-names>
            <surname>Sams</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Flip Your Classroom: Reach Every Student in Every Class Every Day</article-title>
          . International Society for Technology in Education, Eugene, Or.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Charles C.</given-names>
            <surname>Bonwell</surname>
          </string-name>
          and
          <string-name>
            <given-names>James A.</given-names>
            <surname>Eison</surname>
          </string-name>
          .
          <year>1991</year>
          .
          <article-title>Active Learning: Creating Excitement in the Classroom. Number 1, 1991 in ASHE-ERIC higher education report</article-title>
          .
          <source>School of Education and Human Development</source>
          , George Washington University, Washington, DC.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Evert</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Hardie</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Twenty-First Century Corpus Workbench: Updating a Query Architecture for the New Millennium</article-title>
          .
          <source>In Proceedings of the Corpus Linguistics 2011 Conference</source>
          , Brimingham, UK.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>W.N.</given-names>
            <surname>Francis</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Kucˇera</surname>
          </string-name>
          .
          <year>1979</year>
          .
          <article-title>Manual of Information to Accompany A Standard Corpus of Presentday Edited American English, for Use with Digital Computers</article-title>
          . Brown University, Department of Lingustics.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>Ju¨rgen Handke, Alexander Sperl</article-title>
          , and Deutsche ICMKonferenz, editors.
          <year>2012</year>
          .
          <article-title>Das inverted classroom model: Begleitband zur ersten deutschen ICM-Konferenz. Oldenbourg, Mu¨nchen</article-title>
          . OCLC:
          <volume>810266426</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Hardie</surname>
          </string-name>
          .
          <year>2012</year>
          . CQPweb -Combining
          <string-name>
            <surname>Power</surname>
          </string-name>
          ,
          <article-title>Flexibility and Usability in a Corpus Analysis Tool</article-title>
          .
          <source>International Journal of Corpus Linguistics</source>
          ,
          <volume>17</volume>
          (
          <issue>3</issue>
          ):
          <fpage>380</fpage>
          -
          <lpage>409</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Erhard</given-names>
            <surname>Hinrichs</surname>
          </string-name>
          , Marie Hinrichs, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Zastrow</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>WebLicht: Web-based LRT services for German</article-title>
          .
          <source>In Proceedings of the ACL 2010 System Demonstrations</source>
          , pages
          <fpage>25</fpage>
          -
          <lpage>29</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Geoffrey</given-names>
            <surname>Leech</surname>
          </string-name>
          <string-name>
            <surname>Johansson</surname>
          </string-name>
          , Stig and
          <string-name>
            <given-names>Helen</given-names>
            <surname>Goodluck</surname>
          </string-name>
          ,
          <year>1978</year>
          .
          <article-title>Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for Use with Digital Computers</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Hannah</given-names>
            <surname>Kermes</surname>
          </string-name>
          , Jo¨rg Knappen, Stefania DegaetanoOrtlieb, and
          <string-name>
            <given-names>Elke</given-names>
            <surname>Teich</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The royal society corpus: From uncharted data to corpus</article-title>
          . In
          <source>In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'16)</source>
          , Portoroz, Slovenia.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Christian</given-names>
            <surname>Mair</surname>
          </string-name>
          ,
          <year>1999a</year>
          . The
          <string-name>
            <surname>Freiburg-Brown Corpus</surname>
          </string-name>
          (Frown).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Michael</given-names>
            <surname>Prince</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Does active learning work? A review of the research</article-title>
          .
          <source>Journal of engineering education</source>
          ,
          <volume>93</volume>
          (
          <issue>3</issue>
          ):
          <fpage>223</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Helmut</given-names>
            <surname>Schmid</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>Probabilistic Part-of-Speech Tagging Using Decision Trees</article-title>
          .
          <source>In International Conference on New Methods in Language Processing</source>
          , pages
          <fpage>44</fpage>
          -
          <lpage>49</lpage>
          , Manchester, UK.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Helmut</given-names>
            <surname>Schmid</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Improvements in Part-ofSpeech Tagging with an Application to German</article-title>
          .
          <source>In Proceedings of the ACL SIGDAT-Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Bill</given-names>
            <surname>Tucker</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>The flipped classroom</article-title>
          .
          <source>Education next</source>
          ,
          <volume>12</volume>
          (
          <issue>1</issue>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>