<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Treebanks for the Ordinary Working Grammarian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Joel Priestley</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anders Nøklestad</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kristin Hagen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anu Laanemets</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dag Trygve TruslewHaug</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Linguistics and Scandinavian Studies, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Humit - Centre for Digital Development, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <fpage>313</fpage>
      <lpage>321</lpage>
      <abstract>
        <p>In this paper we present how three treebanks of Norwegian have been incorporated in the Glossa search interface, allowing users without specialized training to formulate queries based on syntactic information. One of the treebanks contains written material (mostly newspaper text, but also blogs, magazines and other genres) and the two other treebanks are based on transcriptions of spoken dialects. The user interface is simple and only allows access to selected features of the annotation. We show through two case studies how it can nevertheless be useful for the large group linguists who do not have the time or inclination to learn a full treebank query language. We argue that our tool fills an important gap and can help bring treebank data to new users.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;corpora</kwd>
        <kwd>query interfaces</kwd>
        <kwd>syntax</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>By now, text corpora are a standard tool in linguistics, found across most subdisciplines from
historical linguistics to theoretical syntax. Typically, the corpora that are used consist of raw
text with rich metadata (with features such as genre, dialect, date, author gender and age and
much more) and some linguistic annotation such as part of speech tags. Numerous tools and
web pages are available that make the exploration of such data easy without specialized
training, such as Sketch Engine, the Coca web interface, and – especially for Norwegian corpora –
the Glossa web interface1.</p>
      <p>Until recently, more complex linguistic annotation, such as syntax, was only found in a few
specialized resources such as the Penn Treebank8[]. But over the last decade, the Universal
Dependencies (UD) framework [14, 2] for annotating dependency syntax has spurred the creation
of many more treebanks. Currently more than 200 treebanks for more than 150 languages are
available on the Universal Dependencies webpage2. But there are few if any tools available
that make treebanks accessible to ordinary humanities scholars without special training.</p>
      <p>Partly this is due to the inherent complexity of syntactic annotation, which is not reducible
to attributes of words, but involve labelled relations between words that give rise to a nested
structure. Most tools that let users explore syntactically annotated corpora are therefore based
on a tree (or graph) description language such as INESS9[], Grew [5] or Semgrex [1]. These
let the user specify arbitrarily complex constraints on the syntactic structure, but the learning
curve is often steep.</p>
      <p>In this paper, we describe a diferent approach, where we ofer easy access to some important
aspects of syntactic annotation within Glossa, a corpus tool focused on user-friendliness, where
all queries can be done with a combination of dropdown menus and a google-like text search
box. We describe how three treebanks of Norwegian, one with written language texts and
two with spoken language texts, have been imported in Glossa and show with a number of
case studies that interesting queries can be formulated even in this simpler setting. While
acknowledging that in-depth studies of treebank data will require more advanced tools, we
argue that easy-access tools like ours allow a broad audience of linguists and other humanities
scholars to make use of data that otherwise would be out of reach.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The data</title>
      <p>At present there are three treebanks imported into Glossa: The Norwegian Dependency
Treebank (NDT, [13]), The LIA Treebank (LIA, [11]) and The Nordic Dialect Corpus Treebank
(NDC, [6]). Norwegian Dependency Treebank (NDT) has two parts, one for text written in
the Norwegian standard ”Nynorsk” and one for ”Bokmål3”.LIA and NDC are treebanks with
transcriptions of Norwegian spoken dialects, LIA with transcriptions in Nynorsk, NDC with
transcriptions in Bokmål.</p>
      <p>NDT, LIA and NDC are all dependency treebanks comprising words annotated with
morphological features, syntactic functions, and hierarchical structures. The treebanks are available
for download in CoNLL format. The annotations were made with diferent automatic tools,
but every annotation was subsequently proofread and corrected by one or two linguists, see
the references for details of each treebank.</p>
      <p>NDT Nynorsk and NDT Bokmål have approximately 300.000 tokens each, collected from
newspapers, magazines, and blogs. For the most part, the annotations follow the analyses in
The Norwegian Reference Grammar 4[], but detailed annotation guidelines were also
developed to document the dependency grammar analyses4. The Norwegian Dependency Treebank
was developed by Språkbanken at the National Library.</p>
      <p>The LIA Treebank includes 7,536 speech segments and 77,701 tokens from transcriptions in
Nynorsk from the speech corpus LIA Norwegian - Corpus of historical dialect recordings. The
recordings in the treebank took place between 1958 and 1981, and the 41 speakers come from
21 places in diferent dialect areas of Norway.
2https://universaldependencies.org/
3”Nynorsk” and ”Bokmål” are two diferent written standards for Norwegian.
4https://www.nb.no/sbfil/dok/20140314_guidelines_ndt_english.pdf</p>
      <p>The NDC Treebank contains 4,637 speech segments and 66,042 tokens from the Bokmål
transcriptions in the Norwegian part of the Nordic Dialect Corpus. The recordings took place
between 2007 and 2010, and the 43 speakers come from 17 places in the same dialect areas as
the speakers in the LIA Treebank.</p>
      <p>Both the LIA and the NDC Treebank have been transcribed in two ways, one (quasi)
phonetically and one orthographically. In the Glossa interface you can search both transcriptions.
On the results page you can also listen to and watch (there are video recordings in NDC) the
original recordings for the search results.</p>
      <p>Since spoken language contains speech features like pauses, unfinished/incomplete words
and disfluencies such as repairs and deletions, we had to adapt and add to the NDT guidelines to
cover transcription and annotation of the spoken treebanks. For example, the transcribed texts
are divided into speech segments. A speech segment is our spoken language approximation
of a sentence. Speech segments can lack otherwise required syntactic features like verbs and
subjects, or they can contain only adverbials or interjections. Pauses are transcribed simply as
# and ## and incomplete words are written as they are spoken, demarcated with a hyphen (“-“)
and given the morphological label “ufullst” (short for incomplete). Repairs and deletions get
their own syntactic labels: REP and SLETT (delete). For a more detailed description, se1e1[,
6].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Syntax in Glossa</title>
      <p>Glossa is a user-friendly and functional search interface developed and upgraded at the
University of Oslo over the past twenty years1[0]. Glossa is used for more than 40 written,
multilingual and speech corpora. The easiest option for search is to write one or more words in a
Google-like search box and filter the results by a metadata menu on the left. The results are
given as concordances, and for speech corpora the results are linked to audio and video. There
is also an easy-to-use extended search box with clickable boxes and menus to be used for
finding e.g. lemmas or part of speech and other morphological information, see Figu1r,eas well
as an option to access the underlying query language (CQP, see below). Search results can be
exported in Excel or CSV/TSV format. In the following we describe how Glossa was adapted
to be able to handle treebanks with syntactic information.</p>
      <p>The query engine used in Glossa is the Corpus Query Processor (CQP) from the IMS Open
Corpus Workbench (CWB; [3]). Although the CWB has some limited support for (non-recursive)
structural attributes, it is mainly geared towards searching in token-level annotation. This
makes dependency grammar a better fit for Glossa than phrase structure grammars or other
types of grammar based on hierarchical structures.</p>
      <p>For the treebanks in Glossa, information about dependents and heads is fetched from files
in CoNLL format and stored in positional CWB attributes, i.e., attributes associated with
individual tokens. The CWB format uses XML tags for structural attributes, such as sentences,
speech segments, etc. These tags encompass a one-word-per-line, tab-separated representation
of texts, where each column holds a specified annotation. The additional annotations from the
CoNLL are simply appended as successive columns: Function, Index (1-based, as 0 dependency
implies root status) and Dependency.</p>
      <p>While these extra annotations are taken directly from the CoNLL files, a fourth attribute is
derived, namely syntactic level, which is assigned one of three clause types: Main, Dependent
or Infinitive. To achieve this, a sentence is scanned to identify verbs in non-root positions, i.e.,
verbs with a non-zero Dependency value. Such verbs are then either tagged as Dependent or
Infinitive, according to their morphosyntactic features. A simple parse of the sentence is then
performed to create a tree structure. Starting at the identified node in the parse tree, all nodes
in its subtree can then be given the same syntactic level tag. All other nodes receive the Main
tag.</p>
      <p>There are two alternatives for accessing the new layers of annotation in Glossa. Queries can
be formulated directly in the CQP query field. This requires some basic knowledge of regular
expressions as well as CQP specific features. A simple search for the token ”mellom” within a
dependent clause, for example, would be as follow[sw:ord="mellom" %c &amp; niv="led"].</p>
      <p>A more intuitive option is to use the extended menu, which lists all available attributes, such
as part of speech, their relevant morphosyntactic features, as well as syntactic functions and
level, as shown in Figure1. A benefit of using this method is that the risk of selecting mutually
exclusive attributes is removed.</p>
      <p>Trees are rendered with SVG (Scalable Vector Graphics). Glossa packs CQP output into a
JSON object which is passed to a React component. The component first groups dependent
nodes according to their proximity. Adjacent nodes will be assigned the lowest arching edges.
Height is increased with distance, avoiding edges crossing and enhancing readability. Once this
is done, the component plots the nodes and joining edges using the SVG path and text elements.
For the highlighting efect, a mouseover/mouseout event listener is added to each node. A
function provided to the listener will, when triggered, traverse the chain of dependencies, up
to the root, adding or removing CSS styling as appropriate. The resulting SVG object is then
returned to Glossa and rendered as in Figur2e.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Case studies</title>
      <p>In this section we present two case studies to demonstrate the possibilities and benefits of these
new query functions.</p>
      <sec id="sec-4-1">
        <title>4.1. Case 1 – subject complement construction</title>
        <p>
          Typical examples in a grammar book on subject complement construction7s][are as in (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ).
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
        </p>
        <sec id="sec-4-1-1">
          <title>De er raske. they are fast.PL Ho er feminist. she is feminist</title>
          <p>
            In a subject complement construction, the typical complement is either an adjective, arsaske
(‘fast’) in example (1-a), or a noun, asfeminist in example (1-b), and is linked to the subject by a
copula/linking verb (most typicallvyaere (‘be’)). As is apparent, the construction involves large
word classes (adjectives and nouns), which may have many diferent functions in a sentence
and the verbbe, which is both a high frequency verb in a text and can have diferent functions
in a sentence. Thus, if we are interested in studying subject complement constructions in a
corpus with only part of speech and morphological annotation, we will most probably get a lot
of examples with no relevance for the study as the possibilities to narrow down the query in an
appropriate way are sparse. The best we could do would be to search for the lemmvaeare (‘be’)
followed by an empty slot for any kind of word (e.g., a modifier or an adverb) followed by an
adjective. In Glossa, this query can be expressed with Google-like search boxes and dropdown
menus. By clicking on the CQP query link, we get the translation to CQP shown in (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ).
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            )
          </p>
          <p>
            [lemma="vaere" %c] []{,1} [pos="adj"]
Clearly, the Glossa interface makes it much easier to express the query. Nevertheless, the
results are not very precise. This query results in 4046 matches in the NDT treebank (Bokmål,
311,277 tokens). A quick look at the examples reveals that many of them turn out not to be
relevant. More precisely, among the first 50 matches 24 were not relevant. If we replace the
adjectival complement(pos="adj") in the CQP query exemplified in (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) with nominal
complement (pos="noun") and repeat the search, we get 2443 matches. Here again, 25 matches
out of the first 50 turn out not to be relevant. As the search results show, roughly every second
match is actually not relevant for the research purpose. Such a high percentage of irrelevant
examples necessitates a lot of manual sorting afterwards in order to avoid misleading
conclusions. In cases like this, the possibility to search on syntactic functions will reduce the number
of irrelevant examples.
          </p>
          <p>
            We can take the query string in (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) and add the specification of syntactic function SPRED
(subject complement, in Norwegian subjektspredikativ), by simply clicking the function box.
This yields the query shown in Figure3, and if we click the CQP query link, the query specified
in the GUI gets translated to CQP as in (
            <xref ref-type="bibr" rid="ref3">3</xref>
            ).
(
            <xref ref-type="bibr" rid="ref3">3</xref>
            )
          </p>
          <p>[lemma="vaere" %c] []{,1} [(pos="adj") &amp; (fun="SPRED")]
Again, because the syntactic representation is simplified to attributes on words, it can easily
be accommodated within the Glossa graphical interface.</p>
          <p>The new, specified search results in 2644 matches, which means a reduction by 1402 matches,
i.e., 35%. The same specification of syntactic function can be added to the search with nominal
complements. Here again we see a significant reduction of matches (from 2443 to 1237, i.e., a
reduction by approx. 50%).</p>
          <p>The possibility to specify the syntactic function also provides several other search options.
One could just search on the syntactic function with no other specifications (CQP query:[fun="SPRED"]).
When using this query, we get a concordance list of all subject complements in the NDT
treebank. Then we can combine the list of concordances with other functions in Glossa like the
calculation of frequencies based on e.g. part of speech. By doing so we can find out that many
other words can be heads in phrases that constitute the subject complements, such as
prepositions, pronouns, determiners, adverbs, and even infinitive and nominal clauses. We can also
ifnd out which other verbs, besides the most typical vaere (‘be’) and bli (‘become’), can link a
subject complement. The following search query [–pos="verb"] [fun="SPRED"] – reveals
that nearly one hundred diferent verb lexemes can link a subject complement. This shows the
usefulness of even a simplified syntactic representation reduced to attributes of words, which
moreover has the advantage of being easily queried in a graphical interface, as shown in Figure
3, without the need for a specialised query language.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Case 2 – conditional clauses with inversion</title>
        <p>
          In the second case study, we will demonstrate the benefits of yet another new query function,
namely the specification of syntactic level as either main, dependent, or infinitive clause. We
will illustrate this function by looking at a special conditional clause construction in Norwegian.
In Norwegian, there are two basic word orders, one typically used in main clauses, and one
typically used in subordinate clauses. In subordinate clauses, the subject and the adverbial
precede the finite verb like in the conditional clause illustrated in (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ).
        </p>
        <p>
          In Norwegian, there is another possibility to express a conditional clause without the explicit
hvis (‘if’) and with an inverted word order as illustrated in (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ).
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
        </p>
        <sec id="sec-4-2-1">
          <title>Hvis han ikke har rett, er saken avgjort. if he not has right is case-DEF settled ‘If he is not right, the case is settled.’</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Har han ikke rett, er saken avgjort.</title>
          <p>
            has he not right is case-DEF settled
‘If he is not right, the case is settled.’
This word order is similar to the word order in main clauses, especially so with yes/no
interrogatives as illustrated in (
            <xref ref-type="bibr" rid="ref6">6</xref>
            ):
(
            <xref ref-type="bibr" rid="ref6">6</xref>
            )
          </p>
        </sec>
        <sec id="sec-4-2-3">
          <title>Har han ikke rett?</title>
          <p>
            has he not right
‘Isn’t he right?’
If you are interested in studying this special type of conditional clauses (without the initial
hvis and with the finite verb first) in a corpus with only morphosyntactic annotations, the
possibilities to narrow down the search query are limited and you will most probably end up
with many irrelevant examples, as e.g., yes/no interrogatives, which also have a verb-initial
word order. In Glossa, we can formulate an extended search by selecting the category ‘verb’
among the part of speech tags and then tick of for ‘sentence initial’ position. A CQP query for
this search looks like in (
            <xref ref-type="bibr" rid="ref7">7</xref>
            ):
(
            <xref ref-type="bibr" rid="ref7">7</xref>
            )
          </p>
          <p>
            &lt;s&gt;[pos="verb"]
A search like this results in 778 matches. As predicted, the results include many irrelevant
examples as yes/no interrogatives, but also imperatives and incomplete clauses with omitted
subjects which are often used in newspaper headings. Because many corpora contain a large
quantity of newspaper text, the latter category is not insignificant. As described before, the new
annotation layers implemented in Glossa provide the possibility to search for a construction
in either main clauses, subordinate clauses or infinitive clauses. We can restrict the query to
subordinate clauses with the attributeniv="led" added to the search string in (
            <xref ref-type="bibr" rid="ref7">7</xref>
            ), as illustrated
in (
            <xref ref-type="bibr" rid="ref8">8</xref>
            ), and repeat the search.
(
            <xref ref-type="bibr" rid="ref8">8</xref>
            )
          </p>
          <p>&lt;s&gt;[(pos="verb") &amp; (niv="led")]
This search results in 78 matches. A closer look at the examples confirms that the vast majority
of them are relevant for our purpose, that is, they are conditional clauses with inversion. This
means that the new query function significantly reduces the number of irrelevant examples,
leaving us with only around 10% of the initial search result. Or to put it another way, without
the new function we got a results list where only about every tenth match was relevant to our
purpose, while the rest would have had to been sorted out manually.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Outlook</title>
      <p>We have shown how a suitably simplified syntactic representation can easily be queried in the
Glossa search interface tool, while still remaining useful for many queries. While there are
real limits to what can be done (e.g. one cannot simultaneously constrain the features of a
dependent and its head) compared to what is possible in full tree-based query formalism, our
tool is also much easier to learn. Even compared to a tool like Treebank.info12(][), which
like the current project is built on CWB, and which also allows the use of menus and other
graphical elements to specify a query, our system still seems considerably simpler to learn,
with a correspondingly narrower scope. This makes complicated syntactic data accessible to
ordinary linguists without specialised training and thereby opens up for more widespread use
of a treebank data in linguistics. The recent surge in creation of treebanks has not seen a
corresponding increase in the use of the data, partly we think for accessibility reasons. Recruiting
users through the simple Glossa interface may in time even increase the interest in full-fledged
query languages.</p>
      <p>There are also many other types of annotation that are often created with an eye towards
NLP applications, but that could be useful for general linguists as well. For the Norwegian
Dependency Treebank, this includes named entity recognition, animacy and coreference
annotations. Each of these annotations introduce their own complexities. Coreference works across
sentences for example, while animacy can be assigned at token level, but also phrase level, and
be nested, so that a token can participate in animacy on multiple levels. In future work, we
plan to address this complexity and integrate these annotation layers into the Glossa query
interface to make them similarly accessible to a wider community.
.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kiddon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Yeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          . “
          <article-title>Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs”</article-title>
          .
          <source>In:Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT</source>
          , GURT/SyntaxFest
          <year>2023</year>
          ). Washington, D.C.,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>M.-C. De Marnefe</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Nivre</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Zeman</surname>
          </string-name>
          . “
          <article-title>Universal dependencies”</article-title>
          .
          <source>In: Computational linguistics 47.2</source>
          (
          <issue>2021</issue>
          ), pp.
          <fpage>255</fpage>
          -
          <lpage>308</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Evert</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Hardie</surname>
          </string-name>
          . “
          <article-title>Twenty-first century Corpus Workbench: Updating a query architecture for the new millennium”</article-title>
          .
          <source>InP:roceedings of the Corpus Linguistics 2011 conference. Birmingham</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Faarlund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lie</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. I.</given-names>
            <surname>Vannebo</surname>
          </string-name>
          .Norsk referansegrammatikk. Oslo: Universitetsforlaget,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Guibon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Courtin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Gerdes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Guillaume</surname>
          </string-name>
          . “
          <article-title>When collaborative treebank curation meets graph grammars”</article-title>
          .
          <source>InL:REC 2020-12th Language Resources and Evaluation Conference. Marseille</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kåsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nøklestad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Priestly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Solberg</surname>
          </string-name>
          , and D. T. T. Haug. “
          <article-title>The Norwegian Dialect Corpus Treebank”</article-title>
          .
          <source>In:Proceedings of the Thirteenth Language Resources and Evaluation Conference</source>
          . Marseille, France,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lie</surname>
          </string-name>
          .
          <article-title>Innføring i norsk syntaks</article-title>
          . Oslo: Universitetsforlaget,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Marcus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Santorini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Marcinkiewicz</surname>
          </string-name>
          . “
          <article-title>Building a large annotated corpus of English: The Penn Treebank”</article-title>
          .
          <source>In:Computational linguistics 19.2</source>
          (
          <issue>1993</issue>
          ), pp.
          <fpage>313</fpage>
          -
          <lpage>330</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Meurer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Butt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. H.</given-names>
            <surname>King</surname>
          </string-name>
          . “
          <string-name>
            <surname>INESS-Search</surname>
          </string-name>
          :
          <article-title>A search system for LFG (and other) treebanks”</article-title>
          .
          <source>In: Proceedings of the LFG'12 Conference, LFG Online Proceedings</source>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nøklestad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Bondi</given-names>
            <surname>Johannessen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kosek</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Priestley</surname>
          </string-name>
          . “
          <article-title>A modernised version of the Glossa corpus search system”</article-title>
          .
          <source>InP:roceedings of the 21st Nordic Conference on Computational Linguistics. Gothenburg</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Øvrelid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kåsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nøklestad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Solberg</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Johannessen</surname>
          </string-name>
          . “
          <article-title>The LIA Treebank of Spoken Norwegian Dialects”</article-title>
          .
          <source>In:Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</source>
          <year>2018</year>
          ). Miyazaki, Japan,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Proisl</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Uhrig</surname>
          </string-name>
          . “
          <article-title>EfÏcient Dependency Graph Matching with the IMS Open Corpus Workbench”</article-title>
          .
          <source>In:Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)</source>
          . Istanbul, Turkey,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Solberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Skjaerholt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Øvrelid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hagen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Johannessen</surname>
          </string-name>
          . “
          <article-title>The Norwegian Dependency Treebank”</article-title>
          .
          <source>In:Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)</source>
          . Reykjavik, Iceland,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zeman</surname>
          </string-name>
          et al.
          <source>Universal Dependencies</source>
          <volume>2</volume>
          .
          <fpage>14</fpage>
          .
          <year>2024</year>
          . url: http://hdl.handle.net/11234/1-
          <fpage>55</fpage>
          02.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>