<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Ital-IA</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>AI-Assisted Legal Holding Extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Praveen Bushipaka</string-name>
          <email>praveen.bushipaka@santannapisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniele Licari</string-name>
          <email>daniele.licari@santannapisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriele Marino</string-name>
          <email>gabriele.marino@santannapisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Comandé</string-name>
          <email>giovanni.comande@santannapisa.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Artificial Intelligence, BERT, Summarization, Legal Holding Extraction, Rhetorical Roles, Legal AI</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Scuola Superiore Sant'Anna</institution>
          ,
          <addr-line>P.zza dei Martiri della Libertà, Pisa, 56100</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>3</volume>
      <fpage>29</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>This paper provides an overview of the investigations being carried out at Scuola Superiore Sant'Anna on the use of Artificial Intelligence techniques for automated extraction of rhetorical roles and legal holdings from Italian case documents. These activities are framed within the ”Giustizia Agile” project funded by the Ministry of Justice, aiming at improvements to the eficiency of the Italian justice system, making use of advanced information technology means, among others. ∗Corresponding author.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction</p>
    </sec>
    <sec id="sec-2">
      <title>In every country, the eficiency of the judicial system</title>
      <p>has an impact on the social and economic life of citizens. cal factors addressing adversely the duration of judicial</p>
    </sec>
    <sec id="sec-3">
      <title>Italy has been constantly trying to make its legal system</title>
      <p>more eficient and in line with other European countries.</p>
    </sec>
    <sec id="sec-4">
      <title>For example, in the 2022 EU Justice Scoreboard [1], Italy</title>
      <p>
        was reported as being among the countries with the least
eficient judicial system, with more than 500 days needed
for the first sentence, 800 days for the appeal and
reaching up to 1300 days for final judgments by the Supreme
Court [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. One factor believed to bring a tremendous
potential for improving the eficiency of public
administration systems in general, including judicial systems, is
the widespread adoption of Information and
Communication Technologies (ICTs), supporting fully digitalized
processes. Indeed, the CEPEJ report by the EU
Council [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] includes a survey on the use of ICT in judicial
systems, highlighting for example that Italy exhibits the
lowest score among EU countries on the Criminal justice
      </p>
    </sec>
    <sec id="sec-5">
      <title>ICT index (but with a much better ICT index on Civil and</title>
    </sec>
    <sec id="sec-6">
      <title>Administrative justice).</title>
    </sec>
    <sec id="sec-7">
      <title>In this context, we can understand the eforts being</title>
      <p>nEvelop-O
(D. Licari); 0000-XXXX-XXXX-XXXX (G. Marino);
wider initiative to enhance the performance of judicial
ofices, aiming at significant reductions of the backlog, by
investigating on finding the major bottlenecks and
critiprocesses; investigating the opportunities to add several
innovations on the side of management and organization
of the processes, as well as to embrace a wider adoption
of digitalization of the processes through the use of ICTs.</p>
    </sec>
    <sec id="sec-8">
      <title>This very last topic is the one where this paper fits,</title>
      <p>reporting on some key experimentation being done with
the use of Artificial Intelligence tools, and specifically
Large Language Models (LLMs), in the area of automated
summarization and Rhetorical Role Classification of
Italian case documents. We focused on the extraction of
legal holdings from Italian administrative justice
documents. This activity is carried out by the highest
exponents of Italian justice and is crucial to facilitate access
to justice, create a ’precedent’ and ensure transparency
in decisions. Furthermore, this is a delicate task because
lawyers and judges rely on legal holdings to select
caserelevant documents when searching for similar cases.
Extracting this information from a judgment is a complex
task that requires time-consuming eforts and specific
combines a rhetorical role classifier, text summarization,
and a scalable search engine to accurately and eficiently
retrieve and analyze legal holdings from Italian case
documents. Identifying the rhetorical roles of the diferent
text segments allows for a better understanding of the
structure and content of the document, which can help
guide the summarization process. Irrelevant information
(e.g. introduction) can be filtered out in pre-processing
allowing the summary model to focus exclusively on the
most important information in the document.</p>
    </sec>
    <sec id="sec-9">
      <title>Previous attempts at using rhetorical roles classifica</title>
      <p>
        2004. For example, LetSum [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] assigns Rhetorical roles to
ian Ministry of Justice. This project is framed within a
made in the ”Giustizia Agile” project1, funded by the Ital- skills. Here, we present an innovative approach that
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License tion for text summarization in the legal field date back to
      </p>
      <p>CEUR</p>
      <p>
        Workshop Proceedings (CEUR-WS.org)
Workshop
Proce dings IhStpN:/c1e6u1r3-w-0s.o7r3g
1More information is available at https://www.unitus.it/it/unitus/ the sentences and uses TF-IDF to rank them. A
percentage of sentences for each rhetorical role is then selected which are those that contain the information on the legal
to be a part of the summary. The work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] approached holding.
the identification of Rhetorical roles with Conditional
Random fields, extending to Extractive text summariza- 2.2. Legal Holding Extraction
tion using term distribution. There are diferent classes
of text summarization methods. Extractive summariza- We used an Extractive summarization method to extract
tion involves identifying the most important sentences holdings from the legal documents. We used BERT with
or passages from the original text and combining them a regression head. The top 5 sentences are picked based
to create the summary [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This method has been widely on the scores and chosen as holdings.
experimented with within the Legal area, and a few tools
were developed specifically for the Legal Domain. On the 2.3. Legal Search Engine
other hand, abstractive summarization generates new text
which is not present in the processed documents. This The final stage of the AI system includes a search engine
method has been explored in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and proved eficient. for the eficient retrieval of a large corpus of Italian Legal
      </p>
      <p>
        In our approach, we focused on an extractive method, documents. The documents collected, generated roles,
due to two main reasons: (i) it highlights the most rel- and extracted summaries were given to the data store. A
evant sentences in a given document, constituting an web app will be developed for easier usage.
efective way to speed up a Judge’s work, and (ii)
summarizing long documents is extractive in nature [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], as this 3. Methodology
method takes advantage of the discourse structure [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
to generate factually consistent summaries, preserving Our work is based on fine-tuning the
Italian-Legalthe meaning of the original document [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However, BERT [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] model for both rhetorical role classification
previous eforts in this area were done only on English and holdings extraction. However, we used diferent
datasets. approaches for these two tasks, as explained below.
      </p>
      <p>
        In this paper, we propose a platform based on Italian
Legal BERT models [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] to extract legal holdings from
Italian administrative justice documents using rhetori- 3.1. Dataset Description
cal roles classification and extractive text summarization. We used an ITA-CASEHOLD dataset, which consists of
We use a Hierarchical BERT model to identify only the 1101 judgments and holding pairs between the years of
most important sentences and apply an extractive sum- 2019 and 2022 collected from the Italian Administrative
marization algorithm to improve the performance of the Justice. The dataset consists of a wide range of issues,
insummarization. Later, we feed this information as meta- cluding public contracts, environmental protection,
pubdata into an eficient information retrieval system. v lic services, immigration, taxes, and compensation for
damages caused by the State. It also provides citizens
2. Legal Holding Research with the opportunity to challenge administrative
decisions in an independent and impartial trial. The dataset
Platform Overview was further divided into 792 documents in the training
set, 88 in the validation set, and 221 in the test set. A
The platform we are building consists of three stages, token-level compression ratio between a document and
exemplified in Figure 1. In the first stage, we identify the its holding shows that there is a high standard deviation
most important rhetorical roles of each sentence present across all the datasets w.r.t the length of documents and
in a legal document using Hierarchical BERT. In the final holdings. This is because the documents are quite long,
stage, we ingest the documents, sentences, and holdings whilst their holdings are much shorter.
into Elasticsearch, letting the search engine index these A new dataset was created for the training and
evaluaadditional meta-data, to ease later searches by users. tion of the rhetorical role classifier model. We extracted
and annotated (mainly using regular expressions) 152,368
2.1. Rhetorical Roles Classification sentences from 1,503 Italian civil cases. For each sentence
of the dataset, we derived its rhetorical role between the
following:
      </p>
    </sec>
    <sec id="sec-10">
      <title>In the first phase of the model, we predict Rhetorical roles</title>
      <p>for each sentence. We used a Hierarchical BERT model
for this task. Each sentence is categorized into a single
role. Overall, we categorized 5 diferent roles
(INTRODUCTION, PARTIES, DEVELOPMENT, REASON, and
CONCLUSION). The sentences of REASON are filtered</p>
    </sec>
    <sec id="sec-11">
      <title>1. INTRODUCTION: an indication of the judge who</title>
      <p>pronounced it; an indication of the parties and
their lawyers;
2. CONCLUSION OF THE PARTIES: the conclusions
of the prosecutor (if any) and those of the parties;
3.3. Italian-LEGAL-BERT Holding
Extraction
3. DEVELOPMENT OF THE TRIAL: summary of
the appealed judgment and reasons of appeal;
4. REASON: the concise statement of the factual and
legal reasons for the decision (the statement of
reasons);
5. CONCLUSION: the decisional content of the
judgment.</p>
    </sec>
    <sec id="sec-12">
      <title>We used a novel extractive method called Harmonic</title>
      <p>Mean-BERT. This approach involves fine-tuning the
Italian-LEGAL-BERT model to predict a score for each
sentence in a document. The scores for training and
evaluation were given by the harmonic mean of Rouge R-1</p>
      <p>Both datasets were split 80% for training models and and Rouge R-2 scores (generated by ITA-ROUGE a
mod20% for model testing. The data were obtained through ified version of Rouge metric for the Italian Language)
scientific collaboration agreements between some Italian between the sentence and the corresponding document
courts and the Scuola Superiore Sant’Anna. The ITA- holding. We generate these scores only for the training
CASEHOLD dataset will be publicly released. and validation sets.</p>
      <p>Since the sentences were already rhetorical role
classi3.2. Rhetorical Roles Classification ifed, based on the scores generated, we then only chose
the sentences which have the highest importance. The
The identification of the roles that diferent text segments higher the score of a sentence, the higher the similarity
play in a larger document has been done using a hierar- between the sentence and its corresponding holding. For
chical BERT approach, in order to contextualize a single our experiment, REASON was the most important and
sentence based on the content of the document. This DEVELOPMENT OF THE TRIAL (DEVELOPMENT) was
model is based on a layered architecture whose bottom the second most important. We derived these two were
layer is Italian-LEGAL-BERT and whose top layer is a 2 the most important by their scores, 75th percentile of
layers transformer encoder. The sentences to classify are these sentences had score of more than 2.5 whereas other
tokenized and given as inputs to Italian-LEGAL-BERT. roles were near zero.</p>
      <p>The CLS output tokens are then retrieved and fed to the We made two datasets by removing sentences with
transformer encoder, which extracts relevant features for other roles, (i) Only with REASON, (ii) with REASON, and
each sentence. These features are then processed by a DEVELOPMENT. After getting the scores and choosing
simple softmax-based classification layer to get the final only the important sentences of a document, we
finepredictions. We will provide precise details about the tuned the Italian LEGAL BERT model with a regression
training and performance of this model in further work. head to predict these scores.</p>
      <p>1. R-1 and R-2 scores between each sentence and its
respective document holding are computed for
the training and validation sets.
2. To retrieve a single score out of the R-1 and
R2, we computed their harmonic mean for each
sentence.
3. Based on these scores and the previously
predicted roles from the Rhetorical roles classifier, we
chose the most important roles. The higher the
score, the higher the importance. Two datasets
were created based on this.
4. Italian-LEGAL-BERT was fine-tuned in the
regression task of predicting the score for a given
sentence.
5. The validation dataset was used to determine the
optimal number of top k sentences to compose
the final holding. We tried k = 3,5,7 and found
that k = 5 yielded the best results.</p>
      <p>Model</p>
      <p>REASON
REASON + DEVELOPMENT</p>
      <p>R-1</p>
    </sec>
    <sec id="sec-13">
      <title>Elasticsearch data store. Additional metadata available</title>
      <p>for each document will also be indexed along with the
documents. A tokenization layer on top of the
Elasticsearch data store will be added to tokenize the input text.
The search engine will be developed with a web app for
the judicial people to be able to use it. This eficient
retrieval of documents and their holdings might fasten the
process of searching through documents.</p>
      <p>Apart from search, Elasticsearch can also be used for
analyzing data. This will be explored alongside the main
search engine functionality while developing the final
system.</p>
      <p>In more detail, the following steps have been followed: Table 1
Comparison on ROUGE scores.</p>
      <p>For testing, we followed the steps detailed below:
4. Preliminary Results
1. Two datasets were created based on roles similar
to the training and validation sets. However, we
don’t calculate the scores here beforehand.
Instead, we use the trained model to predict them.
2. The sentences were then grouped into documents
based on their document id.
3. The trained model was used to compute the score
of each sentence.
4. The sentences were sorted by predicted scores.
5. The top 5 sentences were selected and sorted
according to their index position in the original
document to compose the final holding.
6. The ROUGE scores were evaluated between the
extracted and the original holdings.</p>
    </sec>
    <sec id="sec-14">
      <title>Our experiments showed that the hierarchical approach</title>
      <p>based on BERT and Transfomer improved the
classification performance of rhetorical sentences by +12% in
terms of Matthews Correlation Coeficient (from 0.81 to
0.91) compared to a model based only on BERT.</p>
      <p>The experiments on holding extraction were on two
datasets with diferent filters on the rhetorical roles: 1)
only with the REASON and 2) with REASON +
DEVELOPMENT OF THE TRIAL (REASON + DEVELOPMENT).</p>
      <p>Their performance was evaluated with ITA-ROUGE, a
modified version of the ROUGE metrics for the Italian
language. The experiments were carried out on an
NVIDIA</p>
      <p>DGX system equipped with a 32GB TeslaV100 GPU.
REA</p>
      <p>Our software stack included PyTorch, Hugging Face SON outperforms REASON + DEVELOPMENT proving
transformers, and Py-Rouge. We used Italian-LEGAL- that rhetorical roles and picking only the important
senBERT as the encoder. This model has an embedding tences can yield better results.
dimension of 768, an input token size of 512, 12 hidden
layers with 12 attention heads, and an attention dropout
of 0.1. A sequence regression head (i.e. a linear layer) 5. Conclusions and Future Work
was added to the pooled output. The training was carried
out with an AdamW optimizer and a linear scheduler. In this paper, we showed that the quality of extractive
We trained both datasets for 4 epochs, using a batch size summarization can be increased by adding a Rhetorical
of 16 and setting 256 as the maximum sequence length. Role layer and choosing only the most important parts
of the document. This outperforms the HM-BERT model,
which was trained on the same ITA-LEGAL-BERT
with3.4. Legal Holding Search Engine out Rhetorical roles. Our future work involves in the
future development of AI tools that can improve the
performance of Judicial Ofices. This includes information
retrieval, summarization, classification, question
answering, and others. For our immediate future work, we will
explore the possibilities of using a search engine paired</p>
    </sec>
    <sec id="sec-15">
      <title>For the final stage, we adopted Elasticsearch for data storage and retrieval. It is built on the Apache Lucene [12] architecture, which uses inverted term frequency and Okapi BM25 [13] for ranking.</title>
      <p>The documents, along with their generated rhetorical
roles and extracted summaries, will be indexed into the
with the summarization and role classification prototypes
we already built.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>E. Commission,</surname>
          </string-name>
          <article-title>the 2022 EU Justice Scoreboard</article-title>
          , https://europa.eu/!CJdXbP,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lettig</surname>
          </string-name>
          ,
          <article-title>Italy, EU's least-eficient judicial system</article-title>
          , https://www.euractiv.com/section/politics/short_ news/italy-eus
          <article-title>-least-efficient-judicial-</article-title>
          <string-name>
            <surname>system</surname>
            <given-names>/</given-names>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[3] European judicial systems CEPEJ Evaluation Report - 2022 Evaluation Cycle</source>
          (
          <year>2020</year>
          Data)
          <article-title>- Part 1 Tables, graphs and analyses</article-title>
          , https://rm.coe.int/cepejreport-2020-22-e-web/
          <year>1680a86279</year>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Farzindar</surname>
          </string-name>
          , G. Lapalme,
          <article-title>Legal text summarization by exploration of the thematic structure and argumentative roles, in: Text Summarization Branches Out, Association for Computational Linguistics</article-title>
          , Barcelona, Spain,
          <year>2004</year>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          . URL: https://aclanthology.org/W04-1006.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Saravanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ravindran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Raman</surname>
          </string-name>
          ,
          <article-title>Automatic identification of rhetorical roles using conditional random fields for legal document summarization</article-title>
          ,
          <source>in: Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I</source>
          ,
          <year>2008</year>
          . URL: https://aclanthology. org/I08-1063.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheng</surname>
          </string-name>
          , M. Lapata,
          <article-title>Neural summarization by extracting sentences and words</article-title>
          ,
          <source>in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Berlin, Germany,
          <year>2016</year>
          , pp.
          <fpage>484</fpage>
          -
          <lpage>494</lpage>
          . URL: https://aclanthology.org/P16-1046.
          <source>doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 / P 1
          <fpage>6</fpage>
          -
          <lpage>1</lpage>
          0
          <fpage>4</fpage>
          6 .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kalamkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Karn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Raghavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Modi</surname>
          </string-name>
          ,
          <article-title>Corpus for automatic structuring of legal documents</article-title>
          ,
          <source>in: Proceedings of the Thirteenth Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>4420</fpage>
          -
          <lpage>4429</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          . lrec-
          <volume>1</volume>
          .
          <fpage>470</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H. Y.</given-names>
            <surname>Koh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ju</surname>
          </string-name>
          , M. Liu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>An empirical survey on long document summarization: Datasets, models, and metrics</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>55</volume>
          (
          <year>2022</year>
          ). URL: https://doi.org/10.1145/3545176.
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 5 4 5 1 7 6 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mircea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C. K.</given-names>
            <surname>Cheung</surname>
          </string-name>
          ,
          <article-title>Discourseaware unsupervised summarization for long scientific documents</article-title>
          ,
          <source>in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:</source>
          Main Volume,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>1089</fpage>
          -
          <lpage>1102</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          . eacl-main.
          <source>93. doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 /
          <article-title>2 0 2 1</article-title>
          . e a c l - m
          <source>a i n . 9 3 .</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cui</surname>
          </string-name>
          , L. Hu,
          <article-title>Sliding selector network with dynamic memory for extractive summarization of long documents, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>5881</fpage>
          -
          <lpage>5891</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .naacl-main.
          <source>470. doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 /
          <article-title>2 0 2 1 . n a a c l - m a i n . 4 7 0</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Licari</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Comandé, ITALIAN-LEGAL-BERT: A Pre-trained Transformer Language Model for Italian Law</article-title>
          , in: CEUR Workshop Proceedings (Ed.),
          <source>The Knowledge Management for Law Workshop (KM4LAW)</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Białecki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Muir</surname>
          </string-name>
          , G. Ingersoll,
          <article-title>Apache lucene 4</article-title>
          , in: OSIR@SIGIR,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Amati</surname>
          </string-name>
          , BM25,
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          , Boston, MA,
          <year>2009</year>
          , pp.
          <fpage>257</fpage>
          -
          <lpage>260</lpage>
          . URL: https: //doi.org/10.1007/978-0-
          <fpage>387</fpage>
          -39940-9_
          <fpage>921</fpage>
          .
          <source>doi:1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 0 - 3 8 7 - 3 9 9 4 0 - 9</volume>
          _
          <issue>9</issue>
          2
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>