<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Prompt Design and Answer Processing for Knowledge Base Construction from Pre-trained Language Models (KBC-LM)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xiao Fang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alex Kalinowski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Haoran Zhao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ziao You</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuhao Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuan An</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Computing and Informatics, Drexel University</institution>
          ,
          <addr-line>Philadelphia, PA 19104</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Prompt-Design on Pre-trained large Language Models (prompt-design&amp;PLM) has become an emerging paradigm for a range of NLP tasks. Although an increased efort has been put into reformulating many classic NLP problems as prompt-based learning, less explored areas include knowledge base construction from PLMs. The ISWC-2022 challenge on Knowledge Base Construction from Pre-trained Language Models (KBC-LM) provides 12 pre-defined relations each of which is equipped with a number of train and dev triples. In participating in the challenge, we manually developed relation-specific prompt templates to probe BERT-related LMs. Given a (SubjectEntity, relation) pair, we predicted none, one, or many ObjectEntitys to complete the pair as a triple. The test results on unseen (SubjectEntity, relation) pairs showed our prompt design achieved 49% overall macro average F1-score, a 48% improvement from the baseline's 31% F1-score. The insights we learned about the “knowledge” of a language model would lead us to select appropriate LMs for future knowledge base construction tasks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Pre-trained large Language Models (PLM) such as BERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], RoBERTa [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], GPT-3 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and T5 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
have attracted a significant attention in AI and NLP communities. A recent emerging paradigm
leveraging the Pre-trained LMs (PLM) is to use textual prompts to solve a range of NLP tasks.
For example, for Sentiment Analysis, if we have a piece of text “This is a boring movie.",
we can use a textual prompt “This is a boring movie. The review is __" to ask a
pre-trained language model (PLM) to fill up the blank with a positive or negative label.
The downstream applications using this paradigm are reformulated in a way as if we need to
predict a missing or next word using a pre-trained LM. We dub this paradigm as prompt-design
on pre-trained large language models or prompt-design&amp;PLM for short. To efectively apply
prompt-design&amp;PLM, one needs to address several critical issues including selecting a relevant
LM, designing appropriate prompts, and extracting the final predictions. Typically, one develops
templates to generate prompts such that a template processes the original text with some extra
tokens. For example, the template “[TEXT] The review is [MASK]” generates the prompt
we used earlier for Sentiment Analysis, where “[TEXT]” corresponds to the original sentence,
and the token “[MASK]” stands for a blank to be filled up. Optionally, one may further develop a
verbalizer to project original labels to words in the vocabulary of the LM for final prediction. For
example, the verbalizer for our Sentiment Analysis example is {“positive”:“interesting”,
“negative”:“boring”}.
      </p>
      <p>
        A flurry of studies have been reported using prompt-design&amp;PLM to solve text classification
[
        <xref ref-type="bibr" rid="ref3 ref5 ref6 ref7">3, 5, 6, 7</xref>
        ], named-entity recognition [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], natural language inference [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ], sentiment analysis
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], relation extraction [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], text summarization [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and parsing [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Despite the eforts,
an under-explored area is to directly extract structured knowledge from PLMs to construct a
knowledge base. The ISWC-2022 Challenge for Knowledge Base Construction from Pre-trained
Language Models (KBC-LM) aims to explore the capability of various pre-trained language
models (PLMs) for constructing a knowledge base with a set of given predicates/relationships.
The problem is formally defined as follows:
Definition 1. Given a set of inputs each of which contains a SubjectEntity(s) and a relation(r).
Predict the set of correct ObjectEntitys {1, 2, ..., } using a LM probing method for each input.
      </p>
      <p>
        A significant diference between the ISWC-2022 KBC-LM challenge and the existing baseline,
e.g., LAMA presented in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], is that there is no constraint on the number of ObjectEntitys that
can participate in a (SubjectEntity, relation) pair. Specifically, a SubjectEntity can join zero, one,
or many ObjectEntitys in a relation. There are two tracks in this challenge. Track 1 explores the
pre-trained BERT-related LMs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] such as BERT Base Cased Model (BERT-base) and BERT Large
Cased Model (BERT-large). Track 2 explores other LMs including RoBERTa, Transformer-XL,
GPT-2, BART, etc. The outputs are evaluated using the established F1-score KB metric. We
participated in the Track 1 challenge using BERT-related LMs. This paper reports our prompt
design, answering processing steps, and test results on the unseen test data hold back by the
challenge organizers.
      </p>
      <p>The rest of the paper is organized as follows. Section 2 discusses related work. Section 3
presents the relation-specific prompts for LM probing. Section 4 presents test results. Section
5 discusses the prompt design and lessons learned. Section 6 describes the structure of the
implementation. Finally, Section 7 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Pre-trained Language Model (PLM). Standard language models are trained to predict text in
an autoregressive fashion, that is, predicting the tokens in the sequence one at a time. This is
usually done from left to right, but can be done in other orders as well. Representative examples
of modern pre-trained left-to-right autoregressive LMs include GPT-3 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. A disadvantage of
autoregressive language models is its directionality in processing text. To predict text based on
surrounding text, masked language models (MLM) have been developed that use bidirectional
objective function. Representative pre-trained models using MLMs include BERT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], ERNIE
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and many variants. The prefix LM is a left-to-right LM that decodes a target text 
conditioned on a prefixed sequence  as for translation. Example prefix LMs include UniLM
1-2 [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ] and ERNIE-M [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] The encoder-decoder model uses a left-to-right LM to decode a
target text  conditioned on a separate encoder for text  with a fully-connected mask. Example
encoder-decoder pre-trained models include T5 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], BART [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], MASS [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and their variants.
Prompt Design. In general, there are two types of prompts, cloze prompts which fill in
blanks in a textual string and prefix prompts which continue a string prefix. Prompts can be
designed manually based on human intuition [
        <xref ref-type="bibr" rid="ref13 ref3 ref5">13, 3, 5</xref>
        ] or automatically through mining [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
paraphrasing [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ], gradient-based search [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], generation [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], and scoring [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. In addition
to discrete hard prompts, researchers have also developed continuous soft prompts that interact
directly with LMs in the embedding space. Soft prompts have their own parameters that can be
tuned through diferent strategies including prefix tuning [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], hard-prompt initialized tuning
[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], and hybrid tuning [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
      <p>
        Answer Processing. An answer can take (1) one of the tokens in the PLM’s vocabulary, (2) a
short multi-token span, or (3) a sentence or document. Answer processing aims to extract the
correct answers from the output space of a PLM. Researchers have developed manual approaches
using verbalizers [
        <xref ref-type="bibr" rid="ref13 ref29 ref30 ref8">13, 29, 30, 8</xref>
        ] and automatic methods through paraphrasing[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], pruning [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ],
and label decomposition [32].
      </p>
      <p>
        Few-shot Learning. In addition to zero-shot prompt-design&amp;PLM setting, there are also
methods that use training data to optimize parameters in prompts or PLMs. Few-shot learning
methods have been developed for tuning prompts only [33, 34] and tuning both prompts and
PLMs [
        <xref ref-type="bibr" rid="ref10 ref24 ref28">24, 28, 10</xref>
        ].
      </p>
      <p>
        Knowledge Base Construction from LM Probing. The seminal work on KBC-LM is LAMA
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] which manually created cloze templates to probe knowledge in PLMs. Few-shot learning
on the original LAMA datasets has also been evaluated [35]. More studies have been reported on
probing PLMs for complicated knowledge [36], temporal knowledge [37], and domain specific
knowledge [38, 39].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Relation-Specific Prompt Design and Answer Processing</title>
      <p>For participating in this challenge, we chose the BERT Large Cased model to construct the
knowledge base through PLM probing. The challenge uses a diverse set of 12 relations, each
covering a diferent topic equipped with a set of (SubjectEntity, relation, ObjectEntity) triples as
ground truth. Table 1 lists the relations along with their descriptions and ground truth examples.
We describe our prompt design for each individual relation below. Notice that for the KBC-LM
task, we usually do not have a verbalizer to project prediction labels.</p>
      <sec id="sec-3-1">
        <title>3.1. ChemicalCompoundElement</title>
        <p>At first glance, the semantics of this relation seems to be ambiguous, because chemical
components can be an object at the molecular, ionic, or atomic level. After analyzing the training
and development datasets, we found that more than 98% of the object entities were chemical
elements, and only less than 2% of the objects were in ionic groups. The limited number of
chemical elements allows us to filter out the LM outputs by only keeping chemical elements for
iflling up the blanks.</p>
        <p>In addition, we noticed that names of some object entities follow simple linguistic rules.
For example, the entity named “Zinc phosphate” has chemical compound elements “zinc” and
“phosphorus”. The first four characters of each token in “Zinc phosphate” are respectively the
same as that of “zinc” and “phosphorus”. In terms of prompt design, we noticed that the names
PersonCauseOfDeath
CompanyParentOrganization
PersonInstrument
PersonEmployer
PersonPlaceOfDeath
RiverBasinsCountry
PersonLanguage
PersonProfession
CountryBordersWithCountry
CountryOficialLanguage
StateSharesBorderState</p>
        <p>Description
chemical compound (s)
consists of an element (o)
person (s) died
due to a cause (o)
company (s) has another
company (o)
as its parent organization
person (s) plays
an instrument (o)
person (s) is or was
employed by a company (o)
person (s) died
at a location (o)
country (s) basins
in a country (o)
person (s) speaks
in a language (o)
person (s) held
a profession (o)
country (s) shares
a land border
with another country (o)
country (s) has an
oficial language (o)
state (s) of a country
shares a land border
with another state (o)
of some subject entities comprised of two or more words such as “acid” and “hydroxide”. The
basic knowledge in Chemistry indicates that “acid” must contain hydrogen, and “hydroxide”
must contain hydrogen and oxygen. We hypothesized that if we split the compound names
into individual tokens and feed the single tokens to the language model, we might obtain more
correct answers from the LM. Using “Chloric acid” as an example, not only we can ask language
model “[MASK] is a chemical compound element of Chloric acid”, we can also
ask “[MASK]... of Chloric” and “[MASK]... of acid”. By implement this idea, we
observed improved recall metrics.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. PersonCauseOfDeath</title>
        <p>For this relation, we analyzed the training data set and found 50% of the SubjectEntity in the
data set are still alive with an empty “cause of death”. Therefore, we split the probing into
2 steps. First, we probed the LM about the life status of the SubjectEntity, i.e., determining
whether the SubjectEntity is “dead” or “alive”. Second, we probed the LM about the cause of
death under the premise that the SubjectEntity has already deceased.</p>
        <p>To address the first problem, we began with the prompts corresponding to the direct question:
“Did XXX die?”. Unfortunately, we found that a small perturbation in the prompts would
cause significantly diferent results. Using the following set of prompts: “ Did XXX really
die? [MASK].” and “Is XXX still alive? [MASK].”, we noticed that the results would
be more skewed towards “No”s. Sometimes, we received the answer “No” to both questions for
the same subject. This is unreasonable from human understanding, because nobody can be “not
alive” and “not dead” at the same time. In a slightly diferent way, if we asked “ Did XXX die?
[MASK].” or “Is XXX alive? [MASK].”, the results generally contains more “Yes”s. The
propensity of answers can be easily afected by the mood words in the prompts, such as “really”,
“still”. On the contrary, it is not sensitive to the keyword itself. Asking “dead” and “alive” will
even get trend-aligned answers. If we design prompt in this way, the answer extraction step
would be extremely unreliable and unstable.</p>
        <p>A language model captures massive statistics about word co-occurrences in a context. To
probe more efectively, a prompt should reproduce the context in which the SubjectEntity and
ObjectEntity tend to co-occur. For example, if a person XXX is dead, it is more likely that the
following phrase appears in the corpus: “XXX is dead”. Based on the co-occurrence statistics, we
designed prompts in the format “[SubjectEntity] (is|has) [MASK]”, where “(is|has)”
means choosing one of the strings separated by |. We treated the tokens predicted by the
language model as an answer space. We detected the “alive” or “dead” status of the SubjectEntity
based on the presence of “die” and its variants. For example, if the token “die” exists in the
answer space, we consider the SubjectEntity is dead, otherwise is alive. This strategy indeed
achieved improved results tested on both the training and development sets. To address the
second problem, we followed our intuition and experimented with diferent prompts and answer
thresholds with moderate improvements compared to the baseline.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. CompanyParentOrganization</title>
        <p>For this relation, we probed in 2 steps: (1) does the SubjectEntity has a parent organization? and
(2) if yes, which one in the answer list is likely a parent organization. A challenge for addressing
the first problem is to distinguish the relations of subsidiary and parent organizations. We found
that the LM under-performed for the problem no matter how the prompts were designed.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. PersonInstrument</title>
        <p>The prompt is: “[SubjectEntity] (loves|likes playing) [MASK], which is a
instrument.” The prompt uses the article “a” instead “an” in front of “instrument” for a
better performance. We also noticed that the verb phrase (loves|likes playing) improved
the model performance.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. PersonEmployer</title>
        <p>We have tried many diferent prompts and adjusted thresholds for top_k answers. However, we
still obtained the lowest performance for this relation. The current prompt is: “[SubjectEntity]
joined and work at [MASK] as an employer, which is a company”.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. PersonPlaceOfDeath</title>
        <p>In the same way for probing answers for the relation PersonCauseOfDeath, we probed
for PersonPlaceOfDeath in two steps: (1) checking whether the SubjectEntity is “dead” or
“alive” and (2) discovering the place of death if the SubjectEntity has deceased. Assuming
we found out that the SubjectEntity is dead, the prompt for detecting the place of death is:
“[SubjectEntity] died at home or hospital in [MASK].”</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.7. RiverBasinsCountry</title>
        <p>The prompt is: “[SubjectEntity] river basins in [MASK].” Other prompts did not
achieve better performance than the prompt above. The F1-score was improved from 0.38 of
using BERT-base to 0.55 of using BERT-large.</p>
      </sec>
      <sec id="sec-3-8">
        <title>3.8. PersonLanguage</title>
        <p>The prompt is: “[SubjectEntity] speaks in [MASK], which is a language.” As
with many other relations, we found that a prompt containing a subordinate clause such as
“which is a language” improved performance. The F1-score was improved from 0.43 of
using BERT-base to 0.70 of using BERT-large.</p>
      </sec>
      <sec id="sec-3-9">
        <title>3.9. PersonProfession</title>
        <p>The prompt is: “[SubjectEntity] is (a or an) [MASK], which is a profession.”
Using the phrase “(a or an)” which includes both forms of the article “a” improved the
performance. The F1-score was improved from 0.0 of using BERT-base to 0.25 of using
BERTlarge. As an intermediate summary, Table 2 shows the probing parameters and validation results
of these three relation: RiverBasinsCountry, PersonLanguage, and PersonProfession.
model
top_k
threshold
Precision
Recall
F1</p>
        <p>RiverBasinsCountry PersonLanguage PersonProfession
BERT-large BERT-large BERT-large
4 1 5
0.071 0.184 0.010
0.643 0.840 0.365
0.590 0.654 0.202
0.546 0.701 0.249</p>
      </sec>
      <sec id="sec-3-10">
        <title>3.10. CountryBordersWithCountry</title>
        <p>The prompt is: “[SubjectEntity] and [MASK] are neighboring country. They
share the border.” We experimented with more than twenty prompts and finally we found
joining the “[SubjectEntity]” and the “[MASK]” with “and” as the subject in the prompt
will perform better than the prompt “[SubjectEntity] is neighboring with [MASK].
They share the border.” In particular, the recall was improved from 8.7% to 66.2%, and
the F1-score was from 12.2% to 54.8%. The advantage of our prompt is that it strengthens the
relationship between the “[SubjectEntity]” and the “[MASK]” as neighboring countries.
The model takes the entire string “[SubjectEntity] and [MASK]” as a whole to probe the
LM.</p>
      </sec>
      <sec id="sec-3-11">
        <title>3.11. CountryOficialLanguage</title>
        <p>The prompt is: “[SubjectEntity]’s official language is [MASK].” Firstly, we
experimented with changing the order of the “[SubjectEntity]” and “[MASK]” in the prompt
sentence. For example, we tried “[MASK] is the official language of [SubjectEntity].”
We also tried to place diferent adjectives such as “ national”, “official”, and “country” in
front of the word “language”. Finally, we determined to use the template “[SubjectEntity]’s
(adjective) language is [MASK].”, and use “official” to describe “language”.
Overall, we improved the recall by 6.5% and improved the F1-score from 78.6% to 81.2% from the
default baseline.</p>
      </sec>
      <sec id="sec-3-12">
        <title>3.12. StateSharesBorderState</title>
        <p>This is a relatively dificult relationship to deal with, because “a state” could refer to diferent
geographical entities in diferent countries. It would be dificult to retrieve a correct answer if
the location of the state cannot be determined in the probing. We took two steps to address
the problem. First, we query the LM to discover a list of possible countries where a state is
located by using the first prompt:“ [SubjectEntity] is a state in [MASK], which
is a country.” Second, we embed a possible country name in next prompt sentence
to probe bordering states. The second prompt is: “[SubjectEntity] and [MASK] are
neighboring states in [ObjectEntity]”, where the “[ObjectEntity]” is replaced
by a result of probing with the first prompt. This strategy efectively narrowed the scope of the
prompt query, and successfully improved the precision and recall metrics. We ended up with
improving the F1-score from the baseline’s 0.01% to 31%.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Test Results</title>
      <p>1https://codalab.lisn.upsaclay.fr/competitions/5815
results of the “Baseline”2. The comparison shows that we improved the overall average F1-score
from the Baseline’s 31% to 49%.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>All the prompts were manually designed based on human intuition and trial-and-error. At the
end, we also manually aggregate the results and removed stop words. It would be more useful
and helpful if prompts could be developed in a systematic and general way for future KBC-LM
tasks. We will investigate automatic methods that can learn appropriate prompts by matching
training triples to text corpora.</p>
      <p>By participating in this challenge, we have learned valuable insights about the “knowledge”
of a language model, in particular, the BERT Large Cased Model. We found that it was relatively
easier to probe scientific knowledge from the LM than to retrieve facts about social events such
as the cause of the death of a famous person. A possible reason could be that the text corpora
used for training the LM contained noisier information about social events than about scientific
facts and rules.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Implementation</title>
      <p>Our implementation is available in a github repository here:
https://github.com/anyuanay/KBCLM-Drexel3. The implementation directory contains the following content:
2https://github.com/lm-kbc/dataset/blob/main/baseline.py
3https://github.com/anyuanay/KBC-LM-Drexel
• main.py: the main entry
• MyTools.py: prompts and other middle processes
• Processors.py: optimizing parameters such as top_k and thresholds
• MyHelpers.py: help functions on some logics
• baseline.py: Script provided by the organizers; called by our program
• file_io.py: Script provided by the organizers; called by our program
• README.txt
• data/
– test.jsonl
– predictions.jsonl</p>
      <p>The README file in the directory contains instructions to run the system.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>This challenge provides multiple types of relations for LM probing. We designed and tested
relation-specific prompts and answer processing steps. The test results showed our probing
significantly improved the baseline from 31% to 49% in terms of macro average F1-score. The
insights we learned from this challenge would lead us to select appropriate LMs for future
knowledge base constructions.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This project is partially supported by the Drexel Ofice of Faculty Afairs’ 2022 Faculty Summer
Research awards #284213.
18653/v1/2020.emnlp-main.346.
[32] X. Chen, N. Zhang, X. Xie, S. Deng, Y. Yao, C. Tan, F. Huang, L. Si, H. Chen, Knowprompt:
Knowledge-aware prompt-tuning with synergistic optimization for relation extraction,
in: Proceedings of the ACM Web Conference 2022, WWW ’22, New York, NY, USA,
2022, p. 2778–2788. URL: https://doi.org/10.1145/3485447.3511998. doi:10.1145/3485447.
3511998.
[33] X. L. Li, P. Liang, Prefix-tuning: Optimizing continuous prompts for generation, in:
ACL, Association for Computational Linguistics, Online, 2021, pp. 4582–4597. URL: https:
//aclanthology.org/2021.acl-long.353. doi:10.18653/v1/2021.acl-long.353.
[34] B. Lester, R. Al-Rfou, N. Constant, The power of scale for parameter-eficient prompt tuning,
in: EMNLP, Online and Punta Cana, Dominican Republic, 2021, pp. 3045–3059. URL: https:
//aclanthology.org/2021.emnlp-main.243. doi:10.18653/v1/2021.emnlp-main.243.
[35] T. He, K. Cho, J. R. Glass, An empirical study on few-shot knowledge probing for pretrained
language models, CoRR abs/2109.02772 (2021). URL: https://arxiv.org/abs/2109.02772.
arXiv:2109.02772.
[36] N. Poerner, U. Waltinger, H. Schütze, E-BERT: Eficient-Yet-Efective Entity Embeddings
for BERT, arXiv e-prints (2019) arXiv:1911.03681. arXiv:1911.03681.
[37] B. Dhingra, J. R. Cole, J. M. Eisenschlos, D. Gillick, J. Eisenstein, W. W. Cohen,
Timeaware language models as temporal knowledge bases, Transactions of the Association for
Computational Linguistics 10 (2022) 257–273. URL: https://aclanthology.org/2022.tacl-1.15.
doi:10.1162/tacl_a_00459.
[38] M. Sung, J. Lee, S. Yi, M. Jeon, S. Kim, J. Kang, Can language models be biomedical
knowledge bases, in: EMNLP, 2021.
[39] Z. Meng, F. Liu, E. Shareghi, Y. Su, C. Collins, N. Collier, Rewire-then-probe: A contrastive
recipe for probing biomedical knowledge of pre-trained language models, in: 60th ACL,
2022.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: NAACL, Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19-1423. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , V. Stoyanov,
          <article-title>RoBERTa: A Robustly Optimized BERT Pretraining Approach</article-title>
          , arXiv e-prints (
          <year>2019</year>
          ) arXiv:
          <year>1907</year>
          .11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Herbert-Voss</surname>
          </string-name>
          , G. Krueger,
          <string-name>
            <given-names>T.</given-names>
            <surname>Henighan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Winter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hesse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          , E. Sigler,
          <string-name>
            <given-names>M.</given-names>
            <surname>Litwin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chess</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Berner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>McCandlish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          , in: H.
          <string-name>
            <surname>Larochelle</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ranzato</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Hadsell</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Balcan</surname>
          </string-name>
          , H. Lin (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          . URL: https://proceedings.neurips.cc/paper/2020/ ifle/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rafel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Narang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Matena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Exploring the limits of transfer learning with a unified text-to-text transformer</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>21</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>67</lpage>
          . URL: http://jmlr.org/papers/v21/
          <fpage>20</fpage>
          -
          <lpage>074</lpage>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Schick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schütze</surname>
          </string-name>
          ,
          <article-title>Exploiting cloze-questions for few-shot text classification and natural language inference</article-title>
          ,
          <source>in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics</source>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>269</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .eacl-main.
          <volume>20</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eacl-main.
          <volume>20</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Schick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schütze</surname>
          </string-name>
          ,
          <article-title>It's not just size that matters: Small language models are also few-shot learners</article-title>
          , in: NAACL, Online,
          <year>2021</year>
          , pp.
          <fpage>2339</fpage>
          -
          <lpage>2352</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .naacl-main.
          <volume>185</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .naacl-main.
          <volume>185</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Making pre-trained language models better few-shot learners</article-title>
          , in: ACL, Online,
          <year>2021</year>
          , pp.
          <fpage>3816</fpage>
          -
          <lpage>3830</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>295</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>295</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Template-based named entity recognition using BART, in: ACL-IJCNLP</article-title>
          ,
          <year>Online</year>
          ,
          <year>2021</year>
          , pp.
          <fpage>1835</fpage>
          -
          <lpage>1845</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          . ifndings-acl.
          <volume>161</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .findings-acl.
          <volume>161</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z. Yu,</surname>
          </string-name>
          <article-title>SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis</article-title>
          , arXiv e-prints (
          <year>2021</year>
          ) arXiv:
          <fpage>2109</fpage>
          .08306. arXiv:
          <volume>2109</volume>
          .
          <fpage>08306</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>W</surname>
          </string-name>
          . Zhao,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          , Ptr:
          <article-title>Prompt tuning with rules for text classification</article-title>
          ,
          <source>arXiv preprint arXiv:2105.11259</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aghajanyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Okhonko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          , L. Zettlemoyer, HTLM:
          <string-name>
            <surname>Hyper-Text</surname>
          </string-name>
          Pre-Training and
          <article-title>Prompting of Language Models</article-title>
          , arXiv e-prints (
          <year>2021</year>
          ) arXiv:
          <fpage>2107</fpage>
          .06955. arXiv:
          <volume>2107</volume>
          .
          <fpage>06955</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Choe</surname>
          </string-name>
          , E. Charniak,
          <article-title>Parsing as language modeling</article-title>
          , in: EMNLP, Austin, Texas,
          <year>2016</year>
          , pp.
          <fpage>2331</fpage>
          -
          <lpage>2336</lpage>
          . URL: https://aclanthology.org/D16-1257. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D16</fpage>
          -1257.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bakhtin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Language models as knowledge bases?, in: EMNLP-IJCNLP, Hong Kong</article-title>
          , China,
          <year>2019</year>
          , pp.
          <fpage>2463</fpage>
          -
          <lpage>2473</lpage>
          . URL: https://aclanthology.org/D19-1250. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1250.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , X. Han,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          , Q. Liu, ERNIE:
          <article-title>Enhanced language representation with informative entities</article-title>
          , in: ACL, Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>1441</fpage>
          -
          <lpage>1451</lpage>
          . URL: https: //aclanthology.org/P19-1139. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -1139.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , H.-W. Hon,
          <article-title>Unified Language Model Pre-Training for Natural Language Understanding and Generation</article-title>
          , Curran Associates Inc.,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Piao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , H.-W. Hon, Unilmv2:
          <article-title>Pseudo-masked language models for unified language model pre-training</article-title>
          ,
          <source>in: Proceedings of the 37th International Conference on Machine Learning, ICML'20</source>
          , JMLR.org,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>ERNIE-M:</surname>
          </string-name>
          <article-title>Enhanced multilingual representation by aligning cross-lingual semantics with monolingual corpora</article-title>
          , in: EMNLP, Online and
          <string-name>
            <given-names>Punta</given-names>
            <surname>Cana</surname>
          </string-name>
          , Dominican Republic,
          <year>2021</year>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>38</lpage>
          . URL: https: //aclanthology.org/
          <year>2021</year>
          .emnlp-main.3. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .emnlp-main.
          <volume>3</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension</article-title>
          , in: ACL, Online,
          <year>2020</year>
          , pp.
          <fpage>7871</fpage>
          -
          <lpage>7880</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          , T.-Y. Liu,
          <article-title>Mass: Masked sequence to sequence pre-training for language generation</article-title>
          ,
          <source>in: International Conference on Machine Learning</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>5926</fpage>
          -
          <lpage>5936</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. F.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Araki</surname>
          </string-name>
          , G. Neubig,
          <article-title>How Can We Know What Language Models Know?, Transactions of the Association for Computational Linguistics 8 (</article-title>
          <year>2020</year>
          )
          <fpage>423</fpage>
          -
          <lpage>438</lpage>
          . URL: https://doi.org/10.1162/tacl_a_00324. doi:
          <volume>10</volume>
          .1162/tacl_a_
          <fpage>00324</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , G. Neubig, P. Liu, Bartscore:
          <article-title>Evaluating generated text as text generation</article-title>
          , in: M.
          <string-name>
            <surname>Ranzato</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Beygelzimer</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Dauphin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>J. W.</given-names>
          </string-name>
          <string-name>
            <surname>Vaughan</surname>
          </string-name>
          (Eds.),
          <source>NeurIPS</source>
          , volume
          <volume>34</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>27263</fpage>
          -
          <lpage>27277</lpage>
          . URL: https://proceedings.neurips.cc/paper/2021/file/ e4d2b6e6fdeca3e60e0f1a62fee3d9dd-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>A.</given-names>
            <surname>Haviv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Berant</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Globerson,
          <article-title>BERTese: Learning to speak to BERT, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics</article-title>
          : Main Volume,
          <source>Online</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>3618</fpage>
          -
          <lpage>3623</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .eacl-main.
          <volume>316</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eacl-main.
          <volume>316</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>E.</given-names>
            <surname>Wallace</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kandpal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Universal adversarial triggers for attacking and analyzing NLP, in: EMNLP-IJCNLP, Hong Kong</article-title>
          , China,
          <year>2019</year>
          , pp.
          <fpage>2153</fpage>
          -
          <lpage>2162</lpage>
          . URL: https://aclanthology.org/D19-1221. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1221.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ben-David</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Oved</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Reichart</surname>
          </string-name>
          , PADA:
          <article-title>Example-based Prompt Learning for on-thelfy Adaptation to Unseen Domains, Transactions of the Association for Computational Linguistics 10 (</article-title>
          <year>2022</year>
          )
          <fpage>414</fpage>
          -
          <lpage>433</lpage>
          . URL: https://doi.org/10.1162/tacl_a_00468. doi:
          <volume>10</volume>
          .1162/ tacl_a_
          <fpage>00468</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Feldman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rush</surname>
          </string-name>
          ,
          <article-title>Commonsense knowledge mining from pretrained models, in: EMNLP-IJCNLP, Hong Kong</article-title>
          , China,
          <year>2019</year>
          , pp.
          <fpage>1173</fpage>
          -
          <lpage>1178</lpage>
          . URL: https://aclanthology. org/D19-1109. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1109.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>X. L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Prefix-tuning: Optimizing continuous prompts for generation</article-title>
          , in: ACL, Online,
          <year>2021</year>
          , pp.
          <fpage>4582</fpage>
          -
          <lpage>4597</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>353</volume>
          . doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>353</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Factual probing is [mask]:
          <article-title>Learning vs. learning to recall</article-title>
          , in: NAACL,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          , GPT understands, too,
          <source>CoRR abs/2103</source>
          .10385 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2103.10385. arXiv:
          <volume>2103</volume>
          .
          <fpage>10385</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anastasopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Araki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Neubig</surname>
          </string-name>
          ,
          <string-name>
            <surname>X-FACTR</surname>
          </string-name>
          :
          <article-title>Multilingual factual knowledge retrieval from pretrained language models</article-title>
          , in: EMNLP, Online,
          <year>2020</year>
          , pp.
          <fpage>5943</fpage>
          -
          <lpage>5959</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <volume>479</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          . emnlp-main.
          <volume>479</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          ,
          <article-title>Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach</article-title>
          , in: EMNLP-IJCNLP,
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , China,
          <year>2019</year>
          , pp.
          <fpage>3914</fpage>
          -
          <lpage>3923</lpage>
          . URL: https://aclanthology.org/D19-1404. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1404.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>T.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Razeghi</surname>
          </string-name>
          , R. L.
          <string-name>
            <surname>Logan</surname>
            <given-names>IV</given-names>
          </string-name>
          , E. Wallace, S. Singh,
          <article-title>AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts</article-title>
          ,
          <source>in: Proceedings of EMNLP, Online</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>4222</fpage>
          -
          <lpage>4235</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <volume>346</volume>
          . doi:10.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>