=Paper= {{Paper |id=Vol-3747/paper3 |storemode=property |title=Incorporating Type Information into Zero-Shot Relation Extraction |pdfUrl=https://ceur-ws.org/Vol-3747/text2kg_paper3.pdf |volume=Vol-3747 |authors=Ricardo Usbeck,Cedric Möller |dblpUrl=https://dblp.org/rec/conf/text2kg/UsbeckM24 }} ==Incorporating Type Information into Zero-Shot Relation Extraction== https://ceur-ws.org/Vol-3747/text2kg_paper3.pdf
                         Incorporating Type Information into Zero-Shot Relation
                         Extraction
                         Cedric Möller1,* , Ricardo Usbeck2
                         1
                             Universität Hamburg, Department of Informatics, Semantic Systems, Germany
                         2
                             Leuphana Universität Lüneburg, Institute for Information Systems, Artificial Intelligence and Explainability, Germany


                                         Abstract
                                         The task of zero-shot relation extraction focuses on the extraction of relations not seen during training time.
                                         Commonly, additional information about the relation such as the relation name or a description of the relation is
                                         utilised. In this work, we analyze whether a relation extractor can benefit from the inclusion of fine-grained type
                                         information about the involved entities. This is based on the intuition that relation descriptions might contain
                                         ontological information on the domain and range of the entity types that are usually put into relation. For that,
                                         we follow a cross-encoding setup where we encode both, the entity information and relation information, as one
                                         sequence and learn to score the representation. We examine this method on several datasets and show that the
                                         inclusion of the fine-grained type information leads to an improvement in performance.

                                         Keywords
                                         Relation Extraction, Zero-shot, Entity types




                         1. Introduction
                         Identifying the relation that is expressed between entities is a very important subproblem of various
                         downstream tasks. For instance, it is critical to handle semantic-web-related tasks such as knowledge
                         graph question answering or knowledge graph population. Usually, it is assumed that the encountered
                         relations are known before. Zero-shot relation extraction breaks with this assumption. During inference
                         time, the goal is to extract entirely new relations not seen before during training time.
                            With the establishment of pre-trained models, this goal becomes achievable. Those models are trained
                         on large corpora of textual data in an unsupervised way. In zero-shot relation extraction, one assumes
                         that some information on the new relations is available. The simplest kind of information is a label
                         describing the relation. But this only works if the relation label co-occurs with a similar context as
                         encountered during the training of the pre-trained models. If this is not the case, using additional
                         information such as a description of the relation is necessary.
                            In this work, we analyse the impact of combining fine-grained type information and the relation
                         description on the relation extraction performance. This is based on the assumption that the descriptions
                         contain valuable information on the types of the involved entities. For example, the description of the
                         relation director states director(s) of film, TV-series, stageplay, video game or
                         similar. Therefore it is clear, that the relation should not be used when talking about board members
                         of a company, also sometimes referred to as directors. We incorporate fine-grained type information
                         extracted from Wikidata together with the relation descriptions in the relation extraction process. 1
                            The contributions are:

                                 • Zero-shot relation extraction model using fine-grained type information and relation descriptions
                                 • Ablation study on the impact of fine-grained type information and relation descriptions on the
                                   performance

                         TEXT2KG 2024: Third International Workshop on Knowledge Graph Generation from Text, May 26-30, 2024, co-located with
                         Extended Semantic Web Conference (ESWC), Hersonissos, Greece
                         *
                           Corresponding author.
                         $ cedric.moeller@uni-hamburg.de (C. Möller); ricardo.usbeck@leuphana.de (R. Usbeck)
                          0000-0001-6700-3482 (C. Möller); 0000-0002-0191-7211 (R. Usbeck)
                                      © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                         1
                             Code/Data available at: https://github.com/semantic-systems/zero-shot-re

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Methodology
2.1. Problem Definition
The problem of relation extraction can be defined as follows: Given an input text 𝑐, an annotated head
ℎ and tail entity 𝑒, identify the correct relation 𝑟 as expressed in the text. Zero-shot relation extraction
separates the set of relations encountered during training from the ones encountered during inference.
Hence, during training time, the set of available relations is 𝑅train , while during test time, the set is 𝑅test .
It holds that 𝑅train ∩ 𝑅test = ∅. Also, no annotated examples containing any relations in set 𝑅test are
available during inference time. Additional information defining the relation is available. We assume
labels, descriptions and type information on entities to be available.

2.2. Method
To study the impact of fine-grained type information, we opt to extend a simple but powerful model
introduced by Lan et al. [1]. Hence, we cross-encode the information of the text and the relation
information in a single input. Different from their work, we do not solely rely on the relation label
but also include the relation description. Additionally, we assume the existence of fine-grained types
for both, the head and the tail relation, extracted using the P31 relation in Wikidata. We include the
relation description under the assumption that it contains valuable ontological information referring to
the fine-grained types of the considered entities. For example, for the relation shipping port, the
description is
      shipping port of the vessel (if different from
      "ship registry"): For civilian ships, the
      primary port from which the ship operates ...

  We denote the types of the head entity by 𝒯ℎ and the types of the tail entity by 𝒯𝑡 . Additionally, for
each type of an entity, we extract the label describing the type (e.g., human for Q5). The input 𝑥 to the
model then consists of four different segments. The first segment describes the head entity:
    Head Entity : {𝑙ℎ } with Types : {𝑇ℎ }

and the second segment describes the tail entity:
    Tail Entity : {𝑙𝑡 } with Types : {𝑇𝑡 }

where 𝑙□ denotes the label of the head entity
                                         ⨁︀ ℎ or tail entity 𝑡. 𝑇□ is the concatenation of the labels of
the types of the head or tail entity 𝑇□ = 𝑢∈𝒯□ 𝑙𝑢 .
  The third segment gives information on the input text:
    Context : {𝑐}

  The final segment gives information on the relation:
    {𝑙𝑟 } defined as {𝑑𝑟 }

  where 𝑙𝑟 denotes the label of the relation 𝑟 and 𝑑𝑟 is the description of the relation 𝑟.
  All segments are then combined into a single coherent text as follows:
    [CLS] Given the Head Entity : {𝑙ℎ } with
    Types : {𝑇ℎ }, Tail Entity : {𝑙𝑡 } with
    Types : {𝑇𝑡 } and Context : {𝑐}, the context
    expresses the relation [SEP] {𝑙𝑟 } defined
    as {𝑑𝑟 } [SEP]

   The whole text is then fed into an encoder-only model 𝑓 (𝑥) which returns a sequence of vectors
𝑒[𝐶𝐿𝑆] . . . 𝑒[𝑆𝐸𝑃 ] . The vector 𝑒[𝐶𝐿𝑆] is then taken and fed a input to a linear layer which then returns a
final score.
                                                 𝑠𝑟 = 𝑙(𝑒[𝐶𝐿𝑆] )
Figure 1: Model overview: Green specifies the types, blue the entities, orange the context, red the relation label
and Purple the description of the relation.


   This is done for each potential relation, which gives us |𝑅test | scores. The highest score is taken as
the predicted relation. All potential relations are known beforehand. During training, the model is
optimized using cross-entropy loss. Each example contains a single positive relation. The model trains
to differentiate it against other relations by comparing it against incorrect relations. For that, 𝑛 other
relations are sampled and used as negative examples. 2
   An overview of the model can be found in Figure 1.


3. Evaluation
We evaluate the model on two popular datasets, FewRel and Wiki-ZSL. Both datasets were annotated
on Wikipedia article texts. FewRel is originally a few-shot relation extraction dataset annotated by Han
et al. [2]. The dataset was modified for zero-shot purposes by Chia et al. [3]. They split the training,
validation and test examples by their relations into disjoint sets. Wiki-ZSL is a zero-shot relation
extraction dataset created by Chen et al. [4] based on the Wiki-KB [5]. As the entities and relations in
both datasets are linked to Wikidata, we focus on it as the knowledge graph providing the fine-grained
entity types.
   In each dataset, the set of relations in the training and test dataset is disjoint and randomly assigned.
Three different settings are examined per dataset. Each setting considers a different number of relations
in the train, validation and test set. The number of relations in the validation/test set varies between
𝑚 = 5, 𝑚 = 10 and 𝑚 = 15 relations. These relations are randomly picked and the remaining relations
are assigned to the training set.
   To handle the considerable noise induced by the random selection of the relations, the dataset for
𝑚 = 5, 𝑚 = 10 and 𝑚 = 15 were randomly split into train, validation and test sets for five times. A
method is evaluated on each split and the results are averaged.
   As metrics, precision, recall and F1 are calculated. All metrics are computed in a macro setting which
means that for each relation the precision, recall and F1 are calculated and then averaged over all
relations.
   We compare our method, called TMC-BERT, against several methods: CIM [6] solves the task as a
textual entailment problem where the relation descriptions and the input sentence are given to a Natural
Language Inference model to classify whether the input sentence entails the relation description. This

2
    In our experiments we set 𝑛 = 5.
is done for all potential relations and the highest scoring is taken. ZS-BERT [7] encodes the input
sentence as well as the relation descriptions into a dense vector space. An nearest neighbor search
is conducted over all the encodings of the relation descriptions given the input sentence. The closest
relation encoding is the final relation. Tran et al. (2022) [8] again encode the input sentence and relation
descriptions into a dense vector space. They additionally employ a contrastive-learning inspired loss on
the input sentence and relation encodings. The final scoring is achieved by concatenating the relation
encoding and the sentence encoding and feeding it into a linear layer. RE-Matching [9] encodes the
input sentence and relation descriptions as well but uses feature distillation to calculate a similarity
score based on more fine-grained feature interactions. RelationPrompt [3] relies on a generative model
to generate synthetic data as additional training samples. At the same time, the generative model is also
used to generate a relation given the sentence and the two entities as input. We compare against the
model with (RelationPrompt) and without (RelationPrompt NG) synthetic training data. MC-BERT [1]
models the relation extraction similar to us as a multiple-choice problem where the input sentence and
the relation label are rearranged together in a natural sentence, encoded and scored. DSP-ZRSC [10]
solves the problem via Discriminative Soft Prompting where the input text, the entities and all relation
labels are concatenated, fed into a prompt discriminative language model and each relation label is
scored. Tran et al. (2023) [11] solve it as a representation learning problem and introduce a second loss
term incorporating the degree of correlation between sentences and relations.
   BERT-base-case was used as the model to stay comparable to MC-BERT. The model was fine-tuned
on two NVIDIA A6000s with a batch size of 48 and a learning rate of 5𝑒 − 5.

3.1. Results
As can be seen in Table 1, the incorporation of type-related information leads to a large increase in
performance on several datasets in comparison to regular MC-BERT. On Wiki-ZSL, the performance
increases vary between 6 and nearly 8 F1 points. The type-related information has a great impact on,
both, recall and precision. On FewRel, the performance increases when considering 5 or 15 unseen
relations. However, the performance increases are less pronounced. In comparison to the current SOTA
method by Tran et al. [11], TMC-BERT considerably surpasses its performance when confronted with 15
unseen relations. This is the most complex setting as much fewer examples and relations are available
during training while more potential relations are encountered during inference. Here, the additional
type information helps a lot. Furthermore, the inclusion of fine-grained type information is orthogonal
to the properties of the method by Tran et al. [11]. Their method could benefit from it as well.

3.2. Ablation study
To examine what changes lead to the large increase in performance, we conducted an ablation study on
the incorporation of different kinds of information. Here, we differentiated between three cases:
   1. TMC-BERT
   2. TMC having the types of the subject and object entity removed (TMC-BERT w/o types)
   3. TMC having the description of the relation removed (TMC-BERT w/o desc.)
   As can be seen in Table 2, the addition of the relation description alone was the least beneficial type
of information. Adding information on relation types leads to a larger improvement, probably as the
pre-trained model already associates specific types with certain relation labels. Finally, the ablation
study shows that the relation description and fine-grained entity type information complement each
other, as using each separately does not lead to as large a decrease in performance as using them
together.

3.3. Entity Linking impact
As it is not realistic that fine-grained type information is always available, we also evaluate the model
when identifying entity types using an entity linker (EL). For that, we train the model with known
                                                   Wiki-ZSL                   FewRel
                𝑚            Model          P         R        F1      P        R       F1
                             CIM           49.63    48.81     49.22   58.05   61.92    59.92
                          ZS-BERT          71.54    72.39     71.96   76.96   78.86    77.90
                      Tran et al. (2022)   87.48    77.50     82.19   87.11   86.29    86.69
                     RelationPrompt NG     51.78    46.76     48.93   72.36   58.61    64.57
                5
                       RelationPrompt      70.66    83.75     76.63   90.15   88.50    89.30
                        RE-Matching        78.19    78.41     78.30   92.82   92.34    92.58
                         DSP-ZRSC           94.1     77.1      84.8    93.4    92.5     92.9
                      Tran et al. (2023)   94.50    96.48     95.46   96.36   96.68    96.51
                          MC-BERT          80.28    84.03     82.11   90.82   90.13    90.47
                         TMC-BERT          90.11    87.89     88.92   93.94   93.30    93.62
                             CIM           46.54    47.90     45.57   47.39   49.11    48.23
                          ZS-BERT          60.51    60.98     60.74   56.92   57.59    57.25
                      Tran et al. (2022)   71.59    64.69     67.94   64.41   62.61    63.50
                     RelationPrompt NG     54.87    36.52     43.80   66.47   48.28    55.61
                10
                       RelationPrompt      68.51    74.76     71.50   80.33   79.62    79.96
                        RE-Matching        74.39    73.54     73.96   83.21   82.64    82.93
                         DSP-ZRSC           80.0     74.0      76.9    80.7   88.0      84.2
                      Tran et al. (2023)   85.43    88.14     86.74   81.13   82.24    81.68
                          MC-BERT          72.81    73.96     73.38   86.57   85.27    85.92
                         TMC-BERT          81.21    81.27     81.23   84.42   84.99    85.68
                             CIM           29.17    30.58     29.86   31.83   33.06    32.43
                          ZS-BERT          34.12    34.38     34.25   35.54   38.19    36.82
                      Tran et al. (2022)   38.37    36.05     37.17   43.96   39.11    41.36
                     RelationPrompt NG     54.45    29.43     37.45   66.49   40.05    49.38
                15
                       RelationPrompt      63.69    67.93     65.74   74.33   72.51    73.40
                        RE-Matching        67.31    67.33     67.32   73.80   73.52    73.66
                         DSP-ZRSC           77.5     64.4      70.4   82.9     78.1     80.4
                      Tran et al. (2023)   64.68    65.01     65.30   66.44   69.29    67.82
                          MC-BERT          65.71    67.11     66.40   80.71   79.84    80.27
                         TMC-BERT          73.62    74.07     73.77   82.11   79.93    81.00

Table 1
Results on FewRel and Wiki-ZSL

                                                   Wiki-ZSL                   FewRel
               𝑚     Model                   P        R         F1      P       R        F1
                     TMC-BERT w/o desc.    85.56     84.07    84.74   93.96    93.26   93.61
               5     TMC-BERT w/o types    85.00     84.41    84.68   93.33    92.50   92.91
                     TMC-BERT              90.11     87.89    88.92   93.94    93.30   93.62
                     TMC-BERT w/o desc.    77.26     78.16    77.70   85.24    83.29   84.25
               10    TMC-BERT w/o types    74.89     76.05    75.46   85.16    83.36   84.24
                     TMC-BERT              81.21     81.27    81.23   84.42    84.99   85.68
                     TMC-BERT w/o desc.    72.33     71.16    71.73   79.22    76.46   79.79
               15    TMC-BERT w/o types    68.53     69.81    69.16   79.22    78.19   78.69
                     TMC-BERT              73.62     74.07    73.77   82.11    79.93   81.00

Table 2
Ablation study of on FewRel and Wiki-ZSL


entity types but evaluate with the entity types as retrieved by an entity linker. As an entity linker, we
use ReFinED [12]. As can be seen in Table 3, the performance diminishes when using types identified
through entity linking. On Wiki-ZSL, the performance is still surpassing the existing SOTA results
at all times. On FewRel, the performance is still greater when only confronted with five relations but
decreases more when having to predict 10 or 15 relations. One reason might be that the entity linking
performance is lower on FewRel than on Wiki-ZSL.

                                                   Wiki-ZSL                   FewRel
                   𝑚        Model            P        R        F1      P         R       F1
                          TMC-BERT         90.11    87.89     88.92   93.94   93.30    93.62
                   5
                        TMC-BERT + EL      88.44    87.07     87.73   93.94   93.35    93.64
                          TMC-BERT         81.21    81.27     81.23   84.42   84.99    85.68
                   10
                        TMC-BERT + EL      81.16    81.22     81.18   84.88   83.43    84.14
                          TMC-BERT         73.62    74.07     73.77   82.11   79.93    81.00
                   15
                        TMC-BERT + EL      73.53    73.96     73.67   80.87   78.74    79.78

Table 3
Results on FewRel and Wiki-ZSL when using an entity linker



3.4. Case study
Table 4 illustrates two instances where the inclusion of type information or relation descriptions proved
beneficial. In the first case, specifying that MMORPG belongs to the video game genre facilitated the
correct classification of the genre relation. In the second example, highlighting that bass is a voice type
aligned the type label precisely with the voice type relation label. Additionally, the relation description
directly addressed the voice type of bass.

  Method           TMC-BERT                                    MC-BERT
  Sentence         Gravity Corporation is a South Korean video game corporation primarily known for the
                   development of the MMORPG Ragnarok Online. {MMORPG: video game genre; Ragnarok
                   Online: video game}
  Classified Re-   genre                                    manufacturer
  lation
  Description    creative work’s genre or an artist’s field of main use of the subject (includes current
  of Relation    work                                          and former usage)
  Sentence         Putnam Griswold (1875-1914) was an American opera singer (bass), born in Minneapolis,
  with entity      Minnesota. {Putnam Griswold: human, bass: voice type}
  types
  Classified Re-   voice type                                  use
  lation
  Description      person’s voice type. expected values: so- main use of the subject (includes current
  of Relation      prano, mezzo-soprano, contralto, coun- and former usage)
                   tertenor, tenor, baritone, bass (and deriva-
                   tives)

Table 4
Comparison of the performance of TMC-BERT and MC-BERT on two different examples. Ground-truth relations
are shown in bold. The interacting entities and their types are shown in dictionaries following the sentences.



4. Related Work
Commonly, relation extraction is tackled as classification problem. Usually, the input text is encoded and
a classification head is attached. To encode text, CNNs [13], RNNs [14] or transformers [15] are usually
employed. Recently, pre-trained models have been fine-tuned on the relation extraction task. Due to the
fixed classification head, such trained models are not flexible enough to handle new relations [16]. Hence,
when targeting zero-shot relation extraction other methods are necessary. Representation-learning-
based methods [11, 8, 4, 9, 11] try to embed the textual information and the relational information in
the same vector space. For that, relational information such as labels or descriptions are usually used to
get a representation of the relation. The goal is to learn representations such that the representation of
the true relation resides close to a representation of the text in the vector space while the representation
of false relations is far away. Recently, generative language models have been increasingly utilized for
the task [17, 18, 3, 10]. Here, the model is prompted with the input text as well as information on the
potential relations. The model is then fine-tuned to either generate the relation as expressed in the
input text or a full triple consisting of the two entities and relation. For example, Chen et al. [18] model
it as solving a Masked Language Modelling problem. Also, generative models were applied to generate
synthetic training data for relation extraction [3].
   Type information was considered in previous works focusing on relation extraction but these works
either used very broad types or did not tackle zero-shot relation extraction [19, 20, 21].
   Some methods model the problem as a textual entailment problem [22, 23, 24]. Here, the idea is that
a model that is pre-trained on the textual entailment task is directly applied to the relation extraction
task. The assumption is that the model can identify whether the textual information entails the relation
description.
   The method by Lan et al [1] models relation classification as a multiple-choice problem where the
text is encoded with relation information and a score is calculated. This is done for all relations and the
relation with the highest score is taken. We extend this approach.


5. Conclusion and Future Work
In this work, we examined the impact of fine-grained type information on the zero-shot relation
extraction problem. Different from past methods, we employed fine-grained type information as
additional information and showed that combining this information with the description of the relation
leads to a synergistic effect, improving the performance overall. We believe that this is the case because
the description information provides valuable ontological information on the domain and range of a
relation. This domain and range are then compared against the fine-grained type information of the
interacting entities. Furthermore, we validated whether the increase in performance did indeed spring
from the combination of type and relation description information which is indeed the case. Finally, we
studied the impact of using an entity linker to retrieve the entity types. While it leads to a decrease, the
performance often still surpasses the current SOTA in the most complex setting considerably.
   In future works, we want to tackle multiple problems. First, it is not certain that one has access to fine-
grained type information during inference. Therefore, we want to examine, whether the performance of
a trained entity typer is sufficient to produce similar results. Secondly, the current architecture follows a
cross-encoding approach. While this is not a problem when one encounters only a few relations during
inference, in real-world use cases this is not typically the case. There are hundreds of potential different
relations that could be encountered during inference. Cross-encoding the text with each one leads to
a substantial computational effort. We want to examine whether the relation candidate generation
module might also benefit from fine-grained type information. Also, the training process currently
only trains the model by randomly sampling other relations. Choosing the relationships in a smarter
way might lead to additional improvement. Finally, the impact of fine-grained entity types from other
knowledge graphs needs to be evaluated.


Acknowledgments
This project was supported by the House of Computing and Data Science (HCDS) of the Hamburg
University within the Cross-Disciplinary Lab programme.
References
 [1] Y. Lan, D. Li, Y. Zhang, H. Zhao, G. Zhao, Modeling zero-shot relation classification as a multiple-
     choice problem, in: International Joint Conference on Neural Networks, IJCNN 2023, Gold Coast,
     Australia, June 18-23, 2023, IEEE, 2023, pp. 1–8. doi:10.1109/IJCNN54540.2023.10191459.
 [2] X. Han, H. Zhu, P. Yu, Z. Wang, Y. Yao, Z. Liu, M. Sun, Fewrel: A large-scale supervised few-
     shot relation classification dataset with state-of-the-art evaluation, in: E. Riloff, D. Chiang,
     J. Hockenmaier, J. Tsujii (Eds.), Proceedings of the 2018 Conference on Empirical Methods in
     Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Association for
     Computational Linguistics, 2018, pp. 4803–4809. doi:10.18653/V1/D18-1514.
 [3] Y. K. Chia, L. Bing, S. Poria, L. Si, Relationprompt: Leveraging prompts to generate synthetic data for
     zero-shot relation triplet extraction, in: S. Muresan, P. Nakov, A. Villavicencio (Eds.), Findings of the
     Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, Association
     for Computational Linguistics, 2022, pp. 45–57. doi:10.18653/V1/2022.FINDINGS-ACL.5.
 [4] C. Chen, C. Li, ZS-BERT: towards zero-shot relation extraction with attribute representation
     learning, in: K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tür, I. Beltagy, S. Bethard,
     R. Cotterell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North
     American Chapter of the Association for Computational Linguistics: Human Language Technolo-
     gies, NAACL-HLT 2021, Online, June 6-11, 2021, Association for Computational Linguistics, 2021,
     pp. 3470–3479. doi:10.18653/V1/2021.NAACL-MAIN.272.
 [5] D. Sorokin, I. Gurevych, Context-aware representations for knowledge base relation extraction,
     in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proceedings of the 2017 Conference on Empirical Methods
     in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017,
     Association for Computational Linguistics, 2017, pp. 1784–1789. doi:10.18653/V1/D17-1188.
 [6] T. Rocktäschel, E. Grefenstette, K. M. Hermann, T. Kociský, P. Blunsom, Reasoning about entailment
     with neural attention, in: Y. Bengio, Y. LeCun (Eds.), 4th International Conference on Learning
     Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings,
     2016. URL: http://arxiv.org/abs/1509.06664.
 [7] C. Chen, C. Li, ZS-BERT: towards zero-shot relation extraction with attribute representation
     learning, in: K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tür, I. Beltagy, S. Bethard,
     R. Cotterell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North
     American Chapter of the Association for Computational Linguistics: Human Language Technolo-
     gies, NAACL-HLT 2021, Online, June 6-11, 2021, Association for Computational Linguistics, 2021,
     pp. 3470–3479. doi:10.18653/V1/2021.NAACL-MAIN.272.
 [8] V.-H. Tran, H. Ouchi, T. Watanabe, Y. Matsumoto, Improving discriminative learning for zero-shot
     relation extraction, in: Proceedings of the 1st Workshop on Semiparametric Methods in NLP:
     Decoupling Logic from Knowledge, 2022, pp. 1–6.
 [9] J. Zhao, W. Zhan, X. Zhao, Q. Zhang, T. Gui, Z. Wei, J. Wang, M. Peng, M. Sun, Re-matching:
     A fine-grained semantic matching method for zero-shot relation extraction, in: A. Rogers, J. L.
     Boyd-Graber, N. Okazaki (Eds.), Proceedings of the 61st Annual Meeting of the Association
     for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14,
     2023, Association for Computational Linguistics, 2023, pp. 6680–6691. doi:10.18653/V1/2023.
     ACL-LONG.369.
[10] B. Lv, X. Liu, S. Dai, N. Liu, F. Yang, P. Luo, Y. Yu, DSP: discriminative soft prompts for zero-shot
     entity and relation extraction, in: A. Rogers, J. L. Boyd-Graber, N. Okazaki (Eds.), Findings of the
     Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, Association
     for Computational Linguistics, 2023, pp. 5491–5505. doi:10.18653/V1/2023.FINDINGS-ACL.
     339.
[11] V.-H. Tran, H. Ouchi, H. Shindo, Y. Matsumoto, T. Watanabe, Enhancing semantic correlation
     between instances and relations for zero-shot relation extraction, Journal of Natural Language
     Processing 30 (2023) 304–329.
[12] T. Ayoola, S. Tyagi, J. Fisher, C. Christodoulopoulos, A. Pierleoni, Refined: An efficient zero-
     shot-capable approach to end-to-end entity linking, in: A. Loukina, R. Gangadharaiah, B. Min
     (Eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for
     Computational Linguistics: Human Language Technologies: Industry Track, NAACL 2022, Hybrid:
     Seattle, Washington, USA + Online, July 10-15, 2022, Association for Computational Linguistics,
     2022, pp. 209–220. doi:10.18653/V1/2022.NAACL-INDUSTRY.24.
[13] D. Zeng, K. Liu, S. Lai, G. Zhou, J. Zhao, Relation classification via convolutional deep neural
     network, in: J. Hajic, J. Tsujii (Eds.), COLING 2014, 25th International Conference on Computational
     Linguistics, Proceedings of the Conference: Technical Papers, August 23-29, 2014, Dublin, Ireland,
     ACL, 2014, pp. 2335–2344. URL: https://aclanthology.org/C14-1220/.
[14] M. Miwa, M. Bansal, End-to-end relation extraction using lstms on sequences and tree structures,
     in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL
     2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers, The Association for Computer
     Linguistics, 2016. doi:10.18653/V1/P16-1105.
[15] Z. Zhong, D. Chen, A frustratingly easy approach for entity and relation extraction, in:
     K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tür, I. Beltagy, S. Bethard, R. Cot-
     terell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North American
     Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-
     HLT 2021, Online, June 6-11, 2021, Association for Computational Linguistics, 2021, pp. 50–61.
     doi:10.18653/V1/2021.NAACL-MAIN.5.
[16] S. Wu, Y. He, Enriching pre-trained language model with entity information for relation clas-
     sification, in: W. Zhu, D. Tao, X. Cheng, P. Cui, E. A. Rundensteiner, D. Carmel, Q. He, J. X.
     Yu (Eds.), Proceedings of the 28th ACM International Conference on Information and Knowl-
     edge Management, CIKM 2019, Beijing, China, November 3-7, 2019, ACM, 2019, pp. 2361–2364.
     doi:10.1145/3357384.3358119.
[17] J. Ni, G. Rossiello, A. Gliozzo, R. Florian, A generative model for relation extraction and classifica-
     tion, CoRR abs/2202.13229 (2022). URL: https://arxiv.org/abs/2202.13229. arXiv:2202.13229.
[18] X. Chen, N. Zhang, X. Xie, S. Deng, Y. Yao, C. Tan, F. Huang, L. Si, H. Chen, Knowprompt:
     Knowledge-aware prompt-tuning with synergistic optimization for relation extraction, in: F. Lafor-
     est, R. Troncy, E. Simperl, D. Agarwal, A. Gionis, I. Herman, L. Médini (Eds.), WWW ’22: The ACM
     Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, ACM, 2022, pp. 2778–2788.
     doi:10.1145/3485447.3511998.
[19] C. Wu, L. Chen, Utber: Utilizing fine-grained entity types to relation extraction with distant
     supervision, in: IEEE International Conference on Smart Data Services, SMDS 2020, Beijing, China,
     October 19-23, 2020, IEEE, 2020, pp. 63–71. doi:10.1109/SMDS49396.2020.00015.
[20] M. Koch, J. Gilmer, S. Soderland, D. S. Weld, Type-aware distantly supervised relation extraction
     with linked arguments, in: A. Moschitti, B. Pang, W. Daelemans (Eds.), Proceedings of the 2014
     Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29,
     2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL, 2014, pp.
     1891–1901. doi:10.3115/V1/D14-1203.
[21] Y. Liu, K. Liu, L. Xu, J. Zhao, Exploring fine-grained entity type constraints for distantly supervised
     relation extraction, in: J. Hajic, J. Tsujii (Eds.), COLING 2014, 25th International Conference on
     Computational Linguistics, Proceedings of the Conference: Technical Papers, August 23-29, 2014,
     Dublin, Ireland, ACL, 2014, pp. 2107–2116. URL: https://aclanthology.org/C14-1199/.
[22] M. Rahimi, M. Surdeanu, Improving zero-shot relation classification via automatically-acquired
     entailment templates, in: B. Can, M. Mozes, S. Cahyawijaya, N. Saphra, N. Kassner, S. Ravfo-
     gel, A. Ravichander, C. Zhao, I. Augenstein, A. Rogers, K. Cho, E. Grefenstette, L. Voita (Eds.),
     Proceedings of the 8th Workshop on Representation Learning for NLP, RepL4NLP@ACL 2023,
     Toronto, Canada, July 13, 2023, Association for Computational Linguistics, 2023, pp. 187–195.
     doi:10.18653/V1/2023.REPL4NLP-1.16.
[23] O. Sainz, O. L. de Lacalle, G. Labaka, A. Barrena, E. Agirre, Label verbalization and entailment
     for effective zero and few-shot relation extraction, in: M. Moens, X. Huang, L. Specia, S. W. Yih
     (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing,
     EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Association
     for Computational Linguistics, 2021, pp. 1199–1212. doi:10.18653/V1/2021.EMNLP-MAIN.92.
[24] A. Obamuyide, A. Vlachos, Zero-shot relation classification as textual entailment, in: Proceedings
     of the first workshop on fact extraction and VERification (FEVER), 2018, pp. 72–78.