=Paper=
{{Paper
|id=Vol-3878/78_main_long
|storemode=property
|title=Benchmarking the Semantics of Taste: Towards the Automatic Extraction of Gustatory Language
|pdfUrl=https://ceur-ws.org/Vol-3878/78_main_long.pdf
|volume=Vol-3878
|authors=Teresa Paccosi,Sara Tonelli
|dblpUrl=https://dblp.org/rec/conf/clic-it/PaccosiT24
}}
==Benchmarking the Semantics of Taste: Towards the Automatic Extraction of Gustatory Language==
Benchmarking the Semantics of Taste: Towards the
Automatic Extraction of Gustatory Language
Teresa Paccosi1,2,3 , Sara Tonelli1
1
Fondazione Bruno Kessler, Via Sommarive, 18, Trento
2
Università degli studi di Trento, Via Calepina, 14, Rovereto
3
DHLab / KNAW Humanities Cluster, Oudezijds Achterburgwal 185 1012 DK Amsterdam, The Netherlands
Abstract
In this paper, we present a benchmark containing texts manually annotated with gustatory semantic information. We employ
a FrameNet-like approach previously tested to address olfactory language, which we adapt to capture gustatory events. We
then propose an exploration of the data in the benchmark to show the possible insights brought by this type of approach,
addressing the investigation of emotional valence in text genres. Eventually, we present a supervised system trained with the
taste benchmark for the extraction of gustatory information from historical and contemporary texts.
Keywords
Sensory semantics, gustatory language, information extraction, digital humanities
1. Introduction Semantics [4], and the system is trained to identify the
lexical units and the possible semantic roles contribut-
Despite the central role of nutrition in our lives, taste has ing to the construction of a gustatory event. We present
been often classified as an inferior sense in the Western the results of the experiments and an exploration of the
philosophical tradition. This downplayed role is reflected benchmark data, aiming to demonstrate the potential of
in the vocabulary used to describe the gustatory experi- frame-based analysis for sensory studies.
ence, which, together with smell, is characterized by a
scarcity of domain-specific terms [1]. The difficulty in
capturing the semantics of taste could help explain why 2. Related Work
there are few works in the fields of Natural Language
Processing (NLP) and Digital Humanities (DH) that deal In recent years, there has been a growing interest within
with this sense and, in particular, the language used to the NLP community in developing resources designed to
describe its experience. While there has been renewed capture the sensory content of language [5]. In particu-
interest in the automatic extraction of nutrients and in- lar, in the framework 1
of the three-year European Project
gredients from texts for health and medicinal purpose [2], “Odeuropa” aimed at preserving intangible cultural her-
less attention has been devoted to the development of itage, several works have focused on analyzing smell de-
tools and models focused on capturing the semantics of scriptions [6] and extracting olfactory information from
sensory experiences, especially in a diachronic fashion. texts. For instance, [3] created a manually annotated
In this paper, we present an English benchmark for benchmark with smell events, which has been subse-
the study of gustatory language and a supervised system quently used to train a system for olfactory information
for the automatic extraction of taste-related events in extraction [7, 8]. The benchmark focuses on the lan-
English, which we trained using this benchmark. The guage used to describe olfactory experiences and covers
benchmark was built to be a counterpart to the olfactory a period of four centuries (1600-1900), making it useful
one presented in [3], with the idea of making the study for historical research. An extension in this direction
of the language of these two senses comparable. The sys- is SENSE-LM, a system for extracting sensory informa-
tem is designed as a means to study the language used to tion from texts, which shows that combining language
describe the experience of tasting from both synchronic models with lexical resource-based approaches yields
and diachronic perspectives. The selected formal repre- better results in extracting sensory references from texts
sentation for the semantics of taste is based on Frame compared to systems that do not integrate these two
components [9]. The authors were the first to combine
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, sensorimotor representations with the textual features
Dec 04 — 06, 2024, Pisa, Italy of language models for the task of sensory information
$ tpaccosi@fbk.eu;teresa.paccosi@unitn.it (T. Paccosi); extraction in text documents. Even if they propose the
satonelli@fbk.eu (S. Tonelli)
0009-0009-2348-7556 (T. Paccosi); 0000-0001-8010-6689
system for all the 5 senses, they only tested it on olfactory
(S. Tonelli)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License 1
Attribution 4.0 International (CC BY 4.0). https://odeuropa.eu/
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Frame Element Definition
Taste_Source The food items that are ingested
Quality Any property used to describe the taste (usually adjectives)
Taste_Carrier Anything that can contain the taste source
Taster The person/animal who ingests the food
Evoked_Taste The taste that is evoked but it is not present (e.g., it tastes like onions)
Location The place in which the food is tasted
Taste_Modifier An ingredient that can modify the perception of the taste of a taste source
Circumstances The condition or circumstance in which the taste event occurs
Effect Any effect provoked by the tasting experience
Table 1
List of Gustatory Frame Elements
and auditory language, using respectively the benchmark mark together with the frame elements associated with
of [3] and an artificial dataset they generated with GPT-4 it, which the taste extraction system should then iden-
[10]. Most existing work on food representation in the tify automatically. For instance, in the sentence “[Slimy
field of NLP focuses on health-related applications. A no- milk]𝑇 𝑎𝑠𝑡𝑒_𝑆𝑜𝑢𝑟𝑐𝑒 has an [unpleasant]𝑄𝑢𝑎𝑙𝑖𝑡𝑦 taste”, the
table work with a linguistic focus is [2], where the authors
system has to identify the Taste_Word (‘taste’), and then
concentrate on identifying noun-compound headnouns the possible frame elements (in this case, Taste_Source
for developing conversational agents in the e-commerce and Quality). A list of the possible frame elements and
domain. They propose a supervised approach based on a their definition is provided in Table 1. The documents
neural sequence-to-sequence model to identify the most annotated in the benchmark cover 5 different domains or
informative token in Italian food compound-nouns, ob- genres, almost evenly distributed with 3/4 documents for
taining promising results despite the complexity of the century in every domain for a total of 72 documents. The
task. Taste has been also addressed from a diachronic genres are: Literature, Science & Philosophy, Household &
point of view in [11], in which the author reconstructs Recipes, Travel & Ethnography, and Medicine & Botany.
the evolution of food language focusing on the history To select the documents we automatically search for texts
of some dishes and ingredients across continents using presenting a greater density of lexical units (taste words)
2
computational linguistic tools. Several studies have de- spanning through several English corpora and taste-
veloped named-entity recognition (NER) models to au- related websites. The corpora form which we extract
tomatically extract food entities for medicinal purposes the documents we annotated are: (1) Early English Books
and food science applications [12, 13], creating domain- Online (EEBO)3 , a collection of documents published be-
specific corpora by sourcing data from culinary websites tween 1475 and 1700 covering different domains such
and online recipe books [14, 15]. as literature, philosophy, politics, religion, geography,
history, politics, and mathematics; (2) Project Gutenberg 4 ,
a digitized archive of cultural works, containing differ-
3. Benchmark for Taste ent repositories, mainly in the literary domain; (3) me-
dievalcookery.com5 a list of texts freely available online
The training data we use for the models in this paper is
relating to medieval food and ancient cooking recipes; (4)
a benchmark created according to the annotation guide-
foodsofengland.co.uk 6 an online library which holds the
lines presented in [16]. The formalization adopted to
complete texts of several cook books from 1390 to 1974;
annotate the benchmark is inspired by Frame Seman-
(5) Wikisource7 , an online digital library of free-content
tics [4] and their implementation through the FrameNet
textual sources managed by the Wikimedia Foundation;
annotation project [17]. In FrameNet, events and situa-
(6) British Library 8 , a collection of 65,227 digitised vol-
tions are constructed as frames, structures that represent
umes from the 16th to the 19th Century; (7) London Pulse
the knowledge necessary to understand the meaning of
words. Frames include two main components, namely
lexical units, domain-specific words or expression that 2
The list of lexical units is provided in Appendix A
trigger the frame, and frame elements, domain-specific 3 https://textcreationpartnership.org/tcp-texts/
semantic roles usually attached as dependents to the lex- eebo-tcp-early-english-books-online/
ical unit. In our case, taste events are captured through 4 https://www.gutenberg.org/
5
a so-called Gustatory frame, which is triggered in a 6 https://www.medievalcookery.com/etexts.html?England
document by Taste_Words (i.e., domain-specific lexi- 7 http://www.foodsofengland.co.uk/references.htm
https://en.wikisource.org/wiki/Main_Page
cal units). Each lexical unit is annotated in the bench- 8 https://data.bl.uk/digbks/
Frame Elements (FEs) 1500 1600 1700 1800 1900 Overall
Taste_Words 440 2417 500 1498 803 5,648
Taste_Source 372 1627 375 1081 599 4,393
Quality 197 1495 255 881 489 1,732
Taste_Modifier 135 142 66 154 78 1,357
Taster 65 173 85 185 100 638
Evoked_Taste 20 127 31 53 16 247
Location 11 44 12 24 16 116
Taste_Carrier 9 38 9 26 12 98
Circumstances 19 206 38 228 82 656
Effect 24 56 32 34 31 174
Table 2
Statistics of the Taste Benchmark
Medical Reports9 , a collection of 5800 Medical Officer of To this purpose, we use the categories proposed in the
Health reports from the Greater London area from 1848 Historical Thesaurus of English of Savouriness and
to 1972. Unsavouriness for Taste and Fragrant/Fragrance
In Table 2 we report the statistics of the annotated and Stench for Smell10 . This thesaurus contains almost
benchmark (note that in [16] we presented only a prelim- every recorded word in English from medieval times to
inary version of the benchmark containing around 1,400 the present day, ordered into detailed hierarchies of mean-
Taste_Words). The most frequent frame element is the ing. In the Thesaurus, every category of the hierarchy
Taste_Source, followed by Quality and Taste_Modifier, is divided per part of speech (PoS). For our analysis, we
which represent the core frame elements, while the rest manually selected all the nouns, adjectives and adverbs
of the frame elements are much sparser. Even if the distri- used in the period we cover with our documents, namely
bution of the frame elements is not balanced, the system from 16th century to 20th century. We then assigned the
is trained to extract the taste words and all the 9 frame words labeled as Taste_Words and Smell_Words in the
elements. Two expert linguists, trained on [16]’s guide- documents to one of the two categories (positive or neg-
lines, annotated three documents from 1670, 1720, and ative) and calculated the normalized frequency of each
1920 to assess Inter Annotator Agreement (IAA). The category across different text genres. As reported in
Krippendorff’s alpha score [18] at span level was 0.70, Section 3, the genres represented in the gustatory bench-
indicating a moderate agreement. mark are: Literature, Science & Philosophy, Household
& Recipes, Travel & Ethnography, Medicine & Botany.
In the olfactory benchmark presented in [3], there are
4. Exploration of olfactory and instead 10 different genres: Household & Recipes, Law &
gustatory benchmarks Regulations, Literature, Medicine & Botany, Perfumes &
Fashion, Public health, Religion, Science & Philosophy,
It has been observed that words used to describe ol- Theatre, Travel & Ethnography.
factory and gustatory experiences tend to appear more We display the output of this analyses in Fig. 1
frequently in emotionally charged contexts and carry a (for taste words) and Fig. 2 (for smell words), aimed
stronger evaluative content compared to words related at showing which emotional valence prevails in each
to other senses [19]. By ‘evaluative content’, we refer in genre for the two senses. We observe that two gen-
this paper to the concept of ‘emotional valence’, which is res exhibit opposite tendencies: medicine/botany
defined as “the pleasantness of a word in terms of pos- shows a more negative orientation in the smell bench-
itive and negative meaning” ([1], p. 201). We therefore mark and a more positive one in the taste benchmark,
conducted an exploration of the gustatory benchmark whereas travel/ethnography is more positive con-
to investigate the positive and negative connotations of cerning smell and more negative for taste (see Fig. 1
gustatory events across different text genres. We perform and Fig. 2, where the light blue refers to negative va-
the same analysis for olfactory events, using the olfactory lencies and the dark blue to positive ones). We then
benchmark of [3] in order to compare the outcome for analyzed the most frequent smell / taste sources in
the two senses. To perform this analysis, we first divide the two selected genres to motivate why they exhibit
Taste_Words and Smell_Words into positive and negative.
10
In the categories at https://ht.ac.uk/category/: The world>physical
9
https://wellcomelibrary.org/moh/about-the-reports/ sensation>Taste/Flavour>Savouriness&Unsavouriness; The
about-the-medical-officer-of-health-reports/ world>physical sensation>Smell/Odour>Fagrant/Fragrance&Stench
5. System for Gustatory
Information Extraction
The benchmark introduced in the previous sections is
used to train a classifier whose goal is to detect gustatory
information in English texts. The system is based on
multi-task learning (Section 5.1), and is then compared
with a “single task” classifier, which we consider our
baseline (Section 5.2).
Figure 1: Savoury (dark blue) and Unsavoury (light blue)
frequencies of taste words in genres 5.1. Multitask configuration
To build our system for gustatory information extraction,
we adopted a multitask learning approach [20, 21], a con-
figuration successfully tested for olfactory information
extraction in [7, 8]. This approach treats the classification
of lexical units and each frame element as different tasks.
Additionally, we explored a “single task” classification
approach, where both lexical units and frame elements
are classified within a multiclass token classification task.
The results of these experiments served as a baseline for
evaluating the effectiveness of the multitask approach. In
both configurations, we employed a transformer-based
model fine-tuned for a token classification task [22]. This
methodology has proved effective across various NLP
tasks, including olfactory information extraction [8] and
the extraction of food-related ingredients [13]. We exper-
iment the two configurations with monolingual (English)
and multilingual versions of BERT and RoBERTa and
Figure 2: Fragrant/Fragrance (dark blue) and Stench with an English historical model, MacBERTh. The mod-
(light blue) frequencies of smell words in genres els we use are listed below:
- English BERT: bert-base-cased 11 [23]
- Multilingual BERT (mBERT): bert-base-multilingual-
cased 12 [23]
such difference in emotional valence. We notice that
- English historical model: MacBERTh 13 [24]
smell sources in medicine/botany tend to be common
- English RoBERTa: roberta-base 14 [25]
to hospital and disease-related domains having words
- Multilingual RoBERTa (RoBERTa xlm): xlm-
such as ‘urine’ and ’fetid bronchitis’, while taste sources
roberta-large15 [26]
more easily belong to the realm of common food, with
We fine-tuned each model using the same data, main-
words such as ‘almonds’ and ‘apples’. For what con-
taining identical training, validation, and test splits, and
cerns travel/ethnography instead, among the most
evaluated them using 5-fold cross-validation. Each fold
frequently described taste sources there are exotic and
contained 80% of the lexical units and their related frame
rare foods such as ‘coconut’ and ‘plantain’, likely result-
elements for training, 10% for validation (dev), and 10%
ing unpleasant to the palates of foreign travelers. Smell
for testing. These splits were consistent across all con-
sources tend to refer instead to plants, like ‘flowers’ or
figurations and not entirely random. This configuration
‘roots’, hence usually pleasant or neutral to the noses
ensured a balanced distribution of frame elements and
of the writers. This analysis of categories and sources’
comparability in every run. For labeling the data, we
distribution in the genres underlines the importance of
adopted the IOB (Inside-Outside-Beginning) labeling for-
a frame-base analysis for understanding and comparing
mat, as used in [7, 8]. This method facilitates a compre-
sensory descriptions, in particular their emotional va-
hensive analysis of sentences and lexical expressions by
lence.
11
https://huggingface.co/google-bert/bert-base-cased
12
https://huggingface.co/google-bert/bert-base-multilingual-cased
13
https://huggingface.co/emanjavacas/MacBERTh
14
https://huggingface.co/FacebookAI/roberta-base
15
https://huggingface.co/FacebookAI/xlm-roberta-base
Model T_Word T_Source Quality Circum. Effect Evoked_T Loc. T_Carr. T_Modif. Taster
BERT 0.917 0.537 0.780 0.413 0.196 0.457 0.379 0.111 0.781 0.518
BERT 0.903 0.530 0.712 0.308 0.019 0.254 0.206 0.0 0.681 0.434
mBERT 0.919 0.554 0.784 0.402 0.180 0.466 0.357 0.087 0.763 0.511
mBERT 0.910 0.557 0.740 0.284 0.0 0.304 0.162 0.0 0.694 0.434
MacBERTh 0.943 0.580 0.799 0.444 0.285 0.501 0.338 0.093 0.783 0.512
MacBERTh 0.909 0.548 0.720 0.366 0.021 0.226 0.242 0.0 0.688 0.455
RoBERTa 0.913 0.558 0.786 0.414 0.219 0.473 0.406 0.094 0.772 0.508
RoBERTa 0.891 0.553 0.726 0.343 0.0 0.33 0.228 0.0 0.726 0.5
RoB.-xlm 0.932 0.587 0.817 0.452 0.279 0.497 0.416 0.105 0.784 0.563
RoB.- xlm 0.903 0.601 0.777 0.4 0.021 0.409 0.25 0.0 0.743 0.539
Table 3
Results (F1) of the classifiers on the lexical unit (T_Word) and 9 frame elements with single (italics) and multitask configurations.
The results are the average of the f1 results of each label across the 5 folds.
labeling each token with either Inside, Outside, or Begin- five times, each time with a different data fold, and the
ning labels as appropriate. To fine-tune the models, we average scores were computed. We present the results of
used MaChAmp [27], a specialized toolkit designed for for the single task approach of each model in italics in
multi-task fine-tuning scenarios. In this approach, each Table 3. We observe high performance variations across
label classification is treated as a distinct task. This setup different frame elements, with the best results obtained
ensures that simpler tasks, such as recognizing lexical for “Quality” and “Taste_Modifier”. This is probably due
units, contribute as auxiliary tasks to more complex la- to the fact that their syntactic realization tends to be con-
bel classifications like “Circumstances” or “Effect” which sistent in the different documents, with “Quality” mainly
include entire sentences rather than individual words. expressed by adjectives and “Taste_Modifier” by preposi-
MaChAmp enables the choice of different parameters, tional phrases introduced by with. On the contrary, clas-
such as loss weight, epochs and batch size, and we tested sification results for “Taste_Source” are quite low despite
different configurations 16 . The results in Table 3 for it being the most frequent FE in the training set, probably
the multitask approach share the configuration which because they can be expressed by many different role
yielded the best results. The configuration is the same fillers and syntactic constructions. Upon reviewing the
for all the models and it is reported in Appendix A. test and prediction results, we find that most mistakes
concerning Taste_Source are due to a wrong span extent,
5.2. “Single Task” configuration as for instance the system predicts “the taste of [lollilop]”
while the gold standard is “the taste [of lollipop]”. This
Baseline issue is also likely reflected in the inter-annotator agree-
Similar to the system for smell information extraction ment (IAA) of the benchmark. In the future, we will
presented in [8], we designed our baseline approach as consider alternative ways to evaluate text spans beside
a single-task multiclass classification, where the model exact match, for instance by computing the cosine simi-
assigns one of 21 possible labels to each token. These larity between gold instances and system predictions.
labels include 20 representing either “begin” or “inside” Overall, MacBERTh is the best model for Taste_Word
of each lexical unit and frame element, and 1 label repre- detection, but the different FEs are mostly detected with
senting “outside”. As we did for the multitask approach, higher accuracy using RoBERTa xlm. For this reason,
each model is fine-tuned with a token classification head we plan to adopt this model for our future research on
on top 17 . During the training of each model, a hy- gustatory language.
perparameter search was conducted on the first fold
of our data. The search space included learning rates
[1𝑒 − 5, 2𝑒 − 5, 3𝑒 − 5, 4𝑒 − 5, 5𝑒 − 5], batch sizes 6. Conclusions and Future
[8, 16, 32], and training epochs up to 20, with warmup ap- Direction
plied for 10% of the training steps. After determining the
optimal hyperparameters for each model, it is fine-tuned In this paper, we presented a benchmark for gustatory
events containing manually annotated taste-related infor-
16
Loss weight with different combinations over the labels [1, 0.75], mation, built as a counterpart to the one proposed in [3].
epochs [10, 20, 30], and batch size [16, 32] The benchmark is constructed with the same approach
17
https://huggingface.co/docs/transformers/tasks/token_ adopting a frame-based methodological framework to
classification
analyze sensory language. We emphasized the impor- riyetoğlu, G. Dijkstra, et al., A multilingual bench-
tance of frame-based analysis to capture sensory events mark to capture olfactory situations over time, in:
by exploring the characterization of positive and nega- Proceedings of the 3rd Workshop on Computational
tive valence in the benchmarks through the analysis of Approaches to Historical Language Change, 2022,
taste and smell words and sources. The analysis based pp. 1–10.
on frames seems to bring relevant insights into captur- [4] C. J. Fillmore, Frame semantics and the nature of
ing sensory valence from different perspectives, likely language, Annals of the New York Academy of
supporting the suitability of this approach to deal with Sciences 280 (1976) 20–32.
humanistic inquiries. We then presented a supervised sys- [5] S. S. Tekiroğlu, G. Özbal, C. Strapparava, A compu-
tem to automatically extract taste-related frames, trained tational approach to generate a sensorial lexicon,
on this benchmark. This preliminary exploration and the in: Proceedings of the 4th Workshop on Cognitive
results obtained with our experiments seem promising Aspects of the Lexicon (CogALex), Association for
for future exploration with automatically extracted data. Computational Linguistics and Dublin City Uni-
Indeed, the limited data of the benchmark are not enough versity, Dublin, Ireland, 2014, pp. 114–125. URL:
to draw relevant conclusions, and for this reason we plan https://aclanthology.org/W14-4716. doi:10.3115/
to use our system to extract more data and conduct large- v1/W14-4716.
scale analyses of the evolution of sensory information [6] R. Brate, P. Groth, M. van Erp, Towards olfactory in-
over time. The limited number of documents is likely a formation extraction from text: A case study on de-
contributing factor to the significant discrepancies in ac- tecting smell experiences in novels, in: Proceedings
curacy among the different frame elements, necessitating of the The 4th Joint SIGHUM Workshop on Com-
more instances to enable a good generalization. Future putational Linguistics for Cultural Heritage, Social
steps should involve increasing the number of documents Sciences, Humanities and Literature, International
and providing less sparse annotations, aiming for better Committee on Computational Linguistics, Online,
temporal balance. The focus should be on annotating 2020, pp. 147–155. URL: https://aclanthology.org/
frame elements with lower scores and fewer instances in 2020.latechclfl-1.18.
the benchmark, such as Taste_Carrier and Location. Ad- [7] S. Menini, T. Paccosi, S. S. Tekiroğlu, S. Tonelli,
ditionally, alternative metrics and techniques should be Scent mining: Extracting olfactory events, smell
employed to capture and explain performance variations sources and qualities, in: Proceedings of the 7th
across different models. As a further comparison, we plan Joint SIGHUM Workshop on Computational Lin-
also to assess the performance of general-purpose frame guistics for Cultural Heritage, Social Sciences, Hu-
semantic parsers like LOME [28] on our benchmark. manities and Literature, 2023, pp. 135–140.
[8] S. Menini, Semantic frame extraction in multilin-
gual olfactory events, in: Proceedings of the 2024
7. Aknowledgments Joint International Conference on Computational
Linguistics, Language Resources and Evaluation
Funded by the European Union under grant agreement
(LREC-COLING 2024), 2024, pp. 14622–14627.
101088548 -TRIFECTA. Views and opinions expressed are
[9] C. Boscher, C. Largeron, V. Eglin, E. Egyed-
however those of the author only and do not necessarily
Zsigmond, Sense-lm: A synergy between a lan-
reflect those of the European Union or the European
guage model and sensorimotor representations for
Research Council. Neither the European Union nor the
auditory and olfactory information extraction, in:
granting authority can be held responsible for them. The
Findings of the Association for Computational Lin-
authors would also like to thank Marieke Van Erp, the
guistics: EACL 2024, 2024, pp. 1695–1711.
head of the project, for her support.
[10] O. AI, Gpt-4 technical report, arXiv preprint
arXiv:2303.08774 (2023).
References [11] D. Jurafsky, The language of food : a linguist reads
the menu / Dan Jurafsky., first edition. ed., W.W.
[1] B. Winter, Sensory linguistics: Language, percep- Norton Company, New York, 2014 - 2014.
tion and metaphor, volume 20, John Benjamins Pub- [12] G. Cenikj, G. Popovski, R. Stojanov, B. K. Sel-
lishing Company, 2019. jak, T. Eftimov, Butter: Bidirectional lstm for food
[2] B. Magnini, V. Balaraman, S. Magnolini, M. Guerini, named-entity recognition, 2020.
F. B. Kessler, T. Povo, What’s in a food name: Knowl- [13] R. Stojanov, G. Popovski, G. Cenikj, B. Koroušić Sel-
edge induction from gazetteers of food main ingre- jak, T. Eftimov, A fine-tuned bidirectional encoder
dient, in: Proceedings of CLiC-it 2018, 2018, p. 241. representations from transformers model for food
[3] S. Menini, T. Paccosi, S. Tonelli, M. Van Erp, I. Lee- named-entity recognition: Algorithm development
mans, P. Lisena, R. Troncy, W. Tullett, A. Hür- and validation, Journal of Medical Internet Re-
search 23 (2021) e28229. hary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
[14] G. Popovski, B. K. Seljak, T. Eftimov, Foodbase L. Zettlemoyer, V. Stoyanov, Unsupervised cross-
corpus: a new resource of annotated food entities, lingual representation learning at scale, CoRR
Database 2019 (2019) baz121. abs/1911.02116 (2019). URL: http://arxiv.org/abs/
[15] A. Wróblewska, A. Kaliska, M. Pawłowski, 1911.02116. arXiv:1911.02116.
D. Wiśniewski, W. Sosnowski, A. Ławrynowicz, [27] R. Van Der Goot, A. Üstün, A. Ramponi, I. Sharaf,
Tasteset–recipe dataset and food entities recogni- B. Plank, Massive choice, ample tasks (machamp): A
tion benchmark, arXiv preprint arXiv:2204.07775 toolkit for multi-task learning in nlp, arXiv preprint
(2022). arXiv:2005.14672 (2020).
[16] T. Paccosi, S. Tonelli, A new annotation scheme [28] P. Xia, G. Qin, S. Vashishtha, Y. Chen, T. Chen,
for the semantics of taste, in: Proceedings of the C. May, C. Harman, K. Rawlins, A. S. White,
20th Joint ACL-ISO Workshop on Interoperable Se- B. Van Durme, LOME: Large ontology multilingual
mantic Annotation@ LREC-COLING 2024, 2024, pp. extraction, in: D. Gkatzia, D. Seddah (Eds.), Proceed-
39–46. ings of the 16th Conference of the European Chap-
[17] J. Ruppenhofer, M. Ellsworth, M. Schwarzer- ter of the Association for Computational Linguis-
Petruck, C. R. Johnson, J. Scheffczyk, FrameNet tics: System Demonstrations, Association for Com-
II: Extended theory and practice, Technical Report, putational Linguistics, Online, 2021, pp. 149–159.
International Computer Science Institute, 2016. URL: https://aclanthology.org/2021.eacl-demos.19.
[18] K. Krippendorff, Computing krippendorff’s alpha- doi:10.18653/v1/2021.eacl-demos.19.
reliability, 2011.
[19] B. Winter, Taste and smell words form an affectively
loaded and emotionally flexible part of the english
lexicon, Language, Cognition and Neuroscience 31
(2016) 975–988.
[20] R. Caruana, Multitask learning: A knowledge-based
source of inductive bias1, in: Proceedings of the
Tenth International Conference on Machine Learn-
ing, Citeseer, 1993, pp. 41–48.
[21] R. Caruana, Multitask learning, Machine learning
28 (1997) 41–75.
[22] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,
L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, At-
tention is all you need, Advances in neural infor-
mation processing systems 30 (2017).
[23] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert:
Pre-training of deep bidirectional transformers for
language understanding, in: Proceedings of the
2019 Conference of the North American Chapter
of the Association for Computational Linguistics:
Human Language Technologies, Volume 1 (Long
and Short Papers), 2019, pp. 4171–4186.
[24] E. Manjavacas Arévalo, L. Fonteyn, MacBERTh:
Development and evaluation of a historically pre-
trained language model for English (1450-1950), in:
Proceedings of the Workshop on Natural Language
Processing for Digital Humanities (NLP4DH), Asso-
ciation for Computational Linguistics, 2021, pp. 23–
36. URL: https://aclanthology.org/2021.nlp4dh-1.4.
pdf.
[25] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen,
O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
Roberta: A robustly optimized BERT pretraining
approach, CoRR abs/1907.11692 (2019). URL: http:
//arxiv.org/abs/1907.11692. arXiv:1907.11692.
[26] A. Conneau, K. Khandelwal, N. Goyal, V. Chaud-
Part of Speech Lexical Units
Nouns Acidity, aftertaste, aroma, bitterness, dainty, delicacy, disgust, distaste, flavor, flavour, flavorful, flavour-
ful, flavoring, flavouring, flavorsome, flavoursome, flavorous, flavourous, gustation, insipidity, mistaste,
over-eating, palatableness, piquancy, pungency, rancidity, relish, rellish (obsolete), saltness, sapid-
ity, sapor, savor, savoriness, savour, sharpness, smack, smatch, sourness, sowreness (archaic form of
sourness), sweetness, tang, tarage, tartness, tast (obsolete), taste, tastelessness, tasting, unsavoriness,
unsavouriness
Adjectives Acid, acidic, appetizing, appetizing, bitter, bitter-sweet, bland, dainty, delectable, delicious, delight-
som(e), disgusting, flavorless, flavorful, flavourful, flavourless, flavoursome, gamy, indigestible, insipid,
juicy, mellow, palatable, piquant, pungent, racy, rancid, rank, salt/salty, sapid, savory, savoury, savourly,
seasoned, sharp, sour, soured, sower (archaic form of sour), spicy, stale, sweet, tangy, tart, tasteless,
tasty, toothsome, unpalatable, unsavor, unsavour, unsavoury, unsavory, unseasoned, unsweet, unsweet-
ened, wearish, wersh, yummy
Verbs Drink (up), drinking (up), drank (up), drunk (up), eat (up), ate (up), eateth (archaic), eaten (up),
eating (up), distaste, distasting, distasted, mistaste, mistasted, mistasting, partake, partaking, partook,
partaken, relish, relisheth (archaic), relishing, relished, season, seasoning, seasoned, smack, smacking,
smacked, smatch (obsolete), sweeten, sweetening, sweetened, taste, tasting, tasted
Adverbs Sweetly, sourly, tastefully, bitterly, tastingly, unsavourily, unsavourly, insipidly, savourously, savourily,
flavourfully
Table 4
Lexical units for Taste
Hyperparameter Value
𝛽 1, 𝛽 2 0.9, 0.99
Dropout 0.2
Epochs 20
Batch Size 32
Learning Rate (LR) 0.0001
Decay Factor 0.38
Cut Fraction 0.3
All tasks loss weight 1
Table 5
Hyperparameter value used for the experiments which yield
the best results
Appendices
A. Lexical Units and Frame
Elements
In Table 4, we display the list of lexical units or taste
words presented in [16].
B. Hyperparameter Values
The hyperparameter setting for all our models is pre-
sented in Table 5. The setting is the default MaChAmp’s
hyperparameter values, with the addition of loss weights
at 1, and 20 epochs of training.