<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>STLD, May</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Data⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matt Murtagh</string-name>
          <email>mmurtagh@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>PJ Wall</string-name>
          <email>wallp2@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Declan O'Sullivan</string-name>
          <email>declan.osullivan@tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Knowledge Graphs, Deep Learning, Large Language Models, Economic Data</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Statistics, Trinity College Dublin</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>26</volume>
      <issue>2024</issue>
      <abstract>
        <p>Accurate collection of economic data is crucial for metrics like the Consumer Price Index (CPI), informing policies on inflation and living costs. Traditional manual data collection methods from retail sources are labor-intensive and fraught with issues of scalability, accuracy, and data diversity. Our study introduces an OWL RDFS-based framework aligned with COICOP, and a transformer model, 'PRICER', to automate the extraction and structuring of online retail data into RDF. By iteratively fine-tuning PRICER-first with a broad DBPedia and Wikipedia knowledge base, then with specific online retail data-we achieve significant eficiency and accuracy improvements in data collection. Notably, PRICER shows marked performance gains in precision and recall after task-specific conditioning, validating our approach for converting unstructured text to structured knowledge. This advancement facilitates streamlined economic data aggregation and highlights PRICER's adaptability for broader standardised data processing applications. Future work will focus scaling the domain specific price dataset, refining the model's conditioning and exploring potential for other forms of technical data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Statistical measures, such as the Consumer Price Index (CPI), rely heavily on highly
heterogeneous price and consumption data to provide insights into inflation rates and the cost of
living, influencing monetary policies and market strategies worldwide. Traditionally, this data
is collected manually from a variety of retail sources, a process that is not only time-consuming
but also prone to inaccuracies and inconsistencies. The Classification of Individual
Consumption by Purpose (COICOP) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], an international standard taxonomy developed by the United
Nations, serves as a guideline for categorising such consumer goods and services. However, the
application of COICOP in the face of highly heterogeneous and non-interoperable data formats
presents significant challenges. In this context, the integration of advanced computational
models and ontologies ofers a promising avenue for transforming the landscape of economic
data collection and analysis. By automating the extraction and structuring of data, we aim to
⋆Repository for this paper is available here
enhance the eficiency, accuracy, and interoperability of these vital economic indicators, setting
the stage for more informed decision-making and policy development with data available in
almost real-time.
      </p>
      <p>In this paper, we propose a simple RDFS schema for the construction and population of price
indices using an existing international standard. We introduce a framework for collecting and
processing economic data from the internet directly into a knowledge graph, structured using
the Resource Description Framework Schema (RDFS). This choice is due to the simplicity of
the schema requirements, consisting of two datatype properties and the eighty classes that
comprise the first module of the COICOP.</p>
      <p>
        We employ Mistral 7B, an open-source 7 billion parameter model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and fine-tune it using
Low Rank Adaptation (LoRA), a parameter-eficient fine-tuning (PEFT) method [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Using a
training dataset of prices collected automatically via a web crawler and manually annotated
with the constraints of the RDFS schema, we test the accuracy of the model and report our
results.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Transforming unstructured text into structured knowledge graphs has long been a significant
area of research, with notable contributions highlighting the potential of Open Information
Extraction and entity recognition techniques [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ]. Despite the established importance
of these methodologies, the application of Large Language Models (LLMs) as agents for the
collection and structuring of economic data represents a nascent and largely unexplored frontier.
Prior investigations into LLMs concerning economic data have predominantly centered around
enhancing the explainability of forecasts [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], deriving sentiment from financial narratives [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
and providing investment guidance [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], or simulating macroeconomic agents [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. These
studies, however, focus on processing or interpreting data that is already in a structured or
semi-structured form in natural language, contrasting sharply with our ambition to structure
raw, unstructured data directly into knowledge graphs.
      </p>
      <p>
        The closest comparison to the current research is the Billion Prices Project [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], a now-defunct
project that created an alternative indicator for inflation statistics by scraping online price data
with particular utility towards countries or time periods where existing oficial statistics may
be unreliable [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. However, the Billion Prices Project still required manual classification,
identification of appropriate retailers and highly specified web scraping scripts in order to
function; most notably, all products and prices had to be manually categorised within the
bounds of the COICOP after collection [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. PRICER iterates on the goal of the Billion Prices
Project by specifying an RDFS for the COICOP and directly classifying unstructured data within
it.
      </p>
      <sec id="sec-3-1">
        <title>2.1. Classification of Individual Consumption by Purpose</title>
        <p>
          The Classification of Individual Consumption by Purpose (COICOP) is a hierarchical
framework developed by the United Nations Statistics Division to standardize the categorization of
goods and services consumed by individuals and households. Its primary aim is to facilitate
international comparisons of consumer behavior and inflation rates by providing a consistent
taxonomy for classifying consumer expenditure. COICOP divides consumer expenditures into
broad categories such as food and beverages, housing, health, education, and transportation,
which are further subdivided into more detailed classifications. This systematic approach
enables researchers, policymakers, and statisticians to aggregate and analyze data in a manner
that reflects the functional aspects of consumer spending, thereby enhancing the accuracy and
comparability of economic indicators such as the Consumer Price Index (CPI). The adoption of
COICOP across national statistical agencies underscores its significance in economic analysis,
ofering a robust framework for understanding consumption patterns and guiding economic
policy decisions [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>For the purpose of this paper, we have transposed the first and largest section of the COICOP,
covering food, beverages, and tobacco, into a simplified RDFS schema. This schema captures the
hierarchical nature of the COICOP via a straightforward subclassing mechanism. We include
just two properties in our schema: :hasPrice and :collectionDate. These are designed to
demonstrate how a basic RDFS framework can structurally reflect a portion of the COICOP
classification in a manageable way for this research context 1.</p>
        <p>.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Transformers and Large Language Models</title>
        <p>NLP models have long used representation of text as vector embeddings to classify and
understand paragraphs and documents of text, allowing for the semantic meaning of words
and subwords to be represented in a high-dimensional space [14, 15, 16]. The introduction
of the self-attention mechanism, where an additional embedding value is given to represent
the importance of every other word in a sequence to a given word, in addition to positional
encodings to allow for highly parallelised processing dramatically changed the field [ 17]. The
introduction of transformer models such as BERT, trained on large diverse datasets, brought
on the advent of pre-trained models that can be fine tuned for specific tasks [ 18]. The high
benchmark performance and proliferation of these new models led to a diversity of variants,
making slight changes to the model structure for increased memory eficiency [ 19], larger
datasets [20] and performance [21]. The scaling up of data and parameters used in these models
has brought on the proliferation of LLMs. The release of GPT-3, a 175 billion parameter model,
has popularised the LLM not only in academic spheres but also in the public imagination, with
strong task-agnostic performance and few shot learning at scale, with far less necessity to
ifne-tune on specific tasks [ 22]. Concurrently, there has been a plethora of open-source LLM
models developed and released to the public at similar scale to commercial LLMs. Bloom, an
open-source collaboration of major research players in the area, developed a 176B parameter
model that is open for researchers to download, test, fine-tune and integrate into existing
research [23]. Similarly, Falcon LLM produced a competitive openly available 7B parameter
model and demonstrated that training a model on properly filtered and deduplicated web based
data alone can be as efective as training on a mix of such data and highly curated training
datasets [24], a lesson that this paper utilises in its approach. Meta similarly released LLaMA,
with sizes ranging from 7B to 60B, outperforming larger models such as GPT-3 on a number of
1This schema is purely demonstrative; a more complete, semantically rich ontology is outside the scope of this paper.
We make it available for download here.</p>
        <p>
          NLP benchmarks [25]. Mistral 7B, the model used for this paper, is an open source LLM that
outperforms LLaMA of sizes up to 34B on most tasks, incorporating a sliding windows attention
mechanism that allows for cheaper training [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Limitations of Large Language Models</title>
        <p>While LLMs are powerful tools and provide an exciting new avenue for research, they also have
significant issues. One major flaw is the tendency for LLMs to ”hallucinate”, a phenomenon
where output in response to a prompt is plausible sounding but nonsensical or factually incorrect
[26, 27], rather than admitting a lack of knowledge. These hallucinations are particularly
problematic in fields that require highly accurate answers and where fact-checking might be a
lengthy, time-limited process. This has led to LLMs being characterised as ”Stochastic Parrots”,
mirroring human responses or generating code in a probable manner rather than reflecting
some underlying meaning [28]. LLMs have achieved their scale largely due to aforementioned
large volumes of data produced and gathered on the internet. However, this has led to issues
with models reflecting human biases in racist, sexist, extreme or other forms of derogatory or
harmful dialogue found online and then incorporated in to datasets without suficient filtering
or quality control [28]. Finally, the scale of LLMs requires long term training with clusters of
hundreds of power-intensive GPUs, meaning that training and running these models can come
with significant environmental costs [ 28]. This incentivises the use of smaller, more eficient
LLMs. While Mistral 7B, the model fine-tuned as part of this paper is significantly smaller
than many LLMs such as GPT, we acknowledge that the training of such models still incurs an
environmental cost. Similarly, while the dataset used to fine tune the model is extracted from
the internet, it is far smaller than the large scale datasets used for the primary training of these
models, and can therefore be filtered more efectively.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Approach</title>
      <sec id="sec-4-1">
        <title>3.1. Model Overview and Structure</title>
        <p>This section introduces the structure and workflow of our model. The core objective is to
transform HTML data from online retailers into a knowledge graph format, adhering to our
COICOP RDFS schema. This includes the identification and extraction of product prices and
classifications. The foundational model, Mistral 7B, undergoes a two-phase fine-tuning process:
initially with DBPedia data to grasp information structuring, followed by domain-specific
adaptation using online retail data. Figure 1 depicts the overall architecture of PRICER. The
model employs a transformer-based approach, leveraging the capabilities of Mistral7B for
semantic understanding and knowledge extraction. The fine-tuning process is bifurcated into
an initial phase focusing on general knowledge graph formation, and a subsequent phase
targeting the COICOP RDFS for retail data.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Task Specification, Data Preparation and Training</title>
        <p>As discussed, the specific task of PRICER is to generate RDF triples from HTML text scraped
from online retailers, within the parameters defined in the COICOP RDFS schema. Specifically,
the input data is text data, while the output data are RDF triples. To achieve this, the Mistral 7B
model is fine-tuned in two phases; task-specific fine tuning and domain-specific fine tuning.
We outline both of these processes here.</p>
        <sec id="sec-4-2-1">
          <title>3.2.1. Phase 1: Task-Specific Fine Tuning</title>
          <p>The initial training phase is designed to prime the model with the capability to transform
unstructured textual information into structured data aligned with knowledge graph schemas.
This foundational training leverages a dataset of 18,000 pairs, each consisting of text extracted
from Wikipedia and the corresponding structured data in the form of DBPedia triples. The
objective is to imbue the model with an intrinsic understanding of how to discern and map
complex information from natural language text to a structured format that mirrors the organised
nature of knowledge graphs. This begins with the acquisition of textual data, for which we
utilise the Wikipedia API’s “random page” feature. This tool enables us to systematically retrieve
the full textual content from a diverse array of Wikipedia pages. The randomness ensures a
wide coverage of topics, which allows the model to learn from a broad spectrum of subjects
and contexts. Parallel to textual data extraction, we engage in the retrieval of structured data
corresponding to each Wikipedia page. This is achieved through the DBPedia SPARQL endpoint,
a query service that allows us to download the triples associated with the Wikipedia pages that
have been extracted. DBPedia, being a structured version of Wikipedia’s content based on the
Wikipedia page’s information box, ofers a rich set of triples that represent the information on
the pages in a structured RDF format. For the purpose of this research, we make the assumption
that the information contained in the infoboxes on Wikipedia pages are directly related to the
text content of those pages. To ensure that the collected RDF data is valid, we employ RDFlib,
a Python library designed to work with RDF data. We drop any observations that cannot be
parsed by RDFlib. Upon successful extraction and validation, we proceed to pair each piece of
Wikipedia text with its corresponding set of DBPedia triples. This pairing forms the basis of
our training dataset, where the model learns to associate the unstructured text with structured
knowledge representations. Moreover, we manually conduct a final review to eliminate any
duplicates.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>3.2.2. Phase 2: Domain-specific Fine-tuning</title>
          <p>The foundation of our domain-specific dataset begins with an automated data collection process.
Utilising Python scripts that integrate Selenium and BeautifulSoup, we devised a methodical
approach to navigate and extract content from web pages of a prominent Irish online retailer.
Selenium, a tool for browser automation, enables our scripts to interact with the web pages
dynamically to access the necessary product information. BeautifulSoup complements this
by parsing the HTML content retrieved by Selenium, allowing for eficient extraction of data
embedded within the web pages. Upon retrieving the HTML content, our focus shifts to the
extraction of relevant product information. Given the diverse structure of web pages, we employ
a filtering process to isolate the data of interest. This process begins with the identification
and extraction of content enclosed within ”div” tags, a common HTML element used to group
block-level content on web pages. This step is narrows the data to segments more likely to
contain product information.</p>
          <p>Following the initial isolation, we further refine the dataset through a targeted keyword
search. This search is designed to identify and retain content featuring key terms indicative of
retail-specific details, such as ”price” and the euro symbol (€). This keyword filtering ensures
that the extracted excerpts are relevant to our objective, focusing on product prices and possible
classifications. With the filtered set of HTML product excerpts in hand, the next phase involves
the manual annotation of this data within the COICOP RDFS framework. Each excerpt is
thoroughly reviewed and labeled with appropriate COICOP RDFS labels. This encompasses
both product types and price details. Table 1 demonstrates an example of extracted text and
the corresponding RDF triples chosen to represent that text as part of the annotation process.
Following this process, we produced a dataset of 1000 observations of gathered data.</p>
          <p>Extracted Text</p>
          <p>RDF Triples</p>
          <p>For both the Wikipedia-DBpedia and prices datasets, consistent with the Mistral 7B format,
each observation is divided into the system prompt, user prompt and response in the format
System Prompt [INST] User Prompt [/INST] Response. ”Text to RDF” was provided as the system
prompt common to all observations. The user prompt was the relevant text data, in this case
either the Wikipedia text data or the scraped price data. Finally, the response was either the
DBPedia triples or COICOP triples. These individual text observations were encoded using the
Mistral tokenizer.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Training Specification</title>
        <p>
          The training and evaluation was performed on a cluster of four NVIDIA RTX A5000 GPUs.
We fine-tune the model using Low Rank Adaptation (LoRA), a parameter-eficient fine-tuning
(PEFT) method [
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ]. This approach freezes the weights from previous training sessions,
enabling the training of large models with reduced computational resources. Training time for
the DBPedia sample was 12 hours, while training time for the smaller price dataset was just half
an hour. The model was evaluated on the test dataset, which comprised of 10 percent of the total
scraped data gathered and labelled. Each row of text gathered from the product web page in the
test dataset was tokenized and inputted to the model. The text output, formatted in n-triples,
was parsed with RDFlib for evaluation. This is a complex multi-label classification task. One
row of input data results in numerous triple outputs. As such, to evaluate the model, we first
provide the raw input text collected from the website from each observation of our test data.
The generated triples are then recorded and parsed using RDFlib and any duplicate triples are
dropped. These triples are then compared against the those that have been pre-labelled by hand.
A correct observation is considered that where the http://example.org/coicop#hasPrice relation
or http://www.w3.org/1999/02/22-rdf-syntax-ns#type relation has been correctly estimated. All
triples, both those generated by the model and manually labelled input datasets are in n-triples
format.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>The empirical evaluation of PRICER, as shown in the structured comparative analysis (Table 2),
demonstrates the influence of model conditioning through targeted fine-tuning. We analyse a
number of performance metrics across various model configurations, including precision, recall,
and F1 score. The pre-training with DBPedia and Wikipedia data emerges as a critical factor
in enhancing the model’s proficiency in transforming unstructured retail text into structured
knowledge graphs that align with the COICOP schema.</p>
      <p>The results from the model fine-tuned with 9,000 DBPedia and Wikipedia observations
(M+DBP9K+P) indicates a significant peak in precision (0.66) and a robust recall (0.50), suggests
an optimal balance of general and domain-specific knowledge for our task. Notably, the
precision demonstrates peaks at 9,000 observations, but a subsequent decline is observed when
further expanded to 18,000 observations (0.49). This trend prompts a critical examination of the
relationship between data volume and model performance, particularly the balance between
broadening the model’s knowledge base and the potential for overfitting or diminishing returns
on additional data.
2The base model has never seen data categorised in the schema and is therefore not expected to be able to complete
this task. It is included here for demonstration</p>
      <p>The iterative conditioning process, especially with an expansive corpus of 18,000 DBPedia
and Wikipedia observations (M+DBP18K+P), marks a significant enhancement in the model’s
capabilities, outperforming both the base model and configurations fine-tuned with smaller
datasets in terms of precision and recall. However, the observed decline in precision from the
M+DBP9K+P to the M+DBP18K+P configuration indicates a need for further investigation into
data selection and model tuning methodologies.</p>
      <p>While the results indicate an improvement in the model’s ability to generate correctly
associated triples with task-specific tuning, achieving correct interpretations in approximately half
of the tested observations highlights substantial avenues for improvement. This observation
accentuates the necessity for ongoing refinement of the model, aiming to enhance its accuracy
and reliability.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Limitations</title>
      <p>While this work shows promise, it must be acknowledged that this has been achieved on
a relatively simple schema with just eighty unique classes and two unique data properties.
Additionally, the HTML price data is trained on that of only one online retailer. Thus, the
model in its current form will not work as a general purpose annotator for other forms of data,
and will need to be expanded with and tested with data from other online retailers to become
more source agnostic. Additionally, the model remains untested with a more semantically rich
schema or ontology. We fully acknowledge these as current limitations for this model and
intend to expand on these areas in future work.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion and Future Work</title>
      <p>This study presents a pioneering approach to the automation of economic data collection
through the development of an RDFS-based schema, aimed at streamlining the extraction and
structuring of online price and product data. By extending the COICOP taxonomy to an RDFS
schema and training a transformer model on online data categorized within it, this research
suggests an approach to enhance the eficiency and accuracy of data collection processes and
establishes a foundation for the application of such methodologies across various domains
requiring standardized data handling. The employment of Mistral 7B, fine-tuned with LoRA,
showcases the potential of leveraging large language models (LLMs) in the context of economic
data collection, providing a promising avenue toward the automation of complex data processing
tasks.</p>
      <p>Moving forward, our research will pivot around three key areas: significantly expanding our
datasets with a larger array of labeled price data and a richer collection from Wikipedia and
DBpedia to boost the model’s precision and understanding of economic complexities, while
exploring a diverse range of LLMs beyond Mistral 7B. Additionally, we intend to explore
performance on a more semantically rich schema or ontology containing more complex relationships
between objects as well as additional data features such as quantity and base pricing for CPI
calculation. This multifaceted approach aims to refine our model’s capability to accurately
capture the nuances of economic data, thereby enhancing vital indicators like the Consumer
Price Index. By testing various models for their eficacy in integrating and structuring economic
data within our RDFS-based schema, we anticipate identifying models that not only elevate
performance but also exhibit greater adaptability and eficiency in the dynamic field of economic
data analysis and the collection of oficial statistics.</p>
    </sec>
    <sec id="sec-8">
      <title>7. Acknowledgments</title>
      <p>The authors would like to thank the Science Foundation Ireland ADAPT Centre and Centre for
Research Training in Artificial Intelligence for funding and support.
[14] J. Pennington, R. Socher, C. D. Manning, Glove: Global vectors for word representation,
2014, pp. 1532–1543.
[15] K. W. Church, Word2Vec, Natural Language Engineering 23 (2017) 155–162. Publisher:</p>
      <p>Cambridge University Press.
[16] J. H. Lau, T. Baldwin, An empirical evaluation of doc2vec with practical insights into
document embedding generation, arXiv preprint arXiv:1607.05368 (2016).
[17] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, Advances in neural information processing systems 30
(2017).
[18] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[19] V. Sanh, L. Debut, J. Chaumond, T. Wolf, DistilBERT, a distilled version of BERT: smaller,
faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019).
[20] M. Zaheer, G. Guruganesh, K. A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham,
A. Ravula, Q. Wang, L. Yang, Big bird: Transformers for longer sequences, Advances in
neural information processing systems 33 (2020) 17283–17297.
[21] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
V. Stoyanov, Roberta: A robustly optimized bert pretraining approach, arXiv preprint
arXiv:1907.11692 (2019).
[22] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, Language models are few-shot learners, Advances in neural
information processing systems 33 (2020) 1877–1901.
[23] B. Workshop, T. L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilić, D. Hesslow, R. Castagné, A. S.</p>
      <p>Luccioni, F. Yvon, Bloom: A 176b-parameter open-access multilingual language model,
arXiv preprint arXiv:2211.05100 (2022).
[24] G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, A. Cappelli, H. Alobeidli, B. Pannier,
E. Almazrouei, J. Launay, The RefinedWeb dataset for Falcon LLM: outperforming curated
corpora with web data, and web data only, arXiv preprint arXiv:2306.01116 (2023).
[25] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière,
N. Goyal, E. Hambro, F. Azhar, Llama: Open and eficient foundation language models,
arXiv preprint arXiv:2302.13971 (2023).
[26] Z. Ji, Y. Tiezheng, Y. Xu, N. Lee, E. Ishii, P. Fung, Towards mitigating LLM hallucination
via self reflection, 2023.
[27] J.-Y. Yao, K.-P. Ning, Z.-H. Liu, M.-N. Ning, L. Yuan, Llm lies: Hallucinations are not bugs,
but features as adversarial examples, arXiv preprint arXiv:2310.01469 (2023).
[28] E. M. Bender, T. Gebru, A. McMillan-Major, S. Shmitchell, On the dangers of stochastic
parrots: Can language models be too big? , 2021, pp. 610–623.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>United</given-names>
            <surname>Nations</surname>
          </string-name>
          <string-name>
            <surname>Statistics</surname>
          </string-name>
          , Classification of Individual Consumption According to Purpose (COICOP)
          <year>2018</year>
          ,
          <year>2018</year>
          . URL: https://unstats.un.org/unsd/classifications/unsdclassifications/ COICOP_
          <year>2018</year>
          _
          <article-title>-_pre-edited_white_cover</article-title>
          _version_-_
          <fpage>2018</fpage>
          -12-26.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          , D. d. l. Casas,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bressand</surname>
          </string-name>
          , G. Lengyel,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          , L. Saulnier, Mistral 7B,
          <source>arXiv preprint arXiv:2310.06825</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wallis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Allen-Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , Lora:
          <article-title>Low-rank adaptation of large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2106.09685</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <article-title>One-for-All: Generalized LoRA for ParameterEficient Fine-tuning</article-title>
          ,
          <source>arXiv preprint arXiv:2306.07967</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Martinez-Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>López-Arévalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Rios-Alvarado</surname>
          </string-name>
          ,
          <article-title>Openie-based approach for knowledge graph construction from text</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>113</volume>
          (
          <year>2018</year>
          )
          <fpage>339</fpage>
          -
          <lpage>355</lpage>
          . Publisher: Elsevier.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hofart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Yosef</surname>
          </string-name>
          , I. Bordino,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fürstenau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pinkal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Spaniol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Taneva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thater</surname>
          </string-name>
          , G. Weikum,
          <article-title>Robust disambiguation of named entities in text,</article-title>
          <year>2011</year>
          , pp.
          <fpage>782</fpage>
          -
          <lpage>792</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Moro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raganato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          ,
          <article-title>Entity linking meets word sense disambiguation: a unified approach, Transactions of the Association for Computational Linguistics 2 (</article-title>
          <year>2014</year>
          )
          <fpage>231</fpage>
          -
          <lpage>244</lpage>
          . Publisher: MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info ….
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>X.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Temporal Data Meets LLM-Explainable Financial Time Series Forecasting</article-title>
          , arXiv preprint arXiv:
          <volume>2306</volume>
          .11025 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Yang,</surname>
          </string-name>
          <article-title>FinBERT: A large language model for extracting information from financial text</article-title>
          ,
          <source>Contemporary Accounting Research</source>
          <volume>40</volume>
          (
          <year>2023</year>
          )
          <fpage>806</fpage>
          -
          <lpage>841</lpage>
          . Publisher: Wiley Online Library.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          , X.-Y. Liu,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Wang</surname>
          </string-name>
          , Fingpt:
          <article-title>Open-source financial large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2306.06031</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <article-title>Large language model-empowered agents for simulating macroeconomic activities</article-title>
          ,
          <source>arXiv preprint arXiv:2310.10436</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cavallo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rigobon</surname>
          </string-name>
          ,
          <article-title>The billion prices project: Using online prices for measurement and research</article-title>
          ,
          <source>Journal of Economic Perspectives</source>
          <volume>30</volume>
          (
          <year>2016</year>
          )
          <fpage>151</fpage>
          -
          <lpage>178</lpage>
          . Publisher: American Economic Association 2014 Broadway, Suite 305,
          <string-name>
            <surname>Nashville</surname>
          </string-name>
          , TN
          <volume>37203</volume>
          -2418.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cavallo</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Bertolotto, Filling the Gap in Argentina's Inflation Data</article-title>
          , Available at SSRN 2782104 (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>