<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Using ChatGPT to Refine Draft Conceptual Schemata in Supply-Driven Design of Multidimensional Cubes</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Stefano</forename><surname>Rizzi</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">DISI -University of Bologna</orgName>
								<address>
									<addrLine>Viale Risorgimento, 2</addrLine>
									<postCode>40136</postCode>
									<settlement>Bologna</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Using ChatGPT to Refine Draft Conceptual Schemata in Supply-Driven Design of Multidimensional Cubes</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">C8B42C03C9D8561158C532859164836E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:08+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Conceptual design</term>
					<term>Multidimensional model</term>
					<term>Large Language Models</term>
					<term>ChatGPT</term>
					<term>Refinement</term>
					<term>Supply-driven design</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Refinement is a critical step in supply-driven conceptual design of multidimensional cubes because it can hardly be automated. In fact, it relies on the end-users' requirements on the one hand, and on the semantics of measures, dimensions, and attributes on the other. As a consequence, it is normally carried out manually by designers in close collaboration with end-users. The goal of this work is to check whether LLMs can act as facilitators for the refinement task, so as to let it be carried out entirely -or mostly-by end-users. The Dimensional Fact Model is the target formalism for our study; as a representative LLM, we use ChatGPT's model GPT-4o. To achieve our goal, we formulate two research questions aimed at understanding the basic competences of ChatGPT in refinement and investigating if they can be improved via prompt engineering. The results of our experiments show that, indeed, a careful prompt engineering can significantly improve the accuracy of refinement, and that the residual errors can quickly be fixed via one additional prompt. However, we conclude that, at present, some involvement of designers in refinement is still necessary to ensure the validity of the refined schemata.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Conceptual design is a key step in the development of data warehouse (DW) systems and multidimensional databases, since it determines their information content and, ultimately, the set of queries they can answer. The goal is to create an implementation-independent representation of one or more cubes structured according to the multidimensional model, i.e., described in terms of measures, dimensions, and attribute hierarchies. A lot of research has been done over the last couple of decades on conceptual design of cubes, mainly distinguishing between supply-driven approaches, where the conceptual schema is determined starting from the schema of a source databases, and demand-driven approaches, where it is created based on the end-users' requirements.</p><p>An advantage of supply-driven design over demanddriven design is that a draft conceptual schema can be obtained from the source schema in automatic fashion, by applying an algorithm that essentially chases the functional dependencies coded in the source schema and uses them to arrange hierarchies <ref type="bibr" target="#b0">[1]</ref>. Although this significantly speeds up design, the draft schema must then be refined in the light of the end-users' requirements. Refinement mainly implies the following activities <ref type="bibr" target="#b1">[2]</ref>:</p><p>• Removing attributes that are deemed not interesting for analyses.</p><p>• Finding descriptive attributes, i.e., attributes that should not be used for aggregation while being useful for analyses (e.g., the name of a customer).</p><p>• Discretizing attributes with dense domains to make them usable for aggregation (e.g., the weight of a product).</p><p>• Finding optional attributes, i.e., attributes that are undefined for some instances of the hierarchy (e.g., the State attribute in a geographical hierarchy that also includes non-US nations). • Labeling measures based on whether the SUM operator can be used or not to aggregate them (e.g., the exchange rate of dollars to euros, which cannot be summed).</p><p>Unfortunately, these activities can hardly be automated by an algorithm because they rely on the end-users' requirements on the one hand, and on the semantics of measures, dimensions, and attributes as expressed by their names on the other. Then, they must be carried out manually by designers in collaboration with end-users. This is a typical situation in software engineering where Large Language Models (LLMs) may come to the rescue. LLMs have proven to be a great tool for mimicking human linguistic abilities because of their capacity to learn from large corpora, which has had a disruptive effect in a number of fields, and more specifically in software engineering <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5]</ref>. In particular, the experiments on using LLMs for conceptual design <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7]</ref> showed that they can help designers with this task by producing draft solutions in a timely manner -although some human intervention is still necessary to guarantee the accuracy of the outcomes.</p><p>The goal of this work is to check whether LLMs can act as facilitators for the refinement of conceptual schemata of multidimensional cubes, so as to relieve designers from their role or even, if possible, let refinement be carried out entirely by end-users. The Dimensional Fact Model (DFM <ref type="bibr" target="#b0">[1]</ref>) is the target formalism for our study; as a typical LLM, we use ChatGPT's model GPT-4o <ref type="bibr" target="#b7">[8]</ref>, which has gained popularity for its smooth user interface and natural language generating capabilities <ref type="bibr" target="#b8">[9]</ref>. To achieve our goal, we formulate two research questions aimed at (i) understanding the basic competences of ChatGPT in the refinement of a draft DFM schema and (ii) investigating if the latter can be improved via prompt engineering. An extended version of this work is available in <ref type="bibr" target="#b9">[10]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>An experiment to use LLMs for creating specifications from requirements documents in the realm of smart devices is described in <ref type="bibr" target="#b6">[7]</ref>. The authors contend that the fundamental skill of conceptual design is still lacking, but they acknowledge that LLMs are very useful in later phases of the devel-opment process, like creating class diagrams and generating source code. Additional experiments with ChatGPT for conceptual modeling are discussed in <ref type="bibr" target="#b5">[6]</ref>. The authors note that ChatGPT can rapidly produce an initial draft diagram from a natural language description; nevertheless, considerable modeling expertise is still needed to improve and verify the outcomes. The authors of <ref type="bibr" target="#b8">[9]</ref> describe an experiment they conducted using ChatGPT and come to the conclusion that while adding LLMs to human-driven conceptual design does not dramatically affect outcomes, it does greatly reduce the time required to complete the design by requiring fewer design steps. In <ref type="bibr" target="#b10">[11]</ref>, many conceptual schemata produced by an LLM are contrasted with a baseline of crowdsourced solutions. On average, it is shown that crowdsourced ideas are more innovative, whereas LLM-generated solutions are more practical. In <ref type="bibr" target="#b11">[12]</ref>, the benefits of utilizing LLMs to improve morphological analysis in conceptual design are examined. The tests demonstrate how LLMs give designers access to interdisciplinary knowledge; for optimal outcomes, LLMs and designers should work closely together and use smart prompt engineering. With relation to use case and domain modeling, <ref type="bibr" target="#b12">[13]</ref> examines how users engage with LLMs during conceptual modeling. primary conclusions speak to the necessity of particular prompt templates to assist users.</p><p>As to multidimensional conceptual design, the main types of methods in the literature are supply-driven (or datadriven), demand-driven (or requirement-driven), mixed, and query-driven. Supply-driven methods begin by designing conceptual schemata from the schemata of the data sources (such as relational schemata); end-user requirements influence design by enabling the designer to choose which data are important for making decisions and by figuring out how to structure them using the multidimensional model <ref type="bibr" target="#b13">[14]</ref>. Demand-driven techniques begin with identifying end-users' business requirements, and only then do they look into how to map these requirements onto the available data sources <ref type="bibr" target="#b14">[15]</ref>. Mixed techniques integrate requirementsdriven and data-driven methods; here, both end-user requirements and data source schemata are used simultaneously <ref type="bibr" target="#b15">[16]</ref>. The set of OLAP queries that end-users are willing to formulate is the starting point for the creation of a multidimensional schema in query-driven approaches. These queries can be specified using SQL statements <ref type="bibr" target="#b16">[17]</ref>, MDX expressions <ref type="bibr" target="#b17">[18]</ref>, pivot tables <ref type="bibr" target="#b18">[19]</ref>, or query trees <ref type="bibr" target="#b19">[20]</ref>. Multidimensional modeling techniques are reviewed in <ref type="bibr" target="#b20">[21]</ref>, and their cost-benefit analysis is provided in <ref type="bibr" target="#b21">[22]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The investigation process</head><p>As stated in the Introduction, our goal in this work is to assess the performance of ChatGPT in the refinement of a draft DFM schema obtained by supply-driven design starting from a source relational schema. We take as a reference an advanced form of the DFM including, besides the basic constructs of fact, measure, dimension, and attribute, the advanced constructs of descriptive attributes, optional attributes, and additivity. In this form, a DFM schema is a graph whose root is the fact (represented as a box with the fact name -e.g., SALES-followed by a list of measurese.g., Amount), whose other nodes are attributes -e.g., Product-represented as circles and connected by arcs representing many-to-one roll-up relationships, i.e., functional dependencies (FDs, for instance, Product → Category). De-scriptive attributes are represented without a circle; optional attributes are dashed; a non-additive measures is represented by adding its aggregation operator to its name (e.g., ExchangeRate (AVG)).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Research questions</head><p>We formulate the following research questions: RQ.1: Is ChatGPT capable of refining a draft DFM schema by (i) making attribute names more intuitive for endusers, (ii) showing additivity, (iii) finding descriptive attributes or discretizing them, (iv) finding optional attributes, (v) completing time hierarchies, and (vi) removing uninteresting attributes?</p><p>RQ.2: Can the performance of ChatGPT in refining a draft DFM schema be improved via prompt engineering?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Experiment design</head><p>Our experiment relies on five cornerstones, described in the following subsections.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Base criteria and technology</head><p>The criteria we follow for our experiment are listed below:</p><p>• Learning. For learning we adopt a prompt-based learning method, which is often used as an alternative to finetuning <ref type="bibr" target="#b22">[23]</ref>. Specifically, for RQ.1 we adopt 0-shot learning (which operates with no labeled examples); for RQ.2 we adopt few-shot learning and provide two training examples <ref type="bibr" target="#b23">[24]</ref>. To further improve learning, for RQ.2 we also employ the chain-of-thought technique <ref type="bibr" target="#b24">[25]</ref>, which includes a list of reasoning steps in the examples.</p><p>• Reproducibility. The lack of reproducibility of the tests is a significant challenge when working with LLMs because of their non-deterministic nature. The level of "creativity" of ChatGPT is ruled by its temperature parameter; in principle, no creativity is required for refining draft schemata, so we set the temperature to 0 for every chat.</p><p>• Domain. The issue domain is acknowledged to be crucial for LLMs; the more domain knowledge an LLM has, the better the model it generates <ref type="bibr" target="#b25">[26]</ref>. Every example we present describes actual domains, some of which are wellknown (like purchases) and others are less common (like crossfit workouts).</p><p>• Conversation-awareness. The answers obtained from ChatGPT may depend heavily on the previous questions asked during a conversation. Thus, as also suggested in <ref type="bibr" target="#b25">[26]</ref>, we start a new chat for each case.</p><p>• Iteration. In all our tests, the first answer obtained is considered. However, keeping in mind that the refinement process is inherently iterative, in RQ.2 we tried to improve the first answer by further prompting ChatGPT with suggestions.</p><p>As to the technological environment, experiments have been carried out on the ChatGPT-4o model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Input/output format</head><p>A draft DFM schema must be provided in input for each of our research questions, and a refined one must be provided as output. We employ YAML 1 , a human-readable data serialization language that is frequently used for configuration files and in applications where data is being saved or communicated, as a format to express DFM schemata. Since ChatGPT is familiar with YAML, it does not need any further instruction on the syntax of the language; nevertheless, it needs to be taught about the particular tags we added to denote multidimensional concepts (e.g., measures to introduce the list of measures).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Prompt templates</head><p>Following the suggestions given in <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>, the prompts we adopt during our chats are structured according to the following templates:</p><p>• Instruction prompts. These are used in RQ.1 and RQ.2 to assign ChatGPT a task and explain how to execute it. Their structure includes: (1) ROLE: the specific roles assigned to ChatGPT and to the user to provide a context for the task; (2) FORMAT: how the input and output (i.e., the draft and refined DFM schemata) should be coded;</p><p>(3) TASK: the task assigned; (4) PROCEDURE (optional): the method suggested to perform the task; ( <ref type="formula">5</ref>) EXAMPLE (optional): an example of some test cases, an explanation of the procedure suggested to solve it (according to the chain-of-thought principle), and the expected output.</p><p>• Case prompts. These are used in RQ. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.4.">Test cases</head><p>We created a set of five test cases with increasing difficulties, based on some exercises in supply-driven design assigned to the students of a master course in Business Intelligence. Each exercise provided a source relational schema; from this schema, a draft DFM schema was created in supply-driven mode using the FD-chasing algorithm in <ref type="bibr" target="#b0">[1]</ref>. The number of dimensions and measures in the test cases ranges from 3 to 5 and from 0 to 5, respectively, while the overall number of attributes (dimensions plus hierarchy levels) ranges from 10 to 34. Two of the test cases include shared hierarchies (i.e., nodes entered by two or more arcs, as often is the case with temporal hierarchies).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.5.">Evaluation of the results</head><p>Refinement is, to some extent, a subjective process because it largely depends on the end-user requirements. For instance, given a ProductWeight attribute, both making it descriptive and discretizing it into WeightRanges are reasonable refinements. As a consequence, creating a single ground truth for each test case is hardly feasible. So we had to proceed manually, by first identifying a set of feasible refinements for each part of each draft DFM schema, and then counting an error in the solution proposed by ChatGPT for each deviation from this set of feasible refinements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Answer to RQ.1: Refinement</head><p>To put ChatGPT to the test on refinement, we fed it with our five test cases. In order to enable a more precise evaluation of the abilities of ChatGPT, separate prompts are submitted for each refinement step entailed by RQ.1. Thus, for each test case we adopt a simple instruction prompt that explains the DFM constructs followed by a request to carry out a list of refinement steps; no PROCEDURE and EXAM-PLE components are present that suggest ChatGPT how to operate. Then we formulate a sequence of case prompts that (i) specify a draft DFM schema in input, (ii) assign as a task one single refinement step, and (iii) require a refined DFM schema in output. In the following we will separately consider each step and briefly review its result.</p><p>(i) Make names intuitive. The attribute and measure names in the draft DFM schema have the form RELA-TION_NAME.attributeName, being RELATION_NAME a table of the source relational schema used for deriving the draft schema and attributeName one of its attributes (see Figure <ref type="figure" target="#fig_2">1</ref>). Making these names more intuitive for end-users is mostly done correctly even if no specific procedure is suggested. In some cases the relation name was simply dropped, in others it was prefixed to the attribute name (e.g., SUPPLIER.name became SupplierName). In case C5, the most complex one, the shared hierarchy was mistaken and the direction of some FDs was inverted.</p><p>(ii) Label measures. ChatGPT is quite good at dealing with additivity. This is surprising, considering that this task is often not easy even for end-users. The main errors we found were syntactical: the measure was renamed in the YAML code under the "measure" tag but not under the "dependencies" tag, resulting in additional fake nodes.</p><p>(iii) Find descriptive attributes. ChatGPT performs poorly in this task, with an average of almost four errors per test case. On the one hand, it does not know under which conditions an attribute should be made descriptive or discretized; on the other, it does not use the correct syntax as stated in the FORMAT section of the instruction prompt.</p><p>(iv) Find optional attributes. The identification of optional attributes is strictly related to the end-user requirements. Thus, for this refinement step the prompt simulates an end-user statement; for instance, "Not all regions have a state". As in the previous case, ChatGPT always fails in the syntax used (although it correctly identifies the optional attribute).</p><p>(v) Complete time hierarchies. Here, ChatGPT correctly adds Month → Year hierarchies to Date attributes. However, it always fails in recognizing and managing shared hierarchies (in C4 and C5).</p><p>(vi) Remove attributes. For the last refinement step, an indication from end-users about which attributes are deemed uninteresting for their analyses is required. Thus, like  for optional attributes, the prompt simulates an end-user statement; for instance, "StoreId is not interesting to me". ChatGPT does not know how to correctly rearrange FDs after removing an attribute, so it makes an average of 2 errors per test case for this refinement step.</p><p>The number of errors made at each step for each test case is shown in Figure <ref type="figure">2</ref> (top). As an example, Figure <ref type="figure">3</ref> (top) shows the final DFM schema for test case C2; note that descriptive and optional attributes are not shown as such because the visualizer does not recognize the wrong YAML syntax for them, and that the graph is non-connected because some FDs were dropped. Overall, the performance are not very good but acceptable, with an average number of total refinement errors per test case equal to 9. The errors clearly tend to increase with the complexity of the draft schema; the main problems are due to the YAML syntax and to the presence of shared hierarchies. The most critical refinement steps appear to be the identification of descriptive attributes and the removal of uninteresting attributes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Answer to RQ.2: Improved refinement</head><p>To answer RQ.2 we incrementally crafted an instruction prompt by first trying to address the main issues emerged in RQ.1, then progressively adding specific sentences to try to fix the residual (or new) errors. The ROLE, FORMAT, and TASK components are exactly like in RQ.1. However, for each refinement step, we added a PROCEDURE component to suggest ChatGPT how to operate and an EXAMPLE component with two examples. The case prompts are exactly the same used for RQ.1.</p><p>No errors in additivity and optional attributes are made when the improved prompt is used. A single renaming error is made in C5, due to an unrecognized shared hierarchy. A few errors are made on time hierarchies, again due to shared hierarchies. Overall, the main causes of errors are related to descriptive/discretized attributes (in some cases, a few of them are not identified) and to the removal of uninteresting attributes (sometimes, arcs are not correctly repositioned). Noticeably, all these errors could be fixed in a single iteration via specific prompts, e.g., "Merge drop-off date and pick-up date into a single date node" to fix a shared hierarchy. In some cases, even generic prompts were used successfully to fix errors, e.g., "Some arcs are missing, please try again" to fix the FDs after removal. Figure <ref type="figure">3</ref> (bottom) shows the final DFM schema obtained for C2 after correcting two errors in descriptive attributes via an iteration prompt.</p><p>The results, in terms of number of errors made at each step, are summarized in Figure <ref type="figure">2</ref> (bottom). It appears that prompt engineering can significantly improve the accuracy of refinement, with the average number of total refinement errors per test case decreasing from 9 to 4. The main residual errors are related to the recognition of shared hierarchies and of descriptive/discretized attributes, as well as to the removal of uninteresting attributes. In our tests, all these errors could be fixed via an additional prompt that either explains exactly how to proceed, or simply suggests to try again paying more attention to some specific aspect.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>In this work we have investigated the capabilities of Chat-GPT to cope with a specific task in conceptual design, namely, the refinement of draft DFM schemata obtained by supply-driven conceptual design of multidimensional data cubes -a task that is normally carried out manually by designers and end-users in close collaboration. It turned out that, although ChatGPT tends to mix the conceptual level (DFM) with the logical level (star/snowflake schemata), it can provide some acceptable results on test cases with different degrees of complexity using simple prompts. Noticeably, our tests show that, when prompts are enhanced with detailed instructions and examples, the results produced significantly improve in all cases. Indeed, when using an improved prompt the average number of errors per multidimensional concept across all test cases decreases from 0.5 to 0.2. In practice, the residual errors are still too many to state that no involvement of designers is necessary and that end-users can carry out refinement by directly interacting with an LLM. However, we can conclude that LLMs can significantly support designers in refinement, even considering that all residual errors in our tests could quickly be fixed via a simple additional prompt.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>DOLAP 2025: 27th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, co-located with EDBT/ICDT 2025, March 25, 2025, Barcelona, Spain stefano.rizzi@unibo.it (S. Rizzi) 0000-0002-4617-217X (S. Rizzi)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>1 and RQ.2 to assign a specific task to ChatGPT. Their structure includes: (1) INPUT: the input of the task (a draft DFM schema coded in YAML); (2) TASK: the task assigned; (3) OUTPUT: the output required (a refined DFM schema coded in YAML).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Draft schema for test case C2 (only the first four letters of relation names are shown)</figDesc><graphic coords="3,325.98,65.61,180.90,119.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Number of errors in the refinement of draft DFM schemata (top: basic prompts, bottom: improved prompts)</figDesc><graphic coords="4,97.64,285.01,162.41,101.93" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://yaml.org/spec/history/2001-08-01.html</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Golfarelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rizzi</surname></persName>
		</author>
		<title level="m">Data warehouse design: Modern principles and methodologies</title>
				<imprint>
			<publisher>McGraw-Hill</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Multidimensional modeling driven from a domain language</title>
		<author>
			<persName><forename type="first">L</forename><surname>Antonelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bimonte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rizzi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Autom. Softw. Eng</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page">6</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">LLMs: Understanding code syntax and semantics for code analysis</title>
		<author>
			<persName><forename type="first">W</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Nie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<idno>CoRR abs/2305.12138</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">ChatGPT prompt patterns for improving code quality, refactoring, requirements elicitation, and software design</title>
		<author>
			<persName><forename type="first">J</forename><surname>White</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hays</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Spencer-Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">C</forename><surname>Schmidt</surname></persName>
		</author>
		<idno>CoRR abs/2303.07839</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">AI-driven software engineering -the role of conceptual modeling</title>
		<author>
			<persName><forename type="first">H</forename><surname>Fill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cabot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Maass</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Van Sinderen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Enterp. Model. Inf. Syst. Archit. Int. J. Concept. Model</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Conceptual modeling and large language models: Impressions from first experiments with ChatGPT</title>
		<author>
			<persName><forename type="first">H</forename><surname>Fill</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fettke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Köpke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Enterp. Model. Inf. Syst. Archit. Int. J. Concept. Model</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page">3</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Generating specifications from requirements documents for smart devices using large language models (LLMs)</title>
		<author>
			<persName><forename type="first">R</forename><surname>Lutze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Waldhör</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. HCI</title>
				<meeting>HCI<address><addrLine>Washington, DC, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="94" to="108" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Unlocking the potential of ChatGPT: A comprehensive exploration of its applications, advantages, limitations, and future directions in natural language processing</title>
		<author>
			<persName><forename type="first">W</forename><surname>Hariri</surname></persName>
		</author>
		<idno>CoRR abs/2304.02017</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Examining how the large language models impact the conceptual design with human designers: A comparative case study</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Duh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Hum. Comput. Interact</title>
		<imprint>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Using ChatGPT to refine draft conceptual schemata in supply-driven design of multidimensional cubes</title>
		<author>
			<persName><forename type="first">S</forename><surname>Rizzi</surname></persName>
		</author>
		<idno>arXiv 2502.02238v1</idno>
		<imprint>
			<date type="published" when="2025">2025</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Conceptual design generation using large language models</title>
		<author>
			<persName><forename type="first">K</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Grandi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Mccomb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Goucher-Lambert</surname></persName>
		</author>
		<idno>CoRR abs/2306.01779</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A LLM-augmented morphological analysis approach for conceptual design</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Jing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. DRS</title>
				<meeting>DRS<address><addrLine>Boston, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="1" to="19" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">How are LLMs used for conceptual modeling? An exploratory study on interaction behavior and user perception</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Reinhartz-Berger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bork</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ER</title>
				<meeting>ER<address><addrLine>Pittsburgh, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="257" to="275" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Data-driven multidimensional design for OLAP</title>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SSDBM</title>
				<meeting>SSDBM<address><addrLine>Portland, OR, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="594" to="595" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Mayorova, A requirement-driven approach to the design and evolution of data warehouses</title>
		<author>
			<persName><forename type="first">P</forename><surname>Jovanovic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Simitsis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">44</biblScope>
			<biblScope unit="page" from="94" to="119" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Hybrid methodology for data warehouse conceptual design by UML schemas</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">D</forename><surname>Tria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Lefons</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Tangorra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Softw. Technol</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page" from="360" to="379" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Automatic validation of requirements to support multidimensional design</title>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data Knowl. Eng</title>
		<imprint>
			<biblScope unit="volume">69</biblScope>
			<biblScope unit="page" from="917" to="942" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Constructing OLAP cubes based on queries</title>
		<author>
			<persName><forename type="first">T</forename><surname>Niemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Nummenmaa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Thanisch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. DOLAP</title>
				<meeting>DOLAP<address><addrLine>Atlanta, Georgia, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="9" to="15" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Requirements-driven data warehouse design based on enhanced pivot tables</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bimonte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Antonelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rizzi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Req. Eng</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="43" to="65" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A conceptual querydriven design framework for data warehouse</title>
		<author>
			<persName><forename type="first">R</forename><surname>Nair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wilson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Srinivasan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. Jour. of Computer and Information Engineering</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="62" to="67" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">A survey of multidimensional modeling methodologies</title>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Data Warehous. Min</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="1" to="23" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Cost-benefit analysis of data warehouse design methodologies</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">D</forename><surname>Tria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Lefons</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Tangorra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">63</biblScope>
			<biblScope unit="page" from="47" to="62" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Automated domain modeling with large language models: A comparative study</title>
		<author>
			<persName><forename type="first">K</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A H</forename><surname>López</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mussbacher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Varró</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. MODELS</title>
				<meeting>MODELS<address><addrLine>Västerås, Sweden</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="162" to="172" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">B</forename><surname>Brown</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. NeurIPS</title>
				<meeting>NeurIPS</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Chain-of-thought prompting elicits reasoning in large language models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. NeurIPS</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Koyejo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Mohamed</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Agarwal</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Belgrave</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Oh</surname></persName>
		</editor>
		<meeting>NeurIPS<address><addrLine>New Orleans, LA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML</title>
		<author>
			<persName><forename type="first">J</forename><surname>Cámara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Troya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Burgueño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vallecillo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Softw. Syst. Model</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="781" to="793" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
