<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Automatically Designing Machine Learning Models out of Natural Language</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Ernesto</forename><surname>Luis Estevanell-Valladares</surname></persName>
							<email>elev1@alu.ua.es</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Software and Computer Systems</orgName>
								<orgName type="institution">University of Alicante</orgName>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Faculty of Mathematics and Computer Science</orgName>
								<orgName type="institution">University of Havana</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Automatically Designing Machine Learning Models out of Natural Language</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">E7FA46AD725FAE3CB215424CDCF65418</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:05+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>AutoML</term>
					<term>Natural Language Processing</term>
					<term>Large Language Models</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The popularity of artificial intelligence has led to an increasing need for machine learning models tailored to specific needs. AutoML aims to automate the process of creating effective machine-learning solutions, but current systems need to be more versatile to meet the demand. While more flexible and extensible heterogeneous systems have overcome many limitations of traditional AutoML systems, they lack accessibility due to their programmatic interfaces. We propose a research project to address this issue to develop a heterogeneous AutoML system that can produce optimal machine-learning pipelines using a natural language interface.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Machine learning has expanded rapidly, presenting researchers and practitioners with many new algorithms and data sets. However, selecting the most suitable strategy for a given issue has become increasingly complex, requiring extensive experimentation and technical expertise. AutoML has emerged as a solution to this problem by providing powerful tools to search through large machine-learning pipelines <ref type="bibr" target="#b0">[1]</ref>. Nevertheless, the range of possible techniques for natural language processing is vast, making it hard to combine and compare different algorithms. Thus, AutoML algorithms must agree on a standard protocol for sharing outputs as inputs for any other algorithm.</p><p>To achieve the primary goal of AutoML, the systems must have interfaces that are easy to use for those with limited computer science and machine learning knowledge. Furthermore, the systems should have strong generalization capabilities to create tools that can be utilized in various scenarios and produce machine learning models that can be applied to a wide range of applications. However, current AutoML systems focus on a specific set of algorithms, often tailored to a library or toolkit <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4]</ref>. This reduces their ability to explore various algorithms from different domains and find optimal solutions to complex, multifaceted problems. In contrast, Heterogeneous AutoML systems generate learning solutions by mixing techniques from different domains <ref type="bibr" target="#b4">[5]</ref>. However, they do not provide natural interfaces for novice users in programming or AutoML, and their execution requires preparation of the environment, the definition of the problem in appropriate terms, and the provision of data in specific formats <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b6">7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>.</p><p>Using natural language as a user interface can significantly enhance the accessibility and user-friendliness of AutoML. Large Language Models (LLMs) are trendy for their ability to process raw text and effectively identify patterns and connections in data <ref type="bibr" target="#b9">[10,</ref><ref type="bibr" target="#b10">11]</ref>. However, these models can sometimes generate incorrect responses or need help with inference tasks <ref type="bibr" target="#b11">[12]</ref>. Recent studies suggest that incorporating external knowledge significantly improves the performance of LLMs <ref type="bibr" target="#b12">[13]</ref>.</p><p>Researchers are exploring combining these techniques to address the limitations of both AutoML and LLMs. Shen et al. <ref type="bibr" target="#b13">[14]</ref> employed an LLM to process queries and generate learning task planning using pre-trained models. This approach has successfully integrated image, audio, and text prediction, classification, and processing capabilities into a single system. However, it does not focus on optimizing model selection or hyperparameter optimization. It does not produce tuned learning models in standalone programs or allow for the export of inference power to arbitrary environments.</p><p>The research project consists of producing a Heterogeneous AutoML system that integrates natural language processing as its primary interface. The ultimate goal is to design a tool to generate optimal machine learning models that are flexible and adaptable to different contexts and heterogeneous situations. This leads to our Main Research Question: "In what way can we integrate Natural Language into a Heterogeneous AutoML process?".</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Most AutoML systems only use limited algorithms specific to a particular library or toolkit. This limitation hinders their ability to solve complex problems by exploring various algorithms from different areas. On the other hand, Heterogeneous AutoML systems combine techniques from multiple domains to create better learning solutions. However, they require a user-friendly interface for those new to programming or AutoML. This setup involves defining the problem correctly, providing data in specific formats, and setting up the environment.</p><p>Table <ref type="table">1</ref> contrasts several existing AutoML systems with the system proposed in this research regarding their capabilities of dealing with heterogeneous scenarios. This evaluation focused on their capacity to handle diverse scenarios encompassing multiple algorithms. It is worth noting that this assessment was solely based on their ability to handle heterogeneous algorithms without consideration for their overall performance, capacity, or applicability.</p><p>AutoML systems vary in capabilities and limitations, depending on the specific learning libraries they are built on. Some, like Auto-Sklearn <ref type="bibr" target="#b2">[3]</ref>, Auto-Weka <ref type="bibr" target="#b3">[4]</ref>, and Auto-Keras <ref type="bibr" target="#b1">[2]</ref>, are restricted to using Scikit-learn <ref type="bibr" target="#b14">[15]</ref>, Weka <ref type="bibr" target="#b15">[16]</ref>, and Keras <ref type="bibr" target="#b16">[17]</ref>, respectively. Other systems, such as RECIPE <ref type="bibr" target="#b8">[9]</ref> and Hyperopt <ref type="bibr" target="#b6">[7]</ref>, can incorporate algorithms from different libraries but require a concrete implementation. TPOT <ref type="bibr" target="#b5">[6]</ref> and ML-Plan <ref type="bibr" target="#b7">[8]</ref> provide a more flexible approach, combining technologies from multiple learning libraries to create concrete implementations of learning pipelines. AutoML systems are mostly focused on supervised learning, but some offer the potential to integrate unsupervised learning functionality, like Hyperopt.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>AutoML Systems</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Multiple libraries</head><p>Multiple ML problems Probabilistic Extensible Automatic Discovery Multiobjective Distributed Deployable Pipelines</p><formula xml:id="formula_0">Year Hyperopt ≈ ✓ ✓ ✓ ✓ ✓ 2013 Auto-Weka 2.0 ✓ ✓ ✓ ✓ 2017 RECIPE ≈ ≈ 2017 TPOT ✓ ✓ ✓ 2018 ML-Plan ✓ ✓ ✓ ✓ ✓ ✓ 2018 Auto-Keras ✓ ✓ ✓ ✓ 2019 Auto-Sklearn 2.0 ✓ ✓ ✓ ✓ ✓ 2020 AutoGOAL ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ 2023</formula><p>Table <ref type="table">1</ref> Comparison of several existing AutoML systems' capabilities to deal with heterogeneous machine learning problems. Entries marked with ≈ indicate that the system design enables the given capability, but we have no record of its implementation.</p><p>Systems like Auto-Sklearn and Auto-Keras benefit from a unified underlying API, while modularly designed systems such as ML-Plan allow the addition and modification of algorithms. Several systems allow balancing different objectives or metrics during optimization, which is relevant in multiple development and research scenarios. For example, Auto-Keras and ML-Plan use a weighted sum approach to combine multiple evaluation metrics into a single objective function. These systems allow you to specify both the metrics and the weights assigned to each metric, which allows you to control their relative importance in the optimization process.</p><p>Hyperopt, ML-Plan, Auto-WEKA, and Auto-Sklearn include mechanisms to distribute search processes and resources among multiple computers, optimizing search time and generating learning pipelines more quickly. To meet the goal of AutoML, system solutions should be easily usable in arbitrary environments and applicable as portable machine learning algorithms outside the AutoML context. Systems such as Hyperopt, TPOT, and Auto-Sklearn allow exporting the best pipeline found during the search process as a Python script, while Auto-WEKA can export to JAVA <ref type="bibr" target="#b17">[18]</ref> code. Auto-Keras allows exporting the models in various formats, including TensorFlow <ref type="bibr" target="#b18">[19]</ref>, PyTorch <ref type="bibr" target="#b19">[20]</ref>, and Keras <ref type="bibr" target="#b16">[17]</ref>.</p><p>Using probabilistic models to describe the space of possible pipelines is another interesting feature of AutoML systems. AutoML systems based on Bayesian optimization build an internal representation of the space of possible algorithm pipelines, which can be interpreted as assigning a probability distribution to each particular pipeline. This feature describes the algorithm pipeline space and allows researchers to gather additional information by analyzing which regions have higher or lower probabilities of generating effective Pipeline components and allow researchers to gather additional information.</p><p>Previous research addressed the limitations of AutoML systems. Estevez-Velarde et al. <ref type="bibr" target="#b20">[21]</ref> introduced the concept of Heterogeneous AutoML, a more general formulation of the AutoML Problem. Additionally, they introduced AutoGOAL, a flexible and efficient system for heterogeneous AutoML implemented in Python. With AutoGOAL, users can describe a specific machine problem, input and output requirements, and a set of objectives. The system then automatically finds the best pipeline of algorithms from various libraries, including Scikit-learn <ref type="bibr" target="#b14">[15]</ref>, NLTK <ref type="bibr" target="#b21">[22]</ref>, Keras <ref type="bibr" target="#b16">[17]</ref>, and Gensim <ref type="bibr" target="#b22">[23]</ref>. It is also customizable, allowing users to add and integrate new algorithms into the existing pipelines. AutoGOAL uses a Pareto Front approach to multiobjective optimization and an optimization process based on probabilistic grammatical evolution for context-free grammar <ref type="bibr" target="#b23">[24]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Proposed Research</head><p>AutoGOAL has achieved state-of-the-art performance against other AutoML systems and has been able to solve machine-learning tasks outside of supervised learning. Additionally, it can build complex pipelines targeting difficult NLP tasks like Named Entity Recognition, being able to connect algorithms of different natures. This research project will use AutoGOAL as a baseline for its capabilities regarding Heterogeneous AutoML.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Heterogeneous AutoML</head><p>According to Estevez-Velarde et al. <ref type="bibr" target="#b20">[21]</ref>, the Heterogeneous AutoML problem's space of all possible pipelines can be represented as a 𝐺 𝐴 graph. This graph consists of nodes representing each known algorithm 𝑎 𝑖 , and edges exist between all pairs 𝑎 𝑖 , 𝑎 𝑗 such that their corresponding output and input types are compatible. Given a machine-learning task defined as a function that transforms an input type into an output type, we can build a specific search space graph 𝐺 ′ 𝐴 that only models valid pipelines. To extract 𝐺 ′ 𝐴 , we must introduce two additional nodes to 𝐺 𝐴 : Input and Output. These nodes are then connected to all algorithms capable of producing the desired output from the specific input. By identifying any path in 𝐺 ′ 𝐴 that connects the Input and Output nodes, we can obtain a pipeline that addresses the machine-learning problem at hand.</p><p>A suitable computational implementation of this process requires solving the following problems:</p><p>1. Defining each algorithm and their respective input and output, such that it is computationally feasible to determine if two algorithms can be connected and construct the graph. 2. Designing an optimization strategy that can effectively search in the space of all pipelines, algorithms, and their hyperparameters, given restricted computational resources.</p><p>AutoGOAL utilizes a set of Semantic Type objects to implement this compatibility function. Each semantic data type is a Python class that belongs to a hierarchy in which object inheritance directly represents the relation for type compatibility (e.g., Word can also be interpreted as Text, a more general type). The data types have a semantic interpretation beyond their underlying computational structure. For example, a string in computational terms can either be a Document, a Sentence, or a Word. Each algorithm is implemented as a class with a run(input: Tin) -&gt; Tout method that performs the corresponding processing, potentially wrapping an underlying implementation from a machine learning library. Each algorithm's input and output types are specified using Semantic Types and represented by the Tin and Tout annotations.</p><p>While this method for computing compatibility has advantages, it is rigid and difficult to maintain. Due to the closed nature of the type system, precise type definitions must be matched for any new algorithms identified and added to the AutoGOAL search space. Also, because adding new Semantic Types does not automatically update current algorithm annotations, users need to check on every existing algorithm to identify which should be annotated accordingly. Moreover, this mechanism assumes a tree-like structure of type compatibility when there might be more complex relationships (e.g., a Stem might also be considered a Word, albeit these two types do not inherit from each other). This leads to an interesting question: "Can we model algorithm compatibility more openly?".</p><p>Recent proposals suggest using natural language to store information describing algorithms. Shen et al. <ref type="bibr" target="#b13">[14]</ref> uses Jsons, mainly text-based, to store information about pre-trained models. They parse natural language prompts into multiple tasks that are matched with suitable algorithms using an LLM. However, this tool does not address the AutoML problem, as it does not optimize model selection or hyperparameter configurations for algorithms. In contrast, Zhang et al. <ref type="bibr" target="#b24">[25]</ref> aims to develop an AutoML system called AutoML-GPT, which uses LLMs to train models on datasets with user inputs and descriptions automatically. The LLMs serve as an automatic training system to establish connections with models and process inputs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Research Questions</head><p>In this research, we propose the integration of description cards based on text data for algorithms in AutoGOAL. By leveraging the power of LLMs, we can match algorithms based on their description. This method adds a vital factor of generalization that might solve the previous limitations of AutoGOAL. Moreover, by substituting the current Semantic Type system with natural language, we open the tool's interface to accept user text prompts.</p><p>To achieve our main objective, we must address various questions within the proposal:</p><p>1. Which LLM should we use? 2. What language should we support in the interface?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">What machine learning tasks should we target?</head><p>To answer the first question, we must conduct experiments to determine which LLM will be most suitable for our project. One idea is to use two different LLMs that are each fine-tuned for specific purposes. For example, one LLM can help determine the compatibility between algorithms used during the optimization process. At the same time, the other can identify problem definitions (input and output types and objective functions) out of natural language for better user interaction with the system.</p><p>For the second question, we aim to develop an interface not tailored to a specific language to make it fully inclusive. However, the performance of LLMs can vary significantly in different languages due to differences in available training data. Therefore, we will compare the performance of multilingual models against specific-language models in our objective tasks before deciding which approach to follow.</p><p>Finally, our main objective is to extend existing tools, specifically AutoGOAL, to saturate the definition of Heterogeneous AutoML, thus achieving more flexibility and integrating more tasks seamlessly. We can achieve this by using the compatibility function to discover algorithms that were once bound to a specific machine-learning problem but can be part of the solution of another one. The diversity and amount of algorithms determine the limit of tasks we can solve in the system.</p><p>The proposed system has valuable scientific, economic, and social implications. It can enhance our understanding of artificial intelligence and apply it to robotics and process automation. Economically, it can speed up the development of applications and decrease the cost and time required for building machine learning solutions.</p><p>From a social standpoint, an AutoML system based on natural language can improve the accessibility and ease of use of machine learning, especially in critical areas such as healthcare and education. Furthermore, automating the building of learning models can lessen the need for human intervention in repetitive and monotonous tasks, thus reducing the carbon footprint associated with computer system operations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Proposed Experimentation</head><p>To evaluate the potential of our proposed system, we plan to develop a benchmark that incorporates challenging tasks from multiple domains such as vision, NLP, audio, and others. We will perform ablation studies to comprehend the significance of different LLMs and optimization strategies in the overall performance of our system. In addition, to make the evaluation more comprehensive, we will compare our new system with its previous version and other stateof-the-art AutoML systems. By providing more flexibility, we aim to test the capabilities of our system against human adversaries in various challenges. Furthermore, we will explore the possibility of integrating human feedback into the learning process, which can provide valuable insights and lead to further improvement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and Future work</head><p>The purpose of this publication is to present the research framework for a thesis that aims to investigate the intersection between AutoML and Large Language Models (LLMs). Our objective is to improve AutoML systems, making them more accessible, user-friendly, and versatile. In order to achieve this goal, we will begin by examining the current state of the art in this field. Subsequently, we will develop corresponding description cards for each algorithm available in AutoGOAL and also include new algorithms, such as pre-trained models from Hugging Face along with their respective cards. The next step will be integrating an LLM to model the compatibility function between algorithms, thereby enabling a natural language interface for user interaction. Ultimately, our aim is to pave the way for more inclusive and efficient machine learning applications in various domains.</p></div>		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Automated Machine Learning</title>
		<author>
			<persName><forename type="first">Frank</forename><surname>Hutter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lars</forename><surname>Kotthoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Joaquin</forename><surname>Vanschoren</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Auto-Keras: An Efficient Neural Architecture Search System</title>
		<author>
			<persName><forename type="first">Haifeng</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Qingquan</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xia</forename><surname>Hu</surname></persName>
		</author>
		<idno type="DOI">10.1145/3292500.3330648</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>ACM</publisher>
			<biblScope unit="page" from="1946" to="1956" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Auto-Sklearn 2.0: The Next Generation</title>
		<author>
			<persName><forename type="first">Matthias</forename><surname>Feurer</surname></persName>
		</author>
		<idno>arXiv:</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
			<publisher>Learning</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms</title>
		<author>
			<persName><forename type="first">Chris</forename><surname>Thornton</surname></persName>
		</author>
		<idno type="DOI">10.1145/2487575.2487629</idno>
	</analytic>
	<monogr>
		<title level="j">ACM</title>
		<imprint>
			<biblScope unit="page" from="847" to="855" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Automl strategy based on grammatical evolution: A case study about knowledge discovery from text</title>
		<author>
			<persName><forename type="first">Suilan</forename><surname>Estevez-Velarde</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 57th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4356" to="4365" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">TPOT: A Tree-Based Pipeline Optimization Tool for Automating Machine Learning</title>
		<author>
			<persName><forename type="first">Randal</forename><forename type="middle">S</forename><surname>Olson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jason</forename><forename type="middle">H</forename><surname>Moore</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-05318-5_8</idno>
		<imprint>
			<date type="published" when="2018">2019. 2018</date>
			<publisher>Springer</publisher>
			<biblScope unit="page" from="66" to="74" />
			<pubPlace>Cham</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn</title>
		<author>
			<persName><forename type="first">Brent</forename><surname>Komer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Bergstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Eliasmith</surname></persName>
		</author>
		<idno type="DOI">10.25080/MAJORA-14BD3278-006</idno>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="32" to="37" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">ML-Plan: Automated machine learning via hierarchical planning</title>
		<author>
			<persName><forename type="first">Felix</forename><surname>Mohr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marcel</forename><forename type="middle">Dominik</forename><surname>Wever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eyke</forename><surname>Hüllermeier</surname></persName>
		</author>
		<idno type="DOI">10.1007/S10994-018-5735-Z</idno>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">107</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="1495" to="1515" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines</title>
		<author>
			<persName><forename type="first">Alex</forename><surname>Guimarães</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Cardoso</forename><surname>De Sá</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-55696-3_16</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>Springer</publisher>
			<biblScope unit="page" from="246" to="261" />
			<pubPlace>Cham</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey</title>
		<author>
			<persName><forename type="first">Min</forename><surname>Bonan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2111.01243</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>cs.CL</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">A Systematic Evaluation of Large Language Models of Code</title>
		<author>
			<persName><forename type="first">F</forename><surname>Frank</surname></persName>
		</author>
		<author>
			<persName><surname>Xu</surname></persName>
		</author>
		<idno type="DOI">10.48550/ARXIV.2202.13169</idno>
		<idno type="arXiv">arXiv:2202.13169</idno>
		<imprint>
			<date type="published" when="2022-02">Feb. 2022</date>
		</imprint>
	</monogr>
	<note>cs.PL</note>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Persistent Anti-Muslim Bias in Large Language Models</title>
		<author>
			<persName><forename type="first">Abubakar</forename><surname>Abid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maheen</forename><surname>Farooqi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Zou</surname></persName>
		</author>
		<idno type="DOI">10.1145/3461702.3462624</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Check your facts and try again: Improving large language models with external knowledge and automated feedback</title>
		<author>
			<persName><forename type="first">Baolin</forename><surname>Peng</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2302.12813</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace</title>
		<author>
			<persName><forename type="first">Yongliang</forename><surname>Shen</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2303.17580</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Scikit-learn: Machine Learning in Python</title>
		<author>
			<persName><forename type="first">Fabian</forename><surname>Pedregosa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">85</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">WEKA: a machine learning workbench</title>
		<author>
			<persName><forename type="first">Geoffrey</forename><surname>Holmes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Donkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ian</forename><forename type="middle">H</forename><surname>Witten</surname></persName>
		</author>
		<idno type="DOI">10.1109/ANZIIS.1994.396988</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE</title>
		<imprint>
			<biblScope unit="page" from="357" to="361" />
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Keras: The Python Deep Learning library</title>
		<author>
			<persName><forename type="first">François</forename><surname>Chollet</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">The Java language specification</title>
		<author>
			<persName><forename type="first">James</forename><surname>Gosling</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000">2000</date>
			<publisher>Addison-Wesley Professional</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">TensorFlow: a system for large-scale machine learning</title>
		<author>
			<persName><forename type="first">Martín</forename><surname>Abadi</surname></persName>
		</author>
		<idno type="DOI">10.5555/3026877.3026899</idno>
	</analytic>
	<monogr>
		<title level="m">USENIX Association</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="265" to="283" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">PyTorch: An Imperative Style, High-Performance Deep Learning Library</title>
		<author>
			<persName><forename type="first">Adam</forename><surname>Paszke</surname></persName>
		</author>
		<ptr target="org" />
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="8026" to="8037" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Automatic discovery of heterogeneous machine learning pipelines: An application to natural language processing</title>
		<author>
			<persName><forename type="first">Suilan</forename><surname>Estevez-Velarde</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 28th International Conference on Computational Linguistics</title>
				<meeting>the 28th International Conference on Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3558" to="3568" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">NLTK: the natural language toolkit</title>
		<author>
			<persName><forename type="first">Edward</forename><surname>Loper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Steven</forename><surname>Bird</surname></persName>
		</author>
		<idno>arXiv preprint cs/0205028</idno>
		<imprint>
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Gensim-statistical semantics in python</title>
		<author>
			<persName><forename type="first">Petr</forename><surname>Sojka</surname></persName>
		</author>
		<ptr target="genism.org" />
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
	<note>Retrieved from</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A new grammatical evolution based on probabilistic context-free grammar</title>
		<author>
			<persName><forename type="first">Hyun-Tae</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chang</forename><surname>Wook Ahn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems</title>
				<meeting>the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="1" to="12" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">AutoML-GPT: Automatic Machine Learning with GPT</title>
		<author>
			<persName><forename type="first">Shujian</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.02499</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
