<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Gröbner basis computation via learning</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hiroshi</forename><surname>Kera</surname></persName>
							<email>kera@chiba-u.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">Chiba University</orgName>
								<address>
									<addrLine>1-33 Yayoi-cho, Inage-ku, Chiba-shi</addrLine>
									<postCode>2638522</postCode>
									<settlement>Chiba</settlement>
									<country key="JP">Japan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yuki</forename><surname>Ishihara</surname></persName>
							<email>ishihara.yuki@nihon-u.ac.jp</email>
							<affiliation key="aff1">
								<orgName type="institution">Nihon University</orgName>
								<address>
									<addrLine>1-8-14 Kanda Surugadai, Chiyoda-ku</addrLine>
									<postCode>1018308</postCode>
									<settlement>Tokyo</settlement>
									<country key="JP">Japan</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tristan</forename><surname>Vaccon</surname></persName>
							<email>tristan.vaccon@unilim.fr</email>
							<affiliation key="aff2">
								<orgName type="laboratory">UMR 7252</orgName>
								<orgName type="institution" key="instit1">Université de Limoges</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">XLIM</orgName>
								<address>
									<settlement>Limoges</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kazuhiro</forename><surname>Yokoyama</surname></persName>
							<email>kazuhiro@rikkyo.ac.jp</email>
							<affiliation key="aff3">
								<orgName type="institution">Rikkyo University</orgName>
								<address>
									<addrLine>3-34-1, Nishi-Ikebukuro, Toshima-ku</addrLine>
									<postCode>1718501</postCode>
									<settlement>Tokyo</settlement>
									<country key="JP">Japan</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Gröbner basis computation via learning</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">3946AB559C3AFBC219193420359E4DB6</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Gröbner Bases, Machine Learning, Transformer Orcid 0000-0002-9830-0436 (H. Kera)</term>
					<term>0000-0003-4057-3703 (Y. Ishihara)</term>
					<term>0000-0003-4208-8349 (T. Vaccon)</term>
					<term>0000-0002-5072-7799 (K. Yokoyama)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Solving a polynomial system, or computing an associated Gröbner basis, has been a fundamental task in computational algebra. However, it is also known for its notorious doubly exponential time complexity in the number of variables in the worst case. This paper is the first to address the learning of Gröbner basis computation with Transformers. The training requires many pairs of a polynomial system and the associated Gröbner basis, raising two novel algebraic problems: random generation of Gröbner bases and transforming them into non-Gröbner ones, termed as backward Gröbner problem. We resolve these problems with 0-dimensional radical ideals, the ideals appearing in various applications. The experiments show that our dataset generation method is at least three orders of magnitude faster than a naive approach, overcoming a crucial challenge in learning to compute Gröbner bases, and Gröbner computation is learnable in a particular class.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Understanding the properties of polynomial systems and solving them have been a fundamental problem in computational algebra and algebraic geometry with vast applications in cryptography <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, control theory <ref type="bibr" target="#b2">[3]</ref>, statistics <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>, computer vision <ref type="bibr" target="#b5">[6]</ref>, systems biology <ref type="bibr" target="#b6">[7]</ref>, and so forth. Special sets of polynomials called Gröbner bases <ref type="bibr" target="#b7">[8]</ref> play a key role to this end. In linear algebra, the Gaussian elimination simplifies or solves a system of linear equations by transforming its coefficient matrix into the reduced row echelon form. Similarly, a Gröbner basis can be regarded as a reduced form of a given polynomial system, and its computation is a generalization of the Gaussian elimination to general polynomial systems. However, computing a Gröbner basis is known for its notoriously bad computational cost in theory and practice. It is an NP-hard problem with the doubly exponential worst-case time complexity in the number of variables <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b9">10]</ref>. Nevertheless, because of its importance, various algorithms have been proposed in computational algebra to obtain Gröbner bases in better runtime. Examples include Faugère's F4/F5 algorithms <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref> and M4GB <ref type="bibr" target="#b12">[13]</ref>.</p><p>In this study, we investigate Gröbner basis computation from a learning perspective, envisioning it as a practical compromise to address large-scale polynomial system solving and understanding, where mathematical algorithms are computationally intractable. The learning approach does not require explicit design of computational procedures, and we only need to train a model using a large amount of (non-Gröbner set, Gröbner basis) pairs. Further, if we restrict ourselves to a particular class of Gröbner bases (or associated ideals), the model may internally find some patterns useful for prediction. The success of learning indicates the existence of such patterns, which encourages the improvement of mathematical algorithms and heuristics. Several recent studies have already addressed mathematical tasks via learning, particularly using Transformers <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref>. For example, <ref type="bibr" target="#b13">[14]</ref> showed that Transformers can learn symbolic integration simply by observing many ( d𝑓/d𝑥 , 𝑓 ) pairs in training.</p><p>The training samples are generated by first randomly generating 𝑓 and computing its derivative d𝑓/d𝑥 and/or by the reverse process.</p><p>However, a crucial challenge in the learning of Gröbner basis computation is that it is mathematically unknown how to efficiently generate many (non-Gröbner set, Gröbner basis) pairs. We need an efficient backward approach (i.e., solution-to-problem computation) because, as discussed above, the forward approach (i.e., problem-to-solution computation) is prohibitively expensive. To this end, we frame two problems: (i) a random generation of Gröbner bases and (ii) a backward transformation from a Gröbner basis to an associated non-Gröbner set. To our knowledge, neither of them has been addressed in the study of Gröbner bases because of the lack of motivations; all the efforts have been dedicated to the forward computation from a non-Gröbner set to Gröbner basis.</p><p>Tackling aforementioned two unexplored algebraic problems, we investigates the first learning approach to the Gröbner computation using Transformers and experimentally show its learnability uncovered two unexplored algebraic problems in the 0-dimensional case. Our experiments show that the proposed dataset generation is highly efficient and faster than a baseline method by three or four orders of magnitude. Further, we observe a learnability gap between polynomials on finite fields and infinite fields while predicting polynomial supports are more tractable. Full version of this paper can be found in <ref type="bibr" target="#b16">[17]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">New algebraic problems for dataset generation</head><p>Our notations and definitions follow <ref type="bibr" target="#b17">[18]</ref> except that we call power products of indeterminate terms instead of monomials. By Gröbner basis computation, we mean computation of reduced Gröbner bases. Our goal is to realize Gröbner basis computation through learning. To this end, we need a large training set {(𝐹 𝑖 , 𝐺 𝑖 )} 𝑚 𝑖=1 with finite polynomial set 𝐹 𝑖 ⊂ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] and Gröbner basis 𝐺 𝑖 of ideal ⟨𝐹 𝑖 ⟩. As the computation from 𝐹 𝑖 to 𝐺 𝑖 is computationally expensive in general, we instead resort to backward generation (i.e., solution-to-problem process); that is, we generate a Gröbner basis 𝐺 𝑖 randomly and transform it to non-Gröbner set 𝐹 𝑖 . Problems. 2.1 and 2.2 require the collections 𝒢 , ℱ to contain diverse polynomial sets. Thus, the algorithms for these problems should not be deterministic but should have some controllable randomness.</p><p>What makes the learning of Gröbner basis computation hard is that, to our knowledge, neither (i) a random generation of Gröbner basis nor (ii) the backward transform from Gröbner basis to non-Gröbner set has been considered in computational algebra. Its primary interest has been instead posed on Gröbner basis computation (i.e., forward generation), and nothing motivates the random generation of Gröbner basis nor the backward transform. Interestingly, machine learning now sheds light on them. Formally, we address the following problems for dataset generation.</p><p>In this paper, we tackle these problems in the case of radical 0-dimensional ideals. We first address Prob. 2.1 using the fact that 0-dimensional radical ideals are generally in shape position.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 2.3 (Shape position). Ideal</head><formula xml:id="formula_0">𝐼 ⊂ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] is called in shape position if some univariate polynomials ℎ, 𝑔 1 , … , 𝑔 𝑛−1 ∈ 𝑘[𝑥 𝑛 ] form the reduced ≺ lex -Gröbner basis of 𝐼 as follows. 𝐺 = {ℎ, 𝑥 1 − 𝑔 1 , … , 𝑥 𝑛−1 − 𝑔 𝑛−1 }. (2.1)</formula><p>Particularly, 0-dimensional radical ideals are almost always in shape position if 𝑘 is an infinite field or finite field with large field order <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref>. With this fact, an efficient sampling of Gröbner bases of 0-dimensional radical ideals can be realized by sampling 𝑛 polynomials in 𝑘[𝑥 𝑛 ], i.e., ℎ, 𝑔 1 , … , 𝑔 𝑛−1 with ℎ ≠ 0. We have to make sure that the degree of ℎ is always greater than that of 𝑔 1 , … , 𝑔 𝑛−1 , which is necessary and sufficient for 𝐺 to be a reduced Gröbner basis. This approach involves efficiency and randomness, and thus resolving Prob. 2.1. To address Prob. 2.2, we consider the following problem. A similar question was studied without the Gröbner condition in <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22]</ref>. They provided an algebraic necessary and sufficient condition for the polynomial system of 𝐹 to have a solution outside the variety defined by 𝐺. This condition is expressed explicitly by multivariate resultants. However, strong additional assumptions are required: 𝐴, 𝐹 , 𝐺 are homogeneous, 𝐺 is a regular sequence, and in the end, ⟨𝐹⟩ = ⟨𝐺⟩ is only satisfied up to saturation. Thus, they are not compatible with our setting and method for Prob. 2.1. Our analysis gives the following results for the design 𝐴 to achieve ⟨𝐹 ⟩ = ⟨𝐺⟩ for the 0-dimensional case.</p><formula xml:id="formula_1">Theorem 2.5. Let 𝐺 = (𝑔 1 , … , 𝑔 𝑡 ) ⊤ be a Gröbner basis of a 0-dimensional ideal in 𝑘[𝑥 1 , … , 𝑥 𝑛 ]. Let 𝐹 = (𝑓 1 , … , 𝑓 𝑠 ) ⊤ = 𝐴𝐺 with 𝐴 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑠×𝑡 . 1. If ⟨𝐹 ⟩ = ⟨𝐺⟩, it implies 𝑠 ≥ 𝑛. 2. If 𝐴 has a left-inverse in 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑡×𝑠 , ⟨𝐹 ⟩ = ⟨𝐺⟩ holds.</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The equality ⟨𝐹 ⟩ = ⟨𝐺⟩ holds if and only if there exists a matrix 𝐵 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ]</head><p>𝑡×𝑠 such that each row of 𝐵𝐴 − 𝐸 𝑡 is a syzygy of 𝐺, where 𝐸 𝑡 is the identity matrix of size 𝑡.</p><p>We now assume ≺=≺ lex and 0-dimensional ideals in shape position. Then, 𝐺 has exactly 𝑛 generators. When 𝑠 = 𝑛, we have the following. Proposition 2.6. For any 𝐴 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑛×𝑛 with det(𝐴) ∈ 𝑘 ∖ {0}, we have ⟨𝐹 ⟩ = ⟨𝐺⟩.</p><p>As non-zero constant scaling does not change the ideal, we focus on 𝐴 with det(𝐴) = ±1 without loss of generality. Such 𝐴 can be constructed using the Bruhat decomposition 𝐴 = 𝑈 1 𝑃𝑈 2 , where 𝑈 1 , 𝑈 2 ∈ ST(𝑛, 𝑘[𝑥 1 , … , 𝑥 𝑛 ]) are upper-triangular matrices with all-one diagonal entries (i.e., unimodular upper-triangular matrices) and 𝑃 ∈ {0, 1} 𝑛×𝑛 denotes a permutation matrix. Noting that 𝐴 −1 satisfies 𝐴 −1 𝐴 = 𝐸 𝑛 , we have ⟨𝐴𝐺⟩ = ⟨𝐺⟩ from Thm. 2.5. Therefore, random sampling (𝑈 1 , 𝑈 2 , 𝑃) of unimodular upper-triangular matrices 𝑈 1 , 𝑈 2 and a permutation matrix 𝑃 resolves the backward Gröbner problem for 𝑠 = 𝑛. We extend this idea to the case of 𝑠 &gt; 𝑛 using a rectangular unimodular upper-triangular matrix </p><formula xml:id="formula_2">𝑈 2 = ( 𝑈 ′ 2 𝑂 𝑠−𝑛,𝑛 ) ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑠×𝑛 ,</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head><p>We present the efficiency of our dataset generation method and the learnability of Gröbner basis computation. The experiments were conducted with 48-core CPUs, 768GB RAM, and NVIDIA RTX A6000ada GPUs. Due to the space limitation, we cannot present full experimental setup. See the full version in <ref type="bibr" target="#b16">[17]</ref>.</p><p>1</p><p>We surcharge notations to mean that the set {𝑔 1 , … , 𝑔 𝑡 } defined by the vector 𝐺 is a ≺-Gröbner basis.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Runtime comparison (in seconds) of forward generation (F.) and backward generation (B.) of dataset 𝒟 𝑛 (𝔽 7 ) of size 1,000. The forward generation used either of the three algorithms provided in SageMath with the libSingular backend. We set a timeout limit to five seconds (added to the total runtime at every occurrence) for each Gröbner basis computation. The numbers with † and ‡ include the timeout for more than 13 % and 24 % of the runs, respectively. ≤3 and restricted to monomials and binomials. For ℚ, coefficients of all sampled polynomials were bounded as 𝑎/𝑏 with 𝑎, 𝑏 ∈ {−5, … , 5} and we only accepted 𝐹 with coefficients such as 𝑎, 𝑏 ∈ {−100, … , 100}. This restriction is required from our machine learning model and learning framework. For forward generation, we adopted three algorithms given by SageMath <ref type="bibr" target="#b22">[23]</ref> with the libSingular backend. For a fair comparison, forward generation computed Gröbner bases of the non-Gröbner sets given by the backward generation, leading to the identical dataset. As Tab. 1 shows, our backward generation is significant orders of magnitude faster than the forward generation. A sharp runtime growth is observed in the forward generation as the number of variables increases. Note that these numbers only show the runtime on 1,000 samples, while training typically requires millions of samples. Therefore, the forward generation is almost infeasible, and the proposed method resolves a bottleneck in the learning of Gröbner basis computation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method</head><p>Learning results. We used a standard Transformer (e.g., 6 encoder/decoder layers and 8 attention heads) and a standard training setup. The batch size was set to 16, and models were trained for 8 epochs. Each polynomial set in the datasets is converted into a sequence using the prefix representation and the separator tokens. To make the input sequence length manageable for vanilla Transformers, we used simpler datasets 𝒟 − 𝑛 (𝑘) using 𝑈 1 , 𝑈 ′ 2 of a moderate density 𝜎 ∈ (0, 1]. This makes the maximum sequence length less than 5,000. Specifically, we used 𝜎 = 1.0, 0.6, 0.3, 0.2 for 𝑛 = 2, 3, 4, 5, respectively.</p><p>The training set has one million samples, and the test set has one thousand samples. Table <ref type="table" target="#tab_2">2</ref> shows that trained Transformers successfully compute Gröbner bases with moderate/high accuracy. Not shown here, but we found several examples in the datasets for which Transformer successfully compute Gröbner bases significantly faster than math algorithms. The accuracy shows that the learning is more successful on infinite field coefficients 𝑘 ∈ {ℚ, ℝ} than finite field ones 𝑘 = 𝔽 𝑝 . This may be a counter-intuitive observation because there are more possible coefficients in 𝐺 and 𝐹 for ℚ than 𝔽 𝑝 . Specifically, for 𝐺, the coefficient 𝑎/𝑏 ∈ ℚ is restricted to those with 𝑎, 𝑏 ∈ {−5, … , 5} (i.e., roughly 50 choices), and 𝑎, 𝑏 ∈ {−100, … , 100} (i.e., roughly 20,000 choices) for 𝐹. In contrast, there are only 𝑝 choices for 𝔽 𝑝 . The performance even degrades for the larger order 𝑝 = 31. Interestingly, the support accuracy shows that the terms forming the polynomial (i.e., the support of polynomial) are correctly identified well. Thus, Transformers have difficulty determining the coefficients in finite fields. Several studies have also reported that learning to solve a problem involving modular arithmetic may encounter some difficulties <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b25">26]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>This study proposed the first learning approach to a fundamental algebraic task, the Gröbner basis computation. While various recent studies have reported the learnability of mathematical problems by Transformers, we addressed the first problem with nontriviality in the dataset generation. Ultimately, the learning approach may be useful to address large-scale problems that cannot be approached by Gröbner basis computation algorithms because of their computational complexity. Transformers can output predictions in moderate runtime. The outputs may be incorrect, but there is a chance of obtaining a hint of a solution, as shown in our experiments. We believe that our study reveals many interesting open questions to achieve Gröbner basis computation learning.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Problem 2 . 1 (</head><label>21</label><figDesc>Random generation of Gröbner bases). Find a collection 𝒢 = {𝐺 𝑖 } 𝑚 𝑖=1 with the reduced Gröbner basis 𝐺 𝑖 ⊂ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] of ⟨𝐺 𝑖 ⟩, 𝑖 = 1, … , 𝑚. The collection should contain diverse bases, and we need an efficient algorithm for constructing them. Problem 2.2 (Backward Gröbner problem). Given a Gröbner basis 𝐺 ⊂ 𝑘[𝑥 1 , … , 𝑥 𝑛 ], find a collection ℱ = {𝐹 𝑖 } 𝜇 𝑖=1 of polynomial sets that are not Gröbner bases but ⟨𝐹 𝑖 ⟩ = ⟨𝐺⟩ for 𝑖 = 1, … , 𝜇. The collection should contain diverse sets, and we need an efficient algorithm for constructing them.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Problem 2 . 4 .</head><label>24</label><figDesc>Let 𝐼 ⊂ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] be a 0-dimensional ideal, and let 𝐺 = (𝑔 1 , … , 𝑔 𝑡 ) ⊤ ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑡 be its ≺-Gröbner basis with respect to term order ≺.1 Find a polynomial matrix 𝐴 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑠×𝑡 giving a non-Gröbner set 𝐹 = (𝑓 1 , … , 𝑓 𝑠 ) ⊤ = 𝐴𝐺 such that ⟨𝐹 ⟩ = ⟨𝐺⟩.Namely, we generate a set of polynomials 𝐹 = (𝑓 1 , … , 𝑓 𝑠 ) ⊤ from 𝐺 = (𝑔 1 , … , 𝑔 𝑡 ) ⊤ by 𝑓 𝑖 = ∑ 𝑡 𝑗=1 𝑎 𝑖𝑗 𝑔 𝑗 for 𝑖 = 1, … , 𝑠, where 𝑎 𝑖𝑗 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] denotes the (𝑖, 𝑗)-th entry of 𝐴. Note that ⟨𝐹 ⟩ and ⟨𝐺⟩ are generally not identical, and the design of 𝐴 such that ⟨𝐹 ⟩ = ⟨𝐺⟩ is of our question.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>where 𝑈 ′ 2 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] 𝑛×𝑛 is a unimodular upper-triangular matrix and𝑂 𝑠−𝑛,𝑛 ∈ 𝑘[𝑥 1 , … , 𝑥 𝑛 ] (𝑠−𝑛)×𝑛 is the zero matrix. The permutation matrix is now 𝑃 ∈ {0, 1} 𝑠×𝑠 . Our strategy is to compute 𝐹 = 𝑈 1 𝑃𝑈 2 𝐺, which only requires a sampling of 𝒪(𝑠 2 ) polynomials in 𝑘[𝑥 1 , … , 𝑥 𝑛 ], and 𝒪(𝑛 2 + 𝑠 2 )-times multiplications of polynomials.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>We constructed 12 datasets 𝒟 𝑛 (𝑘) for 𝑛 ∈ {2, 3, 4, 5} and 𝑘 ∈ {𝔽 7 , 𝔽 31 , ℚ} and measured the runtime of our backward generation and naive forward generation (i.e., Gröbner basis computation). In the backward generation, we sampled Gröbner bases of ideals in shape position. In this step, univariable polynomials were generically sampled in 𝑘[𝑥 1 , … , 𝑥 𝑛 ] ≤5 . Next, Gröbner bases were transformed to non-Gröbner sets based on Thm. 2.5. Random polynomials in Bruhat decomposition (i.e., 𝑈 1 and 𝑈 ′ 2 ) were sampled from 𝑘[𝑥 1 , … , 𝑥 𝑛 ]</figDesc><table><row><cell></cell><cell>𝑛 = 2</cell><cell>𝑛 = 3</cell><cell>𝑛 = 4</cell><cell>𝑛 = 5</cell></row><row><cell>F. (std)</cell><cell>4.65</cell><cell>129</cell><cell>873  †</cell><cell>1354  ‡</cell></row><row><cell>F. (slimgb)</cell><cell>4.67</cell><cell>149</cell><cell>712  †</cell><cell>1259  ‡</cell></row><row><cell>F. (stdfglm)</cell><cell>5.78</cell><cell>12.6</cell><cell>44.2</cell><cell>360</cell></row><row><cell>B. (ours)</cell><cell>.003</cell><cell>.005</cell><cell>.009</cell><cell>.014</cell></row><row><cell>Dataset generation.</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>Accuracy [%] / support accuracy [%] of Gröbner basis computation by Transformer on 𝒟 − 𝑛 (𝑘). In the support accuracy, two polynomials are considered identical if they consist of an identical set of terms (i.e., identical support), Note that the datasets for 𝑛 = 3, 4, 5 are here constructed using 𝑈 1 , 𝑈 ′ 2 with density 𝜎 = 0.6, 0.3, 0.2, respectively.</figDesc><table><row><cell>Ring</cell><cell>𝑛 = 2, 𝜎 = 1</cell><cell>𝑛 = 3, 𝜎 = 0.6</cell><cell>𝑛 = 4, 𝜎 = 0.3</cell><cell>𝑛 = 5, 𝜎 = 0.2</cell></row><row><cell>ℚ[𝑥 1 , … , 𝑥 𝑛 ]</cell><cell>94.6 / 97.9</cell><cell>96.1 / 98.6</cell><cell>96.2 / 98.6</cell><cell>91.8 / 97.9</cell></row><row><cell>𝔽 7 [𝑥 1 , … , 𝑥 𝑛 ]</cell><cell>66.6 / 76.6</cell><cell>78.8 / 87.6</cell><cell>80.9 / 91.1</cell><cell>83.2 / 91.4</cell></row><row><cell>𝔽 31 [𝑥 1 , … , 𝑥 𝑛 ]</cell><cell>44.7 / 82.7</cell><cell>58.5 / 89.3</cell><cell>73.9 / 93.9</cell><cell>80.0 / 93.4</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This research was supported by JST ACT-X Grant Number JPMJAX23C8 and JSPS KAKENHI Grant Number JP22K13901. Yuta Kambe (Mitsubishi Electric Information Technology R&amp;D Center) is not included in the authors due to a technical reason at submission.</p></div>
			</div>


			<div type="availability">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>K. Yokoyama) GLOBE https://hkera.wordpress.com (H. Kera); https://researchmap.jp/yishihara (Y. Ishihara); https://www.unilim.fr/pages_perso/tristan.vaccon/ (T. Vaccon</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Algorithms for Solving Polynomial Systems</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">V</forename><surname>Bard</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>Springer US</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">MQ challenge: hardness evaluation of solving multivariate quadratic problems</title>
		<author>
			<persName><forename type="first">T</forename><surname>Yasuda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Dahan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-J</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Takagi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Sakurai</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cryptology ePrint Archive</title>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m">Gröbner Bases in Control Theory and Signal Processing</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Park</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Regensburger</surname></persName>
		</editor>
		<imprint>
			<publisher>De Gruyter</publisher>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Algebraic algorithms for sampling from conditional distributions</title>
		<author>
			<persName><forename type="first">P</forename><surname>Diaconis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sturmfels</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Annals of Statistics</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="363" to="397" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Hibi</surname></persName>
		</author>
		<title level="m">Gröbner bases. Statistics and software systems</title>
				<meeting><address><addrLine>Tokyo</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Gröbner Basis Methods for Minimal Problems in Computer Vision</title>
		<author>
			<persName><forename type="first">H</forename><surname>Stewenius</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
		<respStmt>
			<orgName>Mathematics (Faculty of Engineering)</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Computer algebra in systems biology</title>
		<author>
			<persName><forename type="first">R</forename><surname>Laubenbacher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sturmfels</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">American Mathematical Monthly</title>
		<imprint>
			<biblScope unit="volume">116</biblScope>
			<biblScope unit="page" from="882" to="891" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes nach einem nulldimensionalen Polynomideal (An Algorithm for Finding the Basis Elements in the Residue Class Ring Modulo a Zero Dimensional Polynomial Ideal</title>
		<author>
			<persName><forename type="first">B</forename><surname>Buchberger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">of Symbolic Computation, Special Issue on Logic, Mathematics, and Computer Science: Interactions</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">3-4</biblScope>
			<biblScope unit="page" from="475" to="511" />
			<date type="published" when="1965">1965. 2006</date>
		</imprint>
		<respStmt>
			<orgName>Mathematical Institute, University of Innsbruck</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
	<note>English translation in J</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The complexity of the word problems for commutative semigroups and polynomial ideals</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">W</forename><surname>Mayr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Meyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Mathematics</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="page" from="305" to="329" />
			<date type="published" when="1982">1982</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The structure of polynomial ideals and Gröbner bases</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">W</forename><surname>Dubé</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIAM Journal on Computing</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="750" to="773" />
			<date type="published" when="1990">1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A new efficient algorithm for computing Gröbner bases (F4)</title>
		<author>
			<persName><forename type="first">J.-C</forename><surname>Faugère</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Pure and Applied Algebra</title>
		<imprint>
			<biblScope unit="volume">139</biblScope>
			<biblScope unit="page" from="61" to="88" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A new efficient algorithm for computing Gröbner bases without reduction to zero (F5)</title>
		<author>
			<persName><forename type="first">J.-C</forename><surname>Faugère</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2002 International Symposium on Symbolic and Algebraic Computation, ISSAC &apos;02</title>
				<meeting>the 2002 International Symposium on Symbolic and Algebraic Computation, ISSAC &apos;02<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="75" to="83" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">M4GB: An efficient Gröbner-basis algorithm</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">H</forename><surname>Makarim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Stevens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC&apos;17</title>
				<meeting>the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC&apos;17<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="293" to="300" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Deep learning for symbolic mathematics</title>
		<author>
			<persName><forename type="first">G</forename><surname>Lample</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Charton</surname></persName>
		</author>
		<ptr target="https://openreview.net/forum?id=S1eZYeHFDS" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Neural symbolic regression that scales</title>
		<author>
			<persName><forename type="first">L</forename><surname>Biggio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bendinelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lucchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Parascandolo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 38th International Conference on Machine Learning</title>
				<meeting>the 38th International Conference on Machine Learning</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">139</biblScope>
			<biblScope unit="page" from="936" to="945" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Linear algebra with transformers</title>
		<author>
			<persName><forename type="first">F</forename><surname>Charton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions on Machine Learning Research</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Kera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ishihara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Kambe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Vaccon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yokoyama</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2311.12904</idno>
		<title level="m">Learning to compute gröbner bases</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Cox</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Little</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>O'shea</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">Undergraduate Texts in Mathematics</title>
		<imprint>
			<date type="published" when="2015">2015</date>
			<publisher>Springer International Publishing</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Algebraic solution of systems of polynomial equations using Groebner bases</title>
		<author>
			<persName><forename type="first">P</forename><surname>Gianni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mora</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Applied Algebra, Algebraic Algorithms and Error-Correcting Codes</title>
				<meeting><address><addrLine>Berlin Heidelberg; Berlin, Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1989">1989</date>
			<biblScope unit="page" from="247" to="257" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A modular method to compute the rational univariate representation of zero-dimensional ideals</title>
		<author>
			<persName><forename type="first">M</forename><surname>Noro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Yokoyama</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Symbolic Computation</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="page" from="243" to="263" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Resultant over the residual of a complete intersection</title>
		<author>
			<persName><forename type="first">L</forename><surname>Busé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Elkadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mourrain</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Pure and Applied Algebra</title>
		<imprint>
			<biblScope unit="volume">164</biblScope>
			<biblScope unit="page" from="35" to="57" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Busé</surname></persName>
		</author>
		<title level="m">Étude du résultant sur une variété algébrique</title>
				<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
		<respStmt>
			<orgName>Université Nice Sophia Antipolis</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Theses</note>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<ptr target="https://www.sagemath.org" />
		<title level="m">The Sage Developers, SageMath, the Sage Mathematics Software System (Version 10</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Power</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Burda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Edwards</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Babuschkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Misra</surname></persName>
		</author>
		<idno>arXiv abs/2201.02177</idno>
		<title level="m">Grokking: Generalization beyond overfitting on small algorithmic datasets</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">Can transformers learn the greatest common divisor?</title>
		<author>
			<persName><forename type="first">F</forename><surname>Charton</surname></persName>
		</author>
		<idno>arXiv abs/2308.15594</idno>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Gromov</surname></persName>
		</author>
		<idno>arXiv abs/2301.02679</idno>
		<title level="m">Grokking modular arithmetic</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
