<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Greedy Algorithm for Sparse Monotone Regression</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alexey</forename><forename type="middle">R</forename><surname>Faizliev</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Saratov State University</orgName>
								<address>
									<settlement>Saratov</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alexander</forename><forename type="middle">A</forename><surname>Gudkov</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Saratov State University</orgName>
								<address>
									<settlement>Saratov</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sergei</forename><forename type="middle">V</forename><surname>Mironov</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Saratov State University</orgName>
								<address>
									<settlement>Saratov</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mikhail</forename><forename type="middle">A</forename><surname>Levshunov</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Saratov State University</orgName>
								<address>
									<settlement>Saratov</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Greedy Algorithm for Sparse Monotone Regression</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">9C650CC3F5AD233D3E857FB9980ED70B</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T20:35+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>greedy algorithms</term>
					<term>pool-adjacent-violators algorithm</term>
					<term>isotonic regression</term>
					<term>monotone regression</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The problem of constructing the best fitted monotone regression is NP-hard problem and can be formulated in the form of a convex programming problem with linear constraints. The paper proposes a simple greedy algorithm for finding a sparse monotone regression using Frank-Wolfe-type approach. A software package for this problem is developed and implemented in R and C++. The proposed method is compared with the well-known pool-adjacent-violators algorithm (PAVA) using simulated data.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The recent years have seen an increasing interest in shape-constrained estimation in statistics <ref type="bibr" target="#b9">[10]</ref>. One of such problems is the problem of constructing monotone regression. The problem is to find best fitted non-decreasing points to a given set of points on the plane. The survey of results on monotone regression can be found in the book by Robertson and Dykstra <ref type="bibr" target="#b24">[25]</ref>. The papers of <ref type="bibr">Barlow and Brunk [3]</ref>, Dykstra <ref type="bibr" target="#b15">[16]</ref>, Best and Chakravarti <ref type="bibr" target="#b3">[4]</ref>, <ref type="bibr">Best [5]</ref> consider the problem of finding monotone regression in quadratic and convex programming frameworks.</p><p>Using mathematical programming approach the works <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b20">21,</ref><ref type="bibr" target="#b30">31]</ref> have recently provided some new results on the topic. The papers <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b16">17]</ref> extend the problem to particular orders defined by the variables of a multiple regression. The paper <ref type="bibr" target="#b7">[8]</ref> investigates a dual active-set algorithm for regularized monotonic regression.</p><p>Monotone regression is widely used in mathematical statistics <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b9">10]</ref>; in smoothing of empirical data <ref type="bibr" target="#b14">[15]</ref>; in shape-preserving approximation <ref type="bibr" target="#b18">[19]</ref>, <ref type="bibr" target="#b25">[26]</ref>, <ref type="bibr" target="#b29">[30]</ref>, <ref type="bibr" target="#b5">[6]</ref>, <ref type="bibr" target="#b26">[27]</ref>, <ref type="bibr" target="#b12">[13]</ref>; in shape-preserving dynamic programming <ref type="bibr" target="#b8">[9]</ref>.</p><p>Constructing monotone regression we assume a relationship between a predictor x = (x 1 , . . . , x n ) and a response y = (y 1 , . . . , y n ). In the general case it is expected that</p><formula xml:id="formula_0">x i+1 − x i = const, x i &lt; x i+1 , i = 1, . . . , n − 1.</formula><p>The work was supported by RFBR (grant 16-01-00507).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A sequence</head><formula xml:id="formula_1">z = (z 1 , . . . , z n ) ∈ R n is called monotone if z i − z i−1 ≥ 0, i = 2, . . . , n.</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Denote ∆ n</head><p>1 the set of all vectors from R n , which are monotone. The problem of constructing monotone regression can be formulated in the form of a convex programming problem with linear constraints as follows: it is necessary to find a vector z ∈ R n with the lowest mean square error of approximation to the given vector y ∈ R n under condition of monotonicity of z:</p><formula xml:id="formula_2">f (z) = 1 n n i=1 (z i − y i ) 2 → min z∈∆ n 1 ,<label>(1)</label></formula><p>In many situations researchers have no information regarding the mathematical specification of the true regression function. Typically, this involves non-decreasing of y i 's with the ordered x i 's. Such a situation is called isotonic regression. Isotonic regression (monotone regression) is a special case to the kmonotone regression <ref type="bibr" target="#b23">[24]</ref>.</p><p>It is well-known that the problem (1) is NP-hard problem <ref type="bibr" target="#b23">[24]</ref>. In this paper we present a simple greedy algorithm which employs Frank-Wolfe-type approach for finding sparse monotone regression. A software package for this problem is developed and implemented in R and C++.</p><p>For the convenience of solving the problem (1), we move from points z i to its increments ζ i , where</p><formula xml:id="formula_3">ζ i = z i+1 − z i , i = 1, . . . , n − 1, ζ 0 = z 1 .</formula><p>Then monotonicity of z corresponds non-negativity of ζ i 's (exept ζ 0 ). The proposed method is compared with the well-known pool-adjacent-violators algorithm (PAVA) using simulated data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Algorithms for monotone regression</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">PAVA</head><p>Simple iterative algorithm for solving the problem (1) is called Pool-Adjacent-Violators Algorithm (PAVA) <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b23">24]</ref>. The work <ref type="bibr" target="#b3">[4]</ref> examined the generalization of this algorithm. The paper <ref type="bibr" target="#b31">[32]</ref> studied this problem as the problem of identifying the active set and proposed a direct algorithm of the same complexity as the PAVA (the dual algorithm).</p><p>PAVA computes a non-decreasing sequence of values z = (z i ) n i=1 such that the problem (1) is optimized. In the simple monotone regression case we have the measurement pairs (x i , y i ). Let us assume that these pairs are ordered with respect to the predictors. The following (Algorithm 1) is a pseudocode of PAVA for the problem. The generalized pool-adjacent-violators algorithm (GPAVA), which is a strict generalization of PAVA, was developed in the article <ref type="bibr" target="#b32">[33]</ref>.</p><p>The block values are expanded with respect to the observations i = 1, . . . , n such that the final result is the vector z of length n with elements z i of increasing order <ref type="bibr" target="#b23">[24]</ref>. </p><formula xml:id="formula_4">:= s(z (l)</formula><p>r ), the solver s is conditional (weighted) mean and (weighted) quantities;</p><formula xml:id="formula_5">• If z (l) r+1 ≤ z (l) r then l := l + 1; until the z-blocks are increasing, i.e. z (l) r+1 ≥ z (l)</formula><p>r for all r; • Return z; end</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Frank-Wolfe type greedy algorithm</head><p>Frank-Wolfe method (or conditional gradient method) solves conditional convex optimization problems in vector finite-dimensional space. The method was introduced in 1956. The original algorithm did not use a fixed step size, and has the complexity of the linear programming. Frank-Wolfe method was developed by Levitin and Polyak in 1966, and V.F. Demianov and A.M. Rubinov generalized it to the case of arbitrary Banach spaces in 1970 <ref type="bibr" target="#b13">[14]</ref>. Recently Frank-Wolfe type methods have caused an increased interest related to the possibility of obtaining sparse solutions, as well as a good scaling <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b22">23]</ref>. In particular, <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b33">34]</ref> researched algorithms for solving problems with penalty functions (instead of considering the conditional optimization problems). Besides, the paper <ref type="bibr" target="#b33">[34]</ref> uses interlacing boosting with fixed-rank local optimization.</p><p>As it was mentioned above, for computational convenience of the problem (1), we moved from points z i to increments</p><formula xml:id="formula_6">ζ i = z i+1 − z i , i = 1, . . . , n − 1, ζ 0 = z 1 .</formula><p>Then the problem (1) can be rewritten as follows:</p><formula xml:id="formula_7">g(ζ) := 1 n n i=1 i−1 j=0 ζ j − y i 2 → min ζ∈S ,<label>(2)</label></formula><p>where S denotes the set of all</p><formula xml:id="formula_8">ζ = (ζ 0 , ζ 1 , . . . , ζ n−1 ) ∈ R n such that ζ 0 ∈ R, (ζ 1 , . . . , ζ n−1 ) ∈ R n−1 + and n−1 j=0 ζ j ≤ max i y i . Let ∇g(ζ) denote the gradient of function g at point ζ.</formula><p>It should be noted that for larger-scale problems the solution can appear computationally quite challenging. In this regard, the present study proposes to use a greedy algorithm of Frank-Wolfe type for solving this problem.</p><p>The following (Algorithm 2) is a pseudocode of Frank-Wolfe-type algorithm for the problem <ref type="bibr" target="#b1">(2)</ref>.</p><p>The rate of convergence is estimated according to the following theorem.</p><p>Algorithm 2: Greedy algorithm for sparse monotone regression </p><formula xml:id="formula_9">(max i y i − min i y i ) 2 t + 2 , (<label>3</label></formula><formula xml:id="formula_10">)</formula><p>where g * is the optimal solution of (2).</p><p>Proof. It it is know <ref type="bibr" target="#b17">[18]</ref> that for all t ≥ 2:</p><formula xml:id="formula_11">g(ζ t ) − g * ≤ 2L(Diam(S)) 2 t + 2 ,</formula><p>where L is the Lipschitz constant and and Diam(S) is the diameter of S. Let</p><formula xml:id="formula_12">∇ 2 g(ζ) := ∂ 2 g ∂ζ 2 0 , ∂ 2 g ∂ζ 2 1 , . . . , ∂ 2 g ∂ζ 2 n−1 .</formula><p>It is well-known that if ∇g is differentiable then its Lipschitz constant L satisfies the inequality</p><formula xml:id="formula_13">L ≤ sup ζ ∇ 2 g(ζ) 2 . Then L ≤ sup ζ n−1 k=0 ∂ 2 g ∂ζ 2 k 2 = = 1 n n k=1 (2(n − k + 1)) 2 = 2 n n k=1 k 2 = 2 n(n + 1)(2n + 1) 6n 2 . (<label>4</label></formula><formula xml:id="formula_14">)</formula><p>It is easy to prove and Diam(S)</p><formula xml:id="formula_15">:= √ 2(max i y i − min i y i ).</formula><p>The disadvantage of this method is the dependence of the theoretical degree of convergence on the dimensionality of the problem. The papers <ref type="bibr" target="#b27">[28]</ref>, <ref type="bibr" target="#b19">[20]</ref>, <ref type="bibr" target="#b28">[29]</ref> suggest to use the values of duality gap as the stopping criterion for Frank-Wolfe type algorithms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Empirical Result</head><p>The algorithms have been implemented both in R and C++. We compared the performance of the greedy algorithm (Algorithm 2) with the performance of PAVA (Algorithm 1) using simulated data sets.</p><p>It should also be noted that the PAVA's speed is significantly higher for small-scale tasks in R. But if the number of points is greater than at least 2000, the greedy algorithm spends less time searching for a solution (Fig. <ref type="figure" target="#fig_0">1</ref>).</p><p>Tables 1, 2 present empirical results for PAVA and greedy algorithms for a simulated set of points. The simulated points are obtained as the values of logarithm function with added normally distributed noise: A = {(x i , y i ), y i = ln(x 0 + i x) + ϕ i , ϕ i ∼ N (0, 1), x 0 = 1, x = 1, i = 1, . . . , 10000. The dimension of the problem is 10000 points. The tables contain information on errors</p><formula xml:id="formula_16">1 n n i=1 (z i −y i ) 2 ,</formula><p>elapsed time, cardinality and greedy algorithm's iteration number.</p><p>The results show that error of greedy algorithm are getting closer to the error of PAVA with increase of number of iterations for greedy algorithm. While PAVA is better than greedy algorithm in terms of errors, the solutions of greedy algorithm have a better sparsity. Greedy algorithm's output solution is more sparse. It should be noted that the elapsed time for PAVA implemented in C++ is smaller than for greedy algorithm. However, greedy algorithm has a better rate of convergence if number of iterations is less than 700 for the algorithms implemented in R,. Both algorithms obtain a sparse solutions, but we can control the number of nonzero elements (cardinality) in the greedy algorithm as opposed to PAVA. Generally, the greedy algorithm's cardinality increases by one at each iteration. Consequently, we should limit the number of iterations to obtain more sparse solution.</p><p>Figure <ref type="figure">2</ref> shows simulated points (N = 100) with logarithm structure and isotonic regressions, where green line represents the greedy algorithm's isotonic Table <ref type="table">1</ref>. Comparison of algorithms PAVA and greedy algorithm (Greedy) on an example of the simulated data (implementation in language C++): The obtained empirical results for the greedy algorithm show that the degree of convergence for the considered examples is much higher than its theoretical estimates obtained in Theorem 1. </p><formula xml:id="formula_17">A = {(xi, yi), yi = ln(x0 + i x) + ϕi, ϕi ∼ N (0, 1), x0 = 1, x = 1, i = 1, . . . ,<label>10000</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion</head><p>Our research proposes an algorithm for solving the problem of constructing the best fitted monotone regression by using the Frank-Wolfe method. The software was implemented in R and C++. We compared the performance of the greedy algorithm with the performance of PAVA using simulated data sets. While PAVA gives a slightly smaller errors than greedy algorithm, greedy algorithm obtains significantly sparser solutions. The advantages of greedy algorithm are the simplicity of implementation, the potential for controlling cardinality and the elapsed time is lower for the implementation in R in the case of problem with large dimension.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Algorithm 1 :•</head><label>1</label><figDesc>Pool-Adjacent-Violators Algorithm (PAVA) begin • Let z (0) j := yi be the start point, l = 0; • The index for the blocks is r = 1, . . . , B where at step l = 0 we set B := n, i.e. each observation z (0) r forms a block; repeat • (Adjacent pooling) Merge values of z (l) into blocks if z Solve f (z) for each block r, i.e., compute the update based on the solver which gives z (l+1) r</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. The dependence of the CPU time on dimension of the problem of the greedy algorithm (green line, 100 iterations) and PAVA (red line) implemented in R.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>Let zero vector ζ 0 = (0, . . . , 0) be the start point, and let the counter t = 0;• while t &lt; N do • Calculate ∇g(ζ t), the gradient of the function g at the point ζ t ;• Let ζ t be the solution of the linear optimization problem ∇g(ζ t ) Recover the monotone sequence z = (z1, . . . , zn) from the vector of increments ζ N ; end Theorem 1. Let {ζ t } be generated according to the Frank-Wolfe method (Algorithm 2) using the step-size rule α t = 2</figDesc><table><row><cell>begin</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="8">• Let N be the number of iteration;</cell></row><row><cell cols="8">• Our function g(ζ) and the feasible set S were defined above.</cell></row><row><cell cols="4">Let ∇g(ζ) =</cell><cell>∂g ∂ζ0</cell><cell>,</cell><cell>∂g ∂ζ1</cell><cell>, . . . ,</cell><cell>∂g ∂ζn−1</cell><cell>be the gradient of function g at</cell></row><row><cell cols="4">a point ζ,</cell><cell></cell><cell></cell><cell></cell></row><row><cell>∂g ∂ζ k</cell><cell>=</cell><cell>2 n</cell><cell>n i=k</cell><cell>i−1 j=0</cell><cell cols="3">ζj − yi , k = 0, . . . , n − 1;</cell></row><row><cell>•</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note>T , ζ → min ζ∈S , where ∇g(ζ t ) T , ζ is a scalar product of vectors; • (Update step) Let ζ t+1 = ζ t + αt( ζ t − ζ t ), αt = 2 t+2 and than t := t + 1; • t+2 . Then for all t ≥ 2 g(ζ t ) − g * ≤ 4 n(n + 1)(2n + 1) 6n 2</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Comparison of algorithms PAVA and greedy algorithm (Greedy) on an example of the simulated data (implementation in language R): A = (xi, yi)</figDesc><table><row><cell cols="6">Algorithm (the number of iterations) Error Cardinality Time</cell></row><row><cell></cell><cell>PAVA</cell><cell cols="2">0.994</cell><cell>82</cell><cell>4.28</cell></row><row><cell></cell><cell>Greedy (10)</cell><cell cols="2">1.169</cell><cell>6</cell><cell>0.09</cell></row><row><cell></cell><cell>Greedy (50)</cell><cell cols="2">1.011</cell><cell>30</cell><cell>0.33</cell></row><row><cell></cell><cell>Greedy (100)</cell><cell cols="2">0.999</cell><cell>41</cell><cell>0.63</cell></row><row><cell></cell><cell>Greedy (200)</cell><cell cols="2">0.996</cell><cell>57</cell><cell>1.23</cell></row><row><cell></cell><cell>Greedy (500)</cell><cell cols="2">0.995</cell><cell>76</cell><cell>3.08</cell></row><row><cell></cell><cell>Greedy (1000)</cell><cell cols="2">0.994</cell><cell>79</cell><cell>6.28</cell></row><row><cell></cell><cell>Greedy (2000)</cell><cell cols="2">0.994</cell><cell>82</cell><cell>12.67</cell></row><row><cell></cell><cell>Greedy (5000)</cell><cell cols="2">0.994</cell><cell>82</cell><cell>31.58</cell></row><row><cell></cell><cell>Greedy (10000)</cell><cell cols="2">0.994</cell><cell>82</cell><cell>60.91</cell></row><row><cell></cell><cell>6</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>4</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Value</cell><cell>2</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>0</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>20</cell><cell>40</cell><cell>60</cell><cell>80</cell><cell>100</cell></row><row><cell></cell><cell></cell><cell>x</cell><cell></cell><cell></cell></row><row><cell cols="6">Fig. 2. Step functions obtained by the greedy algorithm (ε = 0.753) and PAVA (ε =</cell></row><row><cell>0.751)</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A fast scaling algorithm for minimizing separable convex functions subject to chain constraints</title>
		<author>
			<persName><forename type="first">R</forename><surname>Ahuja</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Orlin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Operations Research</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="784" to="789" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Efficient algorithms for non-convex isotonic regression through submodular optimization</title>
		<author>
			<persName><forename type="first">F</forename><surname>Bach</surname></persName>
		</author>
		<idno>hal-01569934</idno>
		<ptr target="https://hal.archives-ouvertes.fr/hal-01569934,workingpaperorpreprint" />
		<imprint>
			<date type="published" when="2017-07">Jul 2017</date>
		</imprint>
	</monogr>
	<note type="report_type">Tech. Rep.</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The isotonic regression problem and its dual</title>
		<author>
			<persName><forename type="first">R</forename><surname>Barlow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Brunk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the American Statistical Association</title>
		<imprint>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="140" to="147" />
			<date type="published" when="1972">1972</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Active set algorithms for isotonic regression: a unifying framework</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Best</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Chakravarti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Mathematical Programming: Series A and B</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="425" to="439" />
			<date type="published" when="1990">1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Minimizing separable convex functions subject to simple chain constraints</title>
		<author>
			<persName><forename type="first">M</forename><surname>Best</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Chakravarti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ubhaya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIAM Journal on Optimization</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="658" to="672" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Linear approximation method preserving kmonotonicity</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">I</forename><surname>Boytsov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Sidorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Siberian electronic mathematical reports</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="21" to="27" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A generalised PAV algorithm for monotonic regression in several variables</title>
		<author>
			<persName><forename type="first">O</forename><surname>Burdakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Grimvall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hussian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">COMPSTAT, Proceedings of the 16th Symposium in Computational Statistics</title>
				<editor>
			<persName><surname>Antoch</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="761" to="767" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A dual active-set algorithm for regularized monotonic regression</title>
		<author>
			<persName><forename type="first">O</forename><surname>Burdakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sysoev</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Optimization Theory and Applications</title>
		<imprint>
			<biblScope unit="volume">172</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="929" to="949" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Shape-preserving dynamic programming</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">L</forename><surname>Judd</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Math. Meth. Oper. Res</title>
		<imprint>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="page" from="407" to="421" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Aspects of Shape-constrained Estimation in Statistics</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Chen</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Polynomial algorithms for isotonic regression</title>
		<author>
			<persName><forename type="first">V</forename><surname>Chepoi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cogneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Fichet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lecture Notes-Monograph Series</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="147" to="160" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Sparse greedy approximation, and the Frank-Wolfe algorithm</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">L</forename><surname>Clarkson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Algorithms</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="1" to="30" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Piecewise convex-concave approximation in the minimax norm</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Cullinan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Abstracts of Conference on Approximation and Optimization: Algorithms, Complexity, and Applications</title>
				<editor>
			<persName><forename type="first">I</forename><surname>Demetriou</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Pardalos</surname></persName>
		</editor>
		<meeting><address><addrLine>Athens, Greece</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017-06">June 29Ц30, 2017. 2017</date>
			<biblScope unit="page">4</biblScope>
		</imprint>
		<respStmt>
			<orgName>National and Kapodistrian University of Athens</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Approximate Methods in Optimization Problems</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">F</forename><surname>Demyanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Rubinov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Modern Analytic and Computational Methods in Science and Mathematics</title>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Case-control isotonic regression for investigation of elevation in risk around a point source</title>
		<author>
			<persName><forename type="first">Diggle</forename><surname>Peter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Tony</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Statistics in medicine</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1605" to="1613" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">An isotonic regression algorithm</title>
		<author>
			<persName><forename type="first">R</forename><surname>Dykstra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Statistical Planning and Inference</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="355" to="363" />
			<date type="published" when="1981">1981</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">An algorithm for isotonic regression for two or more independent variables</title>
		<author>
			<persName><forename type="first">R</forename><surname>Dykstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Robertson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Annals of Statistics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="708" to="719" />
			<date type="published" when="1982">1982</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">An algorithm for quadratic programming</title>
		<author>
			<persName><forename type="first">M</forename><surname>Frank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wolfe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Naval Research Logistics Quarterly</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="95" to="110" />
			<date type="published" when="1956">1956</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Shape-Preserving Approximation by Real and Complex Polynomials</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Gal</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2008">2008</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">On the convergence of a greedy algorithm for the solution of the problem for the construction of monotone regression</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Gudkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Mironov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Faizliev</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Izv. Saratov Univ. (N. S.)</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="431" to="440" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>Ser. Math. Mech. Inform.</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Algorithms and error estimations for monotone regression on partially preordered sets</title>
		<author>
			<persName><forename type="first">J</forename><surname>Hansohm</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Multivariate Analysis</title>
		<imprint>
			<biblScope unit="volume">98</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1043" to="1050" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Conditional gradient algorithms for norm-regularized smooth convex optimization</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Harchaoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Juditsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nemirovski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Mathematical Programming: Series A and B</title>
		<imprint>
			<biblScope unit="volume">152</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="75" to="112" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Sparse Convex Optimization Methods for Machine Learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Jaggi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
		<respStmt>
			<orgName>ETH Zürich</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Isotone optimization in R: Pool-adjacent-violators algorithm (PAVA) and active set methods</title>
		<author>
			<persName><forename type="first">J</forename><surname>Leeuw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Hornik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Statistical Software</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1" to="24" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">Order Restricted Statistical Inference</title>
		<author>
			<persName><forename type="first">T</forename><surname>Robertson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wright</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dykstra</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1988">1988</date>
			<publisher>John Wiley &amp; Sons</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title level="m" type="main">Local approximation by splines</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">T</forename><surname>Shevaldin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
			<publisher>UrO RAN</publisher>
			<pubPlace>Ekaterinburg</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Linear k-monotonicity preserving algorithms and their approximation properties</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Sidorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LNCS</title>
		<imprint>
			<biblScope unit="volume">9582</biblScope>
			<biblScope unit="page" from="93" to="106" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Duality gap analysis of weak relaxed greedy algorithms</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Mironov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LNCS</title>
		<imprint>
			<biblScope unit="volume">10556</biblScope>
			<biblScope unit="page" from="251" to="262" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Dual convergence estimates for a family of greedy algorithms in banach spaces</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">P</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Mironov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">G</forename><surname>Pleshakov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">LNCS</title>
		<imprint>
			<biblScope unit="volume">10710</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note>in press</note>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">On the saturation effect for linear shape-preserving approximation in Sobolev spaces</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sidorov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Miskolc Mathematical Notes</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1191" to="1197" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">An algorithm for isotonic regression with arbitrary convex distance function</title>
		<author>
			<persName><forename type="first">U</forename><surname>Stromberg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Statistics &amp; Data Analysis</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="205" to="219" />
			<date type="published" when="1991">1991</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Isotonic regression: Another look at the changepoint problem</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">B</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Woodroofe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Mentz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biometrika</title>
		<imprint>
			<biblScope unit="volume">88</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="793" to="804" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Exact algorithms for isotonic regression and related</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">L</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Xing</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Physics: Conference Series</title>
		<imprint>
			<biblScope unit="volume">699</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="9" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Accelerated training for matrix-norm regularization: A boosting approach</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schuurmans</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NIPS&apos;12 Proceedings of the 25th International Conference on Neural Information Processing Systems</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="2906" to="2914" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
