Studying Mixed Normalization Strategies of Lambda Terms Vladyslav Shramenko, Victoriya Kuznietcova, Grygoriy Zholtkevych V.N. Karazin Kharkiv National University, Svobody sq., 4, Kharkiv, 61022, Ukraine Abstract In this work, we introduce a new approach how to normalize lambda terms. We have defined and researched two randomized reduction strategies and compared their efficiency with the normal and applicative strategies for uniformly randomly generated lambda terms. In addition, we have developed a framework that allows us to find a randomized reduction strategy that will almost surely be normalized and the efficiency will be no worse than the efficiency of the existing reduction strategies for normalizing the given set of lambda terms. Keywords 1 lambda-calculus, reduction strategies, uniformly random reduction strategy, mixed reduction strategy, functional programming 1. Introduction Functional programming has its roots in academia, evolving from the lambda calculus, a formal system of computation based only on functions [1]. Functional programming languages use lambda calculus as a low- level programming language. Functional programming is widely used to solve applied problems of artificial intelligence, such as parallel processing of large data streams [2], modeling of large data arrays [3], big data analysis [4], working with machine learning models [5], decentralized systems and blockchain [6], advanced applications with interactive interfaces [7] and more. Therefore, with the development of the sphere of high-tech information technologies, functional programming becomes relevant not only for solving specialized problems but also for classical projects. Compilers of functional programming languages perform a reduction process to obtain normal forms of lambda terms. Various reduction strategies differ in the efficiency of the normalizing process. For large programs, the compilation process can take a long time. Thus, there is a desire to optimize the process of normalizing lambda terms. The object of the study is the process of reducing lambda terms to obtain the normal form of a lambda term. The subject of the study is lambda term reduction strategies. The aims of the study: • to develop more sophisticated lambda term reduction strategies for normalizing lambda terms; • to study the impact of randomness in the process of normalizing randomly generated lambda terms; • to develop a framework that searches for a reduction strategy that will find the normal forms of a given class of lambda terms as quickly as possible. To achieve the aim the following tasks were formulated: • to analyze the existing reduction strategies for lambda terms, their advantages and disadvantages; • to define a randomized reduction strategy; 2nd International Workshop of IT-professionals on Artificial Intelligence (ProfIT AI 2022), December 2-4, 2022, Łódź, Poland EMAIL: shramenko.vlad@gmail.com (V. Shramenko); vkuznietcova@karazin.ua (V. Kuznietcova); g.zholtkevych@gmail.com (G. Zholkevych) ORCID: 0000-0002-3551-6942 (V. Shramenko); 0000-0003-3882-1333 (V. Kuznietcova); 0000-0002-7515-2143 (G. Zholkevych) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) • to define a uniformly random reduction strategy; • to define a mixed reduction strategy; • to implement reduction strategies as program code; • to find the best reduction strategy for normalizing randomly generated lambda terms; • to conduct simulation experiments to compare reduction strategies. 2. Background 2.1. Reduction Process in Lambda Calculus Functional languages are closely related to the lambda calculus. Functional languages may be divided into two classes: strict and non-strict. Strict languages typically employ a call-by-value left-to- right evaluation strategy for arguments to functions. Non-strict languages employ a graphical form of call-by-name evaluation strategy, in which once the value of an argument is calculated this value replaces the expression representing the argument. This ensures that an argument is reduced at most once in the evaluation of function call. Each of these methods of evaluation has an associated cost [8]. Functional languages perform no reduction under abstractions and thus reduce terms to weak normal forms only. In particular, call-by-value reduces to weak normal form, and call-by-name reduces to weak head normal form. These normal forms are defined in [9]. This section contains basic definitions of some well-known concepts related to the lambda calculus [10]. Definition 1. Let 𝑉 = {𝑣1, 𝑣2, …} is set of variables. lambda calculus is the language of lambda terms defined by the following grammar: 𝑀, 𝑁 → 𝑣𝑎𝑟|(𝑀𝑁)|(𝜆 𝑣𝑎𝑟 𝑀), (1) where 𝑣𝑎𝑟 is a metavariable used for referring to elements of 𝑉 and 𝑀, 𝑁 are lambda terms. We denote the set of lambda terms as Λ. Definition 2. The lambda term 𝐾 is called β-reducible term or β-redex if 𝐾 include the term (𝜆 𝑣𝑎𝑟 𝑀)𝑁, (2) where 𝑀, 𝑁 ∈ Λ, 𝑣𝑎𝑟 ∈ 𝑉. We denote the set of β-reducible lambda terms as Λ𝛽 ⊆ Λ. Definition 3. We denote the 𝐾{𝑀} as lambda term 𝐾 ∈ Λ that include 𝑀 ∈ Λ. The relation of β- reduction →𝛽 ⊆ Λ → Λ is defined as →𝛽 = {(𝐾{(𝜆𝑥, 𝑀)𝑁}, 𝐾{𝑀(𝑁 ≔ 𝑥)})|𝐾 ∈ 𝛬𝛽 , 𝑀, 𝑁 ∈ 𝛬}. (3) We denote by ↠𝛽 the reflexive and transitive closure of →𝛽. Definition 4. The lambda term 𝐾 is in normal form if 𝐾 does not contain any β-redex, i.e. 𝐾 ∉ Λ𝛽. We denote by Λ𝑁 ⊆ Λ the set of lambda terms that have normal form. Definition 5. Reduction strategy R is mapping Λ → Λ, that ∀𝑀 ∈ 𝛬: 𝑀 ↠𝛽 𝑅(𝑀). (4) We denote the mapping R as →𝑅 and by ↠𝑅 the reflexive and transitive closure of →𝑅. 2.2. Classification of Reduction Strategies A given term can contain several redexes, hence several different β-reductions could be applied. The reduction strategy chooses between different methods of term reduction. Each strategy has its advantages and disadvantages and different computational efficiency. In addition, depending on the lambda term, the efficiency of reduction strategies is different. Also, for some terms, one reduction strategy turns out to be more efficient, and for others, another reduction strategy. The normalizing process can consist of a different number of steps, depending on the chosen strategy. The main types of reduction strategies are defined below [11]. Definition 6. Reduction strategy R is called one-step strategy, if ∀𝑀 ∈ 𝛬: 𝑀 →𝛽 𝑅(𝑀). (5) Otherwise, it is a many-step strategy. Definition 7. Reduction strategy R is called deterministic strategy, if ∀𝑀 ∈ 𝛬∃𝑁 ∈ 𝛬: 𝑀 ↠𝛽 𝑁. (6) Otherwise, it is a nondeterministic strategy. Definition 8. Reduction strategy R is called normalized strategy, if a lambda term has a normal form, then reduction strategy R will eventually reach it. ∀𝑀 ∈ 𝛬, 𝑀 =𝛽 𝑁, 𝑁 ∉ 𝛬𝛽 : 𝑀 ↠𝑅 𝑁. (7) Definition 9. Reduction strategy R is called efficient if ∀𝑀 ∈ Λ: 𝑀 →𝑅 𝑁 and 𝑁 can be computed in polynomial time. One-step strategies for term reducing include [12]: • leftmost-innermost (LI): in each step the leftmost of the innermost redexes is reduced, where an innermost redex is a redex not containing any redexes; • leftmost-outermost (LO): in each step, the leftmost of the outermost redexes is reduced, where an outermost redex is a redex not contained in any redex; • rightmost-innermost (RI): in each step, the rightmost of the innermost redexes is reduced, where an innermost redex is a redex not contained in any redexes; • rightmost-outermost (RO): in each step the rightmost of the outermost redexes is reduced, where an outermost redex is a redex not contained in any redex. Many-step reduction strategies include [12]: • parallel-innermost: reduces all innermost redexes simultaneously. This is well-defined because the redexes are pairwise disjoint; • parallel-outermost: reduces all outermost redexes simultaneously. This is well-defined because the redexes are pairwise disjoint; • Gross-Knuth reduction, also called full substitution or Kleene reduction: all redexes in the term are simultaneously reduced. The most commonly used reduction strategies in compilers for functional programming languages are LO (Normal order) and RI (Applicative order). The normal order reduction strategy is normalized which ensures that the normal form can be obtained in a finite number of reductions if it exists. In addition, the LO strategy avoids doing useless work and reduces needed redexes only. Although this strategy reduces the same redexes multiple times, doing duplicating work. For compilers of functional programming languages, the LO strategy turns out to be not optimal in terms of the number of reduction steps. The applicative order reduction strategy is generally considered more efficient than the LO strategy for compiling functional programs because programs often need to copy their arguments.[13] So, the RI strategy avoids duplicating work and does not reduce the same redexes. However, the RI strategy reduces redexes that will not be included in the resulting reduced lambda term. Also, in contrast to normal order, applicative order reduction may not terminate, even when the term has a normal form. 2.3. The Problem of an Efficient Optimal Normalized Strategy Definition 10. Let mapping F is the reduction strategy R. Then we denote the function 𝐿𝐹(𝑀) as the number of reduction steps to obtain the normal form of the term M using the reduction strategy R. Definition 11. Let F, G are reduction strategies. 𝐹 ≤𝐿 𝐺 if ∀𝑀 ∈ 𝛬: 𝐿𝐹 (𝑀) ≤𝐿 𝐿𝐺 (𝑀). (8) Definition 12. Reduction strategy F is called L-optimal if F is normalized strategy and ∄ strategy 𝑅: 𝐹 ≤𝐿 𝑅. There are many various reduction strategies. The question arises as to which reduction strategy to choose in order to get the normal form of the lambda term as quickly as possible. The problem of choosing a redex leading to a minimal-length reduction sequence is undecidable for the lambda calculus. It has been proven that there is no efficient L-optimal strategy and there is an L-optimal strategy [11]. 2.4. Efficiency Criterion of Reduction Strategies We will measure the efficiency of reduction strategy R as the expected value of a random variable whose values are the outcome of the 𝐿𝑅 function of uniformly random generated lambda terms of the same size. We will use the Boltzmann model to uniform generate random lambda terms. A Boltzmann generator for a class Λ is built according to a recursive specification of the class Λ. The recurrence relation for 𝑆𝑚,𝑛 (the number of lambda terms of size n with at most m distinct free variables) allows us to define the function generating lambda terms. More precisely, we construct bijections, called unranking functions, between all non-negative integers not greater than 𝑆𝑚,𝑛 and binary lambda terms of size n with at most m distinct free variables. This approach is known as the recursive method, originating with Nijenhuis and Wil [14]. Using the algorithm for the uniform generation of random lambda terms, described in [14], we generated lambda terms with the number of vertices in the tree representation of lambda terms in the range from 50 to 60. From the generated terms, we select 100 terms that have a normal form. We determine the existence of a normal form using the normal reduction strategy. To do this, we reduce the lambda terms and count the number of reduction steps. If the reduction process is not completed in a limited number of steps, then we will assume that this term does not have a normal form. Let's denote the set of the generated lambda terms as S. 3. Studying Randomized Normalization Strategies 3.1. Concept of Random Reduction Strategy It is known that there is no strategy that will be optimal for all lambda terms at the same time. However, there is a strategy that is optimal for a certain class of terms. Therefore, there is a desire to look for optimal strategies for a certain class of lambda terms. Thus, the found strategy should quickly normalize the group of terms to the normal form. A similar situation arises in game theory. Optimal strategies do not always exist among pure strategies. This problem is solved by introducing mixed strategies, we use each pure strategy with some probability. So, there is an element of randomness in the mixed strategy. As a result, it is guaranteed that the player's average payoff will be maximized [15]. Randomized algorithms are algorithms that make random choices during their execution. In practice, a randomized program would use values generated by a random number generator to decide the next step at several branches. Randomized algorithms are typically used to reduce the running time, time complexity; or the memory used, or space complexity, in a standard deterministic algorithm. For many problems, a randomized algorithm is the simplest, the fastest, or both [16]. There are two main types of randomized algorithms: Las Vegas algorithms and Monte-Carlo algorithms. In Las Vegas algorithms, the algorithm may use randomness to speed up the computation, but the algorithm must always return the correct answer to the input. Monte-Carlo algorithms do not have the former restriction, that is, they are allowed to give wrong return values. However, returning a wrong return value must have a small probability, otherwise, the Monte-Carlo algorithm would not be of any use [17]. Adding randomness may improve the efficiency of the lambda-term reduction process. We can also use machine learning or genetic algorithms to match the probabilities of using different reduction strategies for a given class of lambda terms. This will combine the advantages of various reduction strategies in one strategy in the best way. Definition 13. Random reduction strategy R is mapping (Λ, ℕ) → Λ, that ∀𝑀 ∈ 𝛬, 𝑛 ∈ ℕ: 𝑀 ↠𝛽 𝑅(𝑀, 𝑛). (9) Definition 14. Let R is the random reduction strategy. Then we denote the function 𝐿𝑅,𝑛(𝑀) as the average of n times calculated the number of reduction steps to obtain the normal form of the term M using the reduction strategy R n times. 3.2. Uniformly Random Reduction Strategy The basic QuickSort algorithm has 𝑂(𝑛2) running time for an array of n elements when we always choose the leftmost element in the array as the pivot. In the randomized sorting algorithm QuickSort, when we choose a random element in an array as a pivot, we achieve that the expected time of this algorithm is 𝑂(𝑛 log 𝑛) [18]. So, we can use randomness in a reduction strategy by reducing an arbitrary redex. The easiest way is to use uniform distribution and select uniformly random redex for reduction from all redexes in the given term. Definition 15. Uniformly random reduction strategy UR is mapping (Λ, ℕ) → Λ, that R reduces a redex that is uniformly randomly selected from all redexes in the term. ∀𝑀 ∈ 𝛬, 𝑛 ∈ ℕ: 𝑀 →𝛽 𝑈𝑅(𝑀, 𝑛). (10) The implementation of the uniformly random reduction strategy consists in: 1. Count the number of β-redexes in given lambda term. Let's denote the number of redexes by N; 2. Generate а random natural number R in the range [1, N]; 3. Reduce the R-th redex. The asymptotic of this algorithm is O(V), where V is the number of vertices of the lambda term tree representation. Hence the uniformly random reduction strategy is efficient. 3.3. Mixed Reduction Strategy Various reduction strategies have their advantages and disadvantages. For different lambda terms, different strategies turn out to be optimal. As in game theory, we can introduce mixed strategies, which means that we use each pure strategy with some probability. This can significantly increase the average rate of reduction of lambda terms from a given set. Definition 16. Let 𝑅 = {𝑅1, 𝑅2, … , 𝑅𝑛} is set of reduction strategies and 𝑝 = (𝑝1, 𝑝2, … , 𝑝𝑛) and 𝑛 ∑𝑖=1 𝑝𝑖 = 1. Mixed reduction strategy 𝑀𝑅𝑅,𝑝 is mapping (Λ, ℕ) → Λ, that 𝑀𝑅𝑅,𝑝 is applying one of the reduction strategies from R – 𝑅𝑖 to reduce a redex where 𝑝𝑖 is the probability to choose 𝑅𝑖 from R. Definition 17. For given reduction strategy 𝑅 define a function 𝑉: Λ𝑁 → ℝ as Lyapunov if: ∃𝑛 ∈ ℝ ∀M ∈ 𝛬𝑁 : 𝑉(𝑀) ≥ 𝑛. (11) ∃𝜖 > 0∀𝑀 ∈ 𝛬𝑁 ∩ 𝛬𝛽 : 𝑉(𝑀) > 𝑉(𝑅(𝑀)) + 𝜖. (12) where V is extended to partial distribution as follows 𝑉(𝑅(𝑀)) = ∑ 𝑉(𝑚) ℙ(𝑀 → 𝑚). (13) 𝑚∈𝛬 Foster's theorem [19]. If we can define for reduction strategy 𝑅 a Lyapunov function 𝑉, then 𝑅 is normalized and the average derivation length ∃𝜖 > 0∀𝑀 ∈ 𝛬𝑁 : 𝐿𝑅 (𝑀) ≤ 𝑉(𝑀)/𝜖. (14) Theorem 1. Let 𝑅 = {𝑅1, 𝑅2, … , 𝑅𝑛} is set of reduction strategies and 𝑝 = (𝑝1, 𝑝2, … , 𝑝𝑛) and ∑𝑛𝑖=1 𝑝𝑖 = 1. Let ∃ 𝑖 ∈ [1, 𝑛]: 𝑅𝑖 = 𝐿𝑂 is the normal order reduction strategy and 𝑝𝑖 > 0. Then, mixed reduction strategy 𝑀𝑅𝑅,𝑝 is almost-surely normalized. Proof. To prove this theorem, we use Foster’s Theorem. Consider function 𝑉 = 𝐿𝐿𝑂. Obviously, the first condition of the Lyapunov function definition is satisfied: ∀𝑀 ∈ 𝛬𝑁 : 𝐿𝐿𝑂 (𝑀) ≥ 0. Consider 𝑀 ∈ Λ𝑁 ∩ Λ𝛽. Let 𝑚 = 𝑀𝑅𝑅,𝑝(𝑀) and 𝑀 →𝑅𝑗 𝑁𝑗 , 𝑗 ∈ [1, 𝑛]. Obviously, 𝐿𝐿𝑂(𝑀) = 𝐿𝐿𝑂 (𝑁𝑖) + 1. The LO reduction strategy satisfies the inequality 𝐿𝐿𝑂 (𝐾) ≤ 𝐿𝐿𝑂 (𝑀), where 𝑀 →𝛽 𝐾. Therefore 𝐿𝐿𝑂 (𝑁𝑗) ≤ 𝐿𝐿𝑂 (𝑀) , 𝑗 ∈ [1, 𝑛]. 𝑛 𝑖 𝑛 𝐿𝐿𝑂 (𝑚) = ∑ 𝑝𝑗 𝐿𝐿𝑂 (𝑁𝑗 ) = 𝑝𝑖 (𝐿𝐿𝑂 (𝑀) − 1) + ∑ 𝑝𝑗 𝐿𝐿𝑂 (𝑁𝑗 ) + ∑ 𝑝𝑗 𝐿𝐿𝑂 (𝑁𝑗 ) = 𝑗=1 𝑗=1 𝑗=𝑖+1 𝑖 𝑛 = 𝑝𝑖 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 + ∑ 𝑝𝑗 𝐿𝐿𝑂 (𝑀) + ∑ 𝑝𝑗 𝐿𝐿𝑂 (𝑀) = 𝑗=1 𝑗=𝑖+1 𝑖 𝑛 = 𝑝𝑖 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 + 𝐿𝐿𝑂 (𝑀) (∑ 𝑝𝑗 + ∑ 𝑝𝑗 ) = 𝑗=1 𝑗=𝑖+1 𝑛 = 𝑝𝑖 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 + 𝐿𝐿𝑂 (𝑀) (∑ 𝑝𝑗 − 𝑝𝑖 ) = 𝑗=1 = 𝑝𝑖 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 + 𝐿𝐿𝑂 (𝑀)(1 − 𝑝𝑖 ) = 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 . 𝐿𝐿𝑂 (𝑚) ≤ 𝐿𝐿𝑂 (𝑀) − 𝑝𝑖 ⇒ ∃ϵ = 𝑝𝑖 > 0: 𝐿𝐿𝑂 (𝑀) ≥ 𝐿𝐿𝑂 (𝑀 𝑅𝑅 ,𝑝 (𝑀) ) + 𝜖. So 𝑉 = 𝐿𝐿𝑂 is Lyapunov function for the given reduction strategy 𝑀𝑅𝑅,𝑝. Hence, by the Foster's theorem the mixed reduction strategy 𝑀𝑅𝑅,𝑝 is normalized. The implementation of the mixed reduction strategy 𝑀𝑅𝑅,𝑝 consists in: 1. Generate а random float number r in the range [0, 1]; 2. Find the smallest index i for which 𝑟 ≤ ∑𝑖𝑗=1 𝑝𝑗 ; 3. Apply 𝑅𝑖 reduction strategy. If all reduction strategies from 𝑅 = {𝑅1, 𝑅2, … , 𝑅𝑛} are efficient then the mixed reduction strategy 𝑀𝑅𝑅,𝑝 is also efficient. The mixed reduction strategy 𝑀𝑅𝑅,𝑝 involves choosing a probability vector 𝑝, which is the hyperparameter of the 𝑀𝑅𝑅,𝑝 reduction strategy. In this regard, the question arises of how to choose a probability vector for the maximum efficiency of reducing lambda terms from S⊆ Λ𝑁 to normal forms. In this case we need to solve the optimization problem: minimize the sum of the number of required reduction steps for the lambda terms from S by finding the as best as possible probability vector 𝑝 for the reduction strategy 𝑀𝑅𝑅,𝑝: 𝑒(𝑝) = ∑ 𝐿𝑀𝑅 → (𝑀) → 𝑚𝑖𝑛. (15) 𝑀∈𝑆 𝑅,𝑝 A common way to find good combinations of hyperparameters is a grid search. The main disadvantage of searching on a grid is the need for a complete enumeration of all combinations, which can take a very long time. Any search on a grid is limited by a subset of hyperparameters that we have defined. The genetic algorithm allows us to find the best combination in less time than in the case of an exhaustive search on the grid. Genetic algorithms can be used not only as an efficient way to search on a grid but also for direct search in the entire parameter space [20]. The basis of genetic algorithms is based on the hypothesis that the optimal solution to a problem can be assembled from small structural elements. Individuals that contain some of the desirable structural elements are assigned a higher score. Repeated operations of selection and crossing lead to the emergence of ever better individuals, passing these structural elements to the next generation, perhaps combined with other successful structural elements. This creates a genetic pressure that directs the population towards the emergence of an increasing number of individuals with structural elements that form the optimal solution. As a result, each generation is better than the previous one and contains more individuals close to the optimal solution. Consider mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI,},𝑝, 𝑝 = (𝑝𝐿𝑂, 𝑝𝑅𝐼) is probability vector, where 𝑝𝑅𝐼 = 1 − 𝑝𝐿𝑂 . For various values 𝑝𝐿𝑂 ∈ [0, 1], we calculate the efficiency of the obtained reduction strategies 𝑀𝑅{𝐿𝑂,RI,},𝑝. The results are shown in Fig. 1, which shows the dependence of the efficiency of the mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI,},𝑝 on the probability of using the normal strategy 𝑝𝐿𝑂. Figure 1: Dependence of the efficiency of the mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI},𝑝 on the probability of using the normal strategy 𝑝𝐿𝑂 As seen from the figure 1, the most optimal values of the probabilities of using normal and applicative strategies of the mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI},𝑝 are 𝑝 = 0.85 { 𝐿𝑂 𝑝𝑅𝐼 = 0.15 for normalizing uniformly random generated lambda terms. The expected value of reduction steps number to normalize the uniformly random generated lambda term: 𝑀𝑀𝑅{𝐿𝑂,𝑅𝐼},(0.85,0.15) = 8.45. (16) In addition, the mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI},(0.85,0.15) is almost surely normalized by Theorem 1. Thus, we have found a mixed reduction strategy that is more efficient than applicative and normal reduction strategies for normalizing uniformly random generated lambada terms, and it is almost surely normalized. Consider mixed reduction strategy 𝑀𝑅{𝐿𝑂,RI,LI,RO,𝑈𝑅},𝑝, where 𝑝 = (𝑝1, 𝑝2, 𝑝3, p4, p5) and ∑5𝑖=1 𝑝𝑖 = 1. We used a genetic algorithm to find the best as possible probability vector 𝑝. We defined an individual as a list of decimal numbers that will be the desired probability vector. The fitness function for an individual p was the efficiency criterion of the mixed strategy which was described in section 2.4. As a genetic selection, crossover, and mutation operator, we used tournament selection of size 2 in combination with elitism [21], Simulated Binary Crossover [22], and Polynomial Bounded Mutation [23], respectively. As the result, we found the probability vector 𝑝 = (0.72, 0.02, 0.02, 0.17, 0.7). (17) Let's denote the obtained mixed strategy 𝑀𝑅{𝐿𝑂,RI,LI,RO,𝑈𝑅 },𝑝 as 𝑀𝑅opt. The mixed reduction strategy 𝑀𝑅opt is almost surely normalized by Theorem 1. We can generalize this algorithm to an arbitrary set R of reduction strategies. Thus, we have developed a framework that allows us to find the mixed reduction strategy that will normalize a given class of lambda terms with almost the greatest efficiency. In addition, if our mixed strategy is using the normal strategy with non-zero probability then it is almost surely normalized for lambda terms having a normal form. 3.4. Comparing the Efficiency of the Uniformly Random, Mixed, Normal, and Applicative Reduction Strategies We want to compare the efficiency of the most common reduction strategies LO (normal order) and RI (applicative order) with UR (uniformly random) and 𝑀𝑅opt (mixed strategy obtained in the previous section) reduction strategies for uniformly random generated lambda terms. Based on the generated lambda terms S in section 1.4, we construct histograms of the distributions of the required number of reduction steps to obtain the normal form by using LO, RI, UR, and 𝑀𝑅opt reduction strategies respectively (Fig. 2-5). Hypotheses about the laws of distribution of the values of the functions 𝐿𝐿𝑂, 𝐿𝑅𝐼, 𝐿𝑈𝑅,100 and 𝐿𝑀𝑅{𝐿𝑂,RI,LI,RO,𝑈𝑅},𝑝,100 for uniformly random generated lambda terms from S are checked. The following distribution laws were taken for comparison: normal, Cauchy, 𝜒2, exponential, generalized normal (exponential power), gamma, log-normal and Rayleigh. Graphs of distribution laws that best describe the values of the functions 𝐿𝐿𝑂, 𝐿𝑅𝐼, 𝐿𝑈𝑅,100 and 𝐿𝑀𝑅opt,100 for lambda terms from S are shown on Fig. 2- 5. It is determined that the values of the functions 𝐿𝐿𝑂 of the uniform generated lambda terms is distributed according to the log-normal distribution law with the 𝜇 = 1.65, 𝜎 = 1.06 parameters: 2 1 1 ln(𝑥)−𝜇 − ( ) (18) 𝑓(𝑥, 𝜇, 𝜎) = 𝑒 2 𝜎 . 𝑥𝜎√2𝜋 The expected value of this distribution is 𝜎2 (19) 𝑀𝐿𝑂 = 𝑒 𝜇+ 2 = 9.14. It is determined that the values of the functions 𝐿𝑅𝐼 of the uniformly random generated lambda terms is distributed according to the log-normal distribution law with the 𝜇 = 2.04, 𝜎 = 0.68 parameters of (18). The expected value of this distribution is 𝜎2 (20) 𝑀𝑅𝐼 = 𝑒 𝜇+ 2 = 9.66. Figure 2: Histogram of the distributions of the reduction steps numbers 𝐿𝐿𝑂(𝑀) of the uniformly random generated lambda terms 𝑀 ∈ S Figure 3: Histogram of the distributions of the reduction steps numbers 𝐿RI(𝑀) of the uniformly random generated lambda terms 𝑀 ∈ S In addition, the 4 lambda terms were not normalized, although they have a normal form. The use of the applicative strategy for these terms turned out to be incessant. It is determined that the values of the functions 𝐿𝑈𝑅,100 of the uniformly random generated lambda terms is distributed according to the log- normal distribution law with the 𝜇 = 1.92, 𝜎 = 0.8 parameters of (18). The expected value of this distribution is 𝑀𝑅𝐼 = 9.42. (21) Figure 4: Histogram of the distributions of the reduction steps numbers 𝐿UR,100(𝑀) of the uniformly random generated lambda terms 𝑀 ∈ S Figure 5: Histogram of the distributions of the reduction steps numbers 𝐿MRopt,100(𝑀) of the uniformly random generated lambda terms 𝑀 ∈ S Also note that all terms have been normalized, as opposed to using the applicative strategy. It is determined that the values of the functions 𝐿𝑀𝑅opt,100 of the uniformly random generated lambda terms is distributed according to the log-normal distribution law with the 𝜇 = 1.67, 𝜎 = 0.96 parameters of (18). The expected value of this distribution is 𝑀𝑀𝑅𝑜𝑝𝑡 = 8.39. (22) Thus, from the formulas 19 – 22, we got that 𝑀𝑅𝐼 > 𝑀𝑈𝑅 > 𝑀𝐿𝑂 > 𝑀𝑀𝑅𝑜𝑝𝑡 . for uniform generated lambda terms of the same size. We can see that the efficiency of the uniformly random reduction strategy is commensurate with the efficiency of normal and applicative strategies. Thus, the uniformly random strategy can be used for the lambda term reduction process the same as the normal or applicative strategies. The mixed strategy turned out to be the worst among the considered strategies. In addition, the 𝑀𝑅opt strategy is almost surely normalized and all lambda terms were normalized by using this strategy. 4. Conclusions There is no universal strategy that has been the most efficient for reducing any lambda term. All existing reduction strategies have various advantages and disadvantages. The efficiency of reduction strategies essentially depends on the lambda terms for which it is used. We defined a randomized reduction strategy for lambda terms. Two randomized reduction strategies have been defined and implemented as a program code: a uniformly random and a mixed reduction strategy. A mixed reduction strategy is investigated. It is proved that a mixed reduction strategy for which the normal strategy is used with a probability greater than 0 is almost surely normalized. A framework has been developed that allows us to find a mixed reduction strategy that will almost surely be normalized and the efficiency will be no worse than the efficiency of the existing reduction strategies for normalizing the existing class of lambda terms. Simulation experiments to normalize uniformly random generated lambda terms using the mixed, uniformly random, normal, and applicative reduction strategies have been conducted. The uniformly random reduction strategy is proposed for use since it is comparable in efficiency with the normal and applicative strategies. Also, the uniformly random reduction strategy turned out to be better than the applicative one in terms of the number of normalized lambda terms. The efficiency of the mixed reduction strategy is significantly better than the efficiency of the other considered reduction strategies for normalizing randomly generated lambda terms. The mixed reduction strategy combines the advantages of several reduction strategies in the best way. To sum up, randomized reduction strategies may be more efficient than standard non-randomized reduction strategies. 5. References [1] Z. Hu, J. Hughes, and M. Wang, “How functional programming mattered,” National Science Review, vol. 2, no. 3, pp. 349–370, Sep. 2015, doi: 10.1093/nsr/nwv042. [2] J. Arora, S. Westrick, and U. A. Acar, “Provably space-efficient parallel functional programming,” Proceedings of the ACM on Programming Languages, vol. 5, no. POPL, pp. 1–33, Jan. 2021, doi: 10.1145/3434299. [3] K. Bazilevych, et. al., “stochastic modelling of cash flow for personal insurance fund using the cloud data storage,” International Journal of Computing, vol. 17, no. 3, pp. 153–162, Sep. 2018, doi: 10.47839/ijc.17.3.1035. [4] L. Belcastro, R. Cantini, F. Marozzo, A. Orsino, D. Talia, and P. Trunfio, “Programming big data analysis: principles and solutions,” Journal of Big Data, vol. 9, no. 1, Jan. 2022, doi: 10.1186/s40537-021-00555-2. [5] D. Chumachenko, et al., “Investigation of Statistical Machine Learning Models for COVID-19 Epidemic Process Simulation: Random Forest, K-Nearest Neighbors, Gradient Boosting,” Computation, vol. 10, no. 6, 86, 2022, doi: 10.3390/computation10060086. [6] Q. Pan and X. Koutsoukos, “Building a Blockchain Simulation using the Idris Programming Language,” Proceedings of the 2019 ACM Southeast Conference, Apr. 2019, doi: 10.1145/3299815.3314456. [7] S. Yakovlev et al., “The Concept of Developing a Decision Support System for the Epidemic Morbidity Control.” CEUR Workshop Proceedings, vol. 2753, pp. 265-274, 2022. [8] S. L. Peyton Jones, “Parallel Implementations of Functional Programming Languages,” The Computer Journal, vol. 32, no. 2, pp. 175–186, 1989, doi: 10.1093/comjnl/32.2.175. [9] P. Sestoft, “Demonstrating Lambda Calculus Reduction,” Electronic Notes in Theoretical Computer Science, vol. 45, pp. 424–432, 2001, doi: 10.1016/s1571-0661(04)80973-3. [10] D. Wright, “Reduction Types and Intensionality in the Lambda-Calculus,” 1992. [11] E. Engeler, “H. P. Barendregt. The lambda calculus. Its syntax and semantics. Studies in logic and foundations of mathematics” Journal of Symbolic Logic, vol. 49, no. 1, pp. 301–303, 1984, doi: 10.2307/2274112. [12] J. Klop, “Term Rewriting Systems.” 2021, 77 p. [13] G. Mazzola, G. Milmeister, and J. Weissmann, Comprehensive Mathematics for Computer Scientists 2: Calculus and ODEs, Splines, Probability, Fourier and Wavelet Theory, Fractals and Neural Networks, Categories and Lambda Calculus. Springer Science & Business Media, 2004. [14] K. Grygiel, P. Lescanne, “Counting and Generating Terms in the Binary Lambda Calculus (Extended version) Counting and Generating Terms in the Binary Lambda Calculus (Extended version),” 2015. [15] S. Tadelis, Game Theory. 2013. [16] M. Mitzenmacher and E. Upfal, “Probability and Computing: Randomized Algorithms and Probabilistic Analysis,” undefined, 2005. [17] D. Fucci et al., “A Longitudinal Cohort Study on the Retainment of Test-Driven Development” [18] H. J. Karloff and P. Raghavan, “Randomized algorithms and pseudorandom numbers,” Journal of the ACM, vol. 40, no. 3, pp. 454–476, 1993, doi: 10.1145/174130.174132. [19] P. Bremaud, “Markov chains”, 1999. [20] Eyal Wirsansky, Hands-on genetic algorithms with Python : applying genetic algorithms to solve real-world deep learning and artificial intelligence problems / Eyal Wirsansky. Birmingham, Uk: Packt Publishing Ltd, 2020. [21] M. Yusoff and A. A. Othman, “Genetic Algorithm with Elitist-Tournament for Clashes-Free Slots of Lecturer Timetabling Problem,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 12, no. 1, p. 303, 2018, doi: 10.11591/ijeecs.v12.i1.pp303-309. [22] K. Deb and R. B. Agrawal, “Simulated Binary Crossover for Continuous Search Space Kalya nmoy D eb’ Ram B hushan A gr awal,” 2014. [23] K. Deb and ayan Deb, “Analysing mutation schemes for real-parameter genetic algorithms,” International Journal of Artificial Intelligence and Soft Computing, vol. 4, no. 1, p. 1, 2014, doi: 10.1504/ijaisc.2014.059280.