Reconstruction of multi-dimensional arrays in SAPFOR Nikita Kataev1[0000-0002-7603-4026], Vladislav Vasilkin2[0000-0002-7918-1849] 1 Keldysh Institute of Applied Mathematic RAS Miusskaya sq., 4, 125047, Moscow, Russia 2 Lomonosov Moscow State University, GSP-1, Leninskie Gory, 11999, Moscow, Russia kaniandr@gmail.com Abstract. Low-level representation of programs in the form of LLVM IR al- lows for various optimizations to improve the quality of program analysis in SAPFOR (System FOR Automated Parallelization). Being the same for differ- ent high-level languages, LLVM IR allows us to explore multilingual applica- tions. At the same time, it loses some features of the program, which are availa- ble in its higher level representation. One of these features is the multi- dimensional structure of the arrays. Multi-dimensional view improves the accu- racy of data dependency analysis, since the linearized representation of arrays with parametric sizes may not be an affine expression and it will not be possible to apply integer linear programming to analyze it. In addition, the use of multi- dimensional arrays allows us to use a multi-dimensional processor matrix and to parallelize a whole loop nests, rather than a single loop in the nest. This way, parallelism of a program is going to be increased. These opportunities are na- tively supported in the DVM system. This paper discusses the approach used in the SAPFOR system to recover the form of multi-dimensional arrays by their linearized representation in LLVM IR. The proposed approach has been suc- cessfully evaluated on various applications. Keywords: Program Analysis, Semi-automatic Parallelization, SAPFOR, DVM, LLVM. 1 Introduction The multi-dimensional arrays are widely used in many computational programs. For example, they enable descriptions of the object properties at various points in a multi- dimensional computing space. The DVM system [1, 2] relies on the multi-dimensional view of arrays to increase parallelization opportunity of a program. For example, it proposes specifications which simplify mapping of multi-dimensional arrays to a multi-dimensional grid of processors. Thus, to achieve sustainable performance of parallel programs it is necessary to analyze multi-dimensional data structures in the program. SAPFOR (System FOR Automated Parallelization) [3, 4] produces a parallel ver- sion of a program in a semi-automatic way according to DVMH model of the parallel programming for heterogeneous computational clusters. The DVMH model enables consistent distribution of data between processors of a computational cluster which Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 210 are represented as a multi-dimensional grid. The presence of align directive in a source code introduces the alignment rules which specify the location of elements of arrays relative to each other. These rules use affine expressions to set a correspond- ence between points of multi-dimensional index spaces of two objects. To avoid a communication overhead, the elements of different arrays which are used on the same iteration of the loop should be mapped to the same processor. Thus, accesses to arrays in loops and the form of subscript expressions in these accesses imply the alignment rules. Data dependence analysis is obviously required for program parallelization. The visibility of multi-dimensional view of data allows us to significantly increase the accuracy and reduce the computational complexity of data dependency analysis in many cases. To compute loop-carried data dependencies it is possible to perform the pairwise comparison of expressions which calculate the addresses of accessed ele- ments. If the expressions that calculate the offset relative to the beginning of the array are affine with respect to the induction variables of the enclosing loops, then compu- tation of data dependencies becomes NP-hard problem. In this case integer linear programming (ILP) solvers are applicable. Moreover, to reduce the complexity of the problem being solved, heuristics are used. If subscript expressions in a delinearized accesses satisfy some conditions (for example, expressions are SIV [5] (i.e. contains a single index variable)) the pairwise comparison of subscript expressions drastically reduce the complexity of the analysis. The knowledge of the multi-dimensional view of arrays makes it possible to recover subscript expressions from the linearized repre- sentation of array accesses. It should be noted that a parametric size of an array makes the linearized accesses non-affine so it becomes almost impossible to verify the absence of data dependen- cies. The SAPFOR system uses LLVM [6] intermediate representation (IR) to perform program analysis. This representation allows us to analyze programs for diverse pro- gramming languages (Fortran, C). So, it becomes possible to overcome multi-lingual issue essential for large-scale computational applications [7]. LLVM also provides the ability to hide program transformation from the user in order to improve the quality of the analysis [8, 9]. Implementation of LLVM-based analysis tool also eliminates the need to consider syntactic features of different programming languages and to analyze the diverse set of available language constructs. To preserve the relation with a higher level representation, the debug information can be used. For these purposes LLVM uses DWARF debugging file format (DWARF [10]) which is common for various languages. However, linearized view of array accesses impedes program analysis. Multi-dimensional view of arrays of known constant size is only visible in LLVM IR. For example, the type [100 x [200 x float]] can be used to declare an array of 100 * 200 elements. There are two purposes of array delinearization in the context of SAPFOR. The first one is the reconstruction of the array view which was multi-dimensional in the source code. And the second one is the search for equivalent the multi-dimensional view for arrays represented in the program source code in a linearized form. In this paper, we are primarily interested in the approach to recover multi-dimensional view 211 of arrays which were originally multi-dimensional. This follows from the restrictions which DVM system imposes on the programs: the use of multi-dimensional arrays in the source code is required. 2 Delinearization in LLVM The paper [11] proposes an approach to recovering the multi-dimensional view of arrays of parametric size. The presented approach is partially implemented in the context of LLVM and it was initially focused on the use in the Polly framework [12]. Polly is a high-level loop and data-locality optimizer. It can also exploit loop-level parallelism for systems with shared memory, including GPU. Polly performs IR-level optimization, therefore, the correspondence between delinearized arrays and the source code objects is not required. Loops can be optimized separately, so delineari- zation is required in the context of the evaluated loop only instead of the entire pro- gram. However, consistent delinearization for all accesses in the entire program is required to exploit parallelization opportunity in case of HPC systems with distributed memory. The paper [11] is focused on recovering the multi-dimensional view of arrays of parametric size only. Separately, Polly implements delinearization for arrays of known constant size. LLVM provides a simple array type to represent sequences of elements in memory. The array types can be nested, so definition of multi- dimensional arrays of known constant sizes is also supported. That means that in this case delinearization is not actually required, since the array type can be used to restore each subscript expression. If some of array dimensions have parametric sizes while other dimensions have known constant sizes the whole array is treated as an array of parametric size. So, the recovered number of dimensions and recovered multi-dimensional view may differ from the definition of array in the original program. It also should be noted that to implement delinearization Polly uses its internal data structures which are only partially included in the LLVM core. Thus, the use of the proposed algorithms in SAPFOR directly becomes impossible. Let us consider the main points of the approach presented in [11]. The linearized view of the access A [S_0] ... [S_N – 1] to the array A [D_0] ... [D_N – 1] is A + S_0 * D_1 *…* D_N–1 + … + S_N–1. (1) In this case, D_I is the size of the dimension I and S_I is the corresponding sub- script expression, I takes values from 0 to N – 1. This equation implies the following ones: C_N–1 * D_N–1=GCD(S_0 * D_1 * … * D_N–1, …, S_N–2 * D_N–1), (2) C_I * D_I=GCD(S_0 * D_1 * … * D_N–1, … S_I–1 * D_I * … D_N–1) / (D_I+1 *…* D_N–1), 0