=Paper=
{{Paper
|id=Vol-3041/488-493-paper-90
|storemode=property
|title=Implementing the Graph Model of the Spread of A Pandemic on GPUs
|pdfUrl=https://ceur-ws.org/Vol-3041/488-493-paper-90.pdf
|volume=Vol-3041
|authors=Vladimir Sudakov,Nikita Yashin
}}
==Implementing the Graph Model of the Spread of A Pandemic on GPUs==
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 IMPLEMENTING THE GRAPH MODEL OF THE SPREAD OF A PANDEMIC ON GPUS Vladimir Sudakov1,2,a, Nikita Yashin1 1 Plekhanov Russian University of Economics 2 Keldysh Institute of Applied Mathematics (Russian Academy of Sciences) E-mail: asudakov@ws-dss.com Modeling the spread of viruses is an urgent task in modern conditions. In the created model, contacts between people are represented in the form of the Watz and Strogatz graph. We studied graphs with tens of thousands of vertices with a simulation period of six months. The paper proposes methods for accelerating computations on graph models using graphics processors. In the considered problem, there were two resource-intensive computational tasks: generating an adjacency matrix of a graph that simulates the presence of contacts between people and traversing this graph in order to simulate infection. The calculations were carried out in sequential mode and with acceleration on GPUs. The modeling system software is implemented using the Cuda, CuPy, PyTorch libraries. The calculations were carried out using the Tesla T4 graphics accelerator. Compared to computations without using graphics accelerators, their application gave a 7-fold increase in speed. Keywords: graph, pandemic, graphics accelerator, simulation, small world Vladimir Sudakov, Nikita Yashin Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 488 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 1. Introduction Computing on graphs is a large cluster of problems with a wide range of applications. Modeling interactions between objects in a group is one such challenge. However, this type of problem is resource-intensive in the case of a graph with a huge number of vertices. For computations on graphs, an adjacency matrix is used, which in full form takes O(n2) time to complete a round trip, where n is the number of vertices. There are several ways to improve the execution speed of the algorithm - using the multi-threading approach on the CPU or porting the calculations to CUDA. This study explored the second approach. Behavioral comparison and speed of execution of the algorithm at the CPU and the GPU. Effectiveness of the technology was demonstrated on model of the interaction among people and the spread of the virus among them. 2. Literary review In 2016, the GraphOps library was written [1]. This library performs its calculations on FPGA, which made it possible to accelerate computations, but the basis of this work was the creation of a library that accelerates work with graphs for the end user. There have also been several studies on speeding up certain algorithms that run on graphs. Improvement of the GlimmerHHM algorithm [2] consisted in parallelization and transfer of computations to GPU and FPGA, which led to acceleration up to 34 and 200 times, respectively. Semantic search [3] has been successfully accelerated on both GPU and FPGA. It was study of the possible benefits FGPA front of CPU and GPU, but the results show accelerating work on the GPU in 5.6 times in comparison with a CPU. Most of these papers investigate the use of FGPA. However, the use of these devices requires a lot of labor costs for the development and availability of computing equipment with FGPA, while graphics accelerators are more common, it is easier to start working with them, and there are many different libraries in different languages that support using of GPU. Regarding the issues of modeling pandemics using graphic processors, it is worth highlighting the work [4]. There are offered effective approaches to the treatment of co-occurrence matrices for large graphs on the GPU, but there are no algorithms that are designed to prepare these graphs. In the context of the threatening spread of COVID-19, approaches based on SEIR models are actively developing [5]. Basically, such models presuppose a description of pandemics through a system of differential equations that describe the situation well under typical conditions, but do not allow assessing the consequences of certain decisions on isolating the population at the micro level. 3. Method The constructed model of virus propagation works on graphs. The vertices of the graph describe people, and the edges describe the presence of contacts between them through which the virus can be transmitted. Simulation is carried out with a fixed time step. Every unit of time, all the vertices of the graph are traversed and checked for a change in health status - whether the person has become infected, if he was not infected earlier, or whether the disease has passed to a new stage if the person is already infected. In total, a person can have four conditions, which are taken from the SEIR model [5]. The first condition (Susceptible) - the person is healthy, he is not a distributor, but can be infected. Most people have this condition initially. The second state (Exposed) - a person has become infected, and the disease is in the incubation stage. Such a person cannot infect anyone yet and cannot be more infected. This condition is obtained after a successful test for transmission of infection from carriers with whom there were contacts. The third state (Infectious) - a person's disease has passed into the active stage. Such a person can infect others and is the only way to directly transmit the infection, but he himself cannot get the disease more. The state is obtained after a certain time has elapsed in the second state. The fourth and final state of a person (Recovered) is recovered or passed away. A sick person is no 489 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 longer contagious and cannot transmit the infection, and it is also believed that he has immunity and he himself can no longer become infected at a given simulation period. This state is obtained after a certain time in the third state. In the proposed model, it is proposed to separate the types of contacts. A separate graph is built for each type of contacts. Next, the union of the resulting graphs is constructed. Testing model was carried out on two types of contacts. The first is constant contact that does not appear or disappear over time, for example, communication with relatives. Such contacts are described in a permanent graph that is created once before the state updates start. The second is an accidental contact, it can appear and disappear absolutely by accident, for example, an occasional meeting in a transport. Such contacts are described in a graph that is rebuilt every unit of time. Each time unit, these two graphs are combined, and the spread of the virus is calculated from the resulting graph. For the software implementation of the model, the programming language Python3 was chosen. This language is very easy to learn, and has a huge variety of libraries, which allows you to write programs with a minimum of knowledge in the shortest possible time. Of the libraries used in the work – CuPy [6] and NetworkX [7], which were used to build graphs, and PyTorch [8], which can traverse graphs using their adjacency matrices. First, you need to build a graph of the first type contacts. The so-called “small world” concept proposed in work [9] is well suited for solving this problem. It is built using the Watts-Strogatz algorithm, in two steps. The first step is to build N vertices and build edges between each i-th and j-th vertices, if they satisfy the condition: 𝐾 𝐾 0 < |𝑖 − 𝑗| mod (𝑁 − 1 − ) ≤ , 2 2 where K is the number of edges at each vertex. The second step - each edge (i, j) with a chance β changes to (i, k), where k ≠ j, edge (i, k) does not exist yet, k is a random vertex. Thereafter must renew states. The first step in updating states is building a random graph. In this graph, each vertex has the same degree. There are two ways to construct such a graph. The first, using the CPU, is the fast_gnp_random_graph [10] function of the NetworkX library. The second, using the GPU, is the sparse.rand [11] function of the CuPy library. NetworkX, unlike CuPy, does not support using the power of graphics accelerators, so two different libraries are used. CuPy also creates directly an adjacency matrix, not a graph. The second stage of updating states is directly traversing all vertices. It is done with the help of library Pytorch, which can perform calculations as the CPU, as well as on the GPU. If a specific person is not yet infected, the number N of his infectious contacts is considered and according to the formula 𝑃 = 1 − (1 − 𝛼)𝑁 , where α is the probability of infection from one contact. If a person has just passed to the second or third stage of infection, then for him, using the Monte Carlo method, the time spent in this state is modeled using the Weibull distribution. The applicability of the Weibull distribution to this problem was substantiated in [12]. 4. Results To check the results of the program, four programs were written with different levels of use of the graphics accelerator. Each of the programs was wrapped in a separate function, and the execution time of each function was measured. For measurement accuracy, each function was run 50 times and the average running time was taken. To compare the efficiency on different sizes of matrices, only one variable changed in the functions - the number of people in the model. Efficiency was compared for 490 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 different population sizes, and, accordingly, for different graph sizes. This value was measured in the range [1000, 17000] with a step of 1000. Первая функция, no_cuda, не использовала графического ускорителя, все вычисления проводились исключительно на процессоре. Вторая функция, torch_cuda, использовала графический ускоритель только для моделирования непосредственно распространения вируса. Матрица смежности строилась на обычном процессоре. Третья функция, graph_cuda, использовала графический ускоритель для построения матрицы смежности, а распространение между людьми вычисляется на обычном процессоре. Четвертая функция, all_cuda, использовала графический ускоритель для обоих задач – и для построения матрицы смежности, и для расчетов распространения вируса. The first function, no_cuda, did not use a graphics accelerator, all calculations were performed exclusively on the processor. The second function, torch_cuda, used a graphics accelerator only to simulate the actual spread of the virus. The adjacency matrix was built on a conventional processor. The third function, graph_cuda, used a graphics accelerator to build an adjacency matrix, and the spread between people was computed on a regular processor. The fourth function, all_cuda, used a graphics accelerator for both tasks - both for building an adjacency matrix and for calculating the spread of the virus. Computing devices used: CPU: Intel Xeon CPU @ 2.00GHz; GPU: NVIDIA Tesla T4 16GB. The calculation results are shown in Tab. 1. A comparative graph is also built based on these data (Fig.1). Table 1. Results of measurements of the speed of functions execution no_cuda torch_cuda graph_cuda all_cuda Population size 1000 3.200957 3.351706 0.710170 0.912640 2000 7.914960 7.242812 1.510238 0.848995 3000 13.984233 12.124245 3.181422 1.496793 4000 21.029639 17.249926 5.338851 2.455780 5000 28.600931 22.572831 8.708607 3.728163 6000 37.634567 28.611021 11.137490 4.775039 7000 47.314849 35.154100 15.447932 6.286974 8000 57.854738 41.903814 19.260485 7.980263 9000 69.675968 50.306944 25.033169 9.963441 10000 84.130254 57.087363 29.785555 12.199988 11000 97.206411 65.925612 38.764165 14.762600 12000 112.067288 73.430345 42.834090 17.492042 13000 126.610486 80.314935 51.497435 20.050080 14000 141.328987 87.860316 58.368204 23.021266 15000 176.068139 97.052003 68.845642 26.358311 16000 215.929255 109.848109 76.498892 29.992050 17000 244.363839 119.475280 88.965665 33.416545 491 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 Figure 1. Comparing the change in the execution time of different functions with an increase in the size of the graph As a result, it can be seen that the use of a graphics accelerator gives a significant gain in computing speed. This gain becomes more significant as the graph size increases. How many times a full cuda program is faster than a full CPU program is shown in Tab. 2. Table 2. Effect of GPU computing for different graph sizes Population size Population size 1000 3.507361 2000 9.322746 3000 9.342799 4000 8.563324 5000 7.671588 6000 7.881520 7000 7.525854 8000 7.249728 9000 6.993163 10000 6.895929 11000 6.584640 12000 6.406758 13000 6.314712 14000 6.139062 15000 6.679796 16000 7.199550 17000 7.312660 On average, this value is approximately 7.152. Thus, the use of a graphics accelerator can give a 7-fold improvement in efficiency. 5. Conclusions The advantage of using graphics accelerators to increase the speed of computations on graphs was investigated. A new approach to modeling contacts in pandemics based on combining graphs of different types is proposed. The use of the “small world” graph for modeling permanent contacts is investigated. The premium rate of different libraries of working on graphics processors for solving problems of this type has been investigated. Result showed approbation evident an advantage of using GPU, on graphs of high dimensionality. The proposed model of the spread of the virus makes it possible to assess the consequences of decisions to limit the number of contacts of different types. 492 Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021 6. Acknowledgement The reported study was funded by RFBR and CNPq, FASIE, DBT, DST, MOST, NSFC, SAMRC according to the research project No. 20-51-80002. References [1] Tayo Oguntebi, Kunle Olukotun. GraphOps: A Dataflow Library for Graph Analytics Acceleration // FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field- Programmable Gate Arrays. Feb 2016. pp. 111–117 - DOI: 10.1145/2847263.2847337 [2] Nafsika Chrysanthou, Grigorios Chrysos, Euripides Sotiriades, Ioannis Papaefstathiou. Parallel accelerators for GlimmerHMM Bioinformatics Algorithm // Design, Automation and Test in Europe, Grenoble, France, March 14-18, 2011 - DOI: 10.1109/DATE.2011.5763024 [3] Abhinandan Majumdar, Hari Cadambi, Srimat T. Chakradhar, H.P. Graf. A parallel accelerator for semantic search // SASP '11: Proceedings of the 2011 IEEE 9th Symposium on Application Specific Processors, June 2011, pp. 122–128 - DOI: 10.1109/SASP.2011.5941090 [4] Peng Zou, Ya-shuai Lu, Ling-da Wu, Li-li Chen, Yi-ping Yao. Epidemic simulation of a large- scale social contact network on GPU clusters // SIMULATION, vol. 89, no. 10, Oct. 2013, pp. 1154– 1172 - DOI:10.1177/0037549713482026. [5] Wilfredo Angulo, José M. Ramírez, Dany De Cecchis, Juan Primera, Henry Pacheco, Eduardo Rodríguez-Román. A modified SEIR model to predict the behavior of the early stage in coronavirus and coronavirus-like outbreaks // Scientific reports vol. 11,1 16331. 11 Aug. 2021, – DOI:10.1038/s41598-021-95785-y [6] CuPy software available at: https://cupy.dev/ (accessed 20.08.2021) [7] NetworkX software available at: https://networkx.org/ (accessed 20.08.2021) [8] PyTorch software available at: http://pytorch.org/ (accessed 20.08.2021) [9] Watts, D. J.; Strogatz, S. H. Collective dynamics of 'small-world' networks // Nature 393. 1998. 440-442 - DOI:10.1038/30918. [10] Networkx function fast_gnp_random_graph, documentation available at: https://networkx.org/documentation/stable/reference/generated/networkx.generators.random_graphs.fa st_gnp_random_graph.html (accessed 20.08.2021) [11] Cupy function sparse.rand, documentation available at: https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.rand.html (accessed 20.08.2021) [12] Jing Qin, Chong You, Qiushi Lin, Taojun Hu, Shicheng Yu, Xiao-Hua Zhou. Estimation of incubation period distribution of COVID-19 using disease onset forward time: a novel cross-sectional and forward follow-up study // SCIENCE ADVANCES, 14 Aug 2020, Vol 6, Issue 33 - DOI: 10.1126/sciadv.abc1202 493