Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 EFFICIENCY MEASUREMENT SYSTEM FOR THE COMPUTING CLUSTER AT IHEP V. Ezhova a, V. Kotliar b Institute for High Energy Physics named by A.A. Logunov of National Research Centre “Kurchatov Institute”, Nauki Square 1, Protvino, Moscow region, Russia, 142281 E-mail: a Victoria.Ezhova@ihep.ru , b Viktor.Kotliar@ihep.ru Every day IHEP central computing cluster produce thousands of calculations related to research activities, both IHEP and GRID experiments. A lot of machine resources are expended on this work. So, we can estimate the size of the spent resources used for all types of tasks, make decisions for changing cluster configuration and to do the forecast for the work of the computer center in general. In this work you can see the calculations of the efficiency index and the graphical representation of work of a cluster on the basis of account information. It is one of the main tasks within work on creation of system of uniform monitoring of computer center of IHEP. Keywords: accounting system, Torque pbs, Elastic Search, Kibana © 2018 Victoria Ezhova, Viktor Kotliar 538 Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 1. Introduction In this work you can see a graphical representation of the work of the cluster based on the account information from the torque pbs cluster for real-time analysis. The work of the main scripts and a method for calculating cluster performance based on pbs data will be present in details, as well as the results of testing of some memory allocation systems will be described. We pursued the goal of improving the effectiveness of the cluster through the efficient allocation of resources, for example, by introducing additional libraries. 2. Review of the work performed and calculation of efficiency To start the task on the cluster, you might use the Torque PBS job management system. This system contains information about the necessary resources (the number of cluster nodes, the required amount of RAM and the necessary execution time). After starting the task, all this information is placed in a log file and stored on the server [1]. The python script sorts the accounting file of torque pbs into a python dict and throws off this information in JSON format. It uses the alogger.utils - small python library to parse resource manager logs and json module which can generate JSON from python objects and lists [2]. As next step bash script connects to Elastic Search (ES) and it sends JSON data to ES every minute to display it in the form of the schedule from Kibana (Figure 1). Figure 1. Overview of the efficiency measurement system After the efficiency indicator is calculated it is used in visualizations. Here is the information for the analysis: 1) cput – maximum operating time of the processor or CPU time; 2) ncpus – how many cpus each allocated node must have; 3) ppn – how many processes to allocate per each node; 4) walltime – time of performance of a task in hours; 5) efficiency – result on the basis of the previous values (percentage format). These data have been taken from the log pbs files [3]. The last indicator was calculated on the basis of all previous (Figure 2). Figure 2. Calculation of a new efficiency indicator in Kibanа 539 Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 Cput is the main reference point. If there is no such index it is set to default zero value. In order to calculate the efficiency index, cpu time is converted to the number type. The indicators ncpus and ppn were also taken into account and they are mutually exclusive. CPU time is divided into one of these values, if it presents. As a result, the calculated value is divided by the walltime of a task given by type number. Kibana tools allow you to display this data in the table (Figure 3). Figure 3. Job’s data information results 3. Kibana’s graphical representation Display of the data loaded into ES — it is work for Kibana. Kibana is an open source (Apache Licensed), browser based analytics and search dashboard for Elasticsearch. Several graphs were constructed reflecting the growth or decrease in the efficiency of resources use by a group of tasks based on efficiency indicator. The first one shows a vertical bar graph that reflects the number of tasks in the execution queue depending on the type (Figure 4). You can see that more tasks are related to the ihep-short queue. These tasks typically perform fast processing of small amounts of data and are characterized by a large number of input /output operations. Figure 4. Job’s queue and Job’s state information results Second shows the histogram of the statuses of tasks, the number of cancelled, interrupted, completed tasks (Figure 4). You can see that the overwhelming number of tasks are completed tasks, those that are currently executing and those that are in the execution queue. Another two graphs show the effectiveness of the accomplishing tasks of IHEP users and grid tasks (Figure 5). 540 Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 Figure 5. IHEP and GRID efficiency’s information results The last linear graph reflects several critical indicators at once: average efficiency level, maximum and minimum efficiency, Upper and Lower Standard Deviation of efficiency (Figure 6). Figure 6. Maximum and minimum efficiency information results 4. Memory allocation systems However, the memory usage of each process is bounded with Linux resources limitation. The memory management subsystem is one of the most important parts of the operating system. It was supposed to try different memory allocation systems on the computing nodes of the cluster and to check their effect on the efficiency and memory usage. For an experiment the tcmalloc package was configured on the cluster working nodes for one week. The tc in tcmalloc stands for thread cache — the mechanism through which this particular allocator is able to satisfy certain (often most) allocations lucklessly [4]. During tcmalloc usage it is seen on the graph that a sharp decline of cpu usage occurred in compare with previous gradual usage. With regard to the efficiency of using the cluster, the most noticeable decline is visible for Atlas LHC experiment (Figure 7). Figure 7. Tcmalloc - recession of efficiency The tcmalloc allocator does not return memory to the operating system. As more threads are used, the advantage of using lock-free algorithms shows that tcmalloc drops behind. As a conclusion - tcmalloc is expedient to use for a certain type of tasks, scripts and for IHEP cluster it is unsuitable. Jemalloc was tested for the same period (Figure 8). Jemalloc is a general purpose malloc implementation that emphasizes fragmentation avoidance and scalable concurrency support. Unlike 541 Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 tcmalloc focused on separate tasks, this library is intended for use on the cluster for the whole stream of tasks [5]. Figure 8. Jemalloc - recession of efficiency After inclusion of libjemalloc.so.1 library there was a growth of overall performance of the cluster (in particular CPU utilization). But jobs processing has failed - more than a half of jobs and about 27% of multicore jobs were interrupted (Figure 9). Figure 9. Jemalloc forced jobs to fail That means tcmalloc and jemalloc are not useful for IHEP cluster. 5. Cluster Management System In the future there is the plan to create additional control component for IHEP Cluster Management System (CMS) with objectives to analyse efficiency indicators of overall cluster performance and to manage the cluster in a way of improving resource usage efficiency. At this stage CMS consists of event-driven management system, configuration management system, monitoring and accounting system and a ChatOps technology which is used for the administration tasks. 6. Conclusions In this article it was calculated the efficiency indicator, which based on the accounting file of torque pbs. Also the examples of efficiency’s graphs were given. It will help to analyse efficiency of working cluster in the future. For an experiment the memory allocators were tested. In our case the result of tests was not useful for IHEP cluster. We will continue to improve the efficiency of the computing cluster. As future works it is planned to use Cluster Management System (CMS) with objectives to analyse efficiency indicators. 542 Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 - 14, 2018 References [1] Ezhova V., Kotliar V. Accounting system for the computing cluster at IHEP // CEUR Workshop Proceedings. February 2017: Vol. 1787.- pp. 518-524 [2] Python library to parse resource manager logs [python aloger]. Available at: https://pypi.org/project/python-alogger/ (accessed 24.10.2018) [3] TORQUE Resource Manager [Torque home page]. Available at: http://docs.adaptivecomputing.com/torque/3-0-5/4.1queueconfig.php (accessed 24.10.2018) [4] TCMalloc : Thread-Caching Malloc [TCMalloc overview]. Available at: https://gperftools.github.io/gperftools/tcmalloc.html (accessed 24.10.2018) [5] Jemalloc Documentation [jemalloc home page]. Available at: http://jemalloc.net/ (accessed 24.10.2018) 543