Mathematical Support for Statistical Research Based on Informational Technologies Roman Kaminskyi, Nataliia Kunanets and Antonii Rzheuskyi Information Systems and Networks Department, Lviv Polytechnic National University, Stepan Bandera street, 32a, 79013, Lviv, Ukraine Roman.M.Kaminskyi@lpnu.ua, nek.lviv@gmail.com, antonii.v.rzheuskyi@lpnu.ua Abstract. The authors developed "case of the information manager", which will include tools for mathematical and statistical analysis, oriented to those who have the skills of working with software only "at the user level" and implemented by means of MS Excel table processor. "Case of the information manager" includes the following methods: descriptive statistics, correlation analysis, classification and cluster analysis methods, paired comparison method, Saaty’s method (algorithmic hierarchical procedure) and others. Keywords: "case of the information manager", correlation analysis, ranking method paired comparison method, Saaty’s algorithmic hierarchical procedure, hierarchical agglomerative cluster analysis. 1 Introduction The use of information technologies in various areas of human activity has given rise to a certain interest in application of mathematical and statistical methods in humanitarian field, particularly, for library staff - information managers. Without good mathematical training, they often use simple methods to improve their work, using, in spite of this, personal computers and elementary mathematical support of MS Excel spreadsheet. In essence, this situation creates an actual scientific problem, the meaning of which in the development of a special manual on the application of mathematical methods for library workers. The aim of this work is the selection of simple and effective mathematical and statistical methods for mathematical study of decision-making information manager. The first among these methods are methods supported by information managers and are the following: correlation method, method of ranking, method of pair comparisons, Saaty’s analytic hierarchical procedure and the hierarchical agglomeration cluster analysis. These methods are implemented by a simple step-by-step algorithm. 2 Correlation Analysis This method has been used to determine the differences between reading rooms in relation to their attendance by readers. As data, the number of visitors per room per month is taken. Mutual and partial correlation coefficients were determined. The names of reading rooms and calculation results are given in Table 1. Table 1. The results of correlation analysis. Relationship between the Relationship between the Relationship between the reading room № 1 and the reading room № 1 and the room of abstracts № 2 and room of abstracts № 2 room of patents № 3 the room of patents № 3 R12 = 0.689 R13 = 0.823 R23 = 0.688 The coefficient of partial The coefficient of partial The coefficient of partial correlation of rooms № 1 correlation of rooms № 1 correlation of rooms № 1 and № 2 relatively to the and № 2 relatively to the and № 2 relatively to the room № 3 room № 3 room № 3 R12.3 = 0.297 R13.2 = 0.664 R23.1 = 0.293 Thus, it can be approved that the reading rooms № 1 and № 3 are more similar concerning visits. 3 Ranking Method The simplest method to evaluate the impact n factors by determining their weight, i.e. setting their ranking is the method of ranking. The essence of this method is represented by the following example. It is necessary to determine in what order in the instruction to put 7 rules for use of reader fund. Participants in the ranking were the 5 specialists. To solve this, a group of specialists was invited. Each of them gives an assessment of the place of the rule on 10-point scale. The results of the method are shown in Table 2. Table 2. Determination the order of the instruction items. Rules Р1 Р2 Р3 Р4 Р5 Р6 Р7 Points 18 17 33 35 26 23 20 Weights 0.10 0.10 0.20 0.20 0.15 0.13 0.12 The rules in the instruction must be in this order: Р4, Р3, Р5, Р6, Р7, Р1, Р2. Though Р1 > Р2 and Р4 > Р3, presented values of their weights are equal as a result of rounding. This method is the easiest to evaluate staff. The results of work of employees who are evaluated and ranked from the best to the worst or vice versa are compared. Ranking methods also make it possible to compare employees with each other according to selected criteria. 4 Paired Comparison Method The paired comparison method is used when the group of objects must be submitted in a certain sequence, i.e., to determine the rank of each object for giving preference. For example, libraries offer 6 services, among which the most demanded must be identified. Three experts were invited. The weights of alternatives, identified by the experts, are shown in Table 3. Table 3. V1 V2 V3 V4 V5 V6 0.207 0.184 0.156 0.076 0.187 0.191 So, according to experts, the most demanded service is V1. Next are the services: V6, V5, V2, V3, V4. 5 Saaty’s Algorithmic Hierarchical Procedure This method involves decomposition of the problem into simple components and processing the judgments of decision maker. The essence of this procedure is quantitative expression of qualitative considerations. The structure of the problem is presented with a hierarchy, the top of which is the goal, and the levels are criteria and alternatives. The procedure includes pairwise comparisons, both criteria and alternatives. Quantitative values are estimates in the scale of relations. For example, for the post of librarian, five candidates were selected for 5 criteria: О, І, S, N, J., based on interview results for each of the criteria comprise the matrices of pairwise comparisons and defined the normalized vectors of local priorities, constructed in matrix B, and vector N for it is constructed. To determine global priorities, vector U is constructed by multiplying the matrices В × N. The results are shown in Table 4. Table 4. B N U О 0,262 0,143 0,510 0,029 0,090 0,138 0,149 І 0,075 0,495 0,033 0,059 0,255 0,113 0,109 S 0,507 0,047 0,064 0,502 0,138 0,160 0,179 N 0,129 0,074 0,130 0,148 0,478 0,154 0,124 J 0,028 0,241 0,263 0,263 0,039 0,113 0,118 In the last column we obtain, as the matrix product, the values of the vector of global priorities for candidates for the position of librarian U. Candidate S has the highest global priority. Thus, with the help of this analytical hierarchical procedure, candidate S is recommended for the post of librarian. This method is widely used in decision making support. 6 Hierarchical Agglomerative Cluster Analysis In the practice of library and information managers often there are problems of splitting a group of objects into separate classes. If objects are described by identical sets of attributes, then the problem can be solved by means of cluster analysis. Clustering is often the first step in data analysis. The essence of hierarchical agglomerative cluster analysis is that in the beginning all elements are considered as separate clusters, and then the distances between these objects are determined and two of them, the closest ones are grouped together into a separate cluster. Its procedure is iterative and ends with the union of all objects in one cluster, but with a clearly defined structure, which is presented as a dendrogram. As an example, in Fig. 1 the result of the cluster analysis of a group of 20 University libraries based on 12 indicators using the distance of Euclid in the form of a dendrogram is shown. Potential clusters are marked with horizontal dashed lines - similar to each other libraries. Fig. 1. Dendrogram of splitting into clusters the group of University libraries. As a result of splitting into clusters this group of libraries, at the lowest level there are 14 clusters, at the next level is 4, then 3 and two. That is, depending on the height of the level, we have a different number of clusters. 7 Conclusions The listed methods can be supplemented by many others, but even this set can significantly improve the efficiency of the work of the information manager. In this study, the main principle is the presentation of the method, starting with a detailed consideration of solution of specific task with it and discussion of the result. Therefore, at each stage of the used method it is necessary to give detailed interpretation of the obtained intermediate result. Practical methods verification and discussion with the direct users, confirmed the validity of this approach and effectiveness of their use in the practice of information managers.