1. Introduction

International Workshop on Data Mining and Knowledge Engineering, October

Designing of Information System for Semantic Analysis and Classification of Issues in Service Desk System

Ksenia Lokhacheva

Denis Parfenov

Maria Lapina

0 0 North-Caucasus Federal University , Pushkin St., 1, Stavropol, 355017 , Russia 1 Orenburg State University , Prospekt Pobedy, 13, Orenburg, 460018 , Russia

2020

1 5 16

The paper describes the designing of Information System for Semantic Analysis and Classification of Issues in Service Desk System. The concept of a Service Desk system and problems of its using are described; several mathematical models and methods of text analysis and text classification are studied; an analysis of system usage options, construction of a system scheme and a Class diagram were held.

12 reinforcement learning machine learning algorithmic trading market make market liquidity

1. Introduction 2. Related works

Most companies in one way or another work with clients and provide user support service. In addition, technical support of internal processes is a question of great importance for successful company management.

In work [3] negative aspects of the wrong organization of Technical Support Department work are described, namely:

 lack of fixed areas of competence creating a misunderstanding of the importance of the functions performed;

 risk of the particular user request loss in the total amount of requests and managers’ orders as a result of an unregulated request form;

 high dependence of the company's work on the "key" specialist, which occurs when a certain type of work is regularly performed by one employee.

Service Desk systems are able to ensure high-quality interaction between all members of the business process. The main tasks of Service Desk systems are the receiving and processing requests, i.e. the client creates request (ticket) and Service operators process it. With the use of the Service Desk system, it is possible to improve the work of all Service operators of the company.

Processes in the Service Desk systems regulate all the difficulties that arise in the work of the ITDepartment [4]:  Incident Management  Problem Management  Change Management  Release Management  Service Level Management  Financial Management  Availability Management  Capacity Management  Continuity Management  Information Security Management

Thus, according to the described functions and tasks of the Service Desk system, automation of some processes using semantic analysis and requests classification in order to predict the most likely solution to the problem without additional involvement of specialists seems relevant.

Natural languages texts analysis involves two stages: 1. word embedding, that includes Parsing, Part-of-speech tagging, excluding stop-words, digits, Stemming (or Lemmatization).

2. model training on pre-labeled data and text classification.

Due to the fact that automatic processing of text information is becoming more and more relevant and in demand, nowadays there is a large number of studies on methods of models training.

In [1] and [2], a comparative analysis of text classification methods is carried out. Both papers present a formal formulation of the text classification problem, describe classification methods, and provide a comparative analysis of classifier training methods using machine learning technologies, including the Bayes method, k-nearest neighbors algorithm, least squares method, support vector machine, and methods based on artificial neural networks. The main criteria for evaluating the quality of the classification were a combination of precision and recall. Based on the study [1], it was concluded that the best ratio of these characteristics is achieved using the methods of support vector machine and convolutional neural network. At the same time, the speed of the Bayes method is one of the highest, but the accuracy for different experiments varies. According to the study [2], the least squares method showed the best results in terms of recall, while the support vector method was the best in terms of precision. A comparative analysis of the considered classification methods based on studies [1] and [2] is presented in table 1.

3. Problem statement

The goal is to design an information system for semantic analysis and classification of issues in Service Desk system. Typically, Service Desk systems support a three-level client-server architecture, in which the client (user interface), application (hardware and software), and data (DB and DBMS) levels are physically separated.

The following options are available for using the Service Desk system (figure 1).

We suppose that each request left in the Service Desk system will be pre-processed before it is included in the list of requests to be executed. At the same time, the pre-processing will consist of semantic analysis of semi-structured data extracted from the particular issue, classification of the issue (searching for the most appropriate executing Department (or team) in Technical Support Department), and selection of a possible solution based on the analysis of solutions of previously closed issues of the same category.

After pre-processing, the request is added to the list of requests to be executed for a specific Department. Employees of this Department can assign any request to themselves. If, after the first issue reviewing, the technical service operator agrees with the results of the classification, he can review a possible solution, try to apply it, and then, if the solution offered by the system did not help, note this fact in the issue description and offer a new one. If at some point of issue execution it becomes clear that the classification was incorrect, the technical service operator can detach this issue from himself and move it to the list of general open issues. After executing and closing issues from the list of general open issues, an employee who executed it must leave appropriate comments on the task (about the executing Department and the correct solution).

As a result, the options for interacting with the proposed system look as shown in figure 2.

4. System design

The scheme of the developing system is shown in figure 3. In this case, Issues Data, Vocabulary, and Marked Data Storage are components of the Data Storage.

The user leaves the request in the Service Desk system, its data is stored in the Issues Data storage, then the entire request is vectorized using vocabularies (databases) of the Russian language. The marked data is sent to the appropriate storage, where the Issue Classification Module pulls it up. After classification the index of the current issue to update information in the Marked Data Storage is held. In addition, after issue classification, a possible solution should be proposed. As the output, the system converts the original request, adding the assigned task class, the executing Department, and the possible solution for the issue.  InitialOrder, responsible for initial information of received issue. This entity contains the following attributes: the issue identification number (orderId), the issue body (orderBody), information about the issue author(author), information about the Department where the issue author works (authorDepartment), in this regard, this entity is linked with the “Departments” DTO by an association relationship, and a list of tags that the author could add to the issue description to specify the problem (tags).

 TransformedOrder, responsible for information about the transformed request. This entity inherits the attributes of the InitialOrder entity and also has: a) the transformed issue identification number (newOrderId); b) vector representation of the issue body (wordVec); c) the system-selected request type (class) (recomendedOrderType), the actual request type (class) (actualOrderType), these attributes link the TransformedOrder entity to the “OrderTypes” DTO; d) the system-selected Department whose employees could solve the issue (recomendedActorDepartment), the actual Department whose employees solved the issue (actualActorDepartment), these attributes associate the TransformedOrder entity with the “Departments” DTO; e) the system-selected issue solution (recomendedSolution), the actual issue solution (actualSolution), these attributes link the TransformedOrder entity to the “Solutions” DTO.  OrderType, responsible for the classification type of the issue. The “OrderTypes” DTO is associated with the OrderType entity by an aggregation relationship, and it stores information about all possible order types (the “types” attribute).

 Solution, responsible for the type of issue solution. The “Solutions: DTO is associated with the Solution entity by an aggregation relationship, and it stores information about all possible types of solution requests (the “solutions” attribute).

These entities are the main components of Issues Data and the Marked Data Storage (figure 3).

The designing system will be implemented as a plug-in for one of the most famous Service desk systems – Jira.

5. Conclusion

The paper describes the designing of Information System for Semantic Analysis and Classification of Issues in Service Desk System. The following points are mentioned: 1. the concept of a Service Desk system and problems of its using are described; 2. several mathematical models and methods of text analysis and text classification are studied; 3. the information system for semantic analysis and classification of issues in Service Desk system was designed, an analysis of system usage options, construction of a system scheme and a Class diagram were held.

To implement this system, further research of vectorization methods, classification methods, and solution recommendations methods that are compatible with the Atlassian SDK are necessary.

Acknowledgments

The study was carried out with the financial support the grant from the President of the Russian Federation for state support of leading scientific schools of the Russian Federation (NSh2502.2020.9). [1] A.I. Kadhim "Survey on supervised machine learning techniques for automatic text classification." Artificial Intelligence Review 52.1 (2019): 273-292. [2] A.K. Abasi, A.T. Khader, M.A. Al-Betar, S. Naim, S.N. Makhadmeh, Z.A.A. Alyasseri "Linkbased multi-verse optimizer for text documents clustering." Applied Soft Computing 87 (2020): 106002. [3] Kilpeläinen, Jaakko "Automating knowledge work of service desk: Machine learning model for software robot." (2019). [4] S.P. Paramesh, K.S. Shreedhara "Automated it service desk systems using machine learning techniques." Data Analytics and Learning. Springer, Singapore, 2019. 331-346. [5] M. Younas, K. Wakil, M. Arif, A. Mustafa "An automated approach for identification of nonfunctional requirements using Word2Vec model." Int. J. Adv. Comput. Sci. Appl 10 (2019). [6] Christopher D. Manning "The Stanford CoreNLP Natural Language Processing Toolkit" Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (2014): 55-60. [7] M. Dli "Application of Fuzzy Decision Trees for Rubricating Unstructured Electronic Text

Documents" Proceedings of the IS-2019 Conference (2019): 108-118. [8] Hu, Kai "Understanding the topic evolution of scientific literatures like an evolving city: Using Google Word2Vec model and spatial autocorrelation analysis." Information Processing & Management 56.4 (2019): 1185-1203. [9] Matt J. Kusner "From word embeddings to document distances" ICML'15: Proceedings of the 32nd International Conference on International Conference on Machine Learning (2015): 957966.