=Paper=
{{Paper
|id=Vol-2699/paper13
|storemode=property
|title=Improving Spare Part Search for Maintenance Services using Topic Modelling
|pdfUrl=https://ceur-ws.org/Vol-2699/paper13.pdf
|volume=Vol-2699
|authors=Anastasiia Grishina,Milosh Stolikj,Qi Gao,Milan Petkovic
|dblpUrl=https://dblp.org/rec/conf/cikm/GrishinaSGP20
}}
==Improving Spare Part Search for Maintenance Services using Topic Modelling==
Improving Spare Part Search for Maintenance Services using Topic Modelling Anastasiia Grishinaa , Milosh Stolikjb , Qi Gaob and Milan Petkovica,b a Eindhoven University of Technology, Den Dolech 2, 5612 AZ, Eindhoven, The Netherlands b Philips Research, High Tech Campus 34, 5656 AE, Eindhoven, The Netherlands Abstract To support the decision-making process in various industrial applications, many companies use knowledge management and Information Retrieval (IR). In an industrial setting, knowledge is extracted from data that is often stored in a semi-structured or unstructured format. As a result, Natural Language Processing (NLP) methods have been applied to a number of IR steps. In this work, we explore how NLP and particularly topic modelling can be used to improve the relevance of spare part retrieval in the context of maintenance services. A proposed methodology extracts topics from short maintenance service reports that also include part replacement data. An intuition behind the proposed methodology is that every topic should represent a specific root cause. Experimental were conducted for an ad-hoc retrieval system of service case descriptions and spare parts. The results have shown that our modification improves a baseline system thus boosting the performance of maintenance service solution recommendation. Keywords Entity retrieval, spare part search, decision support, maintenance services, natural language processing, topic modelling 1. Introduction system which helps engineers to search for relevant historical service reports and identify the most Information retrieval systems are gaining importance probable service solution. Therefore, target retrieval in various industrial applications. We can observe the entities are equipment components, i.e. parts to be emergence of knowledge-based systems that support replaced. In practice, one case may require multiple the decision-making process in construction, avia- parts to be replaced. tion, equipment maintenance and other areas [1, 2]. To address the challenge of spare part retrieval, we In these settings, knowledge is frequently extracted create an NLP pipeline that pre-processes short from data that is captured in legacy systems using textual descriptions of maintenance activities and natural language and stored in a semi-structured or apply topic modelling to categorize the descriptions unstructured format. As a result, linguistic and of past cases. From relevant maintenance service statistical NLP methods have been applied to a reports, the proposed methodology extracts topics number of IR steps, such as document and query each of which may indicate a specific root cause. modelling, query expansion and search result Once categorized, cases and parts would be easier to clustering based on semantic similarities [3, 4, 5, 6]. examine and more relevant to a particular type of In this work, we explore how NLP and particularly failure. An engineer can address topics seuqentially topic modelling can be used to improve spare part and choose among parts related to the same topic. retrieval that serves the purpose of medical Therefore, we exploit term co-occurrences and their equipment maintenance. In particular, we focus on semantic correspondences using topic modelling to remote system diagnostics that takes place when the enhance the relevance of target entities retrieval. equipment malfunctions, i.e. stops working according Although the use case assumes that a number of to its specification. The problem may be resolved in parts will be ultimately suggested based on past several ways, one of which is the replacement of one maintenance records, the problem statement does not or more (malfunctioning) parts. We conducted our fall under the vastly explored area of recommender research in the context of an ad-hoc entity retrieval systems that involves user preference modelling. Proceedings of the CIKM 2020 Workshops, October 19-20, 2020, To evaluate the difference introduced by the Galway, Ireland proposed component, we use IR metrics that are " a.grishina@tue.nl (A. Grishina); m.stolikj@philips.com (M. customized to characterize the relevance and Stolikj); q.gao@philips.com (Q. Gao); milan.petkovic@philips.com completeness of a set of retrieved entities. They (M. Petkovic) 0000-0003-3139-0200 (A. Grishina) measure how far in the list of search results all the Β© 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). required parts are present, indicate if at least one CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) required entity is retrieved and whether all needed parts are present among top K search results. activities and IDs of parts used to solve the issue. The main contributions of the work are as follows: Hence, the reports might contain abbreviations, software logs sent by a machine as well as natural β’ we enhance the performance of an industrial language descriptions of a machine state on every entity retrieval system by learning semantic step of the maintenance process. Closed cases are up- correspondences between short historical loaded to the collection of historical cases that could descriptions of events associated with the be mined using the above mentioned ER system. entities; To present the setting in a formal way, let π be a β’ we approach the challenge of spare parts query performed by a service engineer while retrieval in remote system diagnostics and working on a case. We will use the term query case to maintenance of industrial equipment using indicate such cases. Each query is associated with a topic modelling to group extracted historical single maintenance case. The list of parts replaced in cases and parts under topics that should a case π is π(π). We use πΆ(π) to denote a list of cases represent failure root causes; retrieved for the query π. A set of parts replaced in all retrieved cases is denoted by π(π) = βͺπβπΆ(π) π(π), and a β’ we evaluate the proposed method on a real set of ranked parts recommended for replacement is world dataset using customized information expressed by ππ (π) β π(π). retrieval metrics. The remainder of this paper is organized as 3. Methodology follows. We present the problem formulation and a baseline part retrieval system in Section 2. The The method proposed in this work combines a baseline methodology of combining the text mining pipeline entity retrieval setting and an add-on topic modelling and the entity retrieval process is described in component as described below. Section 3. Section 4 is dedicated to a dataset description and methods implementation. We discuss 3.1. Baseline Entity Retrieval System experimental results in Section 5 and related work in Section 6. The paper is concluded by Section 7 where The baseline entity search system in question is we also mention possible directions for future work. empowered with a two-step retrieval mechanism. A database of entity descriptions lies in the foundation of the mechanism. It consists of entity descriptions 2. Problem Description retrieval followed by the final entity retrieval and ranking as explained in detail below. In the scope of this work, entity descriptions are composed of equipment characteristics and represented by maintenance case reports registered 3.1.1. Retrieval of Entity Descriptions in the retrieval system. Entities to be retrieved are the At the first step of the entity search, a system parts recommended for replacement to troubleshoot retrieves relevant descriptions using a Vector Space a machine referred to in a new malfunction report. Model (VSM) with Okapi BM25 similarity score [7, 8]. Queries may contain various characteristics of a new VSM is a document and query representation model maintenance case that should be treated by a that converts texts to N-dimensional vectors of term maintenance service team. An entity, i.e. a spare part, weights, where N is the number of words in a is identified with a unique ID and is related to a case dictionary. Terms are simply the words or groups of description. One historical maintenance case can words present in the collection of documents. The have several parts associated with it, similarly, a new dictionary is built from a text corpus and includes service case may require a set of different parts. distinct terms. The intuition behind VSM is that The knowledge base of maintenance cases is retrieved documents will be ranked according to a updated with the help of service engineers. They similarity function computed for a query and a submit maintenance reports for every equipment document, i.e. vectors in a vector space. failure or customer complaint as short technical texts In the context of our problem description, for a often in multiple languages (English and a locally query π containing keywords {ππ }ππ=1 and a mainte- spoken language). Each historical report includes a nance case description π with fields {ππ }π π=1 , Okapi number of logs such as time of customer complaint registration, a textual description of maintenance BM25 similarity score could be expressed as follows: Algorithm 1 Part Recommendation Input: Query π associates with maintenance case π, π΅π25(π, π) = number of parts to recommend πΎ π π π (ππ , ππ ) β (π1 + 1) Output: A list of recommended parts = β β πΌ π·πΉ (ππ ) β πΏπ . πππ’ππ‘ β {} β³ # occurrences of part combinations π=1 π=1 π (ππ , ππ ) + π1 β (1 β π + π β πΏππ£ππ ) π π(π) β {} β³ retrieved parts (1) ππ (π) β {} β³ recommended parts for π β πΆ(π) do: Here, π (ππ , ππ ) is the frequency of the keyword ππ in π(π) β get part IDs(π) a field ππ of the case description π. πΏππ is the length of if π(π) β π(π) then the field ππ in terms of words, and πΏππ£ππ is the average πππ’ππ‘(π(π)) β πππ’ππ‘(π(π)) + 1 length of the field π in descriptions of all cases in the else collection πΆ. Variables π1 and π are tuning π(π) β π(π) βͺ π(π) parameters that control how much every new πππ’ππ‘(π(π)) β 1 occurrence of a term impacts the score and the end if document length scaling correspondingly. Inverse sort(π(π), using=πππ’ππ‘(π(π)), order=DESC) Document Frequency is calculated as: for π(π) β π(π) do π(π) β π(π) βͺ π(π) π β π(ππ ) + 0.5 πΌ π·πΉ (ππ ) = log , (2) drop duplicates(π(π)) ( π(ππ ) + 0.5 ) end for where π is the total number of cases, i.e. π = |πΆ(π)|, end for and π(ππ ) is the number of case descriptions that ππ (π) β top K(π(π)) contain the query term ππ . Therefore, the case π π1 β πΆ(π) is ranked higher than π π2 β πΆ(π) iff π΅π25(π, π π1 ) > π΅π25(π, π π2 ). tokens that represent individual words or sometimes groups of words [9]. The process of lemmatization 3.1.2. Entity Retrieval and Ranking involves finding the initial forms of the inflected words, also referred to as root forms or lemmas. A The second step realizes the entity retrieval. It ranks lemma is a word in its canonical form that exists in spare parts associated with the retrieved cases based the dictionary of the used language. For example, the on the frequency of their occurrence and the rank of lemma for do, doing, did is the word do. Next, term the case where they occur. Thus, the most frequent weighting refers to assigning weights to tokens. We parts that occur in top ranked cases appear higher on utilize term frequency or bag-of-words weights as a the final list of retrieved parts than a part that term weighting scheme. It associates a term with a appears the same number of times lower on the case weight proportional to the frequency of the term list. Several proprietary filters are applied as well, but occurrence in the corpus of documents. For topic they do not affect the methodology. The algorithm modelling, we use Latent Dirichlet Allocation (LDA), for part recommendation is presented in Algorithm 1. one of the most popular algorithms for automatically extracting topics. LDA is based on the generative 3.2. Topic Modelling Component probabilistic language model [10]. The purpose of LDA is to learn the representation of a fixed number Transformation of the historical cases and parts re- of topics and derive the topic distribution for every trieval pipeline is performed by adding a component document in a collection. Every maintenance service that groups retrieved cases under a number of topics case is assigned a topic according to the maximum and ranks the parts within the topics. Figure 1 shows probability of the case belonging to a topic. the baseline architecture (a) and the modification that includes the proposed topic modelling component (b). The topic modelling component could be consid- 4. Evaluation ered as an individual NLP pipeline with a number of steps. The pipeline includes tokenization, lemmatiza- In this section, we describe the real world dataset that tion, removal of stop phrases, building a dictionary of is extracted from the baseline part retrieval system. tokens, term weighting and topic modelling. Tok- We also discuss the metrics used to evaluate the enization of the text refers to splitting it into units or performance of the baseline system and compare it to Figure 1: Integration of the topic modelling component (b) in a baseline two-step document and part retrieval system (a). Figure 2: Distribution of queries over the number of cases and parts retrieved in response to the queries. the configuration with the integrated topic modelling component. with the corresponding cases returned as search 4.1. Dataset Description results: (π, πΆ(π)). Cases returned for the queries may For our experiments, we use a proprietary dataset have non-empty intersection with the training composed of historical maintenance cases. Textual dataset, however, the cases for which the queries had fields of case descriptions have been aggregated into been created were excluded from the training set. one field per maintenance case and serve as input to LDA during training and testing stages. The majority 4.2. Evaluation metrics of cases are written in mixed languages. Figure 2 Top πΎ ranked parts are used to estimate π π’ππππ π , presents the distribution of the number of queries πππππππ‘ππππ π , ππππππ and πππ_π‘ππ_π metrics. over their characteristics: the number of retrieved Metric@K is computed for a set of retrieved parts service cases, retrieved ranked parts to replace and |ππ (π)| β€ πΎ . The operator | β | applied to a set defines parts replaced in the query case. The majority of the count of set elements. The metrics are calculated queries retrieved up to 200 similar case descriptions, as follows: however, this number could reach 1000 cases. The { number of unique recommended parts retrieved from 1, if π(π) β ππ (π), these cases was below 350 in general, while the πππππππ‘ππππ π @πΎ (π) = 0, if π(π) β ππ (π); majority of queries retrieved 0-10 parts. The number { of parts required to treat a maintenance case 1, if |π(π) β© ππ (π)| > 0, associated with the query was equal to 5 or less in π π’ππππ π @πΎ (π) = 0, if |π(π) β© ππ (π)| = 0; most of the query cases. For building the LDA model, we use a subset of ππππππ@πΎ (π) = |π(π) β© ππ (π)| ; historical cases written in English. The training set |π(π)| contains data from 101,026 different maintenance πππ_π‘ππ_π@πΎ (π) =π, π β€ πΎ and cases. For the test set, we use a sample of 1,564 πππππππ‘ππππ π @π(π) = 1; queries performed by service engineers, together πΆππππππ‘ππππ π measures whether all the used parts which spans from 2 to 20 in our experiments. All the were suggested for a troubleshooting report, π π’ππππ π metrics presented in the paper are evaluated at top πΎ shows if any consumed part was listed among retrieved parts, πΎ = 5, 10. The algorithm is set up to retrieved parts and ππππππ indicates the ratio of learn symmetric πΌ, a document-topic prior, from data retrieved parts that were consumed to the total as well as π, a topic-word prior. The number of number of consumed parts. An additional metric iterations is fixed at 100. πππ_π‘ππ_π is used to estimate how far in the list of In addition, we set an empirical parameter for the retrieved parts one could find the full list of ratio of English words appearing in the case consumed parts in the query case and returns null if description π πΈπ = 30%. A topic will be derived by such π does not exist. As a baseline, we use the initial LDA trained on the entirely English corpus in case part retrieval strategy and its statistics for the whole the description contains at least π πΈπ English words, set of retrieved and ranked parts ππ (π). Once topics otherwise the maintenance case will be marked as are computed, the metrics are estimated for parts βtopic undefinedβ. associated with the cases in every topic π‘, i.e. a subset of cases and, therefore, parts: ππ (π)(π‘) = {π(π) | π β πΆ(π) & π β π‘ } instead of ππ (π). 5. Results and Discussion We discard query cases that did not include In this section, we compare the results of the initial information whether some parts were consumed or ER architecture evaluation to the results of the not (i.e. missing data). If a case did not require any modified architecture with the topic modelling part replacement, we utilize an artificial part called component as well as to the best possible results for βNo partsβ and assign an ID to it. In this way, for the dataset of maintenance cases. We group queries query cases that were solved without part by levels of generalization, which stands for the replacement it is possible to evaluate the number of matched cases and retrieved parts in our performance of part retrieval. The top ranked part in setting. Moreover, since a number of topics is a this situation should be βNo partsβ. hyper-parameter that is not learned via training, we discuss the estimation of a possible number of topics 4.3. Implementation using NLP coherence metrics and compare it with The first step of the initial ER system is powered by observations of the retrieval systemβs performance. Elasticsearch [11]. It performs indexing of the documents in the knowledge base and retrieves them 5.1. Retrieval Performance at Top K according to Okapi BM25 ranking with default tuning Parts parameters π1 = 1.2 and π = 0.75. For the add-on topic modelling component, we The performance of maintenance cases and parts utilize Python NLP libraries: Gensim [12] for all the retrieval in the initial configuration of the part steps including topic modelling and spaCy [13] for retrieval system (Baseline) and the configuration with lemmatization. One step that is also customized to LDA topic modelling component (LDA) is evaluated the maintenance application is the removal of stop using the above described metrics at different πΎ . phrases. We use a collection of English stop words These results are also compared to the best possible pre-defined by Gensim and corpus-specific common results on the test dataset computed at πΎ = β. We phrases such as questionnaire forms repeated across report a 95%-level confidence interval of the mean the majority of cases, since question formulations do values of 5 runs with different random seeds for LDA not characterize individual cases. initialization in Figure 3. In addition, we show the One characteristic of LDA model is that it provides ratio of test queries for which the metrics improved different topic distributions depending on a random with the topic modelling component in comparison seed used in its initialization. Therefore, every LDA to the baseline implementation in Figure 4. model with the same set of parameters, except for the Comparing baseline results at different top πΎ random seed, should be computed several times that retrieved parts, it can be seen that the values of will be referred to as runs further in the text. π π’ππππ π , πππππππ‘ππππ π , ππππππ and πππ_π‘ππ_π Afterwards, all the metrics should be averaged over increase with higher πΎ and achieve the possible several runs to get consistent results and minimize maximum at πΎ = β. πππ_π‘ππ_π@β is not the target the influence of the algorithmβs stochastic behavior. value for this metric, since it is higher than the values Another control parameter is the number of topics of πππ_π‘ππ_π@πΎ for any πΎ β β while the goal is to minimize it. Since we target at the lowest πππ_π‘ππ_π@πΎ possible, this metric is improved when the average value decreases. Overall improvement is observed for the experimental configuration with the topic modelling component. For metrics evaluated at πΎ = 10, the improvement reached 54.5%, 52.6% and 51.8% of maximum possible improvement for πππππππ‘ππππ π , ππππππ and π π’ππππ π . It indicates that the introduced component effectively captures similar cases and therefore parts, too. The performance improvement influenced by topic modelling is more prominent at smaller values of πΎ as can be seen from the difference between the average baseline values of πππππππ‘ππππ π , Figure 3: Comparison of different metrics computed for ππππππ and π π’ππππ π and those of LDA in Figure 3. LDA and baseline results in a part retrieval task. Confidence There is an increase in the ratio of improved interval of 95% is shown as a box around LDA values. queries for πππππππ‘ππππ π , ππππππ and π π’ππππ π calculated at smaller πΎ as depiced in Figure 4. For example, from less than 4% of queries for ππππππ@10 to around 5.45% for ππππππ@5. Turning now to the ratio of queries with improved πππ_π‘ππ_π@πΎ , it is higher for larger πΎ since the set of top ranked parts increases with greater πΎ likewise the probability of finding all of the necessary parts among top πΎ parts. Yet, it is the metric with the most prominent progress according to the ratio of queries that were improved using topic modelling: 10.49% to 11.20% for the LDA configuration. While for some queries the metrics were improved by the introduction of LDA component, 0.007% to 0.5% of queries experienced deterioration of the Figure 4: Ratio of queries for which the performance met- rics improved by the topic modelling component. Confi- πππππππ‘ππππ π , ππππππ and π π’ππππ π at different πΎ and dence interval of 95% is shown as a box around LDA values. 0.8% to 3.2% of queries for πππ_π‘ππ_π@πΎ . This happens, for example, when a number of documents with the right parts suggestion do not appear in the as |π(π)| = 0. The groups of queries that benefited the same group. A possible solution (as well as a future most from the topic modelling component work direction) is to integrate domain knowledge integration are the following: into the system and pre-define the number of topics and their characteristic terms to always appear in the 1. queries with number of retrieved cases |πΆ(π)| > same topic. 100, 5.1.1. Performance Evaluation for Queries 2. queries associated with cases that required 1 β€ Grouped Based on the Number of |π(π)| β€ 10 parts, Retrieved Cases and Parts 3. queries with retrieved and ranked parts The queries are grouped by the number of parts used 10 < |ππ (π)| β€ 100. in the query case and retrieved cases as well as by the Therefore, the topic modelling has a positive effect on number of retrieved service cases as demonstrated in the queries that result in extensive lists of cases and, Figure A in Appendix. Similarly to Figure 3, the thus, parts appearing in those cases. Comparing this results are reported with the mention of 95%-level result to the distribution of queries in our confidence interval on average for the runs. We experimental setting (Figure 2), the positive effect distinguish the queries made for service cases that concerns the largest groups of queries. did not require any part replacement and mark them NLP as well as database search and Semantic Web. Both IR and ER are usually enabled with a search engine, a user interface and an available knowledge base. However, while IR aims at document retrieval, the target of ER is to provide a list of ranked entities, such as people, places or specific concepts and things. An entity is characterized with a unique ID, a name and possibly a set of attributes. Data that describes the entity could be stored in natural text or in a more structured form. NLP techniques are used for repre- sentation of unstructured texts in a knowledge base, Figure 5: Coherence metric πΆπ£ and IR metric query processing and expansion and query-document ππππππ@πΎ , πΎ = 5, 10 computed for 2β20 topics de- rived by LDA. modelling. They also facilitate context capturing, named entity recognition, topic-oriented filtering in IR and ER [17, 16, 18]. Considering the classification in [16], our work could be categorized as a study on 5.2. Number of Topics improvement of an ad-hoc entity retrieval system LDA requires the number of topics to be passed as an that uses semantically enriched term representation input parameter. In some applications, this value is and preserves topical relations among search results. available as expert knowledge or is motivated by the Examples of industrial entity retrieval often include dataset [14]. Alternatively, a set of coherence metrics the knowledge representation in the form of ontology could be used to indicate the semantic correspon- as shown in [3, 4, 5]. dences within and throughout the derived topics and to evaluate their quality [15]. When a target number 6.2. Knowledge Management for of topics is unknown, it could be suggested by the Industrial Applications elbow method applied to coherence measures. In our case, the coherence score πΆπ£ estimated over 5 LDA Industries have been adopting process planning and instantiations with 2β20 topics resulted in an elbow knowledge-based systems for machine manufacturing point between 5 and 9 topics as shown in Figure 5. and maintenance over the recent years [1, 2, 19]. In However, the best results of IR evaluation metrics the literature review on spare part demand forecasting were obtained in the majority of experiments with [20], it has been found that a large part of research LDA at πΎ = 5 for 19 topics and at πΎ = 10 for 14 topics work has been dedicated to the analysis of historical as also demonstrated for ππππππ@πΎ in Figure 5. In demand using installed base information and reports. general, the models perform well with 13 or more The work on technical support that utilizes a topics in our experiment. The impact of the number historical case base is particularly relevant to our of topics in terms of chosen evaluation metrics is research [21, 22, 23]. The goal of the paper [21] is to observed on a smaller scale for 13 or more topics than aid telecom technical support teams with a fast and for the number of topics from 2 to 12. accurate search over the solutions base for previously registered cases and solutions from other technical texts. A method of populating an existing ontology 6. Related Work has been proposed using text segmentation and scoring to serve the use case of Telecom Hardware Areas related to our research span across entity remote user assistance. The authors in [24] propose a retrieval and knowledge management in industrial two-step method for spare part demand forecasting applications that correspond to the scope of our work that predicts the number of repairs and the number while the use of topic modelling in IR is related to the of parts needed for a repair. Our work combines methodology used in this paper. processing of a historical case base, but is not focused on spare part demand forecasting for general 6.1. Entity Retrieval Overview planning. It rather considers individual maintenance cases and addresses a lower level of granularity. Entity retrieval (ER) is defined in [16] as βthe task of Processing of Technical Documents Studies answering queries with a ranked list of entities.β The apply NLP as a tool for extracting knowledge from area of entity retrieval is closely connected to IR and natural texts in industrial log mining [25, 22, 26], mining technical documentation [27], classification of In a number of research works, a combination of system failures and preventive maintenance [28, 23]. topic modelling and IR is applied to small texts [34]. The study [22] applies an NLP approach to For instance, the paper [35] describes a method that maintenance data concerning a part of the Swedish first pools similar tweets using an IR approach, railway system and identifies frequent failure cases merges relevant short texts in a larger document and on the railways. Text mining and NLP techniques are trains LDA model on concatenated documents thus applied in [23] to analyze and classify the obtaining richer topics. By contrast, our method construction site accidents using the data from addresses a domain-specific collection of short texts Occupational Safety and Health Administration. In written in so-called telegraph style with spelling this setting, an ensemble method was used to obtain mistakes and domain-related abbreviations. Tfidf matrix and a sequential quadratic parsing Search Results Clustering To date, several stud- method to assign weights to 5 classifiers. ies have investigated document and language models The work [29] focuses on building Machine based on topics and clusters. The work [36] explored Learning (ML) models to estimate future duration of a cluster-based retrieval of documents, a mechanism maintenance activities by identifying problem, that returns a relevant cluster of documents, and solution and items features via text mining for proposed two language models for ranking the pre-processing followed by neural networks and clusters of documents and smoothing the documents decision trees for prediction. NLP is used to mine using clusters. By contrast, some works cluster electronic documents composed of free-form text to search results using traditional ML, graph-based and extract terms of interest, the hierarchy of their con- rank-based clustering techniques [6, 37]. For texts and form a set of normalized terms including instance, Lingo algorithm [38] focuses on learning multi-word terms for further data analysis in [30]. phrases to represent clusters in a human-readable Therefore, problems addressed in maintenance way and then it discovers topics using Tfidf weight- services application domain are diverse in nature. ing, performs term-document matrix reduction with However, to the best of our knowledge the current SVD and matches the extracted phrases with topics. paper is the first attempt to use entity retrieval In comparison to these approaches, our work aims at techniques for spare part management. retrieving entities rather than documents and the user can explore all the retrieved parts within all the 6.3. Use of NLP and Topic Modelling in clusters instead of only one cluster. IR Systems The effectiveness of IR systems could be improved by 7. Conclusion topic modelling that mines term associations in a In this work, we explored a way of improving a spare collection of documents. Topic modelling could be part retrieval system for remote diagnostics and integrated to IR tasks to smooth the document model maintenance of medical equipment by applying topic with a document term prior estimated using term modelling to search results. The topic modelling distributions over topics [31]. The work [32] explores component was used to cluster the results of a the possibilities of modelling term associations as a baseline retrieval system and improve the relevance way of related terms integration into document of the search results. We aimed to support the models and proposes a model of probabilistic term decision-making process of maintenance service association using the joint probability of terms. A teams that searched in a historical collection of combination of term indexing and topic modelling troubleshooting reports and retrieved parts needed approaches is proposed in [33]. In the proposed for a new similar issue. model, every query term in a document is weighted The experimental dataset was constructed from using the LDA algorithm and IR indexing methods. query-result pairs pointing at the historical case base The best experimental results were obtained with and parts used in the cases. We adjusted several IR LDA-BM25 version. However, in this paper, the metrics to evaluate the results of spare part retrieval similarity is computed using a vector space model in the baseline architecture and the topic modelling and the retrieval results are combined using topic component modification. The major enhancement relations mined from a historical case base. was observed for the metric that estimated the Therefore, topic modelling is used as a clustering or minimum top ranked parts that were sufficient for grouping method on top of an ER system. the full treatment of a service case associated with a performed query. 2017, pp. 425β432. URL: http://link.springer.com/ A natural progression of this work is to apply on- 10.1007/978-3-319-66923-6{_}50. doi:10.1007/ line topic learning and automatically recommend the 978-3-319-66923-6_50. topic that performs best for a given query. An input [6] H. Toda, R. Kataoka, M. Oku, Search from domain experts would help fix the number of Result Clustering Using Informatively topics and characteristic terms that should appear Named Entities, International Jour- under one topic. Furthermore, additional domain nal of Human-Computer Interaction 23 knowledge could be combined with the entity re- (2007) 3β23. URL: http://www.tandfonline. trieval system under consideration to suggest actions com/doi/abs/10.1080/10447310701360995. beyond part replacement, such as troubleshooting doi:10.1080/10447310701360995. tests for remote and on-site diagnostics. [7] S. E. Robertson, S. Walker, K. S. Jones, M. M. Hancock-Beaulieu, Okapi at TREC-3, Pro- ceedings of the Third Text REtrieval Conference Acknowledgments (1994). [8] C. D. Manning, P. Raghavan, H. Schutze, The authors would like to acknowledge the gracious Introduction to Information Retrieval, support of this work through the local authorities Cambridge University Press, Cambridge, under grant agreement βITEA-2018-17030-Daytimeβ. 2008. URL: http://ebooks.cambridge.org/ ref/id/CBO9780511809071. doi:10.1017/ References CBO9780511809071. [9] C. D. Manning, H. SchΓΌtze, Foundations of Sta- [1] G.-F. Liang, J.-T. Lin, S.-L. Hwang, E. M.-y. tistical Natural Language Processing, The MIT Wang, P. Patterson, Preventing human er- Press;, 1999. URL: https://nlp.stanford.edu/fsnlp/. rors in aviation maintenance using an on-line [10] D. M. Blei, A. Y. Ng, M. T. Jordan, Latent Dirich- maintenance assistance platform, Inter- let Allocation, Journal of Machine Learning Re- national Journal of Industrial Ergonomics search 3 (2003) 993β1022. 40 (2010) 356β367. URL: https://linkinghub. [11] Elasticsearch B.V., Elasticsearch, ???? URL: https: elsevier.com/retrieve/pii/S0169814110000028. //www.elastic.co/. doi:10.1016/j.ergon.2010.01.001. [12] R. ΕehΕ―Εek, P. Sojka, Software Framework for [2] E. Ruschel, E. A. P. Santos, E. d. F. R. Loures, Topic Modelling with Large Corpora, in: Pro- Industrial maintenance decision-making: A sys- ceedings of the LREC 2010 Workshop on New tematic literature review, Journal of Manufactur- Challenges for NLP Frameworks, ELRA, Valletta, ing Systems 45 (2017) 180β194. URL: https://doi. Malta, 2010, pp. 45β50. org/10.1016/j.jmsy.2017.09.003. doi:10.1016/j. [13] M. Honnibal, I. Montani, spaCy 2: Natural lan- jmsy.2017.09.003. guage understanding with Bloom embeddings, [3] Z. Li, K. Ramani, Ontology-based design infor- convolutional neural networks and incremental mation extraction and retrieval, Artificial Intel- parsing, 2017. To appear. ligence for Engineering Design, Analysis fand [14] R. J. Gallagher, K. Reing, D. Kale, G. Ver Manufacturing 21 (2007) 137β154. URL: https: Steeg, Anchored Correlation Explanation: //www.cambridge.org/core/product/identifier/ Topic Modeling with Minimal Domain Knowl- S0890060407070199/type/journal{_}article. edge, Transactions of the Association for doi:10.1017/S0890060407070199. Computational Linguistics 5 (2017) 529β542. [4] K. Ponnalagu, Ontology-driven root-cause ana- URL: https://www.mitpressjournals.org/doi/abs/ lytics for user-reported symptoms in managed IT 10.1162/tacl{_}a{_}00078. doi:10.1162/tacl_a_ systems, IBM Journal of Research and Develop- 00078. ment 61 (2017) 53β61. doi:10.1147/JRD.2016. [15] M. RΓΆder, A. Both, A. Hinneburg, Exploring the 2629319. Space of Topic Coherence Measures, in: Pro- [5] M. Sharp, T. Sexton, M. P. Brundage, To- ceedings of the Eighth ACM International Con- ward Semi-autonomous Information Extraction ference on Web Search and Data Mining - WSDM for Unstructured Maintenance Data in Root β15, ACM Press, New York, New York, USA, 2015, Cause Analysis, in: IFIP Advances in Information pp. 399β408. URL: http://dl.acm.org/citation.cfm? and Communication Technology, volume 513, doid=2684822.2685324. doi:10.1145/2684822. 2685324. [16] K. Balog, Entity-Oriented Search, volume 39 ence on Knowledge discovery and data mining of The Information Retrieval Series, Springer - KDD β14, ACM Press, New York, New York, International Publishing, Stavanger, Nor- USA, 2014, pp. 1867β1876. URL: http://dl.acm. way, 2018. URL: https://eos-book.orghttp: org/citation.cfm?doid=2623330.2623340. doi:10. //link.springer.com/10.1007/978-3-319-93935-3. 1145/2623330.2623340. doi:10.1007/978-3-319-93935-3. [26] S. Agarwal, V. Aggarwal, A. R. Akula, G. B. Das- [17] S. BΓΌttcher, C. L. A. Clarke, G. V. Cormack, Infor- gupta, G. Sridhara, Automatic problem extrac- mation Retrieval: Implementing and Evaluating tion and analysis from unstructured text in IT Search Engines, The MIT Press, 2010. tickets, IBM Journal of Research and Develop- [18] Z. A. Merrouni, B. Frikh, B. Ouhbi, Toward ment 61 (2017) 41β52. doi:10.1147/JRD.2016. Contextual Information Retrieval: A Review 2629318. And Trends, Procedia Computer Science [27] K. Richardson, J. Kuhn, Learning semantic cor- 148 (2019) 191β200. URL: https://linkinghub. respondences in technical documentation, ACL elsevier.com/retrieve/pii/S1877050919300365. 2017 - 55th Annual Meeting of the Association doi:10.1016/j.procs.2019.01.036. for Computational Linguistics, Proceedings of [19] S. P. Leo Kumar, Knowledge-based expert sys- the Conference (Long Papers) 1 (2017) 1612β tem in manufacturing planning: state-of-the- 1622. doi:10.18653/v1/P17-1148. art review, International Journal of Production [28] K. Arif-Uz-Zaman, M. E. Cholette, L. Ma, Research 57 (2019) 4766β4790. doi:10.1080/ A. Karim, Extracting failure time data 00207543.2018.1424372. from industrial maintenance records us- [20] S. Van der Auweraer, R. N. Boute, A. A. Syn- ing text mining, Advanced Engineer- tetos, Forecasting spare part demand with in- ing Informatics 33 (2017) 388β396. URL: stalled base information: A review, International http://dx.doi.org/10.1016/j.aei.2016.11.004. Journal of Forecasting (2019). doi:10.1016/j. doi:10.1016/j.aei.2016.11.004. ijforecast.2018.09.002. [29] M. Navinchandran, M. E. Sharp, M. P. Brundage, [21] A. Kouznetsov, J. B. Laurila, C. J. Baker, B. Shoe- T. B. Sexton, Studies to predict maintenance bottom, Algorithm for Population of Object time duration and important factors from main- Property Assertions Derived from Telecom Con- tenance workorder data, in: Proceedings of tact Centre Product Support Documentation, in: the Annual Conference of the Prognostics and 2011 IEEE Workshops of International Confer- Health Management Society, PHM, 2019. doi:10. ence on Advanced Information Networking and 36001/phmconf.2019.v11i1.792. Applications, IEEE, 2011, pp. 41β46. URL: http:// [30] A. Kao, N. B. Niraula, D. I. Whyatt, Text mining a ieeexplore.ieee.org/document/5763435/. doi:10. dataset of electronic documents to discover terms 1109/WAINA.2011.135. of interest, 2020. [22] C. StenstrΓΆm, M. Aljumaili, A. Parida, Natu- [31] L. Azzopardi, M. Girolami, C. van Rijsbergen, ral language processing of maintenance records Topic based language models for ad hoc in- data, International Journal of COMADEM 18 formation retrieval, in: 2004 IEEE Interna- (2015) 33β37. tional Joint Conference on Neural Networks [23] F. Zhang, H. Fleyeh, X. Wang, M. Lu, (IEEE Cat. No.04CH37541), volume 4, IEEE, Construction site accident analysis us- 2004, pp. 3281β3286. URL: http://ieeexplore.ieee. ing text mining and natural language org/document/1381205/. doi:10.1109/IJCNN. processing techniques, Automation in 2004.1381205. Construction 99 (2019) 238β248. URL: [32] X. Wei, W. B. Croft, Modeling Term Asso- https://doi.org/10.1016/j.autcon.2018.12.016. ciations for Ad-Hoc Retrieval Performance doi:10.1016/j.autcon.2018.12.016. Within Language Modeling Framework, [24] W. Romeijnders, R. Teunter, W. Van Jaarsveld, A in: Advances in Information Retrieval, two-step method for forecasting spare parts de- Springer Berlin Heidelberg, Berlin, Hei- mand using information on component repairs, delberg, 2007, pp. 52β63. URL: http://link. European Journal of Operational Research (2012). springer.com/10.1007/978-3-540-71496-5{_}8. doi:10.1016/j.ejor.2012.01.019. doi:10.1007/978-3-540-71496-5_8. [25] R. Sipos, D. Fradkin, F. Moerchen, Z. Wang, Log- [33] F. Jian, J. X. Huang, J. Zhao, T. He, P. Hu, A based predictive maintenance, in: Proceedings Simple Enhancement for Ad-hoc Information Re- of the 20th ACM SIGKDD international confer- trieval via Topic Modelling, in: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Re- trieval - SIGIR β16, ACM Press, New York, New York, USA, 2016, pp. 733β736. URL: http://dl.acm. org/citation.cfm?doid=2911451.2914748. doi:10. 1145/2911451.2914748. [34] J. Qiang, Z. Qian, Y. Li, Y. Yuan, X. Wu, Short Text Topic Modeling Techniques, Applications, and Performance: A Survey, IEEE Trans- actions on Knowledge and Data Engineering 14 (2020) 1β17. URL: https://ieeexplore.ieee.org/ document/9086136/. doi:10.1109/TKDE.2020. 2992485. [35] M. Hajjem, C. Latiri, Combining IR and LDA Topic Modeling for Filtering Microblogs, in: Pro- cedia Computer Science, 2017. doi:10.1016/j. procs.2017.08.166. [36] X. Liu, W. B. Croft, Cluster-based retrieval us- ing language models, in: Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR β04, ACM Press, New York, New York, USA, 2004, pp. 1β8. URL: http://portal.acm.org/citation.cfm? doid=1008992.1009026. doi:10.1145/1008992. 1009026. [37] K. Sadaf, Web Search Result Clustering- A Review, International Journal of Computer Science & Engineering Survey 3 (2012) 85β 92. URL: http://www.airccse.org/journal/ijcses/ papers/3412ijcses07.pdf. doi:10.5121/ijcses. 2012.3407. [38] S. OsiΕski, J. Stefanowski, D. Weiss, Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition, in: Intelli- gent Information Processing and Web Mining, Springer Berlin Heidelberg, Berlin, Heidelberg, 2004, pp. 359β368. URL: http://link.springer.com/ 10.1007/978-3-540-39985-8{_}37. doi:10.1007/ 978-3-540-39985-8_37. A. Topic Modelling Component Performance Evaluation for Grouped Queries Figure 1: Comparison of different metrics computed for LDA and baseline results in a part retrieval task. Queries are divided into groups using the number of retrieved cases, as well as used and retrieved parts. Confidence interval of 95% is shown as a box around LDA values.