BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval PatentQuest: A User-Oriented Tool for Integrated Patent Search Manajit Chakraborty, David Zimmermann and Fabio Crestani Faculty of Informatics, Università della Svizzera italiana, Lugano, Switzerland Abstract Patent Search is a well-established research field. In existing Patent Search systems, a user needs to explicitly enter a set of keywords to retrieve a set of ranked results. Conventional patent search systems lack the capability to run directly from the user’s text editor. Moreover, to the best of our knowledge, most practical systems do not leverage explicit user feedback and domain-specific context to enhance the quality of search results. In this paper, we describe a system that offers a single point of access for patent information coming from different sources as well integrates the capability for user feedback and the ability to search from the text editor itself without the need to switch applications. To explore the viability and effectiveness of such a system, we created and deployed it as a web service plug-in for Microsoft Word®and conducted both system and user evaluation on a benchmark dataset. Keywords Patent Search, Integrated Search, Relevance Feedback, User Study, Add-in, Web Service 1. Introduction In recent times the scale of intellectual property rights, including patents, have seen an un- precedented increment in the market globally. To keep up with this global phenomenon, patent offices in several countries are trying to improve patent prosecution quality while minimising the time required to grant patent rights without compromising the robustness of the patent evaluation structure. As such, the ownership of patents is fast becoming one of the most im- portant measures of individual and business as well as national competitiveness. Hence, many companies have recently been encouraging and patenting the newest technologies in huge quantities. Compared to the increasing number of patent documents, the number of patent examiners and judges to handle them is not sufficient enough, and allocating the excessive workload to the limited workforce resources will inevitably deteriorate the quality of patent examination. Therefore, it is imperative for both the applicant and the examiner to perform the manual patent examination process both quicker and more accurate than before. The patent search tasks have the following several purposes. One of them is ‘prior art search’, which has been required before patent filing or for the prevention of patent infringement. It is significantly different for the patent search system on that the purpose and characteristics BIR 2021: 11th International Workshop on Bibliometric-enhanced Information Retrieval at ECIR 2021, April 1, 2021, online " manajit.chakraborty@usi.ch (M. Chakraborty); david.zimmermann@usi.ch (D. Zimmermann); fabio.crestani@usi.ch (F. Crestani)  © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 89 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval of the existing search engines have long been endowed. While there are various public patent search systems like PatentScope1 , Espacenet2 etc. and even commercial patent search systems such as Google Patents3 . However, these systems often come with a big learning curve or are limited by their own data collections. Moreover, since patent prior-art search involves specific legalese, various firms offer patent search services like PatentSight4 for a high price quote. For an inventor, especially a first-time patent applicant, it might seem both an overwhelming and expensive task. To address these issues, we demonstrate the viability of a more user-oriented system that is cost-free. This is achieved by direct integration into the user’s text editor, allow- ing for search without reformulating the text into a query and working hand in hand with the user through a user feedback loop, leveraging domain-specific context information. The stated goal is implemented with the construction of a functioning system prototype called PatentQuest. The system is deployed as a simple add-in to the online web service of Microsoft Word®. The advantage of such a system is that it incorporates the patent search within the text editor itself, thus allowing us to harness the power of explicit user feedback while allowing the user to access the patent text content, all without the need of switching between applications. The prototype system was evaluated on the CLEF-IP 2011 Prior-Art Search [1] track dataset for system performance and efficiency, while a separate user-study was conducted to gauge the system’s usability and convenience. 2. Related Work Patents pose several domain-specific challenges when it comes to information retrieval [2]. Further complications can arise from the fact that patents are written in different languages, are semi-structured and that the input for building a query can itself be a multi-page patent application [3]. To achieve better search performance, different techniques for query reformu- lation have been tested, and applied [4]. A potential technique for query expansion is “Pseudo- relevance feedback” (PRF), in which a first search based on an initial query is run and then features are extracted from the best scoring results to run a second run search. [5] produced an especially interesting result when they conducted their research on PRF. While they failed to demonstrate the better performance of their PRF techniques over the baseline keyword search, they found that the baseline performance can be doubled if just one extra document is marked as relevant by the user, suggesting that the interaction with the user is very powerful. Another approach to query expansion is the addition of synonyms or semantically related concepts to the given query terms. In the patent domain, different sources for the addition of this seman- tic information have been tested. Synonyms have been extracted from the general dictionary WordNet5 or from the document corpus itself [6], domain-specific dictionaries have been built based on examiners’ search queries [7] and Wikipedia6 articles have been exploited for re- lated sentences [8]. All of these systems mentioned above perform with rather mixed results. 1 https://www.wipo.int/patentscope/en/ 2 https://www.epo.org/searching-for-patents/technical/espacenet.html 3 https://patents.google.com/ 4 https://www.lexisnexisip.com/products/ 5 https://wordnet.princeton.edu/ 6 www.wikipedia.org 90 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval (a) Initial search screen. (b) Term highlighting in search results. Figure 1: User interface with search results. The IPC Classifications have been another source for query extension. Verma and Varma [9] built a classification vector for all patents in the data set and calculated the cosine similarity of the documents based on these vectors, which was the best performing system in CLEF-2011 Prior Art Search track. Patents also contain citation information that can be exploited in dif- ferent ways. Mahdabi et al. [10] extracted citations from the text of the patent application and added those citations directly to the search results. Crestani et al. [11, 12] used the citations to build citation networks and exploit information gained from that network, among others using PageRank (see [13] for more details on PageRank). 3. PatentQuest In our literature review, we did not come across any system that offers the flexibility of Patent Search or Recommendation incorporated within a free-text document editor. In lieu of that, we built a prototype for our system called PatentQuest that facilitates users or inventors to have an integrated system at disposal that can fulfil their bibliographic needs while formulating a patent document. The prototype is distributed as an add-in to the online version of Microsoft Word® under Microsoft Office 365. Microsoft Word offers any developer to develop a piece of software or tool and integrate it with Word as an add-in without much hassle. This drove us to prepare a Patent Search prototype for Word. Although we intend to built an add-in for the desktop version of Microsoft Word in the immediate future, our goal here is to provide a proof-of-concept of such a prototype and the advantages it brings with it. In this section, we describe the user-interface and its characteristics. The objective of the user interface is to keep the usage of the add-in as intuitive and self- explanatory as possible. After installation of the add-in, the user will see an additional icon under the “Home” tab. On clicking the icon, a side window opens for the user to interact with the system. The advantage of such a window-based system is that it allows the user to be free of unnecessary distraction while writing, as the window can be simply closed by re-clicking on the add-in button on the tab. A user is provided with basic instruction on how to use the system before starting running a search. The user is instructed to select a part of or the full text (in the current editor window) to run the search on and to click the Search button in the side window. 91 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval (a) Pop-up screen with patent details. (b) Interface with user feedback information. Figure 2: PatentQuest in use. Queries can be issued in any of the three languages English, German or French. The search results show the English title of the patent, its document ID and an excerpt from the search, highlighting matching terms found from the query. By default, only the ten most relevant results are displayed with a link at the bottom of the window, which displays up to an additional 20 search results. The search results window is flexible, allowing the extended results to be hidden with a link at the bottom. A sample screenshot of the interface with search results is presented in Figure 1. For each search performed, the search query is preserved for reference purposes. Clicking on the title of a patent opens up a pop-up window, displaying the content of a full document (Figure 2a). The editing space and search results are static in the background while the full-text of the patent is displayed in the scrollable pop-up window. This allows the user to have a comfortable reading experience without losing the search results or the text written so far in the editor window. In the display window, the upper section is devoted to meta-data information i.e., the title of the patent in English, and the document ID is shown at the top followed by the file types (whether it is an application or grant). In the lower section, the rest of the relevant sections extracted from the patent document are displayed, like the citations, the abstract, the claims or the description. Additionally, each search result comes with a button to mark it as relevant (Figure 2b). Once a document has been marked as relevant, a panel shows up on top of the search button displaying a list of documents (sorted by their document ID) that have been selected by the user. The panel also allows to delete the documents again or display them in full by clicking on the ID. The user has the option to issue a new search at any time. If given documents are marked as relevant, they will be taken into account for the new search (see section 5.1) for an explanation on relevance feedback). Once the user has received is satisfied with the desired search results, the side window can be closed again by clicking on the cross button on the top right corner. The side window can be re-opened by clicking on the add-in’s panel in the "Home" tab. As long as the application window is not closed and reloaded, the current session’s search results and documents marked as relevant remain intact. The option to select and mark relevant documents from search results as relevant (explicit relevance feedback) helps user to drive their navigation in a specific direction and has been shown to improve prior-art search previously [14]. This is particularly helpful if an inventor is looking for similar or seminal patents on a specific topic. 92 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval Figure 3: Schematic view of system implementation and data flow. 4. Implementation Details 4.1. Dataset We used the dataset from CLEF-IP 2011 track7 for building and evaluation of our system. The dataset consists of over 1 million patents from the European Patent Office8 prior to the year 2002 and additional 400,000 patent applications published by the World Intellectual Property Organization in XML format.9 The elements of the XML files can be roughly divided into two categories: Text fields with the contents of the patent and fields with meta information. The text fields of a patent are: Title, Abstract, Description and Claims. In the dataset, the text elements of a patent can be in one of three languages: English, German or French. Generally, the title of the patent is available in all three languages and other text fields in only one language. For each patent, several documents can be published, depending on the information available at the time of publishing. The documents are encoded with “A1”, “A2”, ... for the application phase and with “B1”, “B2”, ... for the granting phase. The relevant information about the patents is spread over several documents in some cases. Overall the dataset comprises of around 2.5 million files. 4.2. System Design The system design can be broadly divided into two parts: (i) the front-end or user-inteface (ii) the back-end handling the query processing and display of the results. In Figure 3, we present an overview of the implementation of the system. 4.2.1. User Interface PatentQuest was motivated by the lack of an integrated system within existing workflows. The system thus built should complement the creation of documents by always suggesting relevant 7 http://www.ifs.tuwien.ac.at/~clef-ip/download/2011/index.shtml 8 https://www.epo.org/ 9 www.wipo.int 93 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval sources based on the current text written by the user. This is beneficial since it saves the user both the time and effort required to switch between applications to gather relevant sources to cite. An ideal system in this scenario should provide the user all the functionalities of a standard text editor while also providing not only an integrated list of relevant search results but one that also allows adding parts of a relevant source directly into the text. In addition, the system should also be backwards compatible with existing documents. All of this suggests that the best way would be to apply one of the more popular text editor currently in use by users and experts as a user interface that offers the possibility of customisation. In light of this, Microsoft Word stands out as a favourable choice for this task. Microsoft offers two ways to create extensions to their “Office Suite” programs, both with their own caveats: (a) Office Add-ins and (b) COM/VSTO Add-ins. As mentioned earlier, our system is built as an Office Add-in. This is the newer format to create an add-in for Microsoft Office products. All office applications offer a JavaScipt API ("Office JS") to access the contents of the document and offer a browser engine that runs in a side-window of the application to render HTML5 and CSS, as well as execute JavaScript. This form of the add-in is cross-platform compatible, unlike the COM/VSTO add-in, which was the main reason it was chosen for the implementation of the prototype system. It also offers strong security through the limited access of the JavaScript add-in on the user’s system. At the same time, the strong security measures implemented by Microsoft induce the biggest drawbacks of this form of the add-in. It forces the add-in to be run as a web service. The Office application will only load an add-in that is served through the HTTPS protocol, making a local standalone use of the add-in difficult. Furthermore, the distribution of the manifest file, which contains the necessary loading information for the add-in, is built for either distribution through an organisation with a central IT infrastructure or through the Microsoft AppSource, which requires authorisation from Microsoft. While this limits our system by allowing us to use only the online version of Microsoft Word through the Office 365 suite, it still allows enough provision for both system and user evaluation. We aim to offer a standalone add-in in the near future. For our current prototype, the user needs to side-load the add-in through the distribution of a manifesto file. The user has to simply download an XML file including the manifesto with the required information and select it through a file manager to integrate the add-in. 4.2.2. The back-end There are several reasons that compelled us to deploy the system as a web service. Firstly, the chosen form of Microsoft Office add-in requires a connection to a secure web service based on the HTTPS protocol to work. Secondly, distributing the full Solr Index (explained in the next section) to all end users would be very heavy (around 40GB) and ensuring the timeliness of up- dates for the dataset becomes much easier. Finally, setting up the system becomes extremely easy for the user. All the user has to do is load the manifest XML file into his Microsoft Word online distribution. Flask10 , a web application framework for Python, was chosen as a founda- tion for the implementation of the back-end and to connect the APIs needed by the front-end to 10 https://flask.palletsprojects.com/en/1.1.x/ 94 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval the web. The more advanced features of the system raise the need for query generation or re- formulation and refreshing the results on-the-fly. When a search request is triggered, the user interface sends two types of information to the Flask app in the back-end, the text selected for query input and the patent IDs of documents marked as relevant by the user (relevance feed- back). The user input text is used as the original query, which is then expanded in two stages using the documents marked as relevant: • by adding the most important terms from the selected documents and then • by adding the IPC classifications to the original query. The relative weights involved in the query expansion and reformulation have been determined empirically (see section 5.1). The important terms are extracted using the “More Like This” MLT feature of Apache Solr11 . In the same stage, the categories/IPC classifications of the documents are collected. This information is then used to reformulate the user request and build the final query, which is again run against the search index. 5. Evaluation In this section, we describe the evaluation process and the evaluation results of the system. The aim of the evaluation is to measure the quality of the search results as well as the usability of the system and to gain insights into potential improvements. Hence, we conducted both system evaluation and user evaluation to gain a fair understanding of the strengths and limitations of our system. For the system evaluation, the CLEF-IP 2011 patent dataset is used to determine the optimal system parameters and to compare the system’s performance with that of the systems participating in the CLEF-IP 2011 track. This is followed by the evaluation of the system by a test user group. 5.1. System Evaluation In CLEF-IP 2011, the participants were provided with 3,973 topics in three languages (English, German and French). Various textual and non-textual elements (as described in later sections) were also employed by the winning systems of the CLEF-IP 2011 track and were the starting point for the queries tested below. We conducted several experiments to determine the opti- mal settings and weights for each parameter used in the design of the system to have the best possible performance. Each of these empirical studies is presented below. The evaluation met- rics used in all these experiments were (i) Mean Average Precision (MAP) and (ii) Normalized Discounted Cumulative Gain (nDCG), which were the metrics used to judge the participant teams’ performance in the CLEF-IP 2011 Prior-Art Search track. 5.1.1. Impact of combining different patent sections As described in Section 4.1, a patent document consists of multiple sections. The first experi- ment thus compares the search result metrics for queries generated from different combinations 11 https://lucene.apache.org/solr/guide/6_6/morelikethis.html 95 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval Table 1 Determining optimal combination of patent sections Section MAP nDCG Avg. query length (#words) t 0.0394 0.0851 9.39 a 0.0585 0.1205 106.09 c 0.0612 0.1236 978.95 d 0.0620 0.1205 5270.01 ta 0.0633 0.1298 114.48 tc 0.0623 0.1255 987.34 td 0.0622 0.121 5278.40 tac 0.0635 0.1282 1092.43 tacd 0.0615 0.1202 6361.44 (a) MAP vs. Query Length. (b) nDCG vs. Query Length. Figure 4: Determining the optimal combination of patent sections. of sections, which in turn translates to varying query length. The query configurations are rep- resented by an encoding where a combination of letters describes the combination of the fields used. The encodings can be described as t: Title, a: Abstract, c: Claims, d: Description, ta: Title + Abstract, tc: Title + Claims, tac: Title + Abstract + Claims and tacd: Title + Abstract + Claims + Description. Figures 4a and 4b shows the impact on retrieval performance for choice and combinations of different sections from the patent. As can be seen for the queries “t” to “d”, there is an increase in MAP for using longer text elements like description and claims versus title or abstract but with rapidly declining marginal returns and at the expense of longer run times. When the titles are combined with the abstract or the claims, the advantage of using the descriptions subsides, despite the titles only adding an average of around nine words (the descriptions have an average length of around 5,270 words). Combining the description with a query that already contains the titles and the abstract and/or the claims seems to add noise rather than any useful information. This result is confirmed by the run, which combines all the elements, having lower performance on all metrics. The tendency for longer queries to do better in the prior art search task at the expense of query 96 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval (a) Class weight vs MAP. (b) Class weight vs nDCG. Figure 5: Effect of adding Classification Codes on performance. speed confirms the results of [15]. As can be seen from Figures 4a and 4b, two combinations show the best search performance, while having reasonable query length. The first one is title and abstract, and the second one is title, abstract and claims. Hence, we proceed with these two combinations for the rest of the experimentation. 5.1.2. Impact of incorporating classification codes As mentioned earlier, each patent document and the topics have their constituent classifica- tion codes, which we used to improve the retrieval performance by adding the IPC codes to the optimal query generated from previous step. Figure 5a and 5b show the results for different boost factors (weights) for the classification codes of the two best queries from the previous experiment (“ta” and “tac”) as well as for the title only (“t”) query. We can observe that incor- porating IPC classification within the query adds considerable information that had not been previously captured by the text alone. Secondly, one needs to adapt the boosting in accordance with the query length. For instance, for a query built from the title and abstract (“ta”; average length of 114.5 words) of the patent application, the optimal weight of the classifications is eight times the weight of the terms (“tacl_8”), while for a query consisting of title, abstract and claims (“tac”; average query length of 1092.4), the classifications should be assigned 32 times the weight of each term (“taccl_32”). Interestingly, after taking into account the classification codes, terms from the claims section do no longer seem to add useful information compared to just using title and abstract. 5.1.3. Impact on multi-lingual search Next, we studied the effect of retrieval performance due to languages. As stated earlier, in the dataset, we had documents from three languages, namely, English (en), German (de) and French (fr). Table 2 present the results obtained. One can observe that the results differ significantly, depending on the input languages. We observed that queries in English produced the best results, closely followed by German, while system struggled the most with queries in French. However, not only are the inputs given in 97 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval Table 2 Effect of multi-lingual search on performance. MAP nDCG Query type Single lang. Multi lang. Single lang. Multi lang. tcl_2 0.0746 0.0735 0.1666 0.1653 tacl_8 0.0892 0.0886 0.1891 0.1959 taccl_32 0.0870 0.0913 0.1816 0.1928 (a) Term weight vs. PRF (b) Term weight vs. (c) Class weight vs. PRF (d) Class weight vs. PRF (MAP). PRF (nDCG). (MAP). (nDCG). Figure 6: Effect of Pseudo-Relevance Feedback. three languages but also cross-language results are expected. An input given in German might expect a result published in French and the other way round. Since the best faring query uses only the relatively short abstract and the title, an attempt was made to use machine translation to achieve better results. The titles themselves are usually already given in all three languages. For each input patent, machine translations of the abstract were created for the other two missing languages, using the Yandex translation API12 . Employing a multi-lingual search, the performance of the system improved in most cases, and the best results were achieved when the combined query of title, abstract and classification weight of 32 (“tacl_32”) was used. 5.1.4. Impact of Relevance Feedback As part of our next experiment, we wanted to determine if relevance feedback could improve the system performance even further. For this, we employed pseudo-relevance feedback (PRF) in two ways: (a) by selecting top-2 relevant results returned by the optimised query and ex- panding it and (b) by selecting top-2 non-relevant results for query expansion. This experiment helped us realise two objectives, (i) whether our system was indeed responsive to relevance feedback in the first place and (ii) the optimal weight to be considered for the same. Figures 6a and 6c, compare the MAP performance against the term weight and classification code weight boosting, while figures 6b and 6d, compare the nDCG performance. In both cases, we can clearly observe that positive relevance feedback can improve retrieval performance consider- ably. In fact, the best-run results obtained by our system after positive relevance feedback on tacl_32 was 0.0905 in terms of MAP and 0.205 in terms of nDCG, which were comparable with 12 https://tech.yandex.com/translate/ 98 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval Table 3 Summary statistics SUS score for user group evaluation. Mean Max. Min. Std. Dev. SUS 72.5 92.5 50 18.14 the best performing system at CLEF-2011 PAC track (MAP=0.097 [9]). While better results have been achieved in the meantime (e.g. by Mahdabi and Crestani [12]), it needs to be noted that our objective was not to provide a mechanism for best search results but to achieve a balance between system performance and usability of the system to general users while providing a novel integration13 . 5.2. User Evaluation While system evaluation could provide us with a measurable impact on the system perfor- mance, since the tool is designed for users, it was imperative that we conducted a user evalua- tion as well. In the absence of expert users, we resorted to a set of four users with high familiar- ity with IR systems. To evaluate the usability of the system, the “System Usability Scale” (SUS) was employed [16]. The SUS score as recorded for the four test users is presented in Table 3. From the table, we can observe that the standard deviation of the SUS is 18.14, which implies that there is a wide range of different perceptions of the system (between 50 and 92.5). When translated to the various scales of SUS evaluation [17], the SUS of 72.5 corresponds to a grade "C" or a qualitative description of "Good". This lies within the range of what users tend to deem acceptable. Naturally, smaller improvements could probably yield substantial improvements in the scoring. Along with the SUS evaluation, the test users were asked to record their responses to an additional questionnaire. The questionnaire has two blocks of statements. The first block contains five extra statements about usability, which were more specific to the system than the SUS statements. The extra statements were only presented to the participants after they had completed rating the SUS statements in order not to influence or bias the questionnaire. The second block of extra statements consists of three statements about the subjective quality of the search results. The summary statistics recorded against the questionnaire is presented in Table 4. At the end of the questionnaire, the test users were presented with two text fields to add general feedback and overall suggestions for improvements to the system. From Table 4, we can observe that on the usability statements, the system was again perceived very differently by different users. The user group’s scoring confirms that the response time and the inclusion of the search into the natural workflow of document creation belong to the strengths of the system. On the other hand, the aesthetics of the interface received a score below average, indi- cating room for improvement, like changing the colouring or hiding the button (for relevance feedback) after a document has been marked relevant, to have a more polished user-interface outlook. However, the overall user study substantiated our initial goal of building a prototype system capable of integrating patent search within the document editor freeing the user from having to switch between workspaces. We have duly recorded the feedbacks and suggestions 13 http://www.dlib.org/dlib/november95/11croft.html 99 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval Table 4 Summary statistics of user group scoring of the additional statements. Statement Mean Score Std. Dev. The add-in is easy to install. 3.25 1.71 The system’s response time to search requests is adequate. 3.75 1.89 The user interface has an appealing look and feel. 2.75 1.50 The system can enhance the workflow of creating a new patent docu- 3.75 1.26 ment. The system is responsive to the user feedback loop. 3.25 1.26 All search results shown are relevant. 3.25 0.5 The most relevant results show up on top. 3 0.82 The search brings up all documents that are relevant to the search. 3.25 0.96 provided by the test users and intend to incorporate them in the next version of our system. 6. Conclusions and Future Work Patent Search continues to be an active research area. In this paper, we demonstrated a pro- totype system that could allow integration of patent search within a popular text editor. The prototype deployed as a Microsoft Word add-in facilitates hassle-free integration into the text editor window freeing the user from the need to switch between applications for prior-art search. The system also allows user to provide relevance feedback to allow for a precision- oriented search while also providing the added advantage of handling multiple languages. We tested and evaluated our system, using a standard benchmarking dataset, from both the ef- ficiency and usability perspectives. We showed that adding domain-specific information like IPC classification code, along with machine-translated text contents for multi-lingual search, improved the system performance. While the impact of explicit relevance feedback could not be determined quantitatively, we showed with the help of pseudo-relevance feedback that our system responded positively in the presence of correct relevant results. Moreover, the overall usability of the system was received quite favourably by the test user group. While the prototype system was well-received overall, there are further potential improve- ments to the design which needs to be explored. Firstly, we would like to build an add-in for the use of a standalone local Word installation. Secondly, we aim to achieve better system per- formance by incorporating the lexical and semantic features of a patent document to account for the several unique factors of a patent, such as obfuscation. Finally, although the PageRank experiment (not discussed in the paper, for brevity) performed poorly in our case, we would continue to investigate and improve the integration of such network flow metrics to better system performance. Finally, we plan to incorporate all the additional suggestions by the test users to improve the user-interface even further to provide it with a more polished outlook. 100 BIR 2021 Workshop on Bibliometric-enhanced Information Retrieval References [1] F. Piroi, M. Lupu, A. Hanbury, V. Zenz, Clef-ip 2011: Retrieval in the intellectual property domain, 2011. [2] M. Lupu, K. Mayer, J. Tait, A. Trippe, Current Challenges in Patent Information Retrieval, volume 37, 2017. [3] W. Shalaby, W. Zadrozny, Patent retrieval: A literature review, Knowledge and Informa- tion Systems (2019) 631–660. [4] G. Cabanac, I. Frommholz, P. Mayr, Bibliometric-enhanced information retrieval 10th an- niversary workshop edition, in: European Conference on Information Retrieval, Springer, 2020, pp. 641–647. [5] M. Golestan Far, S. Sanner, M. R. Bouadjenek, G. Ferraro, D. Hawking, On term selection techniques for patent prior art search, 2015. [6] W. Magdy, G. Jones, A study on query expansion methods for patent retrieval, Interna- tional Conference on Information and Knowledge Management, Proceedings (2011). [7] W. Tannebaum, P. Mahdabi, A. Rauber, Effect of log-based query term expansion on retrieval effectiveness in patent searching, in: Experimental IR Meets Multilinguality, Multimodality, and Interaction, Springer International Publishing, Cham, 2015, pp. 300– 305. [8] B. Al-Shboul, S.-H. Myaeng, Query phrase expansion using wikipedia in patent class search, in: M. V. M. Salem, K. Shaalan, F. Oroumchian, A. Shakery, H. Khelalfa (Eds.), Information Retrieval Technology, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 115–126. [9] M. Verma, V. Varma, Exploring keyphrase extraction and ipc classification vectors for prior art search., in: CLEF (Notebook Papers/Labs/Workshop), 2011. [10] P. Mahdabi, L. Andersson, A. Hanbury, F. Crestani, Report on the clef-ip 2011 experiments: Exploring patent summarization, volume 1177, 2011. [11] P. Mahdabi, F. Crestani, The effect of citation analysis on query expansion for patent retrieval, Information Retrieval 17 (2013) 412–429. [12] P. Mahdabi, F. Crestani, Query-driven mining of citation networks for patent citation retrieval and recommendation, CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (2014) 1659–1668. [13] S. Brin, L. Page, The anatomy of a large-scale hypertextual web search engine, Computer Networks and ISDN Systems 30 (1998) 107 – 117. Proceedings of the Seventh International World Wide Web Conference. [14] S. Bashir, A. Rauber, Improving retrievability of patents in prior-art search, in: Advances in Information Retrieval, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 457– 470. [15] D. Becks, M. Eibl, J. Jürgens, J. Kürsten, T. Wilhelm-Stein, C. Womser-Hacker, Does patent ir profit from linguistics or maximum query length?, volume 1177, 2011. [16] J. Brooke, "SUS-A quick and dirty usability scale." Usability evaluation in industry, CRC Press, 1996. ISBN: 9780748404605. [17] A. Bangor, P. T. Kortum, J. T. Miller, Determining what individual sus scores mean: adding an adjective rating scale, Journal of Usability Studies archive 4 (2009) 114–123. 101