=Paper=
{{Paper
|id=Vol-1292/ipamin2014_paper9
|storemode=property
|title=Understanding Trends in the Patent Domain
|pdfUrl=https://ceur-ws.org/Vol-1292/ipamin2014_paper9.pdf
|volume=Vol-1292
|dblpUrl=https://dblp.org/rec/conf/konvens/StrussMSW14
}}
==Understanding Trends in the Patent Domain==
Understanding Trends in the Patent Domain User Perceptions on Trends and Trend Related Concepts Julia M. Struß Thomas Mandl Michael Schwantner University of Hildesheim University of Hildesheim FIZ Karlsruhe – Leibniz Institute for Information Institute for Information Institute for Information Science and Natural Science and Natural Infrastructure Language Processing Language Processing Hermann-von-Helmholtz- Marienburger Platz 22 Marienburger Platz 22 Platz 1 31141 Hildesheim 31141 Hildesheim 76344 Eggenstein- julia.struss@uni- mandl@uni- Leopoldshafen hildesheim.de hildesheim.de michael.schwantner@fiz- karlsruhe.de Christa Womser-Hacker University of Hildesheim Institute for Information Science and Natural Language Processing Marienburger Platz 22 31141 Hildesheim womser@uni- hildesheim.de ABSTRACT 1. INTRODUCTION The proceeding globalization in combination with an in- The increasing competition in research conducted at univer- creasing competition in research conducted at universities sities and other research institutes as well as in industry, and other research institutes as well as in industry, empha- further intensified by the increasing globalization, reinforces sises the necessity of identifying trends at an early stage, the importance of identifying new trends at an early stage. not only in industry but by universities and governments. According to a study by Thomson Reuters [13], 70% to 90% One of the resources to be considered are patents, as most of the information covered in patents – depending on the of the information contained therein is not published any- research area – is not published anywhere else. The growth where else. The existing research focuses on the technical of this huge information resource in terms of filed patents is perspective of identifying trends in patents. This work ad- also increasing faster every year: According to the annual dresses the user perspective of the problem, in particular report of the European Patent Office in 2012 new records for the user’s working environments, understanding of trends, the third year in a row have been observed, with the largest the underlying tasks and the user requirements regarding a growth in patent filings from Asian countries like China, trend mining system are examined. Japan and Korea [3]. And there is also another increase of 2,8% in the number of filed patents in 2013 compared to 2012 [10]. Categories and Subject Descriptors D.2.1 [Software Engineering]: Requirements/Specifications; In order to provide a system that supports the above men- H.1.2 [Models and Principles]: User/Machine Systems— tioned target audience1 in planning their research strategies Human factors through (semi-) automated trend detection, one needs to un- derstand the information needs and working environments Keywords of these user groups, and most important their understand- ing of trends and requirements regarding the functionality trend mining on patents, requirement analysis, semi-structured of a trend mining system. This paper reports the findings interviews of a qualitative survey on this subject with both scientists, who are working with patents, and information professionals from the patent domain. Copyright c 2014 for the individual papers by the papers’ authors. The paper is organized as follows: In the next section related Copying permitted for private and academic purposes. work is presented, before the methodology of this study is This volume is published and copyrighted by its editors. described in section 3. The subsequent sections present the Published at Ceur-ws.org Proceedings of the First International Workshop on Patent Mining and Its 1 Applications (IPAMIN) 2014. Hildesheim. Oct. 7th. 2014. For a more detailed description of the target audience see At KONVENS’14, October 8–10, 2014, Hildesheim, Germany. section 3 results of the survey, followed by a discussion of the results professionals). The order of the questions was not neces- in section 5 and some concluding remarks. sarily kept during the interviews, but was adapted to the particular interview situation. 2. RELATED WORK There have been several papers that address the technical The interviews were audio recorded and transcribed after- perspective of trend mining in the patent domain. Most wards2 , before they were analyzed with regard to the ques- of this work concentrates on identifying technology trends tions in figure 1. retrospectively like [16], [6] and [2]. Other work consid- ers related areas like the identification of patents with high 4. RESULTS novelty [4, 5] or are engaged with technology monitoring in In this section the insights from the interviews are presented. patents [4]. First the characteristics of trends as viewed by the interview partners are described, before questions and work tasks in Most work addresses the problem by the use of machine the area of trend analysis as well as strategies for trend anal- learning techniques (e.g. [2, 9, 11, 17]), particularly by em- ysis are depicted. Section 4.4 takes a closer look at the parts ploying clustering techniques (e.g. [1,15]) and network anal- and sections of a patent, which are important for trend min- ysis (e.g. [1, 2, 8, 12, 17]). In most works the final decision ing. The section closes with an overview of the functions a about the existence of a trend is left to the users, to whom trend mining system should offer according to the interview the results are presented by different visualisation techniques partners. (e.g. see [7]). A wide range of features has been investigated in those 4.1 Characteristics of Trends works, like terms selected based on their frequency, to men- One main factor for recognizing a trend is the increasing tion the most common one, (e.g. [14, 15]), adjective-noun number of publications in that area (SCI1, SCI2, IP1, IP3). pairs for potential technology features and verb-noun pairs IP1 points out that there needs to be a critical mass of for potential technology functions [17], noun and verb phrases patents, before you can name it a trend and suggests num- [6], or subjective-action-object-relations (e.g. [2,4]), but most bers between 20 and 50 with a stronger tendency towards works don’t present a sound evaluation of their approaches 50. IP2 also gives some numbers, which range from 10 to or only evaluations on selected steps of the complete process, 15, likewise with a tendency towards the higher value. These due to the missing evaluation resources. Instead mostly case numbers can of course not be taken as strict rules, but they studies are performed. show, that according to different disciplines magnitudes can be quite different. One reason for this can be seen in the To our knowledge no study on the understanding of trends size of the research area and another in the understanding and the informational background of the potential users of of a trend with regards to the content and the granularity such a system has been conducted so far. of interest. 3. METHOD Other factors for recognizing a trend in the context of the We are interested in getting deeper insights in the users’ un- patent domain are the appearance of new IPC-classes or the derstanding of trends as well as their requirements towards frequent co-occurrence of IPC-classes from different areas of a trend mining system. Therefore and due to the lack of research assigned to the patents (IP1). prior studies in this area, we choose a qualitative approach and conducted semi-structured interviews. When it comes to time spans of trend evolutions the in- terview partners mostly agree, that it is a matter of several In order to get a better idea of the working environment years. IP2 is giving the smallest time span ranging from sev- and the specific needs of information professionals in the eral months up to one or two years, IP3 also gives a range patent domain, two pre-interviews where conducted with do- from about two years, whereas IP1, SCI1 and IP4 describe main experts from a big information infrastructure institute longer time periods between five years (IP1) and ten years working with patents and offering software products for in- (IP4), with IP4 emphasising the fact that these numbers can formation professionals in the patent domain. Due to this be quite different from discipline to discipline. pre-interviews the area of interest was narrowed down to the engineering sciences, as patent documents in chemistry- According to the granularity of the abstraction level of the related domains add the additional challenge of handling content in the context of trend analysis the interview part- chemical notations, which is out of the scope of the project ners are mainly interested in two levels, which are not spe- in whose context this research is conducted. cific to one group of interview partners: On the one hand trends on the top level of an entire research area, and on Seven interviews have been conducted subsequently. Three the other hand detailed subject-specific or technical develop- interview partners are scientists (SCI1–3) and four inter- ments within a field of interest are mentioned. SCI1 explains view partners are information professionals (IP1–4), who for example, that a scientist usually knows the specific de- either have a background as professional patent searchers velopments within the own research area, whereas it would (IP1, IP3), work in the IP management (IP2) or work in a be interesting to see trends of neighboring disciplines, which company offering different patent services to clients (IP4). might inspire the own direction of research. Contrariwise Figure 1 shows the questions which were asked within the 2 One interview partner did not allow to audio record the interviews, where the questions were adapted to the respec- interview, therefore the interview notes were used for further tive group of the target audience (scientists and information analysis. • Please give some details on your personal background and your working environment∗ / on your research area∗∗ . • Do you selectively conduct trend searches / analysis or trend observations? • Do you include patents in this search or analysis process?∗∗ • What kind of questions do you want to answer by these trend searches / analysis? • How would you characterise such a trend or what makes a trend a trend in your working environment? – What kind of shapes with regard to the trend curve are of interest? – At which time points are those curves interesting? – What are the time periods we are talking about (months, years)? – Which time related fields should be used for measuring a trend? – What is the subject of a trend in terms of content (the granularity level of the content)? Could you give an example? – How would you measure such a trend? – Where can one see a trend at first (what kind of publications)? • How do you realize, that a trend is developing? • What does the result of a trend analysis look like?∗ • What strategy do you pursue, when you do a trend analysis and what steps can you identify in the process? • Which parts of a patent are most applicable or effective in this context? • Which functions should a trend mining system offer? ∗ information professionals ∗∗ scientists Figure 1: Questions for the semi-structured interviews SCI3 focuses on the more subject-specific type of trends. Another question in the context of trend analysis regards As mentioned before the information professionals are also the persons, research teams and companies already engaged interested in both types of trends. IP1 explains, that cus- in the area of interest. On the one hand the interview part- tomers who want to use a specific technology (e.g. SMEs) ners are interested in knowing how many of them are there are more interested in IPC-class level trends, whereas enter- (SCI1), on the other hand they are specifically interested prises wanting to control a commercialization process or to in observing the competitors (IP2) or finding out how big get full market coverage are interested in more fine grained the development team of a specific competitor is, as this is information, like on substance or technology level, when it an indicator of how important a topic is to that competitor comes to trend analysis. (IP3). Other questions have a broader focus, e.g. ask about the development of new technical fields (IP1) or the direction the 4.2 Information Needs in Trend Analysis development in a technical field is taking (IP1, IP4). There Trend searches or analysis are conducted with different aims are also questions which are dealing with possible markets or objectives and are guided by different questions. One (IP1). question coming up in both groups of interview partners is concerned with finding out if it is worthwhile to engage one- self with a specific research topic (SCI1, IP1, IP3), although 4.3 Points of Interest in the Trend Evolution there are different reasons behind this question. SCI1 is in- The above presented characteristics and information needs terested in knowing if there is a possibility of funding, that do have an influence on the points of interest within the is worth the effort of preliminary work and writing an ap- development of a trend. Most interview partners agree, that plication, as this process takes approximately 1.5 years. IP1 the beginning of a trend is a point in time, when a trend constitutes the importance of knowing if the area is already becomes interesting (SCI1, IP1–4). This is especially the covered by patents and IP3 expresses the situation, that the case, if the reason for the analysis is to get involved in a existing patents mean, that competitors have been working specific area of research. for more than 1.5 years in an area, once the patents are available to the public, due to the 18 month delay in publi- The information professionals also consider other points in cation. the evolution of a trend as interesting and stress the depen- dence on the requests of the clients and customers (IP1, IP3, scientists information professionals IP4). Some customers are interested in licencing a specific technology, which means it needs to be functional already, first main claim, main claims (IP4) and therefore a later point in the evolution of the trend is in- claims (SCI1) perh. claims (IP3) teresting (IP1). IP4 describes a similar scenario and assigns claims (SCI3) descending trends to those customers. A descending trend description (SCI2, first page of the description curve with regard to patent applications does not mean, that SCI3) (IP3) a trend is ending, but that the technology has reached a cer- the replication of contents in tain degree of maturity. the description dilute the re- sults (IP1) 4.4 Applicable Sections of a Patent for Trend figures (SCI3) figures (IP2) perh. figures (not for in- Mining formatics or telecommunica- The question about applicable sections for trend mining on tions) the one hand aimed at clarifying which date related fields should be used for trend mining and on the other hand which edited / enhanced titles (IP1) content related sections of a patent are best suited for trend titles (IP3) mining. edited / enhanced abstracts (IP1) Date related fields for patents include application dates, pri- abstract (IP2, IP3) ority dates and publication dates. The application date abstracts are too general refers to the date of the application at the patent office, (IP4) whereas the publication date denotes the date, when the patent was made available to the public, which can be up introduction, especially the to 18 month after the application was handed in. If there task description (IP3) are multiple applications to different patent offices for an invention, these patents form a patent family3 . The earliest Table 1: Content related sections of a patent (not) application date of a patent family is denoted as the priority applicable for trend mining date. Related work in trend mining on patents uses different date related fields to explore temporal developments. Some works 4.5 Trend Analysis Strategy choose the application date (e.g. [6]) while others prefer the Besides the information needs and their understanding of a publication date of a patent (e.g. [4,7]). The interview part- trend the interview partners were also asked for their strate- ners mostly agreed that for trend mining the priority date gies with regard to trend searches and analysis. would be the date related field of choice. Although some acknowledge, that one could use the application date (IP2, IP1 gives descriptions of strategies for both of the above IP3). According to IP1 the publication date could be use- mentioned trend types. When the interest is primarily on ful, if the impact of an invention on an industrial sector is the first type of trends e.g. within an IPC-class, he first of interest. creates a basic set of documents and then aggregates the patents with regard to their respective patent families in or- With respect to the content related sections, a wide variety der to avoid duplicate counting of the same invention. If has been used in prior research: title and abstract have been necessary the document set is further aggregated according used as well as claims and descriptions and varying combina- to national patent families and then the number of patents tions of these (e.g. see [2, 5, 12, 14, 17]). The same variety is per year based on the priority date are calculated and vi- also found in the interviews. Table 1 lists the content related sualized. The last step would be to select technology areas sections suggested or excluded by the individual interview with growth above average and if necessesary conduct fur- partners. ther analysis. Especially when it comes to titles and abstracts the opin- For the second trend type IP1 proposes an iterative ap- ions diverge. IP1 explains, that it depends on the database proach, involving the client at every stage of the process. whether these two fields could be used for determining the Especially at the beginning, according to IP1 clients are not content of a patent: Some providers of patent information of- always able to explain their objectives or questions explic- fer added values like manually rewritten titles and abstracts itly. Another point is, that concept names used within one according to the contents of a patent and therefore make company might be different from those commonly used in these a good data resource, while titles and abstracts taken patents, or there might as well be some variety in the con- directly from the patent application often form a bad base cept names found in the respective patents. Therefore as a for analysis (IP1) as the applicants try to conceal the con- first step a patent landscape of the domain of interest needs tent and claim of a patent, in order to keep it as broad as to be generated and then explored together with the client. possible. This serves the goal of getting a common understanding of the task at hand and identifying aspects of a topic which 3 are of special interest to the client. These identified areas For further details on patent families see for example http: //www.intellogist.com/wiki/Patent_Families are then further analyzed with text mining techniques like clustering. The interest on trends at this stage are mainly ascribed to SMEs. IP3 gives a description of how to get the basic document set for the analysis. He starts off with known competitor names The study also shows that research is needed with regard and their publications and then looks at the IPC classes and to the question of which content related sections of a patent might take those into consideration as well. are best applicable for trend mining, due to the fact that almost every content related section has been named by at 4.6 Functions of a Trend Mining System least one interview partner. At the end of the interview the scientists and information professionals were asked what kind of functions a trend min- The findings show as well, that at least for some of the ing system should possess. These range from possibilities patents searchers it is important to integrate their customers to drill down within a research area to more specific areas and clients in the trend mining process. Therefore a system and explore trends at every stage, to having an alert func- with such a target audience should also incorporate visual- tion informing about changes in a predefined area of interest isation techniques, that allow for exploring analysis results (SCI1). together with clients and make it easy for a non-patent spe- cialist to understand the results shown by the trend mining IP1 describes the ideal trend mining system as a system pos- system. sessing two modes, one standard mode and one advanced mode for experts. Both modes should be transparent to the 6. CONCLUSIONS user and make interim results accessible in order to make the This paper gives first insights into the user perspective of process comprehensible. The advanced mode should addi- trend analysis in the patent domain. Besides showing dif- tionally give the possibility of taking actions at various steps ferent perspectives and understandings of trends as well as during the process, like incorporating additional knowledge pointing out characteristics making a trend interesting to about the domain in question or defining the number of clus- the target audience within the area of engineering sciences, ters that should be build during a clustering step. the study gives first insights into the underlying tasks and information needs of the target audience and some require- Another important aspect are interactive visualisations of ments regarding the functionality of a trend mining system the results, enabling the user for example to zoom in for in the patent domain. more details (IP3). IP1 also remarks that visualisations that help to understand the contents of a set of documents is The study also shows the necessity for further research when a desirable feature and make it possible to explore results it comes to the question of which content related sections of together with costumers. a patents are applicable for trend mining, as there is neither a clear picture on this aspect from the interviews, nor is 5. DISCUSSION there in related research. As this study has the character of an exploratory study and only a small sample is involved, the findings of this study can only give first insights into the domain and a starting point 7. ACKNOWLEDGMENTS for further research, but the variety of information needs and This work was conducted as part of the project “Trendmin- understandings of trends within just the field of engineering ing für die Wissenschaft”4 (T4P), which is a joint project sciences emphasises the necessity of incorporating the tar- of FIZ Karlsruhe – Leibniz Institute for Information Infras- get audience in the development process of a trend mining tructure and the Institute for Information Science and Nat- system. ural Language Processing at the University of Hildesheim and is funded by the Leibniz Association in the context of The presented results show that there are quite a few differ- the Leibniz Competition. ences in the understanding of trends or the characteristics that make a trend interesting to the target audience, al- 8. REFERENCES though the interview partners mostly had a background in [1] P.-L. Chang, C.-C. Wu, and H.-J. Leu. Using Patent engineering. Analyses to Monitor the Technological Trends in an Emerging Field of Technology: a Case of Carbon Mainly two types of trends, that are interesting to the target Nanotube Field Emission Display. Scientometrics, audience, could be identified: Trends at the top level of an 82(1):5–19, 2010. entire research area or domain and subject-specific or tech- [2] S. Choi, J. Yoon, K. Kim, J. Y. Lee, and C.-H. Kim. nical developments within a specific area of interest. The SAO Network Analysis of Patents for Technology results also show, that the time spans encompassing a trend Trends Identification: a Case Study of Polymer can be quite different according to the content granularity Electrolyte Membrane Technology in Proton of interest and the domain of interest. Exchange Membrane Fuel Cells. Scientometrics, 88(3):863–883, 2011. Additionally the results of the interviews show, that not only [3] European Patent Office. Annual Report 2012: emerging trends are of interest to the target audience, but Statistics and trends: Total European patent filings, also trends which have reached their height or are even on 2013. Online available at: a decreasing path, as this denotes, that a technology has http://www.epo.org/about-us/ reached a stage, where it can be used, and licenced by other 4 organisations to incorporate them in their own products. translation: trend mining for sciences annual-reports-statistics/annual-report/2012/ [15] B. Yoon and Y. Park. A Text-mining-based Patent statistics-trends/patent-filings_de.html, last Network: Analytical Tool for High-technology Trend. accessed: 2013-09-16. The Journal of High Technology Management [4] J. M. Gerken. PatMining – Wege zur Erschließung Research, 15(1):37–50, 2004. textueller Patentinformationen für das [16] B. Yoon and Y. Park. A Systematic Approach for Technologie-Monitoring. PhD thesis, Universität Identifying Technology Opportunities: Keyword-based Bremen, Bremen, 2012. Morphology Analysis. Technological Forecasting and [5] P. Hu, M. Huang, P. Xu, W. Li, A. K. Usadi, and Social Change, 72(2):145–160, 2005. X. Zhu. Finding Nuggets in IP Portfolios: Core Patent [17] J. Yoon, S. Choi, and K. Kim. Invention Mining Through Textual Temporal Analysis. In Property-function Network Analysis of Patents: a X. Chen, editor, Proceedings of the 21st ACM Case of Silicon-based Thin Film Solar Cells. International Conference on Information and Scientometrics, 86(3):687–703, 2011. Knowledge Management, pages 1819–1823, New York and NY and USA, 2012. ACM. [6] Y. Kim, Y. Tian, Y. Jeong, R. Jihee, and S.-H. Myaeng. Automatic Discovery of Technology Trends from Patent Text. In Proceedings of the 2009 ACM Symposium on Applied Computing, pages 1480–1487, New York and NY and USA, 2009. ACM. [7] C. Lee, J. Jeon, and Y. Park. Monitoring Trends of Technological Changes Based on the Dynamic Patent Lattice: A Modified Formal Concept Analysis Approach. Technological Forecasting and Social Change, 78(4):690–702, 2011. [8] H. Park, K. Kim, S. Choi, and J. Yoon. A Patent Intelligence System for Strategic Technology Planning. Expert Systems with Applications, 40(7):2373–2390, 2013. [9] W. M. Pottenger and T.-H. Yang. Detecting Emerging Concepts in Textual Data Mining. In M. W. Berry, editor, Computational Information Retrieval, pages 89–105. Society for Industrial and Applied Mathematics, Philadelphia and PA and USA, 2001. [10] O. Schröder. Facts and Figures 2014. European Patent Office, 2014. Online available at: http://documents.epo.org/projects/babylon/ eponet.nsf/0/125011cc1d9b8995c1257c92004b0728/ $FILE/epo_facts_and_figures_2014_en.pdf, last accessed: 2014-06-24. [11] M.-J. Shih, D.-R. Liu, and M.-L. Hsu. Mining Changes in Patent Trends for Competitive Intelligence. In T. Washio, E. Suzuki, K. M. Tin, and A. Inokuchi, editors, Advances in Knowledge Discovery and Data Mining, volume 5012 of Lecture Notes in Computer Science, pages 999–1005. Springer, Berlin and Heidelberg, 2008. [12] J. Tang, B. Wang, Y. Yang, P. Hu, Y. Thao, X. Yan, B. Gao, M. Huang, P. Xu, Li, Weichang, and A. K. Usadi. PatentMiner: Topic-driven Patent Analysis and Mining. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and DataMining, New York and NY and USA, 2012. ACM. [13] The Thomson Corporation. Global Patent Sources: An Overview of International Patents. Thomson Scientific, London, 6 edition, 2007. Online available at: http://ip-science.thomsonreuters.com/m/pdfs/ mgr/global_patent_sources.pdf, last accessed: 2013-09-16. [14] M.-Y. Wang, D.-S. Chang, and C.-H. Kao. Identifying Technology Trends for R&D Planning Using TRIZ and Text Mining. R&D Management, 40(5):491–509, 2010.