FacetSearch: A Faceted Information Search and Exploration Prototype Christin Katharina Kreutz(B) , Peter Boesten, Alex Witry, and Ralf Schenkel(B)  Trier University, 54286 Trier, DE {kreutzch, s4peboes, s4alwitr, schenkel}@uni-trier.de Abstract. Faceted search and exploration is effective and desirable when confronted with bibliographic data sets. As this data is mostly extensive, a powerful and flexible information retrieval tool needs to be provided. We propose FacetSearch; a scalable prototype especially designed for 1) comparing and visualizing trends in corpora via a trend visualization, as well as 2) exploring data and relations between elements with a pivot operation. Metadata can be accessed in multiple dimensions with facets and filter rules on them to help refine search queries or navigate through a corpus. An evaluation showed the potential and user acceptance of our approach when compared to the textual search engine Semantic Scholar. Keywords: Faceted Search · Data Exploration · Faceted Metadata · Information Visualization 1 Introduction With the ever-growing amount and richness of bibliographic data, exploring and interpreting this information with help of tools is becoming increasingly important. A simple textual search is not sufficient to access the collection in its full dimension if a user has no prior knowledge of the data being looked for. Faceted search can help overcome this by providing attributes with which queries can be refined or a corpus can be explored. Some works even point out, that users prefer searching with facets instead of only textual search [24] and exploratory visualization systems improve the users understanding of collections [11]. When comparing applicants for an open position in science, multiple aspects of researchers need to be considered. Not only their number of publications but also their h-index, their domain, the conferences and journals they published in and their publication development over the years could be relevant. In literature research, examining quality and impact of citations as well as getting a brief insight into papers is essential to estimate their usefulness and centrality. For scientists exploring a corpus is important to detect interesting researchers and publications associated with certain topics. All these tasks require a powerful information retrieval system. We introduce a prototype tailored to tackle such information needs: FacetSearch, a system espacially directed at visualizing and 2 Kreutz et al. R1 C1 References of P P Citations of P R2 C2 Fig. 1: Nodes symbolize publications, edges between papers symbolize citations. C1 and C2 are citations of P , R1 and R2 are references of P . comparing trends and relations in large bibliographic collections as well as giving the opportunity to explore them by usage of faceted search. This work is structured as follows: in Section two, we introduce some basic definitions and our data set. Afterwards, the FacetSearch interface is described in detail and evaluated in the following chapter. We close with related work and the description of possible improvements for our preliminary system. 2 Terms and Definitions 2.1 Definition There are different information needs when working with an information system. A simple search is about refining a result set from many to few elements with the goal of finding the most relevant element for a query. Exploration is a much more imprecise task with no certain ambition except to casually examine an information space and navigate between elements. Metadata can be clustered in alternative hierarchical categories in the infor- mation space. These categories are called facets. Facets contain elements, which are the aggregated potential values that occur in the corresponding metadata dimension. Queries on elements of facets are called filter rules, they generate and refine results. A rule set is composed of all currently active filter rules. Faceted search interfaces support exploration and search by suggesting fitting facet filter options based on the respective rule set to progressively specify the results. In the context of a publication-centered faceted search system, facets could be author names, journals or keywords attributed to publications, elements in a facet containing keywords could be database, machine learning and algorithm. A filter rule on a facet containing keywords could query for algorithm and retrieve publications attributed with this keyword. For our use case of bibliographic data, we need to specify linkages between publications. We distinguish between citations and references as visualized in Figure 1. Citations are the publications connected by ingoing links to a paper from its citing publications, references are the papers connected by outgoing links from a publication to papers which are cited. 2.2 Data Set Our prototype uses data from the dblp computer science bibliography. The dblp corpus contains bibliographic metadata concerning authors, publications as well FacetSearch: A Faceted Information Search and Exploration Prototype 3 Fig. 2: FacetSearch interface with (A) menu, (B) facet area, (C) filter, (D) item view and (E) trend visualization. as venues (journals and conferences) from the field of computer science and adjacent areas [10]. As of July 2018, it holds information of over 4.2 million publications and more than 2.1 million authors. This data set is enriched with information from Aminer [20] based on DOIs matched with dblp data or paper title and author matches where DOIs were not present. For about 100,000 papers, full texts extracted from PDFs with Science Parse [29] are included. In dblp, author names, publication titles and related venues are stored, the extended collection additionally contains citations, references, abstracts, full texts, keywords and topics of publications. All supplementary data is partial. Except for abstracts and full text, available metadata is used as elements in facets. Topics are merged into one facet with keywords as these attributes are similar but often missing. The additional data is useful in indicating quality of papers (citations and references) as well as providing further aspects to explore (abstracts, full texts) and redefine (keywords) the data. 4 Kreutz et al. – Papers By Author – Cited Papers By Author – Authors Cited By Author – Authors By Paper – Referenced Papers By Author – Papers Referencing Paper – Papers By Collection – Papers Cited By Paper – Papers Referencing Collection – Authors By Collection – Papers Cited By Collection – Authors Referencing Author – Papers By Keyword Fig. 3: Facets available in FacetSearch. 3 The FacetSearch Interface The FacetSearch interface can be partitioned into five main components as seen in Figure 2: At the top (A), a menu is given with which a user can switch rule sets and select which facets are being displayed. In the facet area (B), currently chosen facets are depicted to explore the data with. Above in the filter (C), the current rule set is given. On the right in the item view (D), elements retrieved by the query donated by the active rule set are listed. Lastly, in the trend visualization (E), the temporal development of elements from rule sets is pictured. The arrangement of areas in the interface is highly influenced by FacetLens [9]. We designed the user interface according to Shneiderman et al.’s principle of direct manipulation, where the visual representation of data, physical interaction with objects and the reversibility of actions are key elements [16]. 3.1 Program Components Facets are named according to the convention X By Y. Y is the type of metadata or elements displayed in the facet, X is the type of results of a filter rule on this facet. Facets working on citations are an exception; they are structured as X Cited By Y or X Referencing Y. Elements in facets are dynamically queried and displayed. Figure 3 gives an overview of facets available in FacetSearch, the repertoire of facets is based on previous work [9]. In our prototype, tool tips provide short descriptions of the facets. To switch between visible facets, a menu can be accessed to pick the three displayed ones. Via search and different sorting methods on facet elements or iteration through them, elements can be selected and combined with conjunction or disjunction as new filter rules. Special focus concerning sorting options needs to be laid on facet Papers By Keyword as there is the possibility to sort elements according to the average percental growth of the keyword’s occurrences in papers per year. The facet Papers By Author contains all author names as elements. If the elements Ralf Schenkel and Gerhard Weikum are selected and conjugated in a filter rule, a query for papers which are co-authored by Ralf Schenkel and Gerhard Weikum is initiated and publications fitting all requirements of the current rule set are retrieved and listed in the item view. The elements in all facets are updated according to the restricted result set. In addition to the traditional facets, two facets Papers By Year and Papers By Type are always visible, they do not contain dynamically generated elements. These facets refine the time interval and the desired publication type of results. Users can switch between multiple rule sets. Filter rules contained in them are conjugated and resulting entries are listed in the item view. It can be switched FacetSearch: A Faceted Information Search and Exploration Prototype 5 Fig. 4: On the left: Detail view for publication Efficient Text Proximity Search. On the right: Detail view for author Ralf Schenkel. One of the buttons for the pivot operation is marked red. between a publication and an author view of the results, the default view is denoted by the X -component of the facet, on which the last filter rule was generated. A search is available on the item view. If an element from the item view is selected, a detail view opens as pictured in Figure 4. It provides all available information for an author or a publication. Next to some attributes in the detail view, a button with an arrow symbol (marked red on the left detail view in Figure 4) offers a shortcut-functionality of our prototype, the pivot operation. When clicking this button, a user breaks free from the current environment and creates a new rule set without dropping the old one. The only active filter rule in the new rule set contains the elements, which a user pivoted on. By pivoting, users can quickly explore the corpus. In the case shown in the left of Figure 4, clicking on the pivot button next to the five authors (Ralf Schenkel, Andreas Broschart, Seung-won Hwang, Martin Theobald and Gerhard Weikum) of the publication Efficient Text Proximity Search generates a new rule set. A user has to decide whether the authors are combined conjunctive or disjunctive. In the new rule set, the only active filter rule would be the selection of the five authors as elements from the facet Papers By Author. The item view then holds all publications fitting these requirements. The development of publications over time from different rule sets can be visualized in the trend visualization. With two drop-down menus, the depicted interval can be determined. There are several trend types available for display: number of publications, number of citations and number of references. With a click on a trend bar, a window with publications fitting the rule set from the 6 Kreutz et al. Fig. 5: Comparison of publications for re- Fig. 6: Comparison of citations of publications searchers R1 (blue) and R2 (green) for key- from researchers R1 (blue) and R2 (green) words database, inf ormation retrieval and for publications with keywords database, data mining. Hight of bars indicates number inf ormation retrieval and data mining. of publications per year. Hight of bars gives number of citations per year corresponding year opens. Trends of up to three rule sets can be compared in the trend visualization at the same time. The trend visualization can be used for the estimation of keywords’ relevance. If the keyword logic programming is chosen in facet Papers By Keyword, the trend visualization shows the decreasing number of publications attributed with this term which could hint at a vanishing importance of the topic. The selection of the keyword smartphone instead set could show an increasing number of publications over the years which could indicate the growing significance of this area at a current time frame. 3.2 Exemplary Application One of the tasks FacetSearch is suited for is the comparison of applicants for vacancies in academia. Manning these positions is challenging, as only experts in the vacant field can fully access quality and impact of candidates research. Let’s assume there are two researchers R1 and R2 who run for a professor position in database systems. In a first step, the number of publications of each applicant could be compared in this area which can be described by keywords database, inf ormation retrieval and data mining. When searching for publications from these authors (via facet Papers By Author) with OR-ed keywords (via facet Papers By Keyword), the comparison of results in the trend visualization is depicted in Figure 5. Based on these publication trends R1 would be the better candidate as he publishes more papers. Figure 6 shows the outcome if the number of citations of the publications of the authors is compared. Even though R1 did publish more than R2 , R2 is cited much more often which indicates that R2 is more important than R1 and would be a more suitable population for the open position if no other aspects influence this decision. 3.3 Implementation Details FacetSearch is a prototype implemented in Java 7 using local indices which store bibliographic information. Elasticsearch is used for saving and searching data. FacetSearch: A Faceted Information Search and Exploration Prototype 7 Q1 Who are the co-authors of author A in his most cited publication? How many publications do they have together? Q2 What is the most frequent keyword associated with author A between years Y1 and Y2 ? Q3 How many publications has author A in collection C? Q4 Which of the publications citing publication P is the most cited one? Q5 Compare collection C1 and C2 : which one contains more publications with keyword K between years Y1 and Y2 ? Q6 Compare authors A1 and A2 : who was cited more between years Y1 and Y2 ? Fig. 7: Overview of types of questions used in the evaluation. One index stores information concerning authors (unique name, possible further names, h-index), the other one contains data regarding publications (title, unique author names, publication year, collection in which it was published, keywords, full text, citations, references, average h-index of authors) as far as available. In Elasticsearch, the maximum result size of a query is set to 10,000 elements by default. As we did not change that value, it is the top number of elements returned in facets where Y = Author or Y = Paper if a search is performed on them. A search on authors could query for the primary unique name or a further name. Fitting names are looked up in the author index. For every author with good scoring, the unique name is used in the publication index to test if the author has at least one paper matching the current rule set and if so, the author is considered further. Search terms with high frequency such as John or the should be avoided or rendered more precisely to prevent the case of an element existing and fitting the current rule set but not being retrieved. 4 Evaluation A direct comparison of FacetSearch with its predecessors PaperLens [8] as well as FacetLens [9] is impossible as these systems are unavailable online. In order to evaluate the usefulness of FacetSearch compared to a classical information system, we invited ten volunteers (two of them females, nine of them with a background in computer science) to participate in a user study. Our goal was to compare the usability and user acceptance of the systems by measuring speed of performance, error rate and the subjective satisfaction of users [16]. 4.1 Experimental Set-Up Our comparative study used FacetLens as a faceted information retrieval system and Semantic Scholar[27], an academic search engine used with textual search. Like FacetSearch, Semantic Scholar contains information on papers, authors, venues, citations, and keywords, indicating the target audience for both systems is similar. Between participants, the order of the presented systems was alternated to compensate for learning effects. Test users were given tutorials of both systems before they started working on the respective tasks. A pre-evaluation with two different subjects showed the need for another training step between tutorial and evaluation with FacetSearch as the time to learn [16] with this system was 8 Kreutz et al. FaSe SeSc 600 X × diffmean X × diffmean 300 Q1 8 2 3.5 9 1 4.1 Time in sec Q2 9 1 2.4 2 8 4.3 120 Q3 9 1 2.4 9 1 3.2 Q4 8 2 4.9 6 4 4.9 40 Q5 9 1 3.2 8 2 3.4 Q6 3 7 3.9 7 3 4.4 Q1 FaSe Q1 SeSc Q2 FaSe Q2 SeSc Q3 FaSe Q3 SeSc Q4 FaSe Q4 SeSc Q5 FaSe Q5 SeSc Q6 FaSe Q6 SeSc (b) Number of right and wrong answers per system and question (a) Time per question answered correctly in seconds with mean of difficulty given by for FacetSearch and Semantic Scholar. test users. Fig. 8: Comparison of evaluation results from FacetSearch (FaSe) and Semantic Scholar (SeSc). primarily underestimated. Two basic training tasks were included before users worked on the evaluated questions. Participants solved these in training with help of an instructor. The whole session lasted about one hour for each member of the evaluation group. Figure 7 gives an overview of the six types of questions posed for each of the systems, variables were changed between them. The first two questions were directed at correct handling of authors, the next two aimed at exposure with publications and the last tasks required usage of the trend visualisation. Not only the correctness of responses was measured, answering time was also taken. Additionally, users were asked to rate the difficulty of solving the posed questions with the respective systems on a scale from 1 to 7 (1 being very easy, 7 being very difficult) and judge functionalities of FacetSearch in the end. 4.2 Results and Discussion FacetSearch Some works observe a change in search habits when a grouping mechanism such as faceted search is available [4]. This corresponds with our observation of users requiring additional time and training before being able to fully grasp the functionality of FacetSearch in Q1. Q2 and Q3 were answered quickly and correctly by the test users. Because Q4 uses references, this task was difficult for most participants as the required facet Papers Referencing Paper behaves unintuitively. Choosing elements from this facet leads to the display of publications in the item view, of which the chosen elements are a reference. The next question Q5 was designed for users to look at the trend visualization but most of them chose another path to determine the right answer. With the last task Q6, a lot of test users made the mistake of limiting the observed years in the wrong place or in the decision of the selected trend type. Our evaluation showed the adequacy of the prototype for the different tasks except for Q6 where a better distinction of restriction of years needs to be found. FacetSearch: A Faceted Information Search and Exploration Prototype 9 Question Avg. Rating How intuitive was working with FacetSearch? (1 - not intuitive to 7 - very intuitive) 3.2 How did you like the implementation of search on facets? (1 - very bad to 7 - very 5.2 good) How did you like the implementation and usage of the trend visualization? (1 - very 5.1 bad to 7 - very good) How did you like searching for information using FacetSearch compared to Semantic 5.0 Scholar? (1 - clearly worse to 7 - clearly better) Table 1: User feedback given in the evaluation. Semantic Scholar With Semantic Scholars intuitive operational concept, most users had no difficulties in answering Q1 and Q3 correctly. The second task Q2 was solved wrong by eight users because in this search engine, keywords are ordered without apparent criteria. Users were confronted with the challenge of being unable to sort citations by attribute in Q4. The next question Q5 hardly posed a problem except in the search for a collection as it cannot be searched for directly in Semantic Scholar. The last task Q6 caused come confusion, several users had problems in finding and comparing the citation diagrams. Comparison In Figure 8, evaluation results from FacetSearch and Semantic Scholar are compared. Figure 8a gives an overview of time needed for correctly answering each of the six questions split by system. The plot indicates over all participants were faster when using Semantic Scholar. Only when solving Q2 and Q6, FacetSearch was the faster option. Figure 8b quantifies the correct and incorrect answers as well as the subjective mean of difficulty for the questions. With FacetSearch, 76.67% of all tasks were finished correctly, with Semantic Scholar, only 68.33% of results were correct. The rating of subjective difficulty of solving the problems by test users lead to a median of 3 for FacetSearch while resulting in 4 for Semantic Scholar. Participants did consider using the faceted search prototype as easier when confronted with our exemplary tasks even though their completion took longer. This phenomenon of initial problems but successful and easy usage when confronting users with faceted search has also been observed in user studies as the users’ ability to use and understand a system evolves due to learning processes [17,11]. There are classes of tasks which cannot be solved by systems such as Semantic Scholar but are manageable with FacetSearch. Such tasks were not put in our evaluation as it would have resulted in failure for the text-based search engine. They include the following example problems: The best publication (e.g. with most citations) from a set of publications is hardly found as sorting options are missing. Exploration of publications from collections is difficult as no direct search is implemented for venues. Trends for referenced publications from an author or publication cannot be displayed as well as citation trends for more complex queries than single authors. User Feedback The last part of the evaluation was a user feedback rating intuitiveness and overall implementation of certain aspects of FacetSearch. Table 1 gives an overview of the assessments. Some participants of the evaluation were confused because facet elements are simultaneously used for input and output. Filter rules are generated from them 10 Kreutz et al. but when a filter rule is added, elements dynamically change to match the query result. This phenomenon is complicated and a reason to classify FacetSearch as an expert system. One user fittingly noted that the more complex requirements for a system become, the more complex its operation becomes. Other users remarked they trusted the results of FacetSearch far more than the results of Semantic Scholar because of its inconsistent data when for example searching for co-authors in author profiles. The stable structures in FacetSearch were complimented by many, overall consent was that with a sufficient period of professional training, users would be much faster with our prototype. 5 Related Work FacetSearch is an extension of PaperLens [8], an information retrieval system specialized on visualizing trends and relationships in bibliographical data, and the faceted classification system FacetLens [9], where facets are ordered in tiles and elements in them can be selected gradually to refine queries. Just as most visualization and exploration tools for bibliographical data, they do not scale well. Some tools even come with a limited number of elements per facet [26,7] whereas our prototype is scalable and has no restricted value range. Tiles are used in other research to display distribution and overlap of searched terms in documents [5,14] but in our case with only a small amount of textual data and separated search on each area, such a representation would not be meaningful for the most part. Daffodil [3] is an information system sitting on top of several digital libraries which supports its users visually by providing so-called stratagems, high-level services which offer the possibility of depth search in different, possibly incon- sistent views. Stratagems are the equivalent to facets in our approach. Working with facets has the advantage of being able to avoid empty result sets while also assuring consistency in the different views of the displayed data, as only facet elements are displayed which are fitting to the current rule set. Some works like FacetMap [18] and FacetedDBLP [1] support search and exploration on faceted data but only hardly provide the possibility to visualize trends and relations in them like our trend visualization does. PaperQuest [12] focuses on the aspect of visualizing literature review research but does not cover exploration on authors or the intertwined network of venues as important factors in this process which we can access by pivoting. The CiteVis system [19] displays citations of papers in a grid-based layout where it highlights searched for publications, authors and keywords. In contrast to our approach, it is only suitable for small corpora and exploration on facets is unavailable. TIARA [22] can be used for displaying the temporal development of keyword clouds extracted from texts and offers a faceted exploration option. Unfortunately, the exploration is not as powerful and exhaustive as ours. Other research displays ego-centred networks to visualize and explore relations between authors, authors and venues, or keywords and venues in a temporal context but do not offer the possibility to define complex queries on papers [15]. FacetSearch: A Faceted Information Search and Exploration Prototype 11 Further systems such as PivotPaths [2], which visualizes the links between different facets as tripartite graphs and focuses on the exploration of connections, seem to scale better but confuse users when too many elements are displayed simultaneously. With Bibliography Explorer++ [25], where the exploration of publications and citations were the focal point, similar problems occurred in user studies. Although users were confused by the complexity of facets dealing with citations and references in our evaluation, FacetSearch’s clear structure prevents an overstraining of users with too much information. 6 Conclusion and Future Work We presented FacetSearch, an information retrieval prototype designed for search and exploration of bibliographic data via facets as well as display and comparison of trends in result sets by usage of a trend visualization. Our tool is able to keep up with text-based search engines such as Semantic Scholar according to our evaluation, although users thought of it as less intuitive (3.2). This observation is no surprise as there are a lot of features which need understanding and experience. Most persons coped well with usage of facets (5.2) and the trend visualization (5.1). Even compared to Semantic Scholar, our prototype was able to satisfy users (5.0). Nevertheless, FacetSearch is an expert system which needs a certain period of vocational adjustment. Further research such as PaperQuest [12] or Bow Tie Academic Search [6] use colouring and shapes to present more information on citations while preventing an overload of the interface with numbers and thus confusing a user. These techniques could also be applicable for publications in the item or detail view. In PivotPaths [2] the visualization of connectedness of facets was valued by users; in our case, we could incorporate a graph representing the citations and references between collections and their associated keywords. Another area for improvement would be the opportunity of incorporating Doc2Vec [13] to extend and diffuse the keyword search and present a broader selection of papers fitting searched terms as this search is rather unspecific and the possible information gain exceeds the cost of displaying unfitting results [21]. Potential for future enhancements lies in the field of collections: Visualizing the citation/reference linkage graph of collections could provide useful insights. Venue ratings from Eigenfactor [23] or the CORE Conference/Journals DB[28] could be included so citations can be rated with them. References 1. J. Diederich, W.-T. Balke: FacetedDBLP - Navigational Access for Digital Libraries. TCDL 4(2) (2008) 2. M. Dörk, N.H. Riche, G. Ramos, S. Dumais: PivotPaths: Strolling through Faceted Information Spaces. TVCG 18(2): 2709–2718 (2012) 3. N. Fuhr, C.-P. Klas, A. Schaefer, P. Mutschke: Daffodil: An Integrated Desktop for Supporting High-Level Search Activities in Federated Digital Libraries. ECDL 2002: 597–612 12 Kreutz et al. 4. M. Hearst: Clustering versus faceted categories for information exploration. Com- mun. ACM 49(4): 59–61 (2006) 5. M. Hearst: TileBars: Visualization of Term Distribution Information in Full Text Information Access. CHI 1995: 59–66 6. T. Khazaei, O. Hoeber: Supporting academic search tasks through citation visual- ization and exploration. Int J Digit Libr 18(1): 59–72 (2017) 7. C. Lambeck, J. Wojdziak, R. Groh: Facet Lens - Local Exploration and Discovery in Globally Faceted Data Sets. DESIRE 2011: 85–88 8. B. Lee, M. Czerwinski, G. Robertson, B. Bederson: Understanding research trends in conferences using paperLens. CHI Extended Abstracts 2005: 1969–1972 9. B. Lee, G. Smith, D. Tan, et al: FacetLens: Exposing Trends and Relationships to Support Sensemaking within Faceted Datasets. CHI 2009: 1293–1302 10. M. Ley: DBLP - some lessons learned. PVLDB, 2(2): 1493–1500 (2009) 11. Y. Liu, S. Barlowe, Y. Feng, et al.: Evaluating exploratory visualization systems: A user study on how clustering-based visualization systems support information seeking from large document collections. Information Visualization 12(1): 25–43 (2012) 12. A. Ponsard, F. Escalona, T. Munzner: PaperQuest: A Visualization Tool to Support Literature Review. CHI Extended Abstracts 2016: 2264–2271 13. L. Quoc, T. Mikolov: Distributed Representations of Sentences and Documents. ICML 2014: 1188–1196 14. H. Reiterer, G. Tullius, T. Mann: INSYDER: a content-based visual-iformation- seeking system for the Web. Int J Digit Libr 5(1): 25–41 (2005) 15. F. Reitz: A Framework for an Ego-centered and Time-aware Visualization of Re- lations in Arbitrary Data Repositories. arXiv preprint arXiv:1009.5183v1 (2010) 16. B. Shneiderman, C. Plaisant, M. Cohen, S. Jacobs, N. Elmqvist: Designing the User Interface - Strategies for Effective Human-Computer Interaction, 6th Edition. Pearson 2016 17. V. Sinha, D.R. Karger: Magnet: supporting navigation in semistructured data en- vironments. SIGMOD ’05: 97–106 18. G. Smith, M. Czerwinski, B. Meyers, D. Robbins, G. Robertson, D.S. Tan: FacetMap: A Scalable Search and Browse Visualization. TVCG, 12(5): 797–804 (2006) 19. J. Stasko, J. Choo, Y. Han, M. Hu, H. Pileggi, R. Sadana, C.D. Stolper: CiteVis: Exploring Conference Paper Citation Data Visually. IEEE VIS 2013 (poster) 20. J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, Z. Su: Arnetminer: Extraction and mining of academic social networks. KDD 2008: 990–998 21. J. Teevan, S.T. Dumais, Z. Gutt: Challenges for Supporting Faceted Search in Large, Heterogeneous Corpora like the Web. HCIR 2008: 6–8 22. F. Wei et al.: TIARA: a visual exploratory text analytic system. KDD’10: 153–162 23. J.D. West, T.C. Bergstrom, C.T. Bergstrom: The Eigenfactor Metrics™: A Net- work Approach to Assessing Scholarly Journals. C&RL 71(3): 236–244 (2010) 24. K.-P. Yee, K. Swearingen, K. Li, M. Hearst: Faceted metadata for image search and browsing. CHI ’03: 401–408 25. J. Zhang, G. Marchionini: Evaluation and evolution of a browse and search inter- face: Relation Browser++. dg.o ’05: 179–188 26. J. Zhao, S. Drucker, D. Fisher, D. Brinkman: TimeSlice: interactive faceted brows- ing of timeline data. AVI 2012: 433–436 27. https://www.semanticscholar.org 28. http://www.core.edu.au/conference-portal 29. https://github.com/allenai/science-parse