Introduction

Case-Based Comparison of Career Trajectories

Kedma Duarte

kedmaduarte@gmail.com 1

Rosina O. Weber

rosina@drexel.edu 0

Roberto C. S. Pacheco

pacheco@egc.ufsc.br 1 0 College of Computing and Informatics, Drexel University , USA 1 Graduate Program in Knowledge Engineering and Management, Federal University of Santa Catarina , Brazil

152 161

Data generated across time may not be easily comparable in its original form thus potentially leading to results that may be perceived as unfair to some. We investigate quality assessment of scholarly researchers from their curricula vitae (CVs) for processes such as hiring, promotion, and grant funding. In previous work, we demonstrated that case-based reasoning (CBR) offers advantages as a transparent methodology to assess researcher quality. Its benefits include consistency, transparency, ability to adapt to specific purposes, and ability to provide explanation. The problem we now face is how to preprocess the data from the CVs to compare researchers whose scholarly production is achieved under different conditions, different points in time, and span different career trajectory lengths. We propose strategies to deal with these aspects of time during preprocessing of the data for case representation. We use 1,000 CVs from the Brazilian Lattes database to illustrate.

case-based reasoning • time series • trajectory • career trajectory • curriculum vitae • normalization • recency

Introduction

There is a growing interest in relying on high quality profiling systems to conduct data studies to, as stated by Lane [ 1 ], “make science more scientific”. Researcher quality assessment is a crucial task because characteristics of research metrics steer science and technology decisions, ultimately steering progress, economics, and our way of life [ 2 ].

Unfortunately, private organizations have started to explore this niche and are now steering our future by offering research metrics that rely on incomplete and flawed automatically crawled data [ 3 ]. In response to this present state, a group of researchers gathered at the 2014 International Conference on Science and Technology Indicators to produce the Leiden Manifesto [ 4 ]—a set of 10 principles for research quality metrics that includes attention to transparency, flexibility, and context, amongst others.

At the 2016 International Conference on Science and Technology Indicators, these authors proposed a CBR approach to manipulate profiling data for researcher quality assessment [ 5 ]. Our CBR method can be tailored to specific contextual purposes to meet some of the objective principles from the manifesto because of its consistency, transparency, ability to adapt to specific purposes, and ability to provide explanation.

In the proposed methodology, CBR is used to classify candidate researchers as either fit or unfit for a purpose. Purposes are characterized by features that reflect specific jobs or promotions. Each entails a series of references of quality such as publications in a journal or conference that are considered more relevant than others. The characterization of the purpose comes from the users who adopt the methodology to classify CVs of applicants. The use of CBR in this task assumes that assessing quality ultimately implies predicting future success.

In this paper, we describe the CBR implementation, and discuss three preprocessing steps that are required due to temporal aspects of the data. The first is a standard normalization step so that absolute volumes of scholarly production are replaced by relative values of productivity. This avoids the comparison of absolute numbers of production accomplished in years when conditions are different. The second aspect is recency. We analyze researchers’ accomplishments to assess whether more recent production is or not more predictive of quality. The third refers to grouping the relative values of productivity depending on the lengths of career trajectories and recency. The CBR system, as it is implemented now, uses one aggregated data point for each attribute. Deciding how to group this data depends on directives of the users in terms of how they favor experience, productivity, or whether they want both to have the same emphasis.

This paper’s intended contributions are to introduce the challenges stemming from using temporal data from CVs to assess researcher quality with CBR, and propose preliminary strategies to address them. We illustrate these challenges and strategies with data from 1,000 CVs from the period 2001 to 2014 from the Brazilian Lattes database [ 6 ]. The expected value of these strategies is to address these time-related challenges in a way that preserves transparency and enables an easy to understand substantiation.

In the next section, we provide the background for this work, including how we proposed to use CBR for assessing researcher quality. We also mention a few related works in time and CBR, and in time-series prediction. In Section 3, we describe the challenges and our proposed strategies. We lay out directions of future work in Section 4. 2

Background

In this section, we introduce some of the concepts used in this paper. We start with normalization, move to time-series approaches, and then discuss some aspects of dealing with career trajectories. In the final section, we describe the CBR approach that motivates this work.

Normalization is a method that may be used before a classification process, required to equalize ranges of the features from different scales, in order to obtain the same proportion between them, making features comparable [ 7 ]. Several techniques have been proposed to implement normalization (e.g., Min-Max Normalization, Linear Scaling to Unit Range, Median Normalization, and Z-Score Normalization), and many studies have investigated the relation between choosing the appropriated normalization technique and improving classification accuracy (e.g., [ 7 ][ 8 ]). These studies demonstrated the dependence of normalization methods in the performance of classification accuracy.

GenericPred [ 9 ] is a method for long-term time-series forecasting that addresses chaotic behaviors such as natural phenomena strongly dependent on initial conditions, which are many times unknown and consequently difficult to model and predict. The results of this approach demonstrated a significant gain in accuracy over traditional time series methods for both short and long-term predictions.

Time-series using bibliometrics data have been used to discover distinguished researchers [ 10 ]. Their approach is able differentiate researchers who have contributed a significant achievement amongst those publishing a few papers over a long period.

Time-series data of renal transplantation patients has been used in case-based binary classification [ 11 ]. The approach compares time series of creatinine courses using a distance measure based on linear regression.

Dynamic time warping (DTW) [ 12 ] is a distance measure to compare temporal se-quences based on dynamic programming. DTW is much more robust than measures based on the Euclidian distance [ 13 ] as it allows an elastic shifting of the time axis. 2.1

Career Trajectories The terms career and trajectories are viewed as synonyms that describe the path from entering into the job market and its following steps [ 16 ]. Along the same lines, the career of a researcher has been described as a longitudinal account of an individual’s productivity [ 17 ]. Our focus in this paper is on career trajectories from the perspective of the productivity of researchers along their careers [ 18 ][ 19 ].

The consideration of time when studying career trajectories is important for the reliability of indicators and rankings. Previous indicators or metrics that attempted to define a fixed interval of years have been highly criticized [ 20 ]. The purpose of normalizing time intervals and use annual productivity when assessing researcher quality is to make available the same transparent standards to all researchers who are assessed.

Our main problem is that this process must be transparent and able to substantiate its fairness. The assessment has to clearly consider and describe separately the biases that come from the description of purpose from the biases that originate in learning methods. The first issue we investigate is how to demonstrate whether an assessment can be fair when quality assessment is case-based, which requires comparison between researchers whose career trajectories span different intervals. 2.2

Purpose-Oriented Case-Based Researcher Quality Assessment The purpose-oriented CBR approach classifies researchers as fit or unfit for the purpose of a target process (Fig. 1) such as hiring or promotion [ 5 ]. This method supports the Leiden Manifesto [ 4 ] to incorporate purpose in research metrics aligned with the con-cept that quality means fitness for purpose [ 14 ].

A purpose-oriented approach requires users to input the purpose as a set of standards or examples. For instance, for a target process to hire a researcher for the federal uni-versity of Rio de Janeiro to work with the Zika virus, publications in local conferences where geographic issues are the focus may be considered of high importance when as-sessing quality of an applicant. Users can also indicate examples of fit and unfit re-searchers, which can then be used for weight learning.

The first parameter to be captured for a purpose p is the target interval of interest N, where n ∈ N is the year in question within the interval of years N that are to be included in the data from candidates to be considered for a given purpose. Years y of importance are y1 = Initial year, and yn = Final year.

Input

Classifier

Output: Classified CV Unclassified Researcher

CV data Classified researcher

Classified researcher

CV Fit for a purpose

Unfit for a purpose ∑ . , 1 (1) We define weights , … , ,  0,1 , for each purpose p. The local similarity measure between attribute aj, and aj’, used in the data in this article is defined by: , 1 , |a a | (2) 0, is the maximum distance between and ’ .

This way the case-based classification of fit or unfit is not assessing similarity between time-series but between flat cases with weights in each attribute stemming from the characteristics of the purpose p. 3

Challenges and Proposed Strategies

We use a simple example to illustrate the challenges in preprocessing data to populate cases for case-based researcher quality assessment from CV data. Suppose we plan to use our case-based quality assessment approach to classify applicants as either fit or unfit for a given purpose. One of the parameters for a job is the target interval of interest N, which delimits the years that are considered relevant to include in the examination of candidates. For example, a job opening seeking social media experts would probably not include accomplishments from candidates that predate the existence of social media. For data in Table 1, the target interval of interest is five years. Each line in Table 1 refers to the volume of one type of accomplishment (e.g., journal articles in one field and reputation) produced by job applicants (i.e., researchers).

Suppose the data in Table 1 on the left columns designated by aijn reflect all the items produced by each applicant. The maximum number of accomplishments varies each year. Only in Year 1 and Year 5, a maximum of five accomplishments was produced. In Year 3 however the maximum produced was three. We contend that there are external factors that may have contributed to higher and lower levels of productivity. One common example is a reduction of participation in conferences in periods of economic depression. We therefore normalize these absolute values and convert them into productivity rates, using Equation 3.

The results of the normalization are laid out in right columns under āijn. They show how one same absolute value (e.g., when i = 3) can represent the maximum productivity in Year 3 and 60% in Year 1. For example, if using absolute values, production of applicant in third row would be considered inferior to the applicant in the second row in Year 2 whereas relative values make them equal. This simple step is easy to describe to a broad audience and does not depend on characterization of the purpose. 3.2

Recency The challenge with respect to recency stems from the notion that recent data may be perceived as more current and therefore more relevant in time series classification. This possible perception may lead to claims of injustice and therefore we need to establish a way to assess whether or not recent data is more influential. Note that in our proposed approach, the target process may dictate the importance of recent accomplishments. Assessing how influential recent data is would be required for implementations when the target process is neutral about recency.

Given our assumption that assessing quality implies predicting future success, it is consistent to interpret that data is influential or relevant when it is predictive. To do this, we take the target interval of interest N and set the last year aside as actual to provide outcome classes. The intuition is that if a given year’s data has cases that correctly predict the actual year then this year’s data are predictive and hence influential.

To demonstrate this proposed analysis, we start from a hypothetical purpose, namely, a job opening that seeks a researcher who is a successful collaborator. This hypothetical purpose was captured using rules that assigned more importance to publications and funded projects achieved in collaboration than to solo authored accomplishments. The data where we applied these rules to determine who was fit or unfit for a collaborative job was selected from the Brazilian Lattes database [ 6 ]. For this reason, some of the parameters used to create rules reflected that local culture. These data and weights were described in [ 5 ].

For the analysis we now describe, we use weights learned in [ 5 ]. The data we used in this analysis is new. We started from the entire Lattes database that retains around 4 million CVs. From these, we selected 212,000 CVs of researchers with completed doctoral degrees. In order to work with dense data, we kept only researchers who were continuously productive from the target interval of interest that we defined from 2001 to 2014, resulting in 50,000 CVs. We kept CVs from researchers with a growing absolute number of accomplishments to eliminate researchers with periods of inactivity. This resulted in 20,000 CVs. For the analysis we show in this paper, we used a randomly selected sample of 1,000 CVs.

For the target interval of interest from 2001 to 2014, we set aside 2014 as actual to provide outcome classes. Our goal is to assess how predictive the data from years 2001 to 2013 are. For each year, we use our case-based implementation with leave-one-out cross validation (LOOCV) [ 21 ] to predict whether each researcher would be classified as fit or unfit for the collaborative hypothetical purpose above described. We compute for each researcher whether the classification using each year is correct (i.e., true positive, true negative) or incorrect (i.e., false positive, false negative). true positives true negatives false positives false negatives

Table 2 shows the average accuracy (AA) for all 1,000 researchers using their data from each year in the first row. The second and third rows present respectively accuracy of fit (i.e., ratio of true positives) and accuracy of unfit (i.e., ratio of true negatives).

These results in Table 2 are difficult to interpret because we do not know if the averages include the same or different researchers. To better understand these results, Fig. 2 plots true positives, true negatives, false positives, and false negatives. Positives are represented with continuous lines, and negatives are represented with dots. Light color is for true and dark for negative. A consistent trend would have true positives and true negatives in a direction opposite to false positives and false negatives. This would mean that accuracy increased because, for example, the true positives increased because of a reduction of false negatives. If accuracy increased in more recent years, then we would have to increase the relative relevance of these years when aggregating these values. The lines in Fig. 2 do not show consistent results. Our conclusion is thus that there is no consistent trend supporting the interpretation that recent data is more or less predictive than older data. Hence, ∀ āijn, N= {2001,..., 2013}, gn = 1. Values for gn are used in the next step when values are aggregated. 3.3

Career Trajectories

A researcher ri  R has a career trajectory CT that reflects the researcher’s years of activity. For aggregating researchers’ production, we need:

CTi = y Max (CT i) = ymax Min (CT i) = ymin

Concluding Remarks and Next Steps

This paper introduces time-related challenges faced when implementing CBR for researcher quality assessment. We propose a standard normalization to compare productivity instead of absolute volume of accomplishments, strategies to aggregate production across different career trajectories, and an analysis of predictiveness to address recency.

Given that there is no consensus on how many years should be used to assess researcher quality, we propose to use predictiveness of data within a target process context as a proxy to how influential it should be. We showed an illustrative example where data did not reveal variations in its level of predictiveness.

This work is very preliminary. The next step is to study different datasets to determine how to assess predictiveness and how to compute a measure of recency for when data reveals consistent trends.

The approach proposed in this paper aims to enhance the case-based researcher quality assessment proposed in [ 5 ] by adding weights within a target time interval when results of the recency assessment determine that more recent data is more or less predictive of the future and therefore should be considered more relevant.

Given that normalization strategies interfere with classification accuracy, we need to experiment with various purpose scenarios and normalization strategies to assess which have both high accuracy and acceptable substantiation. Along these lines, we will investigate DTW particularly when comparing career trajectories of different lengths.

This paper does not detail how the characterization of a purpose may be captured, which can be through examples, conditions, and a combination of these. We also limit the presentation to binary classification and do not discuss how to produce a ranking of the applicants. These are both topics for future work.

Acknowledgements

Authors thank the STELA Institute, particularly Rudger Taxweiler for his help collecting data. First author is supported by Brazilian’s Goiás Research Foundation (FAPEG) and University of the State of Goiás (UEG) under agreement number 201310267000099. Authors also thank the suggestions from the reviewers.

1. Lane , J.: Let's make science metrics more scientific . Nature 464 , 488 - 489 ( 2010 )

2. Katz , J.S. : Scale-Independent

Measures

: Theory and Practice . In 17th International Conference on Science and Technology Indicators . Montreal, Canada, 1 - 19 ( 2012 )

3. Van

Noorden

, R. : Metrics: A profusion of measures . Nature 465 ( 7300 ), 864 - 866 ( 2010 )

4. Hicks , D. Wouters , P. , Waltman , L. , De Rijcke , S. , Rafols , I. : Bibliometrics: The Leiden Manifesto for research metrics . Nature , 520 , 429 - 431 ( 2015 )

5. Duarte , K. , Weber , R. , Pacheco , R.C.S.: Purpose-oriented metrics to assess re-searcher quality . In 21st International Conference on Science and Technology Indicators ( STI2016): Peripheries, frontiers and beyond . València, Spain, 1312 - 1314 ( 2016 )

6. Pacheco , R. C. S. , Kern , V. M. , Salm

, J. F. , Packer , A. L. , Murasaki , R. , Amaral , L. , Santos , L.D. , Cabezas B. , A. R. : Toward CERIF-ScienTI cooperation and interoperability . In:

A.G.S.

Asserson ,

E. J.

Simons (Eds.) 8th International Conference on Current Research Information Systems . Leuven: Leuven University Press. 179 - 188 ( 2006 ).

7. Singh , B. K. , Verma , K. , Thoke , A. S.: Investigations on Impact of Feature Normalization Techniques on Classifier's Performance in Breast Tumor Classification . International Journal of Computer Applications , 116 ( 19 ), 11 - 15 ( 2015 )

8. Jayalakshmi , T. , Santhakumaran , A. : Statistical Normalization and Back Propagation for Classification . International Journal of Computer Theory and Engineering , 3 ( 1 ), 1 - 5 ( 2011 )

9. Golestani , A. , Gras , R.: Can we predict the unpredictable? Scientific reports , 4 , 6834 ( 2014 )

10. Kawamura , T. , Yamashita , Y. , Matsumura , K. : Research Activity Classification based on Time Series Bibliometrics. In 21st International Conference on Science and Technology Indicators ( STI2016): Peripheries, frontiers and beyond . Valencia, Spain, 1456 - 1460 ( 2016 )

11. Schlaefer , A. , Schröter , K. , Fritsche , L.: A case-based approach for the classification of medical time series . In International Symposium on Medical Data Analysis . Springer Berlin Heidelberg. 258- 263 ( 2001 )

12. Myers

, Rabiner

: A comparative study of several dynamic time-warping algorithms for connected word recognition . The Bell System Technical Journal 60 ( 7 ): 1389 - 1409 ( 1981 )

13. Al-Naymat , G. , Chawla , S. , Taheri , J.: Sparse

DTW

: a novel approach to speed up dynamic time warping . In Proceedings of the Eighth Australasian Data Mining Conference. Australian Computer Society , Inc., 101 , 117 - 127 ( 2009 )

14. Juran , J. M , Godfrey , A. B. : Juran's Quality Handbook . New York: McGraw-Hill . 5th Edi-tion . ( 1999 )

15. Duarte , K. , Weber , R. , Pacheco , R. C. S.: Conceptual data model for research collaborators . In CIKI: VI International Conference on Knowledge and Innovation ( 2016 )

16. Valenduc

, Vendramin

, Pedaci

, Piersanti

M.:

Changing careers and trajectories. How individuals cope with organisational change and restructuring, WORKS re-port, HIVAK . U. Leuven, Leuven. ( 2009 )

17. Dietz , J. S. , Chompalov , I. , Bozeman , B. , Lane , E. O. N. , Park , J.: Using the curriculum vita to study the career paths of scientists and engineers: An exploratory assessment . Scientometrics , 49 ( 3 ), 419 - 442 ( 2000 )

18. Lee , S. , & Bozeman , B. : The impact of research collaboration on scientific productivity . Social studies of science . 35 ( 5 ), 673 - 702 ( 2005 )

19. Unger , D. D. , Rumrill , P. D.: An Assessment of Publication Productivity in Career Development and Transition for Exceptional Individuals 1978-2012 . Career Development and Transition for Exceptional Individuals , 36 ( 1 ), 25 - 30 ( 2013 )

20. Stuart , D. : Metrics for an increasingly complicated information ecosystem . Online Information Review , 39 ( 6 ), 848 - 854 ( 2015 )

21. Stone , M. : Cross-validatory choice and assessment of statistical predictions . Journal of the Royal Statistical Society. Series B (Methodological) , 111 - 147 ( 1974 )