CCS CONCEPTS

Vancouver, BC, Canada, October

Engagement Scoring for Care-gap Intervention Optimization

MohamadAli Torkamani

Malhar Jhaveri

Jynelle Mellen Michael Brown-Hayes

James Chung

Penny Pan

Hakan Kardes

FirstName}.

0 0 Recommendation Impact, Care-gap Closure , Right Time Intervention

2018

6 2018

Timely preventative health screenings can be crucial for the early detection of serious disease or complications from chronic conditions, such as diabetes. Influencing the right people to obtain recommended screenings, at the right time, can result in significantly improved health outcomes. These screenings, if not attained, are called care-gaps, while performing the screening is called a closed care-gap. The spectrum of individuals managing their own health care ranges from minimal engagement (no screenings in this case) to high engagement (timely screenings). For those who do not obtain timely screenings, we have identified two types: individuals who will respond to outreach and those who will not. Therefore, our focus becomes identifying the right people (those who need to close a gap and have a likelihood of responding) at the right time. This approach will maximize the efectiveness and impact of outreaches. Our recommendation model generates a ranking order where the individuals who are most likely to close their care-gaps after intervention, are ranked first. Our method shows successful results in detecting patients who need a prompt, and our experimental results show that by using this recommendation model, we can increase the number of closed gaps.

CCS CONCEPTS

• Applied computing → Consumer health; Health care information systems; • Information systems → Recommender systems;

GAPS IN CARE

Based on condition, age, and sex, the US Preventive Services Task Force (USPSTF) and Centers for Disease Control and Prevention (CDC) publish guidelines for how Americans can best manage their HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada © 2018 Copyright for the individual papers remains with the authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. health by performing needed preventive screenings. Care-gaps are the result of obstacles preventing patients and physicians from implementing care recommendations. Some barriers include misunderstanding of guidelines, lack of awareness, lack of proper transportation to clinics and hospital, fear of procedures like colonoscopy, etc.

For breast cancer screenings, the USPSTF recommends that women 50-74 years of age receive one mammogram every 27 months. For colorectal cancer screenings, the USPSTF recommends that individuals 45-75 years of age receive either one fecal occult blood test (FOBT) every year, one flexible sigmoidoscopy every five years, or one colonoscopy every ten years [ 2, 6 ].

There are also specific guidelines for people with diabetes to help manage their care. The CDC encourages individuals with diabetes to annually receive at least one hemoglobin A1c (HbA1c) test to understand their average blood glucose levels, at least one dilated eye exam for early detection of retinal changes, and at least one nephropathy test to check kidney function [ 5 ].

The National Committee for Quality Assurance (NCQA) sets guidelines to evaluates the performance of every health plan based on those guidelines [ 1 ]. The Healthcare Efectiveness Data and Information Set (HEDIS) measures the performance of health plans based in part on care-gap closure rates of Breast Cancer Screening (BCS), Colorectal Cancer Screening (COL), and Comprehensive Diabetes Care (DIAB).

To improve HEDIS performance, health plans employ clinical staf to develop intervention plans for contacting patients with outreach intended to encourage them to close their care-gaps. Supporting this intervention are algorithms designed to identify members who should have received at least one of these screenings but have not done so.

The number of people covered by a health plan who are also eligible for the above-mentioned screenings can be quite large, and comprehensive outreach campaigns could take several months to complete. During such a campaign, everyone who is eligible for screening will receive a telephone call (except those who have opted-out on the “Do Not Call” list). If several outreach attempts are unable to contact a person, then a letter outlining the relevant information is mailed to their address.

In general, some portion of the population is not engaged in their own health. They may be resistant to interventions and may ignore phone calls and letters. This sub-population is less likely to close their care-gaps even if they are contacted several times via various channels. Some people might request that their names be added to the Do Not Call list. On the other hand, a complementary portion of the population are highly engaged in their health. They do not require interventions at all. They will close their care-gaps on their own. Finally, there is a group of people who will likely close their care-gaps, but only after being contacted. One of the challenges of the healthcare system is to prioritize resource allocation to the patients who are higher risk and are more likely to be impacted positively.

To address this problem, we have designed a machine learning recommender system that ranks the more impactable care-gaps higher – i.e., the ones that are more likely to be closed âĂŞ for each person. Some of the screenings such as colonoscopy and eye exam for diabetes are harder to close. Our model can be used to frame a personalized message for each member based on their engagement score, and the degree of dificulty in closing their open gaps. If a person is likely to close at least one care-gap, the system recommends this person to the care advocates for intervention. 2

MANIFEST SPACE

To create a reliable feature vector, we have used both direct feature extraction from the data and a collaborative filtering method for data imputation. The data we observe in the claims-record we term manifests. 2.1

Data

We have used four manifests categories of data sources to create our features: pharmaceutical claims, specialties of the providers that patients have been visiting, diagnoses made, and the final services performed.

Data used for creating the model is derived from claims data with approximately 30 million rows and over 3 million individuals, containing information such as their gender (male, female and unknown), dates of birth and death (if applicable). Based on a patients age, we categorized people in twelve age groups (≤1, 1-4, 5-12, 13-17, 18-25, 26-35, 36-45, 46-55, 56-65, 6675, 76-85 and ≥86). The data is aggregated to the industry standard quarters from year 2012 to 2017.

The features include everything that a member could claim from a payer (medical rehab, surgeries, treatment for health disorders, prescriptions, etc). within the applicable period. Also, in the data set, each person might be present in multiple year/quarters.

To create an extended feature vector, we constructed a bag-ofwords representation for the presence of every possible value that a manifest could have. For example, we used several therapeutic groupers for pharmaceutical data. And, we used the number of times that the patient had a prescription for a specific medication in one calendar quarter as the corresponding feature for that drug. We used the same count-based representation for all other features.

The feature vector was also augmented with patientsâĂŹ demographics in the calendar quarter of interest. This included their age, gender, and several features from the United States census data based on their home neighborhood.

We hand-crafted some feature that we expect to indicate health engagement. In particular, we constructed several features for their medication-adherence based on how timely they are in refilling their recommended prescriptions. 2.2

Smoothing and Missing Value Imputation

A problem with claim data is that missing values do not necessarily mean that the patients have not had a manifest. Besides the noise and human error, the missing values could be caused by the complicated structure of the healthcare system in the United States. For example, a value not being present in a patient’s manifest could be due to their multiple coverages or lack of eligibility for specific periods of time. To deal with this problem, we assume that similar patients require similar types of care. Also, many of the features co-occur. For example, many of the diagnoses are comorbidities that patients have at the same time, or certain drugs are always prescribed for specific ailments. As a result, the table of our features for all the patients at all time should form a low-rank matrix. We use a low-rank matrix completion approach for both filling the missing values as well as smoothing the features and removing the noise [ 4 ].

Our approach is similar to the Robust PCA method by Candès et. al. [ 3 ]. Let Xi j be the observed value for feature j for the ith observation (i.e.,p˜ atient, year and quarter). We learn a low-dimensional approximation M of the full matrix X , by solving the following nuclear norm minimization convex program: minimize

M subject to ∥M ∥∗ + λ X |Xi j − Mi j |

i j X (Xi j − Mi j )2 ≤ ϵ i j ∥M ∥∗ is the nuclear norm of the smoothed data, which encourages M to be low-rank. Pi j |Xi j −Mi j | is the ℓ1 norm of the distance between the observations and their approximations and allows the existence of some outlier vaues, while it pushes matrix M to be close to observed values of matrix X as much as possible. From a statistical point of view, this term assumes that there is a sparse set of outliers that can be modeled using a Laplacian distribution. The constraint Pi j (Xi j − Mi j )2 ≤ ϵ encourages the closeness of M and X in general, but in specific it limits the efects of Gaussian additive noise such as variations in the number of prescription of the same medication for similar people by diferent doctors. The hyperparameter λ and ϵ are tuned by cross-validation using held-out samples after model training explained in the next section. 3

ENGAGEMENT MODEL

We designed an ensemble model that predicts the likelihood of closing care gaps after a phone call. For the people who are more likely to close their care-gap after a call, we recommend a higher priority for outreach, because the data and model show that we can impact them. 3.1

Predictive modeling of care-gap closure using experimental data

Members with open care-gaps can be grouped into three groups. (1) Members who will close their care-gaps by themselves. (2) Those who will respond to an outreach by closing their caregaps. (3) Those who are not engaged with their healthcare and will not close their care-gaps, even after outreach. (Figure 1) With the experimental dataset described in the protocol below, we will be able to estimate the following at an individual-member level.

(a) Probability of closing care-gaps without outreach. (b) Using 1 and 2, we also estimated the increased likelihood of closing care gaps after outreach. In other words, we will be able to measure the value it adds to contact a person, and how the likelihood of care gap closure increases accordingly. (c) Probability of not closing care-gaps with outreach.

3.2 Implementation of the Score

The input to our model is the smoothed and imputed feature vectors, as well as gold standard targets from intervention in the previous years. After feature imputation, we use a hybrid ensemble method consisting of random forest and a support vector regression model for computing the probability of being impacted by an intervention.

The output scores from the model were used to prioritize the member list for closing care-gaps for Medicare and Commercial lines of business.

Experimental Study design was a randomized controlled study of Medicare members with an open care-gap for at least one measure from annual wellness visit, colorectal cancer screening, and breast cancer screening. Randomization was done at a memberlevel using stratification by engagement model score, i.e. samples were randomly selected from each decile of the score distribution. Below is the study design diagram (Figure 2). The interventions were performed for 10,045 unique members.

4 EXPERIMENTAL RESULTS

While the model can identify the likelihood of closing care-gaps, we are unable to calculate the likelihood of impacting care-gap closures

N with outreach, directly from data. To jointly measure the performance of our model as well as the efectiveness of interventions, an experimental study was designed.

Out of 10,045 members, 9,768 members could be contacted. Distribution of measures for outreached members is shown in Figure 4 (highest for colorectal screening following by Wellness and breast cancer screenings.

Table 1 presents the efectiveness of the engagement model. Here the efectiveness is measures by comparing the average engagement score between members who closed the gaps versus who didnâĂŹt closed the gaps for the respective eligible measures. As you can see for all the three measures, members who closed the respective gaps had a significantly higher engagement score.

In Table 1, we show the efectiveness of the model within the whole population, i.e., both the control and intervention groups combined. To study the performance of the model itself, we should also investigate how the members of the control group behaved regarding their open care-gaps in the period following the intervention campaign. To do so, we performed analyses similar to the process for Table 1, and we analyzed the intervention and control groups in tables 2 and 4 separately. As the results in Table 2 state, the model has been able to successfully identify the people who are engaged in their health and have closed their care-gaps without being contacted during this campaign. Table 4 is also aligned with tables 1 and 2, and it shows that the model has performed similarly well for the intervention group.

CareFnigeurte O4: Iunttecrvoemntioen POoutpcoumleaPtoipounlation Wellness Screening (N=7,796)

Intervention

Group (N-10,045) Outreached (N=9,768)

Didn’t Outreached

(N=277) BCS Screening (N=3,082)

COL Screening (N=8,500)

Also, the distribution of the engagement score for members who closed versus who didnâĂŹt closed the wellness gaps is shows in Figure 5. The patients who closed wellness measure had significantly higher engagement score. Similar patterns were observed for Breast and colorectal cancer screening measures (Figure 5). This verifies that our reommender system has successfully selected the impactable people for outreach.

Table 4 shows the efectiveness of interventions in closing gaps for three measures. There is a statistically significant positive lift for wellness and colorectal measures (absolute diference of +4.3% and +2.8%, and relative lift of +20.97% and 22.22% ). Lift is the gapclosure percentage diference between intervention and control groups. Lift for breast cancer screening measure is negative but not statistically significant. Negative lift is partly due gender specific measure and during the randomization process members in control group were selected at a member level and not at a measure level.

Medicare Measure Wellness Measure COL Measure BCS Measure Ncontacted Nclosed

closed (%)

Ncontrol

Nclosed closed (%) 4,967 5,369 2,177 3,854 4,256 1,490

6 CONCLUSION

Care gaps closure in not only financially important for the healthcare system, but also it directly helps patients’ well-being by identifying conditions at early stages.

Our proposed recommendation system generates ordered prioritization of the patients who are more likely impacted by phone interventions. The results show that in practice, people with high engagement scores are more likely to close their care-gaps after being outreached. Our recommender system could prioritize the outreach and can diferentiate who is likely to close the gaps after an intervention.

This model can be extended in several directions. For example, if we aford to contact many people, we can use the system for personalized messages during the outreach. People who are not much engaged in their healthcare, might be motivated by more incentives and encouragement, but highly-engaged people might be willing to have more information about other health-related activities.

We can also use this system, to contact people who need more time to close their care-gaps first and reach to people who will respond faster afterward.

It is easy to add other measures to this system. The model is built on healthcare industry standards (e.g., ICD-10, CPT codes, therapeutic classes). Therefore, it can be used by a broader population. The engagement score can also be used as a proxy for general healthcare engagement for marketing applications.

[1] Nancy

Beaulieu and Arnold M Epstein . 2002 . National Committee on Quality Assurance health-plan accreditation: predictors, correlates of performance, and market impact . Medical Care ( 2002 ), 325 - 337 .

[2]

Ned

Calonge , Diana B Petitti, Thomas G DeWitt, Allen J Dietrich, Kimberly D Gregory , Russell Harris, George Isham, Michael L LeFevre, Roseanne M Leipzig, and Carol Loveland-Cherry. 2008 . Screening for colorectal cancer: US Preventive Services Task Force recommendation statement . Annals of Internal Medicine 149 , 9 ( 2008 ), 627 - 637 .

[3] Emmanuel

J Candès

Xiaodong

Li , Yi Ma, and John Wright. 2011 . Robust principal component analysis ? Journal of the ACM (JACM) 58 , 3 ( 2011 ), 11 .

[4] Emmanuel

Candès and Benjamin Recht . 2009 . Exact matrix completion via convex optimization . Foundations of Computational mathematics 9 , 6 ( 2009 ), 717 .

[5]

Sarah

Stark Casagrande , Catherine C Cowie, and

Judith E

Fradkin . 2013 . Utility of the US Preventive Services Task Force criteria for diabetes screening . American Journal of Preventive Medicine 45 , 2 ( 2013 ), 167 - 174 .

[6]

Laura

Davisson . 2016 . USPSTF breast cancer screening guidelines . West Virginia Medical Journal 112 , 6 ( 2016 ), 29 - 32 .