=Paper= {{Paper |id=Vol-2849/paper-13 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-2849/paper-13.pdf |volume=Vol-2849 |dblpUrl=https://dblp.org/rec/conf/swat4ls/AdiseshBMB19 }} ==None== https://ceur-ws.org/Vol-2849/paper-13.pdf
                    A Working Semantic Model for the Integration
                        of Occupation, Function and Health

                                        Anil Adisesh1 , Hongchang Bao2 ,
                             Mohammad Sadnan Al Manir2,3 and Christopher J.O. Baker2,3
                                 1
                                     Department of Medicine, Division of Occupational Medicine
                                                  University of Toronto, Canada
                                               Anil.Adisesh[at]unityhealth.to
                                               2
                                                 Department of Computer Science
                                        University of New Brunswick, Saint John, Canada
                                                     3
                                                       IPSNP Computing Inc
                                           {bakerc,hbao,sadnan.almanir}[at]unb.ca



                           Abstract. Occupation is an explanatory variable in health research that
                           is used to identify the degree to which exposures to environmental haz-
                           ards and working conditions are correlated with disease. Moreover dis-
                           ease and functional impairment can limit employment options open to
                           patients. Despite the importance of these issues many essential data sets
                           have yet to be integrated. In the current study we defined an integrated
                           semantic model and populated coded patient data representing disease
                           (ICD), functional impairment (ICF), occupation (NOC), and job at-
                           tributes (NOC Career Handbook). Automated NOC coding of patient
                           responses to “What is your job” were coded by a custom algorithm de-
                           veloped in previous work. To validate the utility of the model, SPARQL
                           queries and outputs were prepared and discussed in the context of au-
                           thentic physician and case worker activities.


                   1     Introduction

                   Occupation is a widely used determinant in health research representing socioe-
                   conomic status and class, as well as environmental exposures [1]. Despite this
                   many data sets collected at point of care fail to record patients’ occupations,
                   limiting the reuse of pertinent data for applications relevant to occupational
                   associations of disease and patient outcomes.
                       Research studies in the area of Occupational Health (OH) are typically
                   targeted to specific lines of inquiry such as examining the burden of cancer
                   attributable to occupation [2,3]. Given the challenges in accurate and timely
                   recording of occupations in a standardized way, several studies have sought to
                   facilitate automated coding of occupations using standardised classifications of
                   occupations [4,5]. While successful in accelerating the recording of jobs, with up
                   to 70% accuracy, subsequent analyses using such coded data sets remain limited.
                   This is in part due to the need to define core objectives and integrate complex
                   data sets.




Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    Applications of occupation coding in OH practice include correlating: i) dis-
ease and occupation, and ii) functional impairment and fitness for work. For
instance, the occupations associated with silicosis are a major concern. Occupa-
tional groups exposed to silica4 include construction labourers, heavy equipment
operators, plasterers and drywallers carrying out grinding, sandblasting, crush-
ing, chipping, mixing, and plowing which are common in many industries such
as mining, agriculture, and manufacturing industries [1,6]. Researchers are in-
terested in identifying a specific disease or chronic condition associated with a
type of employment. This type of investigation is an essential step towards the
evolution of health policy and decisions about workplace conditions, albeit chal-
lenging because of the distributed data sources and a lack of standardization of
the source data.
    In this work, we have sought to design an integrated model to support inves-
tigations of these both themes leveraging standardised classifications of disease,
functional impairment, patient data linked to a National Occupation Classifica-
tion (NOC), using an NLP-based grounding algorithm, and job attributes. With
a populated model we assess the suitability of the model for the target studies.


2     Model

In medicine, patients are primarily assessed with the goal of disease diagnosis
and treatment, whereas functional impairment assessments for chronic diseases
or injury occurring at home/workplace/sports are a secondary consideration for
practitioners. However assessments of function are essential when determining
cases of workplace compensation or health insurance claims as well as for reha-
bilitation guidance.
    Here we describe the kinds of entities relevant to OH that exist in the ap-
plication environment, along with classifications and groupings of entities, and
core relations between them. Figure 1 shows an integrated model consisting of
two graph models.
    In consultation with domain experts we identified the core concepts and data
used in OH, specific to the occupational dimensions of disease onset and assess-
ment of functional impairment. The functional descriptors capture the essential
physical abilities and aptitudes required to perform effectively in a given occupa-
tion. The following concepts were modelled: Patient, Disease, FunctionalImpair-
ment, NOCCode, and NOCTitle in the smaller model, while JobAttribute, Phys-
icalActivity, Aptitude, CHNOCCode, and CHNOCTitle in the larger model. Six
more concepts Vision, Hearing, LimbCoordination, ColorDiscrimination, Body-
Position, and Strength are subclassed to PhysicalActivity while nine other con-
cepts are subclassed to Aptitude. Instances of each model can be integrated via
the alignment of the concepts NOCTitle and CHNOCTitle.
    In the model, each instance of a Patient represented by an identifier is di-
agnosedWith an instance of a Disease, which is represented by an ICD-10 code.
4
    https://www.carexcanada.ca/profile/silica_crystalline-occupational-exposures/


                                        2
Instances of FunctionalImpairments are represented by ICF codes and they are
caused by (causes relation) one or more instances of Disease. The job a patient
qualifiesFor is represented by the instance of NOCCode and its corresponding
title as an instance of NOCTitle, expressed by the hasTitle property. Instances
of physical activities (PhysicalActivity) and aptitudes (Aptitude) are partOf of
a job attribute (JobAttribute) which is requiredBy each job instance of CHNOC-
Code from the career handbook. The title of a job is an instance of CHNOCTitle
and expressed by the property hasTitle.
    Subsequent sections describe the data represented, and how they were de-
rived. The model is designed to support multiple queries competency detailed in
Section 5.




         Fig. 1. Integrated data model for occupation, function and health



3   Description of Data

Patient Data Patient data was gathered as part of the Canadian Immunisa-
tion Research Network Community Acquired Pneumonia study [7] to investigate
occupational associations. Pre-existing example data for patients with Diabetes
mellitus was also used in this study.


Occupation Data For each patient the data set contained the fields “Current
Job Title”, and “Current Industry”. This data originates from free text entered
in response to the questions “What is your job title?” and “In which type of


                                        3
industry do you work?”. The dataset of 566 patients included coding to NOC
2016 and NAICS (North American Industrial Classification), added manually.

Canadian National Occupational Classification The Canadian National
Occupational Classification (NOC) is the national reference on occupations in
Canada providing a standard taxonomy for labour market information and
employment-related program administration. NOC-2016 [6] is organized in a
four level hierarchy, there are 10 broad occupational categories (first level), 46
major groups (second level), 140 minor groups (third level), and 500 unit groups
(fourth level) encoding more than 30,000 occupational titles. For example, sam-
ple data for the occupation of cook is as follows: First Level: 6 Sales and service
occupations, Second Level: Major Group 63 - Service supervisors and special-
ized service occupations, Third Level: 632 Chefs and cooks, Fourth Level: 6322
Cooks - Cooks are employed in restaurants, hotels, hospitals and other health
care institutions, central food commissaries, and educational institutions.

Career Handbook Data The Career Handbook is the counselling component
of the National Occupational Classification (NOC) [6] system. The handbook de-
tails worker characteristics and other occupation indicators and is used to help
people make informed career decisions. It includes information for each occu-
pation on the required; aptitudes, physical activities, environmental conditions,
education/training, career progression and work settings.
    Aptitudes5 required for a person to learn the skills needed to perform job du-
ties are defined numerically on a scale from 0 to 5. These include general learning
ability, clerical perception, verbal ability, motor co-ordination, numerical ability,
finger dexterity, spatial perception, manual dexterity and form perception.
    Physical abilities6 include vision, colour discrimination, hearing, body posi-
tion, limb coordination and strength. In the case of visual performance for work
there are four categories with examples; V1 - Close visual acuity (assembling
micro-circuit boards), V2 - Near vision (reading and interpreting drawings and
specifications), V3 - Near and far vision, (installing shingles/tiles on roofs), V4
- Total visual field (driving vehicles).

Disease The International Classification of Diseases (ICD) is maintained by the
World Health Organisation. It supports the identification of health trends and
statistics globally, and is the international standard for reporting diseases and
health conditions. The currently used version in many jurisdictions is ICD-10
although some continue to use ICD-9.
    A version of ICD-11 was released7 on 18 June 2018 to allow Member States
to prepare for implementation, including translating ICD into their national lan-
guages. The ICD-11 for Mortality and Morbidity Statistics 8 (Version : 04/2019)
5
  http://noc.esdc.gc.ca/English/CH/AptitudesEnglish.aspx?ver=16&ch=03
6
  http://noc.esdc.gc.ca/English/CH/PhysicalActivities.aspx?ver=16&ch=03
7
  https://www.who.int/classifications/icd/en/
8
  https://icd.who.int/browse11/l-m/en


                                         4
allows visualisation of the coding hierarchy e.g. for code 5A10 Type 1 diabetes
mellitus, the ancestors to the top are 05 Endocrine, nutritional or metabolic
diseases, Endocrine diseases, and Diabetes mellitus, in order.



Functional Impairment For functional impairment we used the International
Classification of Functioning, Disability and Health (ICF) which provides a com-
prehensive and universally-accepted framework to describe functioning, disabil-
ity and health. Specialised clinical use requires both Comprehensive and Brief
ICF Core Sets e.g. there are 99 ICF categories in the Comprehensive and 33
second-level ICF categories in the Brief ICF Core set for diabetes mellitus. A
sample Comprehensive ICF Core Set for Diabetes Mellitus for the component
‘body functions is shown in Table 1:


                 ICF Code ICF Code
                                     ICF Category Title
                 2nd Level 3rd Level
                 b455                 Exercise tolerance functions
                            b4550     General physical endurance
                            b4551     Aerobic capacity
                            b4552     Fatiguability

Table 1. Sample of the comprehensive ICF core set for Diabetes Mellitus Categories
of the component ‘body functions’




4   NOC Data, Coding and Population of Semantic Model

Data imported to the model was acquired from original sources as described in
Section 3. Essential to the model is the grounding of free text patient data to the
NOC Classification. This is achieved using a coding algorithm [8] that iteratively
performs look ups in the NOC database until all the given job titles are matched
with one or more NOC codes. At each iteration free text inputs are pre-processed
by splitting job titles, removing stop words, stemming, followed by spelling cor-
rections or grammar checks. The accuracy of the algorithm for grounding to 4
digit NOC codes is 58.66 percent, based on benchmarking on manually coded
data from previous studies. Instances of NOCCode and NOCTitle are populated
to the model based on this algorithm.
    Data from 500 Pneumonia cases was used to populate instances of Patient
and Disease. Authentic ICF codes were populated as instances of functional
impairments caused by the corresponding diseases. The Career Handbook data
contributes instances of PhysicalActivity and Aptitude which are partOf one
or more instances of JobAttribute and are among the sets of requirements to
qualify for a job.


                                        5
5     Competency Queries

In this section we focus on illustrating the suitability of the model for occupation,
function and health using competency queries and offer further insights into the
needs of the target community. The following questions are illustrative of the
types of queries that may be of interest to a target user.
    Q1. What is the job classification for patients with disease X ?
    A 2019 report demonstrates this type of question, a Colorado physician spe-
cializing in occupational lung disease observed an increasing number of silicosis
cases in her practice. She undertook an review of electronic medical records for
a one year period of patients with a silicosis diagnosis (ICD-10 code J62.8). Nor-
mal rates silicosis were two per year; however, during June 2017-December 2018,
seven cases9 of silicosis were identified, all among employees of stone fabrication
companies.
    The following SPARQL query represents such an investigation. It uses a
ICD-10 code J62.8 for a disease X as input and produces NOC codes for the
job (?noc code) and the NOC title (?noc title) as output. In this query, lines
1-4 show the prefixes, line 7 asserts J62.8 as the disease, Pneumoconiosis due
to other dust containing silica, lines 8-10 assert a patient identified by p1001
and its relation with the disease and the job s/he qualifies for, and lines 11-13
are used to find the values of the job code and the corresponding job title. The
results of the query are shown in Table 2.

1. PREFIX :      
2. PREFIX icd10: 
3. PREFIX patid: 
4. PREFIX rdfs: 
5. SELECT ?noc_code ?noc_title
6. WHERE {
7.   icd10ID:J62_8 a :Disease .
8.   patid:p1001 a :Patient ;
9.               :diagnosedWith icd10:J62_8 ;
10.              :qualifiesFor ?nc .
11.  ?nc :hasTitle ?nt ;
12.      rdfs:label ?noc_code .
13.  ?nt rdfs:label ?noc_title . }


     ?noc code ?noc title
     8231       Construction trades helpers and labourers
     8231       Underground production and development miners
     6344       Jewellers, jewellery and watch repairers and related occupations
Table 2. Job classification (NOC code and NOC title) of patients with Pneumoconiosis
due to other dust containing silica (ICD-10 Code J62.8)

9
    https://www.cdc.gov/mmwr/volumes/68/wr/mm6838a1.htm?s_cid=mm6838a1_x


                                         6
    Q2. What is the disease and the job classification for patients with acquired
disability (resulting from a functional impairment)?
    A functional impairment for a patient may be caused by one of more diseases.
In this scenario, a case worker is interested in reviewing the job categories for
a given acquired disability resulting from a functional impairment. An example
of such a scenario may involve assessing visual impairment, the associated med-
ical conditions and the corresponding job titles. Knowing the category of job
(NOCCode) allows certain assumptions to be made about the required capabil-
ities for the patient’s job. This permits a review of the corresponding skills and
attributes typically required of the job and an early determination as to whether
the patient is likely to be able to return to the current job or a similar one.
    The SPARQL query below is capable of answering questions in such a sce-
nario. It uses the ICF code b2101 as an instance of :FunctionalImpairment
in line 5 to represent Visual field functions (i.e. seeing functions related to the
entire area that can be seen with fixation of gaze) as input. For all patients, the
query returns patient identifiers (lines 6-7), job codes and job titles (lines 9, 13-
15), the corresponding ICD-10 code of the disease which causes the impairment
(lines 8, 10-11), and the corresponding name of the disease (line 12). The output
in Table 3 lists 3 patients suffering from visual impairments.

1. PREFIX icf: 
2. PREFIX sc:   
3. SELECT ?patient_id ?noc_code ?noc_title ?icd_code ?disease_name
4. WHERE {
5.   icf:b2101 a :FunctionalImpairment .
6.   ?patient a :Patient ;
7.            rdfs:label ?patient_id ;
8.            :diagnosedWith ?icd ;
9.            :qualifiesFor ?nc .
10.  ?icd :causes icf:b2101 ;
11.        rdfs:label ?icd_code ;
12.        sc:name ?disease_name .
13.  ?nc :hasTitle ?nt ;
14.      rdfs:label ?noc_code .
15.  ?nc rdfs:label ?noc_title . }


    ?patient id ?noc code ?noc title         ?icd code ?disease name
                                                        Type 1 diabetes
    1001         7511        Truck driver E10.3         mellitus with
                                                        ophthalmic complications
                                                        Retinal detachment with
     1011         7442        Utility worker H33.0
                                                        retinal break
                                                        Primary angle-closure
     1083         7241        Electrician    H40.2
                                                        glaucoma
Table 3. List of patients, their job classification, ICD-10 codes and names of the
diseases causing functional impairment of the Visual field function (ICF code b2101)



                                         7
    Q3. What jobs can a patient with vision impairment likely return to?
    In the same scenario as Q2, a case worker is again tasked with reviewing
the ‘Return to Work’ options for a patient with a recently acquired disability.
This time the researcher is interested to identify, in the case of the Truck Driver
in Table 3, not just the category of job that the patient worked in, but to de-
termine whether the acquired disability from the specific functional impairment
(Visual field functions (ICF code b2101) caused by Type 1 diabetes mellitus
with ophthalmic complications (ICD Code E10.3)), may prevent the patient
from continuing in his/her job.
    An explicit mapping can be established between the various job attributes
such as Physical abilities (PhysicalActivity) and the functional impairments
(FunctionalImpairment) described in Section 3. Based on the assessments con-
cerning the level of vision impairment and the visual requirements for the jobs
currently shown in Table 3, the mappings that could be established were: Close
visual acuity-V1 maps to to Visual acuity functions, other specified (ICF code
b21008), Near vision-V2 maps to Binocular acuity of near vision (ICF code
b21002), Near and far vision-V3 maps to Visual acuity functions (ICF code
b2100), and Total visual field-V4 maps to Visual field functions (ICF code
b2101).
    For ‘Truck Drivers’, the mapping confirms a the visual requirement of V4 is
required for the job. We then explore the alternative job options for the patients
where they can likely successfully transition to. The target query must return
job titles with alternate vision requirements to that of the original job, but have
similar JobAttributes of a Truck Driver, and the jobs listed must not require
“Total visual field” level of vision.
    The SPARQL query below filters out results from the integrated data by
removing all cases of “Total visual field” which has a V score of 4, but matches
ICF code b2101, and the instances of NOCCode value 7511. Lines 5-12 matches
triples from the data model populated from the Career Handbook while lines
13-20 matches triples from the patient-centric populated data model.

1. PREFIX vision: 
2. SELECT distinct ?noc ?noc_title ?icf_code ?ch_noc
3.                 ?ch_noctitle ?v_score ?att_name
4. WHERE {
5.   vision:v1001 a :Vision ;
6.                sc:ratingValue ?v_score ;
7.                rdfs:label ?att_name ;
8.                :partOf ?ja .
9.   ?ja :requiredBy ?chnoc .
10. ?chnoc rdfs:label ?ch_noc ;
11.         :hasTitle ?chnt .
12. ?chnt rdfs:label ?ch_noctitle .
13. ?patient :diagnosedWith ?icd ;
14.           :qualifiesFor ?nc .
15. ?nc rdfs:label ?noc ;
16.      :hasTitle ?nt .


                                        8
17.   ?nt rdfs:label ?noc_title .
18.   ?icd :causes ?icf .
19.   ?icf a :FunctionalImpairment ;
20.        rdfs:label ?icf_code .
21.   FILTER(?v_score != 4 && ?icf_code="b2101" && ?noc="7511") }

    In Q3 we seek to accommodate the need to cross reference impairment with
given job attributes in an attempt to list jobs where a patient might return-to-
work, successfully whether it be the same job or another. The results in Table 4
show the NOC code and impairment code for a patient (who is a Truck Driver),
as well as the career handbook code and career handbook label of the jobs
that the patient might return to with the given impairment together with the
corresponding vision score in the job attributes, and the vision attribute label.
Specifically, physicians and case workers like to query directly for impairments
that prevent a patient from returning to their the most recent job, and compare
these side by side with the physical attributes of a given job.


?noc ?noc title ?icf code ?ch noc ?ch noctitle             ?v score ?att name
                                      Conservation and
7511 Truck driver b2101      5212.1   restoration           1        Close visual acuity
                                      technicians
7511 Truck driver b2101      7384.1   Gunsmiths             1        Close visual acuity
7511 Truck driver b2101      7281.0   Bricklayers           3        Near and far vision
7511 Truck driver b2101      2225.7   Lawn care specialists 3        Near and far vision
                                      Concrete products
7511 Truck driver b2101      9414.1   forming and           2        Near vision
                                      finishing workers
7511 Truck driver b2101      5212.7   Picture framers       2        Near vision
Table 4. List of jobs (column ?ch noctitle) a Truck Driver can transition to with
different visual requirements other than Total visual field (V-4)


6     Discussion
In the current study we have assessed the core objectives in occupational medicine
and reviewed the target data that needs to be integrated to support these goals,
namely to explore occupation and disease associations, and to cross-reference
the NOC Career Handbook. With knowledge of a patient’s occupation, in the
form of a job title, integrated with information from disease diagnosis, it is possi-
ble to indicate likely functional impairments and consequent difficulties in work
performance.
    The preliminary model we developed appears to be fit for the initial purposes
and can support our sample queries. Instantiation of the model depended on
additional algorithmic computations for the coding of job coding (NOC Code)
and our most complex query had to rely on expert curated mappings between
ICF and the Career Handbook for details of attributes required in the given job.


                                         9
    To our knowledge this is the first study integrating data specific to occu-
pational medicine using semantic technologies. The utility of the model is not
limited to OH, and it may find application in other areas such as human re-
sources or for government agencies to assist with accommodation of disability
or ensuring appropriate allocation of social benefits.
    In addition to the Canadian NOC there are similar coding schemes in different
countries and jurisdictions and established crosswalks between classifications.
The descriptors for job attributes will also be similar across countries, moreover
the disease and function classifications are international being maintained by the
World Health Organisation. Therefore the model can be applied in any location
context using the same or different occupation descriptors.


References
1. Leslie A. MacDonald, Alex Cohen, Sherry Baron, and Cecil M. Burchfiel. Occu-
   pation as Socioeconomic Status or Environmental Exposure? A Survey of Practice
   Among Population-based Cardiovascular Studies in the United States. American
   Journal of Epidemiology, 169(12):1411–1421, 05 2009.
2. France Labrèche, Joanne Kim, Chaojie Song, Manisha Pahwa, Calvin B. Ge, Vic-
   toria H. Arrandale, Christopher B. McLeod, Cheryl E. Peters, Jérôme Lavoué,
   Hugh W. Davies, Anne-Marie Nicol, and Paul A. Demers. The current burden
   of cancer attributable to occupational exposures in canada. Preventive Medicine,
   122:128 – 139, 2019. Burden of Cancer in Canada.
3. Mark P. Purdue, Sally J. Hutchings, Lesley Rushton, and Debra T. Silverman. The
   proportion of cancer attributable to occupational exposures. Annals of Epidemiol-
   ogy, 25(3):188 – 192, 2015. Causes of Cancer.
4. Daniel E Russ, Kwan-Yuet Ho, Joanne S Colt, Karla R Armenti, Dalsu Baris,
   Wong-Ho Chow, Faith Davis, Alison Johnson, Mark P Purdue, Margaret R Kara-
   gas, Kendra Schwartz, Molly Schwenn, Debra T Silverman, Calvin A Johnson, and
   Melissa C Friesen. Computer-based coding of free-text job descriptions to efficiently
   identify occupations in epidemiological studies. Occupational and Environmental
   Medicine, 73(6):417–424, 2016.
5. Igor Burstyn, Anton Slutsky, Derrick G. Lee, Alison B. Singer, Yuan An, and
   Yvonne L. Michael. Beyond Crosswalks: Reliability of Exposure Assessment Follow-
   ing Automated Coding of Free-Text Job Descriptions for Occupational Epidemiol-
   ogy. Annals of Work Exposures and Health, 58(4):482–492, 02 2014.
6. Employment and Social Development Canada and Statistics Canada. National occu-
   pational classification 2016. http://noc.esdc.gc.ca/English/noc/welcome.aspx?
   ver=16, 2016. [Online; accessed September-2019].
7. Canadian Immunisation Research Network. Serious outcomes surveillance (sos)
   network. http://cirnetwork.ca/network/serious-outcomes/, 2019. [Online; ac-
   cessed September 09, 2019].
8. Bao, Gary and Baker, Christopher J.O. and Adisesh, Anil. Development of the
   ASOC (Automated Semantic Occupation Coding) Algorithm. JMIR Preprints
   27/09/2019:16422, 2019. DOI: 10.2196/preprints.16422.




                                          10