=Paper=
{{Paper
|id=Vol-1795/paper22
|storemode=property
|title=Towards the Creation of the Cardiovascular Magnetic Resonance Quality Assessment Ontology (CMR-QA)
|pdfUrl=https://ceur-ws.org/Vol-1795/paper22.pdf
|volume=Vol-1795
|authors=Ernesto Jimenez-Ruiz,Valentina Carapella,Elena Lukaschuk,Nay Aung,Kenneth Fung,Jose Paiva,Mihir Sanghvi,Stefan Neubauer,Steffen Petersen,Ian Horrocks,Stefan Piechnik
|dblpUrl=https://dblp.org/rec/conf/swat4ls/Jimenez-RuizCLA16
}}
==Towards the Creation of the Cardiovascular Magnetic Resonance Quality Assessment Ontology (CMR-QA)==
Towards the Creation of the Cardiovascular Magnetic Resonance Quality Assessment Ontology (CMR-QA)? Ernesto Jiménez-Ruiz1,4 , Valentina Carapella2 , Elena Lukaschuk2 , Nay Aung3 , Kenneth Fung3 , Jose Paiva3 , Mihir Sanghvi3 , Stefan Neubauer1 Steffen Petersen3 , Ian Horrocks4 , Stefan Piechnik2 1 Logic and Intelligent Data Group, Department of Informatics, University of Oslo, Norway 2 Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Radcliffe Department of Medicine, University of Oxford, UK 3 William Harvey Research Institute, NIHR Cardiovascular Biomedical Research Unit at Barts, Queen Mary University of London, UK 4 Information Systems Group, Department of Computer Science, University of Oxford, UK 1 Background UK Biobank5 is a large scale population study that started in 2006 aimed at improv- ing the understanding, diagnosis and treatment of a wide range of diseases, such as cancer, stroke or cardiac pathologies [2]. The recruitment of volunteers took place at the national level and reached the considerable number of 500,000 volunteers aged 40- 69 across UK. All volunteers agreed to go through a series of clinical tests and have their health condition checked on follow-up. Different clinical imaging modalities are included in the protocol applied to the volunteers. In particular, a number of Cardiac Magnetic Resonance Imaging (CMR) sequences are employed to evaluate cardiac func- tion: cine-MRI, tag-MRI, T1-mapping and blood flow imaging [3]. The work presented in this paper is related to a specific pilot study of 5,000 CMR scans limited to cine- MRI, the most common CMR modality in clinical practice. The results of this pilot study are about to be released to the public together with a first part of the associated data analysis, Data analysis for the 5,000 cine-MRI was carried out by a team of observers from two centres, OCMR and Barts Hospital, and consisted of two parts: image analysis and assessment of image quality. Image analysis is essentially manual delineation of con- tours (also known as segmentation) of the four chambers of the heart, which then results in the computation of fundamental parameters of cardiac function. Quality assessment of the Cine-MRI scans was carried out through a combination of free-text comments and numerical quality scores. Figure 1 highlights the two components of the analysis. Quality assessment and general data analysis progress was managed through a shared spreadsheet by the team. For the purposes of our work, we focus only on the quality assessment data and the combination of numerical quality scores and free-text anno- tation. The quality scores alone (1 = optimal, 2 = suboptimal, 3 = not-analysable or ? This paper represents a short but more technical version of the paper “Towards the Semantic Enrichment of Freetext Annotation of Image Quality Assessment for UK Biobank Cardiac Cine MRI Scans” [1] 5 http://www.ukbiobank.ac.uk/ Fig. 1. Example of analysis pipeline combining image analysis and image quality assessment. Quality scores are 1 = optimal, 2 = sub-optimal and 3 = unreliable or non-analysable. not-reliable) only provide a quick overall classification. The free-text annotation is rich in information but cannot be processed in an easy and efficient manner as the numerical scores. In this ongoing work we aim at employing tools from the Semantic Web for the efficient structuring of free-text documentation. The semantics of the free-text annota- tions, which describe the quality of the image analysis, will be defined via a structured vocabulary or ontology, which we are going to call CMR-QA (Cardiovascular Mag- netic Resonance Quality Assessment). The aimed semantic layer (ontology, rules and data) will provide machine-readable data and will be a powerful tool for (i) fast and efficient processing of the free-text comments; (ii) automatic image quality assessment from such comments and generation of quality scores; (iii) evaluation of the quality of the free-text comments in terms of information completeness, ambiguity and variabil- ity; (iv) training purposes (e.g., showing preferred annotation styles for different types of images); (v) efficient semantic access (i.e. database querying) to the images by the UK Biobank target users, such as researchers in the field of automatic segmentation, or clinical researchers who need a specific subset as a control group in their study. 2 Definition of the semantic layer Free-text annotation are prone to variability. Variability is due to natural human variabil- ity (e.g., used nomenclature, differences in point of view), but it can also reflect different ranges of professional expertise. For example, different observers can correctly flag as LA off axis, LA out of plane, wrong LA plane, 2Ch out of plane, or LA foreshortened an image where the plane chosen to acquire a long axis view was not optimally aligned to measure left atrial (LA) volumes. CMR-QA ontology aims at defining the semantics of the free-text annotations in order to help identify commonalities and disagreement among different annotations. An ontology may reveal, as a source of variability, the use of different synonyms, or the use of narrower, broader or even sibling terms for the same kind of annotations. The use of an ontology (in combination with rules) may also reveal a source of ambiguity or incompleteness if some required information is not provided (e.g., the annotation LA off axis should always refer to the cardiac cycle where it was observed). 2.1 The ontology The CMR-QA will include general knowledge about the domain,6 for example it will encode that the concepts CineMRI Scan and T1-Mapping Scan are a kind of MRI Scan. It will also encode more concrete knowledge about the quality assessment process, for example, wrong image plane orientation is a kind of technical issue. Similarly, RA off axis is a specific type of wrong image plane orientation, while mistriggering can be either a kind of technical issue (as an artefact) or a patient-related issue (as a pathology). The ontology also encodes relationships between concepts, for example the property has technical issue relates concepts with types of technical issue. Non-logical knowledge in the form of lexical information can also be added to the ontology; for example, the ontology may include that RA out of plane is an alternative label for RA off axis. Equations (1)-(6) show the formalization of the above knowledge into ontology axioms.7 SubClassOf(CineMRI Scan MRI Scan) (1) SubClassOf(Wrong Orientation Technical Issue) (2) SubClassOf(RA offaxis Wrong Orientation) (3) SubClassOf(Misstriggering ObjectUnionOf(Technical Issue Patient Issue)) (4) ObjectPropertyRange(has technical issue Technical Issue) (5) AnnotationAssertion(alt label RA offaxis RA out of plane) (6) 2.2 The rules The ontology is being extended with (manually created) rules to infer implicit knowl- edge from the annotations. For example if the free-text annotation includes the comment RA off-axis then the comment is necessarily referred to the Horizontal long axis (HLA) view. Analogously, the free-text comment basal slice is missing implies a lack of cov- erage associated to the short axis (SA) view. The rules will also be used to infer the quality assessment scores. For example, Lack of coverage will always lead to a quality score associated to the right and left ventricle equal or greater than 2 (e.g., suboptimal or unreliable). In addition, rules may also be used to reveal potential incompleteness or ambiguity. For example, we can classify the technical issue LA off axis as incomplete if the cardiac cycle is not indicated. Equations (7)-(9) show the formalization of some of the aforementioned rules.8 6 We could not find any ontology in BioPortal [4] meeting all our requirements, which evidences the necessity of a more specific ontology in this particular domain. 7 We use OWL functional-style syntax: https://www.w3.org/TR/owl2-syntax/. 8 We use datalog rules (a subset of Prolog). A rule of the form A(x) ← B(x) ∧ R(x, y) ∧ C(y) means that the combination of the concepts B and C via the relationship R (for the given individuals ‘x’ and ‘y’) implies that the individual ‘x’ is a member (or a type) of the concept A. has issue in HLA view (mri, issue) ← CineMRI Scan(mri) ∧ has technical issue(mri, issue) ∧ RA offaxis(issue) (7) has RV quality score(mri, 2) ← CineMRI Scan(mri) ∧ has technical issue(mri, issue) ∧ Lack coverage(issue) (8) IncompleteIssueDefinition(issue) ← RA offaxis(issue) ∧ not observed in cardiac cycle(issue, cc) (9) 2.3 The data The free-text content in the spreadsheet is rich in information but cannot be processed in an easy and efficient manner. In this work we are developing custom named entity recognition (NER) techniques to transform the free-text comments into semantically rich data according to the CMR-QA. For example, the free-text comment “basal slice is missing. wrong planes ra/la” associated to a CineMRI Scan leads to the following seven ontology facts: CineMRI Scan(mri) has technical issue(mri, issue1 ) Lack coverage(issue1 ) has technical issue(mri, issue2 ) RA offaxis(issue2 ) has technical issue(mri, issue3 ) LA offaxis(issue3 ) Where mri, issue1 , issue2 and issue3 are the identifiers of the ontology data extracted from the annotations. For example, issue1 is associated to the chunk of text “basal slice is missing” and represents a concrete individual of an observed Lack of coverage (i.e., issue1 is a member of the concept Lack coverage, that is, Lack coverage(issue1 )). These facts together with the ontology and rules introduced in the previous sec- tion will lead via reasoning to new (implicit) facts. For example, using the ontology axioms in Equations (1)-(3) we can infer the new facts Technical issue(issue1 ) and MRI Scan(mri). The rules in Equations (7)-(9) will also infer new knowledge (e.g., has RV quality score(mri, 2) via rule (8) and has issue in HLA view (mri, issue2 ) via rule (7)) or raise potential warnings with regard to ambiguity/incompleteness (e.g., IncompleteIssueDefinition(issue2 ) and IncompleteIssueDefinition(issue3 ) via on- tology rule (9)). 3 Discussion In this preliminary study we have set the first axioms and rules to define a structured vocabulary associated to the quality assessment data. Preliminary experiemnts to auto- matically extract ontology data from the free-text annotations have also been conducted. Related Work. There have also been recent efforts in adding a semantic layer to describe the information within a biobank. Andrade et al. [5] envisaged the benefits of using ontologies for querying and searching the information in a biobank and across biobanks. Muller et al. [6] presents and updated overview of the state of the art and open challenges for the description and interoperability across biobanks where the use of Semantic Web technologies will play a key role. Examples of concrete Semantic Web-based solutions in biobanks can also be found in [7, 8]. Future Work. As immediate future work, we plan to complete the CMR-QA, define the necessary ontology rules and finalize the implementation of the techniques to text mine the comments to extract ontology facts. We also aim at integrating CMR-QA with other parts of UK Biobank where different ontologies and controlled vocabularies may be used, and with standards medical vocabularies like SNOMED CT. Furthermore, we will perform an extensive evaluation to analyse the correctness of our approach. For ex- ample, (automatic) validation will be carried out by comparing the generated automatic scores using rules with those manually assigned by the observers as part of their quality assessment. Acknowledgements. SEP, SN and SP acknowledge the British Heart Foundation (BHF) for funding the manual analysis to create a cardiovascular magnetic resonance imag- ing reference standard for the UK Biobank imaging resource in 5,000 CMR scans (PG/14/89/31194, PI Petersen, 6/2015 to 5/2018). SKP, VC and SN were additionally funded by the National Institute for Health Research (NIHR) Oxford Biomedical Re- search Centre based at The Oxford University Hospitals Trust at the University of Ox- ford. EJR and IH were funded by the EC FP7 project Optique, and the EPSRC projects Score!, ED3 and DBOnto. EJR was also funded by the Centre for Scalable Data Access (SIRIUS) and the RCN project BigMed. References 1. Carapella, V., et al.: Towards the Semantic Enrichment of Free-text Annotation of Image Quality Assessment for UK Biobank Cardiac Cine MRI Scans. In: MICCAI Workshop on Large-scale Annotation of Biomedical data and Expert Label Synthesis (LABELS). (2016) 2. Petersen, S.E., et al.: Imaging in population science: cardiovascular magnetic resonance in 100,000 participants of UK Biobank - rationale, challenges and approaches. Journal of Car- diovascular Magnetic Resonance 15(1) (2013) 1–10 3. Schulz-Menger, J., et al.: Standardized image interpretation and post processing in cardio- vascular magnetic resonance: Society for Cardiovascular Magnetic Resonance (SCMR) board of trustees task force on standardized post processing. Journal of Cardiovascular Magnetic Resonance 15(1) (2013) 1–19 4. Fridman Noy, N., et al.: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research 37(Web-Server-Issue) (2009) 5. Andrade, A.Q., Kreuzthaler, M., Hastings, J., Krestyaninova, M., Schulz, S.: Requirements for semantic biobanks. Stud Health Technol Inform. 180 (2012) 569–573 6. Müller, H., et al.: State-of-the-Art and Future Challenges in the Integration of Biobank Cata- logues. In: Smart Health - Open Problems and Future Challenges. (2015) 261–273 7. Pathak, J., et al.: Applying semantic web technologies for phenome-wide scan using an elec- tronic health record linked Biobank. J. Biomedical Semantics 3 (2012) 10 8. Brochhausen, M., et al.: Developing a semantically rich ontology for the biobank- administration domain. J. Biomedical Semantics 4 (2013) 23