Errors and Artefacts in Histopathological Imaging Antony Galton 1∗, Shereen Fouad 2 , Gabriel Landini 2 and David Randell 2 1 Department of Computer Science, University of Exeter, Exeter, UK 2 School of Dentistry, Institute of Clinical Sciences, University of Birmingham, UK INTRODUCTION The inhabitants of level 3 are image segments labelled with Preparation of histological samples for digital imaging, followed by level 0 category names; as such, level 3 models are quite distinct image formation and capture, forms only the start of an extended from the level 0 entities they represent. Such models always pipeline running from biopsy to diagnosis or analysis. Artefacts represent simplified versions of reality, being two-dimensional arising at these early stages form the best documented part of a representations of three-dimensional realities, capturing only a tiny heterogeneous catalogue of things that can go wrong in the course of fraction of the information that could potentially be extracted from following the pipeline. The literature on this subject mainly covers those realities by a hypothetical “omniscient” histologist. problems arising at the stage of specimen preparation and how these Level 0 comprises physical entities (real biological material), affect what is seen by the microscopist. With the advent of digital whereas levels 1 upwards comprise information entities, abstract microsopy and telepathology, however, new kinds of digitisation patterns that may be instantiated in physical bearers (e.g., computer artefacts or imaging errors can arise during slide image capture, memory, screen display, hard-copy printout). We distinguish digital and further types of error emerge when we process, interpret, and artefacts, arising from errors in the production of an information make inferences from the digital images. In this work we provide a entity (at levels 1 and above), from non-digital artefacts, arising classification and explanation of such phenomena, and how, where, from errors occurring wholly within level 0. and why they arise in the imaging pipeline, so that they can at least be mitigated at the point at which they are generated. Level 3: Histological Models Labelled image segments interpretable as model cells, model nuclei, etc slicing, image Level 2: Segmented Images biopsy, mounting, formation Image segments as candidate cells, Human smear, ... Tissue staining Microscope and capture Digital image candidate nuclei, etc patient sample slide (pixel array) Level 1: Captured Images segmentation Pixel arrays treatment Histological model Level 0: Biological Reality (image segments Segmented image Tissues, cells, nuclei, etc Diagnosis interpreted as (potentially meaningful Biopsy samples, histological preparations, etc quantitation, interpretation depicting entities sets of pixels) evaluation and labelling in tissue sample) Fig. 2. Ontological levels in histological imaging (after Galton et al. 2016) Fig. 1. The Histological Imaging Pipeline The histological imaging pipeline (Figure 1) comprises a LEVEL-DEPENDENT ERROR-TYPES sequence of stages leading from extraction of biological tissue from Here we present a brief overview of the kinds of errors that are an organism to yield a tissue sample which is then prepared for encountered during the transition from each level to the next. imaging and segmentation, to the application of histological theory Level 0 to level 0.5. Errors here include tissue sampling errors, to interpret the segmented regions as depicting actual histological arising from the process of extracting tissue samples from organisms entities present in the original sample. This interpretation and (e.g., destruction or degradation of samples, incorrectly targeted labelling results in a histological model for the sample; only on the sampling, crush, splits, fragmentation, haemorrage; tears and basis of such a model can diagnostic inferences be made, leading to missing parts, scratches from a damaged microtome blade, the possibility of selecting the most appropriate treatment. tissue sections too thick, ill-chosen cut direction (Fig. 3a)); and tissue preparation errors, occurring during slicing, staining, and THE SYSTEM OF ONTOLOGICAL LEVELS mounting (e.g., fixation failure, tissue shrinkage, folds (Fig. In order to classify the different types of error or artefact in the 3b), contamination with foreign matter or air bubbles, over- or histological imaging pipeline, we adopt the ontological framework understaining, faded stain; lack of stoichiometry of certain dyes; used in (Galton et al. 2016), according to which each stage of the immunohistochemistry-related issues such as background staining pipeline is characterised by an ontologically distinct assemblage of and antibody cross reactivity; misplaced tissue micro-array cores). entities that are handled at that stage. We refer to these assemblages Level 0.5 to level 1. These are imaging errors, relating to image- as levels; they form a series as shown in Figure 2. formation in the imaging device (e.g., a microscope), or image- capture in the capture device (e.g., a camera). In each case we ∗ To whom correspondence should be addressed: apgalton@ex.ac.uk distinguish device errors and deployment errors: 1 Galton et al (a) (b) (c) (d) (e) (f) Fig. 3. Example errors from various stages of the imaging pipeline. Level 0 errors: (a) Apparent “islands” in epithelium arising from non-orthogonal cut direction. (b) Folded tissue. Level 1 errors: (c) Left: unevenly illuminated, non-white background; middle: background only; right: image corrected using (sample/brightfield)×255 for each colour channel. (d) Left: chromatic aberration; right: corrected image using similarity transform.(e) Image generated by 1-minute exposure of unilluminated field, showing “hot” pixels. A level 2 error: (f) Left, initial captured image; middle: undersegmented image, using histogram “minimum error” method; right: better segmentation using regional gradient method (Landini et al. 2016). Device errors Deployment errors shape and size range of nuclei. Some tests can be embedded in the segmentation process itself, resulting in level 2 entities already more Chromatic aberration Coverslip scratches nearly conformable with level 3 (Landini et al. 2016). Image (Fig. 3d) Uneven background formation Beyond level 3. This is a miscellaneous collection of histological Spatial distortion illumination (Fig.3c) inference errors, leading to a faulty diagnosis. These can arise from Bayer mask errors Thermal noise faulty or incorrectly-used software systems (e.g., for computer- Image “Hot” and “dead” Interference and aided diagnosis) used in digitised image analysis. Errors may occur capture pixels (Fig. 3e) banding at any stage in the software development, from design, through implementation and testing, up to final deployment. Systematic consideration of such errors is relatively new to the histological Level 1 to level 2. These are image-processing errors that occur imaging community, but in view of recent advances in the field it during the process of manipulating the initially captured image in is important to recognise them as a significant class. order to enable discovery of relevant information from it. In pattern recognition algorithms, for example, histological Segmentation picks out some distinguished subset of pixels in the images are represented by vector quantisation, where each object in image and treats each of its connected components (segments) as an the segmented image is characterized by a set of features. A variety “object”. The goal is to find segments which depict level 0 entities. of errors can arise from inappropriate choice of feature set. Errors occur when the technique used leads to segmented images In general these high-level errors arise if too much trust is placed that fail to correspond to reality. These include oversegmentation, in necessarily imperfect software; it should not be used “blindly” where disconnected image segments derive from a single connected but under the scrutiny of a trained pathologist whose judgment can object in reality, and undersegmentation, where one segment supplement or correct an otherwise highly automated process. represents a group of distinct objects in reality (Fig. 3f). Level 2 to level 3. These are interpretation errors, involving ACKNOWLEDGEMENTS incorrect labelling of level 2 entities by level 0 categories. Level 3 entities are histological models, represented as image This work is supported by EPSRC grant EP/M023869/1 “Novel segments labelled by histological categories in conformity with context-based segmentation algorithms for intelligent microscopy”. theoretical expectations (e.g., nuclei should be proper parts of their cell bodies). Often the segmentation must be manipulated before REFERENCES category labels can be conformably assigned; such resegmentation Galton, A., Landini, G., Randell, D. & Fouad, S. (2016), Ontological levels in operations (Randell et al. 2013) may introduce other errors if histological imaging, in R. Ferrario & W. Kuhn, eds, ‘Formal Ontology in not deployed carefully. Uncorrected errors from any earlier stage Information Systems’, IOS Press, pp. 271–284. Landini, G., Randell, D., Fouad, S. & Galton, A. (2016), ‘Automatic thresholding from in the pipeline may result in histological models which, though the gradients of region boundaries’, Journal of Microscopy. theoretically acceptable, do not correspond to reality. Randell, D. A., Landini, G. & Galton, A. (2013), ‘Discrete mereotopology for spatial Mitigation of interpretation errors depends on tests based on reasoning in automated histological image analysis’, IEEE Transactions on Pattern prior theoretical understanding of the target entities, e.g., typical Analysis and Machine Intelligence 35(3), 568–581. 2