Using Self Organizing Maps to Visualize Age Related Changes in Lumbar Vertebrae and Intervertebral Discs Atif Ali Khan*¹, Daciana Iliescu¹, Evor Hines¹, Charles Hutchinson², Robert Sneath³ ¹School of Engineering, University of Warwick, ²Warwick Medical School, University of Warwick ³University Hospital Coventry and Warwickshire, NHS Trust, Coventry, United Kingdom {Atif.Khan; E.L.Hines; D.D.Iliescu; C.E.Hutchinson}@warwick.ac.uk, robert.sneath@uhcw.nhs.uk Abstract clinic, outnumbered only by the upper-respiratory A human spine is a complicated structure of bones, joints, infections [1, 2, 3]. Back pain is one of the most common ligaments and muscles which all undergo a process of reasons for missed work too. One-half of all working change with age. This paper describes the use of artificial Americans admit to having back pain symptoms each year intelligence in visualization and better understanding of the [1, 4]. Before finding the specific cause of back pain, it is progressive and degenerative changes in human lumbar important to study the variation of spinal features with age spine. Visualizing this pattern of change will be helpful in first and their correlation with one another [5]. These finding the correlations among the spinal features and variations and correlation among the features were studies understanding of how a change in one feature affects using the data collected from University Hospital Coventry others. The self-organizing map (SOM) is an efficient tool and Warwickshire, United Kingdom in the form of for visualization of multidimensional numerical data. It is magnetic resonance images (MRI) of the lumbar spine. capable of projecting high-dimensional data onto a regular, Scoring of features (feature selection and extraction) was usually 2-dimensional grid of neurons. In this paper, SOM done under the expertise of an orthopedic surgeon and is used to visualize the pattern of change in lumbar spine radiologist. The model was designed and built using self- features with the varying age. The paper gives an idea of organizing maps. how the information can be acquired from SOM representations and how the SOM can be best utilized in exploratory data visualization. Data from the lumbar spine A self-organizing map (SOM) is a special type of artificial MRIs of 61 patients (both male and female) were used in neural network which is trained using unsupervised this study. The age of patients ranged from 2 to 93 years. learning to produce a low-dimensional (typically two- Information for vertebral height, disc height and disc signal dimensional), discretized representation of the input space intensities were recorded from the MRI scans. SOM then of the training samples, called a map. Unlike to other transformed the larger feature space to a smaller one for artificial neural networks, self-organizing map uses a getting a more meaningful relation between the spinal neighborhood function to preserve the topological features. Complexity is reduced and the data set is properties of the input space [6]. One of the most represented in the form of 2D map which is easier to interesting aspects of SOMs is that they learn to classify understand and provides visual description. data without supervision. With this approach an input vector is presented to the network (typically a multilayer I. Introduction feedforward network) and the output is compared with the target vector. If they differ, the weights of the network are A human spine is a complicated and key component of altered slightly to reduce the error in the output. This is human being. During the normal ageing process, spine repeated many times and with many sets of vector pairs undergoes progressive and regressive changes which until the network gives the desired output. Training a SOM presumably follow certain pattern. This research focuses requires no target vector and learns to classify the training explicitly on the study of progressive and degenerative data without any external supervision whatsoever [7]. To changes occurring in the human lumbar spine with the study the variations and correlation of the spinal features, a normal ageing process. This research work concentrates on model was built which assigns patient a certain cluster to the identification and classification of age-related which he/she resembles the most on the basis of his/her variations in "human spine" with the help of Magnetic spinal scores. This will help spine specialists to rank and Resonance Image (MRI). These scans of the lumbar spine categorize patients on the basis of their spinal scores. This area belong to patients of different age groups. Back pain research work will provide a better overview to the spine is usually associated with the spine disorder. It is the specialists and the patients about the abnormal behavior if second most common reason for visits to the doctor’s any shown by their spine. II. Lumbar Spine MRI scans of 61 patients were selected to develop an initial model. Age and gender distribution of patients are shown A human spine consists of bones, joints, ligaments and in the Table 1 below. Ten groups were formed on the basis muscles. There are a total of 33 vertebrae in the human of age decades as: G0: up to 10 years, G1: 11-20 years, G2: spine: 7 in the neck (cervical region), 12 in the middle 21-30, years, G3: 31-40 years, G4: 41-50 years, G5: 51-60 back (thoracic region), 5 in the lower back (lumbar years, G6: 61-70 years, G7: 71-80 years, G8: 81-90 years region), 5 that are fused to form the sacrum and the 4 and G9: 91 and above years of age). coccygeal bones that form the tailbone [8]. The anatomy of Table I. Age wise clustering of the samples human spine is shown in the figure 1 below. The focus of this research work is to look at the age related changes in Age Group Age (years) Female Male Total the lumbar spine area. This lumbar spine area consists of Group 0 10 and younger 3 1 4 vertebrae L1, L2, L3, L4, L5 and intervertebral discs Group 1 11 to 20 2 4 6 between these vertebrae. Group 2 21 to 30 4 2 6 Group 3 31 to 40 3 3 6 Group 4 41 to 50 3 2 5 Group 5 51 to 60 4 2 6 Group 6 61 to 70 6 2 8 Group 7 71 to 80 4 4 8 Group 8 81 to 90 4 3 7 Group 9 91 and over 2 3 5 Total 35 25 61 There are lots of notable features which can be studied from a lumbar spine MRI scan. The scoring criteria were set to look initially on the vertebral height (L1, L2, L3, L4 and L5), disc height (T12-L1, L1-L2, L2-L3, L3-L4, L4-L5 and L5-S1) and disc signal (T12-L1, L1-L2, L2-L3, L3-L4, L4-L5 and L5-S1). These 17 spinal features were used as Figure 1: Anatomy of human spine an input to build and test the initial model. These features were measured and recorded from the lumbar spine MRIs III. Data Set of 61 patients. Table II. Extracted features of 5 samples from lumbar spine MRIs The data set used in this research was taken from the University Hospital Coventry and Warwickshire (UHCW), 1 2 3 4 5 United Kingdom. The raw data is in the form of Magnetic Gender m/f f f m m f Resonance Images (MRI) specifically of the lumbar spine Age Years 8 23 40 68 89 area. The format of data is Digital Imaging and L1 16.94 22.82 27.16 23.95 21.7 Communications in Medicine (DICOM). These magnetic L2 17.34 22.98 27.16 23.57 22.06 resonance images are the actual scans of the patients. Vertebral height 16.8 24.57 26.08 23.53 L3 21.94 Figure 2, shows the lumbar spine MRI. L4 17.34 24.65 27.85 23.53 21.33 L5 17.22 25.94 27.25 23.95 19.11 T12 L1 5.95 7.51 9.33 9.48 4.45 L1 L2 7.43 9.92 11.41 12.13 6.3 L2 L3 7.75 10.22 13.05 13.27 5.35 Disc height L3 L4 8.34 10.84 12.67 15.15 4.69 L4 L5 8.0 9.06 11.83 15.41 7.15 L5 S1 6.84 11.13 7.71 10.74 5.35 T12 L1 272.4 132.5 189.4 138.8 61.9 L1 L2 268.6 126.1 180.8 127.9 69.6 L2 L3 255.1 123 185.2 120.2 43 Fig 2: Sagittal and an axial view of the lumbar spine MRI Disc Signal L3 L4 307.6 104.4 208.7 129.9 75.1 263 95.3 138.4 137.6 Information associated with each MRI scan is the age and L4 L5 67.2 gender of the patient which is used for SOM modeling. L5 S1 260 109.3 52.6 57.4 89.6 IV. Methodology adapt their future responses to that input accordingly in such a way that neurons of competitive networks physically Self-organizing maps (SOMs) are a data visualization near each other in the neuron layer respond to similar input technique invented by Teuvo Kohonen which reduces the vectors. dimensions of data through the use of self-organizing neural networks. As the humans simply cannot visualize high dimensional data so this technique was created to help us understand high dimensional data. The way SOMs go about reducing dimensions is by producing a map of usually 1 or 2 dimensions which plot the similarities of the data by grouping similar data items together. So SOMs accomplish two things, they reduce dimensions and display similarities. The proposed model has a set of 17 input vectors arranged as columns in a matrix. SOM groups or ranks each sample (patient) on the basis of similarities in their 17 features and assigns certain location to each sample in the map. Figure 3 below; shows the step Figure 4, Structure of self-organizing map by step demonstration of the methodology used. With SOM, clustering is performed by having several units compete for the current object. Once the data have been entered into the system, the network of artificial neurons is trained by providing information about inputs. The weight vector of the unit is closest to the current object becomes the winning or active unit. During the training stage, the values for the input variables are gradually adjusted in an attempt to preserve neighborhood relationships that exist within the input data set. As it gets closer to the input object, the weights of the winning unit are adjusted as well as its neighbors [12, 13]. SOM training is shown below: Figure 3. Steps involved in modeling V. Self-Organizing Maps Self-Organizing Map (SOM) is a data visualization technique which helps to understand high dimensional data by reducing data dimensions and displaying similarities among data. According to Teuvo Kohonen; the self- Figure 5. SOM training organizing map (SOM) is a new, effective software tool for the visualization of high dimensional data. It converts The self-organization process involves four major complex, nonlinear statistical relationships between high- components: dimensional data items into simple geometric relationships on a low-dimensional display. As it thereby compresses Initialization: All the connection weights are initialized information while preserving the most important with small random values. topological and metric relationships of the primary data Competition: For each input pattern, the neurons compute items on the display, it may also be thought to produce their respective values of a discriminant function which some kind of abstractions [9]. provides the basis for competition. The particular neuron with the smallest value of the discriminant function is SOM contains two processes: training and mapping. In declared the winner. training process, it constructs the map using input samples. Cooperation: The winning neuron determines the spatial After the training, it automatically classifiers a new input location of a topological neighbourhood of excited neurons, sample in the mapping process. The map consists of thereby providing the basis for cooperation among several neurons which associated with a weight vector that neighbouring neurons. has the same dimension as the input sample and a position Adaptation: The excited neurons decrease their individual in the map. The neurons are arranged originally in physical values of the discriminant function in relation to the input positions according to a topology function, such as a grid, pattern through suitable adjustment of the associated hexagonal, or random topology. The purpose of SOM is to connection weights, such that the response of the winning detect regularities and correlations in their input, and also neuron to the subsequent application of a similar input to recognize groups of similar input vectors [10, 11]. It can pattern is enhanced. SOM Algorithm: corresponding weight vector W, of n dimensions: (w1, w2, w3...wn). Unlike other learning technique in neural networks, training a SOM requires no target vector. A SOM learns to VI. Experimentation classify the training data without any external supervision. Each node's weights are initialized. If the input space is D The measurements taken from the lumbar MRI of 61 dimensional (i.e. there are D input units) we can write the patients were used to model the SOM. Each patient has 17 input patterns as: features which were used as input to the model. These 17 input variables are vertebral heights (5 variables), disc x = {xi: i = 1, …, D} height (6 variables) and disc signal (6 variables). So the variables 1-5 are the vertebral height (L1-L5), variables 6- And the connection weights between the input units i and 11 are the disc heights from T12/L1--L5/S1 and variables the neurons j in the computation layer can be written as: 12-17 are the disc signals from T12/L1—L5/S1 respectively. The inputs vertebral heights, disc heights and wj = {wji : j = 1, …, N; i = 1, …, D} disc signals have difference ranges. Initial model was built without normalization of the inputs. Figure 7, below shows “N” is the total number of neurons. To determine the best the SOM model built on the basis of 17 input variables matching unit, one method is to iterate through all the without normalization. In this mode, final quantization nodes and calculate the Euclidean distance between each error was: 47.292 and final topographic error was: 0.00. node's weight vector and the current input vector. The node with a weight vector closest to the input vector is tagged as the BMU. The Euclidean distance is given as: U-matrix Variable1 Variable2 Variable3 Variable4 145 25.6 25.9 26.2 25.8 77 22.2 22.6 22.8 22.6 9.36 18.8 19.3 19.5 19.5 d d d d Variable5 Variable6 Variable7 Variable8 Variable9 Where x is the current input vector and w is the node's 25.1 9.1 10.2 11.1 11.6 weight vector. 21.9 7.84 8.89 9.59 10.1 18.7 6.57 7.56 8.1 8.57 d d d d d Network Architecture Variable10 Variable11 Variable12 Variable13 Variable14 10.8 10.4 309 310 313 In SOM, the network is created from a 2D lattice of 9.03 8.94 179 178 180 'nodes', each of which is fully connected to the input layer. 7.28 7.43 49.1 46.5 46.1 Figure 6 shows a very small Kohonen network of 3 x 3 d d d d d nodes connected to the input layer shown in dark blue. Variable15 Variable16 Variable17 332 333 299 188 188 170 43.6 42.9 40.3 d SOM 24-Feb-2013 d d Figure 7, 17 variables SOM without normalization of Inputs Figure 6, SOM network architecture Each node has a specific topological position (an x, y coordinate in the lattice) and contains a vector of weights of the same dimension as the input vectors. That is to say, if the training data consists of vectors, X, of n dimensions: (x1, x2, x3...xn). Then each node will contain a Figure 8, SOM and U-matrix without normalization of inputs There are two separate parts of the SOM display. These U-matrix 3.16 Variable1 0.714 Variable2 0.709 Variable3 0.763 Variable4 0.766 include the unified matrix or U-matrix, and the component 1.79 -0.629 -0.713 -0.675 -0.684 planes that are provided for individual variables [14, 15]. 0.42 d -1.97 d -2.13 d -2.11 d -2.13 The U-matrix allows examination of the overall cluster Variable5 0.705 Variable6 0.683 Variable7 0.898 Variable8 1.14 Variable9 1.21 patterns in the input data set after the model has been trained. [16, 17, 18] Each hexagonal cell represents -0.707 -0.349 -0.241 -0.149 0.00659 individual neurons, which are the mathematical linkages d -2.12 d -1.38 d -1.38 d -1.44 d -1.2 between the input and output layers. Variable10 1.07 Variable11 1.13 Variable12 1.65 Variable13 1.73 Variable14 2 -0.0553 0.0264 0.395 0.461 0.617 The neurons are drawn into distinct clusters during model -1.18 -1.08 -0.856 -0.811 -0.762 training. Relative distances between neuron clusters are Variable15 d Variable16 d Variable17 d d d displayed by the intensity of the colors, with dark color 2.04 1.86 1.75 representing greater distance [19, 20]. In the U-matrix 0.676 0.597 0.513 generated here, a strong cluster is apparent, occurring in d -0.687 d -0.67 d -0.728 the top half (dark blue) and another one in the middle and SOM 23-Feb-2013 lower middle half (light blue). This indicates that most of Figure 10, SOM and U-matrix with normalized inputs the input variables are covarying in one direction in n- dimensional space (where n is the number of input variables). A different trend is seen when SOM is modeled VII. Results with normalized data. When the input variables are In the component planes for individual variables, the color normalized, following trend was seen as shown in figure 9 coding corresponds to actual numerical values for the input below. variables that are referenced in the scale bars adjacent to each plot. Blue colors show low values and red corresponds to high values. The relationships between each of the variables are visualized by comparing the color patterns for individual maps. In this manner, the relationships between all of the variables entered into the model can be examined simultaneously or in pair-wise combinations. U-matrix Variable1 Variable2 Variable3 Variable4 3.16 8.47 9.14 9.05 9.15 1.79 7.13 7.71 7.61 7.7 0.42 5.78 6.29 6.17 6.25 d d d d Variable5 Variable6 Variable7 Variable8 Variable9 9.04 4.8 5.03 5.38 5.16 7.63 3.76 3.89 4.09 3.96 6.22 2.73 2.75 2.8 2.75 d d d d d Variable10 Variable11 Variable12 Variable13 Variable14 4.21 4.51 3.17 3.32 3.46 3.08 3.4 1.92 2.05 2.09 1.96 2.3 0.665 0.779 0.707 d d d d d Figure 9, 17 variables SOM with normalized inputs Variable15 Variable16 Variable17 3.44 3.23 3.04 SOM model without input normalization showed final 2.07 1.96 1.8 quantitation error of 47.292. However, by the 0.711 0.693 0.559 normalization the inputs this quantization error is reduced d d d SOM 24-Feb-2013 to 1.989. The final quantization error was: 1.989 and the final topographic error was: 0.033. This shows that SOM Figure 11, Visualization of SOM U-Matrix and variables analysis with normalized input variables provides far accurate and reliable results as compared to the results Here in figure 11 above, matching the color code of each without normalization. The first map in the figure 10 variable with U-matrix it can be seen that vertebral heights below is the unified distance matrix or U-Matrix which L1, L2, L3, L4, L5 (corresponding to variables 1, 2, 3, 4, 5 represents overall behavior of the model. Variables 1 to 5 respectively) do not correlate with the age (dissimilarity are the vertebral heights. Variable 6 to 11 are the disc with U-matrix). Disc heights T12-L1, L1-L2, L2-L3, L3- heights and variable 12 to 17 are the disc signal intensities L4, L4-L5 and L5-S1 (corresponding to variables 6, 7, 8, 9, of all 61 patients. The color of the units (neurons) in the 10, 11 respectively) show somewhat correlation with the map shows the behavior of the specific neuron. Similar age. However, disc signal T12-L1, L1-L2, L2-L3, L3-L4, color shows that the neurons are located close to one L4-L5 and L5-S1 (corresponding to variables 12, 13, 14, another or similarity among the samples. 15, 16, 17 respectively) shows strong correlation with age. VIII. Conclusion [12] Kohonen, Teuvo. "The self-organizing map." Proceedings of the IEEE 78, no. 9 (1990): 1464-1480. The objective of the SOM analysis was to observe [13] Vesanto, Juha, and Esa Alhoniemi. "Clustering of the interrelationships that exist between 17 variables that were self-organizing map." IEEE Transactions on Neural tested and thereby provides a basis for more advance Networks, 11, no. 3 (2000): 586-600. analysis. The SOM does not replace existing statistical [14] Vesanto, Juha. "SOM-based data visualization tools, but complements our ability to examine relationships methods." Intelligent data analysis 3, no. 2 (1999): between disparate types of variables in a visual 111-126. presentation of the data. By visualizing the SOM results [15] Vesanto, Juha, Johan Himberg, Esa Alhoniemi, and obtained by normalized dataset, it was concluded that Juha Parhankangas. "Self-organizing map in Matlab: lumbar spine vertebral height does not correlate with the the SOM toolbox." In Proceedings of the Matlab DSP age whereas disc height shows somewhat correlation with Conference, vol. 99, pp. 16-17. 1999. age. Disc signal intensities of lumbar spine show a strong [16] Ultsch, Alfred; Siemon, H. Peter (1990). "Kohonen's correlation with the age. In future, other spinal features Self Organizing Feature Maps for Exploratory Data will be incorporated to study the spinal aging process in Analysis". Proceedings of the International Neural more depth. Network Conference (INNC-90), Paris, France, July 9–13, 1990. 1. Dordrecht, Netherlands: Kluwer. pp. Acknowledgments 305–308. ISBN 978-0-7923-0831-7 (0-7923-0831-X). [17] Ultsch, Alfred (2003); U*-Matrix: A tool to visualize This project was partially supported by Warwick Impact clusters in high dimensional data, Department of Fund, University of Warwick, United Kingdom. Authors Computer Science, University of Marburg, Technical would like to thank the University Hospital Coventry and Report Nr. 36:1-12. Warwickshire (UHCW) NHS Trust, Coventry, United [18] Ultsch, Alfred. "Maps for the visualization of high- Kingdom for providing valuable data in the form of dimensional data spaces." In Proc. Workshop on Self magnetic resonance images of the lumbar spine. organizing Maps, pp. 225-230. 2003. [19] Vesanto, Juha, Johan Himberg, Esa Alhoniemi, and References Juha Parhankangas. SOM toolbox for Matlab 5. Helsinki, Finland: Helsinki University of Technology, [1] Vallfors B. Acute, Subacute and Chronic Low Back 2000. Pain: Clinical Symptoms, Absenteeism and Working [20] Ong, Jason, and S. S. Raza. "Data mining using self- Environment. Scan J Rehab Med Suppl 1985; 11:1-98. organizing kohonen maps: A technique for effective [2] Back Health at Work. HSE 2005. data clustering & visualization." In International [3] Back Pain Patient Outcomes Assessment Team Conference on Artificial Intelligence (IC-AI’99). 1999. (BOAT). In MEDTEP Update, Vol. 1 Issue 1, Agency for Health Care Policy and Research, Rockville. [4] Department of Health Statistics Division. The prevalence of back pain in Great Britain in 1998. London: Government Statistical Service, 1999. [5] Atif Khan, Daciana Iliescu, Evor Hines, Charles Hutchinson, and Robert Sneath, “Neural Network Based Spinal Age Estimation Using Lumbar Spine Magnetic Resonance Images (MRI)”, proceedings of 4th International Conference on Intelligent Systems, Modelling and Simulation (ISMS2013), 29-31 January, Bangkok Thailand, pp. 88-93. [6] Kohonen, Teuvo (1982). "Self-Organized Formation of Topologically Correct Feature Maps". Biological Cybernetics 43 (1): 59–69. [7] Kohonen, Teuvo, “Self-organizing maps (SOMs)”, Vol. 30. Springer Verlag, 2001. [8] McKenzie, Robin A., and S. May. “The lumbar spine”, Spinal, 1981. [9] Kohonen, Teuvo. "The self-organizing map" Neurocomputing 21.1 (1998): 1-6. [10] Barlow, Horace B. "Unsupervised learning." Neural Computation 1, no. 3 (1989): 295-311. [11] Carpenter, Gail A., and Stephen Grossberg, eds. “Pattern recognition by self-organizing neural networks”. MIT Press, 1991.