Studies of Anthropometrical Features using Machine Learning Approach The Long Nguyen1 , Thu Huong Nguyen1 , and Aleksei Zhukov2 1 Irkutsk State Technical University, Irkutsk, Russia {thelongit88,thuhuongyb}@gmail.com 2 Irkutsk State University, Irkutsk, Russia zhukovalex13@gmail.com Abstract. In this article we propose the novel approach to measure anthropometrical features such as height, width of shoulder, circumfer- ence of the chest, hip and waist. The sub-pixel processing and convex hull technique are used to efficiently measure the features from 2d im- age. The SVM technique is used to classify men and women based on measured features. The results of real data processing are presented. Keywords: anthropometrical features, SVM, image processing, body size, support vector machine 1 Introduction Development of efficient methods of image recognition is an important field of computer sciences. The theory of such methods uses the machine learning meth- ods enabling an automatic scenes analysis. Recently, automatic detection and feature extraction of the human bodies is widely used in many fields such as non-contact measurements of body size [1], the construction of 3D models of humans [2,3], [12], the analysis of human action [4] and pose estimation [5]. In this paper we propose a system of non-contact anthropometrical features extraction based on the analysis of digital images of humans. Background sub- traction is used to detect contours data. Edge detection operators are employed for contour detection (silhouette). This approach combines with the algorithm of face recognition, background subtraction, the detection of skin color and contour analysis. The convexity hull defects are also used for anthropometrical features extraction, then the support vector machine (SVM) will be applied for gender classification. Cluster analysis methods allow to analyze the space of feature vectors which obtained sizes to find out the clusters corresponding to the most characteristic anthropometric features of people, that can help to develop recommendations for the clothing industry, and also can be used in other fields of natural science, such as axiology. The novelty and motivation of our approach is usage of the image process- ing techniques for efficient measurement of anthropometric features for further 96 classification with machine learning techniques. As an application male/female classification task was examined. 2 Related Work All detection methods of body parts can be divided into two main categories: model based and learning methods. Model based approaches are the “top-down” methods, which are used to obtain preliminary information, e.g. about the shapes of the human body in various poses [6]. In machine learning, one apply the principle of learning data to extract useful knowledge from the data. In [7] an effective approach for silhouette detections is presented. It is used to represent the contour curves of the form of the human body in the form of 8-connected chain code Freeman [8]. The classic algorithm for background subtraction and the Canny edge detector are used to get the silhouette. Then the contour data is divided into a number of segments. Thus, special rules for measuring the differences between the directions of the segments are used for feature extraction points. This approach has several disadvantages including high sensitivity noise. The contour of body is also divided into the parts in [9]. The convexity and curvature boundaries of each segment of the contour are used to define the body parts. In [6] the approach for body parts segmentation in noisy silhouette images was proposed. The weighted radial histogram of distances and directions are used as features. Authors also used Hidden Markov Model (HMM) to model the silhouette as a sequence of body parts. A general model is trained using shape context features extraction from the labeled synthetic data. In [10] authors use the segmentation method based on sub-pixel data process- ing and relatively large regions or segments. It also detects a significant upper and lower extremities from these segments, identifies potential head and torso positions at the same time. Both modules combine these parts from partial im- ages configuration of the body, by applying global constraints to recover full body configurations. 3 Anthropometric Data Extraction In the first part of the approach we detect the silhouette using the background subtraction method. The morphological operations including erosion, dilation, opening, and closing are performed to reduce noise and to smooth the silhouette contour. A convex hull is created around the silhouette frame, and convexity defects are used as the features for analysis. Individually these features are the start and end convexity defect points and convexity defect locations. To find the convex hull of a 2D set points, we use the Sklansky’s algorithm [14] that has O(N log N ) complexity. 97 Fig. 1. Flowchart of machine learning with anthropological features extraction. Background Subtraction Algorithm: The method is based on a compar- ison between the two images, namely foreground (FG) and background images (BG). The scene of background image is obtained when there is no object mo- tion [9,10]. Call F G(x, y) is the intensity values of pixel with coordinates (x, y) in the foreground image, belonging to the interval [0.255]. BG(x, y) is the in- tensity values of pixel with coordinates (x, y) of the background image. A pixel with coordinates (x, y) in the foreground image of the dominant component if it satisfies: |F G(x, y) − BG(x, y)| > T Where, T (x, y) is the threshold value, which enables initializing by the value was determined. Pixels with label 1 are an object if |F G(x, y) − BG(x, y)| > T or not object (value 0) if |F G(x, y) − BG(x, y)| < T . Sub-pixel Processing: To achieve better accuracy sub-pixel processing we employed sub-pixel processing. Conventional image processing is performed in units of 1 pixel, while the sub-pixel processing method performs position detec- tion in units down to 0.01 pixels. This enables high accuracy position detection, expanding the application range to precise part location and dimension mea- surement. As described in [18] sub-pixel procedure iterates to find the sub-pixel accurate location of corners, as shown on the fig. 2. Sub-pixel accurate corner locator is based on the observation that every vec- tor from the center q to a point p located within a neighborhood of q is orthogonal 98 . Fig. 2. Sub-pixel corner location principle illustration (Red arrows mean gradient di- rection). to the image gradient at p subject to image and measurement noise. Consider the expression: i = DIpi T · (q − pi ) where DIpi is an image gradient at one of the points pi in a neighborhood of q. The value of q is to be found so that i is minimized. A system of equations may be set up with i set to zero: X X (DIpi · DIpi T ) − (DIpi · DIpi T · pi ) i i where the gradients are summed within a neighborhood of q. Calling the first gradient term G and the second gradient term b gives: q = G−1 · b . The algorithm sets the center of the neighborhood window at this new center q and then iterates until the center stays within a set threshold. In our work we use OpenCV library to perform sub-pixel corner detection [18]. 99 . Fig. 3. Sub-pixel processing result. In our approach sub-pixel helps us to find corners. It uses the dot product technique to refine corners. The function works iteratively, refining the corners until the termination criteria is reached. Most sub-pixel algorithms require a good estimate of the location of the feature. Otherwise, the algorithms may be attracted to the noise instead of desired features. Features Extraction based on Convexity Hull Defects: In our approach the human body is described by convexity defect triangles. The bodies are represented by triangles with three coordinates called the con- vexity defect start (xds ; yds ), defect end (xde ; yde ), and defect position points (xdp ; ydp ), labeled as P1 ,P2 and P3 respectively. We applied the convex hull method to extract contours and obtained a lot of convexity defects, including areas with very small depth, even a value of 0 - these are not areas containing features to be extracted. So, we determined the area of interest which contains the parts of the body is area that has depth > 50. That depth value was obtained empirically. Thus we got 5 convex regions correspond to conditions. Then we determined of the human body, which contains three parts of chest, waist and hips. We continue applying the convex hull to locate the waist. Once we got the coordinates of the points determined, we perform calculation of the distances between points in pixels, and finally converted the measurements into cm. The convexity defect of triangle is determined based on: (xds ; yds ), (xde ; yde ), and (xdp ; ydp ). A convexity defect is presented wherever the contour of the object is away from the convex hull drawn around the same contour. Convexity defect gives the set of values for every defect in the form of vector. 100 Fig. 4. Flowchart feature extraction based on convex hull. This vector contains the start and end point of the line of defect in the convex hull. These points indicate indices of the coordinate points of the contour. These points can be easily retrieved by using start and end indices of the defect formed from the contour vector. Convexity defect also includes index of the depth point in the contour and its depth value from the line. In fact, the person may be represented by many triangles point defects, it is piecewise convex. However, in this approach, we are interested in two triangles have the biggest area - It obviously corresponds to leg-armpit-arm triangle, and it includes the location of the parts we need to calculate interest: chest, waist, hips. Finally, we obtain the coordinates of the points on the body. Therefore in this paper we propose a simpler and cheaper system comparing with other systems, see e.g. [16]. In our experiments we used single digital camera and A4 sheet (210 × 297mm) for calibration. Source images for the method must be captured in special way: with given background and calibration sheet, human must stand straight with arms stretched.Three dimensions (chest circumference, waist circumference, hip circumference) were selected because of their relevance to clothing sizing, and human classification which was the main purpose of our system. Table 1 shows some results of measurements(from 50 measurements of people in the database) sizes of human body using convex hull method comparing with manual method. 101 Table 1. Results of measurement sizes of human body using convex hull method. BODY SIZES Manual method Convex hull method Chest 87.98 cm 88.12 cm Waist 67.95 cm 68.05 cm Hip 90.52 cm 91.68 cm Chest 88.64 cm 89.02 cm Waist 66.61 cm 67.13 cm Hip 93.58 cm 94.93 cm Chest 87.22 cm 88.15 cm Waist 67.19 cm 67.96 cm Hip 89.36 cm 89.01 cm Chest 88.16 cm 89.46 cm Waist 65.64 cm 66.42 cm Hip 92.17 cm 93.02 cm Chest 90.96 cm 91.23 cm Waist 71.44 cm 70.82 cm Hip 93.56 cm 94.19 cm Chest 86.94 cm 87.12 cm Waist 67.68 cm 68.12 cm Hip 85.28 cm 84.56 cm Measurements errors are mostly caused by camera resolution or non tight clothing and noises of environment near by the object. In our case, we used a basis phone camera of model Samsung Galaxy S4 with resolution 13 Mega Pixel. We recommended using high-quality resolution camera with flash opened during the time capture photos and people should wear tight clothes body to reduce maximum noises as measurements errors. In addition, we have also performed averaging over several measurements to reduce measurement and calibration errors. The circumferential measures were generated by approximating the shape of the respective body part. For example, neck circumference was approximated with the elliptical shape. The major and minor axes lengths were obtained from the front and side views. The chest circumference was determined by approx- imating the shape as a combination of a rectangle and an ellipse, using the method with formula mentioned in [16]. 4 Anthropometric Data Analysis Base on method anthropometric features extraction which mentioned in previous section, we collected sizes of human body parts. We have a train set contains sizes of 50 people sizes (25 men and 25 women) and test set from 18 people (10 men and 8 women). Each measurement contains 3 feature for each object: chest, waist and hip circumferences. 102 4.1 Classification of Men/Women using Support Vector Machine(SVM) To solve the problems of classification we used well-known machine learning method – support vector machine (SVM), proposed by Vladimir Vapnik [11]. We choose SVM as a one of the most effective algorithm which have many real world applications [17]. Our goal is use a support vector machine for gender classification based on anthropometric data. As a features we use three human body parameters: chest, waist, hip circumferences. The LibSVM library [15] was used as Support Vector Machine implementa- tion. The following main parameters of SVM were chosen. Radial basis function with γ = 0.333 parameter was used as a kernel function. SVM model param- eters Gamma (γ) and cost (C) were obtained empirically. Gamma parameter is needed for all types of kernels except linear. Constant C is an regularization term in the Lagrange formulation. We will use the supplied parameter ranges (C - cost, γ - gamma), using the train set. The range to gamma parameter is between 0.000001 and 0.1. For cost parameter the range is from 0.1 until 10. It’s important to understanding the influence of this two parameters, because the accuracy of an SVM model is largely dependent on the selection them. For example, if C is too large, we have a high penalty for non separable points and we may store many support vectors and over-fit. If it is too small, we may have an under-fitting. The results of this algorithm is shown in Fig.5. Obtained test classification error for current dataset is 20%. Fig. 5. The result of classification by SVM. Blue dots: men, red dots: women, green dots: support vectors. 103 5 Conclusion Human classification is an useful application for many future scenarios of human- computer interaction. Our approach is presented in this paper describes the classification of people based on the anthropometrical features using machine learning approach. Proposed approach allows to extract form images a number of anthropometrical features, including the length of arms, chest width, shoulder width, hips, leg length. In this article we propose solution of male/female clas- sification task by image based on support vector machine. Obtained test error is sufficiently large but in feature work we are going to examine more classifiers to choose best one and also use bigger datasets to reduce misclassification rate. Based on these results, we hope to build solution that can help to solve tasks of the clothing industry. References 1. Lin, Y.-L., Wang, M.-J.J.: Automatic Feature Extraction from Front and Side Images. In: International conference on Industrial Engineering and Engineering Management, pp. 1949–1953. IEEE, Singapore (2008) 2. Lin, Y.-L., Wang, M.-J.J.: Constructing 3D Human Model from 2D Images. In: International conference on Industrial Engineering and Engineering Management, pp. 1902–1906. IEEE, Xiemen (2010) 3. Lin, Y.-L., Wang, M.-J.J.: Constructing 3D Human Model from Front and Side Images. Expert Systems with Applications, Vol. 39, No. 5, 5012–5018 (2012) 4. Rahman, S.A., Cho, S.-Y., Leung, M.K.H.: Recognizing Human Actions by Ana- lyzing Negative Spaces. IET Computer Vision, Vol. 6, No. 3, 197–213 (2012) 5. Pickup, D., Sun, X.-F., Rosin, P.L., Martin, R.R., Cheng, Z.-Q., Lian, Z., Aono, M., BenHamza, A., Bronstein, A., Bronstein, M., Bu, S., Castellani, U., Cheng, S., Garro, V., Giachetti, A., Godil, A., Han, J., Johan, H., Lai, L., Li, B., Li, C., Li, H., Litman, R., Liu, X., Liu, Z., Lu, Y., Tatsuma, A., Ye, J.: SHREC’14 Track: Shape Retrieval of Non-Rigid 3D Human Models. In: Proc. 7th of Eurographics Workshop on 3D Object Retrieval, pp. 101–110. Eurographics, Strasbourg (2014) 6. Barnard, M., Matilainen, M., Heikkila, J.: Body part segmentation of noisy human silhouette images. In: Multimedia and Expo, IEEE International Conference on, pp. 1189–1192. IEEE, Hannover(2008) 7. Jiang, L., Yao, J., Li, B., Fang, F., Zhang, Q., Meng, M.Q.-H.: Automatic Body Feature Extraction from Front and Side Images. Journal of Software Engineering and Applications, Vol. 5, No. 12, 94–100 (2012) 8. Freeman, H.: On the Encoding of Arbitrary Geometric Configuration. IRE Trans- actions on Electronics Computers, Vol. EC-10, No. 2, 264–268 (1961) 9. Mittal, A., Zhao, L., Davis, L.S.: Human Body Pose Estimation using Silhou- ette Shape Analysis. In: Advanced Video and Signal Based Surveillance, 2003. Proceedings. IEEE Conference on, pp. 263–270. IEEE, USA (2003) 10. Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering Human Body Configura- tions: Combining Segmentation and Recognition. In: Computer Vision and Pat- tern Recognition. Proceedings of the 2004 IEEE Computer Society Conference on, Vol. 2, pp. 326–333. IEEE, Washington (2004) 104 11. Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning, Vol. 20, No. 3, 273–297 (1995) 12. Han, E.: 3D Body-Scanning to Help Online Shoppers find the Perfect Clothes fit. The Sydney Morning Herald. National Newspaper (Australia) (2015) 13. MacQueen, J. B.: Some Methods for Classification and Analysis of Multivari- ate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1, pp. 281–297. University of California Press, USA (2009) 14. Sklansky, J.: Finding the Convex Hull of a Simple Polygon. Pattern Recognition Letters, Vol. 1, No. 2, 79–83 (1982) 15. Chang, C.-C., Jen, C.: LIBSVM - A Library for Support Vector Machines, http: //www.csie.ntu.edu.tw/~cjlin/libsvm 16. Kohlschetter, T.: Human Body Modelling by Development of the Automatic Landmarking Algorithm. Technical Report No. DCSE/TR-2012-11 (2012) 17. Wang, Lipo, ed. Support Vector Machines: theory and applications. Vol. 177. Springer Science & Business Media (2005). 18. OpenCV 2.4.11.0 documentation, http://docs.opencv.org/modules/imgproc/ doc/feature_detection.html 105