Shape Descriptor Based on Curvature

Abstract

Nowadays, the applications developed in the imaging area have focused on image measurement and compression, facial recognition, blurring, or image enhancement, to name a few. Among these activities, shape descriptors play an important role in describing the topology of an object and determining its general and local characteristics. Despite the large number of studies carried out, the idea of a robust shape descriptor remains a difficult problem, the challenges that hinder shape recognition are due to transformations such as rotations, scaling, deformations caused by noise or occlusions. This article presents the development of a descriptor based on the use of curvature, with this procedure, the dimensionality of the data is reduced so that the variability of the shapes can be described or explained by means of characteristics, and through neural networks associates these features and thus discriminates one type of object from another.

Share and Cite:

Romero-González, J.-A., Herrera-Navarro, M. and Jiménez- Hernández, H. (2022) Shape Descriptor Based on Curvature. Open Access Library Journal, 9, 1-13. doi: 10.4236/oalib.1108422.

1. Introduction

In recent years, researchers from various areas of vision have focused their efforts on developing algorithms that are capable of imitating human vision. To do this, they developed artificial vision systems in which cameras seek to recognize patterns [1], monitor [2], analyze trajectories [3], or infer motion [4]. Among these activities, shape descriptors play an important role in describing the topology of an object and determining its general and local characteristics.

A shape descriptor is an effective tool for recognition [5], classification [6] or identification [7], because the shape is inherent in the understanding of an image, which remains stable despite changes in the lighting, color, or texture of an object. Due to these advantages, shape features have been widely used in various methods, such as: polygon approximation [8], spatial correlation features [9], moments [10], spatial scaling methods [11] and shape transformations [12]. The idea of robust shape descriptors is still a hard problem, and the challenges that make shape recognition difficult are due to transformations such as rotation, scaling, deformation caused by noise or occlusion [13], or undesirable effects caused by shadows [14].

Mainly, global descriptors can only identify shapes that differ greatly due to their simple properties. However, local descriptors focus more on aspects of the form to use them as discriminants. A shape can be described by different aspects such as the center of gravity [15], eccentricity [16], circularity ratio [17], elliptical variance [18] - [22], rectangularity [23], or area ratio [24], to name a few. In general, descriptors attempt to quantify a given shape using a set of numbers due to their characteristic aspects, so a good robust descriptor must satisfy the following requirements:

1) Preservation of similarity between the same types of forms.

2) Invariance to similar transformations, such as translation, rotation, and scaling; and related transformations.

3) Noise resistance.

4) Concealment invariance.

5) Statistical independence (compactness).

6) Reliability of the extracted features.

The techniques related to shape representation are broad and are classified into contour-based methods [25] and region-based methods [14]. Contour-based methods are subject to the detection of features found in the shape’s contour, these can refer to points, edges, and lines [26]. Whereas, region-based methods obtain a set of feature vectors of the regions occupied by objects [27]. In this paper, the analysis of shape curvature is presented, where non-smooth curves represent feature elements used to train a multi-perceptron neural network to classify the shape of a scene. This paper is organized as follows, including this section. Section 2 describes the related work. Section 3 shows the features extraction. The experiments and results obtained by this method are shown in Section 4. Finally, the conclusions of the methodology are exposed in Section 5.

2. Related Work

The general process of the traditional object recognition algorithm is as follows: to evaluate the shape of the object, first, the descriptors of the characteristics of the shape are extracted. Then, a matching algorithm is applied to obtain the distance between the descriptors. Finally, the form recognition classification is completed through classification and metric learning. During this process, the descriptor’s ability to distinguish shape directly affects the recognition result; thus, most of the work of shape recognition has concentrated on descriptors.

An example is the skeleton, which is a structure that describes the topology and geometry of the shape of an object, such representation, and also allow a simpler and computationally effective way to analyze the properties of the shape, the techniques currently for taxonomy analysis, they have properties such as: Homotopy, Invariance and Smoothing [12], that is, the object can be represented as a continuous function, which is invariant at scale; Reconstruction and Details [21], the object can be reconstructed from its structure, finding the main details of the object, reducing the amount of processing required to analyze it.

In this type of techniques, [1] obtains the taxonomy of the object through an iterative process to label the neighboring pixels of the border that is formed between the foreground and the background of the image, which are removed and re-labeled to generate a new border until thinning to one pixel. Similarly, an algorithm called ZS was developed in [5], in which they combined the subdivision method of the image into two sets of odd and even coordinates. At the same time [19], a mean axis-based parallel thinning algorithm was developed for contour extraction, from which they suppressed external constraints until a thinning of 1 pixel width was achieved. For their part, in [21] was proposed an improved slimming algorithm for detecting moving objects, confirming the speed of the algorithm evaluated in their study.

Although skeletons provide an alternative to reduce processing complexity and work directly with the object’s topology, the methods are not highly regarded because of the sensitivity to detail, and they can modify the object’s taxonomy, so other techniques seek locate characteristic elements in the structure of the stage, that define the objects present under conditions of variation and that explicitly represent the scales at which the different events occur. The complexity of these techniques is due to the fact that the changing conditions of spatial-scale image representations are caused by illumination changes, shadows, inherent noise in the camera, and scale changes. In [22] extracts features of large structures using a family of smoothed signals, representing data at multiple scales, by solving a diffusion equation defined as t L = 0.5 × 2 L , where L is the convolution between an image and a Gaussian operator, in order to construct a representation of invariant space at the scale of the image structure, on the other hand, in method [18] analyze the correlation between typical conditions on stage and the entropy patterns of a sequence of images, to provide an abstraction of a space-scale blob with a gray-level blob at a single scale level, that is, such an abstraction involves a projection from a Four-dimensional scale-space blob to a three-dimensional gray level blob and a projection of the three-dimensional blob to its two-dimensional support region.

In recent years, many methods have attempted to find an optimal one-to-one correspondence between two dense point shape boundaries, and from there compute the measure of dissimilarity. Among these methods, Shape context [13] is a classical method that is very popular among researchers. The shape context constructs a histogram for each contour point to describe the distribution of the remaining points relative to that point, and these histograms are used to find point-to-point correspondences. The well-known shape description method proposed in [14] uses inner distance instead of Euclidean distance to construct the context of the shape, where the inner distance is the shortest path between two contour points within the constraints of the shape. They report a recognition rate of 94.13% in the Swedish Leaf dataset, which is higher than the 88.12% obtained by the contextual form [13], owing to IDSC’s high classification accuracy. [6] has successfully applied it to develop a computer vision system to aid in the identification of plant species. The height functions [20] are another description of the shape and a matching method for optimal one-to-one matching. In this method, for each contour point of the object, a height function is constructed based on the distance of other contour points to its tangent. After smoothing the height function, a compact and robust shape descriptor is available for each contour point. In the shape matching stage, a dynamic programming algorithm is used to find the optimal match between the two shapes. The height function method has excellent discriminative power.

There are other techniques for shape analysis that focus on singular value decomposition (SVD) for object recognition, especially in people and cars, which was developed by [26] and has been extensively studied. Some of the applications developed have been in the area of image measurement and compression, facial recognition, blurring, or image enhancement, to name a few. Therefore, despite a lot of research in the field of image processing, very little has been done on the use of singular values for the identification.

3. Feature Extraction

Consider a 2D binary image, which is represented by I(x) with dimensions m x n, where x = [xm, yn] which represents the position of a particular pixel, then a uniform region A is a collection of pixels with values [0, 1] and which defines a binary object (blob) in picture I. Since it is not considered overlap in the forms A as in (1):

( A : A I ) = (1)

Now it is possible to determine the edge of region A as in (2)

δ = A B = { x | ( B ) x A 0 } (2)

The operation ? is defined as I m × n × I m × n I m × n such that ( A , B ) C , where c i j = a i j b i j for all A I m × n and B I m × n .

Now consider Figure 1 in which the radius of curvature and the center of curvature C of a segment of the curve in Figure 2 is shown, the vector Ti represents the tangential vector to the curve s, while Ti+1 is the displaced tangential vector, while C is the center of curvature.

The distance between Ti and the center of curvature C, is the radius of curvature d, which is defined as in (3).

d = lim Δ r 0 Δ r Δ θ = d r d θ (3)

Figure 1. Radius of curvature.

Figure 2. Determination of characteristic points of the shape. Source: own elaboration.

where

d r = 1 + ( d y d x ) 2 d x (4)

tan ( θ ) = d y d x d x (5)

So, the Equation (3) becomes as in (6).

d = 1 + ( d y d x ) 2 d x d ( tan 1 ( d y d x ) ) (6)

The radius of curvature is defined as in (7).

d = [ 1 + ( d y d x ) 2 ] 2 3 d 2 y d x 2 (7)

Figure 2 shows a curve s in R2, where s = x ( t ) i + y ( t ) j , since the curvature of a curve is always zero on a line or segments of a line, a local maximum or minimum is a feature of the curvature of the shape.

To determine a candidate for a characteristic, three points (a, b and c) of the curve s are selected through which a circumference (C) is drawn, the closer these points are to P the radius of curvature (r) will be smaller, and the curvature will be larger and larger, this procedure is carried out until reaching the convergence of C in P.

Because some edge contours may vary from the actual shape, a threshold angle α is used. If the estimated angle ( aT 1 - aT 2 ) = β . If β is greater than the angle threshold α, then the candidate features are removed from the feature set. For the calculation of the angle β, it is done using the tangents aT1 and aT2 on both sides of the feature P.

If the three points are collinear and the radius of curvature is large, the feature will be discarded. The proposed method can be determined by the following algorithm:

The proposed method can be determined by the following algorithm:

1) Extract the outline of the shape by erosion or binary dilation or any method for extracting edges.

2) Determine the limit circumference or osculating circumference to find the curvature of each pixel of the shape.

3) Calculate the radius of curvature r to determine the candidate features.

4) Calculate the angle β at each candidate feature obtained in step 3 and compare it to the angle threshold α to eliminate possible false candidates.

5) A vector resulting from the non-removed candidate features is constructed, which is used to quantify the shape (see Figure 3).

The feature vector is shown in Figure 3(a), it can see the shape and the superimposed features found, which correspond to the feature vector shown in Figure 3(b), the calculation of the feature vector serves to quantify the shape and that this compared with other types of forms, while Figure 4 shows the characteristics found in comparison with the form signature.

The quantification of the vectors is presented in n dimensions ( x 1 , x 2 , , x n ) and by means of the k-means algorithm these characteristic vectors are associated with k groups where the distance is minimized between the classes C = { C 1 , C 2 , , C k } its centroid.

c = min i = 1 k x j C i x j μ i 2 (8)

To do this grouping, perform the following steps:

(a) (b)

Figure 3. Feature vector. Source: own elaboration.

Figure 4. Feature comparison with the signature shape. Source: own elaboration.

● Initialize k groups and set k random centroids.

● Assign each datum to the group with the closest centroid.

● Calculate the new centroid for each group, taking the position of the mean of the data belonging to the k groups as the new centroid.

Repeat the steps of assigning data to groups and computing new centroids until convergence is reached.

4. Experiments and Results

In this section the experiments and performance of the descriptor are shown, first the results are shown before scale and rotation tests, followed by the evaluation with respect to other descriptors for which 5 sets of 20 images were evaluated, using MPEG7 CE-Shape-1 data sets (25). The MPEG7 CE-Shape data set consists of 1400 images of 70 classes, with each class containing 20 shapes. Examples of object shapes from the MPEG7 Part B dataset are shown in Figure 5.

Repeatability tests are shown in Figure 6, of which a subset of 20 images was taken, the tests include image scaling every 20%, rotations from 0˚ to 180˚, in the test results the vector of features, from which the features obtained decrease as the scale decreases, however, even though the algorithm is not designed for a space-scale, most of the features detected prevail, as an improvement is possible to consider the scale-space to reduce the error generated.

Figure 7 shows the tests carried out to evaluate the detection performance based on the repeatability of the characteristics belonging to the shape. Different sets of images have been used to assess the average repeatability, including “Camel-1”, “Beetle-1”, “Deer-1” and “Elephant-1”. The transformed images were implemented by applying the following six different types of transformations on each original image:

Figure 5. Example of objects selected from each class from the MPEG7 Part B dataset [25].

Figure 6. Rotation and scaling of the form. Source: own elaboration.

Experimental results showed that the proposed descriptor has strong discriminability (see Table 1). In this section, the parameters involved in the algorithm are α = 165, Bx = 3.

Figure 7. Repeatability of characteristic elements in different sets. Source: own elaboration.

Table 1. Quantified values of shapes.

Table 2. Scores of different descriptors on the MPEG-7.

The BER analysis rates of different forms representation techniques found in the literature are related to the proposed method and they are shown in Table 2. The proposed descriptor is among the first with the best percentage of classification of the approaches.

5. Conclusion

This paper proposes a method to distinguish the shapes of different classes of objects and compare them with similar objects. The results show that the accuracy of feature similarity is 85.4%, but the method lacks shape recovery. The main weakness of this algorithm is the error in describing the shape that increases as the image is scaled. The reason for this error is that some objects have few features, making it difficult to compare shapes. To solve this problem, spatial scale analysis should be considered to obtain eigenvectors.

Acknowledgements

This work has been partially carried out thanks to the support of the scholarship granted by Consejo Nacional de Ciencia y Tecnología (CONACyT).

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Abu-Ain, W., Abdullah, S.N.H.S., Bataineh, B., Abu-Ain, T. and Omar, K. (2013) Skeletonization Algorithm for Binary Images. Procedia Technology, 11, 704-709. https://doi.org/10.1016/j.protcy.2013.12.248
[2] Arjun, P. and Mirnalinee, T.T. (2018) An Efficient Image Retrieval System Based on Multi-Scale Shape Features. Journal of Circuits, Systems and Computers, 27, Article ID: 1850174. https://doi.org/10.1142/S0218126618501748
[3] Bhanu, B. and Peng, J. (2000) Adaptive Integrated Image Segmentation and Object Recognition. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 30, 427-441. https://doi.org/10.1109/5326.897070
[4] Blas, M.R., Agrawal, M., Sundaresan, A. and Konolige, K. (2008) Fast Color/Texture Segmentation for Outdoor Robots. 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, 22-26 September 2008, 4078-4085. https://doi.org/10.1109/IROS.2008.4651086
[5] Ben Boudaoud, L., Sider, A. and Tari, A. (2015) A New Thinning Algorithm for Binary Images. 2015 3rd International Conference on Control, Engineering & Information Technology, 25-27 May 2015, 1-6. https://doi.org/10.1109/CEIT.2015.7233099
[6] Chai, X., Wen, F. and Yuan, K. (2011) Fast Vision-Based Object Segmentation for Natural Landmark Detection on Indoor Mobile Robot. 2011 IEEE International Conference on Mechatronics and Automation, Beijing, 7-10 August 2011, 2232-2237. https://doi.org/10.1109/ICMA.2011.5986286
[7] Chen, L., Guo, B. and Sun, W. (2010) Obstacle Detection System for Visually Impaired People Based on Stereo Vision. 2010 4th International Conference on Genetic and Evolutionary Computing, Shenzhen, 13-15 December 2010, 723-726. https://doi.org/10.1109/ICGEC.2010.183
[8] Ciric, I., Cojbasic, Z., Nikolic, V. and Antic, D. (2013) Computationally Intelligent System for Thermal Vision People Detection and Tracking in Robotic Applications. 2013 11th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Services, Nis, 16-19 October 2013, 587-590. https://doi.org/10.1109/TELSKS.2013.6704447
[9] Cornea, N.D., Silver D. and Min, P (2007) Curve-Skeleton Properties, Applications, and Algorithms. IEEE Transactions on Visualization and Computer Graphics, 13, 530-548.
[10] Freitas, A.M., Torres, R. da S. and Miranda, P.A.V. (2016) TSS & TSB: Tensor Scale Descriptors within Circular Sectors for Fast Shape Retrieval. Pattern Recognition Letters, 83, 303-311. https://doi.org/10.1016/j.patrec.2016.06.005
[11] Gupta, S., Girshick, R., Arbeláez, P. and Malik, J. (2014) Learning Rich Features from RGB-D Images for Object Detection and Segmentation. European Conference on Computer Vision 2014, Zurich, 6-12 September 2014, 345-360. https://doi.org/10.1007/978-3-319-10584-0_23
[12] Hänsch, R. and Hellwich, O. (2008) Weighted Pyramid Linking for Segmentation of Fully-Polarimetric SAR Data. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 37, 95-100. http://www.isprs.org/proceedings/XXXVII/congress/7_pdf/2_WG-VII-2/05.pdf
[13] Ma, L., Wang, J., Bo, Z. and Wang, S. (2010) Automatic Floor Segmentation for Indoor Robot Navigation. ICSPS 2010: Proceedings of the 2010 2nd International Conference on Signal Processing Systems, Vol. 1, Dalian, 5-7 July 2010, 684-689. https://doi.org/10.1109/ICSPS.2010.5555399
[14] Weis, M., Rumpf, T., Gerhards, R. and Plümer, L. (2009) Comparison of Different Classification Algorithms for Weed Detection from Images Based on Shape Parameters. Image Analysis for Agricultural Products and Processes, 69, 53-64.
[15] Siricharoen, P., Aramvith, S., Chalidabhongse, T.H. and Siddhichai, S. (2010) Robust Outdoor Human Segmentation Based on Color-Based Statistical Approach and Edge Combination. 1st International Conference on Green Circuits and Systems, ICGCS 2010, Shanghai, 21-23 June 2010, 463-468. https://doi.org/10.1109/ICGCS.2010.5543017
[16] Otsu, N. (1979) A Threshold Selection Method from Gray-Level Histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9, 62-66. https://doi.org/10.1109/TSMC.1979.4310076
[17] Rocha, A. and Goldenstein, S. (2009) Classifiers and Machine Learning Techniques for Image Processing and Computer Vision. University of Campinas, Campinas.
[18] Tomasi, C. and Kanade, T. (1991) Detection and Tracking of Point Features Technical Report CMU-CS-91-132. Image Rochester, NY. 1991, 91(April), 1-22.
[19] Viswanathan, G.K., Murugesan, A. and Nallaperumal, K. (2013) A Parallel Thinning Algorithm for Contour Extraction and Medial Axis Transform. 2013 IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology, Tirunelveli, 25-26 March 2013, 606-610. https://doi.org/10.1109/ICE-CCN.2013.6528571
[20] Wang, B., Brown, D., Gao, Y. and Salle, J.L. (2015) MARCH: Multiscale-Arch-Height Description for Mobile Retrieval of Leaf Images. Information Sciences, 302, 132-148. https://doi.org/10.1016/j.ins.2014.07.028
[21] Xie, F., Xu, G., Cheng, Y. and Tian, Y. (2009) An Improved Thinning Algorithm for Human Body Recognition. Proceedings of 2009 IEEE International Workshop on Imaging Systems and Techniques, Shenzhen, 11-12 May 2009, 416-420. https://doi.org/10.1109/IST.2009.5071678
[22] Kaothanthong, N., Chun, J. and Tokuyama, T. (2016) Distance Interior Ratio: A New Shape Signature for 2D Shape Retrieval. Pattern Recognition Letters, 78, 14-21. https://doi.org/10.1016/j.patrec.2016.03.029
[23] Lindeberg, T. (1993) Detecting Salient Blob Like Image Structures and Their Scales with a Scale Space Primal Sketch a Method for Focus-of-Attention. International Journal of Computer Vision, 318, 283-318. https://doi.org/10.1007/BF01469346
[24] Zheng, Y., Guo, B., Chen, Z. and Li, C. (2019) A Fourier Descriptor of 2D Shapes Based on Multiscale Centroid Contour Distances Used in Object Recognition in Remote Sensing Images. Sensors, 19, Article No. 486. https://doi.org/10.3390/s19030486
[25] Latecki, L.J. (2006) Shape Data for the MPEG-7 Core Experiment CE-Shape-1. https://dabi.temple.edu/external/shape/MPEG7/dataset.html/
[26] Stewart, G. (1993) On the Early History of the Singular Value Decomposition. SIAM Review, 35, 551-566. https://doi.org/10.1137/1035134
[27] Sobiecki, A., Yasan, H., Jalba, A. and Telea, A. (2013) Qualitative Comparison of Contraction-Based on Curve Skeletonizations Methods. International Symposium on Mathematical Morphology and Its Applications to Signal and Image Processing 2013, Uppsala, 27-29 May 2013, 425-439. https://doi.org/10.1007/978-3-642-38294-9_36

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.