Survey on Clustering Techniques for Image Categorization Dataset

Abstract

Content Based Image Retrieval, CBIR, performed an automated classification task for a queried image. It could relieve a user from the laborious and time-consuming metadata assigning for an image while working on massive image collection. For an image, user’s definition or description is subjective where it could belong to different categories as defined by different users. Human based categorization and computer-based categorization might produce different results due to different categorization criteria that rely on dataset structure and the clustering techniques. This paper is aimed to exhibit an idea for planning the dataset structure and choosing the clustering algorithm for CBIR implementation. There are 5 sections arranged in this paper; CBIR and QBE concepts are introduced in Section 1, related image categorization research is listed in Section 2, the 5 type of image clustering are described in Section 3, comparative analysis in Section 4, and Section 5 conclude this study. Outcome of this paper will be benefiting CBIR developer for various applications.

Share and Cite:

Shukran, M. , Mohd Yunus, M. , Abdullah, M. , Isa, M. , Khairuddin, M. , Maskat, K. , Ismail, S. and Shibghatullah, A. (2022) Survey on Clustering Techniques for Image Categorization Dataset. Journal of Computer and Communications, 10, 177-185. doi: 10.4236/jcc.2022.106014.

1. Introduction

In CBIR and human interaction, human level image interpretation is applicable where a human user that performed the image query expected CBIR to produce a result based on their preferences. Human based image interpretation is depending on users’ psychological and knowledge level [1] while computer-based classification is relying on clustering techniques. Computer based image classification can be divided into 3 modules; extraction module, query module and retrieval module [2]. An image is interpreted by computer during query module where extracted feature is linked with suitable metadata such as keywords, captioning, or hash-tag (a popular manual metadata technique among netizen lately).

Clustering techniques act to determine the suitable and related metadata for a queried image by referring the similarity matrix of vector data group in a dataset. The referring dataset also called as training dataset, contained number of vector data (extracted feature of an image) that was divided into few data groups with each group carrying different data property (such as image’s keyword) using a clustering technique. Clustering also is a self-learning way for CBIR to find the queried image similarity from the image dataset. The method of querying an image to find a set of similar images is known as Query by Example, QBE [3] where CBIR will return results in the form as image similarity or defined keywords from the user’s loaded image.

However, the QBE style query is fully relying on data contained in the training dataset that results accuracy probability might vary. The outcome of QBE style query accuracy might become deficient when the sum of data in training dataset is insufficient or the dataset improperly cluster.

2. Related Research

While most research focusing on clustering large number of data such in [4] research and multi-cluster data such as [5] research, there are just a few researchers focusing on complex structure data as in [6] research. Human level image categorization is complex when a CBIR deal with a lot of users’ interpretation. Well, CBIR is created to serve human, CBIR need to fulfilling human interpretation. [7] is crowd-source image dataset that categorized images based on English dictionary tree structure which contained around 20 thousand categories where hundreds of images reside in each category (Figure 1).

Practically, other than public hosted CBIR such as [8] image, most of private CBIR did not fully utilized or require such 20 thousand category datasets. For a specific purposed CBIR system, it is more reliable to containing only the purposed related dataset. For instance, a car finder CBIR system only utilized car related dataset should be enough. Other than that enormous dictionary-based category dataset, there are few other research purpose image datasets provided from various academic and research institutional. Those datasets provided only specific purposed content with a smaller number of categories. The only thing to consider is the type of dataset that adequate for a CBIR (Table 1).

There more than that listed dataset available to use as a sample for a research such as surveillance camera, animals, nature process and much more. But, above of all, the purpose of a CBIR system development is a critical decision that affected the content of dataset to adopt.

Figure 1. A part of ImageNet hierarchical dataset screenshot.

Table 1. Example list of specific content research dataset.

3. The Clustering Techniques

CBIR employ the clustering techniques perform the data categorization that reside in a dataset. Clustering techniques is consisting of a collection of clustering algorithms which commonly divided into 5 categories [18] based on their clustering behavioral. Clustering algorithm is a self-learn process for CBIR to link a suitable metadata captions and keywords for a queried image. On a preloaded dataset (or adopted dataset), clustering algorithm analyzing the dataset in matrices (either mostly 2-dimension matrix or some 3-dimension matrix) to find the clustering centers among of the vector data (Figure 2).

The cluster area is built around the cluster center and assigning each data that resides in a cluster area as a cluster membership. Some clustering algorithm is assigning the cluster membership employing the hard-clustering rule where each data strictly become a member of one cluster only. While the other clustering rule is a data could become one or more cluster membership (soft clustering). Hard clustering is less complex than soft clustering and thus, in term of performance it performs faster with minimal resources consumption. However, in several CBIR application, an image is required to allow residing to more than just one category and that make soft clustering rule cannot be abandon.

During the query process, the clustering algorithms calculate the similarity matrices between extracted feature of queried image and the data in a dataset. If the newly queried data need to be inserting into dataset matrix, the clustering algorithm will be revising the cluster center and as well as cluster area for any changes (the term iteration is used for partition-based clustering algorithm). There will be possibilities where a cluster is expanding, downsizing or a new cluster creation. From Table 2 previously, it shows that each clustering algorithm perform the data clustering with their own ways and produced different result each. The range of clustering algorithms variety is needed for many purposes, reasons, and requirements of a CBIR development.

Figure 2. The clusters’ notions.

Table 2. List of clustering techniques.

4. Comparative Analysis

CBIR need to be equipped with dataset as referencing material for CBIR to recognized a queried image and afterward placing the data into a relative cluster. The ability of CBIR to recognize an image and categorizing a dataset is rely on clustering techniques implemented. Designing the dataset will affect the data structure presentation. Some of the data need to be clustered softly as in Table 3.

Table 3. Hard and soft clustered data presentation.

If you follow the “checklist” your paper will conform to the requirements of the publisher and facilitate a problem-free publication process.

Dataset matrix presentation that used by CBIR is not arrange as clean as illustrated in Table 3 before, it need to be clustered in order to assign a data cluster membership. There is no ultimate complete clustering algorithm that suited for any dataset structure and requirement. Thus, the clustering algorithm must be selected properly for the best suit CBIR and datasets design. Table 4 is the comparison of some clustering algorithms that suited for a specific dataset structure.

When a clustering algorithm wrongly implemented, it would result the output that does not meet the requirement and expectation. Well, K-Means is the simplest algorithm [24] that could execute clustering process faster than the other algorithm with minimal resources. For a CBIR that have unknown number of categories, the K-Means algorithm adaption will cause a data wrongly categorized. On the other hand, when a CBIR is required to employ a strict number of categories and cluster membership rule, others than K-Means algorithm type implementation might results a below expectation clustering.

Although clustering algorithm was developed long before the term big data being popular (around 90s), there are lots more research on data clustering research is emerge. From time to time, CBIR require being equipped with more sophisticated clustering algorithm due to the increasing requirement of image categorization alongside with CBIR application. CBIR developer need to choose the clustering algorithm wisely and comparative table as presented in Table 4 should be a help.

Table 4. 2 Dimensional matrix presentations of dataset’s designs.

5. Conclusion

In a CBIR development cycle, the purpose of a CBIR implementation needs to be decided first that would determine the dataset design and succeeding clustering algorithm to be adapted. Since there is no ultimate choice of a clustering algorithm for all clustering purpose, it is depending on CBIR developer wisdom to select and adapting the perfectly fit clustering algorithm. We hope as in this study analysis and findings become an aided reference on choosing the perfect clustering algorithm for CBIR development.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Ordonez, V., Deng, J., Choi, Y., Berg, A.C. and Berg, T.L. (2013) From Large Scale Image Categorization to Entry-Level Categories. Proceedings of the IEEE International Conference on Computer Vision.
[2] Cardoso, D.N.M., Muller, D.J., Alexandre, F., Neves, L.A.P., Trevisani, P.M.G. and Giraldi, G.A. (2013) Iterative Technique for Content-Based Image Retrieval Using Multiple SVM Ensembles. J Clerk Maxwell, A Treatise on Electricity and Magnetism, 2, 68-73.
[3] Zloof, M.M. (1977) Query by Example: A Database Language. IBM Systems Journal, 16, 324-343.
[4] Zhou, B., Khosla, A., Lapedriza, A., Torralba, A. and Oliva, A. (2017) Places: An Image Database for Deep Scene Understanding. Journal of Vision, 17, 296.
[5] Bora, D.J., Gupta, D. and Kumar, A. (2014) A Comparative Study between Fuzzy Clustering Algorithm and Hard Clustering Algorithm. arXiv: 1404.6059.
[6] Lv, Y.H., Ma, T., Tang, M., Cao, J., Tian, Y., Al-Dhelaan, A. and Al-Rodhaan, M. (2016) An Efficient and Scalable Density-Based Clustering Algorithm for Datasets with Complex Structures. Neurocomputing, 171, 9-22.
[7] (2016) http://image-net.org/index
[8] Google, “Google Image” (2018) http://image.google.com
[9] Freitas, F.D.A., Peres, S.M., Lima, C.A.D.M. and Barbosa, F.V. (2014) Grammatical Facial Expressions Recognition with Machine Learning. FLAIRS Conference.
[10] Rothe, R., Timofte, R. and Gool, L.V. (2015) Dex: Deep Expectation of Apparent Age from a Single Image. Proceedings of the IEEE International Conference on Computer Vision Workshops.
[11] Heilbron, F.C., Escorcia, V., Ghanem, B. and Niebles, J.C. (2015) Activitynet: A Large-Scale Video Benchmark for Human Activity Understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[12] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) Imagenet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems.
[13] Xiao, H., Rasul, K. and Vollgraf, R. (2017) Fashion-Mnist: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv: 1708.07747.
[14] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE.
[15] Kussul, E. and Baidyk, T. (2004) Improved Method of Handwritten Digit Recognition Tested on MNIST Database. Image and Vision Computing, 22, 971-981.
[16] Heitz, G., Elidan, G., Packer, B. and Koller, D. (2009) Shape-Based Object Localization for Descriptive Classification. In Advances in Neural Information Processing Systems.
[17] Digital Globe Inc. (2016) http://explore.digitalglobe.com/spacenet
[18] Fahad, A., AlShatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A.Y., Foufou, S. and Bouras, A. (2014) A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Transactions on Emerging Topics in Computing, 2, 267-279.
[19] Ester, M., Kriegel, H.-P., Sander, J. and Xu, X. (1996) A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. KDD, 96, 226-231.
[20] Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 1-38.
[21] Campello, R.J.G.B., Moulavi, D. and Sander, J. (2013) Density-Based Clustering Based on Hierarchical Density Estimates. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Heidelberg, Berlin.
[22] Macqueen, J. (1967) Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.
[23] Dunn, J.C. (1973) A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters.
[24] Rajalakshmi, K., Dhenakaran, D.S. and Roobin, N. (2015) Comparative Analysis of K-Means Algorithm in Disease Prediction. International Journal of Science, Engineering and Technology Research (IJSETR), 4, 1-3.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.