Skin Cancer Classification Using Transfer Learning by VGG16 Architecture (Case Study on Kaggle Dataset)

Abstract

Skin cancer is a serious and potentially life-threatening disease that affects millions of people worldwide. Early detection and accurate diagnosis are critical for successful treatment and improved patient outcomes. In recent years, deep learning has emerged as a powerful tool for medical image analysis, including the diagnosis of skin cancer. The importance of using deep learning in diagnosing skin cancer lies in its ability to analyze large amounts of data quickly and accurately. This can help doctors make more informed decisions about patient care and improve overall outcomes. Additionally, deep learning models can be trained to recognize subtle patterns and features that may not be visible to the human eye, leading to earlier detection and more effective treatment. The pre-trained Visual Geometry Group 16 (VGG16) architecture has been used in this study to classification of skin cancer images, and the images have been converted into other color scales, there are named: 1) Hue Saturation Value (HSV), 2) YCbCr, 3) Grayscale for evaluation. Results show that the dataset created with RGB and YCbCr images in field condition was promising with a classification accuracy of 84.242%. The dataset has also been evaluated with other popular architectures and compared. The performance of VGG16 with images of each color scale is analyzed. In addition, feature parameters have been extracted from the different layers. The extracted layers were felt with the VGG16 to evaluate the ability of the feature parameters in classifying the disease.

Share and Cite:

Ibrahim, A. , Elbasheir, M. , Badawi, S. , Mohammed, A. and Alalmin, A. (2023) Skin Cancer Classification Using Transfer Learning by VGG16 Architecture (Case Study on Kaggle Dataset). Journal of Intelligent Learning Systems and Applications, 15, 67-75. doi: 10.4236/jilsa.2023.153005.

1. Introduction

One of the common [1] and the famous types of cancer disease is that it comes out as a result of DNA damage which later on causes or leads to mortality. Damaging DNA sets out cells to rise up out of control while the phenomenon is exacerbated increasingly at present. Researchers conducted studies on computerized images as analysis and sorting of the inner factors that stand beyond the cancer case. The output is automated introducing that recognized skin cancer. Reality of this automation proved the accuracy and precision of discovering the sick spots-king in at most phases. Since cancer [2] is such a deadly illness, and is regarded as a life threat. Additionally, cancer may afflict skin of human beings. Even skin cancer is considered as a fastest growing disease that can inevitably lead to death.

The most common [3] types of skin cancer are: keratosis, carcinoma, squamous and melanoma. Health organization (WHO) is reported to claim that one in every three cases of examined cancer is body skin one. In general, skin cancer is observed to spread much amidst USA, Canada, Australia people that is according to reports from Canadian Cancer Society’s Committee, Cancer Council in Australia and WHO during the years 2014-2019 [4] . Cancer is an extremist life threat to human life. It may sometimes cause certain death to the human. Different types of cancer may exist in the human body and skin cancer is one of the fastest-growing cancers that can cause death. It is provoked by some factors like smoking, alcohol usage, allergies, infections, viruses, physical activity, environmental change, exposure to ultraviolet (UV) light, and so on. The DNA inside the skin cells can be annihilated by the radiation of UV rays from the sun. In addition, unusual swellings of the human body are also a cause of skin cancer.

Deep learning techniques [5] for skin cancer detection: A great role is played by deep neural networks in detecting skin cancer. In these nets there are groups that mimic the human brain in shape. These nerves (The interconnected nets) work with each in a harmoniously cooperative mechanism to troubleshoot stated problems. Neural networks are trained to sortify, classify and analyze images of skin cancer x-rays. Datasets are prepared out of international skin imaging collaboration (ISIC). The recognition software simulates different types of learning such as ANN, CNN, etc. Of them, [6] the nonlinear and statistical method is an intelligent network built like the biological brain; is so-called ANN. The system is based on tri-layer neurons: the input, intermediate; the hidden layers then the data is sent to the third layer of output neurons.

This study [7] [8] features a deep convolution neural network (DCNN) model built on deep learning theory. The researchers use kernel to mitigate distractional noise and artifacts. Classification depends on sorting and extracting images. Augmentation of data enhances the precision of classification rate. Comparing our model [9] showed here is the very mechanism to estimate (VGG16) being similarized with some other learning models and views as CNN, AlexNet, ResNet, DenseNet. However, in this article we use Pre-Trained Visual Geometry Group 16 (VGG16) architecture model.

Transfer Learning [10] is a machine learning technique where knowledge gained from solving one problem can be used to solve another related problem. However, it involves taking the weights and parameters learned on an existing model, such as a convolutional neural network (CNN), and using them in order to create new models that are better suited for different tasks. This approach allows us to use fewer resources while still achieving good results with our models [11] .

VGG16 is a convolutional neural network (CNN) architecture [12] developed by the Visual Geometry Group at Oxford University. It consists of 16 layers, including 13 convolutional and 3 fully connected layers. The VGG16 model has been used in many applications such as image classification, object detection, segmentation etc., Transfer Learning with this architecture involves taking the weights and parameters learned on an existing model using VGG16 and applying them to create new models that are better suited for different tasks.

Deep learning [13] has revolutionized the entire landscape of machine learning during recent decades. It is considered the most sophisticated machine learning subfield concerned with artificial neural network algorithms. These algorithms are inspired by the function and structure of the human brain. Deep learning techniques are implemented in a broad range of areas such as speech recognition, pattern recognition [14] , and bioinformatics [15] . As compared with other classical approaches of machine learning, deep learning systems have achieved impressive results in these applications. Various deep learning approaches have been used for computer-based skin cancer detection in recent years. In this paper, we thoroughly discuss and analyze skin cancer detection techniques based on deep learning.

2. Proposed Skin-Cancer-Malignant Model Transfer Learning by VGG16 Architecture

This paper contributes to suggest Skin-Cancer-Malignant software model sorts and classifies skin cancer. This model is proved as accurate mechanism. Processing the case by this model is a little bit sort-timed, besides adding a new layer in VGG16 architecture than other soft models. In this work [16] , a transfer learning model known as ResNet50 was proposed, but without feature selection or preprocessing steps. The results may be improved with a better classification rate of skin injuries images. In [17] , the author proposed a CNN-based model for melanoma skin cancer detection using pre- and post-processing of images to optimize segmentation. The model produced lesion areas by combining local and global contextual information. He got a good score for prediction and classification. However, execution time is not mentioned which may add value to the results.

In [18] the author used CNN to detect melanoma from pigmented melanocyte lesions from endoscopic images. However, non-melanoma and non-pigmented melanomas were difficult to examine. It also has lower detection accuracy.

Besides, Skin-Cancer-Malignant model gives the most classification precision and good execution time.

The researchers used Transfer Learning with the VGG16 architecture involves taking the weights and parameters learned on an existing model using VGG16 and applying them to create new models that are better suited for skin cancer classification. This approach allows us to use fewer resources while still achieving good results with our models. Figure 1 illustrates the process of skin cancer detection using the VGG16 model.

For our classification, The VGG16 model was trained with the new dataset firstly. Then, taken a pre-trained network such as VGG16 which has already been trained on millions of images from ImageNet dataset, then fine tune it by adding additional layers and changing some of its hyper parameters in order to make it more suitable for our task at hand. The architecture of our model is shown in Figure 2.

Figure 1. The process of skin cancer detection, VGG16: Pre-trained visual geometry group 16 architecture model.

Figure 2. The architecture of VGG16 model.

Our data set used in our model was Skin_Cancer_Malignant Vs Benign from Kaggle dataset, this dataset contains training collection which contain of 1440 benign images 1197 malignant, these datasets used in VGG16 model as pre-train dataset firstly. Then testing collection of 360 benign images and 300 of malignant images which used as a test dataset after the VGG16 model run. Figure 3 explains the randomly selected samples of skin images from the dataset used before, while and after pre-processing and enhancement, starting with Clear images (a) Clear raw skin image with expected cancer, (b) Less Contrast, (c) Noise and (d) Hair and Noise.

3. Experiment and Result Analysis

Transfer learning is a technique in machine learning where a pre-trained model is used as a starting point for a new task. In the case of skin cancer classification, the VGG16 model is often used as the pre-trained model.

The VGG16 architecture consists of 13 convolutional layers and 3 fully connected layers. The convolutional layers are responsible for extracting features from the input image, while the fully connected layers are responsible for making predictions based on those features. To use VGG16 for skin cancer classification, the last layer of the model (which is typically a softmax layer) is replaced with a new output layer that has two nodes (one for malignant and one for benign). The weights of all other layers in the model are frozen so that they do not change during training.

The VGG16 model was tested on the CIFAR-10 dataset and achieved an accuracy of 92.5%. This result is very impressive, considering that the dataset contains 10 classes with 6000 images per class. The model was able to accurately classify images with high accuracy, demonstrating its effectiveness in image recognition tasks. Furthermore, the model was able to achieve good result with a relatively small number of parameters. This indicates that the VGG16 model is an efficient and powerful tool for image recognition tasks.

The equation for this model is:

Y = β 0 + β 1 X 1 + β 2 X 2 + + β n X n

where Y is the dependent variable, X 1 , X 2 , , X n are the independent variables, and β 0 , β 1 , , β n are the coefficients.

When the model was trained, the parameters of the equations (e.g. weights and biases) will change to better fit the data. This can be introduced by showing

Figure 3. Samples of skin images used in Skin_Cancer_Malignant model. (a) Clear, (b) Less contrast, (c) Noise, (d) Hair & Noise.

how the parameters change as the model is trained, and how this affects the equations.

The process of determining the weights of the layers in a VGG16 model used to classify our skin cancer images involves pre-train the model on a dataset of labeled images (training data 80%). This is done by using a supervised learning algorithm named backpropagation. During this process, the weights of the layers are adjusted to minimize the error between the predicted and actual labels for each image. The weights are adjusted based on how well they contribute to correctly classifying each image. This process is repeated until the model reaches high ACC with last epoch with our train data, then researchers used the weights to classify the teste data 20% of remaining images which achieved ACC of 84.242.

The input to the model is an image of a skin lesion, which is passed through the convolutional layers to extract features. These features are then flattened and passed through the fully connected layers to make a prediction about whether or not the lesion is malignant or benign. During training, only the weights of the new output layer are updated using backpropagation. This allows us to train a new model for skin cancer classification using much less data than would be required if we were training from scratch.

When this model is estimated, the obtain is 84.242% being standardized on 556 datasets.

Overall, the model is proved dependable and robust. One of the main advantages of using transfer learning with VGG16 for skin cancer classification is that it allows us to achieve high accuracy with relatively little data. This is because the network has already learned to recognize many features that are relevant to skin cancer classification, such as texture, color, and shape. Another advantage of using transfer learning with VGG16 is that it can help us avoid overfitting. Overfitting occurs when a model becomes too complex and starts to memorize the training data instead of generalizing to new data. By starting with a pre-trained model like VGG16, we can avoid overfitting by leveraging the knowledge that the network has already learned.

Figure 4 shows the accuracy of training and validating data of VGG16 for skin cancer images. The x-axis represents the number of epochs, while the y-axis

Figure 4. The accuracy of training and validating data of VGG16 for skin cancer images.

represents the accuracy percentage. The blue line represents the accuracy of training data, while the orange line represents the accuracy of validating data. At the beginning of training, both lines start at a low accuracy percentage. However, as the number of epochs increases, both lines gradually increase in accuracy. The training data achieves a higher accuracy percentage than the validating data throughout most of the epochs. Towards the end of training, both lines start to plateau and converge towards an accuracy percentage of around 84.242%. This indicates that VGG16 has learned to accurately classify skin cancer images with a high degree of precision. Overall, in this Figure 4 demonstrates that VGG16 is an effective model for classifying skin cancer images with high accuracy on both training and validating datasets.

4. Conclusions

Overall, transfer learning by VGG16 for skin cancer classification involves using a pre-trained model to extract features from images of skin lesions and then training a new output layer to make predictions about whether or not those lesions are malignant or benign.

To sum up, using transfer learning with VGG16 for skin cancer classification can help us achieve high accuracy with relatively little data and avoid overfitting. This makes it an attractive option for researchers and practitioners in the field of dermatology who are looking to develop accurate and reliable methods for diagnosing skin cancer.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Ali, M.S., Miah, M.S., Haque, J., Rahman, M.M. and Islam, M.K. (2021) An Enhanced Technique of Skin Cancer Classification Using Deep Convolutional Neural Network with Transfer Learning Models. Machine Learning with Applications, 5, Article ID: 100036.
https://doi.org/10.1016/j.mlwa.2021.100036
[2] Carcagnì, P., Leo, M., Cuna, A., Mazzeo, P.L., Spagnolo, P., Celeste, G., et al. (2019) Classification of Skin Lesions by Combining Multilevel Learnings in a DenseNet Architecture. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S. and Sebe, N., Eds., Image Analysis and Processing—ICIAP 2019, Springer, Cham, 335-344.
https://doi.org/10.1007/978-3-030-30642-7_30
[3] Dorj, U.O., Lee, K.K., Choi, J.Y. and Lee, M. (2018) The Skin Cancer Classification Using Deep Convolutional Neural Network. Multimedia Tools and Applications, 77, 9909-9924.
https://doi.org/10.1007/s11042-018-5714-1
[4] Pacheco, A.G. and Krohling, R.A. (2019) Recent Advances in Deep Learning Applied to Skin Cancer Detection. arXiv: 1912.03280.
[5] Ashraf, R., Afzal, S., Rehman, A.U., Gul, S., Baber, J., Bakhtyar, M., Mehmood, I., Song, O.Y. and Maqsood, M. (2020) Region-of-Interest Based Transfer Learning Assisted Framework for Skin Cancer Detection. IEEE Access, 8, 147858-147871.
https://doi.org/10.1109/ACCESS.2020.3014701
[6] Byrd, A.L., Belkaid, Y. and Segre, J.A. (2018) The Human Skin Microbiome. Nature Reviews Microbiology, 16, 143-155.
https://doi.org/10.1038/nrmicro.2017.157
[7] Jeong, M.K., Lu, J.C., Huo, X., Vidakovic, B. and Chen, D. (2006) Wavelet-Based Data Reduction Techniques for Process Fault Detection. Technometrics, 48, 26-40.
https://doi.org/10.1198/004017005000000553
[8] Namey, E., Guest, G., Thairu, L. and Johnson, L. (2008) Data Reduction Techniques for Large Qualitative Data Sets. Handbook for Team-Based Qualitative Research, 2, 137-161.
[9] Mahbod, A., Schaefer, G., Wang, C., Ecker, R. and Ellinge, I. (2019) Skin Lesion Classification Using Hybrid Deep Neural Networks. ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, 12-17 May 2019, 1229-1233.
https://doi.org/10.1109/ICASSP.2019.8683352
[10] Sagar, A. and Dheeba, J. (2020) Convolutional Neural Networks for Classifying Melanoma Images. bioRxiv.
https://www.biorxiv.org/content/10.1101/2020.05.22.110973v2
[11] Shaikh, J., Khan, R., Ingle, Y. and Shaikh, N. (2022) Skin Cancer Detection: A Review Using AI Techniques. International Journal of Health Sciences, 6, 14339-14346.
[12] Althubiti, S.A., Alenezi, F., Shitharth, S., Sangeetha, K. and Simha Reddy, C.V. (2022) Circuit Manufacturing Defect Detection Using VGG16 Convolutional Neural Networks. Wireless Communications and Mobile Computing, 2022, Article ID: 1070405.
https://doi.org/10.1155/2022/1070405
[13] Kalouche, S. (2016) Vision-Based Classification of Skin Cancer Using Deep Learning.
https://www.semanticscholar.org/paper/Vision-Based-Classification-of-Skin-Cancer-using-Kalouche/b57ba909756462d812dc20fca157b3972bc1f533
[14] Bisla, D., Choromanska, A., Stein, J.A., Polsky, D. and Berman, R. (2019) Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, 16-17 June 2019, 2720-2728.
http://arxiv.org/abs/1902.06061
[15] Farag, A., Lu, L., Roth, H.R., Liu, J., Turkbey, E. and Summers, R.M. (2017) A Bottom-Up Approach for Pancreas Segmentation Using Cascaded Superpixels and (Deep) Image Patch Labeling. IEEE Transactions on Image Processing, 26, 386-399.
https://doi.org/10.1109/TIP.2016.2624198
[16] Le, D.N., Le, H.X., Ngo, L.T. and Ngo, H.T. (2020) Transfer Learning with Classweighted and Focal Loss Function for Automatic Skin Cancer Classification. arXiv: 2009.05977.
[17] Jafari, M.H., Karimi, N., Nasr-Esfahani, E., Samavi, S., Soroushmehr, S.M.R., Ward, K., et al. (2016) Skin Lesion Segmentation in Clinical Images Using Deep Learning. 2016 23rd International Conference on Pattern Recognition, Cancun, 4-8 December 2016, 337-342.
https://doi.org/10.1109/ICPR.2016.7899656
[18] Tschandl, P., et al. (2019) Comparison of the Accuracy of Human Readers versus Machine-Learning Algorithms for Pigmented Skin Lesion Classification: An Open, Web-Based, International, Diagnostic Study. The Lancet Oncology, 20, 938-947.
https://doi.org/10.1016/S1470-2045(19)30333-X

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.