Dynamic Classification Using the Adaptive Competitive Algorithm for Breast Cancer Detection ()
1. Introduction
As defined by the World Health Organization (WHO), cancer encompasses a broad range of diseases that can develop in any tissue or organ of the body. It arises when abnormal cells proliferate uncontrollably, extending beyond their normal limits. These cells can invade surrounding tissues and organs, leading to their destruction.
Breast cancer is one of the most common malignant tumors affecting women worldwide, characterized by a high incidence rate, rapid progression, and the potential for late-stage detection. According to the World Health Organization (WHO), cancer, including breast cancer, accounted for 2.3 million women diagnosed and approximately 670,000 deaths worldwide [1]. It remains a leading cause of premature mortality among women, with a particularly high burden in middle- and low-income countries.
The key aspect of breast cancer treatment lies in the accurate detection of its presence and the correct identification of the type of cancer, which is essential to determine the most effective treatment approach [2]. The precise classification of breast cancer is crucial for early detection, accurate diagnosis, and effective treatment, which can lead to its complete elimination in some cases. Additionally, correctly identifying benign tumors can help prevent patients from undergoing unnecessary medical interventions [3].
The use of machine learning in disease diagnosis has rapidly advanced in recent years. By analyzing collected data, machine learning techniques enable researchers to uncover patterns and insights that may not be easily detected through traditional observation or manual calculations. In medical diagnostics, AI-powered software can assist healthcare professionals in making more accurate assessments, significantly enhancing diagnostic precision and offering an additional layer of reliability and safety for patients [4].
Machine learning algorithms have played a significant role in the creation of predictive models that enhance decision-making in breast cancer diagnosis and prognosis [5]. One well-known method is clustering, which has been used in divergent majors such as machine learning and statistics. Clustering can be regarded as the foremost unsupervised learning problem that aims to identify patterns or structures within data by grouping similar objects into clusters based on their intrinsic characteristics. The concept of clustering was presented in an anthropology paper in 1954 [6] and is widely used in science and engineering for applications such as voice recognition, face detection, spam filtering, and Oncological diagnosis.
Unsupervised clustering can be classified into three main groups: hierarchical [7], partitioning [8], and soft computing [9]. Among these, the artificial neural network (ANN) stands out as an intelligent system capable of unsupervised clustering.
Among various machine learning techniques, competitive learning plays a crucial role in unsupervised learning within ANNs. This concept is inspired by natural competition processes, where limited resources drive competitive adaptation [10]. Unsupervised artificial neural networks (ANNs) involve dynamical systems utilizing energy-based functions, first introduced by John Hopfield in 1982 [11]. In optimization problems, an energy function represents the ANN’s state, determined by weight values. The term “energy” is used because optimization models aim to minimize this function, leading to greater system stability at lower energy levels, while higher energy states indicate instability. The iterative nature of the energy function gradually reduces the network’s total energy, driving it toward a minimum and enabling effective classification of object features. The proposed adaptive competitive self-organizing (ACS) ANN model functions as a dynamic system, combining the principles of energy functions with the competitive characteristics of earlier models. This model was developed by [12] [13]. The core concept of the ACS model is derived from a related study that developed an associative memory system. Compared to models like SOM (self-organizing map) and ART (adaptive resonance theory), this model supports more complex implementations and does not have storage capacity limitations. The ACS model could be used as vector quantization (VQ), driven by the model designed by [14] [15].
Artificial neural networks are widely utilized to address classification and clustering problems through the VQ approach. Vector quantization is used for data compression, especially in Image, Video, Speech, and Audio, to reduce the dimensionality of data and make it more compact. Nowadays, it is primarily used for classification, where it partitions large sets of multidimensional data (vectors) into groups, each represented by a “centroid” point [16]. Vector quantization (VQ) is a traditional technique used to estimate a continuous probability density function (PDF), p(x), of the vector variable
, by employing a limited set of prototypes [17]. Classification and compression of vector data are used together in supervised and unsupervised vector quantization problems [18]. The modern applications of VQ data classifiers employ Artificial Neural Networks, which consist of interconnected programming structures referred to as neurons or nodes. These networks replicate the functioning of biological neural networks and are leveraged to address artificial intelligence challenges. Neural network systems strive to intellectually simplify biological neural intricacy and concentrate on essential information processing [19]. Biologically inspired learning techniques, including the self-organizing map (SOM) and neural gas (NG) vector quantizer, are effective approaches for vector quantization [20].
The self-organizing feature arises from a system of ordinary differential equations (ODEs). The ACS model is optimized using gradient descent (GD), which, although not the only optimization method, remains the most practical and efficient approach.
This paper is structured into four distinct sections, outlined as follows: Section 1, the introduction, gives an overview of the research. Section 2, the literature review, presents related work on breast cancer classification and the proposed dynamic model. Section 3 is concerned with the methods used for this research. Section 4 gives the conclusion and a concise summary, along with a critical analysis of the findings.
2. Literature Review
Numerous studies have explored the application of machine learning algorithms to various medical datasets, particularly in breast cancer prediction. Many of these techniques have demonstrated high accuracy in forecasting outcomes.
In 2015, [21] applied various machine learning techniques, including Support Vector Machine (SVM), artificial neural networks, Naïve Bayes classifier, and AdaBoost, for breast cancer prediction. To enhance efficiency, principal component analysis was utilized for feature space reduction.
In 2016, [22] compared the performance of machine learning algorithms such as SVM, Decision Tree (C4.5), Naïve Bayes, and k-NN. The evaluation focuses on accuracy, precision, sensitivity, and specificity, with results showing that SVM achieved the highest accuracy (97.13%) and lowest error rate.
The paper by Gupta et al. (2018) [23] applied machine learning techniques like Linear Regression, Random Forest, Multi-layer Perceptron (MLP), and Decision Trees (DT) for the Wisconsin Breast Cancer dataset. Results showed that MLP achieved the highest prediction accuracy.
The paper by Yarabarla et al. (2019) [24] explores the application of machine learning in predicting breast cancer. The study highlights breast cancer as a leading health concern among women and emphasizes the importance of early detection to reduce mortality rates. The research focuses on the development of computer-aided detection (CAD) systems, which leverage machine-learning algorithms to enhance diagnostic accuracy. By training models with medical datasets, the study aims to improve breast cancer classification, allowing more precise and timely intervention.
In 2020, [25] a study compared five supervised machine learning models (SVM, KNN, Random Forests, ANN, and Logistic Regression) using the Wisconsin Breast Cancer dataset. ANN achieved the highest accuracy (98.57%), precision (97.82%), and F1 score (0.9890), outperforming other models. These findings highlight ANN’s effectiveness in breast cancer prediction and diagnosis.
This study in 2020 [26] applies machine learning (ML) to breast cancer prognosis using data from 610 post-surgery patients. ANN and SVM models predicted recurrence and mortality with high accuracy (95.29% - 96.86%) and specificity (0.97 - 0.99), though sensitivity (0.35 - 0.64) requires improvement. ML shows promise as a clinical tool for breast cancer prognosis.
Ricciardi et al. [27] utilized a combination of Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for coronary artery disease classification. PCA was applied to generate new features, while LDA was used for classification, enhancing diagnostic accuracy, and improving patient outcomes.
The article [28] investigates the use of generative models, specifically Denoising Diffusion Probabilistic Models (DDPM) and Progressive Growing Generative Adversarial Networks (PGGANs), to augment imbalanced medical datasets. Their research demonstrates that DDPM-generated images significantly improve classification performance across various deep-learning architectures by reducing dataset imbalance and enhancing model robustness. This aligns with our approach in Dynamic Classification Using the Adaptive Competitive Algorithm for Breast Cancer Detection, where we explore adaptive techniques to optimize classification performance. The insights from synthetic data generation can complement our method by further improving dataset diversity and stability.
Prioritization and classification of data are crucial in both software development and healthcare. In 2025, [29] proposed a model for ranking app reviews to help developers respond efficiently, using adaptive classification techniques. Similarly, in breast cancer detection, dynamic classification models prioritize medical data for timely diagnosis. The principles of adaptive learning and data filtering in Jafari et al.’s work align with medical AI models, enhancing decision-making and accuracy in critical classifications.
The novel approach introduced by Cheng and Sayeh in 2011 to utilize dynamic systems for clustering and vector quantization, grounded in ordinary differential equations, with an emphasis on the potential for real-time usage. Demonstrated through two varied examples of pattern clustering, this model proves its aptitude in effectively handling diverse input patterns. A significant aspect of the research is the exploration and confirmation of the system’s stability, particularly through the identification and examination of equilibrium points linked to specific input patterns, and their correlation with the system’s vigilance parameter. The proposed system is implemented in two real-world scenarios, yielding results that align with the top-reported findings. This demonstrates the effectiveness of our approach [30].
Sarafraz and Sayeh in 2018 introduced the Adaptive Competitive Self-organizing (ACS) model, suitable for real-time clustering and vector quantization. The model features a dynamic, self-adjusting structure that overcomes challenges like parasitic limit points for more accurate labeling. It performs as an unsupervised classifier, guided by gradient descent and Lotka-Volterra competition. Central to ACS is an energy function, ensuring stability at equilibrium points corresponding to cluster centroids. The model’s efficacy is confirmed through tests on various datasets, showing improved or comparable clustering performance [10].
Goga et al. [31] proposed an intelligent system, DiagBC, for breast cancer diagnosis, combining unsupervised learning (C-Means and Gaussian Mixture Model) for image segmentation and supervised learning (a modified DenseNet) for classification. The system was implemented at Magori Polyclinic in Niamey, diagnosing breast cancer into normal and abnormal classes. Their study highlights the significance of AI in medical imaging, leveraging deep learning for feature extraction and classification. The results suggest that integrating segmentation and classification techniques can enhance early detection and diagnosis, addressing challenges such as limited annotated data and the inefficiencies of traditional diagnostic methods.
The article [32] investigates the performance of algorithms such as Support Vector Machines (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Artificial Neural Networks (ANN) in diagnosing breast cancer. Their findings highlight the significance of hyperparameter tuning in improving classification accuracy, particularly in optimizing SVM kernel functions and ANN architectures. This study aligns with our proposed Adaptive Competitive Algorithm by demonstrating how machine learning can effectively classify malignant and benign breast cancer cases.
Considering all the previously discussed studies, previous studies that primarily focused on traditional classification techniques, and supervised learning approaches for breast cancer diagnosis, this research introduces a novel vector quantization (VQ) model tailored for clustering high-dimensional medical data. Unlike conventional machine learning methods, the proposed model dynamically adapts to data distribution over time, enhancing its capability to recognize patterns within complex datasets. By applying this gradient-based dynamical system to the Breast Cancer Wisconsin Diagnostic dataset [33], we aim to improve clustering accuracy and stability in medical diagnostics. This approach not only provides a fresh perspective on breast cancer classification but also demonstrates the potential of adaptive clustering techniques in handling multidimensional data, setting this study apart from existing machine learning applications in biomedical informatics.
3. Method
This section provides an overview of how a dynamical system can be conceptualized using the principles of vector quantization. The design of the model is constructed by applying collections of Ordinary Differential Equations (ODEs). The neural ordinary differential equation (Neural ODE) model has attained impressive attention in time series analysis due to its ability to handle irregular time steps, where data points are not observed at equally spaced intervals. In the context of multi-dimensional time series analysis, the critical task is to perform evolutionary subspace clustering, which helps to cluster temporal data based on their evolving low-dimensional subspace structures [34]. This method is categorized as an unsupervised neural network (NN) due to the absence of prior knowledge about the input patterns. The design of this method is based on developing an energy function that has finite local minima. These local minima, in turn, represent clusters within the overall data structure, and when no input pattern is present, the energy function’s surface is flat. To investigate the clustering process in ACS (Adaptive Competitive Self-Organizing), two primary processes are employed: learning and recalling. During the learning phase, the energy surface undergoes modifications, as sets of weights competitively self-adjust to label or cluster an encountered input pattern. On the other hand, in the recalling phase, a newly exposed input pattern modifies the nearest existing cluster. The efficacy of the ACS model is showcased through simulation results conducted on both real and artificial datasets, alongside comparisons with other widely recognized clustering methods. The ACS method exhibited superior clustering performance in certain categories and demonstrated a comparable overall performance.
3.1. Construction of an ACS NN Influenced by the Lorentzian Function
The energy function is based on the sum of Lorentzian functions, which are grouped into clusters with similar input patterns. The V energy function is created to establish a solitary valley in each input pattern or a group of similar input patterns, serving as quantization points to identify clusters. Clusters are shown by valleys on the energy surface, and their characteristics are defined by specific parameters such as depth and vigilance. The model requires dynamic adjustments of all these parameters to achieve the highest level of self-adaptation. In general, when considering an N-dimensional space for input patterns, the V energy function is designed to represent M valleys that are centered at specific locations. Energy function V is defined as below:
is defined based on the Lorentzian function:
is a constant value that promotes the formation of multiple valleys (clusters), controlling the model’s sensitivity to forming new clusters, while
(vigilance parameter) roughly determines the radius of the created valley.
has an obvious impact on the size of valleys; when it gets bigger, the valleys become narrow.
determine the depth of valleys which is
, when the value of
approaches 1, indicates that
is becoming a cluster for
.
moves toward unity when the distance between the input pattern and the cluster center decreases, thus indicating the respective cluster for the input pattern.
In order to present the dynamic system, we will express it using the gradient descent (GD) technique. The weights, denoted as Ws (belonging to RN), are updated iteratively as they gradually converge towards the desired set of input patterns. These weight updates by applying the GD optimizer to the energy function V with respect to the weight, as described below:
can be expressed as:
Dynamic system is defined as:
The dynamics outlined in the above equation result in a weight aligning with the valley’s center, which represents an equilibrium point of V and is precisely situated in the midst of a comparable Up input dataset. The energy function V, being a continuously differentiable function in the state space, can be regarded as a form of Lyapunov function. We examine the system’s (model’s) stability, assuming the widely accepted condition of
.
To determine the minimum value of V, it is sufficient to examine a restricted region within the state space. As the state
moves towards infinity, the energy function V progressively increases but does not exceed its maximum conceivable value, expressed as
. This property signifies that the system’s energy is perpetually restricted and will not surpass this upper limit.
3.2. Execution of the Proposed Framework on Generated Data
The first data set denotes the principal concept of the proposed ACS model that could be employed in VQ projects. One of the primary uses of Vector Quantization (VQ) is data compression, where each cluster can be denoted by its corresponding label. Figure 1 represents 20 input patterns in R2 uniformly distributed on logarithmic distribution (Blue dots). Six weights are distributed in Figure 1. The track of trajectories and final locations of Ws are shown, and labels match the equilibrium points of the energy function at final weight positions. Inputs are consistently quantized in six clusters at the end of this dynamic system. In both simulations, the corresponding
is equal to 10 and
.
The blue dots mark the input vector that we want to label using the weight “neuron” that is denoted using the black dots, the input vector value was about
Figure 1. ACS VQ.
0.1, and the initial weight value “that was chosen randomly” is about 0.1, so our objective is for the weights to move toward the input vector value.
Figure 1 points out the movement of the weights towards the input vector by each iteration, and we needed about 2000 iterations to achieve an almost reasonable result. The final value for the weight was about 0.1, and we stated that we used a vigilance vector value of 10. One of the critical factors in this model is
, which plays a key role in the existence, stability, and shape of the results. We want to find a range of values for
which produces weights to move to the middle of similar input patterns to form a cluster.
3.3. Determination of Vigilance Parameters
The vigilance parameter is often shown by lambda (
), which plays an important role in the Vector Quantization (VQ) model. This parameter helps control the “radius” or width of the valley formed during the clustering process. When the value of
is small, it results in a broader valley. This essentially means the cluster will cover a larger area in the data space, potentially encompassing more points within its boundary. Conversely, a larger
value will result in a narrower valley, meaning that the resulting cluster is more tightly focused on the centroid.
Making this parameter dynamic enables larger clusters to have an increased radius, ultimately leading to improved clustering. This dynamic behavior can be mathematically represented as:
where
,
(vigilance parameter) shows the radius of the created valley
. As
becomes smaller, valley
becomes wider. In this case, the depth of valley
,
(
), can be represented as:
As a result, the system’s dynamics can be represented as:
In the context of vector quantization and learning, the choice of
is hard, solving this problem presents a significant challenge due to the highly nonlinear nature of our system. In this section, we aim to derive analytical solutions, but only for specific types of input patterns. A lower
might make the model more generalized (less sensitive to individual data points), whereas a higher
could make the model more specific (more sensitive to individual data points).
The correct choice of
depends on the specific application and the nature of the data. It may require some experimentation or a validation set to choose the best
.
3.4. Learning Rate
Learning rate is one of the most critical hyperparameters in the context of training neural networks and other machine learning algorithms. It determines the step size at each iteration while moving towards a minimum of the loss function. Setting the correct learning rate can be the difference between a model that converges quickly, one that takes too long, or one that doesn’t converge at all.
At the heart of training a neural network is the iterative process of adjusting the network’s weights to minimize a defined loss function. This is done using optimization algorithms, most notably gradient descent. The gradient points in the direction of the steepest increase in the loss function. Thus, to reduce the loss, one needs to move in the opposite direction by a certain step size. This step size is defined by the learning rate. A very high learning rate can cause the algorithm to overshoot the minimum of the loss function, resulting in divergence. Instead of converging to a solution, the error might grow uncontrollably. A small learning rate, while ensuring that the model will eventually converge to a minimum, can be very slow. This is because the updates to the weights are minuscule, making the algorithm take a very long time to reach the minimum. A learning rate that is set appropriately can allow the algorithm to converge rapidly to a good solution. Given the sensitivity of model training to the learning rate, researchers have proposed several methods that adjust the learning rate during training. Algorithms like AdaGrad, RMSprop, and Adam are popular adaptive learning rate methods [35]. These methods modulate the learning rate based on the historical gradient information, allowing faster convergence and alleviating some of the sensitivities to the initial learning rate value. Another strategy involves gradually reducing the learning rate during training. The idea is to start with a larger learning rate (but not too large) to make rapid progress and then reduce it to fine-tune the weights. Common schedules include step decay, exponential decay, and one-cycle learning [36].
3.5. Cluster Simulation
To assess the proficiency of the suggested model in addressing data clustering tasks, the renowned breast cancer dataset from the UCI machine learning repository (http://archive.ics.uci.edu/ml/) was considered.
Breast cancer dataset comprises 569 input patterns (
) characterized by 30 dimensional real vectors (radius1, texture1, perimeter1, area1, smoothness1, compactness1, concavity1, concave_points1, symmetry1, fractal_dimension1, radius2, texture2, perimeter2, area2, smoothness2, compactness2, concavity2, concave_points2, symmetry2, fractal_dimension2, radius3, texture3, perimeter3, area3, smoothness3, compactness3 concavity3, concave_points3, symmetry3, fractal_dimension3).
This dataset is categorized into two classes (
) which are (M = malignant, B = benign). The vigilance parameter (
) is considered 30 in this example. Once the weights settle at the cluster centroids and the set values for
from the previously trained ANN (energy function) are secured, the input pattern dynamics can be activated for all input patterns on the energy surface. This action leads to a dynamic realignment of input patterns towards the previously determined class centers.
Using the confusion matrix, the accuracy percentage is evaluated by contrasting the class predictions (each row in the confusion matrix) identified by the ACS method with the genuine class (each column in the matrix) of the inputs. The findings are encapsulated in Figure 2.
The Confusion matrix of Breast Cancer indicates that 22 out of 357 benign cases were incorrectly classified, while in the malignant group, 16 out of 212 cases were misclassified. Therefore, the accuracy of the dynamic model in this dataset is 93% compared to 90% for the k-means. K-means clustering works best on datasets where the clusters are spherical and well-separated. If the dataset has complex relationships or non-linear separability between classes, k-means might not capture these nuances effectively. Since the dataset is a binary classification for breast cancer detection, the cost of false negatives (missing a malignant tumor) is significantly higher than false positives [36] [37]. Therefore, other performance
Figure 2. Confusion matrix of Breast cancer data.
metrics such as Recall (sensitivity or true positive rate), and Area Under the ROC Curve (AUC), have also been reported. A higher AUC (greater than 85%) ensures that the model can better separate malignant from benign cases. The Recall metric is registered at 92.5% (83% for the K-Means) and AUC is at 92% (89% for the K-Means), hence further supporting the superiority of our model.
4. Conclusion
This study introduces the Adaptive Competitive Self-organizing (ACS) model as an effective and dynamic approach for clustering and vector quantization (VQ) in breast cancer classification. Unlike conventional machine learning methods, ACS leverages ordinary differential equations (ODEs) and a self-organizing mechanism that enables fast convergence without requiring external control parameters. By integrating gradient descent (GD) with competition dynamics, the model effectively minimizes parasitic limit points commonly found in artificial neural networks (ANNs), enhancing clustering stability and accuracy [38]. The ACS system dynamically adjusts to new input patterns, either refining existing clusters or forming new ones based on data distribution. Applied to the Breast Cancer Wisconsin Diagnostic dataset, the model successfully distinguished between benign and malignant cases, demonstrating its potential in medical diagnostics. This research sets itself apart from previous studies by employing an adaptive clustering technique rather than conventional classification algorithms, providing a more flexible and robust solution for handling high-dimensional medical data. The findings highlight the ACS model’s promise in breast cancer detection, classification, and early diagnosis, paving the way for more efficient, data-driven decision-making in healthcare applications.