Consistent and Specific Multi-View Functional Brain Networks Fusion for Autism Spectrum Disorder Diagnosis ()
1. Introduction
As a widespread neurodevelopmental disability, Autism Spectrum Disorder (ASD) is primarily characterised by difficulties with social interaction, communication disturbances and limited repetitive interests [1] . The prevalence of ASD has been increasing in recent years [2] , yet the pathological causes of the emergence and development of ASD remain unexplained. Currently, the diagnosis of individuals with ASD is based mainly on behavioural descriptions of symptoms and clinical observations, which are prone to subjectivity [3] [4] . There is an urgent need for reliable biomarkers for the early and objective diagnosis of patients with ASD [5] [6] [7] [8] .
To tackle this challenge, different neuroimaging techniques have been applied to the diagnostic research of ASD, including structural magnetic resonance imaging (sMRI) [9] ; functional magnetic resonance imaging (fMRI) [10] ; diffusion tensor imaging (DTI) [11] ; and electroencephalogram (EEG) [12] . Among them, a large proportion of functional MRI (fMRI) studies involve task-based and resting-state fMRI (rs-fMRI) data. Compared to task-state fMRI, rs-fMRI avoids the challenges of task design and the impact of task triggering, and is more suitable for the early detection and diagnosis of ASD. Rs-fMRI data can be used to calculate functional connectivities (FCs) between different regions-of-interest (ROIs) by detecting synchronous temporal changes in blood oxygen level dependent (BOLD) signals [13] , thereby constructing a functional brain network (FBN).
To date, FBN based on rs-fMRI has been an important tool to provide biomarkers for ASD identification. Furthermore, researchers have invested a great deal of effort in developing various FBN construction methods [14] [15] [16] , such as Pearson’s correlation (PC), Bayesian network, Granger causality model (GCM), Structural equation model (SEM), and so on. While these strategies may effectively estimate FBNs in certain cases, there is a potential problem that a single FBN constructed using a method such as PC or GCM, which often fails to adequately capture the subtle disruptions in functional brain tissue caused by neuropsychiatric disorders [17] , since single constructing strategy tends to impose a singular preference during the process of modeling the relationships between ROIs.
Multi-view learning is a popular topic in recent years [18] [19] , which uses features from multiple level-dependent and integrally represent a thing. In multi-view learning, the features obtained from different views generally model different priors. Moreover, fusing these different features is easier to obtain a better representation, which to some extent mitigates the defects of single-view learning. Motivated by this idea, for studying functional brain abnormalities, we consider constructing different FBNs based on the same brain as different “views”. We hope that each-view FBN captures different features (i.e., correlations between ROIs), and multi-view FBNs are fused to provide a more comprehensive description of the brain, with the aim to improve the performance of brain disease diagnosis.
In the last decade, several studies have explored the principles of constructing multiple FBNs for the purpose of fusion based on multi-view learning for brain disease diagnosis [20] [21] [22] . For instance, Jie et al. [20] set multiple thresholds on PC-constructed FBNs to generate multi-threshold FBNs and then used the multi-kernel Support Vector Machine (SVM) to fuse them to classify brain diseases. Huang et al. [21] constructed multiple FBNs with different levels of sparsity by setting different [15] regularization parameters for group-constrained sparse regression models with the
norm, and then fused all FBNs to enhance the feature representation. Yang et al. [22] dynamically generated multiple FBNs using potentially different thresholds for different elements of the FBNs constructed by PC, and then the fused networks are fed into SVM to classify brain diseases. Although they have achieved satisfactory results in brain disease diagnosis, the above methods generally obtain multi-view FBNs by setting different thresholds for a fixed FBN constructed by a single method such as PC or sparse representation (SR). However, such a fixed FBN only models a type of correlations between ROIs. For example, a PC-constructed network only captures the full correlation between ROIs, and a SR-constructed one corresponds to the partial correlations. That is to say, the aforementioned multi-view FBNs in essence model a type of correlations between ROIs with different levels, which is difficult to get a better brain representation since the brain is extremely complex.
To address this problem, recently we have fused different types of FBNs by joint embedding [23] , but such fused FBNs more considered common information among multi-FBNs, ignoring their unique parts. Therefore, in this paper, we propose a Consistent and Specific Multi-view FBNs Fusion (CSMF) method. Specifically, for each subject, we first constructed multi-view FBNs using multiple typical construction strategies, including PC, SR, mutual information (MI), and correlation’s correlation (CC), which encode full correlation, partial correlation, non-linear, and higher-order relationships between ROIs, respectively. We then decompose the multi-view FBNs into a common representation matrix capturing their consistency information, and multiple specific matrices corresponding to their unique information. Moreover, the common matrix and the average of specific matrices are fused as the final FBN for ASD classification. Finally, the classification performance is validated in the Autism Brain Imaging Data Exchange (ABIDE) dataset.
The main contributions of this work are shown below.
1) We propose a novel multi-view FBNs fusion framework, CSMF, which potentially fuses different types of relationships among ROIs, such as full correlation, partial correlation and higher order correlation, instead of a single relationship with different levels as in the existing fusion methods. This more benefits to obtain a comprehensive brain representation.
2) In CSMF, the proposed fusion strategy is in the latent representation spaces of multi-view FBNs to simultaneously learn their consistent and specific information, rather than directly combining their original features as in most fusion methods. Such a fusion approach potentially captures more comprehensively connections among ROIs.
3) Extensive experiments for ASD diagnosis on the ABIDE dataset have shown that our model outperforms several state-of-the-art fusion methods.
2. Materials and Method
The process of our proposed framework is shown in Figure 1. It is divided into four steps: data preparation, multi-view FBNs construction, multi-view FBNs fusion, feature selection and classification.
2.1. Data Preparation
We perform extensive experiments on two ASD datasets to validate the proposed method. The rs-fMRI scans for these datasets are obtained from the ABIDE database. The ABIDE database is collected from 17 different imaging sites and consists of 1112 subjects, including 539 ASD patients and 573 normal controls (NCs). Considering the number of data, we select the top two imaging sites with the largest amount of subjects, New York University (NYU) and University of Michigan (UM), where each site has more than 100 subjects. To be specific, the NYU site consists of 79 ASDs and 105 NCs, the UM site has 66 ASDs and 77 NCs. In addition, images at different sites are obtained using different scanning equipment, which results in variations in the scan times. Detailed demographic information is listed in Table 1.
![]()
Table 1. Demographic information of subjects at two sites. M and F indicate male and female.
![]()
Figure 1. Illustration of the proposed multi-view FBN fusion method, including four major steps: 1) Data preparation: preprocessing the rs-fMRI data; 2) Constructing multiple FBNs: calculate the correlation between each pair of ROI based on multiple methods, (e.g., PC, SR, MI and CC); 3) Fusion of multi-view FBNs: multiple FBNs of each subject were decomposed into a consistency matrix and a specificity matrix, and then fused; 4) ASD classification: the fused FBNs were classified by SVM.
This study uses the Data Processing Assistant for Resting State fMRI (DPARSFA) to pre-process the rs-fMRI data of all subjects [24] . The specific pre-processing steps are as follows: 1) slice time and head motion correction; 2) alignment to the Montreal Neurological Institute (MNI) space; 3) temporal filtering (0.01 - 0.1 Hz ); 4) nuisance signal regression. Finally, for each subject, the brain is divided into 116 predefined ROIs based on anatomical automatic labeling (AAL) atlas [25] and the mean time series of each ROI is extracted as a representative signal [26] .
2.2. Preliminaries
Before introducing the consistency and specificity multi-view FBN fusion methods, we first understand some common notations in this paper, as in Table 2.
2.3. Multi-View FBNs Construction
As mentioned earlier, we chose four typical methods to construct FBNs, including PC [27] , SR [28] , MI [29] , and CC [30] . These four methods capture different types of interactions between ROIs according to different model preferences. The equations of these four methods are as follows:
(1)
(2)
(3)
where
is the average BOLD signal of the i-th ROI, and n is the number of ROIs, t is the time point.
is the average of all elements of
. Equation (1) represents the edge weight between two ROIs calculated using PC. Equation (2) obtains the sparse FBN by sparsely encoding the edges in the FBN
![]()
Table 2. Some important symbolic notions used in this paper.
by the
parametrization and avoiding the tame solutions by constraining
. Equation (3) represents the way the of MI to calculates FBN, where
is the joint probability distribution of
and
, and
and
are the marginal probability distributions of
and
, respectively. For CC, we treat each column of the PC-based construction of the FBN as a new variable and again use Equation (1) to solve for the correlation’s correlation.
The above four methods calculate the dependency between pairs of ROIs. Among them, the FBNs constructed by PC, MI and CC are dense and contain some spurious connections. Therefore, in order to remove redundant information, we empirically set a group of sparsity thresholds
[31] . In addition, for example, 90% means that 10% of the weak edges are filtered out from the FBN. Furthermore, for SR, the regularization parameter controls the sparsity of the FBN, and we set the parameter range to
[32] . Each method for each subject had 11 FBNs generated by different parameters. Among them, the FBN corresponding to the best parameter is selected from each method separately for fusion. Specifically, we determine the optimal parameter values for each method in the training set by performing an internal loop for leave-one-out cross-validation (LOOCV).
2.4. Multi-View FBNs Fusion
2.4.1. Formulation
Considering the limited modelling ability of a single-view FBN, in this subsection, we fuse multiple different FBNs utilizing multi-view subspace representation learning [33] [34] [35] [36] . The sub-representation matrix of each FBN is as follows:
, (4)
where
denotes the adjacency matrix, and the reconstruction representation matrix (or called sub-representation matrix) corresponding to the
FBN, where
, n and V are the numbers of ROIs and FBNs, respectively.
is the reconstruction error term.
With the aim to get the consistency/common information across multi-view FBNs and their anisotropy/uniqueness, we further decompose the sub-representation matrix
as
.
denotes the consistency information shared across FBNs and
corresponds to the information specific to the v-th FBN. Equation (4) can then be further transformed into:
(5)
Moreover, we introduce some prior information to the consistency matrix and the anisotropy matrix by imposing a regularization. In particular, for the consistent matrix C, we explore the shared information between the different FBNs using a low-rank constraint on the nuclear norm. The specificity matrices
with each FBN are constrained using the Frobenius norm. In addition, we assume that the noise obeys a Gaussian distribution and we penalise the error term using the Frobenius norm. The final objective function can be formulated as follows:
(6)
where
denotes nuclear norm,
is Frobenius norm and
are trade-off parameters.
Thus, the consistent and multiple specific representations across multi-view FBNs can be obtained via solving the above objective Equation (6). By using them, we get the final FBN according to the following formula
(7)
Furthermore, in order to make the final FBN F symmetric, we rewrite Equation (7) in the following way:
(8)
2.4.2. Optimization
The variables in Equation (6) are coupled to each other and we use the Alternating Direction Minimization strategy [37] to solve them. In order to make the problem separable, we introduce an auxiliary variable J in place of C. Accordingly, Equation (6) can be re-expressed as follows:
(9)
The Augmented Lagrange Multiplier (ALM) [37] method is then used to solve the above equation:
(10)
where
is the penalty parameter, and Y is the Lagrange multiplier. In each iteration, each variable is updated as follows.
J-Subproblem With the other variables fixed, the variable J can be updated by optimizing the following minimization problem:
(11)
taking the partial derivative of J and making it zero, J can be updated as follows:
(12)
where I is an identity matrix.
C-Subproblem With the other variables fixed, the variable C can be updated by optimizing the following minimization problem:
(13)
this problem can be solved by the Singular Value Thresholding (SVT), as follows:
(14)
where
indicates the singular value decomposition (SVD) of
, and
indicates the shrinkage operator, which is defined as
(15)
Sv-Subproblem With the other variables fixed, the variables
can be updated by optimizing the following minimization problem:
(16)
the anisotropy matrix
can be updated separately for each FBN. Equation (16) is rewritten as follows:
(17)
taking the partial derivative of
and making it zero,
can be updated as follows:
(18)
Update Multiplier The multiplier Y is updated by:
(19)
Multiple variables are alternately iterated until
converges to a small number
or the number of iterations reaches a maximum, where,
is
.
2.5. Feature Selection and Classification
As aforementioned, for each subject, the obtained fused FBNs using Equation (8) are utilized to identify subjects with ASD from the NCs. The edge weights of the FBNs are used as features. To alleviate the problem of high dimensionality, we employ the simplest feature selection method (i.e., t-test with
) and the SVM classifier (linear kernel with default parameter
) in our experiments.
3. Experiments
3.1. Experimental Setting
In our experiments we evaluate all methods using leave-one-out cross-validation (LOOCV). In each cross-validation, one subject is left for testing, while the rest of the subjects are used for training. On two imaging sites, for our algorithm, we tune the tradeoff parameters α and β from the set [0.01, 0.1, 1, 10, 100].
Four evaluation metrics are used to evaluate the classification performance of the different methods, including accuracy (ACC), sensitivity (SEN), specificity (SPE), and the area under the receiver operating characteristic curve (AUC).
3.2. Comparison Methods
We compare the proposed method with several state-of-the-art fusion strategies based on FBNs. The details are as follows:
• GKTN [20] : This method extracts features from multiple thresholding brain networks and uses multi-kernel learning to integrate them for classification.
• DTN [22] : This method dynamically constructs multiple FBNs using dynamic thresholds for each connection in the PC-based FBN. Then, the similarity network fusion (SNF) algorithm is used to fuse these FBNs.
• MNER [21] : This method generates multiple FBNs with different degrees of sparsity by varying the regularization parameters of the group-constrained regression model, and then fuse them by means of a multi-kernel combination.
• BMGF [38] : This method automatically learns the connectivity of brain regions by fusing a fully connected FBN and a 1 nearest neighbor FBN, considering the influences of inter-subject variability and across-subject heterogeneity.
3.3. Results
Table 3 shows the classification results for all methods in the ASD vs. NC classification task at both NYU and UM sites. From Table 3, it can be found that our proposed method outperformed the other compared methods at most cases.
![]()
Table 3. Performance of the proposed method with four comparison methods for ASD versus NC classification tasks. The best classification performance are shown in bold.
Specifically, our method improves the classification accuracies by 2.18% and 3.34% on NYU and UM sites respectively, compared to the best performance of the compared methods. This may be due to the fact that the multi-view FBNs utilised in our method are constructed by several different strategies, which model different types of relationships among brain ROIs such as full correlation, partial correlation, nonlinear relationship, and high-order correlation. Another possible reason is that our fusion strategies takes into account the consistency and individual heterogeneity information among multi-view FBNs via multi-view subspace representation learning, which is more easier to obtain a comprehensive representation of brain.
4. Discussion
4.1. Ablation Study
In this section, we perform an ablation study to compare our proposed fusion method CSMF with the two special cases, CMF only considering the consistency of all FBNs and SMF only utilising the specificity of each FBNs. The comparison results are shown in Table 4. Compared to CMF (SMF), our method improves the classification accuracy on the two site data by 3.39% (11.42%) and 9.28% (18.05%), respectively. This suggests that our method can better characterize the functional connectivity of the brain and improve brain disease classification performance by effectively combining consistent and specific information of multi-view FBNs.
4.2. Sensitivity to Parameters
In this section, we discuss the effect of the model parameters α and β on the classification performance. In Equation (3), α and β are used to balance the consistency constraint term and the anisotropy constraint term of multiple FBNs, respectively. Figure 2 reports the classification accuracy of our method under different parameters on both NYU and UM sites. From it, we find that the classification performance decreases rapidly when α or β tends to 0, which is concordant with the results of the ablation study. Furthermore, favorable classification
![]()
Table 4. Classification result of ablaton study on consistency and specifity. The best classification performance are shown in bold.
![]()
Figure 2. Based on ABIDE dataset, influence of parameters α and β on classification performance, where (a) represents NYU site, and (b) represents UM site.
results can be achieved when the two parameters are set to appropriate thresholds for the different site data. This verifies that whether the consistency information or the specific information among multi-view FBNs are essential to the improvement of classification performance.
4.3. Influence of Proposed Fusion Strategy
In this section, to further verify the validity of our proposed fusion strategy, in Table 5 we compare CSMF with the single-view FBN cases (i.e., PC, SR, MI, CC). From it we can be observed that CSMF outperforms the other four approaches. For example, our method improves the classification accuracy by 8.35% and 10.00% compared to the second-best methods, in the NYU and UM data, respectively. Such results demonstrate the existence of complementary information between different types of FBNs, which can improve the classification performance of brain diseases.
4.4. Identified Discriminative Features
As mentioned previously, we view the functional connectivity between the ROIs as features that identify ASDs from the NCs in the dataset. To visualize the features associated with ASD, we choose the 27 and 51 most discriminative features for NYU and UM sites based on a t-test with a p-value of 0.001, as shown in Figure 3. The thickness of the arcs indicates the discriminative power of the functional connectivities. The colors of the arcs are randomly generated just to show them more clearly. As we can see in Figure 3, the ROIs associated with the top discriminative features on the NYU site include right middle frontal gyrus, right hippocampus, right amygdala, bilateral putamen; on the UM site, the ROIs associated with the top discriminative features include the bilateral thalamus, the right parahippocampal gurus. These results are consistent with the results of previous studies [39] [40] [41] .
![]()
Figure 3. The most discriminating connections identified at (a) the NYU site and (b) the UM site. The thickness of the connection is inversely proportional to the p-value.
![]()
Table 5. Classification results for the CSMF and four single-view methods (i.e., PC, SR, MI, CC). The best classification performance are shown in bold.
5. Conclusion
In this paper, we propose a novel multi-view FBNs fusion method, named Consistent and Specific Multi-view FBN Fusion (CSMF). The method draws on the idea of multi-view subspace learning to reconstruct FBNs by fusing the consistency and specificity information among multiple different types of FBNs capturing different types of relationships among ROIs. The experimental results verified its effectiveness of identifying ASD patients on the two sites compared to several single-view FBN methods and state-of-the-art fusion strategies. However, the whole fusion process in CSMF is separated from the subsequent task such as brain disease diagnosis, which is difficult to make the fused FBN is beneficial to the final task. In the future work, we will integrate the fusion process and the final task into a unified framework, downstream classification tasks can effectively guide the fusion of FBNs.
Fund
This work was partly supported by National Natural Science Foundation of China (Nos. 62176112) and Natural Science Foundation of Shandong Province (Nos. ZR202102270451).
Author Contribution
C. Zhang and C. Wang contributed equally to this work and share first authorship.