Identification of Common Gene Characteristics and Pathways between SARA-CoV-2 and Femoral Head Necrosis by Bioinformatics and Systems Biology Methods ()
1. Introduction
Coronaviruses are a diverse group of viruses that can infect various animals, including humans, and can cause respiratory and digestive system complications, significantly impacting individuals’ well-being [1]. The first cases of a new coronavirus, SARS-CoV-2, were reported in late 2019. The most frequently observed clinical symptoms include dry cough, fever, and shortness of breath. Additionally, some patients may exhibit other symptoms, including sore throat, headache, muscle pain, fatigue, and diarrhea, imposing a heavy burden on the health of the world’s population and on the economy [2] [3].
Studies have shown that the systemic inflammatory response caused by SARS-CoV-2 can adversely affect the human skeletal system, with ischemic osteonecrosis and osteoporosis being the main skeletal sequelae [4]. Osteonecrosis has been frequently reported in the literature on SARS, with the majority of these cases affecting the femoral head, but with a lower frequency of involvement of the knee joint, the humeral head, the talus, the calcaneus, the heel, and other anatomical sites [5]. Although prolonged and heavy use of steroids is an important factor in the development of ischemic osteonecrosis, it does not fully explain the development of osteonecrosis after SARS-CoV-2 infection. In addition, bone infarction and ischemic necrosis of large joints and long bones have been observed after infection with neocoronaviruses [6]. Therefore, the role of SARS-CoV-2 in the pathogenesis of ischemic osteonecrosis has aroused our research interest.
In the current study, four datasets were used to explore the relationship between SARS-CoV-2 and femoral head necrosis.
2. Methods
2.1. Data Collection
To determine the genetic interrelationship between SARS-CoV-2 and osteonecrosis of the femoral head (ONFH), we downloaded one microarray data GSE74089, and three RNA-seq dataset GSE157103, GSE152418 and GSE171110. For the training sets, we used GSE157103 and GSE152418. For the validation sets, we used GSE74089 and GSE171110. Figure 1 illustrates the General workflow diagram of this study.
2.2. Identification of Differentially Expressed Genes (DEGs) and Co-Expressed Genes in ONFH and SARS-CoV-2
Differential expression analysis was performed on the microarray and high-throughput sequencing datasets using the “Limma” (version 3.40.6) and “DESeq2” software packages, respectively. Significantly differentially expressed genes (DEGs) were screened in all datasets using threshold criteria (p-value < 0.05 and |logFC| > 1.0). Volcano and heat maps were used to visualize the expression profiles of DEGs. Genes co-expressed were identified and visualized by Venn diagram.
Figure 1. General workflow diagram for this study.
2.3. GO and KEGG Enrichment Analysis
We used KEGGrestAPI to obtain up-to-date gene annotations for the KEGG Pathway, and the GO annotations of the genes in the R package org.Hs.eg.db (version 3.1.0) as the background, mapped the genes to the background set, and performed the enrichment analysis using the R package clusterProfiler (version 3.14.3) for enrichment analysis to obtain the results of gene set enrichment. A minimum gene set of 5 and a maximum gene set of 5000 were set, and a p value of <0.05 and a FDR of <0.2 were considered statistically significant.
2.4. PPI Network Analysis and Hub Gene Identification
PPI networks of common DEGs were constructed using the STRING database. Setting the minimum required interaction confidence level of 0.4 was performed here to generate PPI networks. The resulting networks were constructed and visualized using the open source Cytoscape (v 3.7.2) platform. The top 10 pivotal genes were then analyzed using the algorithms MCC, MNC, Degree, and MCODE analysis of the Cytohubba plugin in the Cytoscape software.
2.5. TF Gene Interactions and TF-miRNA Co-Regulatory Networks
The NetworkAnalyst platform was used to identify TF gene interactions with identified hub genes and TF-miRNA co-regulatory networks. Networks generated from TF gene interactions were obtained from the ENCODE database. TF-miRNA co-regulatory interactions were collected from the RegNetwork repositor. Here Degree cutoff:1.0 was set to filter unimportant nodes to simplify the network and visualize it using NetworkAnalyst.
2.6. Prediction of Drug Candidates
We designed a drug molecule based on SARS-CoV-2 and the ONFH disease pivotal genes to design drug molecules using the Drug Signature Database (DSigDB), which consists of a set of 22,527 genes. Access to the DSigDB database was obtained through the Enrichr platform.
2.7. Datasets Validate the Diagnostic Value of Pivotal Genes
We validated the expression levels of the hub genes with external datasets GSE74089 and GSE171110. The ROC curves of the hub genes were then constructed and the area under the curve (AUC) of the pivotal genes was calculated, which indicates the diagnostic efficiency of the genes. The higher the value of the AUC, the higher the diagnostic efficiency of the genes (an AUC > 0.900 was considered to have accurate diagnostic prediction). p values of <0.05 were considered to be statistically significant.
2.8. Statistical Analysis
R 4.2.0 software and Adobe Illustrator 2022 were used for this study. Data are expressed as SD ± mean and were compared by using t-test or one-way ANOVA, respectively. Nonparametric tests were used when the data did not conform to normal distribution. ROC was used to assess AUC and predictive power. p values less than 0.05 were considered statistically significant.
3. Results
3.1. Differentially Expressed Genes (DEGs) and Co-Expressed Genes in ONFH and SARS-CoV-2
To investigate the interconnections between ONFH and SARS-CoV-2, we searched the corresponding human RNA-seq datasets and microarray datasets in NCBI and categorized the unordered genes in the datasets SARS-CoV-2 and ONFH. Volcano and heat maps were applied to reproduce DEGs visually (Figures 2(a)-(c)). Then we identified 35 upregulated common DEGs and 1 downregulated common DEGs.
3.2. GO and KEGG Enrichment Analysis
We performed gene ontology analyses in three categories: biological process (BP), cellular component (CC), and molecular function (MF), and chose the GO database as the source of annotations. The first 10 GO terms for each subsection are shown in Figure 3.
KEGG analysis has shown results in extracellular matrix receptor interactions, epithelial cell bacterial invasion, adhesive plaques, amoebiasis, antifolate resistance, carbohydrate digestion and absorption, tumor proteoglycans, small cell lung cancer, 5-hydroxytryptaminergic synapses, cell cycle and other pathways enriched (Figure 3(d)).
Figure 2. Volcano and heat maps of DEGs in among SARS-CoV-2 and ONFH. (a) and (b) demonstrate DEGs of different datasets. (c) shows Venn diagrams of DEGs among SARS-CoV-2 and ONFH.
Figure 3. Chord diagram of results for different GO terms as well as KEEG analysis. (a), (b) and (c) respectively show biological process (BP), cellular component (CC), and molecular function (MF). (d) shows KEGG pathways analysis. Each chord plot shows the top 10 pieces of information involved in each GO, KEGG term.
3.3. PPI Network Analysis and Hub Gene Identification
A total of 21 nodes and 53 edges were identified in the PPI network (Figure 4). The top 10 important genes were also calculated based on the method MCC, MNC, Degree in Cytoscape combined with Cytohubba plugin (Figure 4(b)). In addition, one tightly connected module containing 10 nodes and 40 edges with the highest node score (7.0) was obtained by MCODE plugin based on PPI network. Finally, the hub genes were obtained by taking the intersection of the genes contained in the key modules with the genes derived from the MCC, MNC, and Degree algorithms (Figure 4(c)).
3.4. TF Gene Interactions and TF-miRNA Co-Regulatory Networks
We used the NetworkAnalyst platform to identify TF gene interactions with identified hub genes and TF-miRNA co-regulatory networks. The interactions of the TF regulators with the hub genes are shown in Figure 5(a). Next, Figure 5(b) represents the interactions between miRNAs regulators and common DEGs.
3.5. Prediction of Drug Candidates
We identified 10 possible drug molecules by searching the transcriptome features in the DSigDB database and using Enrichr to extract the top 10 compounds (Figure 6).
Figure 4. PPI network analysis of co-expressed genes between SARS-CoV-2 and ONFH and hub gene identification (a) Common DEGs to construct the PPI network, (b) MCODE plugin obtained 1 tightly connected module, (c) Venn diagram showing 10 overlapping hub genes screened by 3 algorithms and MCODE.
Figure 5. TFs-hub gene interactions and TF-miRNA co-regulatory network (a) Hub gene and transcription factor interactions diagram, hub genes represented by red circles and transcription factors represented by blue square diamonds. (b) TF-miRNA co-regulatory network, circles represent hub genes, blue squares represent miRNAs, and pink diamonds represent transcription factors.
Figure 6. A predictive list of the top 10 drugs for patients with SARS-CoV-2 and ONFH.
Figure 7. Validation of hub genes and their diagnostic genes. (a) Validation of the expression levels of the pivotal genes in the validation set (ONFH validation set on top, SARS-CoV-2 validation set on bottom), (b) Diagnostic ROC curves of the 8 pivotal genes (SARS-CoV-2 on the left, ONFH on the right, AUC > 0.900 was considered to be accurate diagnostic value).
3.6. Datasets Validate the Diagnostic Value of Pivotal Genes
The expression levels of the genes were validated and eight genes with consistent expression levels were obtained (Figure 7). The diagnostic ability of the genes obtained from the above analysis was then analyzed by constructing ROC curves and using AUC values, the result of the SARS-CoV-2 and control groups showed AUC > 0.900 for CENPF, TYMS, CDKN3, MCM4, OIP5, UBE2T, DSCC1, suggesting that six of the key genes may have accurate diagnostic ability in both diseases.
4. Discussion
ONFH is a chronic bone disease that is considered one of the major complications following neocoronary pneumonia infection. However, the link between SARS-CoV-2 and ONFH is poorly understood. To the best of our knowledge, little has been reported in bioinformatics on the key genes and pathways between SARS-CoV-2 and ONFH; in this study, we applied a web-based database approach to investigate common hub genes in three RNA-seq datasets from patients with ONFH and SARS-CoV-2 and analyzed the associated pathways [7] [8].
In our study, we applied a gene ontology approach to screen 36 common DEGs. In order to deeply investigate the mechanisms of SARS-CoV-2. Based on the common DEGs, we performed an ontological analysis in terms of BP, MF, and CC [9]. In terms of biological processes, the top ten GO terms are cell cycle, tissue development, cell migration, secretion, cell motility, cellular localization, locomotion, cell development, cellular or subcellular component motility, and immune response (Figure 3(a)). Among molecular functions, identical protein binding, molecular function regulator, enzyme regulator activity, protein dimerization activity, extracellular matrix structural component, protein c-terminal binding, peptidase regulatory activity, enzyme inhibitor activity, small GTPase binding, and SH3 structural domain binding were the main terms (Figure 3(b)). Based on cellular components, secretory granules, secretory vesicles, chromosomal regions, vacuolar fraction, lysosomes, lysosomes, basement membrane, vacuolar lumen, transcription factor complexes and endoplasmic reticulum lumen are the most important terms (Figure 3(c)).
PPI network analysis is the most important step in this study. In this experiment, CENPF, TYMS, HMMR, CDKN3, MCM4, OIP5, PTTG1, UBE2T, FBXO5, and DSCC1 were identified as key genes for SARS-CoV-2 with ONFH (Figure 4).
CENPF, as a checkpoint protein, is involved in the regulation of the cellular cycle [10] [11]. In addition, CENPF is involved in the regulation of apoptosis and apoptosis plays an important role in the pathogenesis of both SARS-CoV-2 and ONFH [12] [13]. Therefore, CENPF may be one of the potential therapeutic targets for SARS-CoV-2 and ONFH.
TYMS is a key gene in the regulation of the cell cycle. Ayo et al. noted that the diversity of SNPs in TYMS may be associated with the risk of infection by some viruses [14]. At the same time, TYMS can also regulate apoptosis and angiogenesis [15].
HMMR not only plays an important role in the regulation of apoptosis as well as angiogenesis, but it also mediates macrophage polarization in hypoxic environments, which, in turn, affects the inflammatory response [16]. CDKN3 is a member of the protein phosphatase family and plays an important role in the regulation of the cell cycle. Haiya et al. found that CDKN2 may be associated with the regulation of inflammatory vesicles [17]. It has been noted that mutations in MCM4 are closely associated with the development of autoinflammatory disorders, which increases the risk of developing SARS-CoV-2 [18]. Furthermore, in osteogenesis, OIP5-AS1 overexpression promotes osteogenic differentiation and inhibits LPS-induced inflammation [19]. By the way, Knockdown of UBE2T results in a blockade of the cell cycle in the G2/S phase. In addition, UBE2T knockdown increases apoptosis [19].
PTTG1 has the ability to promote endothelial differentiation of CD34+ CD45+ cells. PTTG1 regulates the endothelial differentiation of CD45+ CD3+ cells through activation of the VEGF-bFGF/PI34K/AKT/eNOS signaling pathway, thus achieving the role of promoting pulmonary vascular barrier repair during acute lung injury [20]. In addition, inhibition of PTTG1 expression suppresses the expression of inflammatory factors and protects the extracellular matrix through the microtubule-associated protein kinase (MAPK) signaling pathway [21]. Furthermore, F-box protein plays a key role in a variety of cellular processes, including cell proliferation, apoptosis, and angiogenesis [22]. Moreover, DSCC1 can alter the sensitivity of cells to apoptosis [23].
Based on the above key genes, we used the NetworkAnalyst platform for identifying TF gene interactions with identified hub genes and TF-miRNA co-regulatory networks. From the TFs-gene and miRNA-gene interaction network analysis, 149 transcription factors (TFs) and 44 miRNA-regulatory features have been identified to be regulated through more than one common DEGs.
The TF-gene network consists of 159 nodes and 265 edges. The whole network consists of 149 TF genes and 10 key genes (Figure 5(a)). Regarding the TF-miRNA co-regulatory network, it includes 135 nodes and 151 edges, with a total of 53 miRNAs and 72 TF genes co-regulating key genes (Figure 5(b)).
In our study, drug molecules associated with key genes were identified and sorted by searching drug databases (Figure 6). We identified enterolactone, which is often associated with inflammation. It has been suggested that enterolactone may ameliorate oxidative tissue damage and inflammation in some forms of acute lung injury [24]. Furthermore, it has been suggested that Mucuna pruriens can increase femoral BMD as well as femoral BMD and femoral biomechanical strength in a mouse model [25]. In a clinical study, Marta et al. noted that low serum testosterone levels were associated with severe SARS-CoV-2 [26]. Liu pointed out that testosterone can optimize lung function by increasing expiratory volume, lung capacity, and improving oxygen consumption and respiratory muscle contraction [27]. Despite the fact that there have been many relevant findings, so far it is not possible to conclude on the beneficial or detrimental effects of testosterone on SARS-CoV-2, so more clinical trials and large-scale prospective studies are necessary to confirm potential associations in this regard. What is certain, however, is that testosterone can increase femoral bone density as well as muscle mass [28] [29].
Resveratrol, coumestrol, and calcitriol are also potential drugs. Resveratrol has excellent antioxidant, anti-inflammatory and anti-microbial properties. It can play a role in inhibiting the expression and activity of cyclooxygenase, regulating apoptosis, and inhibiting pro-inflammatory molecules by antagonizing signaling pathways [30] [31]. It has been noted that resveratrol and its synthetic resveratrol derivatives have been shown to inhibit SARS-CoV-2 replication and reduce its cytopathic effects [32]-[34]. In addition, resveratrol prevents femoral head necrosis by upregulating miR-146a and regulates osteogenic and osteoblastic differentiation via Wnt/FOXO and Sirt1/NF-κB pathways [35].
Coumestrol can ameliorate inflammation by reducing oxidative stress through the activation of SIRT1 as well as apoptosis, which may be useful for SARS-CoV-2-induced inflammatory response [36].
In addition, we evaluated the diagnostic ability of hub genes by constructing ROC curves (Figure 7). After validation on the GSE74089 and GSE171110 datasets, 8 genes with consistent expression levels were obtained (CENPF, TYMS, CDKN3, MCM4, OIP5, UBE2T, FBXO5, DSCC1). By constructing ROC curves and utilizing AUC values, the diagnostic efficacy of distinguishing these 8 genes for ONFH and control, as well as SARS-CoV-2 and control, was determined. The diagnostic accuracy of CENPF, TYMS, CDKN3, MCM4, UBE2T, FBXO5, and DSCC was higher in the ONFH and control groups, while the diagnostic accuracy of CENPF, TYMS, CDKN3, MCM4, OIP5, UBE2T, and DSCC1 was higher in the SARS-CoV-2 and control groups, indicating that six key genes (CENPF, TYMS, CDKN3, MCM4, UBE2T, and DSCC1) may have accurate diagnostic ability in both diseases.
It has to be mentioned that there are some limitations in this study. On the one hand, due to the scarcity of studies exploring the relationship between ONFH and SARS-CoV-2, we were only able to select four datasets from the GEO database for the bioinformatics study, which, in turn, imposed a certain limitation on the sample size of our data. On the other hand, although this study comprehensively and systematically analyzed the interconnection between SARS-CoV-2 and ONFH at the bioinformatics level, it lacked validation at the cellular and animal levels. Therefore, SARS-CoV-2 and ONFH can be explored at a deeper level in future studies.
5. Conclusion
In this study, we identified 10 key genes (CENPF, TYMS, HMMR, CDKN3, MCM4, OIP5, PTTG1, UBE2T, FBXO5, DSCC1) between SARS-CoV-2 and ONFH. Furthermore, we discussed the common biological process, analyzed TF gene interactions and TF-miRNA co-regulatory networks and predicted possible target drugs. Moreover, the diagnostic ability of the hub genes was analyzed by using ROC with GSE74089 and GSE171110 as validation sets. Since there is little literature on SARS-CoV-2 and ONFH, this study provides value in exploring the relationship between SARS-CoV-2 and ONFH, and the prevention and treatment of ONFH after SARS-CoV-2 infection.
Ethics and Consent
The data in this study came from a public database and therefore did not require ethical approval and patient/participant consent.
NOTES
*Huanning Yang and Yimin Zhu contribute equally to this article.
#Corresponding author.