ORIGINAL ARTICLE

Check for updates

Machine Learning-Driven Identification and In Vitro Validation of the APOBEC3B-ANLN Regulatory Axis in Adrenocortical Carcinoma

Jiadong Zhang1,2 . Xinyu Hu1,3 . Cong Wei1,2 . Wenyun Dong1 . Huanrui Hu1 . Rui Wang2,4 . Zhili Xiong1,2. Chengyin Li2,4 . Jingling Zhao1,2

Received: 7 July 2025 / Accepted: 10 October 2025 @ The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2026

Abstract

Background Adrenocortical carcinoma (ACC) is a rare, aggressive malignancy with limited diagnostic and therapeutic options. APOBEC3B (A3B) has emerged as a mutational driver in several cancers, but its downstream mechanisms remain unclear. We aim to utilize bioinformatics methods, such as machine learning, to reveal the mechanism of A3B in ACC and verify and explore it in depth in vitro.

Methods Through the comprehensive analysis of 311 samples, including differential expression analysis and weighted gene co-expression network analysis (WGCNA), we use the hub genes extracted from the key modules as the background for 113 machine learning methods. Genes with potential associations were evaluated using feature importance and SHAP analysis techniques, and in vitro studies included qRT-PCR, Western blotting, siRNA-mediated knockdown, overexpression rescue, scratch assays, and Transwell migration assays to assess effects on gene expression and cell motility.

Results Random Forest was selected as the optimal model and identified nine gene features centered on A3B and ANLN (AUC = 0.996). Knockout of the A3B gene significantly reduced the mRNA and protein levels of ANLN (p < 0.001). ANLN overexpression rescued the outcome. Compared with knockout alone, the cell migration distance and the number of migrating cells were restored (p < 0.001).

Conclusions Our comprehensive omics and experimental methods have revealed the A3B-ANLN axis as a key mechanism of ACC and also provided predictive models and potential targets for the early diagnosis and therapeutic intervention of ACC.

Jiadong Zhang and Xinyu Hu contributed equally to this work.

☒ Chengyin Li lichengyin@hbhtcm.com

☒ Jingling Zhao jinglingzhao9@outlook.com

1 Hubei University of Chinese Medicine, 430065 Wuhan, China

2 Hubei Shizhen Laboratory, 430065 Wuhan, China

3 The First Clinical College of Zhejiang Chinese Medical University, 310053 Hangzhou, China

4 Hubei Provincial Hospital of Traditional Chinese Medicine, 430065 Wuhan, China

Introduction

Adrenocortical carcinoma (ACC) is an uncommon yet very aggressive neoplasm, exhibiting a troubling 5-year survival rate of under 60%[1]. Early identification of ACC is chal- lenging, as patients are often missed due to symptoms like hormonal imbalances (e.g., cortisol excess in Cushing’s syndrome or abnormal androgen and estrogen levels) and are only diagnosed at advanced stages through imaging[2]. The conventional treatment for localized ACC is surgical resection; yet, the postoperative recurrence rate can be as high as 60%, and patients with recurrence have a two-year death rate of 15%[3], [4]. Mitotane is presently the princi- pal adjuvant therapy, while novel agents like cabozantinib and nivolumab have also shown initial success in treating

ACC[5], [6], [7]. Regrettably, the commonly employed EDP-M regimen has demonstrated limited efficacy, with a complete remission rate of only 1.3%[8] consistent find- ings from clinical and retrospective studies report modest progression-free survival and low response rates in ACC patients[9], [10]. These problems underscore the pressing necessity for novel therapeutic approaches.

Recent research has increasingly focused on precisely determining the molecular basis of ACC to identify new therapeutic opportunities. APOBEC3B (A3B), a DNA- editing enzyme, has been demonstrated to promote genomic instability and replication stress, fostering ACC genera- tion[11], [12], [13]. A3B is also highly related to DNA damage and TP53 mutation rates with GATA3, according to studies showing that it is significantly overexpressed in ACC[13]. These findings significantly clarify the molecular foundations of ACC genesis and development. Still, recent research generally focuses on single-omics analysis or spe- cific molecular targets, thereby limiting a thorough under- standing of the complex molecular characteristics of ACC.

This study integrates weighted gene co-expression net- work analysis (WGCNA), Mendelian randomization (MR), and machine learning into a synergistic framework to iden- tify significant genes, construct robust diagnostic models, and analyze gene functions in ACC.

This research extensively explores significant genes, their upstream and downstream linkages, and potential pathways to ACC advancement, employing an analytical

approach and experimental confirmation. The results should enhance early identification, accurate interventions, and the development of tailored treatments, thereby providing new insights into this rare cancer.

Results

This study further identified key genes significantly associ- ated with ACC by integrating differential expression analy- sis and weighted gene co-expression network analysis. In the differential expression analysis, the box plot and PCA plot indicated that batch effects were corrected in the dataset (Fig. 1 A, B). Using the training dataset, 198 significantly differentially expressed genes (|log2FC] > 2, adj. P < 0.05) were identified, and their expression profiles were visually presented through volcano plots and heatmaps (Fig. 1 C, D). Subsequently, WGCNA analysis was conducted on the TCGA-ACC dataset to construct a scale-free topology net- work. Two modules, “Blue” and “SteelBlue”, were identi- fied as significantly associated with survival status, survival time, and tumor subtypes (Fig. 2A), with the soft power value determined to be 10 (Fig. 2B), which collectively contained 4,358 genes. After intersecting the DEGs and WGCNA results, 58 inter genes were identified (Fig. 2C),

Fig. 1 Differential analysis results. (A, B) Boxplots and PCA plots before and after normalization. (A) boxplots, (B) PCA plots. (C) The volcano plots of DEGs in GEO datasets. red dots: up-regulated; green: down-regulated; gray: nonsignificant. (D) Heatmap of the top 50 regulated genes

A

B

Before batch correction

After batch correction

16

25

Before batch correction

After batch correction

50

60

12

50

25

30

Expression

Expression

Project

0SE12168

.

Type

£

0

2

0

GSE12368

OSESOTIS

GSE143383

GSE90713

3

-25

-30

4

4

B

-50

0

-200

-100 PCI

0

-60

-50

0

PCI

50

Sample

Souple

Type

Type

Project

C

D

4

Normaal TEmor

2

Project

GSE12558 GSE145383 GSE90713

D

2

30

4

-log10(adj.P.Val)

20

Sig

Down

. Not

..

Up

10

0

-2.5

0.0

2.5

logFC

MEdarkmagenta0.0084 (0.9)-0.1 (0.4)0.14 (0.2)0.18 (0.1)-0.18 (0.1)-0.17 (0.1)-0.053 (0.6)-0.041 (0.7)
-0.0055-0.0620.0390.140.220.210.25-0.05
MEsienna3(I)(0.6)(0.7)(0.2)(0.06)(0.06)(0.02)(0.7)
MElightgreen0.065 (0.6)-0.090.110.17-0.0390.0370.046-0.027
(0.4)(0.3)(0.1)(0.7)(0.7)(0.7)(0.8)
MEroyalblue-0.047 (D.7)-0.110.16-0.0620.0590.17-0.047-0.05
(0.3)(0.2)(0.6)(0.6)(0.1){0.7)(0.7)
MEpaleturquoise0.16 (0.2)-0.120.12 (0.3)0.15 (0.2)-0.11-0.11-0.085-0.069
(0.3)(0.3)(0.3)(0.5)(0.5)
MEplum1-0.061-0.160.0570.089-0.13-0.14-0.077-0.061
(0.6)(0.2)(0.6)(0.4)(0.3)(0.2)(0.5)(0.6)
MEcyan0.12-0.019-0.034-0.14-0.13-0.130.0093-0.02
(0.3)(0.9)(0.8)(0.2)(0.3)(0.3)(0.9)(0.9)
MEbrown-0.09-0.1-0.16-0.15-0.098-0.1-0.049-0.049
(0.4)(0.4)(0.1)(0.2)(0.4)(0,4)(0.7)(0.7)
MEyellowgreen-0.048-0.09-0.11-0.089-02-0.19-0.069-0.077
(0.7)(0.4)(0.3)(0.4)(0.05)(0.09)(0.5)(0.5)
MEpink-0.160.230.0350.120.260.240.31-0.021
(0.2)(0.04)(0.8)(0.3)(0.02)(0.03)(0.006)(0.9)
MEblack-0.170.230.130.0980.150.130.065-0.01
(0.1)(0.04)(0.2)(0.4)(0 2)(0.3)(0.6)(0.9)
MEred-0.190.250.0045-0.0510.260.240.3-0.024
(0.1)(0.03)(1)(D.7)(0.02)(0.03)(0.007)(0.8)
MEmagenta-0.037-0.0520.0540.16-0.075-0.070.069-0.16
(D.T)(0.6)(0.6)(0.2)(0.5)(0.5)(0.5)(0 2)
MEsalmon-0.140.011-0.17-0.024-0.071-0.0560.061-0.12
(0.2)(0.9)(0.1)(0.8)(0.5)(0.6)(0.6)(0.3)
MEdarkgrey0.13-0.042-0.15-0.14-0.054-0.058-0.032-0.011
(D.2)(0.7)(0.2)(0.2)(0.6)(0.6)(0.8)(0.9)
MEdarkgreen0.10.14-0.067-0.067-0.024-0.006-0.051-0.0071
(0.4)(0 2)(0.6)(0.6)(0.8)(0.8)(D.7)(I)
MElightyellow0.073-0.037-0.069-0.11-0.14-0.14-0.016-0.0083
(0.5)(0.7)(0.5)(0.3)(0 2)(0.2)(0.9)(0.9)
MEgrey60-0.098-0.0920.0890.150.0920.082-0.038-0.0072
(0.4)(0.4)(0.4)(0.2)(0.4)(0.5)(0.7)(0.9)
MEdarkolivegreen-0.052-0.024-0.16-0.0760.140.0290.000750.36
(0.6)(0.8)(0.2)(0.5)(0 2)(0.8)(1)(0.001)
MEdarkred-0.1030.0880.0990,0750.0540.0550.036
(0.4)(0.008)(0.4)(0.4)(0.5)(0.6)(0.6)(0.8)
MEskyblue3-0.0940.250.0390.160.150.24-0.00930.02
(0.4)(0.03)(0.7)(0.2)(0 2)(0.03)(0.9)(0.9)
MEsaddlebrown-0.053020.054-0.10.230.210.250.018
(0.6)(0.05)(0.6)(0.4)(0.04)(0.07)(0.03)(0.9)
MEsteelblue-0.0850.370.021-0.0950.370.260.350.36
(0.5)(7 :- 04)(0.9)(0.4)(7 :- 01)(0.02)(0.002)(0.001)
MElightcyan0.0930.250.068-0.120.0570.0220.0380.089
(0.4)(0.02)(0.6)(0.3)(0.6)(0.8)(0.7)(0.4)
MEgreen0.140.082-0.22-0.110.190.0740.0330.35
(0.2)(0.5)(0.05)(0.3)(0.1)(0.5)(0.8)(0.002)
MEpurple-0.130.23-0.120.130.230.30.070.077
(D.3)(0.04)(0.3)(0.2)(0.04)(0.008)(0.5)(0.5)
MEorange-0.120.20.120.10.250.220.280.37
(0.3)(0.05)(0.3)(0.4)(0.03)(0.05)(0.01)(7e-04)
MEskyblue0.29 (0.009)-0.02-0.064-0.0980.028-0.00340.00940.015
(0.9)(0.6)(0.4)(0.8)(1)(0.9)(0.9)
MEviolet-0.0680.230.05-0.110.25-3e-040.270.35
(0.6)(0.04)(0.7)(0.3)(0.03)(1)(0.02)(0.001)
MEwhite-0.0320.21-0.0079-0.150.250.10.290.0017
(0.8)(0.06)(0.9)(D.2)(0.03)(0.4)(0.009)(1)
MEgreenyellow-0.067-0.0210,15-0.12-0.0088-0.016-0.018-0.0031
(0.6)(0.9)(0.2)(0.3)(0.9)(0.9)(0.9)(1)
MEdarkorange-0.130.27-0.220.110.0054-0.00130.020.025
(0.2)(0.02)(0.05)(0.3)(1)(1)(0.9)(0.8)
MEtan-0.160.250.16-0.110.240.220.280.33
(0.2)(0.03)(0.2)(0.3)(0.03)(0.06)(0.01)(0.003)
MEorangered4-0.130.230.096-0.0990.120.1-0.0082-0.0026
(0.2)(+0 0)(0.4)(D.4)(0.3)(0.4)(0.9)(1)
MEmidnightblue-0.180.065-0.0980.110.0180.029-0.0120.0045
(0.1)(0.6)(0.4)(0.3)(0.9)(0.8)(0.9)(1)
MEblue-0.40.52-0.097-0.140.440.370.380.21
(3e-04)(le-06)(0.4)(0.2)(de-05)(Be-04)(Ge-04)(0.0T)
MEdarkturquoise-0.190.22-0.0240.13-0.011-0.012-0.0032-0.0068
(0.09)(0.05)(0.3)(D.2)(0.9)(0.9)(1)(1)
-0.070.0510.0057-0.190.190.0910.170.16
MEgrey(0.5)(0.7)(1)(0.1)(0.1)(0.4)(0.1)(0.2)
futimefustatagegenderStage1M-
Fig. 2 WGCNA module identification and functional enrichment anal- ysis. (A) Correlation heatmap, the modules with red boxes are the two best-performing modules, including 4358 genes. (B) Soft threshold

A

Module-trait relationships

B

Scale independence

Mean connectivity

9

14151617181920

1

Scale Free Topology Model Fit, signed R

0.8

9 10111213

-

6 78

4

Mean Connectivity

2000

0.6

3

2

0.4

5

1000

2

0.2

500

3

1

0

1

5 6 7 8 9 1011121314151617181920

T

5

10

15

20

5

10

15

20

Soft Threshold (power)

Soft Threshold (power)

C

GO:0015631

GO:0016573

GO-9009922

383

50

271

6

3

*

$

153

*

a

F

0

ONTOLOGY

Biological Process

2

-

Molecular Function

0.5

log 10(Prahae)

(92)

PAI

Number of Genes

Rich Factor(0-1)

5

10

GO.0000280

(6,8)

28.300

== 10

-

7

D

·

18000-01

E

WGCNA

DEG

Py

44

EN

sol

GO.0098813

8

GO:0000219

120:3087019

-0.5

GO:0140014 …

4300

58

140

T

D

Progesterone-mediated oocyte maturation

Cell cycle

Oocyte meiosis

pvalue

p53 signaling pathway

0.01

Cellular senescence

0.02

0.03

Human immunodeficiency virus 1 infection

0.04

Human T-cell leukemia virus 1 infection

Comt

Platinum drug resistance

0 1

2

FoxO signaling pathway

3

4

Hepatitis B

$

6

Motor proteins

Viral carcinogenesis

Apoptosis - multiple species

·

0.1

0.2

0.3

0.4

GeneRatio

suggesting that these genes might be involved in the onset and progression of ACC through specific functional mecha- nisms. Further functional enrichment analysis revealed that, in Gene Ontology (GO) analysis, these genes were enriched in biological processes, including nuclear division, organ- elle fission, spindle, and microtubule binding. In contrast, the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis indicated they were associated with cell cycle and Progesterone-mediated oocyte maturation signal- ing pathways (Fig. 2D-F).

selection. (C, D) The results of GO and KEGG enrichment analysis for inter genes. (C) circos plot of GO, (D) bubble plot of KEGG. (E) Venn diagram about the intersection of 4358 WGCNAs and 198 DEGs

Mendelian Randomization Analysis

Using two-sample Mendelian Randomization (MR) analy- sis, this study identified 238 exposure genes that were sig- nificantly associated with ACC, derived from the analysis of FinnGen outcomes and the locally curated GWAS data- base. Further investigation revealed that A3B was present in both the 238 significant genes identified by MR and the inter genes in 2.1 (Fig. 3A), and its exposure data were obtained from https://opengwas.io/datasets/eqtl-a-ENSG000001797 50. Figure 3B-D shows the MR result of A3B. The centrality scores of genes were calculated using the MCC algorithm, and 25 hub genes with the highest scores were identified, all of which were upregulated (Fig. 3E, F). Notably, the target

Fig. 3 Key gene identification and Mendelian randomization analysis. (A) The Venn diagram about the intersection of WGCNAs, DEGs, and MRs. (B, C, D) The MR results of A3B. (B) SNP effect, (C) leave-one- out sensitivity analysis, (D) effect size. (E) The PPI network of inter genes (F) The network diagram composed of the top 25 genes using the MCC and A3B

A

DEG

WGCNA

B

MR Test

Inverse variance weighted

Weighted median Weighted mode

MR Egger

Simple mode

137

57

4257

0.4 -

A3B

3

43

SNP effect on outcome

0.2

180

0.0

MR

-0.2 -

C

rs12157810

D

0.1

0.2

0.3

0.4

0.5

SNP effect on A3B

rs139271

r$74986034

rs74986034

rs139271

rs12157810

All - MR Egger

All - Inverse variance weighted

All

-2

0

2

4

MR effect size for ‘A3B’ on ‘outcome’

-2

-1

0

1

2

MR leave-one-out sensitivity analysis for ‘A3B’ on ‘outcome’

HSPB8

E

0

F

NDC80

APOBEC3B

GAS2L3

-

SYTLS

BUB1

9%

SLITRK4

ENC1

DUSP26

SEMAGA

3

en

CCNB2

TPX2

TYMS

CEP55

GẦNN

ANGPT2

ESM1

14

NUFZ

TTK

MAD2L1

TTK

MAD2L1

9

PTTG

AMDHD1

EPHX2

CENPK

3%

CCNB1

TOP2A

1

L

5

ÇOK1

C3

CCNE1

CDK1

DLGAP5

ANEN

V

*

#

ACTR3C

RRM2

%

SEP59

7

COKNS

CENPF

CENB2

S

SLC1649

BUB1B

80

NUF2

DEPAC 18

YBEZT

..

CCNAZ

S

GGH

PTTG1

CSEM

TYMS

MC2R

2

DTL

30

CYP17A1

MA

KIF20A

4

HMMR

PBK

‘S

KIF

A

UBE2C

KCNQ1

&

8

*

MYBL1

APOBEC3B

SLC27A2

ANLN

AS

CCNA2

«5

ALAS1

39

SULF2

4

RARRES2

FAT1

PAPŚ$2

0

26

&

UDE2

BHMT2

gene A3B did not appear among the central genes, suggest- ing it might influence ACC through non-traditional mech- anisms. The MR results of 238 eQTLs, together with the odds ratio of A3B, as well as the pleiotropy and heterogene- ity test tables, are provided in the supplementary materials (supplementary table S3 to table S5).

Machine Learning-Based Diagnostic Model Development and Validation

To explore the potential mechanisms associated with A3B, we utilized hub genes as background gene variables with A3B. According to the results, the GBM model yielded a slightly higher average AUC value than the Random For- est (RF) model; however, all nine feature genes identified by the RF model were also included in the GBM model, which contains all the hub genes (Fig. 4A). To simplify the model and enhance the feasibility of clinical application, the RF model was ultimately chosen as the optimal model, with an average AUC value of 0.996. In addition, the RF model demonstrated better generalizability in cross-validation and independent validation cohorts (Fig. 4B), and its built-in feature selection mechanism helps reduce the risk of overfit- ting caused by redundant variables, making it more suitable for practical diagnostic applications. The confusion matri- ces in Fig. 4C-F illustrate the actual performance of the model. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated from the confusion matrices in Fig. 4D-F, and the results are illustrated in Fig. 4C. Most estimates were close to 100%. Although the GSE19750 dataset had wider confidence inter- vals due to its smaller sample size, the overall predictive performance remained strong. All feature genes were sig- nificantly overexpressed in tumor samples (Fig. 4G, H), and the expression correlation analysis showed positive correla- tions among these genes (Fig. 4I). SHAP and Gene ROC reveal that CDK1 and PTTG, as cell cycle proteins and pitu- itary tumor-transforming genes, respectively, make a sig- nificant contribution to diagnostic models. Their expression levels significantly influence the occurrence, development, and prognosis of ACC (Fig. 4J-L). These findings indicate that the expression levels of these nine feature genes may facilitate an early diagnosis of ACC.

Multivariable Cox Regression Analysis and Nomogram Construction

After adjusting for age, gender, and stage, the forest plot in Fig. 5A based on a continuous-variable Cox model indi- cated that CCNB2, UBE2T, and CDK1 had the highest HRs (4.00, 4.64, and 4.04, respectively). Kaplan-Meier analysis in further Fig. 5B-K showed that patients with high CDK1

expression had the most significant risk of death (HR = 7.27, 95% CI: 2.26-23.34, p < 0.001, Fig. 5F), followed by CCNB2 (HR = 6.10, 95% CI: 1.89-19.65, p = 0.00245, Fig. 5E) and A3B (HR = 4.74, 95% CI: 1.82-12.40, p = 0.00149, Fig. 5D). As shown in Fig. 6A, a prognostic nomo- gram was constructed using Cox regression by integrating clinical variables (gender, age, stage) and the expression levels of feature genes. Among these variables, TOP2A, ANLN, and UBE2T had the most significant point ranges, indicating their most decisive influence on risk scores. Fig- ure 6B demonstrates good agreement between the predicted and observed survival probabilities at 1-, 3-, and 5-year follow-up. Figure 6C shows that patients in the high-risk group had significantly worse overall survival than those in the low-risk group (HR = 24.32, 95% CI: 5.65-104.71, p = 1.6 × 10-9). Figure 6D displays time-dependent ROC curves with excellent predictive performance, yielding AUCs of 0.89, 0.97, and 0.98 at 1, 3, and 5 years, respectively.

Identification of Upstream and Downstream Genes

PPI analysis revealed five feature genes associated with A3B within five network expansions. Additionally, the BioGRID database identified ANLN as the only gene in the overlap[14] (Fig. 7A). This suggests a potential upstream- downstream relationship between ANLN and A3B, which may influence the occurrence and prognosis of ACC. GSVA analysis revealed that in the A3B high-expression group, pathways such as ether lipid metabolism, FceRI signal- ing pathway and, nicotinamide metabolism were highly enriched. Conversely, the ANLN high-expression cohort exhibited enrichment in pathways associated with regulat- ing the actin cytoskeleton, TGF-ß signaling pathway, and immunological modulation (Fig. 7B,C). Further GSEA enrichment analysis revealed that the high-expression group of ANLN was highly enriched in the cell cycle, DNA repli- cation, and oocyte meiosis pathways. In contrast, the low- expression group was predominantly associated with cell adhesion molecules and the signaling pathway of chemo- kine (Fig. 7D, E).

Rescue Experiment to Verify Upstream-Downstream Relationship

WB results showed a significant reduction in the protein levels of both A3B and ANLN after siA3B, suggesting that A3B may influence ANLN expression through direct or indirect regulation. Subsequently, in the ANLN overex- pression experiment, protein levels of ANLN were restored, whereas A3B’s remained steady, further confirming the unidirectional regulatory effect of A3B on ANLN (Fig. 8A, B). All samples were processed in parallel under identical

A

B

Train

GSE10927

GSE19750

9

=

2

3

3

A

3

Sensitivity

8

Sensitivity

0.6

AUC: 1.000

95% CI: 1.000-1.000

AUC: 0.988

95% C1: 0.955-1.000

AUC: 1.000

à

à

0.4

95% CI: 1.000-1.000

3

2

2

8

8

3

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

1 -Specificity

1 -Specificity

1 -Specificity

Dataset

Metric

Value(95%CI)

Train

Sensitivity

1.00(0.97-1.00)

Specificity

-

1.00(0.79-1.00)

PPV

I —

1.00(0.97-1.00)

NPV

GSE10927

1.00(0.79-1.00)

Sensitivity

1.

1.00(0.89-1.00)

Specificity

0.90(0.55-1.00)

PPV

F-

0.97(0.85-1.00)

NPV

I-

4

GSE19750

1.00(0.66-1.00)

Sensitivity

I-

0.91(0.78-0.97)

Specificity

1.00(0.40-1.00)

PPV

1.00(0.91-1.00)

NPV

A

1

0.50(0.16-0.84)

0.2

0.3

0.4

0.5

0.6

Value(95%CI)

0.7

0.8

0.9

1.0

D

CONFUSION MATRIX (Train)

E

CONFUSION MATRIX (GSE10927)

Actual

Actual

Normal

Tumor

Normal

Tumor

Normal

16

0

Normal

9

0

Predicted

Predicted

Cohort

GSE10927

0.8

GSE19750

Tumor

0

127

Tumor

Train

1

33

0.6

0.4

G

Type

Normal

= Tumor

CONFUSION MATRIX (GSE19750)

12

Actual

Normal

Tumor

10

Normal

Gene expression

4

4

8

Predicted

Tumor

6

0

40

A

4

CDKI

PTTGI

TOPZA

DTL

CCNB2

CCNBI

MAD2LI

UBEZT

ANLN

I

0.79

0.87

0.78

0.86

0.80

0.65

0.87

0.83

PTTDI

0.77

0.59

0.79

0.86

0.52

0.78

0.76

*T

ha HA

**

2

Sig

0.74

0.83

0.82

0.67

0.84

0.85

Down

0.80

0.67

0.51

0.79

0.78

PTTG1

Not

Up

CCNB2

0.81

0.53

T

0.85

0.81

CDK1

UBE2T

DIL

**

0.65

**

0.81

0.83

MAD2L1 ANLN

PLI

0.75

0.73

UBERET

**

0.86

J

0

0.5

-2.5

0.0

2.5

logFC

9

K

L

CDK1

High

1.8

CDKI

PTTG1

PTTGI

Sensitivity

0.6

TOP2A

CDK1, AUC-0.975

TOP2A

PTTG1, AUC-0.964

CCNB2

CCNB2

0.4

TOP2A, AUC-0.952

Feature value

MAD2L1

DTL, AUC=0.940

MAD2L1

CCNB2, AUC-0.943

DTL

DTL

0.2

CCNB1, AUC-0.942

MAD2LI, AUC=0.911

CCNB1

CCNBI

UBE2T, AUC=0.943

UBE2T

ANLN, AUC-0.925

UBE2T

0.0

ANLN

ANLN

0.0

0.2

0.4

0.6

0.8

1.0

0.00

0.01

0.02

0.03

0.04

0.05

0.06

mean(|SHAP value) (average impact on model output magnitude)

-0.25

-0.20

-0.15

0.10

0.05

0,00

0,05

0.10

0.15

Low

1 - Specificity

SHAP value (impact on model output)

GBM1.0000.9971.0000.999
RF1.0000.9881.0000.996
Enet[alpha=0.8]0.9890.9821.0000.99
Enet[alpha=0.4]0.9860.9820.9940.987
Enet[alpha=0.9]0.9800.9821.0000.987
Enet[alpha=0.1]0.9840.9820.9940.987FSensitivity
Enet[alpha=0.5]0.9840.9820.9940.987
Ridge0.9780.9821.0000.987
Enet[alpha=0.3]0.9840.9820.9940.987
Enet[alpha=0.7]0.9790.9791.0000.986
Enet[alpha=0.6]0.9770.9791.0000.985
Stepglm[both]+GBM1.0000.9880.9660.985
RF+GBM1.0000.9910.9600.984
RF+Stepglm[forward]0.9830.9760.9890.983
Lasso0.9850.9730.9890.982C
Enet[alpha=0.2]0.9930.9820.9720.982
RF+Enet[alpha=0.1]0.9810.9760.9890.982
Lasso+plsRglm0.9860.9640.9940.981
RF+Enet[alpha=0.2]0.9790.9760.9890.981
Stepglm[backward]+GBM1.0000.9880.9550.981
RF+Enet[alpha=0.3]0.9780.9760.9890.981
RF+Ridge0.9710.9760.9940.98
Lasso+NaiveBayes0.9630.9761.0000.979
plsRglm0.9770.9640.9940.978
Stepglm[both]+NaiveBayes0.9590.9761.0000.978
Stepglm[backward]+NaiveBayes0.9590.9761.0000.978
Stepglm[both]+Ridge0.9700.9760.9890.978
Stepglm[backward]+Ridge0.9700.9760.9890.978
RF+NaiveBayes0.9670.9730.9940.978
Stepglm[backward]+XGBoost1.0000.9850.9490.978
NaiveBayes0.9540.9731.0000.976
Stepglm[both]+XGBoost0.9990.9790.9490.976
Lasso+XGBoost1.0000.9710.9460.972
XGBoost0.9810.9880.9430.971
Stepglm[both]+plsRglm0.9790.9730.9490.967
Stepglm[backward]+plsRglm0.9790.9730.9490.967AUC
RF+plsRglm0.9790.9610.9600.9671
RF+XGBoost1.0000.9700.9290.966
RF+LDA0.9750.9790.9260.96
Stepglm[both]+LDA0.9730.9730.9200.955
Stepglm[backward]+LDA0.9730.9730.9200.955
Lasso+GBM1.0000.9850.8640.949
Stepglm[both]+Enet[alpha=0.8]0.9960.9820.8580.945
Stepglm[backward]+Enet[alpha=0.8]0.9960.9820.8580.945
Stepglm[both]+Enet[alpha=0.2]0.9960.9820.8580.945
Stepglm[backward]+Enet[alpha=0.4]0.9960.9820.8580.945
Stepglm[backward]+Enet[alpha=0.6]0.9960.9820.8520.943
Stepglm[both]+Enet[alpha=0.4]0.9970.9820.8350.938
Stepglm[both]+Enet[alpha=0.1]0.9970.9820.8300.936
Stepglm[backward]+Enet[alpha=0.1]0.9970.9820.8300.936
Stepglm[both]+Enet[alpha=0.3]0.9970.9820.8300.936
Stepglm[backward]+Enet[alpha=0.3]0.9970.9820.8300.936
Stepglm[both]+Enet[alpha=0.5]0.9970.9820.8300.936
Lasso+LDA0.9850.9700.8520.936
Stepglm[both]+Enet[alpha=0.9]0.9970.9820.8240.934
Stepglm[backward]+Enet[alpha=0.9]0.9970.9820.8240.934
Stepglm[backward]+Enet[alpha=0.2]0.9970.9820.8240.934H
Stepglm[both]+Enet[alpha=0.7]0.9970.9820.8240.934
Stepglm[backward]+Enet[alpha=0.7]0.9980.9820.8180.933
Stepglm[backward]+Enet[alpha=0.5]0.9980.9820.8180.933
Stepglm[both]+Lasso0.9980.9820.8120.93130 20 10 0
Stepglm[both]+Enet[alpha=0.6]0.9980.9820.8010.927
Stepglm[backward]+Lasso0.9980.9820.7950.925
LDA0.9880.9270.8010.905
SVM0.8050.6850.8520.781
Stepglm[forward]1.0000.9110.3840.765
Lasso+Stepglm[forward]1.0000.8420.4430.76210(adj.P.Val)
Stepglm[both]1.0000.8350.4200.752
Stepglm[backward]1.0000.8350.4200.752-log
Stepglm[both]+SVM0.8010.7350.7160.75
Stepglm[backward]+SVM0.8010.7350.7160.75
Lasso+Stepglm[both]1.0000.6390.4090.683
Lasso+Stepglm[backward]1.0000.6390.4090.683
RF+SVM0.8010.5850.5000.629
Lasso+SVM0.8050.5350.5000.613

Fig. 4 Diagnostic model construction, feature gene selection, and model interpretability. (A) Heatmap shows the models with higher AUC values represented by darker colors. The model with a green box is the best-performing model. (B) ROC curve plot. (C) Forest plot of performance metrics. Volcano plot labeled part of feature genes. (D-F)

confusion matrix. (G) Boxplot of feature genes. (H) The volcano plots of feature genes. (I) Correlation heatmap of feature genes in tumor groups. (J) ROC curve for each feature gene. (K) SHAP summary bar plot showing feature genes’ importance. (L) SHAP summary dot plot showing feature impact on predictions

Genep_valueHazard Ratio (95% CI)
CCNB27.19E-06-- -I4.00 (2.18-7.32)
UBE2T1.91E-05F1 4.64 (2.30-9.37)
DTL2.62E-05I- 1 3.48 (1.94-6.21)
CDKI6.65E-05F 14.04 (2.03-8.02)
A3B2.02E-04F 12.85 (1.64-4.94)
TOP2A4.84E-041- 12.69 (1.54-4.70)
PTTG19.38E-04I- 12.73 (1.51-4.95)
CCNB11.73E-03I 42.38 (1.38-4.09)
ANLN6.02E-03I +1.91 (1.20-3.04)
MAD2L12.16E-02I T1.83 (1.09-3.06)
Fig. 5 Multivariable Cox analysis of feature genes in TCGA-ACC cohort. (A) Forest plot showing HRs and 95% CIs for each candi- date gene, using gene expression as continuous variables (z-scores) and adjusting for age, sex, and tumor stage. (B-K) Adjusted Kaplan- Meier survival curves for ANLN (B), A3B (C), CCNB1 (D), CCNB2 (E), CDK1 (F), UBE2T (G), DTL(H), MAD2L1 (I), PTTG1 (J), and TOP2A (K)

A

B Adjusted Survival: ANLN (High vs Low)

Adjusted HR (High vs Low) = 4.28 (95% CI 1.61-11.33), p = 0.00345

ANLN

Low

High

1.00 -

Adjusted survival probability

0.75

0.50 -

0.25

2

3

4

5

Hazard Ratio(95%CI)

6

7

8

9

I

0.00 -

0

500

1000

1500

2000

2500

3000

3500

4000

4500

C

D

E

Time (days)

Adjusted Survival: A3B (High vs Low) Adjusted HR (High vs Low) =4.74 (95% CI 1.82-12.40), p = 0.00149

Adjusted Survival: CCNB1 (High vs Low)

Adjusted Survival: CCNB2 (High vs Low)

Adjusted HR (High vs Low) =3.41 (95% CI 1.37-8.46), p = 0.00824

Adjusted HR. (High vs Low) = 6.10 (95% CI 1.89-19.65), p = 0.00245

A3B

Low

High

CCNB1

1

Low

High

CCNB2

Low

High

1.00

Adjusted survival probability

0.75 -

0.50

0.25 -

0.00 -

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (days)

Time (days)

Time (days)

F

G

H

Adjusted Survival: CDK1 (High vs Low)

Adjusted Survival: UBE2T (High vs Low)

Adjusted Survival: DTL (High vs Low) Adjusted HR (High vs Low) = 2.94 (95% CI 1.21-7.13), p = 0.0171

Adjusted HR (High vs Low) =7.27 (95% CI 2.26-23.34), p = 0.000861

Adjusted HR (High vs Low) = 2.98 (95% CI 1.17-7.59), p = 0.0224

CDK1

Low

High

UBE2T

Low

High

DTL

Low

High

1.00

Adjusted survival probability

0.75 -

0.50

0.25

0.00 -

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (days)

Time (days)

Time (days)

I

J

K

Adjusted Survival: MAD2L1 (High vs Low)

Adjusted Survival: PTTG1 (High vs Low)

Adjusted Survival: TOP2A (High vs Low)

Adjusted HR (High vs Low) = 2.90 (95% CI 1.18-7.12), p= 0.0201

Adjusted HR (High vs Low) =4.37 (95% CI 1.61-11.84), p = 0.00376

Adjusted HR (High vs Low) = 4.50 (95% CI 1.69-11.96), p = 0.0026

MAD2L1

Low

High

PTTG1

Low

High

TOP2A

Low

High

1.00

Adjusted survival probability

0.75 -

0.50

0.25

0.00

6

500

1000

1500

2000

2500

3000

3500

4000

4500

1

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

Time (days)

Time (days)

Time (days)

Fig. 6 Prognostic value of feature genes in TCGA-ACC cohort. (A) Nomogram integrating clinical and molecular variables. (B) Calibration curve. (C) Kaplan-Meier survival analysis for risk groups. (D) Time-dependent ROC curves at 1-, 3-, and 5-year OS

A

Points

0

10

20

30

40

50

60

70

80

90

100

C

Gender(-)

FEMAL

LE

MALE

RiskScore

Age(*)

1.0

10

20

40

50

60

70

80

L

H

H

Stage(-)

Sta

te

1

Stage III

Stage IIStage IV

0.8

TOP2A(-)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

Survival probability

ANLN(.)

5.5

$

4.5

4

3.5

3

2.5

2

1.5

1

0.5

0

0.5

MAD2LI(-)

0051132253334435

DTL( ** )

0.5

1.5

2.5

3

3.5

0.3

1

2

1

PITGI(-)

6.5

6

5.5

5

4.5

4

3.5

3

2.5

2

1.5

0.5

p=1.6-9

0.0

HR=24.32,95C1%(5.65,104.71)

CCNBX(-)

0

0.5

i

1.5

2

2.5

3

3.5

4

4.5

5

5.5

Number at risk

1

L

38

24

12

4

1

CDKI(-)

0

0.5

2

1.5

2

25

3

3.5

4

4.5

5

5.5

6

H

19

15

2

1

U

0

1,168

2,336

3,504

4,672

UBEZT( ** )

Time

1.5

2

25

3

3.5

4

4.5

5

5.5

6

6.5

A3B(*)

0

2

6.5

D

CCNBI(-)

6.5

6

5.5

5

4.5

4

3.5

3

2.5

2

13

Total points

1.0

0

50

100

150

200

250

300

350

Linear Predictor

-6

3

-2

- 1

0

3

1.

1

6

Probability of 365

0.95 0.85 0.60.30.05

Probability of 1095

0.95 0.85 0.60.30.05

0.8

Probability of 1825

0.95 0.85 0.60.30.05

B

1.0

Probability of 365

I

1

0.6-

.

Probability of 1095

0.8

Probability of 1825

Sensitivities

Observed(%)

0.6

0.4

0.4

0.2

0.2

Time:AUC(95%CI)

365:0.89(1.00-0.67)

1095:0.97(1.00-0.93)

0.0

0.0

1825:0.98(1.00-0.95)

0.4

0.5

0.6

0.7

0.8

0.9

0.0

0.2

0.4

0.6

1-Specificities

0.8

1.0

Nomogram-predicted(%)

experimental conditions, including electrophoresis, mem- brane transfer, antibody incubation, and exposure. There- fore, the results are comparable across the different blots. The qPCR detection experiment showed similar trends to WB (Fig. 8C). Additionally, scratch assay and Transwell migration experiments were performed to verify changes in cellular phenotypes. In the condition of A3B knockdown, cell migration ability was significantly reduced, while over- expression of ANLN partially restored migration ability (Fig. 8D-M). These results suggest that A3B may regulate ACC cell migration through the downstream gene ANLN, thereby influencing tumor progression. The cell experimen- tal results were consistent with the prior.

Materials and Methods

Selection of Transcriptomic Datasets

For the study of molecular mechanisms and essential genes linked to ACC, the gene expression data used in this work were obtained from the GEO[15] (Gene Expression Omni- bus, https://www.ncbi.nlm.nih.gov/geo) and TCGA GDC

(The Cancer Genome Atlas, https://portal.gdc.cancer.gov) databases[16]. Included were five GEO datasets[17], [18], [19], [20], [21] totaling 234 samples. Additionally, 77 RNA- seq samples from the TCGA database were analyzed. Batch effects in the GEO datasets were corrected using the Com- bat function of the sva package, and background correction and normalization of microarray data were performed using the limma package. Two TCGA samples lacking survival time and other clinical information were excluded from the analysis. In Table 1, more detailed information is provided.

Differential Analysis and WGCNA

The datasets GSE12368, GSE90713, and GSE143383 were merged to form the training group, while GSE10927 and GSE19750 were designated as the validation group, maintaining a 3:2 ratio between the two groups to ensure data balance. Differential expression analysis of the train- ing group was carried out with the following criteria: |log2FC| > 1.5 and adj. P < 0.05. WGCNA was performed using the TCGA-ACC dataset. First, a weighted adjacency matrix was constructed, and the scale-free topology fit was computed under different power values. The minimum

Fig. 7 Pathway enrichment and functional analysis of key genes. (A) The results of the PPI network for A3B and ANLN. (B, C) GSVA. The t-values (from the t-test) indicate enrichment scores between low (green) and high (red) expression groups. (B) A3B, (C) ANLN. (D, E) GSEA. High expression: top; low expression: bottom. (D) ANLN, (E) A3B

APOBECA

D

.

Enriched in high expression group of ANLN

APOBEC2

GOBP CHROMOSOME SEGREGATION

3

GOBP MITOTIC SISTER CHROMATID SEGREGATION

APOn

8

AICF

0.6

CDA

.

GOBP NUCLEAR CHROMOSOME SEGREGATION

GOUP SISTER CHROMATID SEGREGATION

A

Running Enrichment Score

9

UNG

GOCC CONDENSED CHROMOSOME CENTROMERIC REGION

J

SULS

10

ORFOIO

APORECIB

3

3

R

0.0

SOCST

2

ACAPI

2

GATC

5

CoxJC

2

CENTRE

3

F

PTIN?

+

Ranked List Metric

&

EPTINII

?

1

STPG3

®

FKAPZL

0

e

X

SEPIRO

SEPTIN

SEPTING

5

E

-1

ŠA

N

SPIES

SE

IN3

5000

1

SEPFINDS

Rank in Ordered Dataset

10000

15000

EST

=

SEPTIME

A

Enriched in low expression group of ANLN

SPECI

MEMHO

GOUP GRANULOCYTE CHEMOTAXIS

ET

-

0.0

T

GOBP GRANULOCYTE MIGRATION

SEPTINEZ

GOUP LEUKOCYTE CHEMOTAXIS

ES

Running Enrichment Score

GOBP MONONUCLEAR CELL MIGRATION

LC26A8

TEX14

GOMF CHEMOKINE ACTIVITY

P

0.4

B

A3B

KEGG FC EPSILON RI SIGNALING PATHWAY

KEGG ETHER LIPID METABOLISM-

KEGG NICOTINATE AND NICOTINAMIDE METABOLISM-

KEGG ALDOSTERONE REGULATED SODIUM REABSORPTION-

KEGG TERPENOID BACKBONE BIOSYNTHESIS-

KEGG OXIDATIVE PHOSPHORYLATION-

KEGG RIBOFLAVIN METABOLISM-

KEGG CITRATE CYCLE TCA CYCLE

Ranked List Metric

1

KEGG INOSITOL PHOSPHATE METABOLISM-

KEGG RNA DEGRADATION-

Group

0

KEGG PROXIMAL TUBULE BICARBONATE RECLAMATION-

Down

-1

KEGG LONG TERM DEPRESSION-

Not

Up

KEGG PROTEASOME

5000

Rank in Ordered Dataset

10000

15000

KEGG ENDOCYTOSIS-

E

KEGG ADHERENS JUNCTION-

Enriched in high expression group of A3B

GOBP CHROMOSOME SEGREGATION

KEGG ECM RECEPTOR INTERACTION-

GOBP MITOTIC NUCLEAR DIVISION

KEGG DRUG METABOLISM OTHER ENZYMES-

0.6

005P MITOTIC SISTER CHROMATID SEGREGATION

KEGG HEDGEHOG SIGNALING PATHWAY-

Running Enrichment Score

GOSP NUCLEAR CHROMOSOME SEGREGATION

KEGG ADIPOCYTOKINE SIGNALING PATHWAY

GOSP SISTER CHROMATID SEGREGATION

KEGG SYSTEMIC LUPUS ERYTHEMATOSUS

A

KEGG COMPLEMENT AND COAGULATION CASCADES

2

>

0

:

يت

3

t value of GSVA score

C

O.D

ANLN

KEGG VIRAL MYOCARDITIS -

KEGG PATHOGENIC ESCHERICHIA COLI INFECTION

3.

KEGG REGULATION OF ACTIN CYTOSKELETON -

KEGG TGF BETA SIGNALING PATHWAY

KEGG LEUKOCYTE TRANSENDOTHELIAL MIGRATION

Ranked List Metric

1

KEGG CELL ADHESION MOLECULES CAMS

S

KEGG ALLOGRAFT REJECTION

0

KEGG RIBOSOME

KEGG CHRONIC MYELOID LEUKEMIA

1

KEGG LEISHMANIA INFECTION

KEGG COLORECTAL CANCER

-2

KEGG ADIPOCYTOKINE SIGNALING PATHWAY

5000

Rank in Ordered Dataset

10000

15000

KEGG VASCULAR SMOOTH MUSCLE CONTRACTION -

KEGG NOD LIKE RECEPTOR SIGNALING PATHWAY

Group

KEGG MELANOGENESIS

Down

Enriched in low expression group of A3B

KEGG AMINOACYL TRNA BIOSYNTHESIS

KEGG GLYCOSYLPHOSPHATIDYLINOSITOL GPI ANCHOR BIOSYNTHESIS

Not

GOUP ADAPTIVE IMMUNE RESPONSE

KEGG BASAL TRANSCRIPTION FACTORS

Up

GOBP HUMORAL IMMUNE RESPONSE

KEGG OXIDATIVE PHOSPHORYLATION

GOUP LEUKOCYTE CHEMOTAXIS

KEGG GLYCEROPHOSPHOLIPID METABOLISM-

Running Enrichment Score

GOBP MYELOID LEUKOCYTE MIGRATION

KEGG UBIQUITIN MEDIATED PROTEOLYSIS

GOUP POSITIVE REGULATION OF LEUKOCYTE MIGRATION

KEGG ALDOSTERONE REGULATED SODIUM REABSORPTION

KEGG AMYOTROPHIC LATERAL SCLEROSIS ALS

KEGG GLYCEROLIPID METABOLISM -

0.4

KEGG TAURINE AND HYPOTAURINE METABOLISM

KEGG REGULATION OF AUTOPHAGY

KEGG GLYCOSAMINOGLYCAN BIOSYNTHESIS HEPARAN SULFATE

4.6

KEGG NEUROACTIVE LIGAND RECEPTOR INTERACTION

KEGG CARDIAC MUSCLE CONTRACTION

KEGG TYPE II DIABETES MELLITUS

V

2

5.0

t value of GSVA score

,

3

Ranked List Metric

2

1

0

-4

-2

3000

Rank in Ordered Dataset

10000

15000

Fig. 8 Cell-based validation of the regulatory effects of ANLN and A3B on cell migration. (A-C) Western Blot and qPCR detection of ANLN, A3B, and ß-actin expression levels in different groups. (A) is composed of bands obtained from 3 blots. (B) WB quantitative data of A3B (Left) and ANLN (Right), (C) qPCR quantitative data of A3B (Left) and ANLN (Right). (D-G) Scratch assay on 0h (yellow) and 24h (green). (D) siNC, (E) siA3B, (F) siA3B+OE-ANLN, (G) siNC+OE-ANLN. (H) Quantitative data of scratch assay. (I-L) Tran- swell migration after 24h. (I) siNC, (J) siA3B, (K) siA3B+OE-ANLN, (L) siNC+OE-ANLN. (M) Quantitative data of transwell migration

A

siNC

siA3B

siA3B+

siNC+

OE-ANLN

OE-ANLN

B

ANLN

00

110 kDa

ANLN_WB


A3B_WB

A3B

46 kDa


ns

1.5-


Relative ratio of ANLN/B-actin

1.5



B-Actin

Relative ratio of A3B/B-actin

42 kDa


**


qPCR_ANLN

1.0-

1.0-

C

ns

qPCR_A3B

**

**



2.5-

0.5-

0.5-

2.0-


ns


Gene expression of A3B

Gene expression of ANLN

2.0-

1.5-

1.5-

0.0

0.0

si-NC

si-A3B

si-A3B+OE-ANLN

si-NC+OE-ANLN

si-NC

si-A3B

si-A3B+OE-ANLN

si-NC+OE-ANLN

1.0-

1.0-

0.5-

0.5-

Groups

Groups

0.0

0.0

si-NC

si-A3B

si-A3B+OE-ANLN

si-NC+OE-ANLN

Si-NC

si-A3B

si-A3B+OE-ANLN

si-NC+OE-ANLN

Groups

Groups

D

siNC

E

siA3B

F

siA3B+ OE-ANLN

G

siNC+ OE-ANLN

H

Scratch Assay


ns

0 h

100-



Migration Rate(%)

80

60-

40

24 h

20-

0

SiNC

siA3B

siA3B+OE-ANLN

siNC+OE-ANLN

area 0h area 24h

Migration Rate =[ 1-(/)]* 100 %

Groups

I

siNC

J

siA3B

Transwell Migration


M

ns

800



Number of migrated cells

600-

K

siA3B+

L

siNC+

400-

OE-ANLN

OE-ANLN

200

0

SiNC

SiA3B

siA3B+OE-ANLN

siNC+OE-ANLN

Groups

Table 1 Information of Datasets
DatasetPlatformSample Size
GSE10927GEO Database43
GSE12368GEO Database18
GSE19750GEO Database48
GSE90713GEO Database63
GSE143383GEO Database62
ACC_RNAseq_fpkmTCGA (Cancer Database)77

power value corresponding to a signed R2 > 0.9 was selected as the optimal value. Then, the matrix is converted into a topological overlay one with the minimum gene number set to 40. The DEG and WGCNA were intersected to identify inter genes. GO and KEGG functional enrichment analyses were conducted with significance thresholds set at adjusted P < 0.05.

Mendelian Randomization

Genome-wide association study (GWAS) summary statis- tics for exposures were obtained from the MRC Integrative Epidemiology Unit (MRC-IEU) GWAS database[22] (https ://gwas.mrcieu.ac.uk/), frozen as of August 4, 2024. In total, 50,043 GWAS datasets were screened, comprising 323,281 SNP-trait associations (Supplementary table S1 and table S2 in supplementary materials), of which European-ancestry cohorts were prioritized for downstream analyses. Outcome data were sourced from the dataset in version R11 of the FinnGen database [23] (https://www.finngen.fi/fi), which can be accessed at https://r11.finngen.fi/pheno/C3_ADRE NAL_GLAND_EXALLC. MR analysis was conducted in the R environment, using the TwoSampleMR package to merge and align the gene and outcome data. The selection of instrumental variables (IVs) was based on the following criteria: meeting the standard GWAS threshold (P < 5e- 8), linkage disequilibrium threshold (R2 < 0.01), and an F-sta- tistic greater than 10. SNPs were harmonized between exposure and outcome datasets using the TwoSampleMR R package to align effect alleles. The primary MR analysis used the inverse-variance weighted (IVW) approach under a fixed-effect model. When Cochran’s Q test indicated het- erogeneity (P < 0.05), a multiplicative random-effects IVW model was applied. Sensitivity analyses included MR-Egger regression, weighted median estimation, and leave-one-out analysis. The MR-Egger intercept test was used to evalu- ate horizontal pleiotropy, and heterogeneity was assessed using Cochran’s Q statistic. Candidate causal genes identi- fied from the first-stage MR analysis were intersected with WGCNA modules and DEGs. eQTL data were retrieved and subjected to a second-round MR analysis for the inter- sected set. Forest plots, scatter plots, and funnel plots were finally generated. The code and parameter settings used for

data processing and analysis are provided in supplementary materials (S1).

Identification of Hub Genes Network

The inter genes in section 3.3 were imported into the STRING database [24] (https://string-db.org/), with the minimum interaction confidence score set to 0.4 to generate the PPI network. The PPI network data were then imported into Cytoscape software, and the centrality scores of genes within the network were calculated using the cytoHubba plugin. The MCC (Maximal Clique Centrality) algorithm was employed to select the top 25 genes with the highest scores, which were defined as hub genes.

Machine Learning

An integrated framework comprising 12 different machine learning models and 113 model combinations, each based on various hyperparameters was used to conduct machine learning using data for the training and validation groups, as mentioned in section 3.1. These models were selected based on their proven effectiveness in cancer diagnosis and classi- fication tasks in prior studies [25], [26], and the multi-model strategy was intended to ensure balanced performance, robustness, and interpretability. Hub genes were selected as candidate variables, and these model combinations were executed on the training group, with cross-validation con- ducted using the two validation groups. The AUC values for training and validation groups were calculated to assess model performance. The model with the highest average AUC value was selected, and further analyses were con- ducted on this model and its feature genes, including confu- sion matrix, gene-based ROC curve evaluation (geneROC), and SHAP (Shapley Additive Explanations) analysis to interpret the contribution of model features. Ultimately, the key variables in the best diagnostic model were defined as feature genes used for subsequent research. The whole code and hyperparameters were provided in the supplementary materials (S1).

Multivariable Cox Regression Analysis and Nomogram Construction

Clinical and transcriptomic data from the TCGA-ACC cohort were used to evaluate the association between candi- date gene expression and overall survival (OS). OS time was recorded in days, and survival status was defined as alive or dead. Clinical covariates included age at diagnosis, sex, and tumor stage (Stage I-IV). Gene expression values were included as continuous variables in multivariable Cox pro- portional hazards models and standardized using z-scores.

Forest plots were generated based on these multivariable Cox model results. Results are reported as hazard ratios (HRs) with 95% CIs. In addition, we used rms to integrate clinical information with the expression data of the feature genes, constructed a nomogram using the Cox method, and evaluated the prognostic significance of these features in the TCGA-ACC dataset. Samples were stratified into high- and low-risk groups according to the median risk score, and survival differences between groups were assessed using the log-rank test to generate Kaplan-Meier curves. Time- dependent ROC curves were plotted to evaluate predictive performance-the full code provided in the supplementary materials (S1).

PPI, GSEA and GSVA Analysis

The STRING database was used in PPI research to link target genes with feature genes. Selected were genes that might be linked to the target genes within a maximum of five network expansions. Moreover, the BioGRID database was searched for feature genes connected to the target genes [14]; thus, the intersection of these results with the STRING database helped to identify the key gene. We applied GSVA and GSEA algorithms to analyze KEGG and GO pathway enrichment for the key gene and target gene at different expression levels, using c2.cp.kegg.v7.4.symbols.gmt and c5.go.Hs.symbols.gmt. In the GSVA analysis, the “GSVA” R package was used with the ssgsea method to calculate pathway scores. These scores were then normalized using the formula:

Normalized Score =

max(x) - min(x) x - min(x) (1)

Group comparisons were performed on the normalized scores, and the significantly enriched pathways were identi- fied using T-tests. For GSEA, the “clusterProfiler” R pack- age was employed to calculate the average expression values between high- and low-expression groups, with log2FC val- ues computed. Pathways with an adjusted Pvalue < 0.05 were considered significant. The median expression value of the genes determined the division into groups.

Reagents and Materials

DMEM high-glucose medium, penicillin-streptomycin solution, and 0.25% trypsin solution were purchased from Procell (Wuhan, China). Fetal bovine serum (FBS) was obtained from ExCell Bio (Shanghai, China). PBS was purchased from Fuzhou Maixin Biotechnology Co., Ltd. RIPA lysis buffer, ECL detection reagent, phosphatase inhibitors, and BCA protein assay kits were obtained from

Beyotime Biotechnology (China). PMSF was purchased from Biosharp (China). PVDF membranes were sourced from Millipore (USA). Transwell inserts were purchased from Corning Incorporated (USA). Primary antibod- ies against ANLN, A3B, and B-actin were obtained from Wuhan Sanying (China). HRP-conjugated goat anti-rabbit and goat anti-mouse secondary antibodies were purchased from Bioss (China), with catalog numbers bs-0295G and bs-0296G.

Cell Culture

The ACC line SW-13 cells (Cat NO .: CL-0451), kindly pro- vided by Wuhan Pricella Biotechnology Co., Ltd, were cul- tured in DMEM high-glucose medium supplemented with 10% FBS and 1% penicillin-streptomycin solution. Cells were maintained in a humidified incubator at 37℃ with 5% CO2. Cells were digested using a 0.25% trypsin solution for subculturing and washed with PBS.

siRNA and Plasmid Construction and Transfection

The siRNA targeting A3B was designed based on its mRNA sequence (NCBI Reference Sequence: NM_001270411.2) using the siDirect 2.0 online tool, with the specific sequence ( 5/ - GACCTACGATGAGTTTGAGTACT - 3/ ) validated using NCBI primer BLAST for off-target effects. The overexpression plasmid was constructed by amplifying the ANLN coding sequence (CDS) using PCR, followed by insertion into the pcDNA3.1 vector through XhoI and EcoRI restriction enzyme sites. The correct construction of the plasmid was verified by restriction enzyme digestion and Sanger sequencing. The plasmid was then transfected into SW-13 cells. Cells were incubated at 37℃ with 5% CO2 for 48 hours before further experiments.

Western Blot Analysis

Cells were lysed in RIPA buffer supplemented with PMSF and phosphatase inhibitors. Lysates were centrifuged at 12,000 rpm for 5 minutes at 4℃, and the supernatant was collected for protein quantification using the BCA protein assay kit. Equal amounts of protein (30 µg) were loaded onto SDS-PAGE gels and transferred to PVDF membranes under constant voltage (25 V) for 10-15 minutes. Mem- branes were incubated with 5% non-fat milk at ambient temperature for 2 hours, followed by overnight incubation at 4℃ with the primary antibodies: ANLN (1:5000), A3B (1:1000), and ß-actin (1:10000). The next day, membranes were washed four times with TBST (5 minutes each) and incubated with HRP-conjugated secondary antibodies (1:5000) at room temperature for 2 hours. Protein bands

were visualized using the ECL detection reagent, and band intensities were quantified using ImageJ software.

Quantitative Real-Time PCR (qPCR)

qPCR was performed using SYBR Green on a Roche LightCycler 480 system. Each reaction consisted of 5.0 uL SYBR Green Mix (2x), 0.2 µL forward primer, 0.2 uL reverse primer, 0.5 uL cDNA, and 4.1 uL RNase- free water. B-Actin was used as an internal control. Relative expression levels were calculated using the 2-AACt method. Primer sequences : ANLN (Forward: 5/ - TGCCAGGCGAGAGAATCTTC - 3/, Reverse: 5/ - CGCTT AGC ATGAGTCAT AGACCT - 3/andA3B (Forward: 5/ - ACCC ATCCTCT ATGGTCGGA - 3/, Reverse: 5/ - GCTTGAAAT AC ACCTGGCCTC - 3)

Scratch Assay

SW-13 cells were seeded into 6-well plates at a density of 2*105 cells per well and incubated overnight at 37℃ with 5% CO2. The following day, a scratch was made using a 200 µL pipette tip. After washing three times with PBS to remove detached cells, serum-free medium was added, and cells were incubated for 24 hours at 37℃. Images of the wound area were captured at 0 and 24 hours using a microscope (Olympus IX51). Three images were randomly selected from each group, and the wound closure area was measured using ImageJ software to obtain the average scratch width at each time point. The migration rate was

calculated as (1 - 24h width

0h width × 100%.

Transwell Migration Assay

Transwell inserts were used to assess cell migration. SW-13 cells were then suspended in serum-free media, and 200 uL of the cell solution was put into the top chamber of the Transwell inserts, while 800 µL of medium containing 10% FBS was added to the lower chamber as a chemoattractant. Cells were incubated at 37℃ with 5% CO2 for 24 hours. After removing non-migratory cells from the upper surface of the membrane, migrated cells on the lower surface were fixed with methanol and stained with 0.1% crystal violet. Images were captured from five random fields per group, and the number of migrated cells was counted. Cell counts from the five fields were averaged, and three independent biological replicates were performed. Cell quantification was conducted using ImageJ with automated cell counting.

Statistical Analysis

The bioinformatics analyses were conducted in R version 4.3.3 and Python version 3.10 environments. The data from the vitro experiment was processed using GraphPad Prism 9.5.0. software. All results were presented as the mean ± standard deviation (SD) from a minimum of three separate experiments. One-way ANOVA was used to evaluate the statistical differences among groups. Significance levels were indicated as follows :* P < 0.05, ** P < 0.01, *** P < 0.00 1, and **** P < 0.0001.

Discussion

ACC is a rare yet extremely aggressive cancer with inad- equately characterized molecular features and pathogenic processes. Despite surgical excision in conjunction with mitotane chemotherapy being the prevailing standard treatment[27], elevated rates of recurrence and metastasis constrain therapeutic efficacy. This study employed MR, WGCNA, machine learning, and experimental validation to elucidate the significant involvement of A3B and its down- stream gene, ANLN, in ACC, thereby addressing existing research deficiencies and establishing a solid foundation for early diagnosis and targeted therapy.

The convergence of analysis identified A3B as a pivotal gene that influences ACC development and patient progno- sis. According to GO and KEGG enrichment studies, the network function of hub genes mainly relies on nuclear divi- sion, spindle activity, organelle fission, microtubule motor activity, and the cell cycle. The results show that via mitotic processes, the hub genes help cell proliferation and migra- tion. The elevated recurrence rate noted in surgical exci- sion alongside mitotane therapy may be ascribed to atypical nuclear division, resulting in genetic instability and promot- ing relapse via the adaptability and proliferation of remain- ing tumor cells.

The RF model performed superior in 113 methods, identifying nine feature genes, including ANLN. Survival study, PPI network, and BioGRID database investigations validated ANLN linked to A3B. GSVA and GSEA analyses indicated that these two genes were predominantly enriched in pathways including the cell cycle, DNA replication, leu- kocyte chemotaxis, and chromosome segregation. Prior research has demonstrated that A3B promotes genomic instability in breast cancer by elevating replication stress and changes in chromosomal copy number[28]. Leukocyte chemotaxis pathways are essential in tumor cell migra- tion, proliferation, and metastasis[29], [30]. Furthermore, research on breast cancer indicates that A3B expression is dependent on the cell cycle; however, it is not directly

linked to adverse clinical outcomes[31]. Our data suggest that A3B may indirectly affect ACC prognosis and survival via co-expression with ANLN. These findings underscore the bifunctional role of A3B in ACC: facilitating genomic instability by serving as a mutational source and modulating ANLN to augment tumor proliferation and migration.

Western blot analysis and qPCR results indicated that the silencing of A3B (siA3B) markedly diminished ANLN protein levels; however, the overexpression of ANLN (OE- ANLN) failed to reinstate A3B expression, thereby validat- ing ANLN as a downstream gene. Moreover, scratch and Transwell migration experiments indicated that the knock- down of A3B significantly impeded ACC cell movement, but the overexpression of ANLN partially reinstated this phenotype. Notably, qPCR analysis showed that A3B expression in the siNC+OE-ANLN group was elevated compared to the control, suggesting that ANLN overexpres- sion might upregulate A3B. However, Western blot results indicated that A3B protein levels in the siNC+OE-ANLN group were slightly reduced compared with the control, yet remained higher than in the siA3B group. These find- ings suggest that A3B silencing effectively downregulates ANLN. At the same time, ANLN overexpression may par- tially restore or promote A3B expression, implying a poten- tial bidirectional regulatory interaction between A3B and ANLN that warrants further investigation.

Recent studies have reported that nuclear ANLN acts as a transcriptional regulatory factor, clustering with RNA Polymerase II (Pol II) at the transcription initiation stage and promoting target gene expression, which has piqued our interest[32]. McCann et al. have shown that the over- expression of A3B can significantly reduce R-loop levels and induce gene mutations by directly binding to R-loops and mediating the C-U conversion pattern[33]. Research by Sridhara et al. has indicated, in the meantime, that in some tumor cells, the buildup of R-loops can cause Pol II pausing and further induce SRPK2-dependent DDX23 phosphory- lation and nucleation to prevent RNA-dependent genomic instability[34]. Based on this, we speculate that the reduc- tion of R loops caused by A3B overexpression leads to Pol II being unable to properly trigger pausing while inhibiting the DDX23-dependent genome stability repair mechanism. This process may promote Pol II escape and further induce the accelerated expression of ANLN through some signal- ing pathway, ultimately affecting the corresponding entry of cell cycle proteins, such as CDK1, CCNB1, and CCNB2 into division, thereby affecting cell cycle progression[35].

Moreover, existing studies have reported that in breast cancer, the expression of estrogen receptor (ERa) exhibits an oscillatory pattern, affecting Pol II elongation and regu- lation of the c-MYB proto-oncogene, which is intricately linked to the development of various endocrine-related

cancers[36]. Under normal physiological conditions, after estrogen (E2) activates ERa, the transcription efficiency of E2 target genes is relatively low due to Pol II pause sites[37]. However, under high E2 levels and A3B expres- sion in ACC patients, A3B-mediated R-loop resolution may reduce or fail to properly trigger pausing of E2 target genes, potentially leading to accelerated Pol II escape. This may result in increased Pol II recruitment to the ANLN promoter region and enhanced transcriptional activity, thereby con- tributing to aberrantly high expression of E2 target genes. In breast cancer cells and even in normal cells, acute E2 stimu- lation can induce the formation of R loops and temporarily suppress the transcription of E2-responsive genes[36], [38], [39]. However, this regulatory mechanism may be disrupted in ACC cells with high A3B expression, failing to prevent excessive Pol II escape and ANLN-mediated Pol II over- accumulation. Ultimately, this imbalance in the mechanism may become a key driver of carcinogenesis[40].

Taken together, these findings suggest a hypothetical model in which A3B overexpression modulates R-loop dynamics, disrupts Pol II pausing, and indirectly promotes ANLN transcription, leading to enhanced cell cycle pro- gression and tumor growth.

Our study does not aim to alter the current surgi- cal standard for ACC treatment, but to offer a significant improvement in reducing the rate of missed diagnoses by the diagnostic model. The identification and discussion of the A3B-ANLN axis provide a novel perspective for under- standing the underlying molecular mechanisms of ACC. Nonetheless, this study has several limitations. (1) The number of negative samples in the GEO cohorts was rela- tively limited, which may affect the stability and generaliz- ability of the model. (2) Most data in this study were derived from Western populations, which may introduce population stratification bias due to genetic heterogeneity. Differences in gene expression profiles, mutational landscapes, and clin- ical features between Western and Asian populations could impact model performance. We plan to validate the A3B- ANLN regulatory axis and the 9-gene diagnostic model in independent Asian cohorts and assess whether population- specific calibration is needed. Furthermore, we aim to fur- ther elucidate the underlying regulatory pathway in future studies.

Conclusions

Our study, for the first time, reveals the critical role of A3B in ACC through its regulation of the downstream gene ANLN. Based on the unprecedented integration of, WGCNA, and 113 machine learning methods, we systematically identified the principal ACC-related genes. Rigorous experimental

validation further demonstrated the functional relationship between A3B and ANLN, comprehensively elucidating their roles and mechanisms at both the molecular level (regulatory interactions and protein expression) and pheno- typic level (cell proliferation and migration capacity). This study fills a significant gap in the molecular pathology of ACC, providing novel theoretical foundations and potential therapeutic targets for understanding ACC pathogenesis and advancing precision medicine.

Supplementary information The online version contains supplementary material available at https://doi.org/10.1007/s12020-0 25-04536-w.

Acknowledgements We gratefully acknowledge the TCGA, GEO, MRCIEU and FinnGen for providing the data used in this study.

Author contributions J.Z. and X.H. wrote the main manuscript text. C.W. and W.D. prepared Figures 1-3 and performed data visualiza- tion. H.H. and R. W. conducted experiments and prepared Figures 4-7. Z.X. assisted in data validation and prepared Figure 8. C.L. and J.Z. supervised the project and contributed to the manuscript review and editing. Jiadong Zhang and Xinyu Hu contributed equally to this work and share first authorship. All authors reviewed and approved the final manuscript.

Funding This work was supported by the Sixth Batch of National Outstanding Clinical Talents in Traditional Chinese Medicine Train- ing Program by the National Administration of Traditional Chinese Medicine (Grant No. SATCM Document on Personnel and Education No. 256 [2025]); the Huaword Biotech (Wuhan) Co., Ltd. Work sta- tion under the 2025 Hubei University of Chinese Medicine Postgradu- ate Workstation Construction Project (Grant No. Traditional Chinese Medicine Research Document No. 50 [2025]); the Hubei Province International Science and Technology Cooperation Project (Grant No. 2024EHA017);the National Administration of Traditional Chinese Medicine National Famous Elderly Chinese Medicine Experts Inheri- tance Workshop Construction Project; and the Natural Science Foun- dation of Hubei Province (Grant No. 2022CFD023).

Data Availability The datasets analysed during the current study are available from the corresponding author on reasonable request.

Declarations

Ethics approval The ACC line SW-13 cells (Cat NO .: CL-0451), kind- ly provided by Wuhan Pricella Biotechnology Co.,Ltd.

Conflict of interest The authors declare no competing interests.

References

1. M. Fassnacht, G. Assie, E. Baudin, G. Eisenhofer, C. de la Fouchardiere, Adrenocortical carcinomas and malignant phaeo- chromocytomas: ESMO-EURACAN Clinical Practice Guide- lines for diagnosis, treatment and follow-up. Annals of Oncology: Official Journal of the European Society for Medical Oncology 31, 1476-1490 (2020)

2. C. de Ponthaud, M. Roy, S. Gaujoux, Adrenocortical carcinoma: What you at least should know. The British Journal of Surgery 111, znae177 (2024)

3. A. Calabrese, V. Basile, S. Puglisi, P. Perotti, A. Pia, Adjuvant mitotane therapy is beneficial in non-metastatic adrenocortical carcinoma at high risk of recurrence. European Journal of Endo- crinology 180, 387-396 (2019)

4. Y.S. Elhassan, B. Altieri, S. Berhane, D. Cosentini, A. Calabrese, S-GRAS score for prognostic classification of adrenocortical car- cinoma: An international, multicenter ENSAT study. European Journal of Endocrinology 186, 25-36 (2021)

5. H. Remde, L. Schmidt-Pennington, M. Reuter, L .- S. Landwehr, M. Jensen, Outcome of immunotherapy in adrenocortical carci- noma: A retrospective cohort study. European Journal of Endocri- nology 188, 485-493 (2023)

6. M. Terzolo, M. Fassnacht, P. Perotti, R. Libé, D. Kastelan, Adju- vant mitotane versus surveillance in low-grade, localised adre- nocortical carcinoma (ADIUVO): An international, multicentre, open-label, randomised, phase 3 trial and observational study. The Lancet. Diabetes & Endocrinology 11, 720-730 (2023)

7. M.T. Campbell, V. Balderrama-Brondani, C. Jimenez, G. Tamsen, L.P. Marcal, Cabozantinib monotherapy for advanced adrenocor- tical carcinoma: A single-arm, phase 2 trial. The Lancet. Oncol- ogy 25, 649-657 (2024)

8. C. Ghosh, J. Hu, E. Kebebew, Advances in translational research of the rare cancer type adrenocortical carcinoma. Nature Reviews. Cancer 23, 805-824 (2023)

9. M. Laganà, S. Grisanti, D. Cosentini, V.D. Ferrari, B. Lazzari, Efficacy of the EDP-M Scheme Plus Adjunctive Surgery in the Management of Patients with Advanced Adrenocortical Carci- noma: The Brescia Experience. Cancers 12, 941 (2020)

10. M. Uchihara, M. Tanioka, Y. Kojima, T. Nishikawa, K. Sudo, Clinical management and outcomes associated with etoposide, doxorubicin, and cisplatin plus mitotane treatment in metastatic adrenocortical carcinoma: A single institute experience. Interna- tional Journal of Clinical Oncology 26, 2275-2281 (2021)

11. J. Zou, C. Wang, X. Ma, E. Wang, G. Peng, APOBEC3B, a molecular driver of mutagenesis in human cancers. Cell & Bio- science 7, 29 (2017)

12. N. Kanu, M.A. Cerone, G. Goh, L .- P. Zalmas, J. Bartkova, DNA replication stress mediates APOBEC3 family mutagenesis in breast cancer. Genome Biology 17, 185 (2016)

13. S.K. Gara, M.V. Tyagi, D.T. Patel, K. Gaskins, J. Lack, GATA3 and APOBEC3B are prognostic markers in adrenocortical carci- noma and APOBEC3B is directly transcriptionally regulated by GATA3. Oncotarget 11, 3354-3370 (2020)

14. R. Oughtred, J. Rust, C. Chang, B .- J. Breitkreutz, C. Stark, The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci- ence 30, 187-200 (2021)

15. T. Barrett, S.E. Wilhite, P. Ledoux, C. Evangelista, I.F. Kim, NCBI GEO: Archive for functional genomics data sets-update. Nucleic Acids Research 41, D991-D995 (2013)

16. A.P. Heath, V. Ferretti, S. Agrawal, M. An, J.C. Angelakos, The NCI Genomic Data Commons. Nature Genetics 53, 257-262 (2021)

17. T.J. Giordano, R. Kuick, T. Else, P.G. Gauger, M. Vinco, Molecu- lar classification and prognostication of adrenocortical tumors by transcriptome profiling. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research 15, 668-676 (2009)

18. P.S.H. Soon, A.J. Gill, D.E. Benn, A. Clarkson, B.G. Robinson, Microarray gene expression and immunohistochemistry analyses of adrenocortical tumors identify IGF2 and Ki-67 as useful in dif- ferentiating carcinomas from adenomas. Endocrine-Related Can- cer 16, 573-583 (2009)

19. C.R. Legendre, M.J. Demeure, T.G. Whitsett, G.C. Gooden, K.J. Bussey, Pathway Implications of Aberrant Global Methylation in Adrenocortical Cancer. PloS One 11, e0150629 (2016)

20. I.D. Weiss, L.M. Huff, M.O. Evbuomwan, X. Xu, H.D. Dang, Screening of cancer tissue arrays identifies CXCR4 on adreno- cortical carcinoma: Correlates with expression and quantification on metastases using 64Cu-plerixafor PET. Oncotarget 8, 73387- 73406 (2017)

21. T. Fojo, L. Huff, T. Litman, K. Im, M. Edgerly, Metastatic and recurrent adrenocortical cancer is not defined by its genomic landscape. BMC medical genomics 13, 165 (2020)

22. Elsworth, B., Lyon, M., Alexander, T., Liu, Y. & Matthews, P. The MRC IEU OpenGWAS data infrastructure. bioRxiv 2020.08.10.244293 (2020).

23. M.I. Kurki, J. Karjalainen, P. Palta, T.P. Sipilä, K. Kristiansson, FinnGen provides genetic insights from a well-phenotyped iso- lated population. Nature 613, 508-518 (2023)

24. D. Szklarczyk, R. Kirsch, M. Koutrouli, K. Nastou, F. Mehryary, The STRING database in 2023: Protein-protein association net- works and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Research 51, D638-D646 (2023)

25. J. Tang, Y. Fang, Z. Xu, Establishment of prognostic models of adrenocortical carcinoma using machine learning and big data. Frontiers in Surgery 9, 966307 (2023)

26. R. Martin-Hernandez, S. Espeso-Gil, C. Domingo, P. Latorre, S. Hervas, Machine learning combining multi-omics data and net- work algorithms identifies adrenocortical carcinoma prognostic biomarkers. Frontiers in Molecular Biosciences 10, 1258902 (2023)

27. T. Else, A.C. Kim, A. Sabolch, V.M. Raymond, A. Kandathil, Adrenocortical Carcinoma. Endocrine Reviews 35, 282-326 (2014)

28. S. Venkatesan, M. Angelova, C. Puttick, H. Zhai, D.R. Caswell, Induction of APOBEC3 Exacerbates DNA Replication Stress and Chromosomal Instability in Early Breast and Lung Cancer Evolu- tion. Cancer Discovery 11, 2456-2473 (2021)

29. S.J. Youngs, S.A. Ali, D.D. Taub, R.C. Rees, Chemokines induce migrational responses in human breast carcinoma cell lines. Inter- national Journal of Cancer 71, 257-266 (1997)

30. E. Loukinova, G. Dong, I. Enamorado-Ayalya, G.R. Thomas, Z. Chen, Growth regulated oncogene-alpha expression by murine squamous cell carcinoma promotes tumor growth, metastasis, leukocyte infiltration and angiogenesis by a host CXC receptor-2 dependent mechanism. Oncogene 19, 3477-3486 (2000)

31. D.W. Cescon, B. Haibe-Kains, T.W. Mak, APOBEC3B expres- sion in breast cancer reflects cellular proliferation, while a deletion polymorphism is associated with immune activation.

Proceedings of the National Academy of Sciences of the United States of America 112, 2841-2846 (2015)

32. Y .- F. Cao, H. Wang, Y. Sun, B .- B. Tong, W .- Q. Shi, Nuclear ANLN regulates transcription initiation related Pol II clustering and target gene expression. Nature Communications 16, 1271 (2025)

33. J.L. McCann, A. Cristini, E.K. Law, S.Y. Lee, M. Tellier, APO- BEC3B regulates R-loops and promotes transcription-associated mutagenesis in cancer. Nature Genetics 55, 1721-1734 (2023)

34. S.C. Sridhara, S. Carvalho, A.R. Grosso, L.M. Gallego-Paez, M. Carmo-Fonseca, Transcription Dynamics Prevent RNA-Medi- ated Genomic Instability through SRPK2-Dependent DDX23 Phosphorylation. Cell Reports 18, 334-343 (2017)

35. A. Ikeya, M. Nakashima, M. Yamashita, K. Kakizawa, Y. Okawa, CCNB2 and AURKA overexpression may cause atypical mito- sis in Japanese cortisol-producing adrenocortical carcinoma with TP53 somatic variant. PLoS ONE 15, e0231665 (2020)

36. C. Vantaggiato, M. Tocchetti, V. Cappelletti, A. Gurtner, A. Villa, Cell cycle dependent oscillatory expression of estrogen receptor-a links Pol II elongation to neoplastic transformation. Proceedings of the National Academy of Sciences of the United States of America 111, 9561-9566 (2014)

37. C.G. Danko, N. Hah, X. Luo, A.L. Martins, L. Core, Signaling Pathways Differentially Affect RNA Polymerase II Initiation, Pausing, and Elongation Rate in Cells. Molecular cell 50, 212- 222 (2013)

38. C.T. Stork, M. Bocek, M.P. Crossley, J. Sollier, L.A. Sanz, Co- transcriptional R-loops are the main cause of estrogen-induced DNA damage. eLife 5, e17548 (2016)

39. Y. Zhang, T. Liu, F. Yuan, L. Garcia-Martinez, K.D. Lee, The Polycomb protein RING1B enables estrogen-mediated gene expression by promoting enhancer-promoter interaction and R-loop formation. Nucleic Acids Research 49, 9768-9782 (2021)

40. A. Chimento, A. De Luca, M.C. Nocito, S. Sculco, P. Avena, SIRT1 is involved in adrenocortical cancer growth and motil- ity. Journal of Cellular and Molecular Medicine 25, 3856-3869 (2021)

Publisher’s note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.