International Journal of Molecular Sciences
MDPI
Article
Intratumoral Microbiota Correlates with AP-2 Expression: A Pan-Cancer Map with Cohort-Specific Prognostic and Molecular Footprints
Damian Kołat 1,2,*DD, Piotr Gromek 10D, Lin-Yong Zhao 3, Żaneta Kałuzińska-Kołat 1,2,4 (D, Mateusz Kciuk 1,2[D, Renata Kontek 20 and Elżbieta Płuciennik 1D
1 Department of Functional Genomics, Medical University of Lodz, 90-752 Lodz, Poland; piotr.gromek@stud.umed.lodz.pl (P.G.); zaneta.kaluzinska@umed.lodz.pl (Ż.K .- K.); mateusz.kciuk@biol.uni.lodz.pl (M.K.); elzbieta.pluciennik@umed.lodz.pl (E.P.)
2 Department of Molecular Biotechnology and Genetics, University of Lodz, 90-237 Lodz, Poland; renata.kontek@biol.uni.lodz.pl
3 Department of General Surgery, West China Hospital, Sichuan University, Chengdu 610041, China; 153795352@scu.edu.cn
4 Department of Biomedicine and Experimental Surgery, Medical University of Lodz, 90-136 Lodz, Poland
* Correspondence: damian.kolat@umed.lodz.pl
check for updates
Academic Editor: Apostolos Zaravinos
Received: 4 November 2025 Revised: 27 November 2025
Accepted: 28 November 2025 Published: 29 November 2025
Citation: Kołat, D .; Gromek, P .; Zhao, L .- Y .; Kałuzińska-Kołat, Ż .; Kciuk, M .; Kontek, R .; Płuciennik, E. Intratumoral Microbiota Correlates with AP-2 Expression: A Pan-Cancer Map with Cohort-Specific Prognostic and Molecular Footprints. Int. J. Mol. Sci. 2025, 26, 11587. https://doi.org/ 10.3390/ijms262311587
Copyright: @ 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).
Abstract
The AP-2 family is a group of key regulators in cancer, yet their relationship with in- tratumoral microbes remains undefined. The present pan-cancer workflow leveraged TCGA transcriptomic data to correlate expression of AP-2 representatives with bacterial abundance on the genus and species level, followed by cohort-specific survival modeling, clinical profiling, differential expression, weighted co-expression analysis, and chromatin proximity tests with AP-2 enrichment. Significant correlations between microbiota and AP- 2 were observed in 18 of 33 analyzed tumor types; TFAP2E-AS1 was most recurrent among AP-2 members, and Halomonas was most recurrent among genera. Further species-level verification and prognostic importance nominated three promising pairs: Paraburkholderia fungorum-TFAP2E in adrenocortical carcinoma (ACC), Actinomyces oris-TFAP2E in diffuse large B-cell lymphoma (DLBC), and Cutibacterium granulosum-TFAP2B in stomach ade- nocarcinoma (STAD). An attempt to define a consensus expression signature driven by microbiota and AP-2, yet independent of the specific species or family member, revealed genes regulating various biological processes and pathways. ACC and DLBC shared a consensus expression program, whereas STAD diverged; chromatin analysis showed AP-2 motifs near microbe-responsive genes in ACC and DLBC but not STAD, supporting cohort- specific regulation. Collectively, AP-2 family members emerge as plausible mediators of tumor microbiota-host interplay, warranting further mechanistic and translational research.
Keywords: microbiota; AP-2; pan-cancer; bioinformatics; prognosis; TFAP2E; TFAP2B; adrenocortical carcinoma; diffuse large B-cell lymphoma; stomach adenocarcinoma
1. Introduction
The human microbiome, a complex ecosystem of bacteria, viruses, fungi, and proto- zoa, colonizes diverse anatomical niches (e.g., gut, oral cavity, skin), orchestrating both homeostasis and disease [1,2]. While the majority of research has predominantly focused on the influence of gut microbiota on the efficacy of anti-cancer therapies (particularly immunotherapy) [3-5], emerging evidence reveals that microbial communities within
non-gastrointestinal tumors actively participate in disease development, progression, and clinical outcomes [6,7]. Advanced sequencing technologies have uncovered distinct micro- biomes across cancers, with bacterial composition varying not only between malignancies but also among patients with the same tumor type [8]. Intratumoral bacteria can influ- ence drug metabolism and alter the efficacy of checkpoint inhibitors, highlighting their therapeutic relevance [9,10]. Dysbiosis in tumors correlates with advanced disease stage, metastasis, and poor survival [11,12], suggesting that microbiome-targeted strategies could complement existing cancer therapies.
Microbes are recognized as integral components of the tumor microenvironment, where they modulate inflammation, angiogenesis, and immune evasion through direct microbial-host cell interactions or the secretion of metabolites [13-15]. In fact, the regu- lation of metabolic processes or the immune response is driven by transcriptional mech- anisms [16]. The microbiome regulates host gene expression through bidirectional in- teractions in cellular/ extracellular compartments [17]. Microbiome-derived stimuli (e.g., transcripts, metabolites, enzymes, pH changes) are recognized by host cells via extracellular receptors or intracellular entry. These stimuli modulate transcription factor (TF) binding, DNA methylation, chromatin accessibility, transcription, and alternative splicing, driving differential gene/protein expression. Conversely, host-derived proteins mutually shape microbial growth and transcriptional activity, establishing a feedback loop [17]. The mi- crobiota can regulate host gene transcription independently of chromatin accessibility, primarily through differential expression (DE) of TFs and enrichment of their binding sites in nucleosome-depleted cis-regulatory regions [18]. Beyond TF expression, the microbiota also alters TF binding activity and induces epigenetic modifications. Notably, microbial components can directly interact with TFs, as demonstrated for specific host-microbe molecular interfaces [18,19].
For instance, the microbiome inhibits the HNF4« transcription factor, impairing the regulation of inflammatory pathways and promoting pro-inflammatory states [19]. Simi- larly, the microbiota modulates key transcription factors (e.g., KLF4, AP-1, SP-1), underscor- ing systemic microbial-TF interactions [20]. Interestingly, one of the TF families important in carcinogenesis, but yet to be explored in relation to the microbiota, is the Activating enhancer-binding protein-2 (AP-2). Microbiome-related research on AP-2 members should be initiated; one of the premises behind this is H. pylori infection, which has been shown to disrupt cellular processes by activating the AP-1 family, a member of the same superclass as the AP-2 family [21,22]. Our recent works highlight the broader role of the AP-2 TFs in cancer and gastroenterological disorders, including gastric/colorectal tumors, metabolic diseases, and gut dysmotility [23,24]. The AP-2 family regulates embryogenesis (e.g., limb, eye, and facial development) under homeostatic conditions but exhibits dysregulated functionality in cancer, serving as a prognostic marker [25,26]. AP-2 encompasses five proteins, i.e., AP-2x, AP-26, AP-2y, AP-28, and AP-2€, which are encoded, respectively, by TFAP2A, TFAP2B, TFAP2C, TFAP2D, and TFAP2E genes. In addition, three genes affiliated with the long non-coding RNA (lncRNA) class have been identified, i.e., TFAP2A-AS1, TFAP2A-AS2, and TFAP2E-AS1 [27,28]. Structurally, AP-2 proteins belong to the basic Helix-Span-Helix class (bHSH) of Superclass-1, with a transactivation domain at the amino terminus and DNA-binding/dimerization motifs at the carboxyl terminus [27]. The HSH motif facilitates homo- and heterodimerization but requires post-translational modifica- tions (phosphorylation, sumoylation) and interactions with other proteins to modulate subcellular localization, DNA binding, or degradation [25,29]. AP-2 TFs bind GC-rich DNA motifs, with all members except AP-28 containing a proline-rich activation domain [30]. Previously, we indicated that AP-2 members regulate biological processes that underlie cancer hallmarks [31]. Moreover, these TFs are known to affect the immune system and
the effectiveness of related therapies [32], as well as invasiveness [33] or proliferation [23], i.e., aspects frequently discussed in the microbiota literature [11,34]. The importance of AP-2 in the diagnosis of selected tumors, its implication in a network of long non-coding RNAs and microRNAs, as well as the prognostic significance of its targets, certifies that the complexity of the entire family is worth investigating [27,35,36].
As mentioned above, no study has investigated the potential relationship between intratumoral microbiota abundance and AP-2 expression, which could lay the groundwork for future mechanistic research on relevant tumor types. Thus, we present the first analysis of the relationships between microbial abundance and the expression of AP-2 family members, which was preceded by a general pan-cancer assessment of bacterial composition and diversity. By integrating host-microbiome data across a human pan-cancer spectrum, we identified significant correlations between bacterial genera/species and AP-2 family members, as well as evaluated their prognostic and clinical relevance. Furthermore, gene expression patterns associated with these relationships were examined through differential expression and co-expression network analyses. By bridging microbiome and transcription factor research, this work reveals the biological outcomes underlying the microbiota-AP-2 relationship, which may influence cancer development and progression.
2. Results
2.1. Despite Each Tumor Type Having Its Own Bacterial Composition, Some Commonalities Were Noted
Prior to the investigation focused on the AP-2 family, each cohort was summarized in terms of its bacterial composition (Figure 1). Although heterogeneity between tumors or between patients within a single cohort was expected, it is worth noting that the most preva- lent genus in a given cohort also predominates in others, e.g., Bacillus is the foremost genus in BRCA, KIRC, LUAD, LUSC, OV, PCPG, PRAD, SKCM, THCA, and UCEC. Other prevail- ing genera are Pseudomonas (CHOL, KICH, LGG, PADD, TGCT, THYM, UVM), Paenibacillus (GBM, BLCA, CESC, KIRP, LIHC), Prevotella (ESCA, HNSC, STAD), Bacteroides (COAD, READ), and Actinoplanes (DLBC, MESO). In contrast, the most prevalent genera in ACC, SARC, and UCS are, respectively, Acinetobacter, Peptoclostridium, and Saccharomonospora. In some tumors, mutual exclusivity in bacterial composition can be observed, particularly in KIRP and LIHC, and to a lesser extent in DLBC and SKCM.
Cohorts were also investigated for their microbiota abundance in a pan-cancer spec- trum, as well as in terms of diversity indices (Figure 2). The top 50 most prevalent genera included those mentioned earlier (Bacillus, Pseudomonas, Paenibacillus, Prevotella, Bacteroides, and Acinetobacter), as well as bacteria that are less noticeable when analyzing each tumor separately, such as Escherichia, Klebsiella, Yersinia, and Vibrio. A prevalent characteristic of the tumor microbiome is a subtle decline in the Shannon index compared to control tissues. This finding is often accompanied by a decrease in the Simpson index, indicating domina- tion by a few taxa. Still, some tissues exhibit similar interquartile ranges, independent of the sample type, suggesting that certain tumors have a microbial composition resembling that of normal specimens. As for richness, some tumors, such as BRCA, COAD, OV, and READ, harbor two to three times more taxa than others.
ACC
BLCA
BRCA
CESC
CHOL
COAD
DLBC
ESCA
GBM
HNSC
KICH
KIRC
KIRP
LGG
LIHC
LUAD
LUSC
MESO
OV
PAAD
PCPG
PRAD
READ
SARC
SKCM
STAD
TGCT
THCA
THYM
UCEC
UCS
UVM
Int. J. Mol. Sci. 2025, 26, 11587
Relative abundance (%)
D
100
TCMbio
CH
Legend for Figure 2A -+
S
25
0
0
Shannon
100
g_Escherichia
BLC
g_Pseudomonas
ARCA
@_Staphylococcus
CESC
g_Salmonella
…
CHO
g_Klebsiella
m
0
caso
g_Ralstonia
OLBe
@_Pasteurella
1.00 -
100
@_Porphyrobacter
Coca
Simpson
CA
0.76-
GBar
g_Cutbacterium
OFPca
MASC
g_Burkholderia
0.50-
g_Corynebacterium
Eis
NON
Se
0.25-
APRO
g_Streptococcus
.. ..
COMO
@_Bradyrhizobium
0
Ao
G
0.00
OLBC
CARL
COCA
400
Log (Observed)
g_Yersinia g_Fusobacterium
A00
GBM
Chp
g_Clostridium
OUÇA
MASO
440
@_Halomonas
2
DACA
NICH
400
G_Lactococcus
0
…
GÃsp
ARO
MESO
g_Shigella
g_Enterobacter
g_Hydrogenophaga
90%
1/8
9
00
COND
LAty
440
-.. ..
980
…
g_Bacteroides
600
PCPO
COCA
BRAD
g_Prevotella
Log (Chao1)
900
Che
SLCA
004.
QUAD
1640
9_Vibrio
g_Sphingomonas
4
MAISO
LUST
-..
FAR
MARCA
g_Shinelia
May
MESO
SCH
@_Alicycliphilus
9
…
PESO
WHO1
TIPO
-
2700
… .
0
tipo
20
Boy
Evenness
=
g_Lactobacillus
40
-DA0
DLBC
LAME
CA
INCA
g_Priestia
-.. ..
g_Geobacillus
ESCA
40G
HM
CHO
—
“RO
2
g_Streptomy ces
g_Actinobacillus
GBM
16
Sample_type
ACC
ELCA
440
…
440
non_tumor
UC’s
=
MINSC
9_Actinomyces
g_Anaerococcus
NICH
4g
OKCHA
tumor
ORCA
UM
·
CESC
RESO
STAD
g_Aeromonas
TRO
2
g_Clostridioides
CHOL
Tpo
0%
TGCY
Shannon index
V
COMO
AMAL
140
PACA
8
g_Agrobacterium
CBC
50
SMY
2
9_Kinneretia
100
20
Sample
CE
g_Paenbacillus
ESCA
CANC
GBM
…
READ
B
CUAD
CC
90
g_Bacilus
..
ANSO
SARA
LUSC
Im
non_tumor
NACH
2
type
g_Moracella
g_Acinatobacter
NICH
MESO
STAD
0
KIRO
-
TI
g_Rothia
0%
E
g_Phocaeicola
A/go
For
A10
g_Cupriavidus
CAMI
PCPG
CA
8
g_Methylobacterium
LGG
3
G
PRAD
LINC
CEO
Figure 2. Pan-cancer view of bacterial abundance and diversity indices. (A) Top 50 most prevalent genera in a pan-cancer spectrum according to TCMbio data. (B) Evenness according to BIC. (C,D) Shannon
Simpson index of diversity
READ
Sample_type
9 8
q
g_Niala
CUAD
..
8
MARC
-
g_Ideonella
1
LUSc
tumor
_tumor
UM
40
%
SKCM
MESO
8
g_Neisseria g_Diaphorobacter
STAD
0%
I
0
8
$
.
IGCT
Simpson reciprocal in
lex
0
others
PAAD
THICA
PCPG
8
9h
m
8
PRAD
a
8
Sample_type
CEC
5
1
READ
S
VCS
0
2
MARC
non_tumor
8
8
BIC
6
5KCM
tumor
Un
18
8
STAD
0
8
8
0
IGCT
2%
8
8
THCA
Richness
500
B
8
400
0
THYM
8
V 9
CEO
2
a
€
20
48
0
kg
5 of 27
OCS
200
90
va
2
8
OVM
3
38
100
0
0
0
3 2
280
5g
De
9
8
28
20
8
de
9
40
8
4
q
Ve
%
CECA
20
188
8
VICA
8,
2
30
g
I
40
2%
2
38
00
42
40
I
ch
€40
F
4
100
20,
40
8
280
8
40
1980
30
8
3
8
26
00
de
96
0
4
ty
m
MESO
UD
9
Co
200
30
hy
20
6
Me
2
00
DE
”
80
0
indices according to TCMbio and BIC. (E,F) Simpson indices according to TCMbio and BIC. (G) log2-transformed abundances according to TCMbio. (H) Simpson’s reciprocal index accord- ing to BIC. (I) Chao1 index according to TCMbio. (J) Richness according to BIC. Data from TCMbio enables comparison between tumors and non-tumor specimens, whereas BIC data focus on differ- ences between tumor types.
2.2. Correlation of Microbiota Abundance and AP-2 Expression Revealed That More than Half of the Cohorts Contain Significant Relationships, with Some of Them Having Prognostic Importance
Focusing on the correlation between bacterial genus abundance and AP-2 family member expression, potential associations were observed in more than half of the cohorts, i.e., 18 of 33 tumor types (Table 1). Between cohorts, the most duplicative AP-2 family member was TFAP2E-AS1 (DLBC, ESCA, LAML, MESO, PCPG, READ, SKCM, UCS, UVM), while regarding genera, it was Halomonas (CHOL, DLBC, GBM, KICH, LIHC, SARC, SKCM). Others found in at least three cohorts were TFAP2B (CESC, CHOL, KICH, SARC, STAD, UCS), TFAP2D (GBM, LAML, LIHC, MESO, PCPG, SARC), TFAP2E (ACC, COAD, DLBC, KICH, READ), TFAP2A-AS1 (CHOL, DLBC, KICH, PCPG, READ), Meiothermus (CESC, DLBC, MESO), and Corynebacterium (DLBC, KICH, UCS). The cohort with the highest number of identified relationships was DLBC, while the fewest significant correlations were identified ex aequo in ACC, CESC, COAD, ESCA, LIHC, SKCM, STAD, and UVM.
| TCGA Tumor Cohort | Microbiota (Genus-Level) | AP-2 TF | Correlation Coefficient and Statistical Significance |
|---|---|---|---|
| ACC | Paraburkholderia | TFAP2E | r = 0.55; p < 0.0001 ( **** ) |
| CESC | Meiothermus | TFAP2B | r = 0.36; p < 0.0001 ( **** ) |
| CHOL | Halomonas | TFAP2B | r = 0.41; p < 0.05 (*) |
| Liberibacter | r = 0.45; p < 0.01 ( ** ) | ||
| Moraxella | TFAP2A-AS1 | r = 0.38; p < 0.05 (*) | |
| Sphingomonas | r = 0.35; p < 0.05 (*) | ||
| COAD | Brucella | TFAP2E | r = 0.38; p < 0.0001 ( **** ) |
| DLBC | Actinomyces | TFAP2E | r = 0.65; p < 0.0001 ( **** ) |
| TFAP2E-AS1 | r = 0.67; p < 0.0001 ( **** ) | ||
| Alcaligenes | TFAP2A-AS2 | r = 0.49; p < 0.001 ( *** ) | |
| Brucella | TFAP2A-AS2 | r = 0.31; p < 0.05 (*) | |
| Corynebacterium | TFAP2E-AS1 | r = 0.33; p < 0.05 (*) | |
| Cutibacterium | TFAP2A-AS2 | r = 0.32; p < 0.05 (*) | |
| Gordonia | TFAP2E | r = 0.39; p < 0.01 ( ** ) | |
| TFAP2E-AS1 | r = 0.41; p < 0.01 ( ** ) | ||
| TFAP2A-AS1 | r = 0.31; p < 0.05 (*) | ||
| TFAP2A-AS2 | r = 0.37; p < 0.05 (*) | ||
| Halomonas | TFAP2E-AS1 | r = 0.47; p < 0.001 ( *** ) | |
| Meiothermus | TFAP2E | r = 0.32; p < 0.05 (*) | |
| TFAP2A-AS1 | r = 0.38; p < 0.01 ( ** ) | ||
| TFAP2A-AS2 | r = 0.51; p < 0.001 ( *** ) | ||
| Paraburkholderia | TFAP2E-AS1 | r = 0.36; p < 0.05 (*) | |
| TFAP2A-AS2 | r = 0.51; p < 0.001 ( *** ) | ||
| Sphingomonas | TFAP2A | r = 0.47; p < 0.001 ( *** ) | |
| ESCA | Shewanella | TFAP2E-AS1 | r = 0.38; p < 0.0001 ( **** ) |
| TCGA Tumor Cohort | Microbiota (Genus-Level) | AP-2 TF | Correlation Coefficient and Statistical Significance |
|---|---|---|---|
| GBM | Clostridium | TFAP2D | r = 0.79; p < 0.0001 ( **** ) |
| Haemophilus | TFAP2C | r = 0.39; p < 0.0001 ( **** ) | |
| Halomonas | TFAP2D | r = 0.48; p < 0.0001 ( **** ) | |
| Klebsiella | TFAP2D | r = 0.57; p < 0.0001 ( **** ) | |
| KICH | Corynebacterium | TFAP2B | r = 0.97; p < 0.0001 ( **** ) |
| Halomonas | TFAP2E | r = 0.43; p < 0.001 ( *** ) | |
| TFAP2A-AS2 | r = 0.50; p < 0.0001 ( **** ) | ||
| Vibrio | TFAP2E | r = 0.32; p < 0.01 ( ** ) | |
| TFAP2A-AS1 | r = 0.72; p < 0.0001 ( **** ) | ||
| TFAP2A-AS2 | r = 0.47; p < 0.0001 ( **** ) | ||
| LAML | Aquitalea | TFAP2E-AS1 | r = 0.36; p < 0.0001 ( **** ) |
| Porphyrobacter | TFAP2D | r = 0.30; p < 0.001 ( *** ) | |
| Thiopseudomonas | TFAP2D | r = 0.33; p < 0.0001 ( **** ) | |
| LIHC | Halomonas | TFAP2D | r = 0.42; p < 0.0001 ( **** ) |
| MESO | Meiothermus | TFAP2D | r = 0.35; p < 0.01 ( ** ) |
| TFAP2E-AS1 | r = 0.31; p < 0.01 ( ** ) | ||
| PCPG | Escherichia | TFAP2D | r = 0.34; p < 0.0001 ( **** ) |
| Gordonia | TFAP2A-AS1 | r = 0.35; p < 0.0001 ( **** ) | |
| Mitsuaria | TFAP2E-AS1 | r = 0.33; p < 0.0001 ( **** ) | |
| READ | Aquirufa | TFAP2E | r = 0.35; p < 0.0001 ( **** ) |
| Moraxella | TFAP2A-AS1 | r = 0.32; p < 0.0001 ( **** ) | |
| Limnohabitans | TFAP2E-AS1 | r = 0.34; p < 0.0001 ( **** ) | |
| Polynucleobacter | TFAP2E-AS1 | r = 0.30; p < 0.0001 ( **** ) | |
| SARC | Halomonas | TFAP2D | r = 0.37; p < 0.0001 ( **** ) |
| Staphylococcus | TFAP2B | r = 0.38; p < 0.0001 ( **** ) | |
| SKCM | Halomonas | TFAP2E-AS1 | r = 0.31; p < 0.01 ( ** ) |
| STAD | Cutibacterium | TFAP2B | r = 0.51; p < 0.0001 ( **** ) |
| UCS | Corynebacterium | TFAP2B | r = 0.85; p < 0.0001 ( **** ) |
| Priestia | TFAP2A | r = 0.38; p < 0.01 ( ** ) | |
| Solimonas | TFAP2E-AS1 | r = 0.32; p < 0.05 (*) | |
| UVM | Staphylococcus | TFAP2E-AS1 | r = 0.50; p < 0.0001 ( **** ) |
p < 0.05 (*), p < 0.01 (*), p < 0.001 (*), p < 0.0001 (*).
Evaluating the influence of given TFAP2-microbiota pairs on patient prognosis re- vealed that the following bacterial genera or AP-2 family members affect at least one survival endpoint: (1) Paraburkholderia and TFAP2E in ACC; (2) Meiothermus and TFAP2B in CESC; (3) Sphingomonas and TFAP2A-AS1 in CHOL; (4) Brucella and TFAP2E in COAD; (5) Actinomyces and TFAP2E or TFAP2E-AS1 in DLBC; (6) Halomonas and TFAP2E-AS1 in DLBC; (7) Clostridium and TFAP2D in GBM; (8) Haemophilus and TFAP2C in GBM; (9) Vibrio and TFAP2A-AS1 or TFAP2A-AS2 in KICH; (10) Halomonas and TFAP2A-AS2 in KICH; (11) Halomonas and TFAP2D in LIHC; (12) Mitsuaria and TFAP2E-AS1 in PCPG; (13) Limnohabitans and TFAP2E-AS1 in READ; (14) Polynucleaobacter and TFAP2E-AS1 in READ; (15) Halomonas and TFAP2D in SARC; (16) Staphylococcus and TFAP2B in SARC; (17) Cutibacterium and TFAP2B in STAD; (18) Solimonas and TFAP2E-AS1 in UCS; as well as (19) Staphylococcus and TFAP2E-AS1 in UVM. Corresponding genus-level survival analysis is summarized in Supplementary File S1.
2.3. Species-Level Correlation Analysis and Assessment of Prognostic Significance Have Enabled the Selection of Promising Relationships in Four Cohorts
The aforementioned relationships were investigated at the species level of a given mi- crobiota, confirming that the majority of these relationships remained significant (Table 2). Occasionally, a single TFAP2-microbiota pair on the genus level yielded two or three pairs for investigation on the species level because specific bacteria were sufficiently abundant in a given cohort. For instance, the correlation between TFAP2E-AS1 and Staphylococcus in UVM yielded two relationships at the species level (incorporating S. aureus and S. epider- midis), whereas for TFAP2B and Cutibacterium in STAD, it yielded C. acnes, C. granulosum, and C. modestum. Some pairs were not found at the species level (Meiothermus and TFAP2B in CESC, as well as Sphingomonas and TFAP2A-AS1 in CHOL), making further analysis impossible. A few relationships, although present, did not meet an established threshold for the correlation coefficient and were thus excluded from subsequent study stages. Ex- cluded relationships were: (1) Brucella anthropi and TFAP2E in COAD; (2) Halomonas sp. JS92-SW72 and TFAP2E-AS1 in DLBC; (3) Haemophilus parainfluenzae and TFAP2C in GBM; (4) Halomonas sp. JS92-SW72 and TFAP2D in LIHC; (5) Polynucleobacter sp. Adler-ghost and TFAP2E-AS1 in READ; as well as (6) Halomonas sp. JS92-SW72 and TFAP2D in SARC.
| TCGA Tumor Cohort | Microbiota (Species-Level) | AP-2 TF | Correlation Coefficient and Statistical Significance |
|---|---|---|---|
| ACC | Paraburkholderia fungorum | TFAP2E | r = 0.50; p < 0.0001 ( **** ) |
| COAD | Brucella anthropi | TFAP2E | r = 0.20; p < 0.0001 ( **** ) |
| DLBC | Actinomyces oris | TFAP2E | r = 0.47; p < 0.001 ( *** ) |
| TFAP2E-AS1 | r = 0.46; p < 0.01 ( ** ) | ||
| Halomonas sp. JS92-SW72 | TFAP2E-AS1 | r = 0.29; p < 0.05 (*) | |
| GBM | Clostridium botulinum | TFAP2D | r = 0.32; p < 0.0001 ( **** ) |
| Haemophilus parainfluenzae | TFAP2C | r = 0.18; p < 0.05 (*) | |
| KICH | Halomonas sp. JS92-SW72 | TFAP2A-AS2 | r = 0.35; p < 0.01 ( ** ) |
| Vibrio anguillarum | TFAP2A-AS1 | r = 0.48; p < 0.0001 ( **** ) | |
| TFAP2A-AS2 | r = 0.39; p < 0.01 ( ** ) | ||
| LIHC | Halomonas sp. JS92-SW72 | TFAP2D | r = 0.18; p < 0.001 ( *** ) |
| PCPG | Mitsuaria sp. 7 | TFAP2E-AS1 | r = 0.38; p < 0.0001 ( **** ) |
| READ | Limnohabitans sp. 103DPR2 | TFAP2E-AS1 | r = 0.37; p < 0.0001 ( **** ) |
| Limnohabitans sp. 63ED37-2 | r = 0.35; p < 0.0001 ( **** ) | ||
| Polynucleobacter sp. Adler-ghost | r = 0.22; p < 0.01 ( ** ) | ||
| SARC | Halomonas sp. JS92-SW72 | TFAP2D | r = 0.25; p < 0.0001 ( **** ) |
| Staphylococcus aureus | TFAP2B | r = 0.33; p < 0.0001 ( **** ) | |
| STAD | Cutibacterium acnes | TFAP2B | r = 0.44; p < 0.0001 ( **** ) |
| Cutibacterium granulosum | r = 0.37; p < 0.0001 ( **** ) | ||
| Cutibacterium modestum | r = 0.63; p < 0.0001 ( **** ) | ||
| UCS | Solimonas sp. K1W22B-7 | TFAP2E-AS1 | r = 0.34; p < 0.05 (*) |
| UVM | Staphylococcus aureus | TFAP2E-AS1 | r = 0.38; p < 0.001 ( *** ) |
| Staphylococcus epidermidis | r = 0.72; p < 0.0001 ( **** ) |
p< 0.05 (*), p < 0.01 ( ** ), p < 0.001 ( *** ), p < 0.0001 ( **** ).
Subsequently, the impact of promising bacterial species on patient prognosis was evalu- ated (Supplementary File S2). Since all TFAP2-microbiota pairs showed a positive correlation (Tables 1 and 2), it was decided to select those exerting a significant influence on at least one
survival endpoint, with both bacterial abundance and AP-2 expression having the same effect on survival. Only the following five relationships met the requirements: (1) Paraburkholderia fungorum and TFAP2E in ACC; (2) Actinomyces oris and TFAP2E in DLBC; (3) Actinomyces oris and TFAP2E-AS1 in DLBC; (4) Halomonas sp. JS92-SW72 and TFAP2A-AS2 in KICH; as well as (5) Cutibacterium granulosum and TFAP2B in STAD. Since two relationships involved A. oris in KICH (paired with either TFAP2E or TFAP2E-AS1), it was decided to proceed with TFAP2E, given its superior statistical significance and correlation coefficient. Corresponding species-level survival analysis is summarized in Supplementary File S2.
2.4. Establishment of Representative Patient Groups Reaffirmed Survival Outcomes and Revealed Distinct Clinical Features
Patients from the four selected cohorts underwent intersection analysis to identify representative groups (with higher or lower microbiota abundance and AP-2 expression), which were subsequently evaluated for their prognostic outcomes (Figure 3). Although no significant results were found in KICH, the remaining three cohorts indicated that individuals with simultaneously higher microbiota abundance and AP-2 expression have unfavorable survival relative to those with concurrently lower abundance/expression, based on at least three significant endpoints.
ACC
52
Investigation
DLBC
25
Investigation
KICH
30
Investigation
of P. fungorum and TEAPE
al A. oris and TEAPZE
of H.sp. J592-SW72 and TFAPZA-AS2
STAD
172
Investigation
of C. granulosum
Intarsection Si20
Intersection Size
Intersection 8i20
Intersection Size
and TFAP2B
19
11
13
85
13
9
7
₫
57
51
,
Sic Sian
Sat Sice
Hazard ratio
0.12 (3.00. 0.000 095
4
Ratonunca
0.18:(0:20, 1.06) 0.06
· 13
.
329 13 37, 29:54 03
Overall survival (OS)
1.00-
1.004
1.00
Survival probability
Survival probability
Survival probability
Survival probability
.50
sa
150
2
Log-rank p =0.018
Q.25
Log-tank
ASS
Log-rank
p=0.26
0.95
Log-rank p=0.001
p=0.033
0.00
D.DO
0.00
LantY AP-3 dni
2000
1000
2000
3000 Tits&
4000
1000
9000
0
1000
4500
5000
1000
3000
Strom
Number at risk
Number at risk
Number at risk
Girata
Number at risk
15
a
2
C
2
:
9
1
10
10
3
50
0
32
8
Schoenfeld Individual Test p: 0.342
0
Schoenfeld Individual Test p: 0.1503
Schoenfeld Individual Test p: 0.0993
Schoenfeld Individual Test p: 0.8453
Beta[]]
Bata[D)
Betalt)
Betalt)
300
400
Time
1200
$800 2100
…
350
570
$300 2200 210
460
570
Disease-specific survival (DSS)
Time
150 800
880
110
390
THỦ0 3500
Hazard satio
PIHR
13
Masardi sarka
Harand ratko
PİHA
0.06 (0.01. 0:57) 0.02
·12
.
50
1.00
1.74 |0.16, 19.21)
7
1.00-
Survival probability
Fa.75
Survival probability
0.75
Survival probability
0.75
Survival probability
0.76
50
60
Log-rank
Log-rank
p= 0.018
1.25
Log-rank
Log-nank
higher AO-de supression ant higher micsobida abundance
p = 0.002
0.25
p= 0.85
0.25
+ higher ko-2 mpmission and higher microbios abundance
p = 0.02
higher ko- 2 epression antihigher micsoblou abundance
a.Da
0.00
0.00
1000
8000
4000
5000
1000
Number at risk
5000
9000
000
2000
3000
4000
5000
Number at risk
1000
3000
4000
Number at risk
Number at risk
12
2
:
6
25
11
2
6
U
O
19
10
TO
1
a
Strata
ER
0
7
0
BL
Schoenfeld Individual Test p: 0.342
Schoenfeld test unavailable due to data scarcity
Schoenfeld test unavailable due to data scarcity
Schoenfeld Individual Test p: 0.403
Bata(!)
… ..
…
-
1600
2100
14D
370
500
1500
1900
Disease-free interval (DFI)
Hazard ratio
P(HP)
Hazard ratio cannot be calculated because one of the groups has no events
Hazard ratio cannot be calculated because one of the groups has no events
Hscsd matia
sin sin ie is i
0.09 (0.01, 0 #7) 0.04
: 0.14 10.06. 9.34) =0.001
1./00
1.00-
Survival probabilny
0.75
urvival probability
4.75
Survival probabilny
Survival probabili
0.75
.50
1.50
54
50
S
Log-rank
12
Log-sank
Log-tank
Log-rank
p = 0.000
p = 0.77
1.25
p =0.22
0.29
p < 0.0001
D
0.00
1000
8000 Time 200
4000
5000
1000
4000
Number at risk
3000
Number at risk
Number at risk
2000
Tima
3000
1000
Number at risk
2000 Tìma
2000
5
7
2
9
Strat
15
9
2
5
2
A
9
a
00
Stala
2
?
,
g
0
Schoenfeld Individual Test p: 0.4887
Betalt)
Schoenfeld test unavailable due to data scarcity
Schoenfeld test unavailable due to data scarcity
5
Schoenfeld Individual Test p: 0.1011
Beta()
Progression-free interval (PFI)
190
210
440
710 98
130
220
250
Time
720
1000
Hazard ratio
Harari sofia
1.00-
1
1.00
Pokerunca
1.00
· 13
1.004
Survival probability
Survival probability
0.7%
Survival probability
Survival probability
0.75
50
66
50
025
25
21
Log-rank p=0.66
8
Log-tank
Log-rank
0.00
p = 0.00099
D.OD
p= 0.00031
6
1500
Time
4000
à
1000
3000
1000
1000
3900
4000
1000
Number at risk
Time
0
2000
3000
4000
Strata
Tima
13
1
2
0
Numbor at risk
Numbor at rick
e
11
8
2
8
;
5
Number at rie
19
10
0.
Strato
28
8
1
0
Schoenfeld Individual Test p: 0.8054
25
Schoenfeld Individual Test p: 0.7382
Schoenfeld Individual Test p: 0.1018
Schoenfeld Individual Test p: 0.1932
Bata(!)
Betalt)
Beta(!)
Betalt)
43
TT
100
150
0
300
470
850
330
1830
3000 8400
3800
-
150
170
1400 2009
Time
2200
97
190
270
370 440
Time
550
1000
1800
depicts the results of intersection analysis, while the remaining part concerns survival analysis, with specific endpoints (OS, DSS, DFI, PFI) in rows and tumor cohorts in columns.
ACC, DLBC, and STAD patients were further evaluated for their clinical picture, re- vealing that, among the included characteristics, there were more differences in qualitative than quantitative features (Figure 4). In the first cohort, individuals with simultaneously higher Paraburkholderia fungorum abundance and TFAP2E expression were more frequently characterized by higher tumor stage, C1A molecular subtype, metastasis, COC3 molecular subtype, and steroid phenotype with proliferation pattern. In DLBC, the group with higher Actinomyces oris abundance and TFAP2E expression consisted solely of females and tended to have less germinal-center B-cell-like (GCB) expression subtype, although the amount of unavailable data or unclassified expression subtype complicates the inference. Regarding STAD, patients with higher Cutibacterium granulosum abundance and TFAP2B expression showed more homozygous deletions and were less frequently characterized by MSI molec- ular subtype or more commonly had microsatellite stability; however, this is inconclusive due to the unavailability of data for some individuals. The remaining clinical features with less evident differences between patient groups are depicted in Figure S1.
TFAP2E & P. fungorum 4
TFAP2E \ A. oris !
TFAP2B + C. granulosum !
TFAP2E 1 P. fungorum t
ACC
TFAP2E 1 A. oris 1
DLBC
TFAP2B + C. granulosum t
STAD
Percentage
100
Percentage
100
ACC
80
80
DLBC
tumor stage
60
60
Stage II
40
40
Stage I
gender
Stage IV
20
20
female
Stage III
male
0
0
Percentage
100
Percentage
100
ACC
80
80
DLBC
gene
60
60
expression
molecular subtype
subgroup
40
40
NA
NA
C1B
20
20
Unclass
GCB
C1A
ABC
0
0
Percentage
100
Percentage
100
STAD
ACC
80
80
molecular
tumor classification
subtype
60
60
HM_SNV
40
40
NA
Prior primary
GS
metastasis
20
EBV
primary
20
MSI
CIN
0
0
ACC COC
Percentage
100
Percentage
100
80
80
STAD
subtyping
60
60
MSI
NA
status
COC3
40
40
COC1
NA
COC2
20
20
MSI_L
MSI_H
ACC
MSS
0
0
K4 cluster
Percentage
100
40
** (p=0.00605)
STAD
80
8
steroid_phenotype_low
steroid_phenotype_high+prolif. steroid_phenotype_high
Amount of deletions
60
20
Homozygous deletions
40
2
20
NA
o
0
depicted as a beanplot (p < 0.01 ( ** )), with small lines representing individual data points and a larger horizontal line indicating the median per group. Data acquired from the Genomic Data Commons (GDC) portal.
2.5. An Attempt to Identify a Consensus Expression Profile Dependent on AP-2 and Microbiota Uncovered Genes Regulating Various Biological Processes and Pathways
In addition to clinical picture comparison, it was decided to establish a consensus expression profile across cohorts, related to both microbiota and AP-2, but independent of specific species or family members. Simultaneous evaluation of three cohorts using WGCNA revealed one statistically significant gene module associated with microbiota and AP-2 status (Figure 5A). However, the correlation was less satisfactory than that observed in the joint analysis of ACC and DLBC without STAD (Figure 5B), suggesting that the expression profile changes in STAD are generally distinct from those of the other two tumors, and that some essential genes might not constitute a module. Thus, it was decided to employ DEA between patient groups in a given tumor (Supplementary File S3), followed by the intersection analyses between: (1) WGCNA modules from both approaches, or (2) WGCNA modules and DEA-derived differentially expressed genes. Overlapping significant modules identified 116 genes found in the “tan” module from the three-cohort WGCNA and the “midnightblue” module from the two-cohort approach (Figure 5C). In parallel, intersecting WGCNA and DEA data for ACC and DLBC revealed 223 upregulated and 226 downregulated genes that were considered essential for the manifestation of consensus expression profile (Figure 5D). Comparing the lists of 116 and 223 genes, only 18 were found in the “tan” module from the three-cohort WGCNA (Figure 5E), suggesting that expression changes in STAD are devoid of essential genes to the extent that they only partly resemble the consensus profile found in ACC and DLBC.
A
ACC & DLBC & STAD
Consensus module-trait relationships
B
ACC & DLBC
0.24*
-0.24*
tan
-0.59 **
0.59 **
greenyellow
-0.06
0.06
darkgreen
0.47*
-0.47*
midnightblue
-0.064
0.064
red
0.049
-0.049
lightcyan
grey
grey
Correlation
1
red
grey60
0.5
blue
0
cyan
-0.5
Module
green
-1
magenta
Module
midnightblue
pink
AP-2 1 Microbiota 1
AP-2 4
AP-2 1 Microbiota 1
AP-2 4 Microbiota 4
Microbiota 4
51
1952
223
0
226
1769
116
12
1662
0
0
95
0
98
18
205
0
0
0
0
0
0
0
C
2178
D
E
0
Module “midnightblue” from two-cohort WGCNA Module “greenyellow” from two-cohort WGCNA Module “tan” from three-cohort WGCNA
Module “midnightblue” from two-cohort WGCNA Module “greenyellow” from two-cohort WGCNA Genes upregulated in “AP-2 1 Microbiota t” group identified via limma found in ACC and DLBC cohorts Genes downregulated in “AP-2 t Microbiota t” group identified via limma found in ACC and DLBC cohorts
116 genes overlapping between “midnightblue” module of two-cohort WGCNA and “tan” module of three-cohort WGCNA
223 genes overlapping between “midnightblue” module of two-cohort WGCNA and genes upregulated in “AP-2 1 Microbiota t” group identified via limma found in ACC and DLBC cohorts
(C) Intersection analysis of significant gene modules from both WGCNA approaches. (D) Overlap of WGCNA and DEA data for ACC and DLBC cohorts. (E) Identification of a common part between the 116-gene and 223-gene sets found in two previous subpanels. p < 0.05 (*), p < 0.01 ( ** ).
To assess biological outcomes of such disparity, the above modules and overlapping genes were functionally annotated (Figures 6 and S2). Based on the “WikiPathways cancer” resource, genes residing in the “greenyellow” module are responsible for, e.g., angiogenesis, Wnt signaling, cell adhesion, and interleukin-1 signaling: phenomena that were not found in other gene sets. In contrast, the “midnightblue” module had mutual annotations with the “tan” module and a list of 116 overlapping genes, although with subtle differences. For instance, all three gene sets were linked to ATM signaling, as well as glycolysis and gluconeogenesis. However, the “midnightblue” module is more closely related to DNA mismatch repair and Notch signaling, rather than non-homologous end-joining and Ras signaling, as indicated in the ontology of the other two gene sets. This suggests that the expression profile alterations across three cohorts might entail a few similar biological outcomes, but differences between STAD patient groups do not involve phenomena unique to the “midnightblue” module specific to ACC and DLBC, e.g., metabolic reprogramming or TP53 regulatory network. Extended functional annotation of the same gene sets based on other ontological resources is summarized in Figure S2.
A
Regulation of Wnt B catenin signaling by small molecule compounds
Angiogenesis
Type II interferon signaling
PDGFR beta pathway
PI3K AKT mTOR signaling pathway and therapeutic opportunities in prostate cancer
Transcriptional activation by NRF2 in response to phytochemicals
EPO receptor signaling
NRF2 ARE regulation
Imatinib and chronic myeloid leukemia
Integrin mediated cell adhesion
IL 1 signaling pathway
Focal adhesion PI3K Akt mTOR signaling pathway
TGF beta signaling pathway
Non small cell lung cancer
2178 genes
Chemokine signaling pathway
0.0
Enrichment ratio
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
B
DNA mismatch repair
Metabolic reprogramming in colon cancer
Retinoblastoma gene in cancer
DNA damage response
G1 to S cell cycle control
ATM signaling in development and disease
4 hydroxytamoxifen dexamethasone and retinoic acids regulation of p27 expression
TP53 network
ATM signaling pathway
Cell cycle
DNA IR double strand breaks and cellular response via ATM
Glycolysis and gluconeogenesis
Fluoropyrimidine activity
Notch signaling
1885 genes
Non small cell lung cancer
Enrichment ratio
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
C
Non homologous end joining
ATM signaling in development and disease
Glycolysis and gluconeogenesis
DNA IR double strand breaks and cellular response via ATM
Clear cell renal cell carcinoma pathways
128 genes
Ras signaling
Enrichment ratio
0
5
10
15
20
25
30
35
10
15
50
D
Non homologous end joining
ATM signaling in development and disease
Glycolysis and gluconeogenesis
DNA IR double strand breaks and cellular response via ATM
Clear cell renal cell carcinoma pathways
116 genes
Ras signaling
Enrichment ratio
0
5
10
15
20
25
30
35
40
45
50
“midnightblue” from the two-cohort WGCNA. (C) Module “tan” from the three-cohort WGCNA. (D) Genes overlapping between “midnightblue” and “tan” modules. The rectangles on the left are colored according to the WGCNA palette. The last subfigure contains a two-color rectangle because it represents the overlap of the “midnightblue” and “tan” modules.
2.6. AP-2 Engagement of Microbiota-Responsive Chromatin in ACC and DLBC Contrasts with a Null Pattern in STAD
Ultimately, it was evaluated whether microbe-responsive genes show AP-2 motif and ChIP-overlap enrichment in nearby accessible chromatin, implying AP-2-mediated regulation of the microbiota-host transcriptional program. Across radii, ACC and DLBC displayed consistent enrichment of AP-2 motif sites that intersect the ChIP union in FG se- quences relative to GC-matched BG, with the strongest signal at ±750 kb (ACC OR = 1.281, p = 2.4 × 10-5; DLBC OR = 1.258, p = 7.1 × 10-6; empirical one-sided p ~ 1 × 10-4). Enrichment remained positive at ±1000 kb but with smaller effect sizes (ACC OR = 1.219, p = 1.2 × 10-4; DLBC OR=1.106, p=1.6 x 10-2). In STAD, the AP-2 family signal was uniformly absent (OR < 1 and p > 0.05 at all radii), indicating cohort-specific regulatory divergence (Table 3). These patterns were further verified (Table 4): overlaps between motif-assigned nearest genes and AP-2 regulon targets stabilized around ~12-14% (ACC, AP-2€ regulon) and ~10-11% (DLBC, AP-2€ regulon), whereas STAD (AP-2ß regulon) remained low (~2-3%).
| Cohort | Radius (kb) | Odds Ratio | Fisher p-Value | Empirical p-Value |
|---|---|---|---|---|
| ACC | 250 | 1.138 | 8.915 ×10-2 | 8.139 × 10-2 |
| 500 | 1.156 | 2.079 × 10-2 | 2.37 × 10-2 | |
| 750 | 1.281 | 2.403 × 10-5 | 1.00 × 10-4 | |
| 1000 | 1.219 | 1.227 × 10-4 | 2.00 × 10-4 | |
| DLBC | 250 | 1.241 | 3.496 × 10-3 | 2.90 × 10-3 |
| 500 | 1.167 | 4.839 × 10-3 | 4.40 × 10-3 | |
| 750 | 1.258 | 7.099 × 10-6 | 1.00 × 10-4 | |
| 1000 | 1.106 | 1.643 × 10-2 | 1.58 ×10-2 | |
| STAD | 250 | 0.835 | 8.373 × 10-1 | 2.167 × 10-1 |
| 500 | 0.831 | 9.068 ×10-1 | 1.195 × 10-1 | |
| 750 | 0.827 | 9.524 × 10-1 | 6.129 × 10-2 | |
| 1000 | 0.855 | 9.378 × 10-1 | 7.909 ×10-2 |
Odds ratios (FG vs. GC-matched BG) and two-sided Fisher p-values are shown alongside directional empirical one-sided p-values from 10,000 hypergeometric draws.
| Cohort | TF | 250 kb | 500 kb | 750 kb | 1000 kb |
|---|---|---|---|---|---|
| ACC | AP-2€ | 84/622 (13.5%) | 135/1058 (12.8%) | 184/1444 (12.7%) | 223/1822 (12.2%) |
| DLBC | AP-2€ | 60/544 (11.0%) | 100/939 (10.6%) | 137/1255 (10.9%) | 163/1538 (10.6%) |
| STAD | AP-26 | 3/134 (2.24%) | 8/239 (3.35%) | 12/374 (3.21%) | 17/469 (3.62%) |
3. Discussion
The present study establishes the first-ever characterization of the relationship between intratumoral microbiota abundance and AP-2 transcription factor expression across human neoplasms. Through systematic pan-cancer analysis, significant correlations were identified in 18 of 33 tumor types, with subsequent species-level investigation revealing prognostically relevant relationships in ACC, DLBC, and STAD. These findings pave the way for a novel investigative path in cancer biology, suggesting that microbial communities may influence tumor development through previously unexplored transcriptional regulatory mechanisms related to the AP-2 family.
The identification of TFAP2E-AS1 as the most frequently correlated AP-2 family mem- ber represents an encouraging discovery, particularly given its documented role in mi- croRNA regulation [37]. While direct evidence linking TFAP2E-AS1 to the microbiota remains absent, the established involvement of microRNAs in host-microbe interactions suggests potential indirect regulatory mechanisms that require further investigation [38,39]. The prominence of TFAP2E among frequently correlated family members aligns with its documented pleiotropic functions in gastrointestinal diseases, in which the gut microbiota plays an established pathogenic role [24]. This convergence suggests that TFAP2E may serve as a transcriptional mediator of microbiota-host interactions in gastrointestinal ma- lignancies. Though mechanistic insights are still lacking, the location of TFAP2E-AS1 and TFAP2E on the same cytogenetic band (1p34.3) suggests a feedback loop between them and a potential dependency of their expression, which may explain the prominence of both genes among frequently correlated family members. The absence of prior literature on the microbiota-TFAP2D relationship, despite correlations observed across six cohorts, under- scores the novelty of our findings. Given the limited functional characterization of AP-28 relative to other family members, these observations may guide future research on this understudied transcription factor. Previous studies have noted the involvement of TFAP2D in prostate and lung cancer [35,40], suggesting broader relevance across malignancies.
While the predominance of TFAP2C expression in ACC has been previously docu- mented [41], our work represents the first evidence suggesting that TFAP2E may also play a role in this tumor type. The association of TFAP2E and P. fungorum levels with aggres- sive clinical features (including C1A molecular subtype, COC3 classification, and steroid phenotype) aligns with molecular stratification systems, in which these ACC subtypes are associated with worse outcomes [42]. The DLBC findings are particularly compelling, given the extensive evidence for the prognostic significance of microbiota in this malig- nancy [43-48]. While the presence of Actinomyces in DLBC has been documented in the literature [49], its correlation with TFAP2E represents a novel observation. The potential association with reduced frequency of GCB subtype carries clinical relevance, as microbiota composition influences treatment response and survival in DLBC patients [44,45]. The increased number of females observed in a group with higher Actinomyces oris abundance and TFAP2E expression may reflect gender-specific immune-microbiome interactions, con- sistent with emerging evidence of gender-dependent effects of the microbiota on lymphoma progression [46]. For STAD, our findings conform with substantial literature demonstrating the critical role of microbiota in gastric cancer pathogenesis [50-53]. Although specific data on Cutibacterium granulosum are lacking, the putative relationship observed in our study between microsatellite stability and homozygous deletions suggests involvement in genomic instability mechanisms. Previous studies have indicated that microbiota composi- tion influences genomic stability in gastric cancer [52], and AP-2 family members regulate DNA damage response pathways [54], supporting potential mechanistic connections.
The divergent expression profiles between STAD and ACC/DLBC cohorts provide insights into heterogeneous interactions between AP-2 and microbiota across tumor types.
The unique STAD profile, involving TFAP2B rather than TFAP2E, suggests that different AP-2 family members mediate distinct transcriptional programs in response to microbial stimuli. Remarkably, the consensus between ACC and DLBC persists despite fundamental differences in bacterial species (Paraburkholderia vs. Actinomyces) and tumor origins (endocrine vs. hematologic). This convergence, in which TFAP2E was consistently involved in both cohorts and maintained a mutual WGCNA pattern across the specific bacterium, suggests that the identity of the responding AP-2 factor may be a more critical determinant of transcriptional outcomes than either tumor type or bacterial species. These observations, while derived from our most significant findings in only three cohorts with highly refined patient groups, warrant future investigation into whether the primary determinant of microbiota-induced transcriptional changes is not the tumor context or bacterial identity per se, but rather the AP-2 family member engaged in response to microbial stimuli. The FG-BG enrichment and gene-set intersections suggest that AP-2-dependent regulatory programs are selectively engaged in ACC and DLBC within the genomic neighborhoods of microbiota- linked genes, while such coupling is not evident in STAD. The stable OR pattern across radii in ACC and DLBC implies that relevant AP-2-associated elements are distributed across hundreds of kilobases rather than confined to immediate promoters, consistent with enhancer-rich landscapes and distal enhancer-promoter communication. The lack of enrichment in STAD suggests that AP-2 occupancy is either less central to gastric tumor regulatory wiring or operates in contexts not captured by the current microbe-associated gene sets.
Apart from the above most essential relationships, others identified in our initial screening could be elaborated using the literature. The correlation between Haemophilus and TFAP2C in GBM gains context from evidence linking Haemophilus to IDH1 status and glioma grade [55]. For Sphingomonas and TFAP2A-AS1 in CHOL, existing literature suggests potential roles of the aforementioned genus in cholangiocarcinoma pathogenesis [56]. The identification of Staphylococcus as a correlate of SARC and UVM parallels findings in lung cancer, where Staphylococcus levels are notably elevated in tumor tissues compared to healthy controls [57,58]. Staphylococcus exhibits bidirectional effects on carcinogenesis, with pathogenic strains potentially modulating immune responses through virulence factors [59]. Importantly, AP-2 transcription factors in lung cancer show significant correlations with immune cell infiltration, including CD8+ T cells, CD4+ T cells, and other tumor-infiltrating immune populations [60], suggesting potential mechanistic links between transcriptional regulation and immune modulation. This is consistent with Staphylococcus influencing CD8+ T cell infiltration and the immune microenvironment in breast cancer [61], indicating that Staphylococcus may exert universal immunomodulatory functions across multiple malignancies, potentially through interactions with transcriptional programs.
Transcriptomic insights, followed by functional annotation, revealed that microbiota- AP-2 relationships influence key signaling pathways and fundamental cellular processes, such as the DNA damage response and metabolic reprogramming. AP-2« directly regulates transcription of critical DNA repair genes such as TOP2A, NUDT1, POLD1, and PARP1 in hepatocellular carcinoma, thereby facilitating repair of oxidized DNA lesions [62]. AP-2y inhibits GADD45B expression, with overexpressed TFAP2C suppressing GADD45B and PMAIP1 at both mRNA and protein levels in non-small cell lung cancer cells [63]. The identification of ATM signaling and DNA repair pathways aligns with emerging evidence that intratumoral bacteria alter genomic stability by inducing DNA damage and muta- tions [48,64,65]. Regarding metabolic alterations, the microbiota influences tumor metabolic reprogramming, with changes in glycolysis and gluconeogenesis consistent with its estab- lished role in regulating metabolic pathways, including fatty acid synthesis and glutamate metabolism [51]. Moreover, AP-2 family members regulate metabolic processes through
their downstream targets, with TFAP2A shown to increase EZH2 expression by perturbing the activity of the nucleosome remodeling complex, contributing to the metabolic rewiring characteristic of cancer cells [23]. The divergence in pathway enrichment between cohorts (with STAD showing distinct patterns compared with ACC and DLBC) may reflect differ- ences in the AP-2 family members involved and thus distinct transcriptional programs. For instance, phenomena unique to ACC and DLBC (incorporating TFAP2E-related rela- tionships) included Notch signaling and the TP53 network, whereas alternative pathways concerned STAD (in which a TFAP2B-related relationship was noted). This specificity has implications for therapeutic targeting, as different AP-2 members may require distinct intervention strategies.
The consistent observation that elevated bacterial abundance and AP-2 expression correlate with unfavorable survival across multiple endpoints (OS, DSS, DFI, PFI) implies clinical utility. These markers could enhance existing prognostic systems, particularly in ACC, where molecular classification already guides therapy [66], and in DLBC, where microbiota profiling shows promise for patient stratification [43,44]. The species-specific nature of identified correlations opens possibilities for targeted interventions. While direct manipulation of intratumoral microbiota remains challenging, the documented effects of microbiota on treatment response in various cancers suggest potential therapeutic avenues [45,47]. The prominence of TFAP2E across multiple tumor types positions it as a putative target, particularly given its presence in the most promising relationships identified throughout this study, as well as its role in gastrointestinal malignancies, where microbiome-based interventions are advancing rapidly [24,50].
Some study limitations and key takeaways are worth mentioning. Despite being the first investigation to incorporate both microbiota and AP-2 data, the correlative nature of the analysis cannot establish causality, necessitating experimental validation through so- phisticated approaches such as co-culture systems, animal models, and relevant sequencing methods (e.g., Dual-Seq). The genus-level correlations inevitably aggregate heterogeneous species and therefore served only as an initial, broad screening step. It also explains why many PCC values are modest yet significant, which is why only relationships that remained consistent at the species level and in downstream analyses were further interpreted. The high number of significant correlations in some tumors during genus-level analysis (espe- cially in DLBC) may be due to a relatively small cohort size, and these findings require additional verification in larger cohorts in the future. An absence of significant survival effects in KICH, despite initially promising correlations, highlights the importance of mul- tilayered analysis in microbiome studies. Technically, one should consider the potential variability between sequencing methodologies and taxonomic classification systems across databases. In addition, strain-level variations within species, which can dramatically in- fluence host interactions, are challenging to assess in silico. Finally, TFAP2B/E ChIP-seq data are not available in public repositories at the time of analysis; thus, results rely on TFAP2A/C and their union as a proxy for family binding. This limitation affects all cohorts equally and is unlikely to create a divergence between ACC/DLBC and STAD.
All things considered, this work establishes a foundation for investigating microbiota- AP-2 interactions in cancer biology. Priority areas for future research include: (1) mech- anistic validation of identified relationships, particularly the TFAP2E-microbiota axis; (2) characterization of AP-2 antisense transcripts, which are mentioned prominently but lack precise functional studies; (3) investigation of indirect mechanisms, such as microRNA- mediated regulation; (4) expansion to additional tumors or cancer subtypes, particularly those with established microbiome involvement; and (5) development of microbiome- based biomarker signatures incorporating AP-2 members for clinical implementation. The potential for AP-2 family members to serve as therapeutic targets mediating microbiota
effects warrants investigation. Given the established role of microbiota in modulating im- munotherapy response, understanding how AP-2 transcription factors integrate microbial signals could lead to the development of novel therapeutic strategies. The influence of the microbiome on transcriptional programs and host gene expression opens new avenues for precision oncology, enabling integrated profiling of microbial communities and host transcriptional states to guide personalized interventions and improve the selection and effectiveness of combination therapies.
4. Materials and Methods
4.1. Assessment of Bacterial Composition and Diversity in a Pan-Cancer View
Bacterial composition data across all tumor types in The Cancer Genome Atlas (TCGA) [67] were obtained from the Bacteria in Cancer (BIC) database [68], and the relative abundance of the top 15 genera was visualized in composition barplots. BIC also served as a source of genus-level diversity data, presented as boxplots for all available indices: evenness, richness, Shannon index, Simpson index of diversity, and Simpson reciprocal index. In addition, bacterial composition was assessed across a pan-cancer spectrum for the top 50 genera using The Cancer Microbiota (TCMbio) database [3], which was also employed to compare alpha-diversity (Shannon, Simpson, Observed, Chao1) across tumors and non-tumor specimens.
4.2. Acquisition and Processing of Gene Expression and Microbiota Abundance Data
Data for all tumor cohorts available in TCGA (Table 5) were obtained from the Ge- nomic Data Commons repository [69] using the GDCquery() function of the TCGAbiolinks v2.34.1 R-package. Transcriptomic data were downloaded as raw counts generated by the Spliced Transcripts Alignment to a Reference (STAR) protocol, with the 38th major build of the human genome defined by the Genome Reference Consortium (hg38) selected as the reference. Corresponding clinical profiles were merged from both GDC (as part of GDCquery) and TCGA Clinical Data Resource (TCGA-CDR) [70], with the former used to derive general clinical features while the latter utilized as a source of prognostic end- points, i.e., overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). Regarding microbiota, genus-level and species-level abundance data of intratumoral bacteria were downloaded for all TCGA tumors from TCMbio, which utilizes the Kraken2 taxonomic classification system in its pipeline [3]. Data were filtered to investigate genera/species present in at least 20% of a given cohort and in patients having at least two non-zero counts. Among individuals with available gene expression and bacterial abundance data, counts were subjected to Trimmed Mean of M-values (TMM) normalization using the edgeR v4.4.2 R-package. Patients with values acquired through the TMM method were investigated in consecutive stages of the study.
| TCGA Cohort Abbreviation | Full Disease Name/Description | Number of Samples Included in This Study, Considering Microbiota Data Availability |
|---|---|---|
| ACC | Adrenocortical carcinoma | 76 |
| BLCA | Bladder urothelial carcinoma | 395 |
| BRCA | Breast invasive carcinoma | 1090 |
| CESC | Cervical and endocervical cancers | 293 |
| CHOL | Cholangiocarcinoma | 33 |
| COAD | Colon adenocarcinoma | 456 |
| TCGA Cohort Abbreviation | Full Disease Name/Description | Number of Samples Included in This Study, Considering Microbiota Data Availability |
|---|---|---|
| DLBC | Diffuse large B-cell lymphoma | 47 |
| ESCA | Esophageal carcinoma | 162 |
| GBM | Glioblastoma multiforme | 154 |
| HNSC | Head and neck squamous cell carcinoma | 500 |
| KICH | Kidney chromophobe | 65 |
| KIRC | Kidney renal clear cell carcinoma | 532 |
| KIRP | Kidney renal papillary cell carcinoma | 288 |
| LAML | Acute myeloid leukemia | 146 |
| LGG | Lower grade glioma | 510 |
| LIHC | Liver hepatocellular carcinoma | 361 |
| LUAD | Lung adenocarcinoma | 496 |
| LUSC | Lung squamous cell carcinoma | 495 |
| MESO | Mesothelioma | 85 |
| OV | Ovarian serous cystadenocarcinoma | 373 |
| PAAD | Pancreatic adenocarcinoma | 176 |
| PCPG | Pheochromocytoma and Paraganglioma | 179 |
| PRAD | Prostate adenocarcinoma | 484 |
| READ | Rectum adenocarcinoma | 166 |
| SARC | Sarcoma | 253 |
| SKCM | Skin cutaneous melanoma | 103 |
| STAD | Stomach adenocarcinoma | 375 |
| TGCT | Testicular germ cell tumors | 135 |
| THCA | Thyroid carcinoma | 502 |
| THYM | Thymoma | 116 |
| UCEC | Uterine corpus endometrial carcinoma | 545 |
| UCS | Uterine carcinosarcoma | 56 |
| UVM | Uveal melanoma | 79 |
4.3. Correlation Analysis
The Pearson correlation coefficient (PCC) between bacterial abundance and gene expression of AP-2 family members was assessed using the rcorr() function of the Hmisc v5.2-3 R-package. A threshold of p < 0.05 was applied via the which() function of the base v4.4.3 R-package to retain only statistically significant relationships. Since TFAP2- microbiota pairs were later subjected to various analyses, a PCC threshold of at least |0.3| was considered acceptable at this stage. Initial analysis focused on the correlation between genus-level abundance and AP-2 expression; however, the scope of the species- level analysis (see Section 2.3) was outlined following survival analysis (described below) and the interpretation of results for promising genera (see Section 2.2).
4.4. Survival Analysis
Optimal gene expression or bacterial abundance cutpoints to stratify patients into two groups based on OS, DSS, DFI, and PFI were established using the EvaluateCutpoints tool [71], and corresponding Kaplan-Meier curves were generated using the ggsurvplot() function of the survminer v0.5.0 R-package. As mentioned in the previous section, the scope of the survival analysis for bacterial data at both the genus and species levels depended on the results of the correlation analysis, with genus-level information utilized first (see Section 2.2) and species-level data used afterwards (see Section 2.3). During the genus-level approach, it was acceptable to proceed with relationships in which only one component (i.e., microbiota or an AP-2 member) evidently affected survival, with the other component showing at least one result approaching statistical significance. At the species-level stage, it was decided to select relationships that exerted a significant influence on at least one survival endpoint, with both bacterial abundance and AP-2 expression having the same effect on survival.
4.5. Intersection Analysis Followed by Assessment of Prognostic Outcomes
To establish representative groups (with higher or lower microbiota abundance and AP- 2 expression), an intersection analysis was performed using the UpSetR v1.4.0 R-package. Such groups of patients in selected cohorts (see Section 2.4) were subjected to extended survival analysis. In addition to Kaplan-Meier curves generated as in the above section, hazard ratios were assessed using the coxph() function of forestmodel v0.6.2 R-package, whereas the proportional hazards assumption in a Cox model was evaluated using the Schoenfeld residual test performed via the ggcoxdiagnostics() function of survminer v0.5.0 R-package.
4.6. Investigating Clinical Profile and Performing Differential Expression Analysis (DEA) Among Representative Groups of Patients
Representative groups were compared in terms of the clinical profile acquired from GDC, encompassing both qualitative and quantitative features, with the latter detailed in Section 4.11. The distribution of qualitative data was visualized as filled stackbars using the ggplot2 v3.5.1 R-package, while quantitative information was depicted as beanplots generated using the beanplot v1.2 R-package. The same groups of patients were subjected to differential expression analysis (DEA), which was conducted separately for each selected cohort using the limma v3.62.2 R-package, employing the limma-voom method [72,73]. The workflow incorporated normalization through the calcNormFactors() function and the elimination of lowly expressed transcripts (those with ≥5 counts per million in ≥1 library), followed by variance modeling via the voom() transformation. Model fitting was performed in limma using weighted least squares for individual genes through lmFit(), and log2FC values comparing the case group (characterized by higher microbiota abundance and AP-2 expression) against the control group (characterized by lower microbiota abundance and AP-2 expression) were calculated using the makeContrasts() function with default parameters. Empirical Bayes smoothing of standard errors was applied before identifying differentially expressed genes (defined by p < 0.05 and log2FC > 0.58 or ← 0.58) using the topTable() function.
4.7. Bootstrapping and Weighted Gene Co-Expression Network Analysis (WGCNA)
Given the data scarcity in one of the representative groups (only four patients in the DLBC cohort representing higher microbiota abundance and AP-2 expression), this group was subjected twice to the resampling bootstrap method [74,75] to acquire a relevant quantity of samples prior to Weighted gene co-expression network analysis (WGCNA) [76]. The latter was performed on all selected cohorts to obtain consensus modules across tu-
mors using the Biological network reconstruction omnibus (BioNERO) [77]. Read counts were subjected to variance stabilizing transformation [78] via the “vstransform” param- eter set to “TRUE” and filtered by variance using “variance_filter” set to “TRUE” with n = 10,000 during the data preprocessing. Consensus modules were identified using the consensus_trait_cor() and consensus_modules() functions, with the latter executed with the correlation method set to Pearson and a module merging threshold of 0.2 for a signed hybrid type of network. The transformation into an adjacency matrix was performed using BioNERO-estimated optimal power via the consensus_SFT_fit() function and with the scale-free topology fitting index (R2) > 0.8. Established gene-containing modules were correlated to a binary trait representing microbiota abundance and AP-2 expression, with samples denoted as “AP-2 1 Microbiota 1” or “AP-2 Į Microbiota Į”.
4.8. Overlapping DEA and WGCNA Data Alongside Gene Ontology
Given the expression profile disparity between one of the selected cohorts versus the others (see Section 2.5), both WGCNA modules and DEA-derived differentially expressed genes were overlapped using the draw.pairwise.venn() function of the VennDiagram v1.7.3 R-package. The aforementioned modules and overlapping genes were independently subjected to Over-Representation Analysis (ORA) via the WebGestaltR v0.4.6 R-package employing a set of functional annotation resources: Biological Process noRedundant, KEGG, PANTHER, Reactome, and WikiPathways cancer. After setting the reference at “genome” and redundancy removal at “weighted set cover”, up to the top 15 annotations were acquired for each input list of genes.
4.9. Proximity Analysis Around Microbiota-Responsive Genes and FG/BG Set Construction
Following identification of ACC, DLBC, and STAD as the most promising cohorts, it was assessed whether microbe-responsive genes show AP-2 motif and ChIP-overlap enrichment in nearby accessible chromatin, suggesting AP-2-mediated regulation of the microbiota-host transcriptional program. Microbiota-responsive gene sets (utilized to center TSS windows) were derived from various sources (more details in Appendix A.1). Analyses were performed at the cohort level across genomic windows centered on the transcription start sites (TSS) of these genes using four radii (±250, ±500, ±750, ±1000 kb). Accessible chromatin peak sets for ACC and STAD were obtained from the TCGA Pan-Cancer ATAC compendium [79], and the DLBC-oriented analysis utilized ENCODE GM12878 ATAC peaks [80] owing to the lack of DLBC data in the TCGA compendium. Files were converted to BED3 format prior to processing. TSS coordinates were taken from UCSC refFlat on hg38 [81,82]. For each cohort and radius, ATAC peaks were intersected with TSS-centered windows to define foreground (FG) peak windows near upregulated genes, as well as background (BG) peak windows near genes not upregulated. Foreground windows were length/GC matched to background windows to control base composition. GC content was computed with BEDTools nuc [83,84], and adaptive with-replacement matching (tolerances ~2-10% GC and ±20-100 bp) produced a BG set equal in size to FG.
4.10. AP-2 Motif Scanning, ChIP-Seq Integration, and Enrichment Testing
To score AP-2 binding motifs, 300-bp windows centered on peak midpoints (pm150) were converted to FASTA using UCSC twoBitToFa [85] against hg38. Position-weight ma- trices for TFAP2B and TFAP2E were taken from HOCOMOCO v13 [86]. Motif scanning was performed with FIMO (MEME Suite) [87] using default background and a per-site p-value threshold of 1 x 10-3; per-sequence maximum scores and motif-hit BEDs were retained for downstream analysis. AP-2 family binding tracks were compiled from the Cistrome Data Browser v3.0 [88]; ChIP-seq peaks were downloaded, converted to BED3, and harmonized to hg38 (where needed) using UCSC liftOver [89]. No peak sets were available for TFAP2B
or TFAP2E; therefore, analyses relied on TFAP2A and TFAP2C individually and on their union as a surrogate for shared AP-2 occupancy. Motif-ChIP overlap was quantified by intersecting FG and GC-matched BG motif hits with AP-2 ChIP union peaks (bedtools intersect -u [90]). Enrichment of FG relative to BG was tested with a 2 x 2 Fisher’s exact test and summarized as odds ratio (OR) and p-value; directional empirical one-sided p-values were computed by Monte-Carlo draws from the hypergeometric distribution (N = 10,000). Lists of genes related to AP-20 and AP-2€ were obtained from various sources (more details in Appendix A.2) and further used for overlap summaries. Included references were UCSC hg38.2bit and refFlat.txt [91,92].
4.11. Statistical Analysis
All analyses were performed in RStudio 2024.12.1 build 563 with R 4.4.3 (Posit Soft- ware, Boston, MA, USA). Correlations between microbiota abundance and AP-2 expression were quantified using Pearson’s correlation coefficient. Survival endpoints were evaluated using Kaplan-Meier analysis with data-driven cutpoints and Cox proportional hazards models. For quantitative clinical data, normality of distribution was determined by the Shapiro-Wilk test, followed by an unpaired t-test or a Wilcoxon test. DEA between rep- resentative patient groups was assessed using the limma-voom framework, as well as integrated with consensus WGCNA and ORA. Enrichment near microbiota-responsive genes was tested using matched foreground and background peak sets. Results with a p- value less than 0.05 were considered statistically significant. Details on the implementation of specific methods are provided in the above subsections.
5. Conclusions
This study provides the first-ever assessment of the relationships between intratumoral microbiota and AP-2 transcription factors, and their widespread identification suggests a novel regulatory axis in human cancer. Initial screening revealed significant correlations in 18 of 33 tumor types, with TFAP2E-AS1 emerging as the most frequently correlated AP-2 member, and Halomonas as the most prevalent genus. Other frequently correlated compo- nents included TFAP2B, TFAP2D, TFAP2E, TFAP2A-AS1, Meiothermus, and Corynebacterium, with the DLBC tumor cohort exhibiting the highest number of identified relationships. Subsequent species-level analysis and prognostic evaluation identified three most promis- ing relationships: Paraburkholderia fungorum and TFAP2E in ACC, Actinomyces oris and TFAP2E in DLBC, as well as Cutibacterium granulosum and TFAP2B in STAD. Across mul- tiple prognostic endpoints, patients with simultaneously elevated microbiota abundance and AP-2 expression had unfavorable survival compared with those with concurrently lower microbiota and AP-2 levels, which was accompanied by distinct clinical feature patterns. Consensus expression profiling, as well as proximity and enrichment analyses across these three cohorts, has revealed important biological insights. While ACC and DLBC exhibited enrichment of AP-2 motif sites around microbiota-responsive genes and shared a consensus transcriptional profile, STAD displayed uniformly absent AP-2 family signals and showed expression changes that only partially resembled the ACC/DLBC consensus. This divergence manifested functionally, with metabolic reprogramming and TP53 regulation network involvement evident in ACC and DLBC but notably absent in STAD, suggesting that different AP-2 family members (TFAP2E versus TFAP2B) mediate distinct transcriptional programs in response to microbial stimuli. While the mechanistic landscape remains to be elucidated, our findings position AP-2 family members as poten- tial mediators of microbiota-host interactions in cancer, opening new investigative and therapeutic avenues in the rapidly evolving field of cancer microbiome research.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijms262311587/s1.
Author Contributions: Conceptualization, D.K .; methodology, D.K .; software, D.K .; writing-original draft preparation, D.K., P.G., and E.P .; writing-review and editing, D.K., P.G., L .- Y.Z., Ż.K .- K., M.K., R.K., E.P .; visualization, D.K. and P.G .; supervision, D.K .; project administration, D.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.
Conflicts of Interest: The authors declare no conflicts of interest.
Appendix A
Appendix A.1. Establishment of Microbiota-Responsive Gene Sets
Three cohort-specific gene sets were assembled to represent host programs plausibly modulated by the focal microbes. For ACC, a published adrenocortical carcinoma tran- scriptomic study was used to define a panel of high-confidence genes altered following treatment with P. fungorum-containing bacterial supernatant [93]. Differentially expressed genes were filtered at q < 0.05, and the top 30 upregulated plus top 30 down-regulated symbols were retained (n = 60). For STAD, a literature-anchored, Cutibacterium-responsive host panel was curated and then restricted to tumor-related genes: (i) genes implicated in Cutibacterium (Propionibacterium)-driven signaling and M2-macrophage polarization (centered on TLR4-PI3K-AKT and immunomodulatory markers such as IL10, MRC1, CD163, ARG1, TGFB1) were drawn from gastric cancer study [94], and (ii) tumor genes positively associated with Propionibacteriaceae abundance in gastric mucosa were taken from an integrative microbiome-transcriptome study [95]; the union was intersected with genes upregulated in gastric carcinoma versus inflamed non-tumor mucosa, yielding a concise, tumor-relevant set (n = 16). For DLBC, due to the lack of an Actinomyces-related host DE catalog, a mechanism-anchored panel (n = 47) was curated to reflect the expected B-lineage response to Gram-positive, lipoteichoic-acid-bearing stimuli. The panel was centered on TLR2/TLR1/TLR6 → MYD88 → IRAK4/IRAK1 → TRAF6 → NF-KB/MAPK, with complementary sensors (TLR9, NOD2), downstream transcription factors (RELA/RELB/ /NFKB1/2; JUN/FOS; MAPK1/3/14; MAP2K3/6), B-cell activation/costimulation and BCR-PI3K nodes (CD69, CD86, CD40, CD19; SYK, LYN, BLNK, PLCG2, PIK3CD), as well as cytokines/chemokines (IL6, TNF, IL1B, CXCL8, CCL2/3/4, CXCL1), consistent with established TLR2-MyD88 biology and B-cell TLR literature [96,97].
Appendix A.2. Acquisition of Genes Related to AP-23 and AP-2€
Candidate gene sets related to AP-26 /TFAP2B and AP-2€/TFAP2E were assembled us- ing a unified multi-source approach that integrated TF-target annotations and co-expression data. Repositories comprised Enrichr [98], cBioPortal [99], CorrelationAnalyzeR [100], UALCAN [101], TFLink [102], GeneCards [103], Harmonizome [104], TFBSDB [105], and TFTG [106]. Lists from sources with available data for specific TF were merged and dedupli- cated, yielding 928 TFAP2B-associated and 2700 TFAP2E-associated genes.
References
1. Chen, B.Y .; Lin, W.Z .; Li, Y.L .; Bi, C .; Du, L.J .; Liu, Y .; Zhou, L.J .; Liu, T .; Xu, S .; Shi, C.J .; et al. Roles of oral microbiota and oral-gut microbial transmission in hypertension. J. Adv. Res. 2023, 43, 147-161. [CrossRef] [PubMed]
2. Adu-Oppong, B .; Thanert, R .; Wallace, M.A .; Burnham, C.D .; Dantas, G. Substantial overlap between symptomatic and asymp- tomatic genitourinary microbiota states. Microbiome 2022, 10, 6. [CrossRef] [PubMed]
3. Sheng, D .; Jin, C .; Yue, K .; Yue, M .; Liang, Y .; Xue, X .; Li, P .; Zhao, G .; Zhang, L. Pan-cancer atlas of tumor-resident microbiome, immunity and prognosis. Cancer Lett. 2024, 598, 217077. [CrossRef]
4. Thomas, A.M .; Fidelle, M .; Routy, B .; Kroemer, G .; Wargo, J.A .; Segata, N .; Zitvogel, L. Gut OncoMicrobiome Signatures (GOMS) as next-generation biomarkers for cancer immunotherapy. Nat. Rev. Clin. Oncol. 2023, 20, 583-603. [CrossRef]
5. Overacre-Delgoffe, A.E .; Bumgarner, H.J .; Cillo, A.R .; Burr, A.H.P .; Tometich, J.T .; Bhattacharjee, A .; Bruno, T.C .; Vignali, D.A.A .; Hand, T.W. Microbiota-specific T follicular helper cells drive tertiary lymphoid structures and anti-tumor immunity against colorectal cancer. Immunity 2021, 54, 2812-2824.e2814. [CrossRef] [PubMed]
6. Pevsner-Fischer, M .; Tuganbaev, T .; Meijer, M .; Zhang, S.H .; Zeng, Z.R .; Chen, M.H .; Elinav, E. Role of the microbiome in non-gastrointestinal cancers. World J. Clin. Oncol. 2016, 7, 200-213. [CrossRef]
7. Galeano Nino, J.L .; Wu, H .; LaCourse, K.D .; Kempchinsky, A.G .; Baryiames, A .; Barber, B .; Futran, N .; Houlton, J .; Sather, C .; Sicinska, E .; et al. Effect of the intratumoral microbiota on spatial and cellular heterogeneity in cancer. Nature 2022, 611, 810-817. [CrossRef]
8. Heymann, C.J.F .; Bard, J.M .; Heymann, M.F .; Heymann, D .; Bobin-Dubigeon, C. The intratumoral microbiome: Characterization methods and functional impact. Cancer Lett. 2021, 522, 63-79. [CrossRef]
9. Yousefi, Y .; Baines, K.J .; Maleki Vareki, S. Microbiome bacterial influencers of host immunity and response to immunotherapy. Cell Rep. Med. 2024, 5, 101487. [CrossRef]
10. Panebianco, C .; Andriulli, A .; Pazienza, V. Pharmacomicrobiomics: Exploiting the drug-microbiota interactions in anticancer therapies. Microbiome 2018, 6, 92. [CrossRef]
11. Zhao, L.Y .; Mei, J.X .; Yu, G .; Lei, L .; Zhang, W.H .; Liu, K .; Chen, X.L .; Kolat, D .; Yang, K .; Hu, J.K. Role of the gut microbiota in anticancer therapy: From molecular mechanisms to clinical applications. Signal Transduct. Target. Ther. 2023, 8, 201. [CrossRef] [PubMed]
12. Said, S.S .; Ibrahim, W.N. Gut Microbiota-Tumor Microenvironment Interactions: Mechanisms and Clinical Implications for Immune Checkpoint Inhibitor Efficacy in Cancer. Cancer Manag. Res. 2025, 17, 171-192. [CrossRef]
13. Fu, Y .; Li, J .; Cai, W .; Huang, Y .; Liu, X .; Ma, Z .; Tang, Z .; Bian, X .; Zheng, J .; Jiang, J .; et al. The emerging tumor microbe microenvironment: From delineation to multidisciplinary approach-based interventions. Acta Pharm. Sin. B 2024, 14, 1560-1591. [CrossRef] [PubMed]
14. Fujisaka, S .; Watanabe, Y .; Tobe, K. The gut microbiome: A core regulator of metabolism. J. Endocrinol. 2023, 256, e220111. [CrossRef]
15. Lambring, C.B .; Siraj, S .; Patel, K .; Sankpal, U.T .; Mathew, S .; Basha, R. Impact of the Microbiome on the Immune System. Crit. Rev. Immunol. 2019, 39, 313-328. [CrossRef]
6. Jiang, C .; Wan, S .; Hu, P .; Li, Y .; Li, S. Editorial: Transcriptional Regulation in Metabolism and Immunology. Front. Genet. 2022, 13, 845697. [CrossRef]
17. Nichols, R.G .; Davenport, E.R. The relationship between the gut microbiome and host gene expression: A review. Hum. Genet. 2021, 140, 747-760. [CrossRef]
18. Camp, J.G .; Frank, C.L .; Lickwar, C.R .; Guturu, H .; Rube, T .; Wenger, A.M .; Chen, J .; Bejerano, G .; Crawford, G.E .; Rawls, J.F. Microbiota modulate transcription in the intestinal epithelium without remodeling the accessible chromatin landscape. Genome Res. 2014, 24, 1504-1516. [CrossRef]
19. Davison, J.M .; Lickwar, C.R .; Song, L .; Breton, G .; Crawford, G.E .; Rawls, J.F. Microbiota regulate intestinal epithelial gene expression by suppressing the transcription factor Hepatocyte nuclear factor 4 alpha. Genome Res. 2017, 27, 1195-1206. [CrossRef] [PubMed]
20. Meisel, J.S .; Sfyroera, G .; Bartow-McKenney, C .; Gimblet, C .; Bugayev, J .; Horwinski, J .; Kim, B .; Brestoff, J.R .; Tyldsley, A.S .; Zheng, Q .; et al. Commensal microbiota modulate gene expression in the skin. Microbiome 2018, 6, 20. [CrossRef]
21. Kazmierczak-Siedlecka, K .; Ruszkowski, J .; Skonieczna-Zydecka, K .; Jedrzejczak, J .; Folwarski, M .; Makarewicz, W. Gastrointesti- nal cancers: The role of microbiota in carcinogenesis and the role of probiotics and microbiota in anti-cancer therapy efficacy. Cent. Eur. J. Immunol. 2020, 45, 476-487. [CrossRef]
22. Wingender, E .; Schoeps, T .; Haubrock, M .; Krull, M .; Donitz, J. TFClass: Expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 2018, 46, D343-D347. [CrossRef]
23. Jin, C .; Luo, Y .; Liang, Z .; Li, X .; Kolat, D .; Zhao, L .; Xiong, W. Crucial role of the transcription factors family activator protein 2 in cancer: Current clue and views. J. Transl. Med. 2023, 21, 371. [CrossRef]
24. Yu, Y.J .; Kolat, D .; Kaluzinska-Kolat, Z .; Liang, Z .; Peng, B.Q .; Zhu, Y.F .; Liu, K .; Mei, J.X .; Yu, G .; Zhang, W.H .; et al. The AP-2 Family of Transcription Factors-Still Undervalued Regulators in Gastroenterological Disorders. Int. J. Mol. Sci. 2024, 25, 9138. [CrossRef]
25. Eckert, D .; Buhl, S .; Weber, S .; Jager, R .; Schorle, H. The AP-2 family of transcription factors. Genome Biol. 2005, 6, 246. [CrossRef] [PubMed]
26. Wu, H.R .; Zhang, J. AP-2x expression in papillary thyroid carcinoma predicts tumor progression and poor prognosis. Cancer Manag. Res. 2018, 10, 2615-2625. [CrossRef]
27. . Kolat, D .; Kaluzinska, Z .; Bednarek, A.K .; Pluciennik, E. The biological characteristics of transcription factors AP-2x and AP-2y and their importance in various types of cancers. Biosci. Rep. 2019, 39, BSR20181928. [CrossRef] [PubMed]
28. Gao, Q.; Shi, Y.; Sun, Y.; Zhou, S.; Liu, Z.; Sun, X.; Di, X. Identification and verification of aging-related lncRNAs for prognosis prediction and immune microenvironment in patients with head and neck squamous carcinoma. Oncol. Res. 2023, 31, 35-61. [CrossRef] [PubMed]
29. Zhong, L .; Wang, Y .; Kannan, P .; Tainsky, M.A. Functional characterization of the interacting domains of the positive coactivator PC4 with the transcription factor AP-2x. Gene 2003, 320, 155-164. [CrossRef]
30. Zhao, F .; Satoda, M .; Licht, J.D .; Hayashizaki, Y .; Gelb, B.D. Cloning and characterization of a novel mouse AP-2 transcription factor, AP-26, with unique DNA binding and transactivation properties. J. Biol. Chem. 2001, 276, 40755-40760. [CrossRef]
31. Kolat, D .; Kaluzinska, Z .; Orzechowska, M .; Bednarek, A.K .; Pluciennik, E. Functional genomics of AP-2x and AP-2y in cancers: In silico study. BMC Med. Genom. 2020, 13, 174. [CrossRef]
32. Niu, C .; Wen, H .; Wang, S .; Shu, G .; Wang, M .; Yi, H .; Guo, K .; Pan, Q .; Yin, G. Potential prognosis and immunotherapy predictor TFAP2A in pan-cancer. Aging 2024, 16, 1021-1048. [CrossRef] [PubMed]
33. Zhang, Z .; Zhang, L .; Jia, L .; Cui, S .; Shi, Y .; Chang, A .; Zeng, X .; Wang, P. AP-2« suppresses invasion in BeWo cells by repression of matrix metalloproteinase-2 and -9 and up-regulation of E-cadherin. Mol. Cell. Biochem. 2013, 381, 31-39. [CrossRef]
34. Li, Z .; Xiong, W .; Liang, Z .; Wang, J .; Zeng, Z .; Kolat, D .; Li, X .; Zhou, D .; Xu, X .; Zhao, L. Critical role of the gut microbiota in immune responses and cancer immunotherapy. J. Hematol. Oncol. 2024, 17, 33. [CrossRef]
35. Kolat, D .; Zhao, L.Y .; Kciuk, M .; Pluciennik, E .; Kaluzinska-Kolat, Z. AP-26 Is the Most Relevant Target of AP-2 Family-Focused Cancer Therapy and Affects Genome Organization. Cells 2022, 11, 4124. [CrossRef]
36. Kolat, D .; Kaluzinska, Z .; Bednarek, A.K .; Pluciennik, E. Prognostic significance of AP-2x/y targets as cancer therapeutics. Sci. Rep. 2022, 12, 5497. [CrossRef]
37. Liu, M.; Jia, W.; Bai, L.; Lin, Q. Dysregulation of lncRNA TFAP2A-AS1 is involved in the pathogenesis of pulpitis by the regulation of microRNA-32-5p. Immun. Inflamm. Dis. 2024, 12, e1312. [CrossRef]
8. Li, M .; Chen, W.D .; Wang, Y.D. The roles of the gut microbiota-miRNA interaction in the host pathophysiology. Mol. Med. 2020, 26, 101. [CrossRef]
39. Prukpitikul, P .; Sirivarasai, J .; Sutjarit, N. The molecular mechanisms underlying gut microbiota-miRNA interaction in metabolic disorders. Benef. Microbes. 2024, 15, 83-96. [CrossRef] [PubMed]
40. Fraune, C .; Harms, L .; Buscheck, F .; Hoflmayer, D .; Tsourlakis, M.C .; Clauditz, T.S .; Simon, R .; Moller, K .; Luebke, A.M .; Moller- Koop, C .; et al. Upregulation of the transcription factor TFAP2D is associated with aggressive tumor phenotype in prostate cancer lacking the TMPRSS2:ERG fusion. Mol. Med. 2020, 26, 24. [CrossRef] [PubMed]
41. Frierson, H.F., Jr .; El-Naggar, A.K .; Welsh, J.B .; Sapinoso, L.M .; Su, A.I .; Cheng, J .; Saku, T .; Moskaluk, C.A .; Hampton, G.M. Large scale molecular analysis identifies genes with altered expression in salivary adenoid cystic carcinoma. Am. J. Pathol. 2002, 161, 1315-1323. [CrossRef]
2. Assie, G .; Letouze, E .; Fassnacht, M .; Jouinot, A .; Luscap, W .; Barreau, O .; Omeiri, H .; Rodriguez, S .; Perlemoine, K .; Rene-Corail, F .; et al. Integrated genomic characterization of adrenocortical carcinoma. Nat. Genet. 2014, 46, 607-612. [CrossRef]
43. Li, Z .; Shi, J .; Wu, X .; Yan, S .; Gao, Z .; Liu, Y. Gut microbiota is associated with the disease characteristics of patients with newly diagnosed diffuse large B-cell lymphoma. Am. J. Cancer Res. 2025, 15, 2285-2300. [CrossRef]
44. Yoon, S.E .; Kang, W .; Choi, S .; Park, Y .; Chalita, M .; Kim, H .; Lee, J.H .; Hyun, D.W .; Ryu, K.J .; Sung, H .; et al. The influence of microbial dysbiosis on immunochemotherapy-related efficacy and safety in diffuse large B-cell lymphoma. Blood 2023, 141, 2224-2238. [CrossRef]
45. Lin, Z .; Mao, D .; Jin, C .; Wang, J .; Lai, Y .; Zhang, Y .; Zhou, M .; Ge, Q .; Zhang, P .; Sun, Y .; et al. The gut microbiota correlate with the disease characteristics and immune status of patients with untreated diffuse large B-cell lymphoma. Front. Immunol. 2023, 14, 1105293. [CrossRef]
46. Xu, Y .; Shi, C .; Qian, J .; Yu, X .; Wang, S .; Shao, L .; Yu, W. The gut microbiota is altered significantly in primary diffuse large b-cell lymphoma patients and relapse refractory diffuse large b-cell lymphoma patients. Clin. Transl. Oncol. 2025, 27, 2347-2353. [CrossRef] [PubMed]
47. Yijia, Z .; Li, X .; Ma, L .; Wang, S .; Du, H .; Wu, Y .; Yu, J .; Xiang, Y .; Xiong, D .; Shan, H .; et al. Identification of intratumoral microbiome-driven immune modulation and therapeutic implications in diffuse large B-cell lymphoma. Cancer Immunol. Immunother. 2025, 74, 131. [CrossRef] [PubMed]
48. Yoon, S.E .; Kang, W .; Chalita, M .; Lim, J .; Kim, W.S .; Kim, S.J. Comprehensive Understanding of Gut Microbiota in Treatment Naïve Diffuse Large B Cell Lymphoma Patients. Blood 2021, 138, 2409. [CrossRef]
49. Zhang, Y .; Han, S .; Xiao, X .; Zheng, L .; Chen, Y .; Zhang, Z .; Gao, X .; Zhou, S .; Yu, K .; Huang, L .; et al. Integration analysis of tumor metagenome and peripheral immunity data of diffuse large-B cell lymphoma. Front. Immunol. 2023, 14, 1146861. [CrossRef]
50. Zhou, D .; Xiong, S .; Xiong, J .; Deng, X .; Long, Q .; Li, Y. Integrated analysis of the microbiome and transcriptome in stomach adenocarcinoma. Open Life Sci. 2023, 18, 20220528. [CrossRef]
51. Rodriguez, R.M .; Menor, M .; Hernandez, B.Y .; Deng, Y .; Khadka, V.S. Bacterial Diversity Correlates with Overall Survival in Cancers of the Head and Neck, Liver, and Stomach. Molecules 2021, 26, 5659. [CrossRef] [PubMed]
52. Wen, Y .; Wang, Y .; Huang, Y .; Liu, Z .; Hui, C. PLVAP protein expression correlated with microbial composition, clinicopathological features, and prognosis of patients with stomach adenocarcinoma. J. Cancer Res. Clin. Oncol. 2023, 149, 7139-7153. [CrossRef]
53. Hu, Y.L .; Pang, W .; Huang, Y .; Zhang, Y .; Zhang, C.J. The Gastric Microbiome Is Perturbed in Advanced Gastric Adenocarcinoma Identified Through Shotgun Metagenomics. Front. Cell. Infect. Microbiol. 2018, 8, 433. [CrossRef]
54. Wang, W .; Lv, L .; Pan, K .; Zhang, Y .; Zhao, J.J .; Chen, J.G .; Chen, Y.B .; Li, Y.Q .; Wang, Q.J .; He, J .; et al. Reduced expression of transcription factor AP-2« is associated with gastric adenocarcinoma prognosis. PLOS ONE 2011, 6, e24897. [CrossRef]
55. Wen, Y .; Feng, L .; Wang, H .; Zhou, H .; Li, Q .; Zhang, W .; Wang, M .; Li, Y .; Luan, X .; Jiang, Z .; et al. Association Between Oral Microbiota and Human Brain Glioma Grade: A Case-Control Study. Front. Microbiol. 2021, 12, 746568. [CrossRef]
56. Elvevi, A .; Laffusa, A .; Gallo, C .; Invernizzi, P .; Massironi, S. Any Role for Microbiota in Cholangiocarcinoma? A Comprehensive Review. Cells 2023, 12, 370. [CrossRef]
57. Yu, H .; Du, Y .; He, Y .; Sun, Y .; Li, J .; Jia, B .; Chen, J .; Peng, X .; An, T .; Li, J .; et al. Lactate production by tumor-resident Staphylococcus promotes metastatic colonization in lung adenocarcinoma. Cell Host Microbe 2025, 33, 1089-1105.e7. [CrossRef]
58. Li, R .; Li, J .; Zhou, X. Lung microbiome: New insights into the pathogenesis of respiratory diseases. Signal Transduct. Target. Ther. 2024, 9, 19. [CrossRef]
59. Wei, Y .; Sandhu, E .; Yang, X .; Yang, J .; Ren, Y .; Gao, X. Bidirectional Functional Effects of Staphylococcus on Carcinogenesis. Microorganisms 2022, 10, 2353. [CrossRef] [PubMed]
60. Szmajda-Krygier, D .; Krygier, A .; Zebrowska-Nawrocka, M .; Pietrzak, J .; Swiechowski, R .; Wosiak, A .; Jelen, A .; Balcerczak, E. Differential Expression of AP-2 Transcription Factors Family in Lung Adenocarcinoma and Lung Squamous Cell Carcinoma-A Bioinformatics Study. Cells 2023, 12, 667. [CrossRef] [PubMed]
61. Liu, C.C .; Grencewicz, D .; Chakravarthy, K .; Li, L .; Liepold, R .; Wolf, M .; Sangwan, N .; Tzeng, A .; Hoyd, R .; Jhawar, S.R .; et al. Breast tumor microbiome regulates anti-tumor immunity and T cell-associated metabolites. bioRxiv 2024. [CrossRef]
62. Wang, C .; Zhao, Z .; Zhao, Y .; Zhao, J .; Xia, L .; Xia, Q. Macroscopic inhibition of DNA damage repair pathways by targeting AP-2x with LEI110 eradicates hepatocellular carcinoma. Commun. Biol. 2024, 7, 342. [CrossRef]
63. Do, H .; Kim, D .; Kang, J .; Son, B .; Seo, D .; Youn, H .; Youn, B .; Kim, W. TFAP2C increases cell proliferation by downregulating GADD45B and PMAIP1 in non-small cell lung cancer cells. Biol. Res. 2019, 52, 35. [CrossRef]
64. Cao, Y .; Xia, H .; Tan, X .; Shi, C .; Ma, Y .; Meng, D .; Zhou, M .; Lv, Z .; Wang, S .; Jin, Y. Intratumoural microbiota: A new frontier in cancer development and therapy. Signal Transduct. Target. Ther. 2024, 9, 15. [CrossRef]
65. Yang, L .; Li, A .; Wang, Y .; Zhang, Y. Intratumoral microbiota: Roles in cancer initiation, development and therapeutic efficacy. Signal Transduct. Target Ther. 2023, 8, 35. [CrossRef]
66. Li, Y .; Zhang, D .; Wang, M .; Jiang, H .; Feng, C .; Li, Y.X. Intratumoral microbiota is associated with prognosis in patients with adrenocortical carcinoma. iMeta 2023, 2, e102. [CrossRef]
67. The Cancer Genome Atlas Research Network; Weinstein, J.N .; Collisson, E.A .; Mills, G.B .; Shaw, K.R .; Ozenberger, B.A .; Ellrott, K .; Shmulevich, I .; Sander, C .; Stuart, J.M. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 2013, 45, 1113-1120. [CrossRef]
68. Chen, K.P .; Hsu, C.L .; Oyang, Y.J .; Huang, H.C .; Juan, H.F. BIC: A database for the transcriptional landscape of bacteria in cancer. Nucleic Acids Res. 2023, 51, D1205-D1211. [CrossRef] [PubMed]
69. Jensen, M.A .; Ferretti, V .; Grossman, R.L .; Staudt, L.M. The NCI Genomic Data Commons as an engine for precision medicine. Blood 2017, 130, 453-459. [CrossRef]
70. Liu, J .; Lichtenberg, T .; Hoadley, K.A .; Poisson, L.M .; Lazar, A.J .; Cherniack, A.D .; Kovatich, A.J .; Benz, C.C .; Levine, D.A .; Lee, A.V .; et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell 2018, 173, 400-416.e411. [CrossRef] [PubMed]
71. Ogluszka, M .; Orzechowska, M .; Jedroszka, D .; Witas, P .; Bednarek, A.K. Evaluate Cutpoints: Adaptable continuous data distribution system for determining survival in Kaplan-Meier estimator. Comput. Methods Programs Biomed. 2019, 177, 133-139. [CrossRef] [PubMed]
72. Ritchie, M.E .; Phipson, B .; Wu, D .; Hu, Y .; Law, C.W .; Shi, W .; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [CrossRef] [PubMed]
73. Law, C.W .; Chen, Y .; Shi, W .; Smyth, G.K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014, 15, R29. [CrossRef] [PubMed]
74. Al Seesi, S .; Tiagueu, Y.T .; Zelikovsky, A .; Mandoiu, I.I. Bootstrap-based differential gene expression analysis for RNA-Seq data with and without replicates. BMC Genom. 2014, 15 (Suppl. 8), S2. [CrossRef]
75. Kulesa, A .; Krzywinski, M .; Blainey, P .; Altman, N. Sampling distributions and the bootstrap. Nat. Methods 2015, 12, 477-478. [CrossRef]
76. Qu, X .; Wu, S .; Gao, J .; Qin, Z .; Zhou, Z .; Liu, J. Weighted gene co expression network analysis (WGCNA) with key pathways and hub-genes related to micro RNAs in ischemic stroke. IET Syst. Biol. 2021, 15, 93-100. [CrossRef]
77. Almeida-Silva, F .; Venancio, T.M. BioNERO: An all-in-one R/Bioconductor package for comprehensive and easy biological network reconstruction. Funct. Integr. Genom. 2022, 22, 131-136. [CrossRef]
78. Love, M.I .; Huber, W .; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [CrossRef]
79. Corces, M.R .; Granja, J.M .; Shams, S .; Louie, B.H .; Seoane, J.A .; Zhou, W .; Silva, T.C .; Groeneveld, C .; Wong, C.K .; Cho, S.W .; et al. The chromatin accessibility landscape of primary human cancers. Science 2018, 362, eaav1898. [CrossRef]
80. Consortium, E.P. An integrated encyclopedia of DNA elements in the human genome. Nature 2012, 489, 57-74. [CrossRef]
81. Perez, G .; Barber, G.P .; Benet-Pages, A .; Casper, J .; Clawson, H .; Diekhans, M .; Fischer, C .; Gonzalez, J.N .; Hinrichs, A.S .; Lee, C.M .; et al. The UCSC Genome Browser database: 2025 update. Nucleic Acids Res. 2025, 53, D1243-D1249. [CrossRef] [PubMed]
82. Bai, Y .; Kinne, J .; Ding, L .; Rath, E.C .; Cox, A .; Naidu, S.D. Identification of genome-wide non-canonical spliced regions and analysis of biological functions for spliced sequences using Read-Split-Fly. BMC Bioinform. 2017, 18, 382. [CrossRef]
83. Quinlan, A.R. BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr. Protoc. Bioinform. 2014, 47, 11.12.11-11.12.34. [CrossRef]
84. Heilbrun, E.E .; Tseitline, D .; Wasserman, H .; Kirshenbaum, A .; Cohen, Y .; Gordan, R .; Adar, S. The epigenetic landscape shapes smoking-induced mutagenesis by modulating DNA damage susceptibility and repair efficiency. Nucleic Acids Res. 2025, 53, gkaf048. [CrossRef]
85. Capriotti, E .; Fariselli, P. PhD-SNPg: Updating a webserver and lightweight tool for scoring nucleotide variants. Nucleic Acids Res. 2023, 51, W451-W458. [CrossRef]
86. Vorontsov, I.E .; Eliseeva, I.A .; Zinkevich, A .; Nikonov, M .; Abramov, S .; Boytsov, A .; Kamenets, V .; Kasianova, A .; Kolmykov, S .; Yevshin, I.S .; et al. HOCOMOCO in 2024: A rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Res. 2024, 52, D154-D163. [CrossRef] [PubMed]
87. Grant, C.E .; Bailey, T.L .; Noble, W.S. FIMO: Scanning for occurrences of a given motif. Bioinformatics 2011, 27, 1017-1018. [CrossRef]
88. Taing, L .; Dandawate, A .; L’Yi, S .; Gehlenborg, N .; Brown, M .; Meyer, C.A. Cistrome Data Browser: Integrated search, analysis and visualization of chromatin data. Nucleic Acids Res. 2024, 52, D61-D66. [CrossRef] [PubMed]
89. Park, K.J .; Yoon, Y.A .; Park, J.H. Evaluation of Liftover Tools for the Conversion of Genome Reference Consortium Human Build 37 to Build 38 Using ClinVar Variants. Genes 2023, 14, 1875. [CrossRef]
90. Layer, R.M .; Skadron, K .; Robins, G .; Hall, I.M .; Quinlan, A.R. Binary Interval Search: A scalable algorithm for counting interval intersections. Bioinformatics 2013, 29, 1-7. [CrossRef]
91. Schmid-Burgk, J.L .; Gao, L .; Li, D .; Gardner, Z .; Strecker, J .; Lash, B .; Zhang, F. Highly Parallel Profiling of Cas9 Variant Specificity. Mol. Cell 2020, 78, 794-800.e8. [CrossRef] [PubMed]
92. Hsu, F.M .; Wang, C.R .; Chen, P.Y. Reduced Representation Bisulfite Sequencing in Maize. Bio-Protocol 2018, 8, e2778. [CrossRef]
93. Chai, X .; Wang, J .; Li, H .; Gao, C .; Li, S .; Wei, C .; Huang, J .; Tian, Y .; Yuan, J .; Lu, J .; et al. Intratumor microbiome features reveal antitumor potentials of intrahepatic cholangiocarcinoma. Gut Microbes 2023, 15, 2156255. [CrossRef]
94. Li, Q .; Wu, W .; Gong, D .; Shang, R .; Wang, J .; Yu, H. Propionibacterium acnes overabundance in gastric cancer promote M2 polarization of macrophages via a TLR4/PI3K/ Akt signaling. Gastric Cancer 2021, 24, 1242-1253. [CrossRef]
95. Park, C.H .; Hong, C .; Lee, A.R .; Sung, J .; Hwang, T.H. Multi-omics reveals microbiome, host gene expression, and immune landscape in gastric carcinogenesis. iScience 2022, 25, 103956. [CrossRef]
96. de Groen, R.A.L .; Schrader, A.M.R .; Kersten, M.J .; Pals, S.T .; Vermaat, J.S.P. MYD88 in the driver’s seat of B-cell lymphomagenesis: From molecular mechanisms to clinical implications. Haematologica 2019, 104, 2337-2348. [CrossRef]
97. Isaza-Correa, J.M .; Liang, Z .; van den Berg, A .; Diepstra, A .; Visser, L. Toll-like receptors in the pathogenesis of human B cell malignancies. J. Hematol. Oncol. 2014, 7, 57. [CrossRef]
98. Xie, Z .; Bailey, A .; Kuleshov, M.V .; Clarke, D.J.B .; Evangelista, J.E .; Jenkins, S.L .; Lachmann, A .; Wojciechowicz, M.L .; Kropiwnicki, E .; Jagodnik, K.M .; et al. Gene Set Knowledge Discovery with Enrichr. Curr. Protoc. 2021, 1, e90. [CrossRef]
99. Gao, J .; Aksoy, B.A .; Dogrusoz, U .; Dresdner, G .; Gross, B .; Sumer, S.O .; Sun, Y .; Jacobsen, A .; Sinha, R .; Larsson, E .; et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 2013, 6, pl1. [CrossRef] [PubMed]
100. Miller, H.E .; Bishop, A.J.R. Correlation AnalyzeR: Functional predictions from gene co-expression correlations. BMC Bioinform. 2021, 22, 206. [CrossRef] [PubMed]
101. Chandrashekar, D.S .; Karthikeyan, S.K .; Korla, P.K .; Patel, H .; Shovon, A.R .; Athar, M .; Netto, G.J .; Qin, Z.S .; Kumar, S .; Manne, U .; et al. UALCAN: An update to the integrated cancer data analysis platform. Neoplasia 2022, 25, 18-27. [CrossRef]
102. Liska, O .; Bohar, B .; Hidas, A .; Korcsmaros, T .; Papp, B .; Fazekas, D .; Ari, E. TFLink: An integrated gateway to access transcription factor-target gene interactions for multiple species. Database 2022, 2022, baac083. [CrossRef] [PubMed]
103. Stelzer, G .; Rosen, N .; Plaschkes, I .; Zimmerman, S .; Twik, M .; Fishilevich, S .; Stein, T.I .; Nudel, R .; Lieder, I .; Mazor, Y .; et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Curr. Protoc. Bioinform. 2016, 54, 1.30.1-1.30.33. [CrossRef] [PubMed]
104. Diamant, I .; Clarke, D.J.B .; Evangelista, J.E .; Lingam, N .; Ma’ayan, A. Harmonizome 3.0: Integrated knowledge about genes and proteins from diverse multi-omics resources. Nucleic Acids Res. 2025, 53, D1016-D1028. [CrossRef] [PubMed]
105. Plaisier, C.L .; O’Brien, S .; Bernard, B .; Reynolds, S .; Simon, Z .; Toledo, C.M .; Ding, Y .; Reiss, D.J .; Paddison, P.J .; Baliga, N.S. Causal Mechanistic Regulatory Network for Glioblastoma Deciphered Using Systems Genetics Network Analysis. Cell Syst. 2016, 3, 172-186. [CrossRef]
106. Zhou, X .; Zhou, L .; Qian, F .; Chen, J .; Zhang, Y .; Yu, Z .; Zhang, J .; Yang, Y .; Li, Y .; Song, C .; et al. TFTG: A comprehensive database for human transcription factors and their targets. Comput. Struct. Biotechnol. J. 2024, 23, 1877-1885. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.