Check for updates

Pan-cancer analysis identifies RNA helicase DDX1 as a prognostic marker

Baocai Gao1 . Xiangnan Li1 . Shujie Li2 . Sen Wang1 . Jiaxue Wu1 . Jixi Li1(D

Received: 12 August 2021 / Revised: 29 October 2021 / Accepted: 1 November 2021 / Published online: 19 January 2022 @ International Human Phenome Institutes (Shanghai) 2021

Abstract

The DEAD-box RNA helicase (DDX) family plays a critical role in the growth and development of multiple organisms. DDX1 is involved in mRNA/rRNA processing and mature, virus replication and transcription, hormone metabolism, tumo- rigenesis, and tumor development. However, how DDX1 functions in various cancers remains unclear. Here, we explored the potential oncogenic roles of DDX1 across 33 tumors with The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) databases. DDX1 is highly expressed in breast cancer (BRCA), cholangiocarcinoma (CHOL), and colon adenocarcinoma (COAD), but it is lowly expressed in renal cancers, including kidney renal clear cell carcinoma (KIRC), kidney chromophobe (KICH), and kidney renal papillary cell carcinoma (KIRP). Low expression of DDX1 in KIRC is cor- related with a good prognosis of overall survival (OS) and disease-free survival (DFS). Highly expressed DDX1 is linked to a poor prognosis of OS for adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), KICH, and liver hepatocellular carcinoma (LIHC). Also, the residue Ser481 of DDX1 had an enhanced phosphorylation level in BRCA and ovarian cancer (OV) but decreased in KIRC. Immune infiltration analysis exhibited that DDX1 expression affected CD8+ T cells, and it was significantly associated with MSI (microsatellite instability), TMB (tumor mutational burden), and ICT (immune checkpoint blockade therapy) in tumors. In addition, the depletion of DDX1 dramatically affected the cell viability of human tumor-derived cell lines. DDX1 could affect the DNA repair pathway and the RNA transport/DNA replication processes during tumorigenesis by analyzing the CancerSEA database. Thus, our pan-cancer analysis revealed that DDX1 had complicated impacts on different cancers and might act as a prognostic marker for cancers such as renal cancer.

Keywords RNA helicase · DDX1 · Pan-cancer analysis · Survival analysis · Prognostic marker

Introduction

Members of the DEAD-box (DDX) RNA superfamily are involved in various cellular processes, including RNA splic- ing, ribosome biogenesis, RNA transport, degradation, and protein translation (Gibbons et al. 1997; Jarmoskaite and Russell 2014; Godbout et al. 1998; Hondele et al. 2019; Heerma et al. 2017). DDX family proteins contain nine conserved motifs, especially with the Walker B motif,

Highlights

A first pan-cancer analysis of DDX1 in normal and cancer samples

Highly expressed DDX1 is linked to a poor prognosis of overall survival in cancers including ACC, BLCA, KICH, and LIHC

The phosphorylation level of S481 is enhanced in several tumors, including BRCA and OV, but decreases in KIRC

In tumors, the expression level of DDX1 can affect CD8+ T-cell, MSI, TMB, and ICT

RNA transport/DNA replication-associated issue is involved in the oncogenic role of DDX1

☒ Jixi Li

lijixi@fudan.edu.cn

1 State Key Laboratory of Genetic Engineering, School of Life Sciences, MOE Engineering Research Center of Gene Technology, Shanghai Engineering Research Center of Industrial Microorganisms, Fudan University, Shanghai 200438, China

2 Kunming Institute of Physics, Kunming 650223, China

characterized by the DEAD (Asp-Glu-Ala-Asp) sequence (Hall and Matson 1999). As many RNA helicases can regu- late mRNA translation in cancer cells and are involved in tumorigenesis, inhibition of DDXs can be exploited for anti- cancer therapeutics (Heerma et al. 2017; Bol et al. 2015).

The Human DDX1 plays a vital role in the metabolism of RNAs located in the cell nucleus, and it participates in mRNA/miRNA processing (Mitkova et al. 2003; Zhong et al. 2018), and the NF-KB-mediated transcription (Khem- ici and Linder 2018; Tang et al. 2018). In addition, DDX1 plays key roles in the replication of HIV-Rev (Edgcomb et al. 2012). Moreover, DDX1 forms a complex with DDX21, DHX36, and TIR domain-containing Adapter Molecule 1 (TRIF), which can sense and recognize dsRNA and acti- vate the expression of NF-KB and type I interferon. DDX1 is initially found in neuroblastoma (NB) and retinoblastoma (RB) cell lines and tumors (Tanner and Linder 2001). Also, DDX1 deficiency cells have defects in the colony and sphere- forming capacity in vitro and tumorigenesis in nude mice (Tanaka et al. 2018). DDX1 expression level is elevated in multiple cancers, including germ cell tumors, NB, RB, glio- blastoma, BRCA, and colorectal carcinogenesis. However, how DDX1 functions in various cancers remains unclear.

Here, we performed a systematic pan-cancer analysis of RNA helicase DDX1 by TCGA and GTEx databases. Through a series of gene and protein expression differen- tial analysis, patient prognostic survival analysis, genetic alteration analysis, DNA methylation level, protein phos- phorylation level, immune infiltration analysis, the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, and the gene ontology (GO) analysis, DDX1 was first discovered that it has oncogenic roles related with RNA transport/DNA replication processes, and was significantly associated with the prognosis of renal cancers, including KIRC, KICH, and KIRP, which might help us for treating RNA helicase related cancers.

Methods

Gene expression analysis

We inputted DDX1 in the ‘Gene_DE’ module of TIMER2 (Tumor immune estimation resource, version 2) web (http:// timer.cistrome.org/) and checked the expression difference of DDX1 between tumor and adjacent normal tissues for the different tumors or specific tumor subtypes of the TCGA project (Li et al. 2020).

For certain tumors without usual or with highly limited normal tissues, we used the ‘Expression Analysis-Box Plots’ module of the GEPIA2 webserver (http://gepia2.cancer-pku. cn/#analysis) to obtain the box plots of the expression dif- ference between the tumor tissues and the corresponding

normal tissues of the GTEx database (Tang et al. 2019). Additionally, the violin plots of the DDX1 expression in dif- ferent pathological stages (stage I, stage II, stage III, and stage IV) of all TCGA tumors were obtained via the ‘Path- ological Stage Plot’ module of HEPIA2. The log2 [TPM (Transcripts per million) + 1] transformed expression data were applied for the violin plots analysis.

The UALCAN portal (http://ualcan.path.uab.edu/analy sis-prot.html), an interactive web resource for analyzing cancer Omics data, was used to conduct gene expression of TCGA analysis and protein expression analysis of the CPTAC (Clinical proteomic tumor analysis consortium) dataset (Chandrashekar et al. 2017; Cerami E et al. 2012).

Raw counts of RNA-sequencing data (level 3) and cor- responding clinical information from ACC, BLCA, KICH, LIHC, KIRC, KIRP, UVM, and ovarian serous cystadeno- carcinoma were obtained from the TCGA dataset (https:// portal.gdc.cancer.gov/) in January 2020, in which the method of acquisition and application have complied with the guidelines and policies. The Sanguini diagram was built based on the R software package alluvial. All the above analysis methods and R package implemented by the R foun- dation for statistical computing version 4.0.3 (Zeng et al. 2019).

Survival prognosis analysis

We used the ‘Survival Map’ module of GEPIA2 to obtain the OS and DFS significance map data of DDX1 across all TCGA tumors. Cutoff-high (50%) and cutoff-low (50%) val- ues were used as the expression thresholds for splitting the high-expression and low-expression cohorts. The log-rank test was used in the hypothesis test, and the survival plots were obtained through the ‘Survival Analysis’ module of GEPIA2.

The univariate and multivariate cox regression analy- sis was performed to identify the proper terms to build the nomogram. The forest was used to show the p-value, HR (Hazard ratios), and 95% CI (Confidence intervals) of each variable through the ‘forest plot’ R package. A nomogram was developed based on multivariate Cox proportional haz- ards analysis results to predict the X-year overall recurrence. The nomogram provided a graphical representation of the factors, which can be used to calculate the risk of recurrence for an individual patient by the points associated with each risk factor through the ‘rms’ R package.

Genetic alteration analysis

After logging into the cBioPortal web (https://www.cbiop ortal.org/) (Cerami et al. 2012; Gao et al. 2013), we chose the ‘TCGA Pan-Cancer Atlas Studies’ in the ‘Quick select’ sec- tion and entered ‘DDX1’ for queries of the genetic alteration

characteristics of DDX1. The results of the alteration fre- quency, mutation type, and CNA (Copy number alteration) across all TCGA tumors were observed in the ‘Cancer Types Summary’ module. We also used the ‘Comparison’ mod- ule to obtain the overall disease-free, progression-free, and disease-free survival differences for the TCGA cancer cases with or without DDX1 genetic alteration. Kaplan-Meier plots with log-rank p-value were generated as well.

DNA methylation analysis

The DNMIVD (DNA Methylation Interactive Visualization Database) (http://119.3.41.228/dnmivd/index/) is a compre- hensive annotation and interactive visualization database for DNA methylation profile of diverse human cancer con- structed with high-throughput microarray data from TCGA and GTEx databases, and it also integrates some data from the Pancan-meQTL and HACER (Human ACtive Enhancer to interpret Regulatory variants) databases (Ding et al. 2020).

MEXPRESS (https://www.mexpress.be/index.html) is a data visualization tool designed for the easy visualiza- tion of TCGA expression, DNA methylation, and clinical data, as well as the relationships between them (Koch et al. 2015). We used the DNMIVD database and the MEXPRESS approach to investigate the potential association between DNA methylation of DDX1 and different tumor pathogenesis in the TCGA project.

Immune infiltration analysis

The tumor RNA-seq data from 33 different types of tumors were downloaded from the Genomic Data Commons (GDC) data portal website in the TCGA database (https://portal. gdc.cancer.gov/). Each tumor has mRNA expression data from matched standard tissue samples. For reliable immune score evaluation, we used the immunedeconv, an R soft- ware package that integrates six latest algorithms, includ- ing TIMER, xCell, MCP-counter, CIBERSORT, EPIC, and QUANTISEQ. These genes include the SIGLEC15 (Sialic acid-binding ig like lectin 15), PD-L1 (CD274, Programmed cell death 1 ligand 1), HAVCR2 (Hepatitis a virus cellular receptor 2), PD1 (Programmed death 1), CTLA4 (Cytotoxic t-lymphocyte-associated protein 4), LAG3 (Lymphocyte- Activation Gene 3), and PDCD1LG2 (Recombinant Pro- grammed Cell Death Protein 1 Ligand 2).

TMB (Tumor mutational burden) was derived from the Immune Landscape of Cancer (Huang et al. 2019). MSI (Microsatellite instability) was derived from the Landscape of Microsatellite Instability (Bonneville et al. 2017). We used the Spearman rank correlation coefficient to describe the correlation between quantitative variables without

normal distribution. If not otherwise stated, the rank-sum test detects two sets of data, and a p-value of <0.05 is con- sidered statistically significant. The data are visualized as a heatmap and a scatter plot.

We first searched the DDX1 through the website (https:// string-db.org/) using the query of a single protein name (‘DDX1’) and organism (‘Homo sapiens’) (Szklarczyk et al. 2019). Subsequently, we set the following main parameters: the minimum required interaction score is 0.150, the mean- ing of network edges is evidence and active interaction sources are experiments. Finally, the available experimen- tally determined DDX1-binding proteins were obtained.

We used the ‘Similar Gene Detection’ module of GEPIA2 to obtain the top 100 DDX1-correlated targeting genes based on the datasets of all TCGA tumors and normal tissues (Tang et al. 2019). We also applied the ‘correlation analy- sis’ module of GEPIA2 to perform a pairwise gene Pearson correlation analysis of DDX1 and selected genes. The log2 TPM was applied for the dot plot. The p-value and the cor- relation coefficient (r) were indicated. Moreover, we used the ‘Gene_ Cor’ module of TIMER2 to supply the heatmap data of the selected genes, which contains the partial correlation (Cor) and p-value in the purity-adjusted Spearman’s rank correlation test. The Jvenn (http://jvenn.toulouse.inra.fr/app/ index.html) was used to conduct an intersection analysis to compare the DDX1-binding and interacted genes (Bardou et al. 2014). WebGestalt (WEB-based Gene Set Analysis Toolkit) is a functional enrichment analysis web tool (http:// www.webgestalt.org/) (Liao et al. 2019). Therefore, we com- bined the datasets from Jvenn and WebGestalt to perform KEGG pathway analysis.

Knockout of DDX1 in human tumor-derived cell lines

The CRISPR-Cas9 system has enabled precise genome-scale identification of genes essential for the proliferation and sur- vival of cancer cells. However, Cas9-mediated DNA cleavage produces a gene-independent antiproliferative effect that con- founds such measurement of genetic dependency. CERES is a tool to estimate gene-dependency levels from CRISPR-Cas9 essentiality screens while accounting for this effect (Meyers et al. 2017). To define a cancer dependency map, we used genome-scale CRISPR-Cas9 essentiality screens across 342 cancer cell lines and applied CERES to these data. The pro- cess is as follows: log in to the DepMap (Cancer Dependency Map) database (Tsherniak et al. 2017), search for the DDX1 gene, and select CRISPR (DepMap 21Q2Public + Score) as the data set in the Perturbation Effects option.

CancerSEA analysis

CancerSEA (Cancer single-cell state atlas) is the first dedi- cated database to comprehensively explore distinct func- tional states of cancer cells at the single-cell level (http:// biocc.hrbmu.edu.cn/CancerSEA/) (Yuan et al. 2019). Can- cerSEA portrays a cancer single-cell functional state atlas, involving 14 functional states including stemness, invasion, metastasis, proliferation, EMT (Epithelial-Mesenchymal Transition), angiogenesis, apoptosis, cell cycle, differentia- tion, DNA damage, DNA repair, hypoxia, inflammation, and quiescence of 41,900 single cancer cells from 25 cancers. In addition, it allows querying which functional states are associated with the gene (or gene list) of interest in different cancers. The process is as follows: log in to the CancerSEA, select ‘search’, enter ‘DDXI’ in ‘input a gene’, and then con- duct research.

Results

Gene expression analysis of human DDX1

The human protein atlas (HPA, https://www.proteinatlas. org/) was used to explore human DDX1 (NP_004930.1) expression based on the mass-spectrometry-based pro- teomics, transcriptomics, and system biology (Uhlén et al. 2015; Thul et al. 2017; Uhlen et al. 2017). First, we have an overview of the basic information of human DDX1 (NM_004939.3, NP_004930.1). Through the HPA, we found that DDX1 is ubiquitously expressed in cells, tissues, and organs. The transcription level of DDX1 is highest in skeletal muscle, and the overall difference is negligible. Therefore, DDX1 expression belongs to low tissue specificity (data not shown).

Next, the TIMER2 was used to perform statistical anal- ysis on clinical tumor samples in the TCGA and GTEx databases. In CHOL (p<0.001), COAD (p<0.001), ESCA (esophageal carcinoma) (p<0.001), GBM (glioblastoma multiforme) (p<0.05), HNSC (head and neck squamous cell carcinoma) (p<0.001), LIHC (p<0.001), LUAD (lung adenocarcinoma) (p<0.001), LUSC (lung squamous cell carcinoma) (p<0.001), STAD (p<0.001), DLBC (lymphoid neoplasm diffuse large B-cell lymphoma) (p<0.05), and THYM (thymoma) (p<0.05), DDX1 had higher expression level than the corresponding normal tissues (Fig. 1a-b). The expression levels of DDX1 were down-regulated in SKCM (p<0.001), THCA (p<0.001), KIRC (p<0.001), KICH (p<0.001), and KIRP (p<0.001) (Fig. 1a-b). To further verify the results, we compared clinical tumor samples and found a high degree of similar- ity between the TCGA and the GTEx databases (Fig. S1).

To further identify the correlation between DDX1 expression and the pathological stages of cancers, the HEPIA2 and UALCAN were used to explore different cancers, including KIRC (p<0.01), LIHC (p<0.001), and UCS (p<0.05) (Fig. 1c). The results showed that the expression level of DDX1 is down-regulated in THCA and renal cancers KICH, KIRP, and KIRC, but it is up- regulated in COAD, HNSC, LIHC, LUAD, CHOL, ESCA, LUSC, and STAD (Figs. 1d, S2). Next, the Sankey pro- gram was used to determine the differences for DDX1 expression in different tumors and different tumor stages; the results showed that the expression level of DDX1 was correlated with the age, gender, pTNM stage, and progno- sis of tumor patients.

As DDX1 functions mainly depending on the helicase activity (Ribeiro et al. 2018), we evaluated the protein expression level of DDX1 in different tumors through the CPTAC database. DDX1 protein is highly expressed in BRCA, OV, and LUAD, but it has lower expression in the KIRC and uterine corpus endometrial carcinoma (UCEC) than that in normal tissues (Fig. 2a). Also, in KIRC and BRCA, the expression level of DDX1 protein in tumor patients was significantly higher than that in normal tissues, evidenced by using different DDX1 antibodies (Fig. 2b). To further understand whether the expression levels and differences of DDX1 protein existed in different tumors, the individual cancer stages were compared. The results showed that DDX1 protein was significantly higher than normal samples in stages 2 and 3 of BRCA (p<0.001) and OV (p<0.001) (Figs. S3a-S3b), while no difference existed in colon cancer (Fig. S3c). However, the expression of DDX1 was significantly lower than that in the normal samples in KIRC (p<0.001) from stage 1 to stage 4 (Fig. S3d). In stage 1/2/3 of LUAD, the expression level of DDX1 was significantly higher than the normal sample (p<0.001) (Fig. S3e). The mass-spectrometry-based proteomic data from the CPTAC confirmatory/discovery cohorts were used to cat- egorize 532 cases into ten different pan-cancer subtypes of cancer (K1-K10), which turned out that DDX1 expression had significant differences between normal samples and dif- ferent subtypes of BRCA, KIRC, and UCEC (p<0.001) (Figs. S3f-S3h). Therefore, the expression levels of the DDX1 protein are different in different tumors and patient stages. DDX1 might be used as a clinical diagnostic marker in BRCA, OV, LUAD, and KIRC.

Survival prognosis analysis of DDX1

As the expression level of DDX1 might be related to the prognosis of tumor patients, the cancer cases from the TCGA and GTEx datasets were divided into high-expres- sion and low-expression groups. Highly expressed DDX1 is linked to a poor prognosis of OS in cancers including ACC

Fig. 1 The expression level of the DDX1 gene in differ- ent tumors. a The expression status of the DDX1 gene in different cancers or specific cancer subtypes was analyzed through the TIMER2. * p<0.05; ** p<0.01; *** p<0.001. b The tumors DLBC, GBM, LGG, SKCM, TGCT, and THYM in the TCGA and the corre- sponding normal tissues in the GTEx database were compared, respectively. The box plot data were supplied. * p<0.05. c The expression levels of the DDX1 gene in KIRC, LIHC, and UCS were analyzed by the main pathological stages (stages I, II, III, and IV). Log2 (TPM+1) was applied for log-scale. d The expression levels of the DDX1 gene in KICH, KIRP, KIRC, HNSC, LIHC, and LUAD on different cancer stages were analyzed through the UALCAN. KICH: Normal-vs-Stage I/II/ III, p<0.001; KIRP: Normal- vs-Stage I/II/III/IV, p<0.001; KIRC: Normal-vs-Stage I/II/III/ IV, p<0.001; HNSC: Normal- vs-Stage I/II/III/IV, p=0.002, Normal-vs-Stage II/III/IV, p<0.001; LIHC: Normal-vs- Stage I/II/III, p<0.001; LUAD: Normal-vs-Stage I/II/III/IV, p<0.001

a

TCGA dataset

DDX1 Expression Level (log2 TPM)

10

**




*


*










8

6

4

ACC.Tumor (n=79)

BLCA.Tumor (n=408)

BLCA.Normal (n=19)

BRCA.Tumor (n=1093)

BRCA.Normal (n=112)

BRCA-Basal. Tumor (n=190)

BRCA-Her2.Tumor (n=82)

BRCA-LumA. Tumor (n=564)

BRCA-LumB.Tumor (n=217)

CESC.Tumor (n=304)

CESC.Normal (n=3)

CHOL. Tumor (n=36)

CHOL.Normal (n=9)

COAD.Tumor (n=457)

COAD.Normal (n=41)-

DLBC.Tumor (n=48)

ESCA.Tumor (n=184)

ESCA.Normal (n=11)

GBM.Tumor (n=153)

GBM.Normal (n=5)

HNSC.Tumor (n=520)

HNSC.Normal (n=44)

HNSC-HPV+.Tumor (n=97)

HNSC-HPV -. Tumor (n=421)

KICH.Tumor (n=66)

KICH.Normal (n=25)

KIRC.Tumor (n=533)

KIRC.Normal (n=72)

KIRP.Tumor (n=290)

KIRP.Normal (n=32)

LAML.Tumor (n=173)

LGG.Tumor (n=516)

LIHC.Tumor (n=371)

LIHC.Normal (n=50)

LUAD.Tumor (n=515)

LUAD.Normal (n=59)

LUSC.Tumor (n=501)

LUSC.Normal (n=51)-

MESO.Tumor (n=87)

OV.Tumor (n=303)

PAAD.Tumor (n=178)

PAAD.Normal (n=4)

PCPG.Tumor (n=179)

PCPG.Normal (n=3)

PRAD.Tumor (n=497)

PRAD.Normal (n=52)

READ. Tumor (n=166)

READ.Normal (n=10) SARC.Tumor (n=259)

SKCM.Tumor (n=103)

SKCM.Metastasis (n=368)

STAD.Tumor (n=415)

STAD.Normal (n=35)

TGCT.Tumor (n=150)

THCA.Tumor (n=501)

THCA.Normal (n=59)

THYM.Tumor (n=120)

UCEC.Tumor (n=545) UCEC.Normal (n=35)

UCS.Tumor (n=57)

UVM.Tumor (n=80)

b TCGA+GTEx dataset

Expression-log2 (TPM+1)

10

N=47

N=337

N=163

N=207

N=518

N=207

N=461

N=558

N=137

N=165

N=118

N=339

8

6

4

2

0

DLBC

GBN

LGG

SKCM

TGCT

THYM

C

TCGA dataset

8

DDX1 expression log2 (TPM+1)

KIRC

F value = 5.02 Pr(>F) = 0.00193

8

LIHC

F value = 6.56 Pr(>F) = 0.000252

8.0

UCS

F value = 3.79 Pr(>F) = 0.0155

~

~

م

6

6

7.0

5

5

6.0

*

0

5

00

Stage I

Stage II

Stage III

Stage IV

Stage I

Stage II

Stage III

Stage IV

Stage I

Stage II

Stage III

Stage IV

d

TCGA dataset

125

KICH

125

KIRP

125

KIRC

100

100

100

Transcript per million

75

75

75

50

50

50

25

25

25

0

Normal (n=25)

Stage I (n=20)

Stage II (n=25)

Stage III Stage IV (n=14) (n=6)

0

Normal (n=32)

Stage I (n=140)

Stage II (n=21)

Stage III Stage IV (n=29)

0

Normal (n=72)

Stage I (n=267)

Stage II (n=57)

Stage III (n=123)

Stage IV (n=84)

(n=11)

125

HNSC

100

LIHC

125

LUAD

100

80

100

75

60

75

50

40

50

25

20

25

0

Normal (n=44)

Stage I (n=27)

Stage II (n=71)

Stage I (n=81)

Stage I (n=264)

0

Normal (n=50)

Stage I (n=168)

Stage II (n=84)

Stage III (n=82)

Stage IV (n=6)

0

Normal (n=59)

Stage I (n=277)

Stage II (n=125)

Stage III (n=85)

Stage IV (n=28)

(p=0.0078), BLCA (p=0.0089), KICH (p=0.0085), and LIHC (p=0.00087) (Fig. 3a). However, the high expres- sion of DDX1 is correlated with the good prognosis of KIRC in OS (p<0.001) and in disease-free survival (DFS) (p<0.0043) (Fig. 3a-b). Also, highly expressed DDX1 is linked to a poor prognosis of DFS in ACC (p=0.0046),

KICH (p=0.028), KIRP (p=0.031), LIHC (p=0.0018), and UVM (p=0.033) (Fig. 3b). The further analysis showed that DDX1 is a protective factor for KIRC patients with a risk factor for KICH, ACC, and LIHC patients (Figs. 3, S4). The above data indicated that DDX1 expression is differentially associated with the prognosis of cases with different cancers.

a

CPTAC dataset

3

P<0.001

BRCA

OV

4

P<0.001

P<0.001

LUAD

3

2

2

2

Z-value

1

1

Z-value

L-value

0

0

0

7

7

2

2

,

”?

?

Normal (n=18)

Primary tumor (n=125)

7

Normal (n=25)

Primary tumor (n=100)

Y

Normal (n=111)

Primary tumor (n=111)

KIRC

0

0

P<0.001

UCEC

N

2

Z-value

-

Z-value

-

0

P<0.001

0

T

“N

1

“?

Normal (n=84)

Primary tumor (n=110)

~

Normal (n=31)

Primary tumor (n=100)

b

Renal cancer

Breast cancer

HPA034502

HPA034502

HPA034503

Normal tissue

Patient id:2530

Patient id:1767

Patient id:3356

HPA034502

HPA034502

HPA034503

Renal cancer tissue

Patient id:3061

Patient id:2176

Patient id:3541

HPA034502

Patient id:3061

Fig. 2 The expression levels of DDX1 protein in different tumors. a The expression levels of the DDX1 protein were analyzed by the CPTAC dataset between normal tissue and primary tissue of BRCA, OV, LUAD, KIRC, and UCEC. b Antibodies (HPA034502,

HPA034502

HPA034502

HPA034503

Normal tissue

Patient id: 3544

Patient id:3286

Patient id:2773

HPA034502

HPA034502

HPA034503

Breast cancer tissue

Patient id: 2565

Patient id:2160

Patient id: 1775

HPA034502

Patient id:2565

HPA034503) were labeled with DAB (3,3’-diaminobenzidine). The section was furthermore counterstained with hematoxylin to enable visualization of microscopical features for renal cancer and breast cancer

Genetic alteration analysis of DDX1

To identify whether the genetic changes of DDX1 can affect the occurrence of tumors and the prognosis of patients, we used the cBioPortal web (https://www.cbioportal.org/) to analyze the data from the TCGA. Based on the cancer study (Fig. 4a), the main types of point mutation of DDX1 are in uterine cancer, bladder cancer, lung squamous, stomach

cancer, lung adenocarcinoma, melanoma, mesothelioma, and colorectal cancer. Also, the primary cancer types that occurred with amplification are from uterine carcinosar- coma, uterine cancer, bladder cancer, lung squamous, liver cancer, ovarian cancer, GBM, and ACC (Fig. 4a). On the other hand, fusion mutation appears less frequently, mainly in bladder cancer, head and neck cancer (Fig. 4a). Based on the cancer type (Fig. S5a), of all the tumors with genetic

alteration, endometrial carcinoma has the highest percentage (5.63%), followed by cervical adenocarcinoma (4.35%) and bladder urothelial carcinoma (4.14%). The main types of genetic alteration in endometrial carcinoma are point muta- tion and amplification. Deep deletion is the type of major genetic alteration in mature B-cell neoplasms. Thus, the pri- mary genetic change types of DDX1 are point mutation and amplification in most tumors.

Moreover, the missense mutation of DDX1 is the primary type of genetic alteration (Fig. 4b). There is a total of 93 cases of missense type, accounting for 80.17% (93/116) (Fig. 4b). Further analysis showed that the genetic changes of DDX1 have an adverse effect on DFS (p=0.0329) and DSS (disease-specific survival) (p=0.0385), but no signifi- cant effects on OS (p=0.228) and PFS (progress-free sur- vival, p=0.135) (Fig. 4c). Next, the results were analyzed in LUSC, and it showed that the genetic changes of DDX1 have an adverse effect on DFS (p<0.001) and DSS (p=0.0254) (Fig. S5b). Taken together, we explored the potential asso- ciation between genetic alteration of DDX1 and the clinical survival prognosis of different cancers.

DNA methylation analysis of DDX1

DNA methylation regulates gene expression in different cancers. In 22 tumors from the DNMIVD database, DNA methylation had a negative correlation with DDX1 expres- sion level. The strongest correlation was in LUAD (Pear- son _R =- 0.22166, Pearson_p= 1.0384e-06; Spearman R =- 0.289865; Spearman_p= 1e-10) (Fig. 5a-b). The DNA methylation levels were correlated with the good prog- nosis in BLCA, CESC, and SKCM by grouping the median survival (Fig. 5c); the DNA methylation levels were cor- related with the poor prognosis in LUSC. Next, analysis with the CNCB-NGDC and the MEXPRESS showed that the DNA methylation of DDX1 was negatively correlated with the expression level in GBM and LGG (CNCB-NGDC Members and Partners 2021).

Protein phosphorylation analysis of DDX1

Dysfunction of protein phosphorylation might result in severe outcomes in different cancers. Thus, the phospho- rylation analysis of DDX1 was performed with BRCA, OV, KIRC, and UCEC based on the CPTAC database (Fig. 6). According to the CPTCA database information, DDX1 has three phosphorylation sites in BRCA (S481, S486, S632). By comparing normal tissues and cancer tissues, we found that only S481 has a significant difference (p<0.001), while S486 (p=0.085) and S632 (p=0.786) are not statistically different. The phosphorylation level of S481 was higher than normal tissues in BRCA. This indicates that the phosphoryl- ation of S481 is most clinically significant in BRCA. Also,

the phosphorylation level of S481 increased in OV in tumor tissues (p<0.001), while decreased in KIRC (p<0.001). In UCEC, there was no significant difference in the phospho- rylation level of S481 between normal tissues and tumor tissues (p=0.085). In summary, the phosphorylation level of S481 was higher than normal tissues in BRCA and OV, but lower in KIRC (Fig. 6). Thus, the occurrence of tumors might be accompanied by the enhancement of S481 phos- phorylation of DDX1. Furthermore, the phosphorylation level of DDX1 varies to different degrees among differ- ent tumor grades, individual cancer stages, and pan-cancer subtypes (Fig. 6). The phosphorylation level of DDX1 out of 4440 TCGA tumor samples from 15 cancer types in the PhosphoSitePlus website showed that many phosphorylation sites were mainly located in the low complex region (LCR) of the DDX1 protein (Fig. 4b). As the LCRs of DDX family proteins can affect the protein conformation and function (Chen et al. 2020), the phosphorylation level at this region might play an essential role in tumorigenesis.

Immune infiltration analysis of DDX1

The tumor microenvironment (TME) contributes to the mod- ulation of tumor progression (Litchfield et al. 2021). Thus, the TIMER, CIBERSORT, CIBERSORT-ABS, QUAN- TISEQ, XCELL, MCPCOUNTER, and EPIC algorithms were used to investigate the potential relationship between the infiltration level of different immune cells and DDX1 gene expression in different cancers from TCGA. Overall, the expression level of DDX1 has a specific correlation with the infiltration of different immune cells in many cancers. DDX1 expression is negatively correlated with the immune infiltration of CD8+ T-cells in ESCA, LUAD, LUSC, sar- coma (SARC), SKCM, testicular germ cell tumors (TGCT), and UCEC, but it positively correlationed with CD8+ T-cells in KICH, PAAD, pheochromocytoma and paraganglioma (PCPG), and UVM (Fig. S6a).

The inhibitory immunoreceptors PD-1, CTLA4, LAG3, TIM3, TIGIT, and BTLA are named immune checkpoints, referring to molecules that act as gatekeepers of immune responses (He and Xu 2020). ICT has become a power- ful weapon in fighting cancer (Brahmer et al. 2012; Zhang and Zhang 2020). MSI and TMB represent a valuable estimation of tumor neoantigen load (Rizvi et al. 2015; Hugo et al. 2016). DDX1 expression positively correlated with MSI in READ, TCGT, and STAD (Fig. S6b), but it negatively correlated with MSI in DLBC (Fig. 7a, S6b). Also, DDX1 expression positively correlated with TMB in STAD, LUAD, and SKCM, but it negatively correlated with THCA (Fig. 7b, S7). As the expression level of DDX1 is positively correlated with MSI and TMB in STAD, the further analysis with STAD patients showed a significant association between DDX1 and MLH1 (Mutl Homolog 1,

Fig. 3 Correlation between DDX1 gene expression and survival prognosis of different cancers. The GEPIA2 tool was used to perform OS a and DFS b analyses of different tumors in TCGA. The survival map and the Kaplan-Meier curves were shown, respectively

a

OS (overall survival) :

DDX1:

log10(HR)

0.5

ACC

BLCA

BRCA

CESC

CHOL

COAD

DLBC

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LAML

LGG

LIHC

LUAD

LUSC

MESO

0.0

-0.5

1.0

Low ddx1 Group High ddx1 Group

1.0

Low ddx1 Group

High ddx1 Group

1.0

Low ddx1 Group High ddx1 Group

Logrank p=0:0078

Logrank p=0.0089

Logrank p=0.00087

Percent survival

0.8

HR(high)=2.9

HR(high)=1.5

p(HR)=0.011

Percent survival

0.8

p(HR)=0.0093

Percent survival

0.8

HR(high)=1.8

n(high)=201

p(HR)=0.001

n(high)=38

h(low)=38

0.6

n(low)=201

n(high)=182

0.6

0.6

n(low)=182

0.4

0.4

0.4

0.2

0.2

0.2

0.0

ACC

0.0

BLCA

0.0

LIHC

0

50

100

150

0

50

100

150

0

20

40

60

Months

Months

80

100

150

Months

1.0

Low da&T “High ddx 1 Group

1.0

Low ddx1

Logrank p=0.0085

High ddx1 Group

Logrank p=2.7e-06

Percent survival

0.8

HR(high)=9.7

p(HR)=0.032

Percent survival

0.8

HR(high)=0.48

n(high)=32

p(HR)=4.5e-06

(high)=258

0.6

Tklow)=32

0.6

n(low)=258

0.4

0.4

0.2

0.2

0.0

KICH

0.0

KIRC

0

50

Months

100

150

0

50

Months

100

150

b

DFS (Disease-free survival) :

log10(HR)

0.4

DDX1:

0.0

ACC

BLCA

BRCA

CESC

CHOL

COAD

DLBC

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LAML

LGG

LIHC

LUAD

LUSC

MESO

OV

PAAD

PCPG

UCS

UVM

-0.4

1.0

Low ddx1 Group

1.0

…************** LowddxT Group

1.0

Low ddx1 Group

High

High ddx1, Group

Logrank p=0.0046

Logrank p=0.028

High ddx1 Group

Logrank p=0.031

Percent survival

0.8

HR(high)=2.6

HR(high)=4.8

p(HR)=0.0063

Percent survival

0.8

HR(high)=1.9

p(HR)=0.047

Percent survival

0.8

p(HR)=0.033

n(high)=38

n(high)=32

n(high)=141

n(low)=32

aflow)=144

0.6

n(low)=38

0.6

0.6

0.4

0.4

0.4

0.2

0.2

0.2

ACC

0.0

0.0

KICH

0.0

KIRP

0

50

Months

100

150

0

50

Months

100

150

0

50

100

150

200

Months

1.0

Low ddx1 Group

High ddx1 Group

1.0

Low ddx1 Group

Logrank p=0.0018

High

1.0

Low ddx1 Group

High ddx1 Group

Percent survival

0.8

HR(high)=1.6

Percent survival

0.8

1

Logrank p=0.033

Logrank p=0.0043

p(HR)=0.002

HR(high)=2.9

p(HR)=0.041

Percent survival

0.8

HR(high)=0.59

n(high)=182

p(HR)=0.0048

0.6

n(low)=182

n(high)=39

n(high)=258

0.6

.n(low)=39

0.6

[low)=258

0.4

0.4

0.4

0.2

0.2

0.2

0.0

LIHC

0.0

UVM

0.0

KIRC

0

20

40

60

80

100

120

0

20

40

Months

Months

60

80

0

20

40

60

80

100

Months

120

140

1.0

Low ddx1 Group

High ddx1 Group

Logrank p=0.049

Percent survival

0.8

HR(high)=0.78

p(HR)=0.05

n(high)=212

0.6

n(low)=212

0.4

0.2

0.0

OV

0

50

100

Months

150

Fig. 4 Mutation feature of DDX1 in different tumors in TCGA. The alteration frequency with mutation type a and mutation site b were displayed, respectively. Each color denoted a mutation type. Mutation (green): point mutation. c The potential correlation between the mutation status of DDX1 and overall survival, disease- free survival, progression-free survival, and disease-specific survival of different cancers were analyzed using the cBio- Portal tool

a

Mutation

Fusion

Amplification

Deep Deletion

Alteration Frequency

6%

4%

2%

Mutation data

CNA data

Uterine CS (TCGA PanCan 2018)

Uterine (TCGA PanCan 2018)

Bladder (TCGA PanCan 2018)

Lung squ (TCGA PanCan)

Stomach (TCGA PanCan 2018)

Lung adeno (TCGA PanCan)

Melanoma (TCGA PanCan 2018)

Liver (TCGA PanCan)

Cervical (TCGA PanCan)

Mesothelioma (TCGA PanCan 2018)

DLBC (TCGA PanCan)

Ovarian (TCGA PanCan 2018)

GBM (TCGA PanCan)

Colorectal (TCGA PanCan)

Head & neck (TCGA PanCan)

ACC (TCGA PanCan 2018) Breast Invasive Carcinoma Breast

Esophagus (TCGA PanCan)

pRCC (TCGA PanCan)

Prostate (TCGA PanCan 2018)

LGG (TCGA PanCan)

AML (TCGA PanCan)

Sarcoma (TCGA PanCan 2018)

ccRCC (TCGA PanCan)

b

Case number with alteration

93

Missense

21

Truncating

3

0

Inframe

2

Fusion

0 Other

0

SPRY

Helicase_C

0

100

200

300

400

500

600

740 aa

C

100%

P=0.228

Overall

100%

P=0.135

Progression Free

90%

Altered group

90%

80%

Unaltered group

80%

Altered group

Percent Survival

70%

Percent Survival

Unaltered group

70%

60%

60%

50%

50%

40%

40%

30%

30%

20%

20%

10%

10%

0%

0

40

80

120

160

200

240

280

320

360

0%

Overall Survival (Months)

0

40

Progression Free Survival (Months)

80

120

160

200

240

280

320

360

100%

P=0.0329

Disease Free

100%

P=0.0385

Disease-specific

90%

Altered group

90%

80%

Unaltered group

Altered group

Percent Survival

80%

Percent Survival

Unaltered group

70%

70%

60%

60%

50%

50%

40%

40%

30%

30%

20%

20%

10%

10%

0%

40

120

160

200

240

280

0%

0

80

0

40

80

120

160

200

240

280

320

360

Disease Free Survival (Months)

Disease-specific Survival (Months)

Cor=0.256), MSH2 (Muts Homolog 2, Cor=0.762), MSH6 (Muts Homolog 6, Cor=0.717), and PMS2 (Pms1 Homolog 2, Cor=0.551) (Fig. S6d). The above results indicate that the expression level of the DDX1 gene is correlated with MSI and TMB.

Next, we analyzed the correlation between DDX1 and eight immune checkpoints in different tumors. DDX1 posi- tively correlates with the immune checkpoints in UVM,

READ, PCPG, PAAD, and LIHC, but negatively corre- lates with immune checkpoints in THCA, TGCT, LUSC, LUAD, and CESC (Fig. 7c). Previous reports showed that DDX1 is involved in the regulation of CXCL12 (Stromal cell-derived factor 1), CXCL10 (C-X-C motif chemokine 10), CCR5 (C-C chemokine receptor type 5), and CXCL9 (C-X-C motif chemokine 9) (Nagel et al. 2004). However, DDX1 had weak correlations with these immune checkpoints

a

LUAD-DDX1 (P=0.021)

Normal

:

Tumor

Normalized Read Count

14

:

13

12

11

Normal (n=59)

Tumor (n=526)

Fig. 5 The effects of DDX1 methylation on tumor occur- rence and prognosis. a Boxplots of expression levels for DDX1 in LUAD. b The Beta value indicated DNA methylation level ranging from 0 (unmethyl- ated) to 1 (fully methylated). Different beta value cut-off was considered to indicate the hyper-methylation (Beta-value: 0.7-0.5) or the hypo-methyl- ation (Beta-value: 0.3-0.25). c Correlation between DDX1 methylation and survival prognosis in BLCA, LUSC, SKCM, and CESC is shown, respectively

b

LUAD-DDX1 Pearson r =- 0.22;P=1.04e-06

6

5

DDX1 (FPKM)

4

3

2

1

0

0

DDX1 (mean beta value of promoter)

0.1

0.2

0.3

c

BLCA

1

LUSC

SKCM

Median-PFI

High

Low

1

Median-PFI

High

1

Median-OS

High

Survival probabilities

Low

0.8

Low

Survival probabilities

0.8

Survival probabilities

0.8

0.6

0.6

0.6

0.4

0.4

0.4

0.2

0.2

P=0.02

0.2

P=0.012

0

P=0.025

0

0

20

40

60

80 10

0

0

20 40 60 80 100

0

Survival time (month)

Survival time (month)

20 40 60 80 100 120 Survival time (month)

1

CESC

CESC

Median-OS

High

Low

1

Median-PFI

High

Survival probabilities

0.8

Survival probabilities

Low

0.8

0.6

0.6

0.4

0.4

0.2

P=0.035

0.2

P=0.015

0

0

0

20

40 60 80 100 120

0

Survival time (month)

20 40 60 80 100 120 Survival time (month)

in tumor-adjacent normal samples, while it had a significant positive correlation with CXCL9 (Fig. S6e).

The tumor mutation database IntOGen was used to identify cancer genes and pinpoint their putative mecha- nism of action across tumor types (Martínez et al. 2020). We found that DDX1 is mainly positively correlated with cancer driver genes TP53 (Cellular tumor antigen p53), APC (Adenomatous polyposis coli protein), BRAF (Serine/ threonine-protein kinase b-raf), ARID1A (At-rich inter- active domain-containing protein 1a), KRAS (GTPase KRas), LRP1B (Low-density lipoprotein receptor-related protein 1b), MLL2 (Histone-lysine N-methyltransferase 2d), 3 MLL3 (Myeloid/lymphoid or mixed-lineage leu- kemia), PIK3CA (Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform), and PTEN (Phosphatase and tensin homolog deleted on chromosome

ten) (Fig. 7d). Also, DDX1 has a strong positive corre- lation with APC (Cor=0.871), KRAS (Cor=0.877), and MLL3 (Cor=0.844) in UVM (Fig. 7d). In summary, there is a close relationship between DDX1 with tumor immu- nity, MSI, TMB, and ICT.

To further investigate the molecular mechanism of the DDX1 gene in tumorigenesis, we screened out the DDX1- binding proteins and the DDX1 expression-correlated genes for a series of pathway enrichment analyses. Based on the STRING tool, we obtained a total of 50 DDX1-binding proteins, which were supported by experimental evidence (Fig. 8a). We used the GEPIA2 tool to combine all tumor expression data of TCGA and obtained the top 100 genes that

Fig. 6 Phosphorylation analysis of DDX1 protein in different tumors. a DDX1 phosphoprotein expression profile in BRCA is shown based on the sample types. b DDX1 proteomic expression profiles in OV were shown. The z-values were compared based on the sample types (left), tumor grade (middle), and the individual cancer stages (right). c DDX1 proteomic expression profiles in KIRC were shown. The Z-values were compared based on the sample types (left), the tumor grade (middle), and the individual cancer stages (right). d DDX1 pro-

a

CPTAC dataset

3

S481

4-

S486

P=0.850

3

2

P<0.001

S632

P=0.786

3

2

BRCA

Z-value

1

2

0

Z-value

Z-value

1

1

-1

0

0

-2

-1

-1

-3.

-2

-2.

-4

Normal (n=18)

Primary tumor (n=125)

-3

Normal (n=18)

Primary tumor (n=125)

-3

Normal (n=18)

Primary tumor (n=125)

b

2

2

S481

P<0.001

S481

P<0.001

2

P<0.001

1

1

1

S481

OV

Z-value

0

Z-value

0

Z-value

0

-1

-1

-1

-2.

-2

-2-

-3

-3

-3

-4

Normal (n=19)

Primary tumor (n=84)

-4

Normal (n=19)

Grade1 (n=1)

Grade2 (n=4)

Grade3 (n=66)

-4

Normal (n=19)

Stage1 (n=2)

Stage3 (n=62)

Stage4 (n=15)

C

3

3

S481

P<0.001

S481

P<0.001

3

S481

2

2

P=0.002

2

KIRC

Z-value

1

Z-value

1

Z-value

1

0

0

0

-1

-1

-1

-2

-2-

-2

-3

Normal (n=83)

Primary tumor (n=110)

-3

Normal (n=83)

Grade1 (n=7)

Grade2 (n=53)

Grade3 (n=41)

Grade4 (n=9)

-3

Normal (n=83)

Stage1 (n=52)

Stage2 (n=13)

Stage3 (n=33)

Stage4 (n=12)

d

2

S481

P=0.085

2

S481

P=0.001

3

S481

2

UCEC

1

1

Z-value

P<0.001 T

T

Z-value

1

Z-value

T

T

T

0

0

0

-1

-1

-1

-2

-2

-2

-3

-3

Normal (n=31)

Primary tumor (n=100)

-3

Normal (n=31)

Grade1 (n=34)

Grade2 (n=32)

Grade3 (n=8)

-4

Normal (n=31) (n=15) K1

K2 (n=3)

K3 (n=13

K5

K6

K7 (n=16)

K8 K10 (n=18) (n=12)

13) (n=18)

correlated with DDX1 expression. As shown in Fig. 8b, the DDX1 expression level was positively correlated with that of SLC4A1AP (Kanadaptin) (r=0.62), RDH14 (Retinol dehy- drogenase 14) (r=0.57), E2F6 (Transcription factor E2F6) (r=0.57), NOL10 (Nucleolar protein 10) (r=0.57), DHX9 (ATP-dependent RNA helicase A) (r=0.54), and HNRNPK (Heterogeneous nuclear ribonucleoprotein K) (r=0.53) (all genes with p<0.001). All the above six genes were positively correlated with DDX1 in multiple tumors (p<0.05) using the TIMER2 tool (Fig. 8c-d). Also, the expression level of the DDX1 has a positive correlation with FAM98B (Pro- tein FAM98B), FAM98A (Protein FAM98A), and PPP1R8

teomic expression profiles in UCEC were shown. The Z-values were compared based on the sample types (left), the tumor grade (middle), and the individual cancer stages (right). The Z-values represented standard deviations from the median across samples for the given cancer type. Log2 Spectral count ratio values from CPTAC were first normalized within each sample profile, then normalized across sam- ples

(Nuclear inhibitor of protein phosphatase 1) in various types of tumors analyzed by using the TIMER2 tool (Fig. S8a).

To further understand the pathway distribution of these related genes, we integrated the two sets of gene data for KEGG and GO enrichment analysis. Multiple pathways, including the RNA transport, homologous recombination, and Fanconi anemia pathway, might be involved in the pathogenesis of DDX1 (Figs. 8e, S8b). Based on biologi- cal process categories (BP) analysis (Figs. 8f, S8c), DDX1 may participate in the occurrence of tumors by participating in the regulation of the metabolic process, RNA splicing, protein translation, and other pathways. Based on cellular

Fig. 7 Correlation analysis between DDX1 expression and immune infiltration in different cancers. a-b Correlation analysis of DDX1 gene expression and TMB or MSI is shown, respectively. The hori- zontal axis in the figure represented the expression distribution of the gene, and the ordinate was the expression distribution of the TMB or MSI score. The density curve on the right represented the distribution trend of the TMB or MSI score. The upper-density curve represented the distribution trend of the gene. The top side represented the cor- relation p-value, correlation coefficient, and correlation calculation method. c The correlation between DDX1 expression and cancer- driver genes in different cancers was analyzed. The color scale indi- cated the Spearman correlation. d The corresponding heatmap data in the exact cancer types were displayed. The red box showed the high expression renal cancers KIRC, KIRP, and KICH

a

READ

DLBC

TGCT

STAD

P<0.001

0.60

P<0.001

P=0.004

1.25

P<0.001

Cor=0.25

Cor=0.22

MSI score

1.00

Cor=0.33

Cor =- 0.50

0.42

MSI score

1.00

0.50

MSI score

MSI score

0.75

0.40

0.75

0.50

0.40

0.50

0.38

5:0

Log2 (DDX1 expression)

6:0

7:0

0.30

5.0

Log2 (DDX1 expression)

5.5

6.0

6.5

7.0

5:0

6:0

Log2 (DDX1 expression)

7:0

8.0

0.25

Log2 (DDX1 expression)

6.0

8.0

10.0

b

STAD

THCA

LUAD

SKCM

P<0.001

1.5

P<0.001

P<0.001

Cor=0.32

Cor =- 0.24

7.5

6.0

4.0.

Cor=0.21

TMB score

TMB score

1.0

TMB score

TMB score

4.0

5.0

0.5

2.0.

2.0

P<0.001

2.5

Cor=0.23

0.0

0.0

6:0

8:0

10.0

4.0

5.0

6.0

7.0

0.0

0.0

5.0

6:0

7:0

8:0

Log2 (DDX1 expression)

Log2 (DDX1 expression)

3.0

4.0

5.0

Log2 (DDX1 expression)

Log2 (DDX1 expression)

6.0

7.0

8:0

C

p > 0.05

Spearman_Cor

SKCM-Metastasis (n=368)

1

SKCM-Primary (n=103)

p ≤ 0.05

0

-1

UVM (n=80)

UCS (n=57)

UCEC (n=545)

THYM (n=120)

THCA (n=509)

TGCT (n=150)

STAD (n=415)

SKCM (n=471)

SARC (n=260)

READ (n=166)

PRAD (n=498)

PCPG (n=181)-

PAAD (n=179)

OV (n=303)

MESO (n=87)-

LUSC (n=501)

LUAD (n=515)

LIHC (n=371)

LGG (n=516)

KIRP (n=290)

KIRC (n=533)

KICH (n=66)

HNSC-HPV+ (n=98)

HNSC-HPV- (n=422)

HNSC (n=522)

GBM (n=153)

ESCA (n=185)

DLBC (n=48)

COAD (n=458)

CHOL (n=36)-

CESC (n=306)

BRCA-LumB (n=219)

BRCA-LumA (n=568)

BRCA-Her2 (n=82)

BRCA-Basal (n=191)-

BRCA (n=1100)

BLCA (n=408)

ACC (n=79)

CD274

CTLA4

HAVCR2

LAG3

PDCD1

PDCD1LG2

SIGLEC15

TIGIT

p > 0.05

Spearman_Cor

0

1

p ≤ 0.05

0

-1

UVM (n=80)

UCS (n=57)

UCEC (n=545)

THYM (n=120)

THCA (n=509)

TGCT (n=150)

STAD (n=415)

SKCM-Primary (n=103)

SKCM-Metastasis (n=368)

SKCM (n=471)

SARC (n=260) READ (n=166)

PRAD (n=498)

PCPG (n=181)

PAAD (n=179)

OV (n=303)

MESO (n=87)

LUSC (n=501)

LUAD (n=515)

LIHC (n=371)

LGG (n=516)

KIRP (n=290)

KIRC (n=533)

KICH (n=66)

HNSC-HPV+ (n=98)

HNSC-HPV- (n=422)

HNSC (n=522)

GBM (n=153)

ESCA (n=185)

DLBC (n=48)

COAD (n=458)

CHOL (n=36)

CESC (n=306)

BRCA-LumB (n=219)

BRCA-LumA (n=568)

BRCA-Her2 (n=82)

BRCA-Basal (n=191)

BRCA (n=1100)

BLCA (n=408)

ACC (n=79)

APC

ARIDIA

BRAF

KRAS

LRP1B

MLL2

MLL3

PIK3CA

PTEN

XIX

TP53

component categories (CC) analysis (Figs. 8g, S8d), these genes mainly exist in the nucleus chromosome, chromo- somal region, and cytoplasmic ribonucleoprotein granule. Based on molecular function categories (MF) analysis (Figs. 8h, S8e), these genes have single-stranded DNA bind- ing activity, helicase activity, and catalytic activity.

Knocking out of DDX1 in different cell lines

Next, the data from CRISPR-Cas9 screens across 558 cancer cell lines (https://depmap.org) were analyzed for ‘Essential- ity’ of the DDX1 genes (Meyers et al. 2017). ‘Essentiality’ of a gene was estimated using CERES, which estimates gene- dependency scores (gDS). ‘Strictly essential genes’, whose deletion severely affects cell viability, have gDS < - 1. ‘Strictly non-essential genes’, whose deletion does not affect cell viability, have gDS>0. Genes with gDS between - 1 and 0 are ‘potentially non-essential’, indicating the deletion affects cell viability to some degree but is not lethal. In all knockout cell lines, DDX1 had a CERES score of less than 0, and only a few cell lines had a score of less than -1, indicat- ing that DDX1 has a vital role in tumor cell lines (Fig. S9). Also, the cell lines from blood, gastric, peripheral nervous system, soft tissue, and upper aerodigestive have gDS < - 1. The RNA interference (RNAi) results are also similar (Fig. S9a). In conclusion, depletion of DDX1 affected the cell viability of human tumor-derived cell lines.

To further explore the mechanistic pathways of DDX1 in tumors, the CancerSEA was used to analyze reported tumor single-cell sequencing data (Fig. S9b). We found that DDX1 has the strongest positive correlation with the DNA repair pathway (Cor=0.3) in BRCA, Cell Cycle (Cor=0.26) in high-grade glioma (HGG), and Invasion (Cor=0.26) in BRCA. DDX1 has the strongest negative correlation with Differentiation (Cor =- 0.29) in OV, followed by Metastasis (Cor =- 0.28) in colorectal cancer (CRC), and Inflammation (Cor =- 0.26) in CRC. The above results verify that DDX1 participates in multiple pathways in tumors.

Discussion

Pan-cancer analysis was successfully used for identifying diagnostic and prognostic markers in many cancers (Cheng et al. 2021; Ju et al. 2021; Chatrath et al. 2020; Mitra et al. 2020; Berger et al. 2018; Robichaux et al. 2019; Turajlic et al. 2017). Here, we explored the potential oncogenic roles of DDX1 across 33 tumors based on the TCGA and the GTEx databases, and it was systematically explained

that differences in DDX1 expression levels, genetic changes, DNA methylation levels, and protein phosphorylation modi- fications could affect and regulate the immune cells, change the tumor microenvironment, and affect tumors by partici- pating in the RNA transport pathway (Fig. 1-8). DDX1 is highly expressed in many tumors, including BRCA, CHOL, and COAD, in agreement with previous reports (Tanaka et al. 2018) (Fig. 1). Also, the low expression of DDX1 is related to the poor prognosis of OV patients (Fig. 3), similar to the previous report (Li et al. 2008). Most importantly, we found that DDX1 is lowly expressed in renal cancers, including KIRC, KICH, and KIRP. Low expression of DDX1 in KIRC is correlated with a good prognosis of OS and DFS (Fig. 3). Thus, DDX1 might be a protective factor for KIRC patients and a risk factor for KICH, ACC, and LIHC patients.

More evidence showed that genetic changes of DDX1 are associated with different cancers; for example, changes of gene copy number and abnormally amplification of DDX1 with the MYCN oncogene in neuroblastoma are associated with poor prognosis in many tumors (Bayani et al. 2000; De et al. 2002). The genetic alteration analysis showed that the primary genetic change types of DDX1 are point mutation and amplification in most tumors. Especially in LUSC, the genetic alteration of DDX1 had an adverse effect on DFS (p<0.001) and DSS (p=0.0254) (Fig. S5b). Taken together, we explored the potential association between genetic altera- tion of DDX1 and the clinical survival prognosis of different cancers.

The correlation analysis between DDX1 and DNA meth- ylation showed that the amplifier is the primary genetic change in different tumors (Fig. 4). Therefore, we conducted a methylation analysis and found that DNA methylation negatively correlated with DDX1 expression level in can- cers (Fig. 5). The methylation levels of DDX1 have distinct effects on the prognosis of different cancers. We found that the DNA methylation level was significantly lower than that of normal tissues in SKCM, while the DNA methylation level was significantly higher than that of normal tissues in LUSC. However, there is no significant difference between BLCA and CESC. That may be one of the reasons for the different prognostic outcomes. At present, through analysis, we have found that the high DNA methylation level of DDX1 is related to the excellent prognosis of patients in LUSC, suggesting that DDX1 may regulate gene activity through DNA methylation levels and cause-related chromosomes changes in structure, DNA conformation, and DNA stabil- ity to regulate the gene expression level of DDX1. All the data above explain the difference in DDX1 expression in different cancers from epigenetics. Unfortunately, there are no available data on DDX1 protein methylation.

a

TSTA3

b

6

P=0

P=0

log2(SLC4A1AP TPM)

R=0.62

6

R=0.57

5

5

RC3H1

log2(E2F6 TPM)

4

DA

3

3

KIAA0368

2

EIF4EBP1

2

1

1

RAVER1

0

0

FAM98C

ARHGEF1

-

ARHGDIA

0

2

4

6

8

10

0

2

4

6

8

log2(DDX1 TPM)

10

EIF4ENIF1

KATNAL2

log2(DDX1 TPM)

C2orf49

8

8

FAM98A

C14orf166

RHOA

P=0

P=0

FAM98B

SHMT2

6

R=0.57

R=0.57

$

log2(NOL 10 TPM)

6

LSM14A

RTCB

log2(RDH14 TPM)

ARHG

4

y

DDX1

PPP1R8

2

2

EIF2S1

0

.

-

EIF2S2

RBM4

PPP1CA

0

0

2

log2(DDX1 TPM)

4

6

8

10

0

2

8

CDC5L

log2(DDX1 TPM)

4

6

10

PPP1CC

8

P=0

10

P=0

R=0.54

R=0.53

PICALM

RANBP9

CSTF2

log2(DHX9 TPM)

6

log2(HNRNPK TPM)

8

NUP133

0

SEC133

4

NUP107

2

2

2

NUP43

0

-

0

-

0

2

4

6

8

10

0

2

4

6

8

10

C

log2(DDX1 TPM)

log2(DDX1 TPM)

p > 0.05

Spearman_Cor

SKCM-Metastasis (n=368)-

1

p ≤ 0.05

0

-1

UVM (n=80)

UCS (n=57)

UCEC (n=545)-

THYM (n=120)-

THCA (n=509)

TGCT (n=150)

STAD (n=415)-

SKCM-Primary (n=103)-

SKCM (n=471)-

SARC (n=260)

READ (n=166)

PRAD (n=498)

PCPG (n=181) -

PAAD (n=179)-

OV (n=303)-

MESO (n=87)

LUSC (n=501)-

LUAD (n=515)-

LIHC (n=371)-

LGG (n=516)

KIRP (n=290)

KIRC (n=533)

KICH (n=66)

HNSC-HPV+ (n=98)

HNSC-HPV- (n=422)

HNSC (n=522)

GBM (n=153)-

ESCA (n=185)

DLBC (n=48)

COAD (n=458)

CHOL (n=36)

CESC (n=306)

BRCA-LumB (n=219) -

BRCA-LumA (n=568)

BRCA-Her2 (n=82)

BRCA-Basal (n=191)-

BRCA (n=1100)

BLCA (n=408)-

ACC (n=79)

DHX9

E2F6

HNRNPK

NOL10

RDH14

SLC4A1AP

d

e

f

RNA transport

KEGG

Regulation of cellular amide metabolic process

RNA splicing

Correlated

Interacted

Homologous recombination

Regulation of translation

P. adjust

Nucleocytoplasmic transport

P. adjust

Fanconi anemia pathway

Nuclear transport

1e-05

97

2e-05

3

31

0.005

DNA-dependent DNA replication

3e-05

Platelet activation

0.010

0.015

Double-strand break repair

4e-05

0.020

Vascular smooth muscle

DNA recombination

5e-05

Count

contraction

Count

DNA replication

5.0

· 5

7.5

Cellular response to heat

7

Mismatch repair

10.0

Response to heat

9

11

DNA replication

Regulation of cellular response

to heat

Protein sumoylation

GO

13

Nucleotide excision repair

tRNA splicing, via endonucleolytic

cleavage and ligation

Biological process (BP

FAM98B; FAM98A;PPP1R8

0.10 0.15 0.20 0.25 0.30

RNA splicing, via endonucleolytic

Gene Ratio

cleavage and ligation

0.075 0.100 0.125 0.150 0.175 0.200 Gene Ratio

g

Nuclear chromosome part

Chromosomal region

Cytoplasmic ribonucleoprotein

granule

Ribonucleoprotein granule

Count

Replication fork

· 4

6

Cytoplasmic stress granule

8

Protein-DNA complex

10

Condensed chromosome

12

DNA replication factor A complex

P. adjust

Nuclear replisome

2.5e-05

Replisome

GO

5.0e-05

Nuclear replication fork

7.5e-05

Nuclear pore

Site of DNA damage

Cellular component (CC)

Nuclear pore outer ring

0.10

0.15

0.20

Gene Ratio

h

Single-stranded DNA binding

Helicase activity

Catalytic activity, acting on DNA

ATPase activity, coupled-

P. adjust

Telomeric DNA binding

0.001

ATP-dependent helicase activity

0.002

Purine NTP-dependent helicase activity

0.003

Damaged DNA binding

Count

DNA-dependent ATPase activity

3

Protein phosphatase binding

4

5

ATP-dependent DNA helicase activity

6

DNA helicase activity

GO

7

G-rich strand telomeric DNA binding

8

Single-stranded telomeric DNA binding

molecular function (MF)

Sequence-specific single stranded DNA

binding

0.06

0.08

Gene Ratio

0.10

0.12

«Fig. 8 Enrichment analysis for DDX1-related genes. a Protein-pro- tein interaction network between the available experimentally deter- mined DDX1-binding proteins was analyzed using the STRING tool. b The expression correlation between DDX1 and selected targeting genes, including SLC4A1AP, RDH14, E2F6, NOL10, DHX9, and HNRNOK is shown. c Heat map of the Pearson correlation between the six genes and DDX1 for pan-cancers (red: positive correlation; blue: negative correlation) is shown. d The Venn diagram showed two types of crossover genes, including FAM98B, FAM98A, and PPP1R8. e The KEGG pathway was analyzed by using the DDX1- binding and interacted genes. f-h The GO enrichment analysis was performed by using the DDX1-binding and interacted genes

In addition, DDX1 can promote the expression of a subset of miRNAs in a mesenchymal ovarian cancer subtype, facili- tated by the ataxia telangiectasia mutated (ATM)-mediated phosphorylation (Han et al. 2014). We found that Ser481 of DDX1 had an enhanced phosphorylation level in BRCA and OV but decreased in KIRC (Fig. 6). Thus, the phosphoryla- tion of DDX1 might act as a regulatory protein to induce miRNAs expression and be involved in the occurrence of multiple tumors. It may also be because of the increase in the expression level of DDX1 protein, which leads to the increase in phosphorylation level in BRCA (Fig. 2). There- fore, the occurrence of tumors will be accompanied by the enhancement of S481 phosphorylation of DDX1, which has an important physiological role, and the specific related reg- ulatory mechanisms are worthy of further study. Immune infiltration analysis showed that DDX1 expression affected CD8+ T cells, and is significantly associated with MSI, TMB, and ICT in tumors (Fig. 7). Therefore, DDX1 may become a potential therapeutic prediction target when comb- ing with PD-1/PD-L1 and other immunotherapies. Loss of DDX1 caused rRNA processing defects, thereby activating the ribosome stress-p53 pathway (Suzuki et al. 2021). By KEGG and GO analysis, we further identified that DDX1 might participate in the regulation of tumorigenesis through RNA transport, Homologous recombination, Fanconi ane- mia pathway, and other pathways (Fig. 8). The two reported databases (Depmap and CancerSEA) were used to verify that DDX1 plays an essential function in tumor cell lines and it is related to the cancer driver genes (TP53, APC, ARID1A, BRAF, KRAS, LRP1B, MLL2, MLL3, PIK3CA, and PTEN) (Fig. S9).

In conclusion, the pan-cancer study of DDX1 offers a comprehensive understanding of the oncogenic roles of DDX1 across different tumors, suggesting that DDX1 might be a prognostic marker for renal cancers and a clinical diag- nostic marker in BRCA, OV, LUAD and KIRC.

Supplementary Information The online version contains supplemen- tary material available at https://doi.org/10.1007/s43657-021-00034-x.

Acknowledgements This work was supported by grants from the National Natural Science Foundation of China (82071782) and the

Shanghai Committee of Science and Technology (20XD1400800) to JL.

Authors’ contribution JL conceived and supervised the study. JL and BG wrote the manuscript. BG, XL, SL, SW, and JW conducted the analysis, interpreted the results, and critically reviewed the manuscript.

Data availability The datasets generated during and/or analyzed dur- ing the current study are available in the TCGA and GTEx. The TCGA (https://www.cancer.gov/about-nci/organization/ccg/research/struc tural-genomics/tcga) and GTEx (https://www.genome.gov/Funded- Programs-Projects/Genotype-Tissue-Expression-Project) belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data free for research and publish relevant articles.

Code availability Not applicable.

Declarations

Conflicts of interest This study is based on open-source data, and there are no ethical issues and other conflicts of interest. The authors declare that they have no competing interests.

Consent to publication Not applicable.

Consent to participate Not applicable.

Ethics approval Not applicable.

References

Bardou P, Mariette J, Escudié F, Djemiel C, Klopp C (2014) jvenn: an interactive Venn diagram viewer. BMC Bioinformatics 15:293. https://doi.org/10.1186/1471-2105-15-293

Bayani J, Zielenska M, Marrano P, Kwan Ng Y, Taylor MD, Jay V, Rutka JT, Squire JA (2000) Molecular cytogenetic analysis of medulloblastomas and supratentorial primitive neuroectodermal tumors by using conventional banding, comparative genomic hybridization, and spectral karyotyping. J Neurosurg 93:437-448. https://doi.org/10.3171/jns.2000.93.3.0437

Berger AC, Korkut A, Kanchi RS, Hegde AM, Lenoir W, Liu W, Liu Y, Fan H, Shen H, Ravikumar V, Rao A, Schultz A, Li X, Sumazin P, Williams C, Mestdagh P, Gunaratne PH, Yau C, Bowlby R, Robertson AG, Tiezzi DG, Wang C, Cherniack AD, Godwin AK, Kuderer NM, Rader JS, Zuna RE, Sood AK, Lazar AJ, Ojesina AI, Adebamowo C, Adebamowo SN, Baggerly KA, Chen TW, Chiu HS, Lefever S, Liu L, Mackenzie K, Orsulic S, Roszik J, Shelley CS, Song Q, Vellano CP, Wentzensen N; Cancer Genome Atlas Research Network, Weinstein JN, Mills GB, Levine DA, Akbani R (2018). A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. Cancer Cell 33:690-705.e9. https://doi.org/10.1016/j.ccell.2018.03.014

Bol GM, Vesuna F, Xie M, Zeng J, Aziz K, Gandhi N, Levine A, Irving A, Korz D, Tantravedi S, Heerma van Voss MR, Gabrielson K, Bordt EA, Polster BM, Cope L, van der Groep P, Kondaskar A, Rudek MA, Hosmane RS, van der Wall E, van Diest PJ, Tran PT, Raman V (2015). Targeting DDX3 with a small molecule inhibitor for lung cancer therapy. EMBO Mol Med 7:648-669. https://doi. org/10.15252/emmm.201404368.

Bonneville R, Krook MA, Kautto EA, Miya J, Wing MR, Chen HZ, Reeser JW, Yu L, Roychowdhury S (2017). Landscape of Micro- satellite Instability Across 39 Cancer Types. JCO Precis Oncol 2017: PO.17.00073. https://doi.org/10.1200/PO.17.00073.

Brahmer JR, Tykodi SS, Chow LQ, Hwu WJ, Topalian SL, Hwu P, Drake CG, Camacho LH, Kauh J, Odunsi K, Pitot HC, Hamid O, Bhatia S, Martins R, Eaton K, Chen S, Salay TM, Alaparthy S, Grosso JF, Korman AJ, Parker SM, Agrawal S, Goldberg SM, Pardoll DM, Gupta A, Wigginton JM (2012) Safety and activity of anti-PD-L1 antibody in patients with advanced cancer. N Engl J Med 366:2455-2465. https://doi.org/10.1056/NEJMoa1200694

Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacob- sen A, Byrne CJ, Heuer ML, Larsson E, Antipin Y, Reva B, Gold- berg AP, Sander C, Schultz N (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2:401-404. https://doi.org/10.1158/ 2159-8290.CD-12-0095

Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi BVSK, Varambally S (2017) UALCAN: a portal for facilitating tumor subgroup gene expres- sion and survival analyses. Neoplasia 19:649-658. https://doi.org/ 10.1016/j.neo.2017.05.002

Chatrath A, Przanowska R, Kiran S, Su Z, Saha S, Wilson B, Tsune- matsu T, Ahn JH, Lee KY, Paulsen T, Sobierajska E, Kiran M, Tang X, Li T, Kumar P, Ratan A, Dutta A (2020) The pan-cancer landscape of prognostic germline variants in 10,582 patients. Genome Med 12:15. https://doi.org/10.1186/s13073-020-0718-7

Chen Z, Li Z, Hu X, Xie F, Kuang S, Zhan B, Gao W, Chen X, Gao S, Li Y, Wang Y, Qian F, Ding C, Gan J, Ji C, Xu XW, Zhou Z, Huang J, He HH, Li J (2020) Structural Basis of Human Heli- case DDX21 in RNA Binding, Unwinding, and Antiviral Signal Activation. Adv Sci (weinh) 7:2000532. https://doi.org/10.1002/ advs.202000532

Cheng S, Li Z, Gao R, Xing B, Gao Y, Yang Y, Qin S, Zhang L, Ouy- ang H, Du P, Jiang L, Zhang B, Yang Y, Wang X, Ren X, Bei JX, Hu X, Bu Z, Ji J, Zhang Z (2021) A pan-cancer single-cell tran- scriptional atlas of tumor infiltrating myeloid cells. Cell 184:792- 809.e23. https://doi.org/10.1016/j.cell.2021.01.010

CNCB-NGDC Members and Partners (2021) Database resources of the national genomics data center, China national center for bioinfor- mation in 2021. Nucleic Acids Res 49:D18-D28. https://doi.org/ 10.1093/nar/gkaa1022

De Preter K, Speleman F, Combaret V, Lunec J, Laureys G, Eussen BH, Francotte N, Board J, Pearson AD, De Paepe A, Van Roy N, Vandesompele J (2002) Quantification of MYCN, DDX1, and NAG gene copy number in neuroblastoma using a real-time quan- titative PCR assay. Mod Pathol 15:159-166. https://doi.org/10. 1038/modpathol.3880508

Ding W, Chen J, Feng G, Chen G, Wu J, Guo Y, Ni X, Shi T (2020) DNMIVD: DNA methylation interactive visualization database. Nucleic Acids Res 48:D856-D862. https://doi.org/10.1093/nar/ gkz830

Edgcomb SP, Carmel AB, Naji S, Ambrus-Aikelin G, Reyes JR, Saphire AC, Gerace L, Williamson JR (2012) DDX1 is an RNA- dependent ATPase involved in HIV-1 Rev function and virus replication. J Mol Biol 415:61-74. https://doi.org/10.1016/j.jmb. 2011.10.032

Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E, Cerami E, Sander C, Schultz N (2013). Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6: pl1. https:// doi.org/10.1126/scisignal.2004088.

Gibbons RJ, Bachoo S, Picketts DJ, Aftimos S, Asenbauer B, Ber- goffen J, Berry SA, Dahl N, Fryer A, Keppler K, Kurosawa K, Levin ML, Masuno M, Neri G, Pierpont ME, Slaney SF, Higgs DR (1997) Mutations in transcriptional regulator ATRX establish

the functional significance of a PHD-like domain. Nat Genet 17:146-148. https://doi.org/10.1038/ng1097-146

Godbout R, Packer M, Bie W (1998) Overexpression of a DEAD box protein (DDX1) in neuroblastoma and retinoblastoma cell lines. J Biol Chem 273:21161-21168. https://doi.org/10.1074/jbc.273. 33.21161

Hall MC, Matson SW (1999) The Escherichia coli MutL protein physi- cally interacts with MutH and stimulates the MutH-associated endonuclease activity. J Biol Chem 274:1306-1312. https://doi. org/10.1074/jbc.274.3.1306

Han C, Liu Y, Wan G, Choi HJ, Zhao L, Ivan C, He X, Sood AK, Zhang X, Lu X (2014) The RNA-binding protein DDX1 promotes primary microRNA maturation and inhibits ovarian tumor pro- gression. Cell Rep 8:1447-1460. https://doi.org/10.1016/j.celrep. 2014.07.058

He X, Xu C (2020) Immune checkpoint signaling and cancer immu- notherapy. Cell Res 30:660-669. https://doi.org/10.1038/ s41422-020-0343-4

Heerma van Voss MR, van Diest PJ, Raman V (2017) Targeting RNA helicases in cancer: the translation trap. Biochim Biophys Acta Rev Cancer 1868:510-520. https://doi.org/10.1016/j.bbcan.2017. 09.006

Hondele M, Sachdev R, Heinrich S, Wang J, Vallotton P, Fontoura BMA, Weis K (2019) DEAD-box ATPases are global regulators of phase-separated organelles. Nature 573:144-148. https://doi. org/10.1038/s41586-019-1502-y

Huang TX, Fu L (2019) The immune landscape of esophageal can- cer. Cancer Commun (lond) 39:79. https://doi.org/10.1186/ s40880-019-0427-z

Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, Berent-Maoz B, Pang J, Chmielowski B, Cherry G, Seja E, Lomeli S, Kong X, Kelley MC, Sosman JA, Johnson DB, Ribas A, Lo RS (2016) Genomic and transcriptomic features of response to Anti-PD-1 therapy in metastatic melanoma. Cell 165:35-44. https://doi.org/10.1016/j.cell.2016.02.065

Jarmoskaite I, Russell R (2014) RNA helicase proteins as chaperones and remodelers. Annu Rev Biochem 83:697-725. https://doi.org/ 10.1146/annurev-biochem-060713-035546

Ju M, Bi J, Wei Q, Jiang L, Guan Q, Zhang M, Song X, Chen T, Fan J, Li X, Wei M, Zhao L (2021). Pan-cancer analysis of NLRP3 inflammasome with potential implications in prognosis and immu- notherapy in human cancer. Brief Bioinform 22: bbaa345. https:// doi.org/10.1093/bib/bbaa345.

Khemici V, Linder P (2018) RNA helicases in RNA decay. Biochem Soc Trans 46:163-172. https://doi.org/10.1042/BST20170052

Koch A, De Meyer T, Jeschke J, Van Criekinge W (2015) MEX- PRESS: visualizing expression, DNA methylation and clinical TCGA data. BMC Genomics 16:636. https://doi.org/10.1186/ s12864-015-1847-z

Li L, Monckton EA, Godbout R (2008) A role for DEAD box 1 at DNA double-strand breaks. Mol Cell Biol 28:6413-6425. https://doi. org/10.1128/MCB.01053-08

Li T, Fu J, Zeng Z, Cohen D, Li J, Chen Q, Li B, Liu XS (2020) TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res 48:W509-W514. https://doi.org/10.1093/nar/gkaa407

Liao Y, Wang J, Jaehnig EJ, Shi Z, Zhang B (2019) WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res 47:W199-W205. https://doi.org/10.1093/nar/gkz401

Litchfield K, Reading JL, Puttick C, Thakkar K, Abbosh C, Bentham R, Watkins TBK, Rosenthal R, Biswas D, Rowan A, Lim E, Al Bakir M, Turati V, Guerra-Assunção JA, Conde L, Furness AJS, Saini SK, Hadrup SR, Herrero J, Lee SH, Van Loo P, Enver T, Larkin J, Hellmann MD, Turajlic S, Quezada SA, McGranahan N, Swanton C (2021) Meta-analysis of tumor- and T cell-intrinsic mechanisms

of sensitization to checkpoint inhibition. Cell 184:596-614.e14. https://doi.org/10.1016/j.cell.2021.01.002

Martínez-Jiménez F, Muiños F, Sentís I, Deu-Pons J, Reyes-Salazar I, Arnedo-Pac C, Mularoni L, Pich O, Bonet J, Kranas H, Gonzalez- Perez A, Lopez-Bigas N (2020) A compendium of mutational cancer driver genes. Nat Rev Cancer 20:555-572. https://doi.org/ 10.1038/s41568-020-0290-x

Meyers RM, Bryan JG, McFarland JM, Weir BA, Sizemore AE, Xu H, Dharia NV, Montgomery PG, Cowley GS, Pantel S, Goodale A, Lee Y, Ali LD, Jiang G, Lubonja R, Harrington WF, Strickland M, Wu T, Hawes DC, Zhivich VA, Wyatt MR, Kalani Z, Chang JJ, Okamoto M, Stegmaier K, Golub TR, Boehm JS, Vazquez F, Root DE, Hahn WC, Tsherniak A (2017) Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat Genet 49:1779-1784. https://doi.org/10.1038/ng.3984

Mitkova AV, Khopde SM, Biswas SB (2003) Mechanism and stoi- chiometry of interaction of DnaG primase with DnaB helicase of Escherichia coli in RNA primer synthesis. J Biol Chem 278:52253-52261. https://doi.org/10.1074/jbc.M308956200

Mitra R, Adams CM, Jiang W, Greenawalt E, Eischen CM (2020) Pan- cancer analysis reveals cooperativity of both strands of microRNA that regulate tumorigenesis and patient survival. Nat Commun 11:968. https://doi.org/10.1038/s41467-020-14713-2

Nagel JE, Smith RJ, Shaw L, Bertak D, Dixit VD, Schaffer EM, Taub DD (2004) Identification of genes differentially expressed in T cells following stimulation with the chemokines CXCL12 and CXCL10. BMC Immunol 5:17. https://doi.org/10.1186/ 1471-2172-5-17

Ribeiro de Almeida C, Dhir S, Dhir A, Moghaddam AE, Sattentau Q, Meinhart A, Proudfoot NJ (2018) RNA helicase DDX1 converts RNA G-quadruplex structures into R-loops to promote IgH class switch recombination. Mol Cell 70:650-662.e8. https://doi.org/ 10.1016/j.molcel.2018.04.001

Rizvi NA, Hellmann MD, Snyder A, Kvistborg P, Makarov V, Havel JJ, Lee W, Yuan J, Wong P, Ho TS, Miller ML, Rekhtman N, Moreira AL, Ibrahim F, Bruggeman C, Gasmi B, Zappasodi R, Maeda Y, Sander C, Garon EB, Merghoub T, Wolchok JD, Schumacher TN, Chan TA (2015) Cancer immunology. Mutational landscape deter- mines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348:124-128. https://doi.org/10.1126/science.aaa1348

Robichaux JP, Elamin YY, Vijayan RSK, Nilsson MB, Hu L, He J, Zhang F, Pisegna M, Poteete A, Sun H, Li S, Chen T, Han H, Negrao MV, Ahnert JR, Diao L, Wang J, Le X, Meric-Bernstam F, Routbort M, Roeck B, Yang Z, Raymond VM, Lanman RB, Frampton GM, Miller VA, Schrock AB, Albacker LA, Wong KK, Cross JB, Heymach JV (2019) Pan-cancer landscape and analysis of ERBB2 mutations identifies poziotinib as a clinically active inhibitor and enhancer of T-DM1 activity. Cancer Cell 36:444- 457.e7. https://doi.org/10.1016/j.ccell.2019.09.001

Suzuki T, Katada E, Mizuoka Y, Takagi S, Kazuki Y, Oshimura M, Shindo M, Hara T (2021) A novel all-in-one conditional knockout system uncovered an essential role of DDX1 in ribosomal RNA processing. Nucleic Acids Res 49:40. https://doi.org/10.1093/nar/gkaa1296

Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Mering CV (2019) STRING v11: protein-protein association net- works with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 47:D607- D613. https://doi.org/10.1093/nar/gky1131

Tanaka K, Ikeda N, Miyashita K, Nuriya H, Hara T (2018) DEAD box protein DDX1 promotes colorectal tumorigenesis through tran- scriptional activation of the LGR5 gene. Cancer Sci 109:2479- 2489. https://doi.org/10.1111/cas.13661

Tang J, Chen H, Wong CC, Liu D, Li T, Wang X, Ji J, Sung JJ, Fang JY, Yu J (2018) DEAD-box helicase 27 promotes colorectal can- cer growth and metastasis and predicts poor survival in CRC

patients. Oncogene 37:3006-3021. https://doi.org/10.1038/ s41388-018-0196-1

Tang Z, Kang B, Li C, Chen T, Zhang Z (2019) GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res 47:W556-W560. https://doi.org/10. 1093/nar/gkz430

Tanner NK, Linder P (2001) DExD/H box RNA helicases: from generic motors to specific dissociation functions. Mol Cell 8:251-262. https://doi.org/10.1016/s1097-2765(01)00329-x

Thul PJ, Åkesson L, Wiking M, Mahdessian D, Geladaki A, Ait Blal H, Alm T, Asplund A, Björk L, Breckels LM, Bäckström A, Daniels- son F, Fagerberg L, Fall J, Gatto L, Gnann C, Hober S, Hjelmare M, Johansson F, Lee S, Lindskog C, Mulder J, Mulvey CM, Nils- son P, Oksvold P, Rockberg J, Schutten R, Schwenk JM, Siverts- son Å, Sjöstedt E, Skogs M, Stadler C, Sullivan DP, Tegel H, Winsnes C, Zhang C, Zwahlen M, Mardinoglu A, Pontén F, von Feilitzen K, Lilley KS, Uhlén M, Lundberg E (2017). A subcel- lular map of the human proteome. Science 356: eaal3321. https:// doi.org/10.1126/science.aal3321.

Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, Gill S, Harrington WF, Pantel S, Krill-Burger JM, Meyers RM, Ali L, Goodale A, Lee Y, Jiang G, Hsiao J, Gerath WFJ, Howell S, Merkel E, Ghandi M, Garraway LA, Root DE, Golub TR, Boehm JS, Hahn WC (2017) Defining a cancer dependency map. Cell 170:564-576.e16. https://doi.org/10.1016/j.cell.2017.06.010

Turajlic S, Litchfield K, Xu H, Rosenthal R, McGranahan N, Reading JL, Wong YNS, Rowan A, Kanu N, Al Bakir M, Chambers T, Sal- gado R, Savas P, Loi S, Birkbak NJ, Sansregret L, Gore M, Larkin J, Quezada SA, Swanton C (2017) Insertion-and-deletion-derived tumour-specific neoantigens and the immunogenic phenotype: a pan-cancer analysis. Lancet Oncol 18:1009-1021. https://doi.org/ 10.1016/S1470-2045(17)30516-8

Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardi- noglu A, Sivertsson Å, Kampf C, Sjöstedt E, Asplund A, Olsson I, Edlund K, Lundberg E, Navani S, Szigyarto CA, Odeberg J, Djureinovic D, Takanen JO, Hober S, Alm T, Edqvist PH, Ber- ling H, Tegel H, Mulder J, Rockberg J, Nilsson P, Schwenk JM, Hamsten M, von Feilitzen K, Forsberg M, Persson L, Johans- son F, Zwahlen M, von Heijne G, Nielsen J, Pontén F (2015). Proteomics. Tissue-based map of the human proteome. Science 347:1260419. https://doi.org/10.1126/science.1260419.

Uhlen M, Zhang C, Lee S, Sjöstedt E, Fagerberg L, Bidkhori G, Ben- feitas R, Arif M, Liu Z, Edfors F, Sanli K, von Feilitzen K, Oks- vold P, Lundberg E, Hober S, Nilsson P, Mattsson J, Schwenk JM, Brunnström H, Glimelius B, Sjöblom T, Edqvist PH, Djure- inovic D, Micke P, Lindskog C, Mardinoglu A, Ponten F (2017). A pathology atlas of the human cancer transcriptome. Science 357: eaan2507. https://doi.org/10.1126/science.aan2507.

Yuan H, Yan M, Zhang G, Liu W, Deng C, Liao G, Xu L, Luo T, Yan H, Long Z, Shi A, Zhao T, Xiao Y, Li X (2019) CancerSEA: a cancer single-cell state atlas. Nucleic Acids Res 47:D900-D908. https://doi.org/10.1093/nar/gky939

Zeng D, Li M, Zhou R, Zhang J, Sun H, Shi M, Bin J, Liao Y, Rao J, Liao W (2019) Tumor microenvironment characterization in gastric cancer identifies prognostic and immunotherapeutically relevant gene signatures. Cancer Immunol Res 7:737-750. https:// doi.org/10.1158/2326-6066.CIR-18-0436

Zhang Y, Zhang Z (2020) The history and advances in cancer immu- notherapy: understanding the characteristics of tumor-infiltrating immune cells and their therapeutic implications. Cell Mol Immu- nol 17:807-821. https://doi.org/10.1038/s41423-020-0488-6

Zhong W, Li Z, Zhou M, Xu T, Wang Y (2018) DDX1 regulates alter- native splicing and insulin secretion in pancreatic ß cells. Biochem Biophys Res Commun 500:751-757. https://doi.org/10.1016/j. bbrc.2018.04.147