Original Article MMP9 in pan-cancer and computational study to screen for MMP9 inhibitors

Xianjie Ai1, Xinyu Wang2, Taotao Ren1, Zhong Li1, Bo Wu1, Ming Li1

1Lower Extremity Division, Orthopedic Trauma Department, Honghui Hospital, Xi’an Jiaotong University, Youyi East Road No. 555, Beilin District, Xi’an, Shaanxi, China; 2Department of Orthopaedic Trauma, Center of Orthopaedics and Traumatology, The First Hospital of Jilin University, Street Xinmin 71, Changchun, Jilin, China

Received May 21, 2024; Accepted September 26, 2024; Epub November 15, 2024; Published November 30, 2024

Abstract: Purpose: The stromal cell protein metalloproteinase 9 (MMP9), associated with extracellular matrix degra- dation and remodeling, promotes tumor invasion and metastasis and regulates cell adhesion molecule and cytokine activity. This study evaluated MMP9 in pan-cancer and screened for compounds and drug candidates that can inhib- it it. Methods: MMP9 expression in pan-cancer tissues was evaluated in a pan-cancer dataset from the University of California Santa Cruz database, along with the correlation between MMP9 and the tumor microenvironment (TME), RNA modification genes, and tumor mutation burden. MMP9 crystal structures were downloaded, and a ligand- based pharmacophore model was constructed. A machine learning model was constructed for further screening. The identified compounds were pooled into Discovery Studio 4.5 for absorption, distribution, metabolism, and excre- tion (ADME) and toxicity prediction. Molecular docking was used to demonstrate the binding affinity and mechanism between the compounds and MMP9, and the stability of the ligand-receptor complex was assessed. Results: The expression levels of MMP9 differed between tumor tissues. Prognostic analysis showed that high MMP9 expression indicates poor survival and tumor progression in glioma (GMBLGG), pan-kidney (KIPAN; KICH+KIRC+KIRP), uveal melanoma (UVM), low-grade glioma (LGG), adrenocortical carcinoma (ACC), and liver hepatocellular carcinoma (LIHC). MMP9 expression in GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC was positively correlated with the TME. The ligand-based pharmacophore model and the machine learning model identified 49 small molecules. ADME and tox- icity prediction identified CEMBL82047 and CEMBL381163 as potential MMP9 inhibitors, showing robust binding affinity with MMP9. The resulting complexes are stable in the natural environment. Conclusion: CHEMBL82047 and CHEMBL381163 are ideal compounds for inhibiting MMP9. The findings of this study will contribute to the design and improvement of MMP9-targeting drugs.

Keywords: MMP9, pan-cancer, ligand-based pharmacophore model, machine learning model, virtual screening, molecular dynamics simulation

Introduction

Tumors are formed when normal cells prolifer- ate and differentiate abnormally under the action of various initiating and promoting fac- tors. Tumors, especially malignant ones, de- stroy normal tissues and organs and can cause gradual organ dysfunction until failure or death due to compression, consumption, or destruc- tion [1]. Malignant cancer is one of the leading causes of death worldwide [2], with an extreme- ly low cure rate in developed and developing countries [3, 4]. Although great progress has been made in cancer therapy, many patients

still have poor prognoses and low survival rates. Thus, novel therapeutic methods and drugs are urgently needed.

Matrix metalloproteinase-9 (MMP9), a member of the zinc-dependent endopeptidase family, is a gelatinase involved in a variety of biological processes (e.g., proteolytic extracellular matrix (ECM) degradation, cell-ECM and cell-cell inter- actions, and cell surface cleavage activities). In addition, it degrades and regulates ECM pro- teins and releases bioactive proteins, including cytokines, chemokines, and growth factors [5, 6]. MMP9 degrades type IV collagen and dis-

MMP9 in cancer & computational screening of inhibitors

rupts basement membranes associated with tumor invasion and metastasis. The expression level of MMP9 mRNA is significantly higher in nasopharyngeal carcinoma tissues than in nasopharyngeal tissues, and MMP9 overex- pression accelerates tumor growth by inducing angiogenesis and enhanced local cell invasion and metastasis by degrading the ECM [7]. In esophageal cancer, MMP9 overexpression is significantly correlated with the depth of tumor infiltration, lymphatic infiltration, lymph node metastasis, and the degree of pathological dif- ferentiation [7]. The ECM is a key component of the local tumor microenvironment (TME) and undergoes extensive remodeling during breast cancer evolution. MMP9 is reported as a key player in ECM remodeling during cancer initia- tion and progression through a variety of mech- anisms [8].

Currently, several chemotherapeutic agents target MMP9. MMP9-IN-1, a highly selective MMP9 inhibitor with oral efficacy [9, 10], selec- tively inhibits MMP9 to control the develop- ment, progression, invasion, and metastasis of nasopharyngeal carcinoma, but it also affects the function of the human respiratory system and reduces the activity of other proteases and cytokines because of its strong and effec- tive inhibitory effect [7, 11]. JNJ0966 is anoth- er highly selective MMP9 inhibitor that blocks the conversion of MMP9 zymogen to a catalyti- cally active enzyme [12]. However, it is currently only used in scientific research. Other MMP9 inhibitors exist but with extensive effect tar- gets, which means they have more side effects. Therefore, novel MMP9-targeting drugs are needed.

This study combined a pharmacophore model and a machine learning model to screen for novel MMP9 inhibitors. Pharmacophores are combinations of characterized three-dimen- sional structural elements [13, 14] and have been used to design and screen new drugs on the basis of specific ligand structures [15, 16]. Machine learning is used to predict or classify drugs using data analysis [17] and is helpful in many fields, such as clinical data processing [18, 19]. We explored the role of MMP9 in pan-cancer and assessed the relevance of MMP9 in the tumor immune microenvironment and mRNA modifications. We then constructed a pharmacophore model and a machine learn-

ing model to screen for inhibitors of MMP9, fol- lowed by absorption, distribution, metabolism, excretion (ADME) and toxicity analysis, protein- ligand docking, and molecular dynamics (MD) simulation. This research provides a novel investigation strategy and a group of therapeu- tic candidates for MMP9, which might serve as a strong foundation for further agonist research.

Methods

Analysis of the expression level of MMP9 in pan-cancer datasets

The unified and standardized pan-cancer data- set TCGA (The Cancer Genome Atlas) Pan- Cancer (PANCAN, N = 10535, G = 60499) was downloaded from the University of California Santa Cruz (UCSC) database (https://xen- abrowser.net/). The expression data of the ENSG00000100985 (MMP9) gene was ex- tracted from each sample, and the samples from normal solid tissue, primary blood-derived cancer - peripheral blood, and primary tumors were further screened, followed by log2 (x+0.001) transformation of each expression value. Cancers with fewer than three samples were excluded. The difference in expression between normal and tumor samples in each tumor was calculated using R software (ver- sion 3.6.4), and significance analysis was per- formed using unpaired Wilcoxon rank sum and signed rank tests. Finally, a plot showing the differences in MMP9 expression between can- cers was created.

Identification of the correlation between MMP9 expression levels and survival in pan- cancer

Several metrics (overall survival [OS] and pro- gression-free survival [PFS]) were selected from TCGA samples to investigate the associa- tion between MMP9 expression and patient outcomes. A high-quality prognostic dataset (TCGA) was obtained from a previously pub- lished TCGA prognosis study published in Cell; cancers with fewer than 10 samples and sam- ples with a follow-up time of less than 30 days were excluded. The R software package “sur- vival” was used to obtain a forest map for Cox to analyze the relationship between MMP9 gene expression and survival in each tumor. The patients with each tumor type in the TCGA dataset were divided into two groups according

MMP9 in cancer & computational screening of inhibitors

to the best cut-off value of MMP9 to compare the prognostic differences. The prognostic dif- ferences between the two groups were further analyzed using the “survfit” function of the R software package “survival”, and the log-rank test was used to evaluate significant prognostic differences between the samples of different groups.

Association between MMP9 expression and the TME in pan-cancer

The gene expression profiles of each tumor were extracted separately, and the expression profiles were mapped to “Gene Symbol”. The R software package “ESTIMATE” was used to calculate the stromal, immune, and ESTIMATE scores of each patient with each tumor type according to gene expression. The corr.test function of the R software package “psych” was used to calculate the Pearson’s correlation coefficient between genes and immune inva- sion and immune cell invasion scores in each tumor to determine whether immune invasion scores were significantly correlated.

Correlation between MMP9 expression and mRNA-modifying genes in pan-cancer

The expression data of the marker genes of the MMP9 gene and three types of RNA modifica- tion genes (m1A, m5C, and m6A) in each sam- ple were extracted. Primary blood-derived can- cer - peripheral blood samples and primary tumor samples were screened, and the Pear- son correlation coefficients between MMP9 and the marker genes of the five types of immune pathways were calculated by filtering all normal samples and transforming each expression value. These data were used to esti- mate the role of RNA modifications in cancer using the gene expression dataset and further summarize their therapeutic potential for abnormal deposition in cancer.

Association between MMP9 expression and tumor mutation burden in pan-cancer

Simple nucleotide variation data were down- loaded from the database and processed. A simple nucleotide variation dataset was used to plot the mutational landscape of MMP9 in four tumor types. Tumor mutation burden (TMB) scores were calculated using mutation data of four tumor samples from TCGA, and patients

were divided into low-TMB and high-TMB groups according to the TMB score quartile. Dif- ferentially expressed genes (DEGs) were identi- fied in the low- and high-TMB groups.

Construction and verification of pharmacody- namic mass models

Pharmacophore models are useful for screen- ing ideal compounds, and two types of pharma- cophore models are known: structure-based pharmacological models derived directly from the X-ray structure of protein - ligand complex- es and ligand-based pharmacological models derived from the structure of known active compounds. The crystal structures of human MMP9 receptors with different ligands (pro- tein data bank [PDB] IDs: 2OW0, 2OW1, 4H3X, and 4WZV) were analyzed using LigandScout v4.3, which provides automated construction of three-dimensional pharmacophores. Ligand- Scout identifies 3D chemical features; ligand options containing hydrogen bond donors (HBDs) and acceptors (HBAs) are shown as concentrated vectors, along with negative and positive ignitable spheres. Moreover, lipophilic regions are indicated by spheres. In addition, to expand selectivity, the LigandScout indicator incorporates spatial data about regions into each promising inhibitor. Pharmacophore sig- natures were entered into the web server Pharmit (http://pharmit.csb.pitt.edu/) to search for and identify small molecules that bind to the target molecule (MMP9 receptor) on the basis of structural and chemical similarities between small molecules. By combining the code from the PDB, 1,752,844 possible small molecules are obtained. Then, the deep learn- ing model was built by DeepScreening (http:// deepscreening.xielab.net/) for further screen- ing, and the performance of the model was evaluated using test loss, accuracy, recall, pre- cision, the F1 (F1-score), and Matthew’s corre- lation coefficient (MCC).

ADME and toxicity prediction

The ADME module of Discovery Studio 4.5 was used to calculate the ADME of selected com- pounds, along with their water solubility, blood- brain barrier permeability, cytochrome P-450 2D6 (CYP2D6) inhibition, hepatotoxicity, human enteric absorption, and plasma protein binding levels. The topcat module of Discovery Studio

Figure 1. Mind map of this study.

MMP9

Pan-cancer Analysis

Virtual Screening

Gene Expression

1752844molecules

Pharmacophore Model

Prognotic Analysis

230 molecules

Immune Infiltrarion Analusis

Deep Learning Model

49 molecules

mRNA Modification Analusis

ADMET

SNP Analysis

2 molecules

Protein-molecule Docking and MD

4.5 was used to calculate the potential com- pounds’ toxicity and other properties, such as the National Toxicology Program rodent carci- nogenicity, the Ames mutagenicity, the devel- opmental toxicity potential, the median oral lethal dose (LD50), and the chronic oral mini- mum observed adverse reaction level (LOAEL) in rats. These pharmacological properties were considered when selecting appropriate drug candidates for MMP9.

Protein molecule docking

Molecular docking was assessed using the Glide module of the Schrödinger kit to collect the active conformation of small molecules interacting with the MMP9 receptor. Top-level compounds from the pharmacophore screen- ing were prepared in Maestro using the LigPrep module to obtain the starting struc- ture for docking. Ligand-acceptor interactions included hydrogen bond interactions, van der Waals interactions, IT-IT stacked interactions, and ionic interactions. The molecular docking results were analyzed according to the binding energy (kcal/mol) between small molecules and amino residues and the number of binding interactions.

Molecular dynamics simulation

The best binding conformations of the ligand- MMP9 complexes among the potential com- pounds predicted by the molecule docking pro- gram were submitted to the MD simulation using Discovery Studio 4.5. The ligand-acce-

ptor complex was placed into an orthogonal box and sol- vated with an explicit perio- dic boundary-solvated water model. To simulate the physi- ological environment, sodium chloride was added to a sys- tem with an ionic strength of 0.145. The system was then subjected to a CHARMM force field for analogy-based ligand parameterization. For this sys- tem, the following simulation protocols were applied: 1000 minimization steps for the fastest descent and conjugate gradient; 5ps equilibrium sim- ulation at 300 K (slow drive 2ps from 50 K initial tempera- ture) and atmospheric pressure; 25ps-MD sim- ulation (production mode) at NPT (atmospheric pressure and temperature). The Particle Grid Ewald (PME) algorithm was used to calculate remote electrostatic, and the Linear Constraint Solver (LINCS) algorithm was used to fix all bonds involving hydrogen. With the initial com- plexity setting as a reference, the trajectories of the root mean square deviation (RMSD), poten- tial energy, and structural features were deter- mined by the Discovery Studio 4.5 analysis tra- jectory protocol.

Results

MMP9 expression in pan-cancer

The complete data analysis process is depict- ed in Figure 1. We analyzed the expression data of 26 cancer types and found that MMP9 was highly expressed in the vast majority of tumor samples. The expression differed signifi- cantly between most tumors, including glio- blastoma multiforme (GBM), cervical squa- mous cell carcinoma and endocervical ad- enocarcinoma (CESC), lung adenocarcinoma (LUAD), colon adenocarcinoma (COAD), co- lon adenocarcinoma/rectum adenocarcinoma esophageal carcinoma (COADREAD), breast invasive carcinoma (BRCA), esophageal carci- noma (ESCA), stomach and esophageal carci- noma (STES), kidney renal papillary cell carci- noma (KIRP), kidney pancreas carcinoma (KIPAN), stomach adenocarcinoma (STAD), prostate adenocarcinoma (PRAD), uterine cor- pus endometrial carcinoma (UCEC), head and

Figure 2. Pan-cancer analysis of MMP9 expression. A. Differential expression of MMP9 between tumor and normal tissues in pan-cancer analysis. MMP9 expression correlates with overall survival time (OS). B. Survival curves of MMP9 expression in GBMLGG, KIPAN and UVM. L represents low expression of MMP9 group, and H represents high expression of MMP9 group. C. Pan-cancer cohort (GBMLGG, KICH, KIRC, KIRP, KIPAN and UVM). Correlation between MMP9 expression and immune scores.

A

25


**
















-

*

-

*



20

Expression

15

10

5

Group

0

Tumor

Norma

-5

HH

-10.

-15

-20.

GBM(T=153,N=5)

GBMLGG(T=662,N=5)

LGG(T=509,N=5)

CESC(T=304,N=3)

LUAD(T=513,N=109)

COAD(T=288,N=41)

COADREAD(T=380,N=51)

BRCA(T=1092,N=113)

ESCA(T=181,N=13)

STES(T=595,N=49)

KIRP(T=288,N=129)

KIPAN(T=884,N=129)

STAD(T=414,N=36)

PRAD(T=495,N=52)

UCEC(T=180,N=23)

HNSC(T=518,N=44)

KIRC(T=530,N=129)

LUSC(T=498,N=109)

LIHC(T=369,N=50)

THCA(T=504,N=59)

READ(T=92,N=10)

PAAD(T=178,N=4)

PCPG(T=177,N=3)

BLCA(T=407,N=19)

KICH(T=66,N=129)

CHOL(T=36,N=9)

B

GBMLGG.L.H

C

1.0

p=1.2e-51

Survival probability

HR=6.26,95C1%(4.79,8.18)

0.8

4,000-

3,000-

TCGA-GBMLGG(N=656)

4,000

.5

ImmuneScore

r=0.50

ImmuneScore

3,000-

TCGA-LGG(N=504)

2,000

1,000-

p=1.1e,43

2,000

r=0.35

0.3

1,000-

p=2.3e-16

0-

0

0.0 - Number at risk

-1,000

-1,000

L

-2,000-

-2,000-

H

464

155

71

18

3

4

0

1,605

3,210

4,815

6,420

-5

0

5

-5

0

5

Overall survival

MMP9 Expression

MMP9 Expression

KIPAN -L .H

1.0

p=2.9e-5

Survival probability

HR=1.76,95C1%(1.35,2.31)

0.8

4,000-

TCGA-KIPAN(N=878)

4,000=

ImmuneScore

3,000-

r=0.47

ImmuneScore

3,000

TCGA-KIRC(N=528)

2,000-

r=0.32

0.5

2,000-

1,000-

p=2.0e-48:

1,000-

p=1.1e-13:

0.3

0-

0

-1,000

-1,000-

0.0- Number at risk

-2,000-

-2,000-

L431

H424

175

160

46

28

3

V

-5

0

5

-5

0

5

0

1,481

2,962

4,443

5,924

MMP9 Expression

MMP9 Expression

Overall survival

UVM-L .H

1.0

p=6.7e-5

Survival probability

HR=4.92,95C1%(2.08,11.66)

0.8

4,000-

TCGA-UVM(N=79)

4,000=

ImmuneScore

3,000-

r=0.59

ImmuneScore

3,000

TCGA-ACC(N=77)

2,000

r=0.29

0.5

2,000

1,000-

p=9.1e-9

1,000-

p=0.01

0.3

0

0

-1,000-

-1,000-

0.0 Number at risk

-2,000-

-2,000-

L

54

37

13

2

1

11

-5

0

5

H

20

-5

0

5

0

650

1,300

1,950

2,600

MMP9 Expression

MMP9 Expression

Overall survival

neck squamous cell carcinoma (HNSC), kidney renal clear cell carcinoma (KIRC), lung squa- mous cell carcinoma (LUSC), liver hepatocellu- lar carcinoma (LIHC), rectum adenocarcinoma (READ), pheochromocytoma and paraganglio- ma (PCPG), bladder urothelial carcinoma (BLCA), kidney chromophobe carcinoma (KICH), and cholangiocarcinoma (CHOL) (P < 0.05). MMP9 was highly expressed in brain low-grade

glioma (LGG), cervical squamous cell carcino- ma and endocervical adenocarcinoma (CESC), and pancreatic adenocarcinoma (PAAD); how- ever, because of the small sample size of the control group (normal), no significant differenc- es were detected. Furthermore, the expression of MMP9 in thyroid carcinoma (THCA) did not differ significantly from that of normal samples (Figure 2A).

MMP9 in cancer & computational screening of inhibitors

Pan-cancer prognostic analysis of MMP9

To further explore the association between MMP9 and the prognosis of pan-cancer, we performed prognostic analysis on 39 cancer types. The OS results (Supplementary Figure 1A) showed that for glioma (GMBLGG), KIPAN, uveal melanoma (UVM), LGG, adrenocortical carcinoma (ACC), liver hepatocellular carcino- ma (LIHC), BLCA, and testicular germ cell tumors (TGCTs), higher MMP9 expression was associated with a lower survival rate (P < 0.05). For skin cutaneous melanoma (SKCM) and SKCM-M, higher MMP9 expression was as- sociated with a higher survival rate, suggesting that MMP9 is a beneficial factor for these two tumor types (P < 0.05). For the other 28 tu- mors, expression was not significantly associ- ated with survival (P > 0.05). We also plotted the survival curves of GMBLGG, KIPAN, UVM, LGG, ACC, KIRC, LIHC, BLCA, and TGCT (Figure 2B, Supplementary Figure 1B, 1C). In addi- tion, we analyzed the PFS of pan-cancer (Supplementary Figure 1D) and found that for GMBLGG, KIPAN, KIRC, UVM, LGG, ACC, THCA, GBM, and KICH, higher MMP9 expression was associated with faster tumor progression (P < 0.05); for lymphoid neoplasm diffuse large B-cell lymphoma (DLBC) and ovarian serous cystadenocarcinoma (OV), higher MMP9 ex- pression was associated with slower tumor progression, suggesting that MMP9 is a sup- pressor of tumor development in these can- cers (P < 0.05). For the other 28 tumors, MMP9 expression was not significantly associated with tumor progression (P > 0.05). In summary, higher MMP9 expression was associated with a lower survival rate and tumor progression in GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC.

Correlation between MMP9 expression, the TME, and immune infiltration

The TME is composed of various components, such as immune cells, non-immune stromal cells, and ECM proteins, including innate immune cells, adaptive immune cells, extracel- lular immune factors, and cell surface mole- cules. TME, also known as the tumor immune microenvironment (TIME), has unique internal interactions and plays an important role in tumor biology [20, 21]. To further explore the correlation between MMP9 and tumor immune infiltration, we performed immune analysis on six tumors with MMP9 expression. We found

that MMP9 expression in GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC was positively corre- lated with the immune score, ESTIMATE score, and stromal score (Figure 2C, Supplementary Figure 2).

In addition, we analyzed the correlation of MMP9 expression with immune cells in each tumor (Figure 3A). We found that macrophag- es were significantly associated with MMP9 expression. Specifically, M0 macrophages were significantly positively correlated with MMP9 expression in all six tumors; classically activated M1 macrophages were positively cor- related with MMP9 expression in GMBLGG, KIPAN, UVM, LGG, and ACC; alternative activat- ed M2 macrophages were positively correlated with MMP9 expression in GMBLGG and LGG. High macrophage expression leads to the release of more cytokines (such as epidermal growth factor (EGF), which promotes the metastasis and invasion of cancer cells [22, 23]; this may explain the high correlation between MMP9 expression and metastasis. Monocytes were negatively correlated with MMP9 expression in five tumors but not in ACC, suggesting that the ability to recognize and kill tumor cells was inhibited [24]. Activated natural killer cells were negatively correlated with MMP9 expression in GMBLGG, KIPAN, KIRC, and ACC, indicating that their ability to kill tumor cells decreases when tumors express more MMP9. Furthermore, MMP9 expression was positively correlated with regulatory T cells (Tregs) in GMBLGG, KIPAN, UVM, LGG, and KIRC, which could suppress the immune sys- tem [25].

Correlation of MMP9 expression with RNA modification genes

Chemical RNA modifications play an important role in fundamental cellular processes, such as cell differentiation, protein production, cell signaling, and the maintenance of circadian rhythms [26, 27], and these modifications can be critical in tumor suppression or tumor-pro- moting effects. We found that GBMLGG was positively correlated with most of the genes in m1A modification, with significant differences between tumor types; the gene ALKBH3 was positively associated with MMP9 expression in four tumors - GBMLGG, KIPAN, ACC, and LGG - with statistically significant differences be- tween tumors (Figure 3B). ALKBH3 can pro-

MMP9 in cancer & computational screening of inhibitors

A

Correlation coefficient

TCGA-LGG(N=504) *

0.5

*

*

*

*

*

*

*

*

*

*

*

*

*

*

TCGA-KIPAN(N=878)

*

*

*

*

*

*

*

*

*

*

0.0

*

*

*

M

TCGA-KIRC(N=528) *

*

*

*

*

*

*

*

*

-0.5

TCGA-GBMLGG(N=656)

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

TCGA-UVM(N=79)

*

*

*

*

*

*

B_cells_naive B_cells_memory

TCGA-ACC(N=77) *

2.0

*

*

*

*

Plasma_cells

T_cells_CD8

T_cells_CD4_naive

T_cells_CD4_memory_resting

T_cells_follicular_helper

T_cells_CD4_memory_activated

T_cells_regulatory_(Tregs)

T_cells_gamma_delta

NK_cells_resting

NK_cells_activated

Monocytes

Macrophages_MO

Macrophages_M1

Macrophages_M2

Dendritic_cells_resting

Dendritic_cells_activated

Mast_cells_resting

Mast_cells_activated

Eosinophils

Neutrophils

pValue

1.0

0.0

B

TRMT61A

Type

D

Type

TRMT61A

· Writer

*

Writer

*

Reader

TRMT6

Reader

*

Eraser

*

*

TRMT6

· Eraser

*

TRMT10C

Correlation coefficient

Correlation coefficient

*

*

1.0

TRMT10C

1.0

*

TRMT61B

.

*

0.0

TRMT61B

0.0

YTHDC1

*

-1.

YTHDC1

-1.

YTHDF3

*

YTHDF1

1.0

YTHDF3

1.0

*

pValue

0.5

YTHDF1

pValue

YTHDF2

0.5

*

YTHDF2

ALKBH1

0.0

*

0.0

*

ALKBH3

ALKBH1

*

*

*

*

*

GBMLGG(N=662)

UVM(N=79)

ACC(N=77)

LGG(N=509)

KIPAN(N=884)

KIRC(N=530)

ALKBH3

NSUN3

NSUN4

*

C

Type

TRDMT1

TRMT61A

· Writer

*

*

*

*

*

.Reader

· Eraser

NSUN5

TRMT6

*

*

Correlation coefficient

DNMT3A

TRMT10C

1.0

*

*

*

*

*

DNMT1

TRMT61B

0.0

*

*

*

DNMT3B

YTHDC1

*

*

*

*

*

*

-1.

NOP2

YTHDF3

*

*

*

1.0

NSUN2

YTHDF1

pValue

w

W

*

0.5

NSUN6

YTHDF2

*

*

0.0

NSUN7

ALKBH1

*

*

*

GBMLGG(N=662)

UVM(N=79)

ACC(N=77)

LGG(N=509)

KIPAN(N=884)

KIRC(N=530)

ALKBH3

*

NSUN3

*

NSUN4

*

*

TRDMT1

*

GBMLGG(N=662)

UVM(N=79)

ACC(N=77)

LGG(N=509)

KIPAN(N=884)

KIRC(N=530)

Figure 3. A. Pan-cancer cohort (GBMLGG, KICH, KIRC, KIRP, KIPAN and UVM). Analysis of the relationship between MMP9 expression and immune cell infiltration. B-D. Correlation between m1A, m5C and m6A mRNA modification genes and the expression of MMP9 in Pan-cancer cohort (GBMLGG, KICH, KIRC, KIRP, KIPAN and UVM).

mote the proliferation, migration, and invasion of cancer cells [28]. In m6A modification (Figure 3C), MMP9 expression was positively correlat- ed with most genes in GBMLGG, with signifi-

cant differences between tumor types. TR- MT61A was positively correlated with MMP9 expression in four tumors - GBMLGG, KIPAN, ACC, and LGG - with statistically significant dif-

MMP9 in cancer & computational screening of inhibitors

ferences between tumors (Figure 3C). A58 in m1tRNA is composed of the RNA-binding com- ponent TRMT6 and the catalytic component TRMT61A, which is crucial for maintaining m1tRNA stability, affects translation initiation, and has profound effects on various biological processes [29]. In m5C modification (Figure 3D), MMP9 expression was positively correlat- ed with most genes in GBMLGG, with sig- nificant differences between tumor types. DNMT3B was positively correlated with MMP9 in four tumors - GBMLGG, KIPAN, ACC, and LGG - with significant differences between tumors (Figure 3D). DNMT3B is involved in de novo DNA methylation in embryonic stem cells and early embryos. It is overexpressed in several human tumors and is an indicator of early tumor recurrence and poor prognosis in hepa- tocellular carcinoma [30].

Correlation of MMP9 expression with TMB

We further performed single nucleotide poly- morphism (SNP) analysis by dividing patients into two groups: a high MMP9 expression gro- up and a low MMP9 expression group. In LGG (Supplementary Figure 3A), the genes IDH1, TP53, and ATRX had high mutation frequencies (> 20%), and EGFR, MYH13, EPPK1, MYO15A, SI, KIAA1109, CDH17, SLCO1B1, SYNE2, CFAP47, SSPO, and ZFHX4 also had higher mutation rates and more mutation types in the high MMP9 expression group. In KIRC (Supplementary Figure 3B), the genes VHL and PBRM1 had high mutation frequencies (> 20%), and the mutation types were mostly mis- sense mutations, frameshift deletion muta- tions, nonsense mutations, splice site muta- tions, and in-frame insertions. THSD7B, ADGRV1, XPO7, LAMC2, and UBR4 also had higher mutation rates and mutation types in the MMP9 high expression group. TP53, CTNNB1, and MUC16 showed high mutation frequencies (> 20%) in ACC (Supplementary Figure 3C) as well as higher mutation rates in the high MMP9 expression group. DST, FAT4, ASXL3, CNTNAP5, and NF1 also had higher mutation rates and more mutation types in high MMP9 expression group. In UVM (Supplementary Figure 3D), the genes GNAQ, GNA11, BAP1, and SF3B1 had high mutation frequencies (> 20%), whereas BAP1 had a high- er mutation frequency in the high MMP9 expression group. Finally, SF3B1 and EIF1AX

showed higher mutation rates in patients with high MMP9 expression.

Construction and validation of the pharmaco- phore model

To further screen for novel inhibitors of MMP9, we constructed a ligand-based pharmaco- phore model. We first considered evaluating the major residues obtained by analyzing the crystal structures (PDB IDs: 2OW0, 2OW1, 4H3X, and 4WZV) to obtain the major residues of the MMP9 receptor (Figure 4A-D), identify- ing small active molecules and target proteins and the physicochemical interaction patterns between them and then mapping them to 3D array features (e.g., hydrogen bonds, lipophilic contacts, and ionic or aromatic interactions).

As shown in Figure 4A, the crystal structure 2OWO exhibited two hydrophobic interactions, binding with the residues TYR423, LEU397, LEU418, VAL398, and ZN444. Two hydrogen bond acceptors were found with ALA189, GLN402, HOH503, HOH608, and LEU188. In addition, a positively ionized region was also detected. The crystal structure 2OW1 (Figure 4B) exhibited two hydrophobic interactions, binding with the residues VAL398, LEU418, TYR423, LEU397, and ZN444. Five hydrogen bond acceptors were found with LEU188, HOH593, HOH557, ALA189, and GLN402, and three hydrogen bond donors were also ob- served, along with a positively ionized region. The crystal structure 4H3X (Figure 4C) exhibit- ed two hydrophobic interactions, binding with the residues LEU243, TYR248, VAL223, and ZN301. Two hydrogen bond acceptors were found with LEU188 and ALA189, and three hydrogen bond donors with ALA189, HIS226, and HOH415 were also observed, along with a positively ionized region. The crystal structure 4WZV (Figure 4D) exhibited two hydropho- bic interactions, binding with the residues TYR245, MET247, ZN302, VAL223, and TYR248. Four hydrogen bond acceptors were found with ALA191, HOH401, LEU188, and ALA189, and ALA189, HIS230, and GLU227 hydrogen bond donors were also observed, along with a positively ionized region. As shown in Supplementary Figure 4A-D, these com- pounds exerted the largest effect with the amino acid residue H401.

MMP9 in cancer & computational screening of inhibitors

Figure 4. Chemical structure formula and pharmacophore analysis of (A) 2OW0, (B) 2OW1, (C) 4H3X and (D) 4WZV. Chemical features of the co-crystal structures were analyzed for summarizing common features. Red arrows indi- cate hydrogen bond acceptors, green arrows indicate hydrogen bond donors and yellow spheres indicate hydro- phobes. (E) Evaluation index of deep learning model.

A

ALA189A

GLN402A

HOH503A

ZN444A

HOH608A

HN

VAL398A

LEU418A

NH

NH

LEU397A

TYR423A

LEU188A

B

VAL398A

LEU188OH593A

HOH SEZA 89A

HIS401A

GLN402A

OH

NH

LEU418A

HOH551A

F

NH

NH

F

3

TYR423A

LEU397A

ZN444A

C

TYR248A

HO

LEU243A

NH

VAL223A

LEU188A

ZN301A

ALA189A

o =S

HIS226A

HOH415A

D

ALA191A

GLU227A

HIS230A

o

HO

NH

NH

TYR248A

HOH818B

VAL223A

TYR2458

LEU188A

ZN302A

MET2478

ALA189A

E

Accuracy

Precision

AUC

Accuracy

Precision

AUC

1.00

1.00

1.00

0.80

0.80

0.80

0.60

0.60

0.60

0.40

0.40

0.40

0.20

0.20

0.20

0.00

15

20

25

Epoch

0.00

0

5

10

30

0

5

Epoch

0.00

10

15

20

25

30

0

5

10

15

20

25

Epoch

30

Virtual screening

We performed a prospective virtual screening (VS) of a database of compounds of natural ori- gin and synthetic drugs, in which we used fitted values as pharmacology-based screening crite-

ria. After removing duplicates, we screened 230 small molecules with the same pharmaco- phore from 1,752,844 small molecules. Then, we built a deep learning model with MMP9 and 3479 small molecules and validated it. The accuracy, precision, and area under the curve

MMP9 in cancer & computational screening of inhibitors

(AUC) of the model gradually stabilized with increases in the Epoch, and finally stabilized at around 0.9 (Figure 4E). Recall and F1 also gradually stabilized around 0.9, and loss and MCC gradually stabilized around 0.45 and 0.7, respectively (Supplementary Figure 4E-H). After screened by the machine learning model, 49 small molecules (score = 1) from the 230 small molecules were identified.

ADME and toxicity prediction

Pharmacokinetics is an important analytical method for detecting effective compounds in the process of drug discovery, and the analysis of its properties plays a key role in drug design (Supplementary Table 1). Water solubility pre- dictions (defined in water at 25℃) indicated that 33 compounds were soluble in water. In addition, 21 compounds showed good human intestinal absorption levels. Furthermore, 40 compounds were highly bound to plasma pro- teins, whereas the rest were not. CYP2D6 is an important enzyme involved in drug metabo- lism, and all 49 compounds were predicted to be non-inhibitors of cytochrome P450 2D6 (CYP2D6). Regarding hepatotoxicity, seven compounds were predicted to be nontoxic. CHEMBL82047 and CHEMBL381163 have good water solubility, intestinal absorption, and protein binding and can act as non-inhibitors of CYP2D6 without hepatotoxicity (Supplementary Table 2). We conducted a comprehensive in- vestigation of the safety of these small mole- cules; the results showed that two small mole- cules, CEMBL82047 and CEMBL381163, are non-mutagenic and predicted to have less Ames mutagenic, rodent carcinogenic, and developmental toxicity potential than other compounds.

Protein molecular docking

To further study the binding properties of small molecules to proteins, we carried out molecular docking experiments (Figures 5, 6A, 6B, Supplementary Figure 5A-D). As shown in Table 1, CEMBL82047 and CEMBL381163 have higher binding affinity to the protein compared with the drugs JNJ0966 and MMP9- IN-1. Supplementary Figure 5E, 5F shows the TT-dependent interactions and hydrogen bonds determined by the structural calculations. The results of the structural calculation studies showed that CEMBL82047 forms four pairs of

hydrogen bonds with the MMP9 residue accep- tor, and the complex itself forms four pairs of TT-related interactions with the MMP9 residue acceptor. CHEMBL381163 forms four pairs of hydrogen bonds and seven pairs of n-related interactions with the MMP9 residue acceptor (Tables 2 and 3).

Molecular dynamics simulation

Molecular dynamics simulation is a method for simulating the physical motion trajectories and states of atoms and molecules based on Newtonian mechanics. We build a molecular dynamics simulation module to evaluate the stability of small molecule-protein complexes under natural environment conditions. Figure 6C, 6D shows the potential energy and RMSD plots for each complex. The trajectories of each complex reached equilibrium, and the potential energy and RMSD of complex- es CEMBL82047-MMP9 and CEMBL381163- MMP9 reached a steady state over time. This indicates that the complexes can exist stably in the natural environment.

Discussion

Tumors are among the leading causes of death worldwide [2], and MMP9 is a reported cancer biomarker [6] that promotes tumor invasion and metastasis, greatly contributing to the occurrence and development of tumors [5, 6]. Although great progress has been made in the design and development of drugs targeting MMP9, these drugs have many shortcomings. This study systematically assessed the expres- sion pattern and prognostic value of MMP9 in pan-cancer and screened for specific MMP9- targeting drugs.

We found that the expression level of MMP9 differed significantly between tumor samples and normal samples in most of the 26 cancers investigated. Higher MMP9 expression was associated with poorer survival and tumor pro- gression in GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC. These findings are consistent with those of previous reports; for example, elevat- ed MMP9 expression in breast cancer has been identified as a predictor of shortened patient survival [31-33]; it also acts as a prog- nostic biomarker for thyroid cancer [34]. To fur- ther confirm the correlation between MMP9 expression and tumors, we performed immune

Figure 5. Schematic drawing of interactions between ligands and MMP9. (A) Ligand interaction diagram of CHEM- BL82047-MMP9 complex. (B) Ligand interaction diagram of CHEMBL381163-MMP9 complex. (C) CHEMBL82047- MMP9 complex. (D) Schematic of intermolecular interaction of the predicted binding modes of CHEMBL82047 with MMP9. (E) CHEMBL381163-MMP9 complex. (F) Schematic of intermolecular interaction of the predicted binding modes of CHEMBL381163 with MMP9.

A

B

C

D

ARG A:424

A:597

LEU

ALA

A:187

A:417

90°

PRO

A:415

H H

H

2

TYR

A:423MET

6

H

LEU A:418

GLY :186

A-456

A:422

A:426

TYR

-189A:1

GLU

A:420

PRO

A:416

H

A:421

VAI

HIS

A:398

A:401

HIS

A:190

HIS

A:411

HIS

A:405

Interactions

van der Waals

Pi-Pi Stacked

Conventional Hydrogen Bond

Alkyl

Carbon Hydrogen Bond

Pi-Alkyl

E

F

GLY

A:186

90°

PHE

HIS

A:190

ALA

HIS

:189

A:405

MET

A:422

A:110

ARG

LEU

A:397

GLI

A:424

2.

A:402

HIS

A:401

Å

4.411

H

LEU

A:418

H

PRO

A:421

TYR

LEU

A:423

A:187

A:188

VAL

A:398

Interactions

van der Waals

Pi-Sulfur

Conventional Hydrogen Bond

Alkyl

Carbon Hydrogen Bond

Pi-Alkyl

Pi-Signa

infiltration analysis on six abovementioned tumors with high MMP9 expression in TCGA. We found that MMP9 expression in patients with tumors was significantly correlated with the stromal score, immune score, and ESTIMA- TE score. We also examined the relationship between MMP9 expression and the infiltration of 22 immune cell subtypes, and our findings showed that the level of immune cell infiltration

was significantly correlated with MMP9 expres- sion in most cancer types. This also demon- strates that immune escape occurs in patients with tumors with high MMP9 expression; more- over, it illustrates the mechanism of MMP9 in tumors. For example, macrophages were sig- nificantly positively associated with all six tumors, and high macrophage expression pro- motes cancer initiation and malignant progres-

Figure 6. Schematic of intermolecular interaction of the predicted binding modes of (A) CHEMBL82047 with MMP9, and (B) CHEMBL381163 with MMP9. Results of molecular dynamics simulation of the compounds CHEMBL82047- MMP9 complex and CHEMBL381163-MMP9 complex. (C) Potential energy, average backbone root-mean-square deviation. (D) RMSD, root-mean-square deviation.

A

B

LEU-188

LEU-188

ALA-189

ALA-189

GLN-402

GLN-402

HIS-411

C

D

Potential energy(Kcal/mol)

-47000

CHEMBL82047

60

CHEMBL82047

CHEMBL381163

CHEMBL381163

40

-48000

RMSD

20

-49000

0

0

20

40

60

80

100

120

20

40

60

80

100

120

Time(Ps)

Time(Ps)

Table 1. COCKER potential energy of com- pounds
COCKER potential energy
CEMBL82047-12.164
CEMBL381163-11.623
JNJ0966-6.629
MMP9IN1-8.618

sion. During tumorigenesis, macrophages cre- ate a mutagenic and growth-promoting inflam- matory environment; as tumors progress to malignant tumors, macrophages stimulate angiogenesis, enhance tumor cell migration and invasion, and suppress antitumor immuni- ty [34]. Monocytes were negatively associated with six tumors, and their ability to generate antitumor effectors and activate antigen-pre- senting cells was suppressed [24]. NK cells were also significantly inhibited in these six tumors, and their ability to directly kill tumor cells and release soluble factors affecting

innate and adaptive immune responses was significantly inhibited. In the TME, Tregs can be induced and differentiated by traditional T cells; they have strong immunosuppressive functions, inhibit anti-tumor immunity, and pro- mote the occurrence and development of tumors, which also explains why Treg levels are positively correlated with these tumor types [35]. Activated CD4 memory T cells can sup- press anticancer immunity, thereby hindering protective immune surveillance of tumors and hindering effective antitumor immune respons- es of tumor hosts, promoting tumor develop- ment and progression. This finding is consis- tent with the results of a previous study, in which activated CD4 memory T cell expression was positively correlated with tumors [36].

Exploring the mutational landscape of MMP9 in different cancers further, we found that UVM, KIRC, ACC, and LGG - four types of tumors with high MMP9 expression - had much higher mutation numbers and more mutation types

MMP9 in cancer & computational screening of inhibitors

Table 2. Hydrogen bond interaction parameters for each compound with MMP9 residues
ReceptorCompoundDonor AtomReceptor AtomDistances (Å)
2OW1CEMBL82047LEU188:HUNK900:O21.84939
ALA189:HUNK900:O22.5637
GLN402:HE22UNK900:O52.04371
UNK900:H29ALA189:O2.21685
CEMBL381163LEU188:HUNK900:O32.73564
GLN402:HE22UNK900:O71.84541
HIS411:HD1UNK900:O42.63353
UNK900:H22ALA189:O2.04562
JNJ0966LEU188:HUNK900:N32.7175
UNK900:H1MET422:O2.01943
MMP9IN1GLN227:HE21UNK900:N22.6066
UNK900:H11TYR245:O1.90625
Table 3. 1-related interaction parameters for each compound with MMP9
ReceptorCompoundDonor AtomReceptor AtomDistances (Å)
2OW1CEMBL82047HIS401UNK9004.2886
UNK900:C15LEU1874.82839
UNK900LEU1885.32252
UNK900VAL3984.8719
CEMBL381163UNK900:H4HIS4012.77631
UNK900:C1LEU3974.16409
UNK900:C14LEU1874.4138
PHE110UNK900:C154.14676
HIS411UNK9005.04727
TYR423UNK900:C14.81758
UNK900LEU1884.80899
JNJ0966HIS401UNK9004.00463
UNK900TYR4235.5835
ALA189UNK900:C93.89591
UNK900:C9LEU1884.59029
UNK900:C9VAL3984.42465
UNK900LEU1884.21123
UNK900VAL3984.71999
UNK900LEU3975.13481
UNK900LEU4185.36355
MMP9IN1UNK900LEU2435.32463

than normal tissues. This also verified that MMP9 promotes tumorigenesis and develop- ment. TMB reflects the number of cancer muta- tions, and a higher TMB generally indicates better outcomes. Mutations are processed as neoantigens and presented to T cells by major histocompatibility complex (MHC) proteins, and a higher TMB results in more neoantigens, increasing the chances of T-cell recognition and improving immunotherapy efficacy [37].

Although MMP9 is highly expressed in most tumors and closely related to tumor metasta- sis, only a few drugs specifically target MMP9, and they have many limitations. JNJ0966 is a specific inhibitor of MMP9; it is reportedly involved in the progression and development of various diseases, and it can regulate a series of physiological response processes in the body by regulating the expression of MMP9. However, as mentioned above, JNJ0966 is currently only

MMP9 in cancer & computational screening of inhibitors

used in scientific research [12]. Similarly, MMP9-IN-1, as a specific MMP9-targetingdrug, has not been put into clinical use on a large scale because of several defects, such as respiratory system inhibition [7, 11]. Although the mechanism of action of MMP9 in tumor progression is relatively clear, the application of existing drugs is not satisfactory. Therefore, it is necessary to use various cell biology experi- ments and other methods to screen for and develop new drugs targeting MMP9.

We virtually screened 1,752,844 small-mole- cule compounds in a natural source com- pound and synthetic drug database. By con- structing a pharmacophore model, we screen- ed 230 small-molecule compounds with the same pharmacophore and then constructed a pharmacophore model. We used a machine learning model to further screen 49 small mol- ecule compounds with high binding affinity to MMP9 and pooled them for further study.

The ADME and toxicity prediction results indi- cated that CEMBL82047 and CEMBL381163 had good water solubility, absorption levels, and plasma protein binding properties, with no hepatotoxicity or toxicity and low Ames muta- genicity, rodent carcinogenicity, and develop- mental toxicity, indicating their potential as ideal compounds. Then, we further performed docking analysis and the results showed that CEMBL82047 and CEMBL381163 had higher binding affinity to MMP9 than JNJ0966 and MMP9-IN-1. Because these two compounds form more chemical bonds with MMP9 than JNJ0966 and MMP9-IN-1, they have a higher interaction force and more stable binding with MMP9, which may enhance their inhibition of MMP9, thereby improving the tumor-killing effect. Finally, we conducted a molecular dynamics simulation, and the results showed that the potential energy and RMSD of these complexes reached a steady state over time, indicating that the two complexes remain sta- ble in natural environments.

In conclusion, MMP9 is highly expressed in most cancers. Higher MMP9 expression in GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC is associated with poorer survival and tumor pro- gression. In GMBLGG, KIPAN, UVM, LGG, ACC, and LIHC, higher MMP9 expression is associ- ated with increased infiltration of immune

cells, such as macrophages and regulatory T cells, and more RNA modifications. In UVM, LGG, ACC, and LIHC, higher MMP9 expression indicates that the tumor has a higher TMB. A total of 49 candidate inhibitors against MMP9 were screened with a ligand-based pharmaco- phore model and a machine learning model. CHEMBL82047 and CHEMBL381163 have good water solubility, absorption levels, and plasma protein binding properties. They also have low Ames mutagenicity, rodent carcinoge- nicity, and developmental toxicity potential, with no hepatotoxicity or toxicity. These mole- cules have a high binding affinity to proteins and are stable in the natural environment. Therefore, CEMBL82047 and CEMBL381163 show potential as MMP9-inhibiting drugs.

Acknowledgements

This work was supported by Natural Science Foundation of Shaanxi province (2022JQ-299), and Research Program of Xi’an Heath Com- mission (2021yb26).

Disclosure of conflict of interest

None.

Address correspondence to: Drs. Ming Li and Bo Wu, Lower Extremity Division, Orthopedic Trauma Department, Honghui Hospital, Xi’an Jiaotong University, Youyi East Road No. 555, Beilin District, Xi’an, Shaanxi, China. E-mail: limingguke123@163. com (ML); bocai527@163.com (BW)

References

[1] Hossain SMM, Khatun L, Ray S and Mukho- padhyay A. Pan-cancer classification by regu- larized multi-task learning. Sci Rep 2021; 11: 24252.

[2] Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieu- lent J and Jemal A. Global cancer statistics, 2012. CA Cancer J Clin 2015; 65: 87-108.

[3] Hu J, Xu J, Feng X, Li Y, Hua F and Xu G. Differ- ential expression of the TLR4 gene in pan-can- cer and its related mechanism. Front Cell Dev Biol 2021; 9: 700661.

[4] Roche L, Danieli C, Belot A, Grosclaude P, Bou- vier AM, Velten M, Iwaz J, Remontet L and Bossard N. Cancer net survival on registry data: use of the new unbiased pohar-perme estimator and magnitude of the bias with the classical methods. Int J Cancer 2013; 132: 2359-2369.

MMP9 in cancer & computational screening of inhibitors

[5] Liu N, Wang X, Wu H, Lv X, Xie H, Guo Z, Wang J, Dou G, Zhang C and Sun M. Computational study of effective matrix metalloproteinase 9 (MMP9) targeting natural inhibitors. Aging (Al- bany NY) 2021; 13: 22867-22882.

[6] Huang H. Matrix metalloproteinase-9 (MMP-9) as a cancer biomarker and MMP-9 biosensors: recent advances. Sensors (Basel) 2018; 18: 3249.

[7] Liu Z, Li L, Yang Z, Luo W, Li X, Yang H, Yao K, Wu B and Fang W. Increased expression of MMP9 is correlated with poor prognosis of na- sopharyngeal carcinoma. BMC Cancer 2010; 10: 270.

[8] Owyong M, Chou J, van den Bijgaart RJ, Kong N, Efe G, Maynard C, Talmi-Frank D, Solomonov I, Koopman C, Hadler-Olsen E, Headley M, Lin C, Wang CY, Sagi I, Werb Z and Plaks V. MMP9 modulates the metastatic cascade and im- mune landscape for breast cancer anti-meta- static therapy. Life Sci Alliance 2019; 2: e201800226.

[9] Tamura Y, Watanabe F, Nakatani T, Yasui K, Fuji M, Komurasaki T, Tsuzuki H, Maekawa R, Yoshioka T, Kawada K, Sugita K and Ohtani M. Highly selective and orally active inhibitors of type IV collagenase (MMP-9 and MMP-2): N- sulfonylamino acid derivatives. J Med Chem 1998; 41: 640-649.

[10] Dufour A, Sampson NS, Li J, Kuscu C, Rizzo RC, Deleon JL, Zhi J, Jaber N, Liu E, Zucker S and Cao J. Small-molecule anticancer compounds selectively target the hemopexin domain of matrix metalloproteinase-9. Cancer Res 2011; 71: 4977-4988.

[11] Song Z, Wang J, Su Q, Luan M, Chen X and Xu X. The role of MMP-2 and MMP-9 in the metas- tasis and development of hypopharyngeal car- cinoma. Braz J Otorhinolaryngol 2021; 87: 521-528.

[12] Scannevin RH, Alexander R, Haarlander TM, Burke SL, Singer M, Huo C, Zhang YM, Maguire D, Spurlino J, Deckman I, Carroll KI, Lewan- dowski F, Devine E, Dzordzorme K, Tounge B, Milligan C, Bayoumy S, Williams R, Schalk-Hihi C, Leonard K, Jackson P, Todd M, Kuo LC and Rhodes KJ. Discovery of a highly selective chemical inhibitor of matrix metalloprotein- ase-9 (MMP-9) that allosterically inhibits zymo- gen activation. J Biol Chem 2017; 292: 17963- 17974.

[13] Ravichandran S, Singh N, Donnelly D, Migliore M, Johnson P, Fishwick C, Luke BT, Martin B, Maudsley S, Fugmann SD and Moaddel R. Pharmacophore model of the quercetin bind- ing site of the SIRT6 protein. J Mol Graph Mod- el 2014; 49: 38-46.

[14] Vaidyanathan J, Vaidyanathan TK and Ravi- chandran S. Computer simulated screening of

dentin bonding primer monomers through analysis of their chemical functions and their spatial 3D alignment. J Biomed Mater Res B Appl Biomater 2009; 88: 447-457.

[15] Pascual R, Almansa C, Plata-Salamán C and Vela JM. A new pharmacophore model for the design of sigma-1 ligands validated on a large experimental dataset. Front Pharmacol 2019; 10: 519.

[16] Lindvall M, McBride C, McKenna M, Gesner TG, Yabannavar A, Wong K, Lin S, Walter A and Shafer CM. 3D pharmacophore model-assist- ed discovery of novel CDC7 inhibitors. ACS Med Chem Lett 2011; 2: 720-723.

[17] Auslander N, Gussow AB and Koonin EV. Incor- porating machine learning into established bioinformatics frameworks. Int J Mol Sci 2021; 22: 2903.

[18] Taniguchi H, Sato H and Shirakawa T. A ma- chine learning model with human cognitive bi- ases capable of learning from small and bi- ased datasets. Sci Rep 2018; 8: 7397.

[19] Deo RC. Machine learning in medicine. Circula- tion 2015; 132: 1920-1930.

[20] Fu T, Dai LJ, Wu SY, Xiao Y, Ma D, Jiang YZ and Shao ZM. Spatial architecture of the immune microenvironment orchestrates tumor immu- nity and therapeutic response. J Hematol On- col 2021; 14: 98.

[21] Binnewies M, Roberts EW, Kersten K, Chan V, Fearon DF, Merad M, Coussens LM, Gabrilov- ich DI, Ostrand-Rosenberg S, Hedrick CC, Von- derheide RH, Pittet MJ, Jain RK, Zou W, How- croft TK, Woodhouse EC, Weinberg RA and Krummel MF. Understanding the tumor im- mune microenvironment (TIME) for effective therapy. Nat Med 2018; 24: 541-550.

[22] Siveen KS and Kuttan G. Role of macrophages in tumour progression. Immunol Lett 2009; 123: 97-102.

[23] Lewis CE and Pollard JW. Distinct role of mac- rophages in different tumor microenviron- ments. Cancer Res 2006; 66: 605-612.

[24] Ugel S, Canè S, De Sanctis F and Bronte V. Monocytes in the tumor microenvironment. Annu Rev Pathol 2021; 16: 93-122.

[25] Bazewicz CG, Dinavahi SS, Schell TD and Rob- ertson GP. Aldehyde dehydrogenase in regula- tory T-cell development, immunity and cancer. Immunology 2019; 156: 47-55.

[26] Jonkhout N, Tran J, Smith MA, Schonrock N, Mattick JS and Novoa EM. The RNA modifica- tion landscape in human disease. RNA 2017; 23: 1754-1769.

[27] Gao L, Chen R, Sugimoto M, Mizuta M, Kishi- moto Y and Omori K. The impact of m1A meth- ylation modification patterns on tumor im- mune microenvironment and prognosis in oral squamous cell carcinoma. Int J Mol Sci 2021; 22: 10302.

MMP9 in cancer & computational screening of inhibitors

[28] Chen Z, Qi M, Shen B, Luo G, Wu Y, Li J, Lu Z, Zheng Z, Dai Q and Wang H. Transfer RNA de- methylase ALKBH3 promotes cancer progres- sion via induction of tRNA-derived small RNAs. Nucleic Acids Res 2019; 47: 2533-2545.

[29] Wang Y, Wang J, Li X, Xiong X, Wang J, Zhou Z, Zhu X, Gu Y, Dominissini D, He L, Tian Y, Yi C and Fan Z. N(1)-methyladenosine methylation in tRNA drives liver tumourigenesis by regulat- ing cholesterol metabolism. Nat Commun 2021; 12: 6314.

[30] Lai SC, Su YT, Chi CC, Kuo YC, Lee KF, Wu YC, Lan PC, Yang MH, Chang TS and Huang YH. DNMT3b/OCT4 expression confers sorafenib resistance and poor prognosis of hepatocellu- lar carcinoma through IL-6/STAT3 regulation. J Exp Clin Cancer Res 2019; 38: 474.

[31] Joseph C, Alsaleem M, Orah N, Narasimha PL, Miligy IM, Kurozumi S, Ellis IO, Mongan NP, Green AR and Rakha EA. Elevated MMP9 ex- pression in breast cancer is a predictor of shorter patient survival. Breast Cancer Res Treat 2020; 182: 267-282.

[32] Xue Q, Cao L, Chen XY, Zhao J, Gao L, Li SZ and Fei Z. High expression of MMP9 in glioma af- fects cell proliferation and is associated with patient survival rates. Oncol Lett 2017; 13: 1325-1330.

[33] Niu H, Li F, Wang Q, Ye Z, Chen Q and Lin Y. High expression level of MMP9 is associated with poor prognosis in patients with clear cell renal carcinoma. PeerJ 2018; 6: e5050.

[34] Zarkesh M, Zadeh-Vakili A, Akbarzadeh M, Fa- naei SA, Hedayati M and Azizi F. The role of ma- trix metalloproteinase-9 as a prognostic bio- marker in papillary thyroid cancer. BMC Cancer 2018; 18: 1199.

[35] Li C, Jiang P, Wei S, Xu X and Wang J. Regula- tory T cells in tumor microenvironment: new mechanisms, potential therapeutic strategies and future prospects. Mol Cancer 2020; 19: 116.

[36] Togashi Y, Shitara K and Nishikawa H. Regula- tory T cells in cancer immunosuppression - im- plications for anticancer therapy. Nat Rev Clin Oncol 2019; 16: 356-371.

[37] Jardim DL, Goodman A, de Melo Gagliato D and Kurzrock R. The challenges of tumor muta- tional burden as an immunotherapy biomark- er. Cancer Cell 2021; 39: 154-173.

MMP9 in cancer & computational screening of inhibitors

CancerCodepvalueHazard Ratio(95%CI)
TCGA-GBMLGG(N=619)1.4e-411.28(1.23,1.33)
TCGA-KIPAN(N=855)4.5e-61.15(1.08,1.22)
TCGA-UVM(N=74)7.4e-61 1 - - I1.63(1.32,2.02)
TCGA-LGG(N=474)2.8e-51 O1.14(1.07,1.22)
TCGA-ACC(N=77)4.0e-4I- -11.38(1.15,1.66)
TCGA-KIRC(N=515)4.7e-41.14(1.06,1.22)
TCGA-LIHC(N=341)5.9e-31.11(1.03,1.20)
TCGA-BLCA(N=398)7.5e-31.08(1.02,1.15)
TCGA-TGCT(N=128)0.032.98(1.04,8.51)
TCGA-PAAD(N=172)0.07I1.11(0.99,1.24)
TCGA-GBM(N=144)0.11I1.08(0.98,1.18)
TCGA-KICH(N=64)0.13--- I1.37(0.92,2.06)
TCGA-LUAD(N=490)0.271.05(0.96,1.14)
TCGA-SARC(N=254)0.301.04(0.97,1.11)
TCGA-THCA(N=501)0.32F1.14(0.88,1.47)
TCGA-UCS(N=55)0.49- 41.06(0.90,1.24)
TCGA-HNSC(N=509)0.521.03(0.95,1.11)
TCGA-LUSC(N=468)0.561.03(0.94,1.12)
TCGA-MESO(N=84)0.581.03(0.93,1.14)
TCGA-READ(N=90)0.601 - - -- I1.09(0.79,1.51)
TCGA-KIRP(N=276)0.64+ 11.03(0.91,1.16)
TCGA-SKCM-P(N=97)0.72I - -11.04(0.85,1.27)
TCGA-COADREAD(N=368)0.7411.02(0.91,1.14)
TCGA-COAD(N=278)0.81+1.02(0.90,1.15)
TCGA-LAML(N=144)0.861.01(0.95,1.07)
TCGA-SKCM(N=444)0.020.94(0.89,0.99)
TCGA-SKCM-M(N=347)0.040.94(0.89,1.00)
TCGA-OV(N=407)0.070.95(0.89,1.01)
TCGA-DLBC(N=44)0.18- - -0.80(0.57,1.12)
TCGA-UCEC(N=166)0.231- -10.90(0.77,1.07)
TCGA-CHOL(N=33)0.40I -0.91(0.74,1.13)
TCGA-BRCA(N=1044)0.410.97(0.89,1.05)
TCGA-STES(N=547)0.420.97(0.90,1.05)
TCGA-STAD(N=372)0.520.97(0.88,1.07)
TCGA-THYM(N=117)0.641- r - - - I0.89(0.55,1.45)
TCGA-PRAD(N=492)0.68-- - -0.92(0.63,1.34)
TCGA-PCPG(N=170)0.870.97(0.68,1.39)
TCGA-ESCA(N=175)0.93+ 10.99(0.88,1.13)
TCGA-CESC(N=273)0.99+ ʻ1.00(0.88,1.14)

A

B

LGG .L .H

Survival probability

1.0

p=2.9e-4

HR=1.96,95Cl%(1.35,2.83)

0.8

2.0.5

0.3

0.0- Number at risk

L 353

H121

56

15

2

17

1

5

2

0

1,605

3,210

Overall survival

4,815

6,420

ACC -L -H

Survival probability

1.0

p=1.0e-3

L

HR=3.61,95Cl%(1.60,8.14)

0.8

D0.5

0.3

0.0- Number at risk

L

41

H 36

25

11

4

1

16

3

0

1,168

2,336

Overall survival

3,504

4,672

KIRC .L .H

Survival probability

1.0

p=1.8e-3

HR=1.61,95Cl%(1.19,2.18)

0.8

0.5

0.3

0.0- Number at risk

L

344

186

66

16

1

-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

H 171

91

24

8

1

Log2(Hazard Ratio(95%CI))

0

1,134

2,268

Overall survival

3,402

4,536

C

LIHC -L -H

D

Survival probability

1.0

p=2.4e-4

HR=1.96,95Cl%(1.36,2.83)

0.8

20.5

0.3

0.0- Number at risk

L

81

59

25

6

1 H

H

60

48

15

0

918

1,836

Overall survival

2,754

3,672

BLCA .L .H

Survival probability

1.0

p=7.1e-3

HR=1.56,95Cl%(1.13,2.16)

0.8

0.5

0.3

0.0- Number at risk

L

150

33

10

1

H

248

1

45

12

5

0

1,262

2,524

Overall survival

3,786

5,048

TGCT-L .H

1.0

Survival probability

0.8

p=2.8e-3

HR=2.98,95C1%(1.04,8.51)

2.0.5

0.3

0.0

Number at risk

L

96

40

19

8

-0.8-0.6-0.4-0.20.0 0.2 0.4 0.6 0.8

32

12

8

1

Log2(Hazard Ratio(95%CI))

0

1,859

3,718

Overall survival

5,577

7,436

CancerCodepvalueHazard Ratio(95%CI)
TCGA-GBMLGG(N=616)5.7e-321.22(1.18,1.27)
TCGA-KIPAN(N=845)2.4e-8. + 11.18(1.12,1.25)
TCGA-KIRC(N=508)3.6e-7. - -l1.21(1.12,1.30)
TCGA-UVM(N=73)3.5e-4. I --- I1.41(1.17,1.70)
TCGA-LGG(N=472)2.4e-3F 11.09(1.03,1.15)
TCGA-ACC(N=76)6.1e-3F -- +1.25(1.06,1.47)
TCGA-THCA(N=499)6.7e-3· - - - -- 11.23(1.06,1.43)
TCGA-GBM(N=143)0.02-11.13(1.02,1.24)
TCGA-KICH(N=64)0.031- - -----1 1.43(1.04,1.97)
TCGA-SARC(N=250)0.0711.05(0.99,1.12)
TCGA-BLCA(N=397)0.07+ 11.05(1.00,1.12)
TCGA-PCPG(N=168)0.10--- 11.18(0.97,1.44)
TCGA-PAAD(N=171)0.12H- -1.09(0.98,1.21)
TCGA-UCS(N=55)0.28-- 11.09(0.93,1.27)
TCGA-LUSC(N=467)0.291.06(0.95,1.17)
TCGA-BRCA(N=1043)0.3517 -11.04(0.96,1.14)
TCGA-KIRP(N=273)0.55+1.03(0.93,1.15)
TCGA-PRAD(N=492)0.56- - 11.04(0.92,1.17)
TCGA-READ(N=88)0.57|-- - - " . ------ |1.09(0.80,1.49)
TCGA-LUAD(N=486)0.60I- -I1.02(0.95,1.10)
TCGA-ESCA(N=173)0.6141.03(0.92,1.16)
TCGA-MESO(N=82)0.61-11.03(0.91,1.17)
TCGA-TGCT(N=126)0.621.05(0.86,1.29)
TCGA-LIHC(N=340)0.72F 41.01(0.95,1.08)
TCGA-DLBC(N=43)7.0e-4 - - -- - ----1 .0.69(0.54,0.88)
TCGA-OV(N=407)0.02G +0.94(0.89,0.99)
TCGA-STAD(N=375)0.100.92(0.83,1.01)
TCGA-SKCM(N=434)0.111 10.96(0.92,1.01)
TCGA-CESC(N=273)0.17I--0.92(0.81,1.04)
TCGA-UCEC(N=166)0.18-rl0.91(0.80,1.04)
TCGA-CHOL(N=33)0.200.88(0.72,1.07)
TCGA-STES(N=548)0.21I- 40.95(0.88,1.03)
TCGA-SKCM-M(N=338)0.231 I0.97(0.92,1.02)
TCGA-HNSC(N=508)0.311- el0.96(0.89,1.04)
TCGA-SKCM-P(N=96)0.33|- -- --- 10.91(0.76,1.10)
TCGA-COAD(N=275)0.390.95(0.85,1.07)
TCGA-COADREAD(N=363)0.66ト ー0.98(0.88,1.09)
TCGA-THYM(N=117)0.69. ----- 10.94(0.71,1.25)

Supplementary Figure 1. (A) MMP9 expression correlates with overall survival time (OS). Forest plots showing the correlations between OS and MMP9 expression across 39 types of cancers. (B, C) Survival curves of MMP9 ex- pression in LGG, ACC, KRIC, LIHC, BLCA and TGCT. L represents low expression of MMP9 group, H represents high expression of MMP9 group. (D) Forest plots showing the correlations between Progression-free survival time (PFS) and MMP9 expression across 39 types of cancers.

MMP9 in cancer & computational screening of inhibitors

Supplementary Figure 2. Pan-cancer cohort (GBMLGG, KICH, KIRC, KIRP, KIPAN and UVM). Correlation between MMP9 expression and pan-cancer (A) estimate score and (B) stroma score.

A

ESTIMATEScore

6,000

ESTIMATEScore

6,000

4,000

TCGA-GBMLGG(N=656)

4,000

TCGA-LGG(N=504)

ESTIMATEScore

6,000

r=0.38

4,000

TCGA-KIPAN(N=878)

r=0.55

r=0.49

2,000

p=8.0e-54

2,000

p=1.6e-18

2,000

p=1.4e-53

0

0

0

-2,000

-2,00€

-2,000

-4,000

-4,000

-4,000

-5

MMP9 Expression

0

5

MMP9 Expression

-5

0

5

-5

MMP9 Expression

0

5

ESTIMATEScore

6,000

6,000

TCGA-UVM(N=79)

ESTIMATEScore

4,000

TCGA-KIRC(N=528)

ESTIMATEScore

6,000

TCGA-ACC(N=77)

r=0.41

4,000

r=0.59

4,000

r=0.31

2,000

p=3.0e-23

2,000

p=1.3e-8

2,000

p=5.7e-3

0

0

0

-2,000

-2,000

-2,000

-4,000

-4,000

-4,000

MMP9 Expression

-5

0

5

MMP9 Expression

-5

0

5

MMP9 Expression

5

0

5

B

StromalScore

2,000

TCGA-GBMLGG(N=656)

StromalScore

2,000

TCGA-LGG(N=504)

1,000

r=0.59

r=0.39

StromalScore

2,000

TCGA-KIPAN(N=878)

0

p=1.4e-62

1,000

0

p=1.7e-19

1,000

r=0.46

0

p=1.4e-46

-1,00€

-1,00€

-1,000

-2,000

-2,000

-2,000

MMP9 Expression

-5

0

5

-5

0

5

MMP9 Expression

5

MMP9 Expression

0

5

StromalScore

2,000

TCGA-KIRC(N=528)

StromalScore

2,000

TCGA-UVM(N=79)

2,000

TCGA-ACC(N=77)

1,000

r=0.44

0

p=2.8e-26

1,000

r=0.48

StromalScore

1,000

0

p=6.5e-6

r=0.32

p=4.8e-3

0

-1,000

-1,000

-1,000

-2,000

-2,000

-2,000

-5

0

5

-5

MMP9 Expression

0

MMP9 Expression

5

-5

MMP9 Expression

0

5

MMP9 in cancer & computational screening of inhibitors

Supplementary Figure 3. Correlation between MMP9 expression and tumor mutation burden in (A) LGG, (B) KIRC, (C) ACC and (D) UVM.

A

MutCount

309

LGG

SampleGroup

0-

IDH1(p<0.001)

85.9%

TP53(p<0.05)

ATRX(p=0.52)

51.1%

EGFR(p<0.05)

36.6%

7.0%

MYH13(p<0.05)

EPPK1(p<0.01)

2.0%

MYO15A(p<0.05)

2.0%

2.0%

SI(p<0.05)

KIAA1109(p<0.05)

2.0%

CDH17(p<0.05)

1.6%

SLCO1B1(p<0.05)

1.4%

SYNE2(p<0.05)

1.4%

CFAP47(p<0.05)

1.4%

SSPO(p<0.05)

1.4%

ZFHX4(p<0.05)

1.4%

1.4%

B

MutCount

15=

KIRC

SampleGroup

VHL(p=0.45)

PBRM1(p=0.46)

60.4%

50.9%

TTN(p=0.35)

SETD2(p=0.21)

HI

21.5%

BAP1(p=0.07)

15.3%

MUC16(p=0.71

12.7%

MTOR(p=0.71)

8.0%

KDM5C(p=1.00)

8.0%

7.3%

PTEN(p<0.001)

SSPO(p<0.05)

4.4%

THSD7B(p<0.05)

3.6%

ADGRV1(p<0.05)

3.3%

XPO7(p<0.05)

3.3%

LAMC2(p<0.05)

2.2%

UBR4(p<0.05)

2.2%

2.2%

C

10ª

ACC

MutCount

SampleGroup

0

TP53(p=0.20)

CTNNB1(p=0.32)

27.1%

MUC16(p=0.18)

25.0%

22.9%

TTN(p=0.97)

PKHD1(p=0.41)

18.8%

HMCN1(p=0.10)

14.6%

MEN1(p=0.19)

14.6%

MUC4(p=0.65)

12.5%

PRKAR1A(p=1.00)

12.5%

ANK2(p=0.98)

12.5%

DST(p=0.06)

10.4%

FAT4(p=0.06)

10.4%

ASXL3(p=0.34)

10.4%

CNTNAP5(p=0.98)

10.4%

NF1(p=0.34)

10.4%

10.4%

D

UVM

MutCount

6

SampleGroup

0

GNAQ(p=0.22)

GNA11(p=0.92)

49.4%

44.3%

BAP1(p=0.07)

SF3B1(p<0.05)

27.8%

EIF1AX(p<0.05)

22.8%

COL14A1(p=0.98)

12.7%

CYSLTR2(p=0.98)

3.8%

MYOF(p=1.00)

3.8%

PKHD1L1(p=1.00)

3.8%

3 8% 3.8%

SRSF2(p=0.98)

3.8%

MACF1(p=0.23)

3 8%

ADAMTSL1(p=1.00)

3.8%

2.5%

APC(p=0.49)

ARHGEF17(p=1.00)

2.5%

ARID1B(p=1.00)

2.5%

2.5%

SampleGroup:

. Missense_Mutation

.Frame_Shift_Ins

Nonstop_Mutation

LowExp

. Frame_Shift_Del

.In_Frame_Del

.In_Frame_Ins

. HighExp

. Splice_Site

. Translation_Start_Site

. Nonsense_Mutation

MMP9 in cancer & computational screening of inhibitors

Supplementary Figure 4. The interaction between MMP9 and MMP9's inhibitors (A) 20WO (B) 20W1 (C) 4H3X and (D) 4WZV. (E-H) Evaluation index of deep learning model. Loss, Recall, MCC and F1.

A

P193 G404F403 A400A399 308

B

A40

P194 P193

G404

C

F403

G2334232 A231 P193 F19

P1947

L407

A406

F396

G408

A191

L234

D235

F110

G229

L407

L395

L409

A400

G106

F228

G408

$394

D410

A399

Q108

A191

L409

Y393

D410

405

402

W210

S412

H411

405

F396

S237

H230

A225

H41

H401

L418

402

₩148

H236

E227

A224

5412

L418

4397

E208

W148

420

H40

L395

H226

A242

L243

F221

W148

E416

P421

$394

¥420

190

D205

V398

M244

245

223

L220

E416

A417

P421

6MR501

F204

A417

Y393

89

H203

M422

7MR501

397

$238

P246

10B306

222

S219

M419

W210

M419

M422

L188

A202

V239

M247

A189

Y218

V414

V414

Y423

89

E208

P240

Y423

W210

187

A191

L188

P415

N444

44 E111

$186

P415

R424

L188

Y248

Y179

ZN444

L187

D205

R249

ZN301

L187

E208

R424

D177

F425

G186

F204

F250

G188

D205

F425

H175

T426

H203

T251

F204

P430

F181

P255

H203

L431

D185

P430

H190

D435

ZN445

2D182

K184

L431

D435

D185

F181

L256

F181

H190

CA447 F110

G112”

D260

CA44@D182 K184

CA303 GOL309 D182 D185-180

D

A231 P194 P193 G229F228 A225

E

F

L232

G233

A224

V223

Loss

Recall

L234

F221

Loss

Recall

8237

L220

0.67

1.00

W148

H238

D235 H230

27

$219

0.60

E241

L243

H228

Y218

0.80

A242

¥245

+222

W210

0.50

M244

P246

E40301

91

E208

0.40

0.60

$238

D205

V239

M247

190

F204

0.30

P240

Y248

89

0.40

H203

R249

ZN302

L188

PG0303,

186

L187

A202

0.20

F250

D201

0.10

0.20

T251

F192

P255

¥179

L256

D177

0.00

D260

0

5

10

15

20

25

Epoch

0.00

30

0

5

10

15

20

25

30

Epoch

ZN303

H175

CA305 D182

D185F18

G

MCC

H

F1

MCC

F1

1.00

1.00

0.80

0.80

0.60

0.60

0.40

0.40

0.20

0.20

0.00

5

10

15

25

Epoch

0.00

0

20

30

Epoch

0

5

10

15

20

25

30

Supplementary Table 1. Adsorption, distribution, metabolism, and excretion properties of compounds
Solubility LevelBBB levelCYP2D6HepatotoxicityAbsorption LevelPPB Level
CHEMBL344828 PubChem-10764489330100
CHEMBL2425940 PubChem-73293197340131
CHEMBL139884 PubChem-10502046330101
CHEMBL381554 PubChem-44409390340101
CHEMBL2425944 PubChem-73293200440130
CHEMBL82047 PubChem-10738924330001
CHEMBL196647 PubChem-44402021440131
CHEMBL381163 PubChem-44409365340011
CHEMBL206481 PubChem-44409389340120
CHEMBL207776 PubChem-21304710330101
CHEMBL138643 PubChem-23523890240131
CHEMBL382227 PubChem-44411830240111
CHEMBL419503 PubChem-44325156440130
CHEMBL252711 PubChem-44445823440130
CHEMBL433171 PubChem-21130561240001

MMP9 in cancer & computational screening of inhibitors

CHEMBL1801052 PubChem-9847113240111
CHEMBL234529 PubChem-25181080340031
CHEMBL126004 PubChem-10389610340101
CHEMBL236167 PubChem-23655323330101
CHEMBL429800 PubChem-23656291330101
CHEMBL358812 PubChem-10549612440120
CHEMBL1801395 PubChem-22707860240111
CHEMBL1916211 PubChem-57403331220101
CHEMBL1770697 PubChem-20620715240121
CHEMBL47728 PubChem-44291532330101
CHEMBL303082 PubChem-44306344240131
CHEMBL71227 PubChem-44309863240111
CHEMBL1770712 PubChem-20620688340101
CHEMBL164980 PubChem-11070343340110
CHEMBL44045 PubChem-44289352330001
CHEMBL362797 PubChem-22644895330101
CHEMBL561625 PubChem-45269631330101
CHEMBL35606340101
CHEMBL2425935 PubChem-73292710340110
CHEMBL2204827 PubChem-71459505330101
CHEMBL369302 PubChem-22644965340101
CHEMBL1771223 PubChem-54587429240131
CHEMBL92778 PubChem-9913479240111
CHEMBL292671 PubChem-44299758340131
CHEMBL1771216 PubChem-20620240340031
CHEMBL1801431 PubChem-10280852 PubChem-46939559240121
CHEMBL381505 PubChem-44409164340121
CHEMBL1771222 PubChem-54580544240121
CHEMBL1771215 PubChem-10483139240131
CHEMBL1771221 PubChem-54583511240121
CHEMBL42771 PubChem-44289604330101
CHEMBL1801398 PubChem-46938727240101
CHEMBL44250 PubChem-10572544330001
CHEMBL338007 PubChem-10789711440110

Aqueous-solubility level: 0 (extremely low); 1 (very low, but possible); 2 (low); 3 (good). Blood brain barrier level: 0 (Very high penetrant); 1 (High); 2 (Medium); 3 (Low); 4 (Undefined). Cytochrome P450 2D6 level: 0 (Non-inhibitor); 1 (Inhibitor). Hepatotoxicity: 0 (Nontoxic); 1 (Toxic). Human- intestinal absorption level: 0 (good); 1 (moderate); 2 (poor); 3 (very poor). Plasma protein binding: 0 (Absorbent weak); 1 (Absorbent strong).

Supplementary Table 2. Toxicities of compounds
Mouse NTPRat NTPAMESDTP
FemaleMaleFemaleMale
CHEMBL344828 PubChem-1076448900.680.8090.38310.978
CHEMBL2425940 PubChem-73293197001110
CHEMBL139884 PubChem-1050204600.01600.7010.3960.999
CHEMBL381554 PubChem-4440939000000.8441
CHEMBL2425944 PubChem-7329320000.1091110.001
CHEMBL82047 PubChem-1073892400.9640.43910.990.11
CHEMBL196647 PubChem-444020210.5901111
CHEMBL381163 PubChem-44409365000.0620.0230.9641
CHEMBL206481 PubChem-4440938900000.6231
CHEMBL207776 PubChem-2130471000.00300.8980.4980.946

MMP9 in cancer & computational screening of inhibitors

CHEMBL138643 PubChem-23523890011111
CHEMBL382227 PubChem-444118300.0320.051101
CHEMBL419503 PubChem-4432515600.004110.9951
CHEMBL252711 PubChem-44445823001111
CHEMBL433171 PubChem-211305610.8980.0050.953111
CHEMBL1801052 PubChem-98471130.0760.352110.4241
CHEMBL234529 PubChem-251810800.9850.0281010.946
CHEMBL126004 PubChem-1038961000.891110.1370.002
CHEMBL236167 PubChem-236553230.0030.0121110.434
CHEMBL429800 PubChem-236562910.0030.0121110.434
CHEMBL358812 PubChem-1054961200.006110.6390.003
CHEMBL1801395 PubChem-2270786000.0020.99910.9481
CHEMBL1916211 PubChem-57403331000110.98
CHEMBL1770697 PubChem-206207150.0060.8211101
CHEMBL47728 PubChem-4429153200000.7770
CHEMBL303082 PubChem-4430634410.636110.9990.521
CHEMBL71227 PubChem-4430986300.0010.99310.9831
CHEMBL1770712 PubChem-206206880.0180.998110.921
CHEMBL164980 PubChem-11070343010110.273
CHEMBL44045 PubChem-4428935200.7520.9990.99901
CHEMBL362797 PubChem-2264489500.001000.0710.649
CHEMBL561625 PubChem-4526963100.0090.0010.984: 10
CHEMBL3560600.002010.9680
CHEMBL2425935 PubChem-73292710
CHEMBL2204827 PubChem-714595050.0250.3640.998110.829
CHEMBL369302 PubChem-2264496510.631110.002
CHEMBL1771223 PubChem-5458742900.227110.3740
CHEMBL92778 PubChem-991347910.0051111
CHEMBL292671 PubChem-442997580.98710.903110.002
CHEMBL1771216 PubChem-2062024000.997110.6890
CHEMBL1801431 PubChem-10280852 PubChem-4693955900.0010.99210.5881
CHEMBL381505 PubChem-4440916400.1700.7120.0240.998
CHEMBL1771222 PubChem-5458054400.7761110.813
CHEMBL1771215 PubChem-1048313900.24110.3340
CHEMBL1771221 PubChem-5458351100.1871110.996
CHEMBL42771 PubChem-4428960400.87510.8990.0011
CHEMBL1801398 PubChem-4693872700.0240.966101
CHEMBL44250 PubChem-105725440.0010.611110.0021
CHEMBL338007 PubChem-1078971100.9981111

NTP < 0.3 (Non-Carcinogen); > 0.7 (Carcinogen). AMES < 0.3 (Non-Mutagen); > 0.7 (Mutagen). DTP < 0.3 (Nontoxic); >0.7 (Toxic).

Supplementary Figure 5. Schematic drawing of interactions between control drugs and MMP9. (A) JNJ0966-MMP9 complex, (B) JNJ0966 with MMP9, (C) MMP-9-IN-1-MMP9 complex, (D) MMP-9-IN-1 with MMP9. Schematic of inter- molecular interaction of the predicted binding modes of (E) JNJ0966 with MMP9, and (F) MMP-9-IN-1 with MMP9.

A

B

PRO A:421

TYR

420

ARG

PRO

MET :42

A:424

A:415

HIS

:401

GLU

90°

A:416

LEN

H

A:187

ALA

A:417

2

GLY

A:186

THR

A:426

188

LEU

A:418

VAL

ALA

GLN

A:189

A:402

A:398

LEU

A:397

PRO

A:430

Interactions

van der Waals

Pi-Pi Stacked

Conventional Hydrogen Bond

Pi-Pi T-shaped

Carbon Hydrogen Bond

Alkyl

Pi-Sulfur

Pi-Alkyl

C

D

HIS

A:230

HIS

A:236

HIS

A:190

MET

A:247

PRO

A:240

ALA

TYR

90°

LEU

A:189

A:243

ARG

THR

A:249

A:251

2

LEU

H

A:187

ALA

H

A:242

GLY

PRO

A:186

A:246

LEU

4:226

TYR

A:248

A:188

VAL

LEU

GLU

A:223

A:ZZZ

A:241

Interactions

van der Waals

Halogen (Fluorine)

Conventional Hydrogen Bond

Pi-Sulfur

Carbon Hydrogen Bond

Pi-Alkyl

E

F

GUN-227

TYR 245

MET-422