PNAS NEXUS

A genome-wide RNAi screen for novel CIN genes using human artificial chromosome

Mikhail Liskovykh (Da,*, Natalia Y. Kochanova (Db, Chih-Yuan Chiang®, Anjali Dhalla, Vasilisa Aksenova (Dd, Yu-Chi Chen”, William C. Reinhold Da, Mary Dasso (Dd, Anish Thomas (Da, Ken Chih-Chien Cheng”, Yves Pommierª, William C. Earnshaw (Db, Vladimir Larionov (Da,* and Natalay Kouprina İDa,*

ªDevelopmental Therapeutics Branch, National Cancer Institute, NIH, Bethesda, MD 20892, USA

bWellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, United Kingdom

“Functional Genomics Laboratory, National Center for Advancing Translational Sciences, NIH, Rockville, MD 20850, USA

ªDivision of Molecular and Cellular Biology, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892, USA *To whom correspondence should be addressed: Email: mikhail.liskovykh@nih.gov (M.L.); Email: larionov@mail.nih.gov (V.L.); Email: kouprinn@mail.nih.gov (N.K.) Edited By Shibu Yooseph

Abstract

Chromosome instability (CIN) remains among the most important problems in modern cancer research. In this study, we conducted a genome-wide RNAi screen to identify genes that contribute to CIN. To achieve this, we used a human artificial chromosome in a novel sensitized screen to measure CIN. We screened 18,658 genes for their roles in maintaining chromosomal stability and identified 834 candidates as potential CIN genes. A secondary RNAi screen identified 44 genes with the most pronounced CIN phenotypes. In guilt- by-association analysis using a published set of 8,498 proteins across a panel of 949 cancer cell lines, this cohort of 44 genes displayed a striking correlation with mitotic regulators. Furthermore, altered expression of these proteins was associated with a poor prognosis across multiple cancer types. Specifically, downregulation of AMY2B, ALAD, PDGFRA, PPIE, VEZ1, and TTC19 is associated with poor survival in two or more of the following: small cell lung cancer, lung adenocarcinoma, adrenocortical carcinoma, ovarian cancer, and breast cancer. The genes identified in this screen hold potential as prognostic markers for patient survival across several cancer types and could potentially serve as targets for the development of new therapeutic approaches aimed at mitigating CIN.

Keywords: chromosome instability, human artificial chromosomes, mitosis

Significance Statement

In this work, we analyzed a genome-wide library of 18,658 genes to check their role in chromosome instability (CIN) using artificial chromosome technology. The outcome of this study reveals newly identified W-CIN genes that offer potentially important targets for a better understanding of mitosis as a biological process both from a basic science point of view and from a clinical point of view as new biomarkers and diagnostic genes linked to patient survival.

Introduction

Chromosome instability (CIN) is a result of problems with chromo- some segregation during cell division (1). Such segregation errors can cause cell death or lead to the production of abnormal daughter cells that are potentially dangerous to the organism, e.g. by predis- posing cells to cancer progression (2) or causing problems in early neurodevelopment (3). The CIN phenotype has complex causes and is mostly comprised of two major subgroups: S-CIN (structural aberration CIN), which ranges from point mutations to a variety of chromosomal rearrangements, and W-CIN (whole chromosome CIN), which involves whole chromosome gains and losses (1). In this study, we present an assay that mainly focuses on W-CIN.

A comprehensive list of genes whose dysfunction leads to CIN would be a significant step toward a deeper understanding of

mitotic regulation in normal and cancer cells. To date, the study of CIN genes in humans has been fragmented, with subsets of CIN genes typically discovered as a byproduct of studies focused on DNA repair and replication (4, 5), duplication of centrosomes (6, 7), connections between kinetochore and microtubules (8, 9), chro- matid cohesion (10-13), and cell cycle checkpoints (14-16). Among the causes for W-CIN, the most established are DNA damage (17), replication problems (18), and abnormal microtubule dynamics (8, 19). The incompleteness of our knowledge about CIN genes and their functions is demonstrated by the fact that new CIN genes are reported every year (20, 21). To date, there has been no system- atic high-throughput search for human W-CIN genes because of the lack of a suitable experimental approach.

Previously, several groups described collections of mutants with impaired mitotic chromosome transmission in the budding

OXFORD UNIVERSITY PRESS

yeast Saccharomyces cerevisiae. Spencer et al. identified a collection of ctf mutants (chromosome transmission fidelity) using an assay in which chromosome loss was detected by changes in colony col- or to monitor the inheritance of an artificially engineered non- essential yeast chromosome (22, 23). Independently Larionov’s group described another collection of CHL (chromosome loss) genes using a yeast haploid strain disomic for chromosome III and heterozygotic at the mating type locus. Mutations in these genes exhibited an increased frequency of bipolar mating, caused by the loss of one of the copies of chromosome III (24, 25). Using a multiply marked supernumerary chromosome III as an indicator, Andrew Hoyt’s group isolated yeast mutants that display in- creased rates of chromosome loss. These genes so-called CIN genes regulated microtubule-mediated processes (26).

Such screens have yet to be designed for high-throughput in human cells, in part due to the lack of a nonessential chromosome whose segregation could be readily monitored. Instead, high- throughput studies have typically focused on the identification of essential genes and their interacting partners in synthetic le- thality screens-e.g. in HAP1 haploid cells (27).

Inspired by the yeast experiments described above (22-26, 28), we applied a novel approach specifically designed for human cells (29) to screen as many human genes as possible for those whose loss of function results in W-CIN. Our system uses a high-throughput imaging assay based on the use of a nonessential synthetic human artificial chromosome (HAC), carrying a dual cassette simultaneously expressing two destabilized versions of the green fluorescent protein (GFP, dGFP) (29). The HAC, which was assembled from synthetic alpha-satellite repeats, contains a functional centromere that allows its stable inheritance as a nonessential chromosome (30). Critically, this HAC segregates ac- curately but is more sensitive to external disturbances (like drugs or siRNA) than the native human chromosomes (29, 31). The rate of spontaneous HAC loss is 7 x 10-3 per cell division as measured by the accumulation of nonfluorescent cells during growth in the absence of selection. Control experiments indicate that nonfluor- escent cells arise primarily through the loss of the GFP-marked HAC. In a previous study (29), control experiments using FISH clearly demonstrated that the absence of GFP was caused by HAC loss.

The rate of HAC loss in our test-system (29) is ~10-fold high- er than that of natural chromosomes. This makes the HAC a sensitive model for studying a W-CIN phenotype in cancer cells (32). Thus, in addition to being nonessential it is also sensitized for chromosome missegregation, meaning that alterations in its segregation fidelity can be detected with a reasonable fre- quency. As shown previously, this HAC is a powerful tool with multiple applications (33, 34), including cloning and expression of full-length genes (i.e. both exons and introns) (35-37), chromosome cluster construction (38), centromere/ kinetochore function and regulation studies (39-41), synthetic biology applications (42, 43), and gene therapy in mutant cell lines (44, 45).

Recently, we successfully applied a similar HAC-based system to design a reproducible and sensitive screen focusing on the role of human protein kinases in chromosome transmission. That study identified a novel set of kinases whose knockdown resulted in in- creased W-CIN (29). Given the success of that assay, we here ap- plied an analogous approach in a genome-wide siRNA screen of all genes to identify those whose knockdown causes W-CIN. The present study has identified a number of novel W-CIN genes and we have examined their association with patient prognosis across multiple cancer types.

Results

Primary whole-genome siRNA screening identifies a set of novel CIN candidate genes

To identify new human CIN genes, we adapted a novel previously optimized HAC-based approach for measuring CIN (Fig. 1A) (29). In short, a HT1080 parental cell population carrying the dGFP/ HAC is green during cell cycle progression. If cells lose the HAC due to a missegregation event, they lose the GFP signal within 6 h (29). This system, therefore, allows rapid detection of chromo- some missegregation.

HT1080 are fibrosarcoma cells possessing chromosome arm- level copy number alterations (DepMap portal data) (46, 47), as well as gene mutations including those in NRAS (46-48), and p53 (49, 50). Under conditions that generate DNA breakage such as Methotrexate (MTX) treatment, which inhibits dihydrofolate re- ductase (DHFR), HT1080 cells can acquire resistance to MTX via DHFR gene amplification. This requires progression through the S-phase (50). Furthermore, rearrangement of the RET oncogene was detected in HT1080 cells in response to X-irradiation (51). It is therefore possible that endogenous mutations or the introduc- tion of the additional reporter chromosome into HT1080 cells could potentially increase the baseline of CIN and genomic rear- rangements, as well as affect survival upon siRNA treatment. This is why we carefully analyzed the results of transfection with control siRNAs in order to accurately quantify the specific in- crease in CIN upon silencing of screened genes.

Our initial screen used a siRNA library targeting 18,658 human genes (Thermofisher). This library has three individual siRNAs for each target whose depletion yielded values below the scrambled siRNA (negative control [NC]) threshold (points below the x axis in Fig. 1B). We applied the three following criteria to analyze the readout of this primary screen (Supplementary Excel File S1) (52-54). First, we excluded all genes whose depletion yielded val- ues below the scrambled siRNA (NC) threshold. This reduced the pool of candidates to 9,000 genes (Fig. 1C). Second, we excluded the most cytotoxic siRNAs, which blocked cell proliferation or re- sulted in extensive cell death. Note, major kinetochore/centro- mere proteins/mitotic regulators, including CENPC, CENPN, AURKB, MAD1L1, MAD2L1, PLK1, etc., were removed by this filter. This provided an additional internal control, because loss of centromere function is expected to cause cells to significantly slow proliferation or die rather than survive and accumulate transformative changes. This narrowed our search to 2,673 genes (Fig. 1C). Third, to minimize possible false-positive and off-target results, we focused only on genes where all three siRNAs passed the statistical t-test (Fig. 1B and C), as described in previous stud- ies (29, 31). This strategy is designed to yield the most robust and significant outcome, with the risk of providing false-negative reads (i.e. miss genes whose knockdown promotes mild levels of CIN). However, alternative filters could be applied to analyze read- outs of the screen using our data provided in the Supplementary files. Thus, the data presented in this paper will be useful for sub- sequent data mining and follow-up studies.

The primary siRNA screen identified 834 genes that comprise our primary CIN gene list (Fig. 1C and Supplementary Excel File S2). These genes represent a wide variety of functional classes (cell cycle regulators, ATP metabolism, immune response, cyto- skeleton, DNA metabolism, transcription factors, structural pro- teins, RNA metabolism, transporters, proteases, signaling proteins, transferases, binding proteins), suggesting, that main- tenance of chromosome stability is a complex process, requiring a multilayered regulatory network. From this primary list of CIN

Fig. 1. Schematics of alphoid-HAC approach and results of primary genome-wide siRNA screening. A) Schematic use of alphoid-HAC-based system to measure CIN in cells. HT1080 cell line carrying alphoid-HAC with combination of two GFP degrons was used to conduct the screening [for details, please see Ref. (29)]. Parental cell population is green during cell cycle progression. As soon as cells lose HAC due to missegregation event they lose GFP signal within 6 h (29). Thus, screening can be performed in a high-throughput manner in 384-well format. After siRNA transfection cells can be fixed and number of GFP positive and GFP negative nuclei can be calculated and presented as a probability of HAC loss per cell division. B) Distribution of the whole-genome siRNA screening results based on Z-score. C) Pipeline of selecting primary CIN gene list and 250 topmost genes for the focused secondary siRNA screening. The pipeline is containing several filters to select the hits. Filter 1, we selected genes above NC threshold line. Filter 2, we selected only genes where all three siRNAs showed the significant effect. Filter 3, we selected only not cytotoxic siRNAs. Thus, we were able to identify primary CIN gene list. Later, we were focusing on 250 topmost genes based on the HAC loss per cell division parameter. Panel on the right is representing classification of primary hits based on GO terms, including list of internal positive controls.

DAPI

GFP

control

Probability of HAC loss per cell division

0.3

0.25

SİRNA 1

0.2

0.15

SİRNA 2

0.1

0.05

SİRNA 3

SİRNA1

SİRNA2

SİRNA3

384-well imaging plate

8.000

6.000

4.000

MPLKIP CENPB

MEM1

BUB1B

MCM6

Z-Score

2.000

0.000

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

-2.000

-4.000

-6.000

-8.000

18,658 genes library

Filter 1

9,000 genes

Cell cycle regulators/centromere ATP metabolism

Filter 2

Protein binding

Immune response

Cytoskeleton/adhesion

MPLKIP CENPB

2,673 genes

DNA metabolism

MCM2

MCM3

Filter 3

Transferases/ hydrolases/ ligases

MCM6

POLE

834 genes

Transcription factors

POLR2L

SUPV3L1

TIPIN

XRCC3

Kinases/ phosphatases

TOP3B

KIF23

KIF3B

KLC3

TUBA4A

NEK2

PLK4

BUB1B

Structural proteins

PPP1R12A

PPP2R1B

PPP2R2A

PTK2B

584 genes

Signaling proteins

YWHAG

RNA metabolism

RBX1

SMC1A

SMC3

GMNN

Unannotated

PTTG1

250

Transporters

Internal positive controls

Proteases

Classification of the 834 primary hits

genes, 28 genes had previously been annotated as connected to chromosomal instability (GO terms). These genes (Supplementary Excel File S2, green labeling) serve as an internal positive control, in- dicating that the test-system can indeed identify CIN genes.

Published work from the Hieter lab revealed 692 genes linked to CIN in whole yeast-genome screening (55). Homologs of 23 of these were found in our primary list of human CIN genes [Supplementary Excel File S2, (Comparison with yeast CIN genes sheet)]. Combined with the set of GO-term positive controls, this brought the total number of genes previously linked to accurate chromosome segregation to 51 (6% of the 834 genes). Thus, our whole-genome siRNA screen with the dGFP/HAC system identified a list of 783 candidate genes whose association with CIN was pre- viously unsuspected. Validation and discovery of the mechanism of action of these candidates offers a potential research avenue for future studies of CIN.

Secondary siRNA screening to select CIN genes for further analysis

To select a group of genes for further analysis, we subjected the primary CIN gene list to a secondary siRNA screen. For this, we chose the 250 CIN genes (Fig. 1C) that revealed the highest level of HAC loss (Fig. 2A) per cell division [as described in (29, 31)], for secondary screening. For each candidate gene, we again used three independent siRNAs that were different from those used in the primary screen and originating from a different vendor (Horizon Discovery, Supplementary Excel File S3). Two prior stud- ies (56, 57) emphasized the importance of chemical modifications and synthesis quality in producing off-target effects in RNAi screens. For instance, residual chemicals from the synthesis pro- cess, such as phosphoramidites, can interfere with the specificity of the siRNA or trigger immune responses, leading to unintended effects (56, 57). Changing the siRNA vendor for the secondary screen is one way to reduce these effects.

To analyze readouts of the secondary screen, we applied the same strategy used for primary screen analysis, resulting in the identification of 44 genes most strongly associated with CIN in our study (the secondary CIN gene list).

It is worth noting that another 584 genes from the primary screen (Fig. 1C) are potential candidates for further study.

We have compared the list of CIN genes from the primary and secondary screens to the Achilles and CRISPR post-Chronos com- mon essential gene lists across cancer cell lines obtained from DepMap (47, 58-61). 118 out of 834 primary screen genes and 11 out of 44 secondary screen hits were on one or more of the essen- tial gene lists (Fig. S1).

Depletion of 44 novel CIN genes causes mitotic abnormalities in normal RPE1 cells

To check the effect of siRNA depletion on natural chromosomes, we performed experiments using nontransformed RPE1 cells. Although depletion of 37 of 44 candidate CIN genes had no significant effect on the overall rate of proliferation or mitotic index (Fig. 2B, Supplementary Excel File S4), depletion of all 44 genes induced an elevated number of micronuclei (MNi) and abnormal mitoses (Fig. 2C). MNi are formed when chromosome fragments or whole chromosomes lag behind at anaphase during cell division and are not incorporated into the nucleus with the bulk of the segregated chromatids. Thus, MNi are commonly scored as a marker for chromosome missegregation (62). Upon further detailed analysis, we found that RNAi depletion of 29 of the 44 genes resulted in a sig- nificant increase in MNi formation (Supplementary Excel File S4).

Depletions of 12 of these candidate genes resulted in a signifi- cant increase in abnormal mitoses (Fig. 2C). The correlation be- tween MNi formation and abnormal mitoses was almost perfect (Fig. 2C), confirming that abnormal mitosis is a cause for MNi formation.

To determine the mechanism responsible for the elevated for- mation of MNi and abnormal mitoses, we carried out a detailed examination of mitotic progression in 44 genes using confocal mi- croscopy with immunostaining for alpha-tubulin and histone H3S10ph (see Materials and methods). In all cases, we observed formation of lagging chromosomes in anaphase, chromosome breakage or kinetochore attachment defects, with consequent formation of MNi in telophase (Fig. 3; description of observed mi- totic abnormalities can be found in Supplementary Excel File S4, abnormal mitoses sheet). This confirms that depletion of each of the 44 candidate CIN genes causes chromosome segregation er- rors in normal RPE1 cells.

Additionally, to check the possible connection of increased abnormal mitosis with DNA damage, we performed phospho- gamma-H2AX staining in RPE1 cells treated with siRNAs against genes identified in secondary screen (Fig. S2, Supplementary Excel File S4). Depletion of 24 out of 44 genes demonstrated an in- creased level of gamma-H2AX foci formation, suggesting that DNA damage might be the driver of CIN for some genes.

Correlation analysis revealed a link between CIN genes, kinetochore proteins, and cell cycle progression

To further query the function of the proteins identified in the sec- ondary screen using an independent method, we performed a pro- tein correlation (guilt-by-association) analysis using a large cancer proteomics dataset that reported abundances of 8,498 pro- teins across 949 human cancer cell lines (63). Initially, we calcu- lated the correlation of maxLFQ proteomic values for every protein from the secondary screen that was available in the pub- lished dataset versus every protein in that dataset. Next, for every screen protein we ranked the correlations from the highest to the lowest. We observed that in the case of particular proteins (Fig. 4C), the top proteins in the rankings were often related to ki- netochore, centromere, cell division or cell cycle (an example of top-200 correlation ranking is presented in the Supplementary Excel File S7). This revealed an increased degree of coexpression at the protein level of the proteins identified in the secondary screen and proteins involved in cell cycle and cell division path- ways, a strategy that previously identified a number of critical proteins involved in cell division (64).

To generalize and quantify our findings, we correlated maxLFQ values for 21 proteins from the secondary screen and present in the cancer dataset versus levels of proteins with GO terms associated with the kinetochore, centromere, cell division, and cell cycle. As a control, we calculated similar correlations for all the proteins pre- sent in the cancer dataset and plotted the distributions of correla- tions. We observed that, in general, levels of the 21 proteins from the secondary screen correlate more strongly with levels of pro- teins having mitosis-associated GO terms “kinetochore,” “centro- mere,” “cell division,” and “cell cycle,” than do the levels of bulk proteins in the dataset (Fig. 4A and B). When the correlation pro- files of individual proteins identified in the screen were analyzed, most of them (e.g. USP47, ALDOA) exhibited positive correlations with proteins having mitosis-associated GO terms (Fig. 4C). However, several candidate proteins were anticorrelated with the mitosis-associated reference proteins. Similarly, the levels of

Liskovykh et al. | 5

Fig. 2. Results of secondary focused siRNA screening and the effect of individually depleted genes on normal RPE1 cells. A) Distribution of the focused

secondary siRNA screening results based on Z-score. B) Percentage of mitotic cells (bars on the left from each pair) in comparison with MNi (bars on the right from each pair) after siRNA depletion in normal RPE1 cells. Names of target genes are displayed on the x axis, percentage-on the y, NC stands for negative control (scrambled siRNA), an asterisks indicate statistical significance (P < 0.05, Fisher’s exact test). C) Percentage of MNi (bars on the left from each pair) in comparison with abnormal mitoses (bars on the right from each pair) after siRNA depletion in normal RPE1 cells. Names of target genes are displayed on the x axis, percentage-on the y. NC stands for negative control (scrambled siRNA) asterisks indicate statistical significance (P < 0.05,

Fisher’s exact test).

proteins identified in the CIN screen correlated with levels of pro- teins which absence leads to synthetic lethality with kinetochore, centromere, cell division, and cell cycle proteins (65) (Fig. S3). The results of this correlation analysis support our initial hy- pothesis that the 44 CIN genes identified in this study are likely to be involved in cell cycle and cell division-related pathways and important for proper mitotic progression and regulation.

cancer types

Low expression of CIN genes correlates with poor survival prognosis for genomically unstable

CIN is a hallmark of 80-90% of all solid tumors (66, 67). Among those, the most prone to aneuploidy, genome doubling and al- tered genomes in metastasis include small cell lung cancer

Z score

% Mitosis

-4.00

-3.00

-2.00

-1.00

0.00

1.00

2.00

3.00

4.00

LPA

% Micronuclei

LPA

MMP15

YTHDF2

ADRM1

ALAD

AMY2B

FAM210A

ATP13A4

REPS2

%Abnormal mitosis

% Micronuclei

ATP13A4

REPS2

SNX33

OMG

KCNH8

STK19

TFRC

SOST

C2orf42

PDGFRA

100

PKIA

SYMPK

GABRR3

GSS

SLC36A4

MLEC

TTC19

DHODH

150

SOX14

ICE2

DDX39

VEZF1

ABHD4

FBXO32

OR5A1

TMEM127

200

PSMG1

BPIFB4

PPIE

IFNA7

ALDOA

YEATS4

MAP3K13

SENP6

250

C4orf45

USP47

GGPS1

Fig. 3. Observation of mitotic abnormalities in normal RPE1 cells with siRNA depletion of target genes. Immunostaining of normal RPE1 cells during mitotic progression while siRNA depletion of target genes. The panel shows chromosome distribution during major mitotic phases (prophase, metaphase, anaphase, and telophase). Yellow arrows point to mitotic abnormalities, such as lagging chromosomes, chromosome bridges, chromosome loss or miss-attachment. Names of target genes displayed in white color; NC stands for negative control (scrambled siRNA). Magenta staining is against phosphorylated Histone H3. Green staining is against beta-tubulin.

Prophase

Metaphase

Anaphase

Telophase

Prophase

Metaphase

Anaphase

Telophase

Prophase

Metaphase

Anaphase

Telophase

SOST

ABHD4

LPA

C2orf42

FBXO32

MMP15

PDGFRA

OR5A1

ORSA1

YTHDF2

PKIA

TMEM127

ADRM1

SYMPK

PSMG1

PŚMG1

ALAD

GABRR3

BPIFB4

AMY2B

GSS

PPIE

FAM210A

SLC36A4

IFNA7

ATP13A4

MLEC

ALDOA

REPS2

TTC19

YEATS4

SNX33

DHODH

MAP3K13

OMG

SOX14

SENP6

KCNH8

ICE2

CAorf45

C4orf45

CAorf45

C4orf45

STK19

DDX39

USP47

TFRC

VEZF1

GGPS1

(SCLC), adrenocortical carcinoma (ACC), colon adenocarcinoma (COAD), lung adenocarcinoma (LUAD), ovarian cancer, and breast cancer (67, 68). We hypothesized that downregulation of genes identified in our secondary screening might have a prog- nostic impact in these cancers. To link our findings with patient prognosis, we performed bioinformatic analysis on CellMiner da- tabases (see Materials and methods). Indeed, lower expression of a number of the CIN genes correlated with poor prognosis for pa- tient survival (Figs. 5 and S4, and Supplementary Excel File S5) in

several different types of cancer, as follows: TFRC, SYMPK, TMEM127-SCLC; ATP13A4, YTHDF2, OMG, GABRR3, STK19, VEZF1-LUAD; FBXO32, REPS2, MLEC, GSS-ACC; USP47, SOX14, YEATS4-breast cancer; and SENP6-ovarian cancer. ALAD and TTC19 downregulation correlated with poor patient survival in both LUAD and ACC. Furthermore, AMY2B, PDGFRA and PPIE downregulation correlated with poor patient survival in breast cancer plus ACC, COAD, and ovarian cancer, respectively.

Fig. 4. Correlation analysis for all proteins in a large cancer proteome dataset or the subset of proteins revealed in this CIN screen and present in that dataset. A) The plot shows density ridgeline plots of correlation distributions for all proteins (turquoise) or CIN candidates (salmon) against proteins in that dataset with GO terms for kinetochore, centromere, cell division and cell cycle. The Kolmogorov-Smirnov test was performed to compare the distributions. B) Fractions of correlations >0.2 from (A). C) Correlation analysis (density ridgeline plots) for specific proteins revealed in this CIN screen and found in the large cancer proteome dataset versus proteins in that dataset with GO terms for kinetochore, centromere, cell division, and cell cycle (salmon) or proteins in that dataset that lack those GO terms (turquoise).

Correlations

0.20 -

All

Correlations

Chromosome

All

instability proteins

Chromosome instability

Overlay

proteins

Kinetochore

0.15-

GO term

Centromere

Fraction

0.10 -

Cell Division

0.05-

Cell Cycle

-0.5

0.0

0.5

1.0

0.00

Cell Cycle Cell Division Centromere Kinetochore Distribution

Correlations distribution

Cell Cycle

Cell Division

Centromere

Kinetochore

Q9Y5A9.YTHD2_HUMAN -

O95456.PSMG1_HUMAN-

095619.YETS4_HUMAN -

000148.DX39A_HUMAN -

Q16186.ADRM1_HUMAN -

Q9UNP9.PPIE_HUMAN -

Q96K76.UBP47_HUMAN -

P04075.ALDOA_HUMAN -

O95749.GGPPS_HUMAN -

P48637.GSHB_HUMAN -

Protein

Q14119.VEZF1_HUMAN -

Yes

P13716.HEM2_HUMAN -

Overlay

Q92797.SYMPK_HUMAN -

P16234.PGFRA_HUMAN -

P61925.IPKA_HUMAN -

Q6DKK2.TTC19_HUMAN -

P51511.MMP15_HUMAN -

P02786.TFR1_HUMAN -

Q96NDO.F210A_HUMAN -

Q02127.PYRD_HUMAN -

Q14165.MLEC_HUMAN -

-0.5

0.0

0.5

1.0

-0.5

0.0

0.5

1.0

-0.5

0.0

0.5

1.0

-0.5

0.0

0.5

1.0

correlations distribution

Taking these findings together, we conclude that low expres- sion levels of the candidate CIN genes identified in our secondary screen might serve as a diagnostic marker to detect predisposition for cancer or as a prognostic marker predictive of the probability of patient survival.

Depletion of CIN genes slowed proliferation of the genetically unstable U2OS cell line

It has been suggested for more than a decade that, because CIN cells show an increase in CIN, artificially increasing that instability

Fig. 5. Patient survival analysis. A) The panel represents KM survival curves for SCLC, indicating lower expression of the gene correlates with poor prognosis patient survival from left to right (TFRC, SYMPK, and TMEM127). B) The panel represents KM survival curves for ACC, indicating lower expression of the gene correlates with poor prognosis patient survival from top to bottom (FBXO32, AMY2B, REPS2, MLEC, ALAD, GSS, and TTC19). C) The panel represents KM survival curves for LUAD, indicating lower expression of the gene correlates with poor prognosis patient survival from top to bottom (ALAD, ATP13A4, TTC19, YTHDF2, OMG, GABRR3, STK19, and VEZF1).

1.00

Survival probability

0.75

Survival probability

0.75

Survival probability

0.75

Low TFRC

High TFRC

Low SYMPK

Low TMEM127

0.50

P-value = 0.0144

0.50

High SYMPK

P-value = 0.022

0.50

High TMEM127

P-value = 0.022

0.25

0.00

100

120

140

160

100

120

140

160

100

120

140

160

Number at risk

Low TFRC

Low SYMPK

Low TMEM127

High TFRC

0 High SYMPK

1 High THEM127

100

120

140

160

100

120

140 160

100

120

140

160

1.00

Low REPS2

High REPS2

1.00

Low MLEC

High MLEC

Survival probability

0.75

Survival probability

0.75

Survival probability

0.75

P-value = 0.0036

Survival probability

0.75

P-value = 0.021

Low FBXO32

Low AMY2B

0.50

High FBXO32

P-value = 0.00093

0.50

High AMY2B

P-value = 0.0012

0.50

0.25

0.00

100

120

140

100

120

140

100

120

140

100

120

140

Number at risk

Low FBXO32

Low AMY28

Low REPS2

Low MLEC

High FBXO32

High AMY28

High REPS2

High MLEC

100

120

140

100

120

140

100

120

140

100

120

140

1.00

Low ALAD

1.00

Survival probability

High ALAD

Low GSS

0.75

P-value = 0.033

Survival probability

0.75

High GSS

P-value = 0.05

Survival probability

Low TCC19

0.75

High TCC19

P-value = 0.05

0.50

0.25

0.00

100

120

140

100

120

140

100

120

140

Number at risk

Low ALAD High ALAD

2 5

Low OSS

Low TCC19

High OSS

High TCC19

100

120

140

100

120

140

100

120

140

1.00

Survival probability

Low ALAD

0.75

High ALAD

P-value = 0.016

Survival probability

Low ATP13A4

0.75

High ATP13A4

Low TTC19

P-value = 0.0056

Survival probability

0.75

High TTC19

P-value = 0.0071

Survival probability

Low YTHDF2

0.75

High YTHDF2

P-value = 0.016

0.50

0.25

0.00

100

120

140

160

180 2

0 22

240

100

140

160 180

200

220 240

100

120

140

160

80 200

240

100

120 140

0 160

80 2

0 2

20 240

Number at risk

Low ALAD

54 125 41

Low ATP1344

254 135 52

Low TTC19

254 136 48

Low YTHDF2

254 125 41 20

High ALAD

253

151 67

High ATP13A4

253

141

56 26

High TTC19

53 140 6 10

High YTHDF2

253 151 67 33

10 6

30 100

120

1 140 16

60 180 2

0 22 240

100

120

0 160 180

200

0 220 24

100

140 16

1 180

00 2

220 240

20 40 60 80 100 120 140 160 180 200 220 240

1.00

Survival probability

Low OMG

0.75

High OMG

P-value = 0.021

Survival probability

Low GABRR3

0.75

High GABRR3

Survival probability

Low STK19

Survival probability

Low VEZF1

P-value = 0.036

0.75

High STK19

P-value = 0.039

0.75

High VEZF1

P-value = 0.043

0.50

0.25

0.00

100

120

140

160

200 2

240

100

120

140

60 1 0 180 2

220 240

100

120

140

160

180

240

100

120

140

160

0 180

200

240

Number at risk

Low OMG

254

139

Low GABRRS

270

133

52 26

Low STK19

54 134 50

Low VEZF1

254 137 50

29 17

High OMG

253

140 55

High GABRR3

237

143

56 27

High STK19

253

142

High VEZF1

253

139 58

140

40 60

100

120

1 160

220 240

100

120 140 160

0 180 200 220

100

140

160

0 20

0 220 240

100

0 160 1

240

0 22

might be used to selectively eliminate CIN cancer cells via synthetic lethality (69, 70).

To explore this hypothesis, we compared cell growth of rela- tively genetically stable (RPE1) and unstable (U2OS) cells during

depletion of six of our top CIN genes (GGPS1, USP47, C4orf45, SENP6, MAP3K13, and YEATS4). As illustrated in Fig. 6A, following depletion of these mRNAs in RPE1 cells resulted in slowed prolif- eration, and an increased level of mitotic abnormalities (see

Fig. 6. Effect of siRNA depleted CIN genes on U2OS cells. A) Growth curves of RPE1 cells after siRNA treatment against GGPS1 (orange), USP47 (dark green), C4orf45 (blue), SENP6 (purple), MAP3K13 (green), and YEATS4 (violet) in comparison with NC scrambled siRNA (dark blue). Days after treatment are displayed on the x axis, number of cells in the sample (x1,000)-on the y. B) Growth curves of U2OS cells after siRNA treatment against GGPS1 (orange), USP47 (dark green), C4orf45 (blue), SENP6 (purple), MAP3K13 (green), and YEATS4 (violet) in comparison with NC scrambled siRNA (dark blue). Days after treatment are displayed on the x axis, number of cells in the sample (x1,000)-on the y. C) Observation of mitotic abnormalities in U2OS cells with siRNA depletion of target genes. Immunostaining of U2OS cells during mitotic progression while siRNA depletion of target genes. The panel shows chromosome distribution during major mitotic phases (prophase, metaphase, anaphase, and telophase). Yellow arrows point to mitotic abnormalities, like lagging chromosomes, chromosome bridges, chromosome loss, or miss-attachment. Names of target genes displayed in white color; NC stands for negative control (scrambled siRNA). Magenta staining is against phosphorylated Histon H3. Green staining is against beta-tubulin. D) Percentage of MNi (orange bars) in comparison with abnormal mitoses (AMitoses) (blue bars) after siRNA depletion in U2OS cells. Names of target genes are displayed on the x axis, percentage-on the y. NC stands for negative control (scrambled siRNA), one asterisk indicates statistical significance (P < 0.05, Fisher's exact test), two (P <0.0001, Fisher's exact test).

120

110

100

RPE (day 0)

RPE (day1)

RPE (day2)

RPE (day3)

RPE (day4)

U20S (day 0) U2OS (day1) U20S (day2) U2OS (day3) U2OS (day4)

GGPS1

USP47

C4orf45

GGPS1

USP47

C4orf45

SENP6

MAP3K13

YEATS4

SENP6

MAP3K13

YEATS4

Prophase

Metaphase

Anaphase

Telophase

45.00

% Micronuclei **

%AMitoses

40.00

35.00

GGPS1

30.00

USP47

25.00

C4orf45

CAorf45

C4orf45

20.00

C4orf45

15.00

SENP6

10.00

MAP3K13

5.00

YEATS4

0.00

GGPS1

USP47

C4orf45

SENP6

MAP3K13

YEATS4

Figs. 2C and 3). This is consistent with our hypothesis that down- regulation of CIN genes could potentially contribute to cell trans- formation. On the contrary, in the genetically less stable U2OS cells, depletion of these CIN genes led to a more pronounced pro- liferation slowdown (Fig. 6B). Additionally, micronucleation and abnormal mitosis formation occurred at much higher levels in U2OS cells compared with RPE1 (Fig. 6C and D; Tables S1 and S2).

Thus, the proliferation rate of cells predisposed to CIN can be drastically slowed by the downregulation of these candidate CIN genes. As a caveat, it should be mentioned that there are many other genotypic differences between U2OS and RPE1 cells that could contribute to this response to siRNA treatment. Nonetheless, it appears likely that targeted elevation of CIN in cancer cells with a high level of basal CIN might be useful in can- cer treatment.

Some inhibitors for proteins encoded by candidate CIN genes act similarly to the siRNA

Since siRNA depletion acts by decreasing the level of mRNA of its targets and consequently reducing the amount of the protein in the cell, we decided to check whether small molecules that target the same proteins can also induce CIN by directly disrupting pro- tein function.

Among the 44 top genes from our secondary screening, small molecule inhibitors were available for 12 of them: ALAD (3 com- pounds); ALDOA (4 compounds); AMY2B (15 compounds); GGPS1 (6 compounds); KCNH8 (4 compounds); LPA (1 compound); PDGFRA (1 compound); PKIA (20 compounds); STK19 (1 com- pound); TFRC (1 compound); USP47 (3 compounds); and YTHDF2 (1 compound) (see for details Supplementary Excel File S6). To measure the effect of these inhibitors on chromosome loss, we ap- plied the HAC-based system similar to our primary and secondary siRNA screens. Significant effects (P <0.0001) on CIN were ob- served for ALAD with 3-(2-aminoethyl)-4-(aminomethyl) heptane- dioic acid and artenimol; PDGFRA with fostamatinib; PKIA with 5-(2-methylpiperazine-1-sulfonyl) isoquinoline, myristic acid, y-27632 dihydrochloride, at-7867, a-443654, and (s)-h-1152 (hydrochloride); and USP47 with usp7/usp47 inhibitor (Fig. 7). Thus, inhibition of at least some of these proteins can cause an in- crease in CIN.

Identification of currently used inhibitors of CIN proteins that significantly increase chromosome missegregation suggests a possible avenue to target and leverage the CIN phenotype in can- cer cells.

Future insights into gene function

As of perspective, we would like to emphasize that all of the newly identified CIN genes merit further investigation. We suggest a next approach that could be followed. Application of string ana- lysis (https://string-db.org/) can be informative for identifying po- tential targets for proteins encoded by CIN genes. If compared with the existing pool of genes involved in cell cycle/mitosis regu- lation, it can reveal potential interaction partners. As an example, mapping gene YEATS4 reveals a link to the mitotic spindle assem- bly checkpoint/centromere gene cluster. Interestingly, that yeast ortholog of YEATS4 (YAF9) interacts with MAD2L1 (71, 72). Fluorescence-activated cell sorting (FACS) analysis (Fig. S5) strengthens the suggestion that the human protein YEATS4 can be involved in MAD2L1 regulatory network, because it is similar to the previously described MAD2L1 depletion profile (73). Further work in this direction could further illuminate the role of the candidate CIN genes in mitosis.

Discussion

A comprehensive understanding of the spectrum of mechanisms that can lead to cancerous transformation of normal cells is a ne- cessary step towards effective prevention, diagnosis, and therapy. High-throughput studies addressing this problem have been done in model organisms such as yeast (74, 75), However, work on hu- man cells (55, 76, 77) has been hampered by the lack of an appro- priate model for detection of CIN. Here, we have investigated the involvement of 18,658 human genes in maintaining chromosome stability using a nonessential HAC as a sensor.

The assay that monitors the segregation of the HAC provides a way to detect minor changes in chromosome homeostasis that could potentially lead to cellular transformation. Although most studies of cancerous transformation focus on a few key genes whose misregulation has a dramatic effect on cell behavior such as p53 (78) or c-Myc (79), in fact, our cells accumulate a myriad of small defects over their lifetime. When these defects are com- bined or in some cases on their own, the result can be transform- ation of normal cells into cancer. Our primary and secondary screen results identify potential targets-primarily in non- essential genes-for future studies of mitotic regulation. The role of these genes in chromosome segregation is a fertile area for future studies.

While the screen used the GFP signal loss as a readout of chromosome loss or rearrangement, an important limitation is the inability of the screen to identify drivers involved in chromo- some amplification, since key amplification events are detected in many cancers (46). The test-system employed here is unable to clearly distinguish such HAC gain effects, which are obscured by natural fluctuations in GFP expression within the cell popula- tion. We also cannot exclude loss of GFP by internal rearrange- ments of the GFP locus on HAC caused by S-CIN thus further validation of the targets identified here using a technique such as FISH will be important in future studies. Finally, use of an siRNA library has certain limitations due to potential variation in quantitative gene knockdown by individual siRNAs.

Our secondary screen identified a cohort of candidate genes whose depletion was most effective at inducing chromosome loss. The fact that downregulation of these putative CIN genes re- sulted in an elevated number of abnormal mitoses and, conse- quently, MNi (Figs. 2 and 3) strongly supports the specificity and utility of our high-throughput primary screen. The range of sever- ity of the CIN phenotypes supports our hypothesis that we de- tected genes that are causing minor disturbances on mitosis. For example, genes to the left of C4orf42 in Fig. 2C did not show stat- istical significance in the formation of MNi or abnormal mitoses, but following depletion of those proteins, the level of mitotic ab- normalities is higher than in the NC, and the HAC loss per cell div- ision is also high and significant. This suggests that though some gene knockdowns are not strong enough on their own to affect the behavior of natural chromosomes, they are sufficient to cause missegregation of the sensitized HAC. They should therefore be regarded as potential factors enhancing the risk of aberrant mi- toses when combined with other factors.

Intriguingly, among the top targets we identified SENP6 as hav- ing a strong connection to CIN. SENP6 was previously described as a key kinetochore regulator (80, 81) but was not linked to CIN. Our work complements the previous findings on the role of SENP6 in mitosis. Clearly, the other newly identified CIN genes described here offer fertile ground for discovery of novel proteins and path- ways involved in mitotic chromosome segregation. Powerful new computational technologies such as AlphaFold or biochemical

Fig. 7. Measurement of HAC loss per cell division in a drug screening. Sixty compounds were tested on their effect on CIN. The graph represents all tested compounds. If the significant effect was observed in more than one concentration (10 uM by default) it is specified. Statistically significant enrichment of HAC loss per cell division (P <0.0001, in Fisher's exact test with Bonferroni's correction) is labeled with an asterisk. Names of target genes with drug names, and concentrations are displayed on the x axis, HAC loss per cell division-on the y. Green bar-is NC (DMSO). Yellow bars-positive control (taxol) previously described in Ref. (31).

Probability of HAC loss per cell division

0.1

0.2

0.3

0.4

0.5

0.6

0.7

microtubule (paclitaxel)

* ☒

microtubule (paclitaxel) (3uM)

microtubule (paclitaxel) (1uM)

microtubule (paclitaxel) (370 nM)

microtubule (paclitaxel) (123 nM)

microtubule (paclitaxel) (41 nM)

microtubule (paclitaxel) (13 nM)

* ☒

ALAD (aminolevulinic acid hydrochloride)

ALAD (3-(2-aminoethyl)-4-(aminomethyl)heptanedioic acid)

ALAD (levulinic acid)

ALDOA (zinc acetate)

ALDOA (artenimol)

ALDOA (lithium 3-hydroxy-2-oxopropyl phosphate)

ALDOA (lithium 3-hydroxy-2-oxopropyl phosphate) (3uM)

ALDOA (artenimol)

ALDOA (artenimol) (3um)

ALDOA (artenimol) (1uM)

AMY2B (I-galactose)

AMY2B (d-galactose)

AMY2B (maltotriose)

AMY2B (L-Glucose)

AMY2B (D-Mannose)

AMY2B (alpha-D-(+)-Talose)

AMY2B (maltotriose)

AMY2B (alpha-D-glucose)

AMY2B (pidolic acid)

AMY2B (maltotriose)

AMY2B (pidolic acid)

AMY2B (D-Talose)

AMY2B (d-galactose)

AMY2B (L-mannopyranose)

GGPS1 (ibandronate sodium)

GGPS1 (minodronic acid)

GGPS1 (ibandronate sodium)

GGPS1 (zoledronic acid monohydrate)

GGPS1 (ibandronate sodium)

GGPS1 (pamidronate disodium)

KCNH8 (promethazine hydrochloride)

KCNH8 (enflurane)

KCNH8 (promethazine hydrochloride)

KCNH8 (miconazole nitrate)

LPA (aminocaproic acid)

PDGFRA (fostamatinib)

PDGFRA (fostamatinib) (3uM)

* ☒

PDGFRA (fostamatinib) (1uM)

PKIA (diotyrosine i-125)

PKIA (3-pyridin-4-yl-1h-indazole)

PKIA (hydroxyfasudil)

PKIA (diotyrosine i-125)

PKIA (N-[2-(Methylamino)ethyl]-5-isoquinolinesulfonamide dihydrochloride)

PKIA (n-(2-(methylamino)ethyl)isoquinoline-5-sulfonamide)

PKIA (5-(2-methylpiperazine-1-sulfonyl)isoquinoline)

PKIA (n-(2-(methylamino)ethyl)isoquinoline-5-sulfonamide)

PKIA (at-7867)

PKIA (y-27632 dihydrochloride)

PKIA (myristic acid)

PKIA (y-27632 dihydrochloride)

* ☒

PKIA (fasudil hydrochloride)

PKIA (hydroxyfasudil)

PKIA (at-7867)

PKIA (at-7867) (3uM)

PKIA (at-7867) (1uM)

PKIA (at-7867) (370 nM)

PKIA (at-7867) (123 nM)

PKIA (a-443654)

PKIA (a-443654) (3uM)

PKIA (a-443654) (1uM)

PKIA (a-443654) (370 nM)

PKIA (a-443654) (123 nM)

PKIA (s)-h-1152 (hydrochloride)

* ☒

PKIA (s)-h-1152 (hydrochloride) (3uM)

PKIA (s)-h-1152 (hydrochloride) (1uM)

STK19 (zt-12-037-01)

TFRC (ferrous chloride)

USP47 (USP7/USP47 Inhibitor)

USP47 (usp7/usp47 inhibitor)

USP47 (usp7/usp47 inhibitor) (3uM)

USP47 (usp7/usp47 inhibitor) (1uM)

USP47 (usp7-in-8)

YTHDF2 (dc-y13-27)

analyses such as cross-linking mass spectrometry can be used to predict and confirm protein interactions and map the molecular ecosystem within which these proteins normally function to pro- mote accurate chromosome segregation.

Since CIN is observed in ~80-90% of all solid tumors (66, 67), the CIN genes described here are potential candidates for early markers for cancer diagnosis. Our previous discovery of PINK as a kinase involved in CIN (29) led to the clinical usage of PINK as a new (ovarian cancer) diagnostic marker (82). Given the scale of our new study compared to the kinase screen, the CIN genes de- scribed here might make a substantial contribution to future clin- ical diagnosis of various types of cancer. Considering the correlation between downregulation of CIN genes and poor prog- nosis for patient survival for the six most genomically unstable cancer types (SCLC, ACC, COAD, LUAD, ovarian cancer, and breast cancer), analysis of the level of the expression of these genes (es- pecially downregulation) may potentially be clinically inform- ative. For example, our correlation analysis looking at expression over many cancer cell lines, considered together with synthetic lethality data sets, suggests that treatments targeting combinations of the CIN genes identified here with their synthetic lethal counterparts might offer promise in the clinic.

Interestingly, one of the potential drugs we found to increase CIN is fostamatinib, a clinically used compound to treat blood dis- order thrombocytopenia (83). Based on our discovery it also can be suggested that fostamatinib might have a second application in cancer treatment. Thus, the HAC-based approach described here might be applied in discovery of new potential therapeutics affecting CIN. Overall, development of tests screening for levels of the spectrum of genes identified here may offer a potential step towards personalized diagnosis and treatment of patients with cancer.

Materials and methods Cell culture

All cell culture media, components, and supplements were pur- chased from Life Technologies (USA), and Sigma (USA), unless other is specified. HT1080 carrying HAC/dGFP cells were routinely maintained in 5% CO2 atmosphere in Dulbecco’s Modified Eagle Medium (DMEM) medium supplemented with 10% fetal bovine se- rum, 100 U/mL penicillin, 100 mg/mL streptomycin, and 2 mmol/ L L-glutamine in the presence of 10 mg/mL Blasticidine S. U2OS cell lines were routinely maintained in 5% CO2 atmosphere in alphaMEM supplemented with 10% fetal bovine serum, 100 U/ mL penicillin, 100 mg/mL streptomycin, and 2 mmol/L L-glutamine. RPE1 cells were routinely maintained in 5% CO2 at- mosphere in DMEM medium supplemented with 10% fetal bovine serum, 100 U/mL penicillin, 100 mg/mL streptomycin, and 2 mmol/L L-glutamine. Cells source: U2OS, RPE1 (ATCC).

Immunocytochemistry

Here the protocol describes ICC procedure for one well of the 24-well plate. If it were necessary, volumes were scaled up propor- tionally. Cultured cells were washed with PBS and fixed in 4% par- aformaldehyde (PFA) in PBS for 15 min at RT. Cells were rinsed two times quickly with PBS at RT. Cells were blocked in 200 µL of 5% bovine serum albumin (BSA) in PBS-TT (PBS plus 0.5% Tween 20, 0.1% Triton X-100) for 30 min at RT. Cells were washed three times with PBS-T (PBS, plus 0.1% Tween 20) for 5 min. Cell were stained with primary antibodies (dilution according to the manufacture’s protocol) in 200 uL of 1% BSA in PBS-TT at RT for 2 h. The samples

were washed three times with PBS-T for 5 min. Cells were stained with secondary antibodies (dilution according to the manufac- ture’s protocol) in 200 µL of 1% BSA in PBS-TT at RT for 1 h. The samples were washed three times in PBS-T for 5 min. The samples were counterstained with 4’,6-diamidino-2-phenylindole (DAPI) and mount with mounting media (ProLong Diamond Antifade Mountant with DAPI, Life Technology, P36962). The samples ob- tained were analyzed using Confocal Microscope System Zeiss LSM780, LRBGE Fluorescence Imaging Facility (NIH). Images were analyzed using Fiji software. A list of antibodies is available in Table S3.

siRNA transfection (24-well format)

Here, the protocol describes ICC procedure for one well of the 24-well plate. If it was necessary, volumes were scaled up propor- tionally. The genes of interest were knocked down using siRNAs. siRNAs were purchased from Dharmacon (USA). For siRNA treat- ment, 12.5 x 103/well cells were seeded in 24-well plates before a day of the experiment. Cells were transfected with each siRNA (a working concentration 17 nM) using lipofectamine-RNAiMAX (Thermofisher) followed by the manufacturer’s protocol. Cells were grown without Blasticidine for 96 h after transfection. After 96 h, the cells were collected or fixed depending on further analysis.

Micronucleation assay

The MNi assay was performed as described (29) with minor changes. Quadruplicate cultures of cells in 24-well plates were ex- posed to different siRNAs or scramble siRNA as a NC. After 72 h of cultivation, Cytochalasin B was added to a final concentration 4.5 µg/mL for 24 h. The cells were trypsinized and 5x 103 cells were spun down onto cytoslides (Shandon, # 5991056) at 1,000 rpm for 1 min in Cytospin 3 (Shandon). The slides were air- dried for 5 min, fixed with Diff-Quick fixative for 5 min, stained in Diff-Quick solution C (Eosin Y) (Electron Microscopy Sciences, # 26096) for 10 s, rinsed in distilled water and dried for 5 min. Coverslips were mounted with ProLong Diamond Antifade Mountant with DAPI (Thermofisher, # P36962). About 100 bi- nucleated cells on each slide were scored for the presence of MNi.

Whole-genome high-throughput siRNA screen

A 384-well plate-based assay was optimized to identify siRNAs that influence CIN. Genome-wide libraries (Thermofisher) com- prising 59,569 synthetic siRNAs targeting 18,658 unique human genes in total were arrayed in 384-well plates. Each well contained ~10 ng siRNA per gene with an average of 3 siRNAs per gene. Each plate also contained positive controls (PINK1 and PRKCE siRNAs) and negative siRNA controls. The library was introduced into the HT1080 dGFP/HAC (A1) cell line by a high-throughput trans- fection process (29). Ninety-six hours post-transfection, cells were fixed with a final concentration of 4% PFA, and the GFP and Nuclei staining signal were quantified. All steps were per- formed using a Viafill liquid dispensing system.

High-throughput imaging

Fixed and stained plates were imaged using an Opera Phenix high- content screening system with 20x water immersion objectives, and a 16-bit sCMOS camera with pixel binning set to 2 (1,080 x 1,080 px). For the DAPI channel, a 405-nm laser source and a 435-480-nm bandpass acquisition filter were used. For the GFP channel, a 488-nm laser source and a 500-550-nm bandpass ac- quisition filter were used. The DAPI and GFP channels were

acquired sequentially at a single focal plane in one field of view per well. Images were directly transferred to Columbus for the analysis pipeline.

High-content image analysis

The images were analyzed using PerkinElmer Columbus Image Analysis System (2.9.1). Nuclei segmentations were identified with DAPI channel. The filter was applied to exclude the nuclei that were cut out by the image border. The mean of a green fluor- escence intensity in the nucleus were measured in GFP channel. The cells with values of GFP mean fluorescence intensity >200 AU, an empirically determined threshold that was kept con- stants for all plates in the screen, were classified as GFP + cells. The percentage of GFP + cells was used as a proxy for measuring HAC loss. Well level data were exported as tab-separated text files for data normalization and hit selection.

Drug screen

Selected compounds were acoustically dispensed by Echo Acoustic Liquid Handler (Beckman Coulter Life Sciences) into the 384 phenoplates (Revvity). Each compound was plated at a 7-point concentration range with 1:3 dilution. Paclitaxel was used as a positive control. The HT1080 dGFP/HAC (A1) cell line was trypsinized and dispensed in 40 uFL of growth medium using a Multidrop Combi dispenser at a density of 450 cells per well to allow compounds to be present during the exponential growth phase. Cells were fixed with a final concentration of 4% PFA, and the GFP and nuclei staining signal were quantified.

Overlap of the screen genes with essential genes

Post-Chronos CRISPR and Achilles common essential gene lists [files “CRISPR_common_essentials.csv” and “Achilles_common essentials.csv” from DepMap v. 22Q2 (47)] were downloaded and an overlap with the screen genes was plotted via R (“ggVennDiagram”) (84).

Correlation analysis coupled to GO term analysis

The main part of the statistical analysis was performed with R (85).

The list of chromosomal instability genes was converted into Uniprot IDs and matched to the IDs from (63). Proteomic Table S2 data from Ref. (63) was imputed zeros instead of missing values. Afterwards, Spearman correlations between vectors, con- sisting of MaxLFQ values across the cell lines for differing pro- teins, were calculated and were referred in the manuscript as correlations between protein MaxLFQ values. NAs in the correl- ation matrix were substituted by zeros. GO terms of centromere, kinetochore, cell cycle, and cell division were matched to the pro- tein list from the reference through Uniprot. Correlations of CIN proteins vs. correlations of all proteins from the cancer proteome dataset against proteins, having respective GO terms, were calcu- lated and plotted with “ggridges” package (86). Correlations of in- dividual chromosomal instability proteins against proteins marked or not by a respective GO term were also calculated and plotted accordingly. Fractions of correlations were plotted with ggplot2 (87).

For comprising a list of proteins, synthetic lethal with proteins marked by respective GO terms, SynLethDB 2.0 database (65) was downloaded and filtered for synthetic lethal interactions, de- tected in low throughput experiments and CRISPR screens. The density ridgeline plots of correlations of CIN proteins or all pro- teins from the cancer proteome dataset against proteins,

synthetic lethal with GO term proteins, as well as fractions of cor- relations, were plotted as above.

Patient survival data collection and analysis

In this analysis, the RNA-seq expression profiles of cancer pa- tients were obtained from cbioportal (https://www.cbioportal. org/datasets). The survival analysis was conducted on six cancer types with the highest prevalence of chromosomal instability: TCGA-OV (n=303), TCGA-COAD (n=453), TCGA-LUAD (n=507), TCGA-ACC (n=79), BREAST METABRIC (n=1980), and SCLC (U Cologne) (n =77). The univariate overall survival analysis per- formed using Cox proportional hazard (Cox-PH) regression model build-in “survival” package in R (v 4.2.3). The significant survival distributions between the high-risk and low-risk groups were esti- mated using the log-rank test in terms of the P-value and hazard ratio (HR). HR> 1 shows bad impact on the survival of patients, while HR < 1, shows improve survival of patients and HR = 1 has no effect on survival. High-risk and low-risk groups (were divided by median cut-off) and graphically represented using Kaplan- Meier (KM) survival curves.

Data Availability

All data for this manuscript are available in Supplementary materials. A script used for the correlation analysis in this manu- script is available at: https://github.com/NatashaKochanova/ Chromosome-instability-correlation-analysis/.

Acknowledgments

We thank the Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression Fluorescence Imaging Facility (NIH) and, in particular, Dr. Karpova for instructions, consultations, and help with the usage of a LSM-780 Zeiss microscopy imaging system. We thank Dr. Georg Kustatscher (University of Edinburgh) for his advice and guidance on correlation analysis. We thank Dr. Samuel Corless (University of Edinburgh) for his ad- vice on DepMap database usage.

Supplementary Material

Supplementary material is available at PNAS Nexus online.

Funding

This work was supported by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research, USA (V.L. and N.K.), National Center for Advancing Translational Science (C.Y.C., Y.C.C., and C.C.C.). NIH intramural research grant FY21-NCI-03 (M.L., V.L.), NICHD intramural pro- gram ZIAHD008954 (M.D.), a Wellcome Principal Research Fellowship by Wellcome (W.C.E .; grant number 107022).

Author Contributions

Mikhail Liskovykh (Conceptualization, Data curation, Formal ana- lysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing-original draft, Project ad- ministration, Writing-review & editing), Natalia Y. Kochanova (Data curation, Formal analysis, Writing-original draft, Writing -review & editing), Chih-Yuan Chiang (Data curation, Formal analysis), Anjali Dhall (Data curation, Formal analysis), Vasilisa Aksenova (Data curation), Yu-Chi Chen (Data curation, Formal

analysis), William C. Reinhold (Data curation, Formal analysis, Supervision, Validation, Writing-review & editing), Mary Dasso (Supervision), Anish Thomas (Supervision, Writing-review & ed- iting), Ken Chih-Chien Cheng (Supervision), Yves Pommier (Supervision, Writing-review & editing), William C. Earnshaw (Supervision, Funding acquisition, Writing-review & editing), Vladimir Larionov (Conceptualization, Supervision, Funding ac- quisition, Writing-review & editing), and Natalay Kouprina (Supervision, Writing-original draft).

References

1 Sansregret L, Vanhaesebroeck B, Swanton C. 2018. Determinants and clinical implications of chromosomal instability in cancer. Nat Rev Clin Oncol. 15:139-150.

2 Bach D-H, Zhang W, Sood AK. 2019. Chromosomal instability in tumor initiation and development. Cancer Res. 79:3995-4002.

3 Rohrback S, Siddoway B, Liu CS, Chun J. 2018. Genomic mosaicism in the developing and adult brain. Dev Neurobiol. 78:1026-1048.

4 Burrell RA, et al. 2013. Replication stress links structural and nu- merical cancer chromosomal instability. Nature. 494:492-496.

5 McManus KJ, Barrett IJ, Nouhi Y, Hieter P. 2009. Specific synthetic lethal killing of RAD54B-deficient human colorectal cancer cells by FEN1 silencing. Proc Natl Acad Sci U S A. 106:3276-3281.

6 Lentini L, Amato A, Schillaci T, Di Leonardo A. 2007. Simultaneous Aurora-A/STK15 overexpression and centrosome amplification induce chromosomal instability in tumour cells with a MIN phenotype. BMC Cancer. 7:212.

7 Cheng X, Shen Z, Yang J, Lu S-H, Cui Y. 2008. ECRG2 disruption leads to centrosome amplification and spindle checkpoint defects contributing chromosome instability. J Biol Chem. 283:5888-5898.

8 Bakhoum SF, Genovese G, Compton DA. 2009. Deviant kineto- chore microtubule dynamics underlie chromosomal instability. Curr Biol. 19:1937-1942.

9 Green RA, Kaplan KB. 2003. Chromosome instability in colorectal tumor cells is associated with defects in microtubule plus-end attachments caused by a dominant mutation in APC. J Cell Biol. 163:949-961.

10 Sajesh BV, Lichtensztejn Z, McManus KJ. 2013. Sister chromatid cohesion defects are associated with chromosome instability in Hodgkin lymphoma cells. BMC Cancer. 13:391.

11 Barber TD, et al. 2008. Chromatid cohesion defects may underlie chromosome instability in human colorectal cancers. Proc Natl Acad Sci U S A. 105:3443-3448.

12 Jallepalli PV, Lengauer C. 2001. Chromosome segregation and can- cer: cutting through the mystery. Nat Rev Cancer. 1:109-117.

13 Wirth KG, et al. 2006. Separase: a universal trigger for sister chro- matid disjunction but not chromosome cycle progression. J Cell Biol. 172:847-860.

14 Cahill DP, et al. 1998. Mutations of mitotic checkpoint genes in human cancers. Nature. 392:300-303.

15 Tutaj H, Pogoda E, Tomala K, Korona R. 2019. Gene overexpres- sion screen for chromosome instability in yeast primarily identi- fies cell cycle progression genes. Curr Genet. 65:483-492.

16 Engeland K. 2018. Cell cycle arrest through indirect transcrip- tional repression by p53: I have a DREAM. Cell Death Differ. 25: 114-132.

17 Bakhoum SF, Kabeche L, Murnane JP, Zaki BI, Compton DA. 2014. DNA-damage response during mitosis induces whole- chromosome missegregation. Cancer Discov. 4:1281-1289.

18 Bohly N, et al. 2022. Increased replication origin firing links repli- cation stress to whole chromosomal instability in human cancer. Cell Rep. 41:111836.

19 Bakhoum SF, Thompson SL, Manning AL, Compton DA. 2009. Genome stability is ensured by temporal control of kinetochore- microtubule dynamics. Nat Cell Biol. 11:27-35.

20 Hong C, et al. 2022. cGAS-STING drives the IL-6-dependent sur- vival of chromosomally instable cancers. Nature. 607:366-373.

21 Pan H, et al. 2021. Discovery of candidate DNA methylation can- cer driver genes. Cancer Discov. 11:2266-2281.

22 Spencer F, Gerring SL, Connelly C, Hieter P. 1990. Mitotic chromo- some transmission fidelity mutants in Saccharomyces cerevisiae. Genetics. 124:237-249.

23 Yuen KWY, et al. 2007. Systematic genome instability screens in yeast and their potential relevance to cancer. Proc Natl Acad Sci U S A. 104:3925-3930.

24 Kouprina NY, Pashina OB, Nikolaishwili NT, Tsouladze AM, Larionov VL. 1988. Genetic control of chromosome stability in the yeast Saccharomyces cerevisiae. Yeast. 4:257-269.

25 Larionov VL, et al. 1987. The stability of chromosomes in yeast. Curr Genet. 11:435-443.

26 Hoyt MA, Stearns T, Botstein D. 1990. Chromosome instability mutants of Saccharomyces cerevisiae that are defective in microtubule-mediated processes. Mol Cell Biol. 10:223-234.

27 Gisler S, Maia ARR, Chandrasekaran G, Kopparam J, van Lohuizen M. 2020. A genome-wide enrichment screen identifies NUMA1-loss as a resistance mechanism against mitotic cell- death induced by BMI1 inhibition. PLoS One. 15:e0227592.

28 Sears DD, Hegemann JH, Hieter P. 1992. Meiotic recombination and segregation of human-derived artificial chromosomes in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 89:5296-5300.

29 Liskovykh M, et al. 2019. A novel assay to screen siRNA libraries identifies protein kinases required for chromosome transmis- sion. Genome Res. 29:1719-1732.

30 Nakano M, et al. 2008. Inactivation of a human kinetochore by specific targeting of chromatin modifiers. Dev Cell. 14:507-522.

31 Lee H-S, et al. 2013. A new assay for measuring chromosome in- stability (CIN) and identification of drugs that elevate CIN in can- cer cells. BMC Cancer. 13:252.

32 Lee H-S, et al. 2016. Effects of anticancer drugs on chromosome instability and new clinical implications for tumor-suppressing therapies. Cancer Res. 76:902-911.

33 Kouprina N, et al. 2018. Human artificial chromosome with regu- lated centromere: a tool for genome and cancer studies. ACS Synth Biol. 7:1974-1989.

34 Kouprina N, Tomilin AN, Masumoto H, Earnshaw WC, Larionov V. 2014. Human artificial chromosome-based gene delivery vec- tors for biomedicine and biotechnology. Expert Opin Drug Deliv. 11:517-535.

35 Kim J-H, et al. 2011. Human artificial chromosome (HAC) vector with a conditional centromere for correction of genetic deficien- cies in human cells. Proc Natl Acad Sci U S A. 108:20048-20053.

36 Kononenko AV, et al. 2014. A portable BRCA1-HAC (human artifi- cial chromosome) module for analysis of BRCA1 tumor suppres- sor function. Nucleic Acids Res. 42:e164.

37 Kouprina N, Lee NCO, Kononenko AV, Samoshkin A, Larionov V. 2015. From selective full-length genes isolation by TAR cloning in yeast to their expression from HAC vectors in human cells. Methods Mol Biol. 1227:3-26.

38 Liskovykh M, et al. 2023. Actively transcribed rDNA and distal junction (DJ) sequence are involved in association of NORs with nucleoli. Cell Mol Life Sci. 80:121.

39 Kononenko AV, et al. 2015. Generation of a conditionally self- eliminating HAC gene delivery vector through incorporation of a tTAVP64 expression cassette. Nucleic Acids Res. 43:e57.

40 Pesenti E, et al. 2018. Generation of a synthetic human chromo- some with two centromeric domains for advanced epigenetic en- gineering studies. ACS Synth Biol. 7:1116-1130.

41 Pesenti E, et al. 2020. Analysis of complex DNA rearrangements during early stages of HAC formation. ACS Synth Biol. 9: 3267-3287.

42 Logsdon GA, et al. 2019. Human artificial chromosomes that by- pass centromeric DNA. Cell. 178:624-639.e19.

43 Liskovykh M, et al. 2015. Stable maintenance of de novo as- sembled human artificial chromosomes in embryonic stem cells and their differentiated progeny in mice. Cell Cycle. 14:1268-1273.

44 Ponomartsev SV, et al. 2020. Human alphoid(tetO) artificial chromosome as a gene therapy vector for the developing hemo- philia A model in mice. Cells. 9:879.

45 Sinenko SA, et al. 2018. Transfer of synthetic human chromo- some into human induced pluripotent stem cells for biomedical applications. Cells. 7:261.

46 Tsherniak A, et al. 2017. Defining a cancer dependency map. Cell. 170:564-576.e16.

47 Broad DepMap. 2022. DepMap 22Q2 Public. figshare. Dataset.

48 Hall A, Marshall CJ, Spurr NK, Weiss RA. 1983. Identification of transforming gene in two human sarcoma cell lines as a new member of the ras gene family located on chromosome 1. Nature. 303:396-400.

49 Anderson MJ, Fasching CL, Stanbridge EJ, Casey G. 1994. Evidence that wild-type TP53, and not genes on either chromosome 1 or 11, controls the tumorigenic phenotype of the human fibrosarcoma HT1080. Genes Chromosomes Cancer. 9:266-281.

50 Paulson TG, Almasan A, Brody LL, Wahl GM. 1998. Gene amplifi- cation in a p53-deficient cell line requires cell cycle progression under conditions that generate DNA breakage. Mol Cell Biol. 18: 3089-3100.

51 Ito T, et al. 1993. In vitro irradiation is able to cause RET oncogene rearrangement. Cancer Res. 53:2940-2943.

52 Baird TD, et al. 2018. ICE1 promotes the link between splicing and nonsense-mediated mRNA decay. Elife. 7:e33178.

53 Söhle J, et al. 2012. Identification of new genes involved in human adipogenesis and fat storage. PLoS One. 7:e31193.

54 Moser R, et al. 2014. Functional kinomics identifies candidate therapeutic targets in head and neck cancer. Clin Cancer Res. 20: 4274-4288.

55 Duffy S, et al. 2016. Overexpression screens identify conserved dosage chromosome instability genes in yeast and human can- cer. Proc Natl Acad Sci U S A. 113:9967-9976.

56 Bartoszewski R, Sikorski AF. 2019. Editorial focus: understanding off-target effects as the key to successful RNAi therapy. Cell Mol Biol Lett. 24:69.

57 Friedrich M, Aigner A. 2022. Therapeutic siRNA: state-of-the-art and future perspectives. BioDrugs. 36:549-571.

58 Meyers RM, et al. 2017. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat Genet. 49:1779-1784.

59 Dempster JM, et al. Extracting biological insights from the project Achilles genome-scale CRISPR screens in cancer cell lines. bioRxiv 720243, 31 July 2019, preprint: not peer reviewed.

60 Dempster JM, et al. 2021. Chronos: a cell population dynamics model of CRISPR experiments that improves inference of gene fitness effects. Genome Biol. 22:343.

61 Pacini C, et al. 2021. Integrated cross-study datasets of genetic de- pendencies in cancer. Nat Commun. 12:1661.

62 Fenech M. 2006. Cytokinesis-block micronucleus assay evolves into a “cytome” assay of chromosomal instability, mitotic dys- function and cell death. Mutat Res. 600:58-66.

63 Goncalves E, et al. 2022. Pan-cancer proteomic map of 949 human cell lines. Cancer Cell. 40:835-849.e8.

64 Walker MG. 2001. Drug target discovery by gene expression ana- lysis: cell cycle genes. Curr Cancer Drug Targets. 1:73-83.

65 Wang J, et al. 2022. SynLethDB 2.0: a web-based knowledge graph database on synthetic lethality for novel anticancer drug discov- ery. Database (Oxford). 2022:baac030.

66 Quinton RJ, et al. 2021. Whole-genome doubling confers unique genetic vulnerabilities on tumour cells. Nature. 590:492-497.

67 Nguyen B, et al. 2022. Genomic characterization of metastatic patterns from prospective clinical sequencing of 25,000 patients. Cell. 185:563-575.e11.

68 Bielski CM, et al. 2018. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat Genet. 50:1189-1195.

69 Janssen A, Kops GJPL, Medema RH. 2009. Elevating the frequency of chromosome mis-segregation as a strategy to kill tumor cells. Proc Natl Acad Sci U S A. 106:19108-19113.

70 Swanton C, et al. 2009. Chromosomal instability determines tax- ane response. Proc Natl Acad Sci U S A. 106:8671-8676.

71 Collins SR, et al. 2007. Functional dissection of protein complexes involved in yeast chromosome biology using a genetic inter- action map. Nature. 446:806-810.

72 Daniel JA, Keyes BE, Ng YPY, Freeman CO, Burke DJ. 2006. Diverse functions of spindle assembly checkpoint genes in Saccharomyces cerevisiae. Genetics. 172:53-65.

73 Orr B, Bousbaa H, Sunkel CE. 2007. Mad2-independent spindle as- sembly checkpoint activation and controlled metaphase- anaphase transition in Drosophila S2 cells. Mol Biol Cell. 18:850-863.

74 Stirling PC, et al. 2011. The complete spectrum of yeast chromo- some instability genes identifies candidate CIN cancer genes and functional roles for ASTRA complex components. PLoS Genet. 7: e1002057.

75 Stirling PC, et al. 2012. Mutability and mutational spectrum of chromosome transmission fidelity genes. Chromosoma. 121:263-275.

76 Hutchins JRA, et al. 2010. Systematic analysis of human protein complexes identifies chromosome segregation proteins. Science. 328:593-599.

77 Neumann B, et al. 2010. Phenotypic profiling of the human gen- ome by time-lapse microscopy reveals cell division genes. Nature. 464:721-727.

78 Liu Y, Su Z, Tavana O, Gu W. 2024. Understanding the complexity of p53 in a new era of tumor suppression. Cancer Cell. 42:946-967.

79 Dhanasekaran R, et al. 2022. The MYC oncogene-the grand or- chestrator of cancer growth and immune evasion. Nat Rev Clin Oncol. 19:23-36.

80 Mitra S, et al. 2020. Genetic screening identifies a SUMO protease dynamically maintaining centromeric chromatin. Nat Commun. 11:501.

81 Mukhopadhyay D, Arnaoutov A, Dasso M. 2010. The SUMO pro- tease SENP6 is essential for inner kinetochore assembly. J Cell Biol. 188:681-692.

82 Zheng F, et al. 2023. PINK1-PTEN axis promotes metastasis and chemoresistance in ovarian cancer via non-canonical pathway. J Exp Clin Cancer Res. 42:295.

83 Connell NT, Berliner N. 2019. Fostamatinib for the treatment of chronic immune thrombocytopenia. Blood. 133:2027-2030.

84 Gao CH, Dusa A. 2024 ggVennDiagram: a ‘ggplot2’ implement of venn diagram.

85 R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2022.

86 Wilke C. 2024 ggridges: ridgeline plots in ‘ggplot2’.

87 Wickham H. ggplot2: elegant graphics for data analysis. Springer- Verlag, New York, 2016.

Quartz 4

Explorer

41416224

A genome-wide RNAi screen for novel CIN genes using human artificial chromosome

Abstract

Significance Statement

Introduction

Results

Primary whole-genome siRNA screening identifies a set of novel CIN candidate genes

Secondary siRNA screening to select CIN genes for further analysis

Depletion of 44 novel CIN genes causes mitotic abnormalities in normal RPE1 cells

Correlation analysis revealed a link between CIN genes, kinetochore proteins, and cell cycle progression

Depletion of CIN genes slowed proliferation of the genetically unstable U2OS cell line

Some inhibitors for proteins encoded by candidate CIN genes act similarly to the siRNA

Future insights into gene function

Discussion

Materials and methods Cell culture

Immunocytochemistry

siRNA transfection (24-well format)

Micronucleation assay

Whole-genome high-throughput siRNA screen

High-throughput imaging

High-content image analysis

Drug screen

Overlap of the screen genes with essential genes

Correlation analysis coupled to GO term analysis

Patient survival data collection and analysis

Data Availability

Acknowledgments

Supplementary Material

Funding

Author Contributions

References

Graph View

Table of Contents

Backlinks