THE LANCET Diabetes & Endocrinology

Supplementary appendix

This appendix formed part of the original submission and has been peer reviewed. We post it as supplied by the authors.

Supplement to: Bancos I, Taylor A E, Chortis V, et al. Urine steroid metabolomics for the differential diagnosis of adrenal incidentalomas in the EURINE-ACT study: a prospective test validation study. Lancet Diabetes Endocrinol 2020; published online July 23. https://doi.org/10.1016/S2213-8587(20)30218-7.

Supplementary Appendix

“Urine steroid metabolomics in the differential diagnosis of adrenal incidentalomas: A prospective test validation study”

List of ENSAT EURINE-ACT Investigators (listed in alphabetical order by country and institution)

Australia

· School of Computing and Information, University of Melbourne, Melbourne, Australia (Stephan Glöckner, Richard O. Sinnott, Anthony Stell)

Brazil

· Adrenal Unit, Divison of Endocrinology and Metabolism, Hospital das Clinicas, University of São Paulo Medical School, Institute of Cancer of São Paulo, São Paulo Brazil (Maria Candida B. V. Fragoso)

Croatia

· Department of Endocrinology, University Hospital Centre Zagreb, Zagreb, Croatia (Darko Kastelan, Ivana Dora Pupovac, Bojana Simunov)

France

· Department of Endocrinology, Hôpital Haut Lévêque, CHU de Bordeaux, Pessac, France (Sarah Cazenave, Magalie Haissaguerre, Antoine Tabarin)

· National Expert Centre for Rare Adrenal Cancers, Covhin Hospital, Institut Cochin, Institut National de la Santé et de la Recherche Medicale Unite 1016, René Descartes University, Paris (Jérôme Bertherat, Rossella Libé)

Germany

· Endocrinology in Charlottenburg, Berlin, Germany (Tina Kienitz, Marcus Quinkler)

· Institute of Clinical Chemistry and Laboratory Medicine, University Hospital Carl Gustav Carus, Technical University, Dresden, Germany (Katharina Langton, Graeme Eisenhofer)

· Medizinische Klinik and Poliklinik IV, Ludwig-Maximilians-Universität München, Munich, Germany (Felix Beuschlein, Christina Brugger, Martin Reincke, Anna Riester, Ariadni Spyroglou)

· Division of Endocrinology and Diabetes, Department of Internal Medicine I, University Hospital, University of Würzburg, German and Comprehensive Cancer Centre Mainfranken, University of Würzburg, Würzburg, Germany (Stephanie Burger-Stritt, Timo Deutschbein, Martin Fassnacht, Stefanie Hahner, Matthias Kroiss, Cristina L. Ronchi)

Greece

· Department of Endocrinology, Diabetes and Metabolism, Evangelismos Hospital, Athens, Greece (Sotiria Palimeri, Stylianos Tsagarakis, Ioanna Tsirou, Dimitra Vassiliadi)

Italy

· Department of Clinical and Biological Sciences, San Luigi Hospital, University of Turin, Turin, Italy (Vittoria Basile, Elisa Ingargiola, Giuseppe Reimondo, Massimo Terzolo)

· Department of Experimental and Clinical Biomedical Sciences, University of Florence, Florence, Italy (Letizia Canu, Massimo Mannelli)

The Netherlands

· Department of Internal Medicine, Maxima Medisch Centrum, Eindhoven, The Netherlands (Hester Ettaieb, Harm R. Haak, Thomas M. Kerkhofs)

· Department of Health Services Research, and CAPHRI School for Public Health and Primary Care, Maastricht University, The Netherlands (Harm R. Haak)

· Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, Groningen, The Netherlands (Michael Biehl)

· Department of Internal Medicine, Division of Endocrinology, Erasmus Medical Centre, University Medical Centre Rotterdam, Rotterdam, The Netherlands (Richard A. Feelders, Johannes Hofland, Leo J. Hofland)

Norway

· Department of Clinical Science, University of Bergen, and Department of Medicine, Haukeland University Hospital, Bergen, Norway (Marianne A. Grytaas, Eystein S. Husebye, Grethe A. Ueland)

Poland

· Department of Internal Medicine and Endocrinology, Medical University of Warsaw, Warsaw, Poland (Urszula Ambroziak, Tomasz Bednarczuk, Agnieszka Kondracka, Magdalena Macech, Malgorzata Zawierucha)

Portugal

· Department of Endocrinology, University Hospital of Coimbra, Coimbra, Portugal (Isabel Paiva) Republic of Ireland

· School of Medicine, National University of Ireland Galway (NUIG), Galway, Republic of Ireland (M. Conall Dennedy, Ahmed Sajwani)

· Department of Endocrinology, Beaumont Hospital, Dublin, and the Royal College of Surgeons in Ireland, Dublin, Republic of Ireland (Mark Sherlock)

· Department of Endocrinology, St. Vincent’s University Hospital, Dublin, and School of Medicine, University College Dublin, Dublin, Republic of Ireland (Rachel K. Crowley)

Serbia

· Department for Obesity, Reproductive and Metabolic Disorders, Clinic for Endocrinology, Diabetes and Metabolic Diseases, Clinical Centre of Serbia, Faculty of Medicine, University of Belgrade, Belgrade, Serbia (Miomira Ivovic, Ljiljana Marina)

United Kingdom

· Institute of Applied Health Research, University of Birmingham, Birmingham, UK (Jonathan J. Deeks, Alice J. Sitch)

· Institute of Metabolism and Systems Research, University of Birmingham, and Centre for Endocrinology, Diabetes and Metabolism, Birmingham Health Partners, Birmingham, UK (Wiebke Arlt, Irina Bancos, Vasileios Chortis, Lorna C. Gilligan, Beverly A. Hughes, Katharina Lang, Hannah E. Ivison, Carl Jenkinson, Konstantinos Manolopoulos, Donna M. O’Neil, Michael W. O’Reilly, Thomas G. Papathomas, Alessandro Prete, Cristina L. Ronchi, Cedric H.L. Shackleton, Angela E. Taylor)

· Department of Endocrinology, Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK (Wiebke Arlt, Miriam Asia, Vasileios Chortis, Katharina Lang, Konstantinos N. Manolopoulos, Michael W. O’Reilly, Alessandro Prete, Cristina L. Ronchi)

· Department of Hepato-Pancreato-Biliary and Liver Transplant Surgery, Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK (Robert P. Sutcliffe)

· Department of Radiology, Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK (Peter Guest)

· Department of Pathology, Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK (Kassiani Skordilis)

United States of America

· Division of Endocrinology, Metabolism and Nutrition, Mayo Clinic, Rochester, MN, USA (Irina Bancos, Cristian Bancos, Alice Chang, Caroline J. Davidge-Pitts, Danae A. Delivanis, Dana Erickson, Neena Natt, Todd B. Nippoldt, Melinda Thomas, William F. Young Jr.)

· UCSF Benioff Children’s Hospital Oakland Research Institute, Oakland, California, CA, USA (Cedric H.L. Shackleton)

Supplementary Methods

Sample size calculation

The study had a recruitment target of 2000 participants at an expected ACC rate of 5%, determined by sample size calculations based on the results of the proof-of-principle study1. Based on a conservative estimate that 5% of tumours would be ACC, observing 100 ACC cases would allow a sensitivity of 95% to be estimated with a 95% confidence interval of width less than 10%. This sample size would have over 99% power to detect a difference of 3% (87% vs 90% assuming positive correlation of errors) in specificity at the 5% significance level, allowing for 10% loss to follow-up, thus would provide ample precision to be able to estimate benefits through reduced false positives. The prevalence of ACC in the final data was 4.9% (98/2017). Final recruitment was 2169 with 2017 used in the final analysis, due to exclusion of selectively recruited patients and samples lost during processing, storage or transport.

Patient recruitment

The European Network for the Study of Adrenal Tumours (ENSAT) database (https://registry.ensat.org) was established in 2008 and serves as a complete virtual research environment for adrenal tumour research, with password protected access to the database by all registered full ENSAT members who have approval of their local ethics committee. Access to the ENSAT database occurs through a unified, security-driven portal that allows targeted upload of pseudonymised patient data. In the EURINE-ACT study, all clinical data were prospectively collected and recorded in the ENSAT registry. Variables included demographic data, mode of adrenal tumour discovery and clinical presentation, tumour diameter and imaging characteristics of the adrenal mass, results of endocrine testing, clinical and radiological follow up data, data on surgery and histopathology, and availability of biomaterial. For bilateral adrenal masses, we selected the larger tumour diameter and the more unfavorable imaging characteristic for analysis. Histopathology and radiological assessment were carried out and recorded locally; all participating referral centres had local access to specialist radiologists and histopathologists highly experienced in the diagnosis and differential diagnosis of adrenal tumours. A diagnosis of ACC was made based on multifactorial scoring systems set out for the diagnosis of adrenal cortical carcinoma in the WHO Classification of Tumours of Endocrine Organs2

including a Weiss Score of 3 or above for conventional adrenocortical carcinomas, in accordance with the Histopathology Reporting Guide for Carcinoma of the Adrenal Cortex by the International Collaboration on Cancer Reporting3.

Prior to enrolment in EURINE-ACT, all participants underwent biochemical exclusion for the presence of pheochromocytoma4. All participants underwent standardised endocrine assessment for exclusion of clinically overt Cushing’s syndrome and primary aldosteronism, which were diagnosed according to standard guidelines5,6. We recorded the presence of mild autonomous cortisol secretion secretion in patients with (1) a lack of clinical features indicative of overt Cushing’s syndrome (e.g. proximal myopathy, dorsocervical and supraclavicular fat pads, broad striae), and (2) failure to suppress morning serum cortisol to less than 50 nmol/L (1.8 µg/dL) after administration of 1mg dexamethasone orally at 11 pm the preceding night (1mg-dexamethasone suppression test), as defined by recent guidelines7. Each patient provided a 24-hour urine sample and the volume of the 24-h collection was recorded. The samples were aliquoted in the recruitment centre on the day of collection and stored locally at -20℃ before transport on dry ice to the University of Birmingham, UK, for mass spectrometry analysis in the Steroid Metabolome Analysis Core of the Institute of Metabolism and Systems Research. Upon receipt, samples were catalogued and compared to the sample list sent by the local recruitment centre and transferred to -20℃ storage until analysis. 24-h urines were accepted as accurately collected if their total collection volume was >1000mL and/or if 24-h urinary creatinine excretion was within the reference range.

Urinary steroid metabolite profiling by liquid chromatography-tandem mass spectrometry

From each 24-h urine collection, we aliquoted 400uL of urine and added 40uL of internal standard solution (10µg/mL), containing deuterated steroid standards (DHEA-d6, Cortisol-d4 [Sigma Aldrich, Gillingham, UK]; Etio-d5, THE-d5, THS-d5 [Isosciences, Ambler, USA]). To negate dilution effects, if the the 24-h urine collection volume exceeded 2500mL, we increased the sample volume to 800uL of urine. Samples were then hydrolysed to release the steroids from their sulfate and glucuronide conjugates, after addition of 440uL deconjugation mixture, containing 0.2M acetate buffer (prepared at pH 4.8-5.0), 3.3 mg/mL ascorbate, and 67 U/mL of a sulfatase/glucuronidase enzyme mix derived from

helix pomatia (Sigma Aldrich, Gillingham, UK), followed by heating at 60℃ for 3 hours. Thereafter, the solution was allowed to cool, followed by solid phase extraction using Sep Pak C18 cartridges (96- well plate 100mg sorbent per cartridge [Biotage, Hengoed, UK]). The cartridges were washed with 1mL LC-MS grade methanol (Greyhound Chromatography, Birkenhead, UK) and 1mL LC-MS grade water (Fisher Scientific, Loughborough, UK). Next, the urine sample was passed over the cartridge and a further wash was performed with 1mL LC-MS grade water. Following this, steroids were eluted with 1mL of LC-MS grade methanol. This steroid fraction was dried under nitrogen at 55℃ and reconstituted in 125uL of 50/50 LC-MS grade methanol/water, vortexed for 5 minutes and centrifuged at 1793 x g prior to mass spectrometry analysis.

A Waters Xevo mass spectrometer with an acquity ultra high performance (uPLC) chromatography system with a HSS T3, 1.8um, 1.2x50mm column (heated at 60℃) was used to analyse the steroids. A 20uL sample injection volume was used. A mobile phase of LC-MS grade methanol and water, both with 0.1% formic acid was used. Elution of steroids was achieved at a flow rate of 600uL per minute, which started at 45% methanol held for one minute, followed by a linear gradient to 80% methanol at 8.5 minutes. The column was then washed at 98% methanol and re-equilibrated at the starting gradient prior to the next injection.

All 15 steroids (Suppl. Fig. 2) were detected in positive ionisation mode. For positive identification and quantification of a steroid, the analyte had to have two matching multiple reaction monitoring (MRM) mass transitions (precursor/product transitions) and an identical retention time relative to an authentic steroid standard. Steroids were quantified compared to a calibration series using standard concentrations of each steroid standard ranging from 10 to 5000ng/mL, with inclusion of a blank, prepared in steroid free synthetic urine matrix (Sigma, Gillingham, UK) and processed as above. Each steroid concentration was calculated relative to an assigned internal standard. Prior to analysis, we had validated the method assessing specificity, sensitivity, accuracy, precision, linearity, limit of detection (LOD), limit of quantification (LOQ), reproducibility, absolute recovery, and matrix effects (Suppl. Table 3).

Machine learning and classification algorithm

In order to obtain a classifier system for the discrimination of ACC from ACA, we had analysed an independent, retrospectively collected data set comprising 24-h urine steroid excretion data from 139 patients (99 ACA, 40 ACC) (Taylor AE et al., unpublished, available on request). Steroid excretion in those 139 retrospectively collected patients had been measured by the same LC-MS/MS method employed in this study, yielding values for 15 distinct urinary steroid metabolites (Suppl. Table 2), including seven steroids previously described as part of the “malignant steroid fingerprint” identified by machine learning analysis of steroid data obtained by gas chromatography-mass spectrometry1. All steroid excretion values were log-transformed and subsequently z-score normalised with respect to the means and standard deviations observed in the data set. The resulting set of 15-dimensional vectors X=(X1,X2, … ,,X15), together with the class membership served as input for the machine learning analysis. We employed a variant of Learning Vector Quantization (LVQ)8 for the computational analysis, Generalised Matrix Relevance LVQ (GMLVQ)9-11, which represents classes in terms of typical prototypes wACA and wACC. For the comparison of a specific vector x with a prototype w=(W1,W2, … ,,W15), a distance measure of the form d(x,w)= Zij (Xi-Wi) Aij (Xj-Wj) is employed. The prototypes wACA and wACC as well as the matrix A of coefficients Aij are determined in a cost function based training process; mathematical details of this approach have been previously described10-12 and it has been applied to multi-steroid data in our urine steroid metabolomics proof-of-principle study1. We employed a publicly available implementation of GMLVQ using default parameters9. The training process was repeated for 1000 randomly selected subsets of 90% of the data and wACA, wACC and A were obtained as averages over these 1000 runs.

In contrast to other machine learning algorithms, GMLVQ yields an interpretable, white box classification algorithm10. Numerical values of the coefficients w;ACA, wACC and Aij as well as the parameters of the z-score transformation are available upon request from the corresponding author.

In this paper, we followed to a large extent the set-up and methodology of our previous analysis of GC- MS steroid data1, in which we provided proof-of-principle for the urine steroid metabolomics approach and also demonstrated that the GMLVQ approach was superior to other statistical or machine learning approaches, namely logistic regression or linear discriminant analysis (LDA).

Similar to our paper utilizing GMLVQ analysis of GC-MS steroid excretion data1, machine learning analysis of the 24-h urinary steroid metabolite excretion in the retrospective cohort (99 ACA, 40 ACC) identified the 11-deoxycortisol metabolite THS as the steroid most relevant for the differentiation of ACC from ACA, followed by the pregnenolone and 17-hydroxypregnenolone metabolites 5-PD and 5- PT; however, the diagnostic algorithm used for interpretation of the steroid excretion data in this paper employed the data of all 15 urinary steroid metabolites measured by LC-MS/MS (Suppl. Table 2) for diagnostic classification.

A new sample, not contained in the training set, is to be classified according to the following prescription:

· The steroid metabolite excretion data is log-transformed and z-score normalised with the (publicly available) means and standard deviations obtained from the training data. This yields a vector y repesenting the sample.

· With the (publicly availble) parameters of the classifier wACA, wACC and A, the quantities d(y,wACA)=Eij (yi-WACA) Aij (x ;- wjACA) and d(y,wACC)=Lij (yi-WACC) Aij (Xj-W;ACC) are computed.

· The corresponding score is given as s(y)=1/(1+exp([d(y,wACC)-d(y,wACA)]/10)), which satisfies 0<s(y)<1.

Along these lines, the resulting classifier system was applied to the urine steroid metabolite excretion data from the prospective EURINE-ACT cohort. A particular binary GMLVQ classifier can be specified by considering a threshold value 0 and assigning data with s(y) 0 to class ACA, while samples with s(y)> 0 are classified as ACC. By variation of the threshold parameter, 0-dependent sensitivities and specificities and, therefore, the full Receiver Operating Characteristics (ROC) can be determined13 (Suppl. Fig. 2).

Machine learning-based classifier system and definition of risk thresholds (ACC vs. Non-ACC)

Next, we defined urine steroid metabolomics (USM) score thresholds in order to assign a sample to one of three classes: high risk (USM-HR), moderate risk (USM-MR), and low risk (USM-LR) of ACC. The corresponding thresholds were selected to ensure that the post-test probability of ACC in the high risk group was at least 65% and at least 10% in the moderate risk group. These cut-offs were obtained through a modified Delphi process involving 21 clinicians from 12 countries; all of them are members of the ENSAT EURINE-ACT Investigator group and have long-standing expertise in the management of patients with adrenal tumours including ACC. The experts were asked to identify the post-test probabilities that in their opinion were required to confidently recommend surgery (high risk group), individualised management with surgery or biopsy (moderate risk group), or no further treatment (low risk group). These cut-offs were then applied to the USM scores obtained from the prospective EURINE-ACT data.

Additional machine learning-based classifier system to differentiate four types of adrenal masses (ACC, ACA, OM, OB)

Our machine learning-based classifier was designed and developed to solve a two-class problem, i.e. differentiate ACC from ACA. However, following completion of the ENSAT EURINE-ACT recruitment, we realised that non-selective recruitment of a very large number of patients actually results in four classes of adrenal masses: ACC, ACA, but also other benign (OB) and other malignant (OM) adrenal masses. Therefore, we trained an additional multi-class GMLVQ system aiming at the discrimination of these four classes (ACA, OB, ACC, OM) represented by one prototype each. This was done by selecting class-balanced random subsets of 65 samples from each class from the prospective data for training and determination of the average performance of the obtained system over 50 such randomised training processes. Analogously, a further classifier was obtained by considering only ACA, OB, and OM samples. Results of this post hoc analysis using both of these classifiers are described in the Supplementary Results section and displayed in Suppl. Fig. 4.

Diagnostic strategy

Each of the index tests (tumour size, imaging characteristics, USM risk score) was considered individually and as a test strategy in combination with one or both other index tests. Strategies including tumour diameter use the diameter as the first test. Participants with tumour diameter above the threshold of 4cm and/or positive imaging characteristics were classed as ‘high risk of ACC’ if they had a high urine steroid metabolomics score and as ‘moderate risk of ACC’ if they had a moderate urine steroid metabolomics score. Participants with a tumour diameter below the threshold and/or negative imaging and/or low urine steroid metabolomics score were considered ‘low risk of ACC’. As outlined in the above section on the definition of ACC risk thresholds, the urine steroid metabolomics score cut-offs were applied according to the outcome of a modified Delphi consensus to achieve post-test probability ACC in line with the expectation of the 21 expert clinicians who participated in the Delphi process. Secondary analyses evaluated alternative strategy combinations identifying participants with malignant adrenal masses other than ACC (other malignant, OM).

Statistical analysis

Characteristics of participants were described for each target condition defined by the reference standard, with data for continuous and categorical analysis presented as median [lower and upper quartiles] and n (%) respectively. For each test, results were tabulated against the reference standard diagnosis: ACC and non-ACC, as well as for the subcategories of non-ACC of ACA (adrenocortical adenoma); OB (Other Benign); and OM (Other Malignant). For each index test, we computed the percentage of ACC cases with each test result (giving sensitivity for a positive result for a binary test); the percentage of non-ACC cases with each test result (giving specificity for a negative result for a binary test); and the likelihood ratio for each test result. We also computed the proportion with ACC with each test result to estimate the probability of ACC (the positive predictive value [for positive test result], 1 - negative predictive value [for negative test results]). All results were expressed using 95% confidence intervals, computed using the exact binomial method for proportions and using wald based methods for likelihood ratios.

Supplementary Results

Results of endocrine function assessment

Of the 1767 participants with benign adrenocortical adenomas (ACA), 913 (51.7%) showed no evidence of adrenal hormone excess based on the assessments carried out at the clinical recruitment centres (Suppl. Table 5). Primary aldosteronism was diagnosed in 153 (8.7%) ACAs, with N=118 (77.1%) diagnosed non-incidentally, mostly due to treatment-resistant or hypokalemic hypertension. By contrast, the majority of the 77 patients with overt adrenal Cushing’s syndrome were diagnosed incidentally (N=46, 59.7%) (Suppl. Table 5). Mild autonomous cortisol secretion was diagnosed in 602 participants with ACA, with incidental discovery of the adrenal mass in 95% (Suppl. Table 5). Fourty-five (46%) of the 98 ACC patients presented with clinical signs and symptoms of adrenal hormone excess; however, results of routine biochemistry showed evidence of hormone excess in 76 patients (78%). Isolated glucocorticoid and adrenal androgen excess was documented in 18 and 13 patients, respectively. Combined glucocorticoid and adrenal androgen excess was documented in 34 patients. Eight ACC patients had evidence of aldosterone excess and 13 had 17ß-estradiol excess. Though all patients had undergone exclusion of pheochromocytoma prior to inclusion in EURINE- ACT, histopathology revealed 10 patients with phaeochromocytoma, which were biochemically silent, i.e. not detected by plasma or urinary metanephrine analysis.

Post hoc analysis of diagnostic accuracy in a more stringently selected patient cohort

To create an even more stringently selected patient cohort for an additional post hoc analysis, we analysed the pattern of steroid excess, as assessed by routine serum biochemistry, and the clinical and radiological presentation to identify those patients who were readily identifiable as ACC or ACA based on these parameters (Suppl. Fig. 3). We defined patients identifiable as ACC (Suppl. Table 13) as those who had either a steroid excess pattern that was aberrant, i.e. not of typical adrenal origin (e.g. estradiol), or mixed steroid excess (any combination of steroid classes other than glucocorticoid and mineralocorticoid co-secretion, which is also regularly observed in benign adrenal tumours 1,12. In addition, we considered patients presenting with a large adrenal mass and extra-adrenal metastases as likely ACC (Suppl. Table 13). Similarly, we excluded 22 ACA patients from the post hoc analysis

cohort, as they had bilateral macronodular adrenal hyperplasia with isolated cortisol excess and, thus, were readily identifiable as benign (Suppl. Fig. 3 and Suppl. Table 13). The results of the post-hoc analysis of this even more stringently selected patient cohort (N=1940, including 43 ACC) again revealed a higher positive predictive value for urine steroid metaoblomics as compared to routinely used imaging tests (Suppl. Table 14).

Adrenal masses other than ACC and ACA

In addition to 98 ACC (4.9%) and 1767 ACA (87.6%), the EURINE-ACT cohort comprised 87 participants (4.3%) with a benign adrenal mass other than ACA (other benign, OB) and 65 participants (3.2%) with a malignant adrenal mass other than ACC (other malignant, OM). Bilateral adrenal masses were diagnosed in 368 ACA, 9 OB, and 7 OM tumours; the EURINE-ACT cohort did not comprise any bilateral masses with different underlying pathologies.

Surgical removal of the adrenal mass was performed in 50 of 65 OM tumours (77%) and in 59 of 87 OB tumours (68%) (Suppl. Table 5). Discovery of the adrenal mass upon follow-up imaging carried out as part of screening or monitoring of a previously known non-adrenal malignancy was an exclusion criterion for EURINE-ACT participation. However, histopathology (either following adrenalectomy or biopsy) revealed adrenal metastases of non-adrenal primary tumours in 39 of the 65 OM tumours (60%) (Suppl. Tables 5 and 6); this was almost equally split between metastases of a subsequently newly diagnosed non-adrenal primary tumour, and the late occurrence of metastases of a previously diagnosed non-adrenal primary tumour for which the patient no longer underwent follow-up monitoring.

The majority of the malignant masses other than ACC had positive imaging characteristics (63 of 65, 97%), most had a tumour diameter greater than 4cm (46 of 65, 71%), and only a minority had a USM- HR score (7 of 65, 11%). OM tumours could not be reliably differentiated from the other two Non- ACC classes, ACA and OB tumours, with any of the combined test strategies (Suppl. Tables 15-17).

The EURINE-ACT study was designed to validate a GMLVQ algorithm that had been developed to address a two class problem, the differentiation of malignant ACC from benign ACA. However, the final analysis cohort of 2017 patients also comprised 65 OM and 87 OB tumours, hence two additional classes. Therefore, we used the prospective data to train two additional algorithms, differentiating

between all four tumour classes (ACC, ACA, OM, OB); this showed excellent separation of the ACC group prototype from the prototype of the other three classes, which, however, appeared indistinguishable (Suppl. Figure 4A). We then trained a three class algorithm for the differentiation of the three Non-ACC classes (ACA, OM, OB). This algorithm achieved some degree of separation between ACA, OM, and OB prototypes, with 42-50% of the respective group participants correctly identified (Suppl. Figure 4B); however, this performance would be deemed insufficient for use as a clinically relevant diagnostic test.

Supplementary Figures

Suppl. Fig. 1: Recruitment of the ENSAT EURINE-ACT Final Analysis Cohort over the 66- month prospective recruitment period (January 2011 to July 2016). The graphs depict cumulative recruitment (Panel A) and annual recruitment (Panel B) as well as number of recruiting centres per year (Panel C) for the final EURINE-ACT analysis cohort (N=2017) recruited by 14 ENSAT centres from 11 countries.

A

Cumulative Recruitment

2500

Number of patients

2000

2017

1500

1582

1000

1039

500

538

0

194

360

2011

2012

2013

2014

2015

2016

B

Annual Recruitment

600

Number of patients

500

543

400

501

435

300

200

100

194

166

178

0

2011

2012

2013

2014

2015

2016

C

Number of Participating Centers

16

Number of centers

14

12

10

8

6

4

2

0

2011

2012

2013

2014

2015

2016

Suppl. Fig. 2: Receiver operating characteristic (ROC) curve illustrating the accuracy of urine steroid metabolomics (USM) in detecting ACC. Area Under the ROC curve (AUROC) provided as median and 95% confidence interval, taking into account all 15 urinary steroid metabolites measured by LC-MS/MS.

1.0

0.8

Sensitivity

0.6

0.4

0.2

AUROC=94.6% (92.2%, 96.9%)

0.0

1.0

0.8

0.6

0.4

0.2

0.0

Specificity

Suppl. Fig. 3: Flowchart illustrating the selection of the Post hoc EURINE-ACT Analysis Cohort from the Final EURINE-ACT Analysis Cohort. For this purpose, we excluded all ACCs that were readily identifiable as ACC either by steroid pattern (mixed or aberrant excess steroid excess) or clinical presentation (large adrenal mass with extra-adrenal metastases), and all ACAs identifiable as benign due to presentation with bilateral macronodular adrenal hyperplasia and isolated cortisol excess.

24-h urine analysis by urine steroid metabolomics N=2017 (98 ACC; 4.9%)

= Final EURINE-ACT Analysis Cohort

Bilateral macronodular adrenal hyperplasia with cortisol excess (=presumed ACA)

Adrenal mass with mixed or aberrant steroid excess (=presumed ACC)

· 42 ACC (incl. 13 met.)

Adrenal mass with metastasis and isolated or no steroid excess (=presumed ACC)

· 22 Non-ACC

· 13 ACC

24-h urine analysis by urine steroid metabolomics N=1940 (43 ACC; 2.2%)

= Post hoc EURINE-ACT Analysis Cohort

Suppl. Fig. 4: Results of multi-class GMLVQ systems trained and applied to prospective urine steroid metabolome data for the differentiation of adrenocortical carcinoma (ACC) and the three Non-ACC classes. The GMVLQ algorithm prospectively validated in the EURINE-ACT study had been trained to differentiate ACC from ACA,. However, due to the large number of patients recruited in the prospective study, its population also comprised other benign (OB) and other malignant (OM) adrenal masses. Therefore, we trained additional GMVLQ algorithms to see if this would achieve further differentiation. For the results in Panel A, a GMLVQ system was obtained by training on a class-balanced randomised set of 260 samples. We display the projections of a particular training set on the two leading eigenvalues of the resulting relevance matrix A. The corresponding prototype vectors are displayed as large symbols, small filled circles represent the steroid metabolome of individual patients, with each steroid metabolome comprising 15 distinct urinary steroid metabolites measured by LC-MS/MS. The confusion matrix below the graph summarises the average percentages of mis- classifications over 50 training randomly selected training sets. This shows that ACC samples clearly separate from all other tumour groups, while the three non-ACC groups display significant overlap. In Panel B, analogous results are shown for a three-class GMLVQ system trained from reduced data sets comprising only ACA, OM and OB tumours. The separation of the three classes appears slightly better than in the four-class setup of Panel A, but still displays significant overlap.

A

1

· ACC

· other malignant

· other benign

0.5

· ACA

0

-0.5

-1

-1

0

1

2

3

4

Percentage classified as

ACCOther malignantOther benignACA
ACC Other malign. Other benign True Class ACA80.12.712.05.2
4.337.438.719.6
3.420.052.624.0
4.621.233.640.6

B

0.4

0.2

8

0

-0.2

-0.4

-0.6

-0.8

-1

-1.2

-1.4

other malignant

other benign

-1.6

· ACA

1

1.5

2

2.5

3

Percentage classified as

Other malignantOther benignACA
Other malign. Other benign ACA True Class43.836.719.5
24.849.825.4
24.833.541.7

Supplementary Tables

Suppl. Table 1: Imaging test combinations performed in the EURINE-ACT participants (N=2017) prior to clinical decision on whether the adrenal mass was regarded as benign or malignant. Imaging modalities were carried out according to standard guidelines and included non-contrast computed tomography (CT; (N=1549) with measurement of attenuation in homogeneous tumours, magnetic resonance imaging (MRI; N=342) with chemical shift analysis, fluordeoxyglucose-positron emission tomography (FDG-PET; N=161) as well as CT contrast washout studies (N=21) and follow-up CT (N=664). 37 patients did not undergo additional imaging after the initial contrast CT that had led to the discovery of a large adrenal mass; histopathology confirmed 32 ACA and 5 other benign tumours.
Non- contrast CTMRIFDG- PETCT Contrast WashoutFollow-up CT at 6 monthsContrast CT only + histologyFrequency%
X ☒X ☒X ☒X ☒110.5
X ☒X ☒X ☒80.4
X ☒X ☒X ☒412.0
X ☒X ☒552.7
X ☒X ☒X ☒351.7
X ☒X ☒582.9
☒ X☒ X150.7
X ☒X ☒36017.8
X ☒96647.9
X ☒☒ XX ☒30.1
☒ X☒ X30.1
☒ X☒ X371.8
X ☒1849.1
☒ X☒ X221.1
☒ X211.0
X ☒60.3
X ☒1557.7
☒ X371.8
Suppl. Table 2: Nomenclature and origin of the 15 urinary steroid metabolites quantified by the multi- steroid profiling assay utilizing liquid chromatography-tandem mass spectrometry (LC-MS/MS; single run of 7 minutes duration). * indicates steroids comprised in the "malignant steroid fingerprint" indicative of ACC identified by machine learning in our previous retrospective proof-of-principle study (10).
AbbreviationCommon nameChemical nameMetabolite of
AnAndrosterone5a-androstan-3a-ol-17-oneAndrostenedione, testosterone, 5a- dihydrotestosterone
Etio*Etiocholanolone5ß-androstan-3a-ol-17-oneAndrostenedione, testosterone
11ß-OH-An11ß-hydroxyandrosterone5a-androstan-3a,11ß-diol-17-one11ß-hydroxy- androstenedione
DHEADehydroepiandrosterone5-androsten-3ß-ol-17-oneDHEA, DHEAS
5-PT*Pregnenetriol5-pregnene-36,17-20a-triol17-hydroxy- pregnenolone
5-PD*Pregnenediol5-pregnene-36,20a-diolPregnenolone
PD*Pregnanediol5-pregnane-3a,20a-diolProgesterone
17HP*17-hydroxypregnanolone5-pregnane-3a,17a-diol-20-one17-hydroxy- progesterone
PT*Pregnanetriol5-pregnane-3a,17a,20a-triol17-hydroxy- progesterone
THS*Tetrahydro-11-deoxycortisol5-pregnane-3a,17a,21-triol-20-one11-deoxycortisol
FCortisol4-pregnene-116,17,21-triol-3,20-dioneCortisol
11ß-OHEtio11ß-hydroxyetiocholanolone5B-androstan-3a,11ß-diol-17-oneCortisol
ECortisone4-pregnene-17a,21-diol-3,11,20-trioneCortisone
THETetrahydrocortisone50-pregnene-3a,17,21-triol-11,20-dioneCortisone
ß-cortoloneB-cortolone50-pregnane-3a,17,200,21-tetrol-11-oneCortisone
Supp. Table 3: Validation data of the LC-MS/MS multi-steroid profiling assay including limit of detection (LOD), limit of quantitation (LOD), reproductivility, accuracy and low (L), medium (M) and high (H) concentrations; precision, matrix effects, and absolute recovery. Low, medium and high concentrations; cortisol, cortisone, B-cortolone, 11BOHEt, 11BOHAn, 50, 500 and 500ng/ml; THE, THS, Etio, An 250, 500, 5000ng/ml; 5-PT 10, 100, 500ng/ml; DHEA, 10, 100, 250ng/ml; 5PD 10, 50, 500ng/ml; 17HP 50, 100, 250ng/ml; PT 100, 500, 5000ng/ml; PD 50, 100, 500ng/ml. RSD, relative standard deviation = (standard deviation*100)/average.
SteroidLOD (ng/mL)LOQª (ug/24hr)Reproducibility (RSD%)Accuracy (RSD %)Precision (RSD %)Matrix Effects (%)Absolute Recovery (%)
LMHLMH
E0.320166.55.63.11.92.71.6-0.00399
F0.72068.33.02.34.82.61.0-0.003102
THE5.230511124.42.22.03.00.03103
B-cortolone1.6205144.06.0147.05.40.007102
11OHEt56.89612185.07.0175.07.08.0-04108
11ßOHAn30.610514185.88.313106.0-9.0-04122
5PT2.443619123.97.9125.50.00394
DHEA3.422229.0101.613177.50.000796
THS31.455182.12.54.75.36.94.4-0.00296
Etio5.41043.53.33.74.45.03.60.02102
5PD8.1551029162.614115.3-0.00374
An2.71363.03.04.25.84.75.10.008100
17HP1.62412159.23.815130.48.0-0596
PT1.411066.33.24.36.17.34.60.00496
PD15.2881321188.52316110.00485

“based on an average signal-to-noise ratio 10:1 in patient samples

Suppl. Table 4: Characteristics of centre-specific patient recruitment during the 66 months of prospective recruitment to ENSAT EURINE-ACT. The table lists the centre-specific recruitment periods, the number of patients recruited per centre and number and prevalence of adrenocortical carcinoma (ACC) cases for each of the 21 clinical centres that agreed to participate in EURINE-ACT. On this basis, centres 1-14 (total recruitment N=2068) were classified as non-selective recruitment centres while the remaining seven centres 15-21 (total recruitment N=101) were classified as selective. Centres with selective enrolment did not pursue consecutive enrolment as requested by the protocol, but prioritized enrolment of patients with large and indeterminate tumours, thus resulting in a high proportion of ACC not reflective of routine caseload. Consent rate was >95% of all patients approached in each of participating centres.
Centre NoRecruitment period (months)Number of eligible patients consented (mean N/year)Number of ACC patients (% of total n)Centre Recruitment Classification
136278 (93)15 (5.4)Non-selective
260264 (53)15 (5.7)Non-selective
348246 (62)4 (1.6)Non-selective
448234 (59)7 (3.0)Non-selective
560225 (45)9 (4.0)Non-selective
6*60158 (32)21 (13.3)Non-selective
7*48153 (38)14 (9.2)Non-selective
854132 (29)5 (3.8)Non-selective
93693 (31)1 (1.1)Non-selective
10*3673 (24)8 (11.0)Non-selective
112469 (35)0 (0)Non-selective
122452 (26)1 (1.9)Non-selective
133051 (20)3 (5.9)Non-selective
142440 (20)0 (0)Non-selective
153624 (8.0)5 (20.8)Selective
163620 (6.7)2 (10.0)Selective
173019 (7.6)6 (31.6)Selective
183617 (5.7)8 (53.0)Selective
193017 (6.8)6 (35.0)Selective
20483 (0.75)3 (100%)Selective
21481 (0.25)1 (100%)Selective

* One of three major ACC specialist centres in Europe, thus, higher ACC rate than other centres

Suppl. Table 5: Distribution of Pathologies in the EURINE-ACT cohort. Underlying pathologies are listed as N (%) for the whole cohort and for incidentally and non-incidentally discovered tumours, respectively, as well as for those adrenal tumours that underwent surgical removal (adrenalectomy group).
All participants Number (% of total) (% of group)Non- incidentally discovered tumoursIncidentally discovered tumoursAdrenal masses surgically removed (Adrenalectomy Group)
Number (% of total) (% of group)Number (% of total (% of group)Number (% of total) (% of group)
Total number2017 (100)331 (16.4)1686 (83.6)563 (27.9)
Adrenocortical carcinoma (ACC)98 (4.9)55 (16.6)43 (2.6)84 (14.9)
Benign adrenocortical adenomas
(ACA)1767 (87.6)254 (76.7)1513 (89.7)370 (65.7)
Non-functioning adenoma913 (51.7)71 (28.0)842 (55.7)81 (21.9)
Mild autonomous cortisol secretion602 (34.1)30 (11.8)572 (37.8)105 (28.4)
Aldosterone-producing adenoma153 (8.7)118 (46.5)35 (2.3)119 (32.2)
Cortisol-producing adenoma77 (4.4)31 (12.2)46 (3.0)51 (13.8)
Bilateral macronodular hyperplasia with cortisol excess22 (1.2)4 (1.6)18 (1.2)14 (3.8)
Other malignant (OM) masses65 (3.2)9 (2.7)56 (3.3)50 (8.9)
Metastasis39 (60.0)9 (100.0)30 (53.6)30 (60.0)
Primary Adrenal Lymphoma8 (12.3)0 (0.0)8 (14.3)4 (8.0)
Leiomyosarcoma5 (7.7)0 (0.0)5 (8.9)4 (8.0)
Angiosarcoma4 (6.2)0 (0.0)4 (7.1)4 (8.0)
Liposarcoma4 (6.2)0 (0.0)4 (7.1)3 (6.0)
Neuroblastoma2 (3.1)0 (0.0)2 (3.6)2 (4.0)
Sarcoma2 (3.1)0 (0.0)2 (3.6)2 (4.0)
Castleman1 (1.5)0 (0.0)1 (1.8)1 (2.0)
Other benign (OB) masses87 (4.3)13 (3.9)74 (4.4)59 (10.5)
Myelolipoma28 (32.2)3 (23.1)25 (33.8)17 (28.8)
Cyst17 (19.5)2 (15.4)15 (20.3)12 (20.3)
Pheochromocytoma10 (11.5)1 (7.7)9 (12.2)8 (13.6)
Ganglioneuroma8 (9.2)3 (23.1)5 (6.8)8 (13.6)
Hemangioma8 (9.2)1 (7.7)7 (9.5)7 (11.9)
Hematoma8 (9.2)1 (7.7)7 (9.5)2 (3.4)
Schwannoma2 (2.3)1 (7.7)1 (1.4)2 (3.4)
Lymphangioma2 (2.3)0 (0.0)2 (2.7)1 (1.7)
Hepatic adenoma1 (1.1)1 (7.7)0 (0.0)0 (0.0)
Pseudocyst1 (1.1)0 (0.0)1 (1.4)1 (1.7)
Stromal tumour1 (1.1)0 (0.0)1 (1.4)1 (1.7)
Angiolipoma1 (1.1)0 (0.0)1 (1.4)0 (0.0)
Suppl. Table 6: Reference standards applied to the adrenal masses in the EURINE-ACT participants (N=2017).
Adreno- cortical adenomas (ACA)Adreno- cortical carcinomas (ACC)Other malignant (OM) massesOther benign (OB) masses
Total, n1,767986587
Histopathology, n (%)370 (21%)91 (93%)*65 (100%)*59 (68%)
No histopathology, n (%)1397 (79%)7 (7%) **0 (0%)28 (32%)
Number of patients with imaging follow up > 6 months, n (%)577 (33%)0 (0%)0 (0%)16 (18%)
Duration (months) of imaging follow-up, median (IQR)17 (10, 72)N/AN/A19 (10, 92)
Number of patients with clinical follow# up only > 12 months, n (%)820 (46%)0 (0%)0 (0%)12 (14%)
Duration (months) of clinical follow-up", median (IQR)24 (12, 26)N/AN/AN/A ***

* Seven ACC patients and 15 patients with OM masses underwent biopsy only of the adrenal mass (no adrenalectomy). Within the OM group, biopsy revaled metastasis of non-adrenal primary tumours (N=9), primary adrenal lymphoma (N=4), liposarcoma (N=1), and leiomyosarcoma (N=1).

** Seven ACC patients presented with a large adrenal tumour, metastases and steroid hormone excess, indicative of ACC, with no feasible alternative diagnosis, which allows for the diagnosis of ACC without histopathology13.

*** Twelve benign adrenal myelolipomas were unequivocally diagnosed by their characteristic imaging findings7 and, therefore, did not undergo further follow-up.

# clinical follow up was defined as a face-to-face visit including a physical examination.

Suppl. Table 7: Sensitivity and specificity (%) when using different cut-offs for maximum tumour diameter (N=2017) and tumour attenuation measured in Hounsfield Units (HU) or tumour heterogeneity on unenhanced computed tomography (CT; N=1549). * one-sided 97.5% Confidence Interval (CI).
Criteria for positive test resultTrue Positive/ACC; Sensitivity % (95% CI)True Negative/Non-ACC; Specificity % (95% CI)
Tumour diameter (N=2017)
≥2cm98/98; 100.0 (96.3, 100.0)*576/1919; 30.0 (28.0, 32.1)
≥4cm96/98; 98.0 (92.8, 99.8)1527/1919; 79.6 (77.7, 81.3)
≥6cm82/98; 83.7 (74.8, 90.4)1793/1919; 93.4 (92.2, 94.5)
Unenhanced CT (N=1549)
HU≥10 & not heterogeneous33/98; 33.7 (24.4, 43.9)950/1451; 65.5 (63.0, 67.9)
HU>20 & not heterogeneous32/98; 32.7 (23.5, 42.9)1183/1451; 81.5 (79.4, 83.5)
Heterogeneous65/98; 66.3 (56.1, 75.6)1429/1451; 98.5 (97.7, 99.0)
HU≥10 or heterogeneous98/98; 100.0 (96.3, 100.0)*928/1451; 64.0 (61.4, 66.4)
HU>20 or heterogeneous97/98; 99.0 (94.4, 100.0)1161/1451; 80.0 (77.9, 82.0)
Suppl. Table 8: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging characteristics by tumour diameter and underlying diagnosis (Non-ACC [N=1919] vs. ACC [N=98]).
Non-ACC
Tumour diameter <4cm
USM
Imaging characteristicsLow RiskModerate RiskHigh RiskTotal
Negative762428931283
Positive1299718244
Total8915251111527
Non-ACC
Tumour diameter ≥4cm USM
Imaging characteristicsLow RiskModerate RiskHighTotal
Risk
Negative1477221240
Positive695825152
Total21613046392
ACC
Tumour diameter <4cm USM
Imaging characteristicsLow RiskModerate RiskHigh RiskTotal
Negative0000
Positive0112
Total0112
ACC
Tumour diameter ≥4cm
USM
Imaging characteristicsLow RiskModerate RiskHigh RiskTotal
Negative0011
Positive2128195
Total2128296
Suppl. Table 9: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging characteristics on CT (unenhanced tumour attenuation >20HU or tumour heterogeneity were considered as indicative of ACC [=postitive]; total N=1549) by tumour diameter and underlying diagnosis (Non-ACC [N=1451] vs. ACC [N=98]).
Non-ACC Tumour diameter <4cm
USM
Unenhanced CT >20HU or heterogeneityLow RiskModerate RiskTotal
High Risk
Negative59832063981
Positive966812176
Total694388751157
Non-ACC
Tumour diameter ≥4cm USM
Unenhanced CT >20HU or heterogeneityLow RiskModerate RiskTotal
High Risk
Negative1135413180
Positive484719114
Total16110132294

ACC

Tumour diameter <4cm USM
Unenhanced CT >20HU or heterogeneityLow RiskTotal
Moderate RiskHigh Risk
Negative0000
Positive0112
Total0112
ACC Tumour diameter ≥4cm USM
Unenhanced CT >20HU or heterogeneityLow RiskTotal
Moderate RiskHigh Risk
Negative0011
Positive2128195
Total2128296
Suppl. Table 10: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging with MRI chemical shift analysis (total N=342) by tumour diameter and underlying diagnosis (Non-ACC [N=341] vs. ACC [N=1]).
Non-ACC Tumour diameter <4cm
USM
MRILow RiskModerate RiskHigh RiskTotal
Negative1157528218
Positive2724556
Total1429933274
Non-ACC Tumour diameter ≥4cm USM
MRILow RiskModerate RiskHigh RiskTotal
Negative209635
Positive189532
Total38181167
ACC
Tumour diameter <4cm USM
MRILow RiskModerate RiskHigh RiskTotal
Negative0000
Positive0000
Total0000
ACC Tumour diameter ≥4cm USM
MRILow RiskModerate RiskHigh RiskTotal
Negative0000
Positive0011
Total0011
Suppl. Table 11: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging by FDG-PET (total N=161) by tumour diameter and underlying diagnosis (Non-ACC [N=159] vs. ACC [N=2]).
Non-ACC Tumour diameter <4cm
USM
PETLow RiskModerate RiskHigh RiskTotal
Negative55321097
Positive67013
Total613910110
Non-ACC Tumour diameter ≥4cm USM
PETLow RiskModerate RiskHigh RiskTotal
Negative2310336
Positive47213
Total2717549
ACC Tumour diameter <4cm USM
PETLow RiskModerate RiskHigh RiskTotal
Negative0000
Positive0000
Total0000
ACC Tumour diameter ≥4cm USM
PETLow RiskModerate RiskHigh RiskTotal
Negative0000
Positive0022
Total0022
Suppl. Table 12: Intermediate testing steps for strategies combining the three index tests (tumour diameter, imaging characteristics, and urine steroid metabolomics [USM]). Measures reported with 95% confidence intervals. USM urine steroid metabolomics; OB Other Benign; OM Other Malignant; shaded areas show results for Non-ACC results by group; ; Sensitivity; ¿ Specificity.
ACCNon ACCACAOBOMTotal% of ACC cases% of non-ACC casesLikelihood RatioPost-test probability of ACC (per 100)
Double test strategy: Tumour diameter AND Imaging characteristics
Results of second test (Imaging characteristics) for participants with Tumour diameter >4cm (N=488)
Tumour ≥4cm AND Imaging characteristics positive95152832445247+99.0 (94.3, 100.0)38.8 (33.9, 43.8)2.6 (2.3, 2.9)38.5 (32.4, 44.8)
Tumour ≥4cm AND Imaging characteristics negative12402132612411.0 (0.0, 5.7)$61.2 (56.2, 66.1)0.02 (0.00, 0.12)0.4 (0.0, 2.3)
Total963922965046488
Double test strategy: Tumour diameter AND Urine Steroid Metabolomics (USM)
Results of second test (USM) for participants with Tumour diameter >4cm(N=488)
Tumour ≥4cm AND USM High Risk (HR) score8246336712885.4 (76.7, 91.8)11.7 (8.7, 15.3)7.3 (5.5, 9.7)64.1 (55.1, 72.3)
Tumour ≥4cm AND USM Moderate Risk (MR) score1213085252014212.5 (6.6, 20.8)33.2 (28.5, 38.1)0.38 (0.22, 0.65)8.5 (4.4, 14.3)
Tumour ≥4cm AND USM Low Risk (LR) score221617819192182.1 (0.3, 7.3)55.1 (50.0, 60.1)0.04 (0.01, 0.15)0.9 (0.1, 3.3)
Total963922965046488
Double test strategy: Urine steroid metabolomics (USM) AND Imaging characteristics
Result of second test (Imaging characteristics) for participants with USM-HR or -MR (N=908)
USM-HR AND Imaging characteritics positive8243352612585.4 (76.7, 91.8)5.3 (3.9, 7.1)16.1 (11.9, 21.8)65.6 (56.6, 73.9)
USM-MR AND Imaging characteristics positive1315597302816813.5 (7.4, 22.0)19.1 (16.4, 22.0)0.71 (0.42, 1.20)7.7 (4.2, 12.9)
USM-HR/-MR AND Imaging characteristics negative16145892416151.0 (0.0, 5.7)75.6 (72.5, 78.5)0.01 (0.00, 0.10)0.1 (0.0, 0.9)
Total968127215635908
Result of second test (USM) for participants with positive Imaging characteristics (N=493)
Imaging positive AND USM-HR8243352612584.5 (75.8, 91.1)10.9 (8.0, 14.3)7.8 (5.8, 10.5)65.6 (56.6, 73.9)
Imaging positive AND USM-MR1315597302816813.4 (7.3, 21.8)39.1 (34.3, 44.1)0.34 (0.20, 0.58)7.7 (4.2, 12.9)
Imaging positive AND USM-LR219815712292002.1 (0.3, 7.3)0.34 (0.01, 0.16)0.04 (0.01, 0.16)1.0 (0.1, 3.6)
Total973962894463493
Triple test strategy: Tumour diameter AND USM test AND Imaging characteristics
Result of third test (Imaging characteristics) for participants with Tumour diameter ≥4cm AND USM-HR or -MR (N=270)
Tumour ≥4cm AND USM-HR AND Imaging positive8125172610686.2 (77.5, 92.4)14.2 (9.4, 20.3)6.1 (4.2, 8.8)76.4 (67.2, 84.1)
Tumour ≥4cm AND USM-MR AND Imaging positive12582315207012.8 (6.8, 21.2)33.0 (26.1, 40.4)0.39 (0.22, 0.68)17.1 (9.2, 28.0)
Tumour ≥4cm AND USM-HR/-MR AND Imaging negative19378141941.1 (0.0, 5.8)52.8 (45.2, 60.4)0.02 (0.00, 0.14)1.1 (0.0, 5.8)
Total941761183127270
Result of third test (USM) for participants with Tumour diameter ≥4cm AND positive Imaging characteristics (N=247)
Tumour ≥4cm AND Imaging positive AND USM-HR8125172610685.3 (76.5, 91.7)16.4 (10.9, 23.3)5.2 (3.6, 7.5)76.4 (67.2, 84.1)
Tumour ≥4cm AND Imaging positive AND USM-MR12582315207012.6 (6.7, 21.0)38.2 (30.4, 46.4)0.33 (0.19, 0.58)17.1 (9.2, 28.0)
Tumour ≥4cm AND Imaging positive AND USM-LR26943719712.1 (0.3, 7.4)45.4 (37.3, 53.7)0.05 (0.01, 0.18)2.8 (0.3, 9.8)
Total95152832445247
Suppl. Table 13: Tumour-related steroid secretion patterns according to clinical presentation and results of routine serum biochemistry in the EURINE-ACT participants. We compared ACC vs. Non-ACC sub-divided by maximum tumour diameter (< or ≥6cm). Data are presented as N (%). Twenty-six ACCs (26.5%) were metastatic upon first diagnosis of the adrenal mass, numbers of metastatic tumours indicated per steroid category.
Tumour-related steroid secretion patternACC <6cm (N=22)ACC ≥6cm (N=76)Non-ACC <6cm (N=1808)Non-ACC ≥6cm (N=111)
Clinically overt and biochemical signs of steroid excess
Mixed (or aberrant) steroid excess4 (22.7%)24 (34.2%) 12/24 metastatic0 (0%)0 (0%)
Isolated cortisol excess2 (9.1%) 1/2 metastatic10 (13.2%) 6/10 metastatic74 (4.1%)3 (2.7%)
Isolated aldosterone excess0 (0%)0 (0%)153 (8.5%)0 (0%)
Isolated androgen excess2 (9.1%)0 (0%)0 (0%)0 (0%)
No clinical but biochemical signs of steroid excess
Mixed (or aberrant) steroid excess4 (18.2%)10 (13.2%) 1/10 metastatic0 (0%)0 (0%)
Isolated cortisol excess4 (18.2%)4 (5.3%)604 (33.4%)*20 (18.0%)*
Isolated aldosterone excess0 (0%)1 (1.3%)0 (0%)0 (0%)
Isolated androgen excess1 (4.6%)7 (9.2%) 2/7 metastatic0 (0%)0 (0%)
Neither clinical nor biochemical signs of steroid excess
No steroid excess4 (18.2%)18 (23.7%) 4/18 metastatic978 (54.1%)87 (8.4%)

* In the Non-ACC category, 21 participants with maximum tumour diameter <6cm (1.2%) and one patient ≥6cm (0.9%) presented with bilateral macronodular adrenal hyperplasia and biochemical cortisol excess; these are included in the overall numbers in the respective category.

Suppl. Table 14: Post hoc sensitivity analysis. Performance of tests and test strategies in the EURINE-ACT cohort (N=1940) after exclusion of patients identifiable as ACC or ACA by steroid pattern or clinical presentation. Measures reported with 95% confidence intervals. USM, urine steroid metabolomics; ACC, adrenocortical carcinoma; ACA, adrenocortical adenoma; OB, Other Benign; OM, Other Malignant; the shaded areas show results for the Non-ACC sub-groups ACA, OB, and OM; ; Sensitivity; ¿ Specificity.
ACC (43)Non ACC (1897)ACA (1745)OB (87)OM (65)Total (1940)% of ACC cases% of Non-ACC casesLikelihood Ratio (LR)Post-test probability of ACC (per 100)
SINGLE TEST STRATEGIES
Single test: Tumour Diameter
Positive≥4cm423832875046425+97.7 (87.7, 99.9)20.2 (18.4, 22.1)4.8 (4.4, 5.4)9.9 (7.2, 13.1)
Negative<4cm115141458371915152.3 (0.1, 12.3)¿79.8 (77.9,81.6)0.03 (0.01, 0.20)0.1 (0.0, 0.4)
Single test: Imaging characteristics
PositivePositive423942874463436+97.7 (87.7, 99.9)20.8 (19.0, 22.7)4.7 (4.3, 5.2)9.6 (7.0, 12.8) 0.1 (0.0, 0.4)
NegativeNegative11503145843215042.3 (0.1, 12.3)į79.2 (77.3,81.0)0.03 (0.00, 0.20)
Single test: Urine Steroid Metabolomics (USM)
High ModerateHigh Risk of ACC (USM-HR)331551417718876.7 (61.4, 88.2)8.2 (7.0, 9.5)9.4 (7.5, 11.7)17.6 (12.4, 23.8)
Moderate Risk of ACC (USM-HR)10648571492865823.3 (11.8, 38.6)34.2 (32.0, 36.3)0.7 (0.4, 1.2)1.5 (0.7, 2.8)
LowLow Risk of ACC (USM-LR)210941033313010940.0 (0.0, 8.2)*57.7 (55.4, 59.9)-0.0 (0.0, 0.4)*
COMBINED TEST STRATEGIES
Double test strategy: Tumour Diameter AND Imaging characteristics
Positive NegativeTumour diameter ≥4cm AND Imaging characteristics positive41151822445192+95.3 (84.2, 99.4)8.0 (6.8, 9.3)12.0 (10.1, 14.2)21.4 (15.8, 27.8)
Tumour diameter <4cm AND/OR Imaging characteristics negative4.7 (0.6, 15.8)¿92.0 (90.7,93.2)0.05 (0.01, 0.20)0.1 (0.0, 0.4)
21746166363201748
Double test strategy: Tumour Diameter AND Urine Steroid Metabolomics (USM)
HighTumour diameter ≥4cm AND USM high risk of ACC (USM-HR)334532677876.7 (61.4, 88.2)2.4 (1.7, 3.2)32.4 (23.2, 45.1)42.3 (31.2, 54.0)
ModerateTumour diameter ≥4cm AND USM moderate risk of ACC (USM-MR)912883252013720.9 (10.0, 36.0)6.7 (5.7, 8.0)3.1 (1.7, 5.7)6.6 (3.1, 12.1)
LowTumour diameter <4cm AND/OR
USM low risk of ACC (USM-LR)117241630563817252.3 (0.1, 12.3)90.9 (89.5, 92.1)0.03 (0.02, 0.18)0.1 (0.0, 0.3)
Double test strategy: Imaging characteristics AND Urine Steroid Metabolomics (USM)
HighImaging characteristics positive AND USM high risk of ACC (USM-HR)324234267474.4 (58.8, 86.5)2.2 (1.6, 3.0)33.6 (23.8, 47.5)43.2 (31.8, 55.3)
ModerateImaging characteristics positive AND USM moderate risk of ACC (USM-MR)10 15496302816423.3 (11.8, 38.6)8.1 (6.9, 9.4)2.9 (1.6, 5.0)6.1 (0.0, 0.3)
LowImaging characteristics negative AND/OR USM low risk of ACC (USM-LR)1 17011615553117022.3 (0.1, 12.3)89.7 (88.2, 91.0)0.03 (0.01, 0.18)0.1 (0.0, 0.3)
Triple test strategy: Tumour Diameter AND USM test AND Imaging characteristics
HighTumour diameter ≥4cm AND USM high risk of ACC (USM-HR) AND Imaging characteristics positive32 2517265774.4 (58.8, 86.5)1.3 (0.9, 1.9)56.5 (36.8, 86.5)56.1 (42.4, 69.3)
Moderate LowTumour diameter ≥4cm AND USM moderate risk of ACC (USM-MR) AND Imaging characteristics positive Tumour diameter <4cm AND/OR USM low AND/OR Imaging characteristics negative9 57 2 181522 170615 7020 3966 181720.9 (10.0, 36.0) 4.7 (0.6, 15.8)3.0 (2.3, 3.9) 95.7 (94.7, 96.5)7.0 (3.7, 13.1) 0.05 (0.01, 0.19)13.6 (6.4, 24.3) 0.1 (0.0, 0.4)
Suppl. Table 15: Performance of tests and test strategies when detecting malignant adrenal masses other than adrenocortical carcinoma (other malignant; OM). Measures reported with 95% confidence intervals. USM, urine steroid metabolomics; OM, Other Malignant; OB, Other Benign; ACA, adrenocortical adenoma; ACC, adrenocortical carcinoma; Non-OM =ACA+OB+ACC; shaded areas show results for Non-OM results by group; ; Sensitivity; ¿Specificity.
OM (65)Non-OM (1952)ACA (1767)OB ACC (87) (98)Total% of OM cases% of non-OM casesLikelihood RatioPost-test probability of OM (per 100)
Single test strategy: Tumour Diameter
Positive ≥4cm46442296509648870.8 (58.2, 81.4)22.6 (20.8, 24.6)3.1 (2.6, 3.7)9.4 (7.0, 12.4)
Negative <4cm1915101471372152929.2 (18.6, 41.8)77.4 (75.4, 79.2)0.38 (0.26, 0.55)1.2 (0.7, 1.9)
Single test strategy: Imaging characteristics
Positive Imaging characteristics positive63430289449749396.9 (89.3, 99.7)22.0 (20.2, 23.9)4.4 (4.0, 4.8)12.8 (10.0, 16.1)
Negative Imaging characteristics negative21522147843115243.1 (0.4, 10.7)78.0 (76.1, 79.8)0.04 (0.01, 0.15)0.1 (0.0, 0.5)
Single test strategy: Urine steroid metabolomics (USM)
HighUSM High Risk score (USM-HR)723314378324010.8 (4.4, 20.9)11.9 (10.5, 13.5)0.9 (0.44, 1.84)2.9 (1.2, 5.9)
ModerateUSM Moderate Risk score (USM-MR)28640578491366843.1 (30.8, 56.0)32.8 (30.7, 34.9)1.31 (0.99, 1.75)4.2 (2.8, 6.0)
Low USM Low Risk score (USM-LR)3010791046312110946.2 (33.7, 59.0)55.3 (53.0, 57.5)0.83 (0.64, 1.09)2.7 (1.8, 3.8)
Double test strategy: Tumour Diameter AND Imaging characteristics
PositiveTumour ≥4cm AND Imaging characteristics positive4520283249524769.2 (56.6, 80.1)10.3 (9.0, 11.8)6.7 (5.4, 8.2)18.2 (13.6, 23.6)
Negative Tumour <4cm AND/OR Imaging characteristics negative2017501684633177030.8 (20.0, 43.4)89.7 (88.2, 91.0)0.34 (0.24, 0.49)1.1 (0.7, 1.7)
Double test strategy: Tumour Diameter AND USM
HighTumour ≥4cm AND USM-HR71213368212810.8 (4.4, 20.9)6.2 (5.2, 7.4)1.7 (0.9, 3.6)5.5 (2.2, 10.9)
ModerateTumour >4cm AND USM-MR2012285251214230.8 (19.9, 43.4)6.3 (5.2, 7.4)4.92 (3.29, 7.37)14.1 (8.8, 20.9)
Low Tumour <4cm AND/OR USM-LR3817091649564174758.5 (45.6, 70.6)87.6 (86.0, 89.0)0.67 (0.54, 0.82)2.2 (1.5, 3.0)
Double test strategy: Imaging characteristics AND USM
HighImaging characteristics positive AND USM high6119352821259.2 (3.5, 19.0)6.1 (5.1, 7.3)1.8 (0.9, 3.6)4.8 (1.9, 10.2)
ModerateImaging characteristics positive AND USM moderate2814097301316843.1 (30.8, 56.0)7.2 (6.1, 8.4)6.01 (4.35, 8.29)16.7 (11.4, 23.2)
Low Imaging characteristics negative AND/OR USM low31169316355531724(47.7 (35.1, 60.5)86.7 (85.1, 88.2)0.55 (0.43, 0.71)1.8 (1.2, 2.5)
Triple test strategy: Tumour Diameter AND Imaging characteristics AND USM
HighTumour ≥4cm AND Imaging positive AND USM-HR6100172811069.2 (3.5, 19.0)5.1 (4.2, 6.2)1.8 (0.8, 4.0)5.7 (2.1, 11.9)
ModerateTumour ≥4cm AND Imaging positive AND USM-M/-L3910266221414160.0 (47.1, 72.0)5.2 (4.3, 6.3)11.5 (8.7, 15.1)27.7 (20.5, 35.8)
LowTumour <4cm OR Imaging negative AND any USM result2017501684633177030.8 (19.9, 43.4)89.7 (88.2, 91.0)0.34 (0.24, 0.49)1.1 (0.7, 1.7)
Suppl. Table 16: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging characteristics by tumour diameter and diagnosis in Non-ACC participants (benign adrenocortical adenoma [ACA; N=1767, other benign [OB; N=87] and other malignant [OM; N=65] adrenal masses.
ACA Tumour diameter <4cm USMOB Tumour diameter <4cm USMOM Tumour diameter <4cm USM
Imaging characteristicsLow RiskModerate RiskHigh RiskTotalLow RiskModerate RiskHigh RiskTotalLow RiskModerate RiskHigh RiskTotal
Negative754419921265791171001
Positive1147418206515020108018
Total86849311014711224137118019
ACA Tumour diameter ≥4cm USMOB Tumour diameter ≥4cm USMOM Tumour diameter ≥4cm USM
Imaging characteristicsLow RiskModerate RiskHigh RiskTotalLow RiskModerate RiskHigh RiskTotalLow RiskModerate RiskHigh RiskTotal
Negative135621621312104260011
Positive432317837152241920645
Total178853329619256501920746
Suppl. Table 17: Cross tabulations of the results of urine steroid metabolomics (USM) risk score classification (low, moderate, or high risk of adrenocortical carcinoma [ACC]) and imaging characteristics on unenhanced CT (Total N=1549) by tumour diameter and diagnosis in Non-ACC participants with either benign adrenocortical adenoma [ACA; N=1328], other benign [OB; N=63], or other malignant [OM; N=60] adrenal masses.
ACA Tumour diameter <4cm USMOBOM
Tumour diameter <4cm USMTumour diameter <4cm USM
Unenhanced CT >20HU or heterogeneityLow RiskTotalTotal
Moderate RiskHigh RiskLow RiskModerate RiskHigh RiskLow RiskModerate RiskHigh RiskTotal
Negative59231562969551110000
Positive8247121414130171719541
Total6743627411109181281719541
ACA Tumour diameter ≥4cm USMOB Tumour diameter ≥4cm USMOM Tumour diameter ≥4cm USM
Unenhanced CT >20HU or heterogeneityTotalTotal
Low RiskModerate RiskHigh RiskLow RiskModerate RiskHigh RiskLow RiskModerate RiskHigh RiskTotal
Negative10245121591191211001
Positive2619145959014108018
Total12864262181618135118019

STARD Checklist

TITLE OR ABSTRACT
1Identification as a study of diagnostic accuracy using at least one measure of accuracy (such as sensitivity, specificity, predictive values, or AUC)1
ABSTRACT
2Structured summary of study design, methods, results, and conclusions (for specific guidance, see STARD for Abstracts)5
INTRODUCTION
3Scientific and clinical background, including the intended use and clinical role of the index test8
4Study objectives and hypotheses9
METHODS
Study design5Whether data collection was planned before the index test and reference standard were performed (prospective study) or after (retrospective study)10
Participants6Eligibility criteria10
7On what basis potentially eligible participants were identified (such as symptoms, results from previous tests, inclusion in registry)10
8Where and when potentially eligible participants were identified (setting, location and dates)10
9Whether participants formed a consecutive, random or convenience series10
Test methods10aIndex test, in sufficient detail to allow replication11,12, Suppl.
10bReference standard, in sufficient detail to allow replication11,12
11Rationale for choosing the reference standard (if alternatives exist)12
12aDefinition of and rationale for test positivity cut-offs or result categories of the index test, distinguishing pre-specified from exploratory11,12
12bDefinition of and rationale for test positivity cut-offs or result categories of the reference standard, distinguishing pre-specified from exploratory12
13aWhether clinical information and reference standard results were available to the performers/readers of the index test11,12
13bWhether clinical information and index test results were available to the assessors of the reference standard11,12
Analysis14Methods for estimating or comparing measures of diagnostic accuracy12, Suppl.
15How indeterminate index test or reference standard results were handled11,12,Suppl
16How missing data on the index test and reference standard were handledSuppl.
17Any analyses of variability in diagnostic accuracy, distinguishing pre-specified from exploratorySuppl.
18Intended sample size and how it was determined12
RESULTS
Participants19Flow of participants, using a diagram28 (Fig 1)
20Baseline demographic and clinical characteristics of participants14
21aDistribution of severity of disease in those with the target condition14
21bDistribution of alternative diagnoses in those without the target condition14
22Time interval and any clinical interventions between index test and reference standardSuppl.
Test results23Cross tabulation of the index test results (or their distribution) by the results of the reference standard24,25,29, 30, Suppl.
24Estimates of diagnostic accuracy and their precision (such as 95% confidence intervals)15,16,17
25Any adverse events from performing the index test or the reference standardN/A
DISCUSSION
26Study limitations, including sources of potential bias, statistical uncertainty, and generalisability20
27Implications for practice, including the intended use and clinical role of the index test19,20,21
OTHER INFORMATION
28Registration number and name of registry10
29Where the full study protocol can be accessedN/A
30Sources of funding and other support; role of funders13

Supplementary References

1. Arlt W, Biehl M, Taylor AE, et al. Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors. J Clin Endocrinol Metab 2011; 96(12): 3775-84.

2. WHO Classification of Tumours of Endocrine Organs. 4th ed: International Agency for Research on Cancer (IARC); Lyon, France; 2017.

3. Giordano TJ, Berney D, de Krijger RR, et al. Carcinoma of the Adrenal Cortex Histopathology Reporting Guide. International Collaboration on Cancer Reporting. Sydney, Australia; 2019.

4. Lenders JW, Duh QY, Eisenhofer G, et al. Pheochromocytoma and paraganglioma: an endocrine society clinical practice guideline. J Clin Endocrinol Metab 2014; 99(6): 1915-42.

5. Nieman LK, Biller BM, Findling JW, et al. The diagnosis of Cushing’s syndrome: an Endocrine Society Clinical Practice Guideline. J Clin Endocrinol Metab 2008; 93(5): 1526-40.

6. Funder JW, Carey RM, Fardella C, et al. Case detection, diagnosis, and treatment of patients with primary aldosteronism: an endocrine society clinical practice guideline. J Clin Endocrinol Metab 2008; 93(9): 3266-81.

7. Fassnacht M, Arlt W, Bancos I, et al. Management of adrenal incidentalomas: European Society of Endocrinology Clinical Practice Guideline in collaboration with the European Network for the Study of Adrenal Tumors. Eur J Endocrinol 2016; 175(2): G1-G34.

8. Kohonen T. Self-Organizing Maps. 2nd ed. Berlin: Springer; 1997.

9. Biehl M. A no-nonsense GMLVQ demo code (Version 2.3). http://www.cs.rug.nl/~biehl/gmlvq.

10. Biehl M, Schneider P, Smith DJ, et al. Matrix relevance LVQ in steroid metabolomics based classification of adrenal tumors. European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning 2012 25-27 April 2012 Bruges (Belgium); 2012.

11. 1. Schneider P, Biehl M, Hammer B. Adaptive relevance matrices in learning vector quantization. Neural Comput 2009; 21(12): 3532-61.

12. Biehl M, Hammer B, Villmann T. Prototype-based models in machine learning. Wiley Interdiscip Rev Cogn Sci 2016; 7(2): 92-111.

13. Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett 2006; 27(8): 861-74.