1. INTRODUCTION
ICIs have led to a groundbreaking shift in cancer therapy, by significantly improving treatment outcomes across various malignancies [1, 2]. These treatments have been approved for various cancers, including malignant melanoma, non-small cell lung cancer (NSCLC), and renal cell carcinoma (RCC), each of which shows a distinct epidemiological pattern and response to ICIs [3]. The integration of ICIs is addressing critical needs in oncology, by targeting CTLA-4, PD-1, and PD-L1. Research has underscored the efficacy of CTLA-4 inhibitors in melanoma treatment, through significantly increasing survival rates [4]. Similarly, PD-1/PD-L1 inhibitors have revolutionized lung cancer treatment by enabling the development of personalized strategies through the analysis of PD-1 and PD-L1 expression levels, and enhancing the precision of immunotherapy [5].
ICIs have also shown promise in treating other cancers, such as triple-negative breast cancer and HER2-positive breast cancer with PD-1 inhibitors, and in improving overall survival in platinum-resistant head and neck squamous cell carcinoma with nivolumab [6, 7]. For patients with certain types of metastatic RCC, a combination of nivolumab and ipilimumab is now the preferred initial treatment, taking into account patient tolerance and quality of life [8]. However, despite their broad efficacy, ICIs exhibit unique response patterns and a range of tumor responses that differ from those with traditional therapies, such as chemotherapy, radiotherapy, and targeted therapy. Challenges include delayed responses, pseudo-progression, hyper-progression, and immune-related adverse events (irAEs), which can cause severe endocrine toxicity affecting the thyroid, adrenal glands, and pancreas [9, 10]. These complications may significantly affect patient well-being and survival, thus underscoring the need for careful patient monitoring and a deeper understanding of the epidemiological data associated with ICIs and cancer treatment.
In recent years, emerging treatment strategies based on ICIs, such as combinations with chemotherapy, anti-angiogenic drugs, and dual ICI therapies, have entered clinical practice [11]. Extensive clinical research evaluating these combination therapies has demonstrated that both chemotherapy with ICIs and dual ICI treatments substantially improve patient outcomes and prognosis [12–14]. Notably, dual ICI treatments, particularly ipilimumab and nivolumab, have shown objective response rates of 58–71% in phase III clinical trials for metastatic clear cell renal cell carcinoma [15, 16]. The enhanced therapeutic effects are attributed to the concurrent blockade of CTLA-4 and PD-1/PD-L1 pathways, thereby fostering synergistic anti-cancer effects. Nonetheless, although these combination therapies may offer synergistic benefits, they can also significantly increase toxicity levels [17]. This concern is particularly relevant to ICI-induced endocrine toxicity: focused research has extensively explored this toxicity, yet current studies have often been constrained by their narrow scope and methods, thus limiting their applicability to the diverse and evolving array of ICI treatments. This limitation is particularly salient in the comparison of safety profiles across multiple ICI strategies—a task for which traditional meta-analysis is often inadequate. To address this gap, this study leverages the sophisticated capabilities of network meta-analysis (NMA), an analytical approach uniquely suited for comparing multiple interventions in an integrated manner. Unlike traditional meta-analyses, NMA enables the comparison of multiple treatments both directly and indirectly, thus offering a comprehensive safety ranking for specific endocrine dysfunctions across a broad range of ICI therapies. The use of NMA methods represents an important advance over previous studies, by enabling a more nuanced understanding of ICI-related endocrine toxicity.
This study was aimed at providing a robust, quantitative safety ranking of endocrine disruptions induced by various ICIs, in both monotherapy and combination therapy settings, to provide a valuable resource for clinicians. Our comprehensive analysis not only augments current understanding of ICI-induced endocrine toxicity but also provides a critical decision-making tool for clinicians aiming to personalize ICI regimens while minimizing associated risks.
2. METHODS
2.1 Search strategy
We conducted a literature search of PubMed, Embase, the Cochrane Library, and ClinicalTrials.gov to identify relevant studies published in English through December 2022. Our search strategy used the following keywords: ipilimumab, tremelimumab, pembrolizumab, nivolumab, camrelizumab, atezolizumab, durvalumab, avelumab, CTLA-4, PD-1, PD-L1, and immune checkpoint inhibitor. The search strategy is detailed in the Table S1 .
2.2 Inclusion and exclusion criteria
To select eligible studies, we followed the patient population, interventions, controls, and outcome indicators principles of evidence-based medicine, applying specific inclusion and exclusion criteria. We included randomized controlled trials (RCTs) of ICI monotherapy or combination therapy that reported data on immune-related endocrine toxicity, such as hyperthyroidism, hypothyroidism, hypophysitis, adrenal insufficiency, and thyroiditis. When multiple studies reported data on the same trial, we included only the most complete and updated data. Two investigators independently screened all titles, abstracts, and full texts for eligibility, and any discrepancies were resolved through consultation. In this process, we excluded non-English articles, as well as reviews, case reports, editorials, meta-analyses, letters to the editor, and conference records.
2.3 Data extraction
We extracted the following information from the included studies: clinical trial NCT number; year of study; trial location; trial phase; type of cancer; total number of patients; treatment regimen; median follow-up time; number of patients experiencing immune-related hyperthyroidism, hypothyroidism, hypophysitis, adrenal insufficiency, or thyroiditis; and total sample size. To ensure consistency and accuracy, we extracted research data and detailed content by using standardized electronic spreadsheets.
2.4 Quality assessment
We used the Cochrane Risk of Bias Assessment Tool to evaluate the potential risk of bias in these clinical trials. This tool evaluates the quality of clinical studies in seven dimensions: random sequence generation (selection bias), allocation concealment (selection bias), blinding to intervention (performance bias), blinding to outcome assessment (detection bias), incomplete outcome data (attrition bias), selective reporting (reporting bias), and other biases. Study quality was classified as low risk of bias (+), high risk of bias (-), or unclear (?).
2.5 Statistical analysis
We used Stata 16.0 software to perform NMA and generate evidence network diagrams for the various ICI treatment regimens. Binary variables and odds ratios (OR) with 95% confidence intervals (CI) were used to measure outcomes, and P values less than 0.05 were considered significant. We assessed heterogeneity among studies by using direct meta-analysis and the I2 test, selecting fixed-effects or random-effects models accordingly. If significant heterogeneity was present, with I2 ≥50%, a great influence on the statistical results was indicated, and a random-effect model was used; otherwise, a fixed-effect model was used. For closed-loop studies, we used OR and 95% CI for the heterogeneity test, and used a fixed-effects model if the value contained 0, and P was greater than 0.05. We evaluated consistency and inconsistency with the node-splitting method, with a P value greater than 0.05 indicating good consistency. We estimated overall treatment rankings with the surface under the cumulative ranking (SUCRA) and a league table for each endocrine-related adverse effect. We used funnel plots to test for bias of the included studies.
3. RESULTS
3.1 Literature selection process and characteristics of the included literature
Figure 1 illustrates the flowchart of the study selection process. Initially, we identified 2,942 potentially relevant studies through electronic searches of PubMed, Embase, the Cochrane Library, and ClinicalTrials.gov. After screening the titles and abstracts, we assessed 406 potentially eligible articles for eligibility. Ultimately, we included 55 head-to-head RCTs comprising 32,522 patients for the quantitative NMA synthesis. Eleven phase II trials and 44 phase III RCTs were among those included. We analyzed data on endocrine toxicity from 48 studies on hypothyroidism, 26 studies on hyperthyroidism, 16 studies on hypophysitis, 15 studies on adrenal insufficiency, and 11 studies on thyroiditis. The detailed baseline characteristics of the included studies are presented in Table 1 .
Baseline characteristics of 55 randomized controlled trials for network meta-analysis by cancer type.
Study ID | Author | Year | Region | Phase | Cancer type | Total No. | Safety analysis No. | Arm | Treatment | NCT No. |
---|---|---|---|---|---|---|---|---|---|---|
KEYNOTE 006 [18] | Robert | 2015 | MN | III | MM | 834 | 278 | 1 | PEM 10 mg/kg every 2 weeks | 01866319 |
277 | 2 | PEM 10 mg/kg every 3 weeks | ||||||||
256 | 3 | IPI 3 mg/kg every 3 weeks | ||||||||
OAK [19] | Rittmeyer | 2017 | Japan | III | NSCLC | 101 | 56 | 1 | ATE 1,200 mg every 3 weeks | 02008227 |
45 | 2 | DTX 75 mg/m2 every 3 weeks | ||||||||
CheckMate 037 [20] | Larkin | 2017 | MN | III | MM | 405 | 268 | 1 | NIV 3 mg/kg every 2 weeks | 01721746 |
102 | 2 | DAC 1,000 mg/m2 or CBP AUC = 6 + PTX 175 mg/m2 every 3 weeks | ||||||||
CheckMate 025 [21] | Motzer | 2015 | MN | III | RCC | 821 | 406 | 1 | NIV 3 mg/kg every 2 weeks | 01668784 |
397 | 2 | EVE 10 mg orally once daily | ||||||||
KEYNOTE 045 [22] | Bellmunt | 2017 | MN | III | UC | 542 | 266 | 1 | PEM 200 mg every 3 weeks | 02256436 |
255 | 2 | PTX, DTX, or VIN | ||||||||
CA184-162 [23] | Bang | 2017 | MN | III | GC/GEJC | 114 | 57 | 1 | IPI 10 mg/kg every 3 weeks | 01585987 |
45 | 2 | Best supportive care | ||||||||
CA184-169 [24] | Ascierto | 2017 | MN | III | MM | 727 | 364 | 1 | IPI 10 mg/kg every 3 weeks | 01515189 |
362 | 2 | IPI 3 mg/kg every 3 weeks | ||||||||
KEYNOTE 024 [25] | Reck | 2016 | MN | III | NSCLC | 305 | 154 | 1 | PEM 200 mg every 3 weeks | 02142738 |
150 | 2 | PT-based chemotherapy | ||||||||
KEYNOTE021 [26] | Langer | 2016 | USA, Taiwan | II | NSCLC | 123 | 59 | 1 | PEM 200 mg + CBP (AUC = 5) + PMT 500 mg/m2 every 3 weeks | 02039674 |
62 | 2 | CBP (AUC = 5) + PMT 500 mg/m2 every 3 weeks | ||||||||
CheckMate 069 [27] | Hodi | 2016 | USA, France | III | MM | 142 | 94 | 1 | NIV 1 mg/kg + IPI 3 mg/kg every 3 weeks | 01927419 |
46 | 2 | IPI 3 mg/kg every 3 weeks | ||||||||
KEYNOTE 010 [28] | Herbst | 2016 | MN | II/III | NSCLC | 1,034 | 339 | 1 | PEM 2 mg/kg every 3 weeks | 01905657 |
343 | 2 | PEM 10 mg/kg every 3 weeks | ||||||||
309 | 3 | DTX 75 mg/m2 every 3 weeks | ||||||||
POPLAR [29] | Fehrenbacher | 2016 | MN | II | NSCLC | 287 | 142 | 1 | ATE 1,200 mg every 3 weeks | 01903993 |
135 | 2 | DTX 75 mg/m2 every 3 weeks | ||||||||
KEYNOTE 002 [30] | Ribas | 2015 | MN | II | MM | 540 | 178 | 1 | PEM 2 mg/kg every 3 weeks | 01704287 |
179 | 2 | PEM 10 mg/kg every 3 weeks | ||||||||
171 | 3 | ICC (PTX and CBP, PTX, CBP, DTIC oral TEM) | ||||||||
A3671009 [31] | Ribas | 2013 | MN | III | MM | 655 | 325 | 1 | TRE 15 mg/kg every 90 days | 00257205 |
316 | 2 | TEM 200 mg/m2 every 4 weeks or DTIC 1,000 mg/m2 every 3 weeks | ||||||||
JAVELIN Bladder 100 [32] | Powles | 2020 | III | UC | 689 | 344 | 1 | AVE 10 mg/kg every 2 weeks | 02603432 | |
345 | 2 | Best supportive care | ||||||||
JAVELIN Renal 101 [33] | Motzer | 2019 | MN | III | RCC | 886 | 442 | 1 | AVE 10 mg/kg every 2 weeks + AXI 5 mg twice daily | 02684006 |
444 | 2 | SUN 50 mg orally once daily | ||||||||
CheckMate 067 [34] | Hodi | 2018 | MN | III | MM | 937 | 313 | 1 | NIV 1 mg/kg + IPI 3 mg/kg every 3 weeks | 01844505 |
313 | 2 | NIV 3 mg/kg every 2 weeks | ||||||||
311 | 3 | IPI 3 mg/kg every 3 weeks | ||||||||
CheckMate 511 [35] | Lebbe | 2019 | MN | III | MM | 358 | 180 | 1 | NIV 3 mg/kg + IPI 1 mg/kg every 3 weeks | 02714218 |
178 | 2 | NIV 1 mg/kg + IPI 3 mg/kg every 3 weeks | ||||||||
NRG GY003 [36] | Zamarin | 2020 | USA | II | OC | 100 | 49 | 1 | NIV 3 mg/kg every 2 weeks | 02498600 |
51 | 2 | NIV 3 mg/kg + IPI 1 mg/kg every 3 weeks | ||||||||
ANNONC 152 [37] | Ferris | 2020 | MN | III | HNSC | 274 | 78 | 1 | DUR 10 mg/kg every 2 weeks | 02369874 |
104 | 2 | DUR 20 mg/kg + TRE 1 mg/kg every 4 weeks | ||||||||
92 | 3 | SOC (CET, taxane, MTX or FLU) | ||||||||
KEYNOTE 252 [38] | Long | 2019 | MN | III | MM | 706 | 353 | 1 | PEM 200 mg every 3 weeks + epacadostat 100 mg twice daily | 02752074 |
352 | 2 | PEM 200 mg every 3 weeks | ||||||||
OpACIN neo [39] | Rozeman | 2019 | MN | II | MM | 89 | 30 | 1 | IPI 3 mg/kg + NIV 1 mg/kg every 3 weeks | 02977052 |
30 | 2 | IPI 1 mg/kg + NIV 3 mg/kg every 3 weeks | ||||||||
26 | 3 | IPI 3 mg/kg every 3 weeks | ||||||||
KEYNOTE 407 [40] | Paz-Ares | 2018 | MN | III | NSCLC | 559 | 278 | 1 | PEM 200 mg every 3 weeks + CBP (AUC = 6) + PTX 200 mg/m2 or Nab-PTX 100 mg/m2 | 02775435 |
280 | 2 | CBP (AUC = 6) + PTX 200 mg/m2 or Nab-PTX 100 mg/m2 | ||||||||
KEYNOTE 042 [41] | Mok | 2019 | MN | III | NSCLC | 1,274 | 636 | 1 | PEM 200 mg every 3 weeks | 02220894 |
615 | 2 | ICC (PT-based chemotherapy) | ||||||||
KEYNOTE 048 [42] | Burtness | 2019 | MN | III | HNSC | 882 | 300 | 1 | PEM 200 mg every 3 weeks | 02358031 |
276 | 2 | PEM 200 mg every 3 weeks + CBP (AUC = 5) or CIS 100 mg/m2 and 5-FLU 1,000 mg/m2 per day for 4 consecutive days | ||||||||
287 | 3 | CET 400 mg/m2 loading dose, then 250 mg/m2 per week + CBP (AUC = 5) or CIS 100 mg/m2 and 5-FLU 1,000 mg/m2 per day for 4 consecutive days | ||||||||
KEYNOTE 189 [43] | Gadgeel | 2020 | MN | III | NSCLC | 616 | 405 | 1 | PEM 200 mg every 3 weeks + PMT 500 mg/m2 every 3 weeks + CBP (AUC = 5) or CIS 75 mg/m2 every 3 weeks | 02578680 |
202 | 2 | PMT 500 mg/m2 every 3 weeks + CBP (AUC = 5) or CIS 75 mg/m2 every 3 weeks | ||||||||
KEYNOTE 522 [44] | Schmid | 2022 | MN | III | BC | 1,174 | 783 | 1 | PEM 200 mg every 3 weeks + PTX 80 mg/m2 + CBP 5 mg/m2 | 03036488 |
389 | 2 | PTX 80 mg/m2 + CBP 5 mg/m2 | ||||||||
KEYNOTE 024 [45] | Reck | 2019 | MN | III | NSCLC | 305 | 154 | 1 | PEM 200 mg every 3 weeks | 02142738 |
150 | 2 | PT-based chemotherapy | ||||||||
IMpassion130 [46] | Schmid | 2019 | MN | III | BC | 902 | 453 | 1 | ATE 840 mg days 1 and 15 of each 28-day cycle + Nab-PTX 100 mg/m2 days 1, 8, and 15 of each 28-day cycle | 02425891 |
437 | 2 | Nab-PTX 100 mg/m2 days 1, 8, and 15 of each 28-day cycle | ||||||||
KEYNOTE 177 [47] | André | 2020 | MN | III | CRC | 307 | 153 | 1 | PEM 200 mg every 3 weeks | 02563002 |
143 | 2 | mFOLFOX6 or FOLFIRI + BEV 5 mg/kg every 2 weeks or CET 400 mg/m2 over 2 hours, then 250 mg/m2 over 1 hour weekly in each 2-week cycle | ||||||||
CheckMate 214 [48] | Motzer | 2019 | MN | III | RCC | 1,096 | 547 | 1 | NIV 3 mg/kg + IPI 1 mg/kg every 3 weeks for four doses, followed by NIV3 mg/kg every 2 weeks | 02231749 |
535 | 2 | SUN 50 mg orally once daily for 4 weeks | ||||||||
KEYNOTE 040 [49] | Cohen | 2018 | MN | III | HNSC | 480 | 246 | 1 | PEM 200 mg every 3 weeks | 02252042 |
234 | 2 | MTX 40 mg/m2 days 1, 8, and 15 of each 3-week cycle, DTX 75 mg/m2 day 1 of each 3-week cycle or CET 400 mg/m2 loading dose on day 1 and 250 mg/m2 on days 8 and 15 of cycle 1, followed by CET 250 mg/m2 on days 1, 8, and 15 of each subsequent 3-week cycle | ||||||||
IMmotion 151 [50] | Rini | 2019 | MN | III | RCC | 915 | 454 | 1 | ATE 1,200 mg every 3 weeks + BEV 15 mg/kg every 3 weeks | 02420821 |
461 | 2 | SUN 50 mg orally once daily for 4 weeks | ||||||||
KEYNOTE 045 [51] | Bellmunt | 2017 | MN | III | UC | 542 | 266 | 1 | PEM 200 mg every 3 weeks | 02256436 |
255 | 2 | PTX 175 mg/m2, DTX 75 mg/m2, or VIN 320 mg/m2 every 3 weeks | ||||||||
KEYNOTE 062 [52] | Shitara | 2020 | MN | III | GC | 763 | 254 | 1 | PEM 200 mg every 3 weeks | 02494583 |
250 | 2 | PEM 200 mg every 3 weeks + CIS 80 mg/m2 + FLU 800 mg/m2/d day 1 to 5 or CAP 1,000 mg/m2 twice daily day 1–14 every 3 weeks | ||||||||
244 | 3 | CIS 80 mg/m2/d on day 1 + FLU 800 mg/m2/d on days 1–5 or CAP 1,000 mg/m2 twice daily | ||||||||
IMspire 150 [53] | Gutzmer | 2020 | MN | III | MM | 777 | 230 | 1 | ATE 840 mg + VEM 960 mg twice per day + COM 60 mg once daily | 02908672 |
281 | 2 | VEM 960 mg twice per day + COM 60 mg once daily | ||||||||
IMpassion 031 [54] | Mittendorf | 2020 | MN | III | BC | 455 | 164 | 1 | ATE 840 mg + Nab-PTX 125 mg/m2 every week + DOX 60 mg/m2 and CTX 600 mg/m2 every 2 weeks | 03197935 |
167 | 2 | Nab-PTX 125 mg/m2 every week, followed by DOX 60 mg/m2 and CTX 600 mg/m2 every 2 weeks | ||||||||
CheckMate 227 [55] | Hellmann | 2018 | MN | III | NSCLC | 1,189 | 396 | 1 | NIV 3 mg/kg every 2 weeks + IPI 1 mg/kg every 6 weeks | 02477826 |
396 | 2 | NIV 240 mg every 2 weeks | ||||||||
397 | 3 | PT-based chemotherapy | ||||||||
CheckMate 9 LA [56] | Paz-Ares | 2021 | MN | III | NSCLC | 1,150 | 358 | 1 | NIV 360 mg every 3 weeks + IPI 1 mg/kg every 6 weeks + PT-based chemotherapy | 03215706 |
349 | 2 | PT-based chemotherapy | ||||||||
KEYNOTE426 [57] | Powles | 2020 | MN | III | RCC | 861 | 429 | 1 | PEM 200 mg every 3 weeks + conventional therapy | 02853331 |
425 | 2 | SUN 50 mg once daily | ||||||||
KEYNOTE 604 [58] | Rudin | 2020 | MN | III | SCLC | 453 | 223 | 1 | PEM 200 mg every 3 weeks + VP16 100 mg/m2 + (AUC = 5) or CIS 75 mg/m2 | 03066778 |
223 | 2 | VP16 100 mg/m2 + (AUC = 5)or CIS 75 mg/m2 | ||||||||
CheckMate 066 [59] | Ascierto | 2018 | MN | III | MM | 418 | 206 | 1 | NIV 3 mg/kg every 2 weeks | 01721772 |
205 | 2 | DAC 1,000 mg/m2 every 3 weeks | ||||||||
I-SPY2 Trial [60] | Nanda | 2020 | MN | II | BC | 250 | 69 | 1 | PEM 200 mg every 3 weeks + PTX 80 mg/m2 every week + DOX 60 mg/m2 + CTX 600 mg/m2 every 2 to 3 weeks | 01042379 |
181 | 2 | PTX 80 mg/m2 every week + DOX 60 mg/m2 + CTX 600 mg/m2 every 2–3 weeks | ||||||||
CheckMate 238 [61] | Ascierto | 2020 | MN | III | MM | 905 | 452 | 1 | NIV 3 mg/kg every 2 weeks | 02388906 |
453 | 2 | IPI 10 mg/kg every 3 weeks | ||||||||
JAVELIN Gastric 300 [62] | Bang | 2018 | MN | III | GC | 371 | 184 | 1 | AVE 10 mg/kg every 2 weeks | 02625623 |
177 | 2 | PTX 80 mg/m2 or IRI 150 mg/m2 every week | ||||||||
CASPIAN [63] | Goldman | 2020 | MN | III | SCLC | 972 | 266 | 1 | DUR 1,500 mg + TRE 75 mg every 3 weeks + PT + VP16 80–100 mg/m2 | 03043872 |
265 | 2 | DUR 1,500 mg every 3 weeks + PT + VP16 | ||||||||
266 | 3 | PT + VP16 | ||||||||
IMblaze 370 [64] | Eng | 2019 | MN | III | CRC | 363 | 183 | 1 | ATE 840 mg + COM 60 mg once daily | 02788279 |
90 | 2 | ATE 1,200 mg every 3 weeks | ||||||||
90 | 3 | REG 160 mg once daily | ||||||||
E1609 [65] | Ahmad | 2019 | MN | III | MM | 1,018 | 516 | 1 | IPI 3 mg/kg every 3 weeks | 01274338 |
503 | 2 | IPI 10 mg/kg every 3 weeks | ||||||||
Jing Nie et al. [66] | Nie | 2019 | China | II | HL | 86 | 19 | 1 | CAM 200 mg every 3 weeks | 03250962 |
67 | 2 | CAM 200 mg every 3 weeks + DTB 10 mg/d day 1–5 | ||||||||
PROLUNG [67] | Arrieta | 2020 | Mexico | II | NSCLC | 78 | 40 | 1 | PEM 200 mg every 3 weeks + DTX 75 mg/m2 every 3 weeks | 02574598 |
38 | 2 | DTX 75 mg/m2 every 3 weeks | ||||||||
KATE2 [68] | Emens | 2020 | MN | II | BC | 200 | 132 | 1 | ATE 1,200 mg + TDM-1 3.6 mg/kg | 02924883 |
68 | 2 | Trastuzumab emtansine 3.6 mg/kg | ||||||||
CameL [69] | Zhou | 2020 | China | III | NSCLC | 419 | 205 | 1 | CAM 200 mg every 3 weeks + CBP (AUC = 5) + PMT 500 mg/m2 every 3 weeks | 03134872 |
207 | 2 | CBP (AUC = 5) + PMT 500 mg/m2 every 3 weeks | ||||||||
ESCORT [70] | Huang | 2020 | China | EC | 457 | 228 | 1 | CAM 200 mg every 2 weeks | 03099382 | |
220 | 2 | DTX 75 mg/m2 every 3 weeks or IRI 180 mg/m2 every 2 weeks | ||||||||
CheckMate 9ER [71] | Choueiri | 2021 | MN | III | RCC | 651 | 323 | 1 | NIV 240 mg every 2 weeks + CZT 40 mg once daily | 03141177 |
328 | 2 | SUN 50 mg once daily | ||||||||
KEYNOTE 181 [72] | Kojima | 2020 | MN | III | EC | 628 | 314 | 1 | PEM 200 mg every 3 weeks | 02559687 |
296 | 2 | PTX 80–100 mg/m2 every week or DTX 75 mg/m2 every 3 weeks or IRI 180 mg/m2 every 2 weeks |
CTCAE = Common Terminology Criteria for Adverse Events; TrAE = treatment-related adverse event; MN = multinational; PD-L1 = programmed cell death ligand 1; MM = melanoma; NIV = nivolumab; IPI = ipilimumab; DAC = dacarbazine; PEM = pembrolizumab; NR = not reported; ICC = investigator’s choice chemotherapy; CBP = carboplatin; AUC = area under the curve; PT = platinum; PTX = paclitaxel; TEM = temozolomide; ATE = atezolizumab; CIS = cisplatin; GEM = gemcitabine; ETO = VP16; DTX = docetaxel; CR = concurrent regimen; PR = phased regimen; MTX = methotrexate; CET = cetuximab; EVE = everolimus; VIN = vinflunine; NSCLC = non-small cell lung cancer; RCC = renal cell carcinoma; GC = gastric carcinoma; GEJC = gastro-esophageal junction; UC = urothelial cancer; PMT = pemetrexed; AVE = avelumab; AXI = axitinib; SUN = sunitinib; OC = ovarian carcinoma; TRE = tremelimumab; SOC = standard of care; HNSC = head and neck squamous carcinoma; FLU = fluorouracil; BC = breast carcinoma; BEV = bevacizumab; CRC = colorectal carcinoma; CAP = capecitabine; VEM = vemurafenib; COM = cobimetinib; CTX = cyclophosphamide; DOX = doxorubicin; SCLC = small cell lung cancer; REG = regorafenib; HL = Hodgkin lymphoma; CAM = camrelizumab; DTB = decitabine; CZT = cabozanitinib; EC = esophageal carcinoma; IRI = irinotecan; DUR = durvalumab.
3.2 Quality assessment results of the included studies
Among the 55 studies included, 69.1% met the criteria for random sequence generation, 47.3% reported allocation concealment, 43.7% implemented blinding to the intervention, 96.4% implemented blinding to the outcome assessment, and 90.9% had complete outcome data. Furthermore, as shown in Figure S1 , selective reporting bias was not detected in 98.2% of the included studies. These results suggested that the studies included in this meta-analysis are of relatively high quality.
3.3 Network meta-analysis results
3.3.1 Network plot
Figure 2 displays the network plots for five immune related-endocrine toxicity events: hypothyroidism, hyperthyroidism, hypophysitis, adrenal insufficiency, and thyroiditis. The 11 ICI treatment strategies in the analysis included four PD-1 inhibitors, three PD-L1 inhibitors, and two CTLA-4 inhibitors. The ICI monotherapy options included pembrolizumab (n = 2,675), ipilimumab (n = 1,809), nivolumab (n = 1,411), avelumab (n = 528), tremelimumab (n = 325), camrelizumab (307), atezolizumab (n = 288), and durvalumab (n = 78). ICI combination therapy included one ICI plus conventional therapy (n = 6,078), two ICIs plus conventional therapy (n = 624), and two ICIs (n = 2,043). In addition, 10,786 patients received conventional therapy.
3.3.2 Inconsistency testing
The global inconsistency test revealed no significant heterogeneity for hypothyroidism, hyperthyroidism, hypophysitis, adrenal insufficiency, and thyroiditis ( Table S2 ), with all P values greater than 0.05. The I2 test showed high heterogeneity in hypothyroidism, hyperthyroidism, hypophysitis, and adrenal insufficiency, with I2 values of 85.7%, 75.4%, 73.4%, and 70.5%, respectively ( Figures S2–S6 ). The closed-loop inconsistency test, which assesses inconsistency under both direct and indirect comparisons, also demonstrated no significant inconsistency, as the 95% CI for all adverse reactions included 0 ( Tables S3–S7 ). The node-splitting method was used to test the overall inconsistency for each direct and indirect comparison, and all P values were greater than 0.05, indicating no significant inconsistency ( Tables S8–S12 ). Therefore, a consistency model was chosen for the NMA.
3.3.3 League table and cumulative probability ranking
ICI combination therapies, including one ICI plus conventional therapy, dual ICIs plus conventional therapy, and dual ICIs, were associated with a higher incidence of hypothyroidism than conventional therapy ( Figure 3a ). Treatment with dual ICIs plus conventional therapy resulted in a significantly greater risk of hypothyroidism than treatment with one ICI plus conventional therapy. Among the ICI monotherapies, nivolumab, pembrolizumab, camrelizumab, and durvalumab were associated with significantly greater incidence of hypothyroidism than conventional therapy, whereas ipilimumab, avelumab, and atezolizumab were not associated with elevated risk of hypothyroidism. Nivolumab and pembrolizumab were associated with significantly greater risk of hypothyroidism than ipilimumab, whereas no significant difference was observed in the incidence of hypothyroidism with treatment with other ICI monotherapy strategies. SUCRA analysis indicated that ipilimumab had the best safety ranking for hypothyroidism (probability = 78.6%) among treatment regimens involving ICIs, and was followed by avelumab (77.3%), one ICI in combination with traditional treatment (67.7%), two ICIs (58.4%), atezolizumab (48.8%), nivolumab (38.7%), pembrolizumab (34.6%), cemiplimab (27.6%), durvalumab (12.2%), and dual ICIs in combination with conventional therapy (9.1%) ( Figure 3b ).

Comparative analysis of hypothyroidism in a consistency model-based network meta-analysis (NMA) (A) league table and (B) probability ranking chart. Panel A presents a league table delineating the pooled odds ratios (OR) and corresponding 95% confidence intervals (CI) for drug-induced immune-related hypothyroidism across different treatment regimens. Statistically significant outcomes are highlighted in bold. Panel B depicts probability ranking curves, which quantify the likelihood of each treatment achieving a specified rank in terms of hypothyroidism risk reduction, ranging from lowest to highest. Abbreviations: ICI = immune checkpoint inhibitor.
Similarly to the results for hypothyroidism, ICI combination therapies, compared with conventional therapy, were associated with varying degrees of hyperthyroidism risk ( Figure 4a ). The risk of hyperthyroidism was higher with dual ICIs plus conventional therapy than with one ICI plus conventional therapy. Monotherapy a PD-1 inhibitor (nivolumab or pembrolizumab) or CTLA-4 inhibitor (ipilimumab) was associated with significantly greater risk of hyperthyroidism than conventional therapy, whereas treatment with a PD-L1 inhibitor (atezolizumab) was not associated with increased risk. Moreover, nivolumab and pembrolizumab were associated with a significantly greater risk of hyperthyroidism than ipilimumab. The safety ranking for hyperthyroidism among treatment regimens involving ICIs indicated that PD1/PD-L1 inhibitor plus conventional therapy had the highest probability (78.6%), followed by pembrolizumab (67.7%), nivolumab (73.3%), two ICIs (60.4%), ipilimumab (37.3%), and two ICIs plus conventional therapy (5.9%) ( Figure 4b ).

Comparative analysis of hyperthyroidism in a consistency model-based network meta-analysis (NMA) (A) league table and (B) probability ranking chart. Panel A presents a league table delineating the pooled odds ratios (OR) and corresponding 95% confidence intervals (CI) for drug-induced immune-related hypothyroidism across different treatment regimens. Statistically significant outcomes are highlighted in bold. Panel B depicts probability ranking curves, which quantify the likelihood of each treatment achieving a specified rank in terms of hypothyroidism risk reduction, ranging from lowest to highest. Abbreviations: ICI = immune checkpoint inhibitor.
On the basis of the findings presented in Figure 5a , ipilimumab and pembrolizumab were associated with significantly greater risk of hypophysitis than conventional therapy. Moreover, ipilimumab was associated with a greater risk of hypophysitis than pembrolizumab and nivolumab, or PD1/PD-L1 plus conventional therapy. Furthermore, treatment with two ICIs was associated with a greater risk of hypophysitis than treatment with one ICI plus conventional therapy, as well as ipilimumab, pembrolizumab, or nivolumab monotherapies. In the cumulative probability ranking for hypophysitis ( Figure 5b ), conventional therapy had the best safety ranking (95.6%), and was followed by nivolumab (70.0%), one ICI plus traditional therapy (67.5%), pembrolizumab (55.9%), tislelizumab (33.2%), ipilimumab (24.1%), and dual ICI therapy (3.7%).

Comparative analysis of hypophysitis in a consistency model-based network meta-analysis (NMA) (A) league table and (B) probability ranking chart. Panel A presents a league table delineating the pooled odds ratios (OR) and corresponding 95% confidence intervals (CI) for drug-induced immune-related hypothyroidism across different treatment regimens. Statistically significant outcomes are highlighted in bold. Panel B depicts probability ranking curves, which quantify the likelihood of each treatment achieving a specified rank in terms of hypothyroidism risk reduction, ranging from lowest to highest. Abbreviations: ICI = immune checkpoint inhibitor.
The use of combination therapy with two ICIs, or two ICIs plus conventional therapy, was associated with greater risk of adrenal insufficiency than conventional therapy ( Figure 6a ). Ipilimumab and pembrolizumab were also associated with significantly greater risk of adrenal insufficiency than traditional therapy, whereas nivolumab was not associated with a significantly elevated risk. Moreover, no significant difference was observed in the incidence of adrenal insufficiency among ICI monotherapies. The cumulative probability ranking of adrenal insufficiency, ordered from highest to lowest ( Figure 6b ), indicated that conventional therapy had the best safety ranking, at 97.5%, and was followed by one ICI in combination with traditional therapy (74.0%), pembrolizumab (58.3%), nivolumab (54.5%), two ICIs in combination (32.2%), ipilimumab (17.9%), and two ICIs in combination with traditional therapy (15.5%).

Comparative analysis of adrenal insufficiency in a consistency model-based network meta-analysis (NMA) (A) league table and (B) probability ranking chart. Panel A presents a league table delineating the pooled odds ratios (OR) and corresponding 95% confidence intervals (CI) for drug-induced immune-related hypothyroidism across different treatment regimens. Statistically significant outcomes are highlighted in bold. Panel B depicts probability ranking curves, which quantify the likelihood of each treatment achieving a specified rank in terms of hypothyroidism risk reduction, ranging from lowest to highest. Abbreviations: ICI = immune checkpoint inhibitor.
Treatment with one ICI plus conventional therapy, or two ICIs, was associated with significantly greater risk of thyroiditis than conventional therapy ( Figure 7a ). Additionally, pembrolizumab was associated with greater risk of thyroiditis than conventional therapy. SUCRA analysis also indicated that treatment with two ICIs had the highest ranking regarding thyroiditis (8.4%) ( Figure 7b ).

Comparative analysis of thyroiditis in a consistency model-based network meta-analysis (NMA) (A) league table and (B) probability ranking chart. Panel A presents a league table delineating the pooled odds ratios (OR) and corresponding 95% confidence intervals (CI) for drug-induced immune-related hypothyroidism across different treatment regimens. Statistically significant outcomes are highlighted in bold. Panel B depicts probability ranking curves, which quantify the likelihood of each treatment achieving a specified rank in terms of hypothyroidism risk reduction, ranging from lowest to highest. Abbreviations: ICI = immune checkpoint inhibitor.
3.3.4 Subgroup analysis
We conducted a subgroup analysis of endocrine toxicity, including hypothyroidism, hyperthyroidism, and hypophysitis, in patients with melanoma treated with ICIs ( Figures S7–S9 ).
According to the results shown in Figure S7A , treatments with one ICI plus conventional therapy, or with monotherapies comprising nivolumab or pembrolizumab, were associated with greater incidence of hypothyroidism than conventional therapy alone. Moreover, a combination of two ICIs or pembrolizumab alone was associated with a significantly greater risk than treatment with ipilimumab alone. Other treatment comparisons did not reveal any notable differences. According to SUCRA analysis ( Figure S7B ), conventional therapy (95.7%) was safest, and was followed by ipilimumab (81.0%), two ICIs (44.6%), one ICI plus conventional therapy (35.8%), nivolumab (31.8%), and pembrolizumab (11.1%).
Regarding hyperthyroidism, no significant differences were observed in the incidence of hypothyroidism among strategies ( Figure S8A ). The safety ranking for hyperthyroidism among treatment regimens involving ICIs indicated that ipilimumab (88.6%) had the highest probability, and was followed by pembrolizumab (49.2%), two ICIs (42.1%), and nivolumab (20.1%) ( Figure S8B ).
Regarding hypophysitis, both ipilimumab and dual ICI therapies were associated with significantly greater risk than one ICI plus conventional therapy, nivolumab, and pembrolizumab ( Figure S9A ). Notably, the risk with dual ICI therapy was significantly higher than that with ipilimumab alone. No significant differences emerged in other treatment comparisons. The cumulative probability curve for hypophysitis, ranked from highest to lowest risk, was as follows: one ICI plus conventional therapy (83.7%), nivolumab (77.5%), pembrolizumab (62.6%), ipilimumab (25.8%), and two ICIs (0.4%) ( Figure S9B ).
A subgroup analysis was also conducted for patients with NSCLC and four endocrine toxicities: hypothyroidism, hyperthyroidism, hypophysitis, and thyroiditis ( Figures S10–S13 ).
One ICI plus conventional therapy, pembrolizumab, tremelimumab, two ICIs plus conventional therapy, nivolumab, and dual ICI therapy were all associated with greater incidence of hypothyroidism than conventional therapy ( Figure S10A ). Notably, pembrolizumab, two ICIs plus conventional therapy, and two ICIs were associated with a significantly greater risk of hypothyroidism than one ICI plus conventional therapy. No significant differences were observed in other treatment comparisons. The cumulative probability curve area for hypothyroidism, ranked from lowest to highest risk, was as follows: conventional therapy (99.5%), one ICI plus conventional therapy (78.1%), atezolizumab (70.6%), pembrolizumab (54.2%), tremelimumab (38.7%), two ICIs plus conventional therapy (23.9%), nivolumab (21.3%), and dual ICI therapy (13.6%) ( Figure S10B ).
In the context of hyperthyroidism, pembrolizumab was associated with significantly greater risk than treatment with one ICI plus conventional therapy or conventional therapy alone ( Figure S11A ). Other treatment comparisons did not show significant differences. The cumulative probability curve for hyperthyroidism, from highest to lowest risk, was as follows: conventional therapy (87.6%), atezolizumab (60.1%), one ICI plus conventional therapy (46.9%), and pembrolizumab (5.4%) ( Figure S11B ).
Our analysis revealed no significant differences in hypophysitis risk across various treatment modalities ( Figure S12A ). The cumulative probability curve for hypophysitis, from highest to lowest risk, was as follows: conventional therapy (91.9%), pembrolizumab (38.2%), and one ICI plus conventional therapy (19.9%) ( Figure S12B ).
On the basis of the findings presented in Figure S13A , pembrolizumab was associated with notably greater risk than conventional therapy. No other treatment comparisons showed significant differences. SUCRA analysis indicated that conventional therapy (93.0%) was the safest option regarding thyroiditis among the ICI regimens, followed by one ICI plus conventional therapy (42.1%) and pembrolizumab (14.9%) ( Figure S13B ).
3.3.5 Sensitivity analysis
To assess the stability of the NMA results, we conducted a sensitivity analysis by systematically excluding each study from the safety data set. Although most of the sensitivity analysis data aligned with the broader trend, several exceptions, such as those from the E1609 and CheckMate 238 studies, deviated from the conventional range. These deviations might have arisen from specific sample characteristics, stringent selection criteria, or unique methodological approaches. For instance, the E1609 study used a two-step hierarchical approach for assessing ipilimumab’s efficacy and safety at various dosages in patients with melanoma. These specialized trial designs might have introduced selection bias if their initial screening phases were not randomized, thus potentially affecting the generalizability of the findings. Nonetheless, our overall conclusions indicated resilience against individual study variances, and a consistent overall trend was maintained and was unaffected by the exceptions ( Figures S14–S18 ).
3.3.6 Bias detection
Regarding bias detection, the funnel plot CI for six studies on hypothyroidism exceeded the range; one study on hyperthyroidism also exceeded the range, potentially because of heterogeneity among studies. In contrast, the remaining studies showed basic symmetry, thereby suggesting a possible small sample size effect or relatively low risk of publication bias, as depicted in Figure S19 .
4. DISCUSSION
The advent of ICIs has revolutionized cancer therapeutics and ushered in a new era of effective treatments that mobilize the immune system to target malignancies. Although this progress is encouraging, the risks of irAEs, specifically endocrine toxicity, remain an underexplored challenge that merits rigorous investigation [73–75]. Endocrine toxicity, one of the most common irAE types, requires attention throughout the entire course of cancer treatment [76]. Evidence-based synthesis and evaluation of endocrine toxicity associated with various ICI treatment strategies via NMA can provide clinicians and patients with a more informed perspective regarding the risks and benefits of various ICI treatment options.
Although prior meta-analyses have reported elevated risks of endocrine dysfunction associated with ICIs compared with chemotherapy or placebos, our study is unprecedented in its comprehensive comparison of specific ICI regimens. Unlike previous studies with a narrower focus, our NMA accommodates the newest ICI therapies, including camrelizumab, the first PD-1 inhibitor approved in China for liver cancer, and combination therapies, such as durvalumab plus tremelimumab. This broad analysis enables clinicians to gauge endocrine toxicity risks across multiple ICI therapies, thus providing an invaluable tool for individualized treatment planning.
Our study, performing NMA on 55 RCTs involving 32,522 patients, provides a detailed exploration of endocrine toxicity across a spectrum of ICI treatments. Our findings reaffirm the heightened risk of endocrine toxicity in patients receiving ICI therapy, with particular emphasis on the unique toxicity profiles of different ICIs and their combinations. First, the study demonstrated that dual ICI combinations generally pose greater risks than monotherapies, probably because of synergistic immune activation and longer treatment durations [77, 78]. This finding has immediate ramifications for treatment protocols, by indicating the need for vigilant monitoring and preemptive interventions to mitigate toxicity. Second, PD1/PD-L1 inhibitors are more likely than CTLA-4 inhibitors to induce thyroid-related adverse events, such as hypothyroidism and hyperthyroidism. This finding might be due to the high expression of PD-L1 in the follicular cells of the thyroid gland; this expression is positively correlated with the severity of autoimmune thyroid diseases [79, 80]. In contrast, CTLA-4 inhibitors were associated with hypophysitis, a finding indicating a need for specialized monitoring and therapeutic strategies for patients receiving these drugs. Previous studies have shown that CTLA-4 protein is expressed in the pituitary gland, which plays a key role in regulating hormone secretion [81, 82]. Therefore, CTLA-4 inhibitors may interfere with the normal function of the pituitary gland and lead to a decrease in, or cessation of, anterior pituitary hormone secretion [83–85]. In addition, CTLA-4 inhibitors may cause expansion and activation of T cells, and consequently exacerbate the body’s autoimmune response and aggravate the severity of hypophysitis. However, tremelimumab, another CTLA-4 inhibitor, has shown a better safety profile and was not found to significantly increase the risk of hypophysitis. Notably, the studies included in the analysis of tremelimumab are relatively limited, and further research is required to confirm this conclusion. These granular insights may facilitate more precise tailoring of treatments, thereby enabling clinicians to make educated therapeutic choices while considering individual patient vulnerabilities.
Notably, nivolumab had a high safety profile, comparable to that of conventional therapy, in terms of adrenal insufficiency. In contrast, pembrolizumab, ipilimumab, two ICIs, or two ICIs plus conventional therapy had a significantly greater risk of inducing adrenal dysfunction. Moreover, pembrolizumab and PD1/PD-L1 plus conventional therapy or two ICIs increased the risk of thyroiditis, although to varying degrees.
In our comprehensive analysis of endocrine toxicity associated with ICIs, we extensively reviewed and contrasted the findings from a broad array of previous studies with our own results. Initially, we examined articles addressing the overall safety of ICIs, particularly those detailing outcomes associated with endocrine functions. For instance, an NMA on the overall safety of ICI in cancer treatment has demonstrated that nivolumab is associated with a greater risk of hypothyroidism than ipilimumab, a finding consistent with our results [20]. Furthermore, another study conducting an NMA on the overall safety of ICI in solid tumors has highlighted that durvalumab combined with conventional therapy presents a greater risk of hyperthyroidism than pembrolizumab with conventional therapy [86]. Our more detailed study specifically focused on comparing durvalumab and pembrolizumab as monotherapies, and identified no significant difference in the risk of hyperthyroidism between these treatments.
Expanding our focus, we analyzed research on ICI-related adverse endocrine reactions. We discovered a lack of network meta-analyses directly targeting ICI-induced endocrine adverse reactions, thus emphasizing the innovative and crucial nature of our work. Our comparison of our network meta-analysis with several existing meta-analyses revealed both consistencies and differences. PD-1/PD-L1 inhibitors were consistently implicated in a higher incidence of thyroid-related adverse events, such as hypothyroidism and hyperthyroidism, than CTLA-4 inhibitors [20, 87]. This trend is corroborated by multiple studies reporting a significant risk of hypophysitis associated with ipilimumab [87]. Nonetheless, our study’s conclusions diverge from the findings in some previous research. For instance, whereas some studies have indicated a greater risk of hyperthyroidism with PD-1 inhibitors than PD-L1 inhibitors, our analysis did not indicate this distinction [87]. Given the expanded number of studies included in our research, our findings must be juxtaposed with those of prior research, to identify and discuss the nuances and broader implications of these similarities and differences.
However, several study limitations must be acknowledged. First, our study focused on English-language publications and did not analyze influential factors such as sex, age, and geographical distribution. Second, the scarcity of studies specifically investigating endocrine adverse events induced by durvalumab or tremelimumab necessitates further research with larger sample sizes to substantiate the findings regarding the incidence of these events after treatment with these drugs. Third, our research demonstrated low overall inconsistency yet relatively high heterogeneity among studies. This finding might have been due to the biological diversity of cancer types, variations in patient demographics, the complexity of treatment regimens, and differing statistical methods. Future efforts will be aimed at exploring the sources of this heterogeneity, to minimize heterogeneity and bolster the rigor of our findings.
Additionally, most included studies were clinical trials for drug registration, involving patients selected based on strict criteria. This selection might limit the applicability of our results to a broader patient spectrum, including older individuals and those with multiple comorbidities. These inherent limitations underscore the need for further more inclusive research.
Despite these constraints, our study notably carefully compared endocrine toxicity risks across various ICI regimens, thus adding depth to the growing body of ICI research. Our findings equip clinicians with a sophisticated tool for risk assessment and management, thereby augmenting the safety and efficacy of ICI treatments. Future research should build on this groundwork, integrating real-world data and focusing on underrepresented populations to foster a comprehensive understanding of ICI-related endocrine toxicity.
5. CONCLUSION
In conclusion, ICIs have been associated with an elevated risk of endocrine toxicity—a risk further amplified with the use of dual ICI regimens. Specifically, CTLA-4 inhibitors are notably associated with hypophysitis, whereas PD-1/PD-L1 inhibitors are more commonly associated with thyroid-related conditions such as hyperthyroidism, hypothyroidism, and thyroiditis. Interestingly, nivolumab has a safety profile similar to that of conventional therapies in terms of adrenal dysfunction, in contrast to the heightened risks associated with other ICI treatments. In the context of melanoma, the safety profile of ICIs generally aligns with broader study findings. Pembrolizumab tends to induce hypothyroidism more frequently than ipilimumab, whereas ipilimumab is associated with a greater risk of hypophysitis than PD-1/PD-L1 inhibitors, such as pembrolizumab and nivolumab. In NSCLC, pembrolizumab is associated with a relatively lower safety profile including elevated risks of hypothyroidism, hyperthyroidism, and thyroiditis. Our study provides crucial evidence-based insights aimed at refining the risk-benefit calculus and contributing to the development of safer and more efficacious ICI therapies in clinical settings.