Executable cross-cohort benchmarking of NSCLC immunotherapy biomarkers reveals robust transfer of tumor mutational burden
Reliable biomarkers for immune checkpoint therapy in non-small-cell lung cancer (NSCLC) remain difficult to validate across cohorts and treatment regimens. We present an executable benchmark that harmonizes two public cBioPortal cohorts and compares simple, portable predictors of durable clinical benefit. The discovery cohort comprised 195 evaluable anti-PD-(L)1 monotherapy cases from nsclc_pd1_msk_2018; the validation cohort comprised 75 evaluable PD-1 plus CTLA-4 cases from nsclc_mskcc_2018. The skill performs checksum-verified data acquisition, deterministic preprocessing, nonparametric and Fisher tests, repeated cross-validation, and external validation. Tumor mutational burden (TMB) was significantly higher in durable responders in both cohorts (p=0.0095 discovery; p=0.0066 validation). In external validation, a TMB-only model achieved AUC 0.683, whereas a sparse six-gene mutation panel achieved AUC 0.579. The highest external AUC (0.717) used TMB, clinical covariates, and PD-L1, but PD-L1 was missing for 65.6% of discovery patients. This executable result supports TMB as the most portable biomarker in this benchmark and shows that sparse mutation panels do not transfer robustly.


