Purpose: Many studies caution against using radiomic features that are sensitive to contouring variability in predictive models for disease stratification. Consequently, metrics such as the intraclass correlation coefficient (ICC) are recommended to guide feature selection based on stability. However, the direct impact of segmentation variability on the performance of predictive models remains underexplored. We examine how segmentation variability affects both feature stability and predictive performance in the radiomics-based classification of triple-negative breast cancer (TNBC) using breast magnetic resonance imaging. Approach: We analyzed 244 images from the Duke dataset, introducing segmentation variability through controlled modifications of manual segmentations. For each segmentation mask, explainable radiomic features were selected using Shapley Additive exPlanations and used to train logistic regression models. Feature stability across segmentations was assessed via ICC, Pearson’s correlation, and reliability scores quantifying the relationship between segmentation variability and feature robustness. Results: Model performances in predicting TNBC do not exhibit a significant difference across varying segmentations. The most explicative and predictive features exhibit decreasing ICC as segmentation accuracy decreases. However, their predictive power remains intact due to low ICC combined with high Pearson’s correlation. No shared numerical relationship is found between feature stability and segmentation variability among the most predictive features. Conclusions: Moderate segmentation variability has a limited impact on model performance. Although incorporating peritumoral information may reduce feature reproducibility, it does not compromise predictive utility. Notably, feature stability is not a strict prerequisite for predictive relevance, highlighting that exclusive reliance on ICC or stability metrics for feature selection may inadvertently discard informative features.

Segmentation variability and radiomics stability for predicting triple-negative breast cancer subtype using magnetic resonance imaging

Cama, Isabella;Campi, Cristina;Piana, Michele;Garbarino, Sara;
2025-01-01

Abstract

Purpose: Many studies caution against using radiomic features that are sensitive to contouring variability in predictive models for disease stratification. Consequently, metrics such as the intraclass correlation coefficient (ICC) are recommended to guide feature selection based on stability. However, the direct impact of segmentation variability on the performance of predictive models remains underexplored. We examine how segmentation variability affects both feature stability and predictive performance in the radiomics-based classification of triple-negative breast cancer (TNBC) using breast magnetic resonance imaging. Approach: We analyzed 244 images from the Duke dataset, introducing segmentation variability through controlled modifications of manual segmentations. For each segmentation mask, explainable radiomic features were selected using Shapley Additive exPlanations and used to train logistic regression models. Feature stability across segmentations was assessed via ICC, Pearson’s correlation, and reliability scores quantifying the relationship between segmentation variability and feature robustness. Results: Model performances in predicting TNBC do not exhibit a significant difference across varying segmentations. The most explicative and predictive features exhibit decreasing ICC as segmentation accuracy decreases. However, their predictive power remains intact due to low ICC combined with high Pearson’s correlation. No shared numerical relationship is found between feature stability and segmentation variability among the most predictive features. Conclusions: Moderate segmentation variability has a limited impact on model performance. Although incorporating peritumoral information may reduce feature reproducibility, it does not compromise predictive utility. Notably, feature stability is not a strict prerequisite for predictive relevance, highlighting that exclusive reliance on ICC or stability metrics for feature selection may inadvertently discard informative features.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1276896
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact