The present study proposes a variable selection algorithm for Soft Independent Modeling of Class Analogy (SIMCA) in the context of rigorous one-class classification (OCC), using Modeling Power (MP) as a central internal metric. The algorithm integrates three complementary criteria: correlation between MP and class compactness, MP non-growth rate across successive principal components, and a minimum MP threshold. These criteria were applied to three experimental datasets: UV–Vis spectra of edible oils, NIR spectra of Argentinean green teas, and HPLC–CAD chromatographic profiles of olive oils. In all cases, the goal was to enhance model parsimony and interpretability without compromising classification performance. The selected variables were chemically meaningful, aligning with known spectral or chromatographic regions associated with key compositional markers. Comparative analyses between the proposed Modeling Power Selector with SIMCA (MPS-SIMCA) and traditional full-spectrum SIMCA showed equivalent or improved classification performance. Additionally, MPS-SIMCA achieved superior model compactness and interpretability, supporting the feasibility of variable selection based solely on internal class structure. This approach offers a robust and interpretable alternative for class modeling in food authentication tasks where only target-class samples are reliably available.

Modeling power-based variable selection for rigorous one-class classification with SIMCA

Cristina Malegori;Paolo Oliveri;
2025-01-01

Abstract

The present study proposes a variable selection algorithm for Soft Independent Modeling of Class Analogy (SIMCA) in the context of rigorous one-class classification (OCC), using Modeling Power (MP) as a central internal metric. The algorithm integrates three complementary criteria: correlation between MP and class compactness, MP non-growth rate across successive principal components, and a minimum MP threshold. These criteria were applied to three experimental datasets: UV–Vis spectra of edible oils, NIR spectra of Argentinean green teas, and HPLC–CAD chromatographic profiles of olive oils. In all cases, the goal was to enhance model parsimony and interpretability without compromising classification performance. The selected variables were chemically meaningful, aligning with known spectral or chromatographic regions associated with key compositional markers. Comparative analyses between the proposed Modeling Power Selector with SIMCA (MPS-SIMCA) and traditional full-spectrum SIMCA showed equivalent or improved classification performance. Additionally, MPS-SIMCA achieved superior model compactness and interpretability, supporting the feasibility of variable selection based solely on internal class structure. This approach offers a robust and interpretable alternative for class modeling in food authentication tasks where only target-class samples are reliably available.
File in questo prodotto:
File Dimensione Formato  
Pires-Schneider_ACA_2025.pdf

accesso aperto

Tipologia: Documento in versione editoriale
Dimensione 6.03 MB
Formato Adobe PDF
6.03 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1265897
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact