This study addresses member churn in a Brazilian agro-industrial cooperative by operationalizing a leakage-aware, governance-aligned machine-learning protocol within the organization’s Customer Relationship Management (CRM) system. Using real-world CRM data under confidentiality constraints, we followed a KDD-based workflow. This workflow includes: (i) multi-source integration; (ii) targeted preprocessing with explicit handling of severe class imbalance via undersampling; (iii) a unified validation scheme with stratified cross-validation, hyperparameter search, and controlled AutoML benchmarking; (iv) comparison of tabular learners (Random Forest, XGBoost, and Support Vector Classifier) and a voting ensemble; and (v) SHAP-based explainability to support transparent decisionmaking. Class rebalancing substantially improved minority-class performance; for instance, the “Inactive” recall increased from 0.27 to 0.74 with SVC. Across ten folds, AutoML achieved competitive mean ROC-AUC (0.8844), followed by XGBoost (0.8690) and Random Forest (0.8660); global metrics supported operational feasibility (accuracy 0.79–0.80; ROCAUC up to 0.8876), while the ensemble delivered comparable discrimination (ROC-AUC 0.8845) with a modest precision gain. SHAP analyses yielded business-coherent drivers and enabled actionable, instance-level communication in the CRM. The resulting microservicesbased module exposes ranked churn propensities and explanations in dashboards for risk stratification and prioritization of retention actions. Overall, the work provides an interpretable, reproducible, and production-ready methodological blueprint for predictive CRM in seasonal cooperative environments under governance and confidentiality constraints.

A Machine Learning-Driven CRM Approach for Identifying Member Churn in a Brazilian Agro-Industrial Cooperative: A Practical Case Study

Converti, Attilio;
2026-01-01

Abstract

This study addresses member churn in a Brazilian agro-industrial cooperative by operationalizing a leakage-aware, governance-aligned machine-learning protocol within the organization’s Customer Relationship Management (CRM) system. Using real-world CRM data under confidentiality constraints, we followed a KDD-based workflow. This workflow includes: (i) multi-source integration; (ii) targeted preprocessing with explicit handling of severe class imbalance via undersampling; (iii) a unified validation scheme with stratified cross-validation, hyperparameter search, and controlled AutoML benchmarking; (iv) comparison of tabular learners (Random Forest, XGBoost, and Support Vector Classifier) and a voting ensemble; and (v) SHAP-based explainability to support transparent decisionmaking. Class rebalancing substantially improved minority-class performance; for instance, the “Inactive” recall increased from 0.27 to 0.74 with SVC. Across ten folds, AutoML achieved competitive mean ROC-AUC (0.8844), followed by XGBoost (0.8690) and Random Forest (0.8660); global metrics supported operational feasibility (accuracy 0.79–0.80; ROCAUC up to 0.8876), while the ensemble delivered comparable discrimination (ROC-AUC 0.8845) with a modest precision gain. SHAP analyses yielded business-coherent drivers and enabled actionable, instance-level communication in the CRM. The resulting microservicesbased module exposes ranked churn propensities and explanations in dashboards for risk stratification and prioritization of retention actions. Overall, the work provides an interpretable, reproducible, and production-ready methodological blueprint for predictive CRM in seasonal cooperative environments under governance and confidentiality constraints.
File in questo prodotto:
File Dimensione Formato  
A38b.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Documento in versione editoriale
Dimensione 2.01 MB
Formato Adobe PDF
2.01 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1290796
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact