Continuous, contact-rich manipulation is increasingly relevant for industrial applications such as surface finishing and polishing, where value is created through sustained, controlled interaction between tool and workpiece. Despite its practical importance, reliable execution remains challenging because key process parameters can vary during operation, such as contact conditions, friction, surface properties, tool compliance, and wear, making purely model-based tuning brittle. This thesis investigates whether automated, learning-based readaptation, specifically Reinforcement Learning (RL), can improve robustness under such variability while preserving the structure and safety of classical control. To address the problem from multiple fronts, the thesis bridges Learning from Demonstration (LfD) and Reinforcement Learning (RL) within a human-in-the-loop, deployment-oriented workflow built around NVIDIA Isaac Lab. A low-cost and open-source LfD pipeline based on multi-view RGB sensing is introduced to capture operator demonstrations, reconstruct task-relevant motion, and retarget it to the robot through constrained inverse kinematics and safety-aware filtering, producing executable trajectories and waypoints that are imported into Isaac Lab as task references. Within the same Isaac Lab framework, a modular contact-rich simulation environment is developed to study interaction dynamics and to train structured RL policies (PPO) layered on operational-space control, enabling adaptive behaviours such as impedance modulation under explicit constraints and targeted randomization of contact conditions. Experiments on a polishing benchmark show more consistent task execution and improved robustness to changing contact conditions when RL is combined with structured control and demonstration-derived references. Finally, a PCA-based analysis is introduced to reframe policy evaluation in physical terms: by extracting dominant modes from interaction and control signals, it highlights which variables drive contact behaviour and how they couple, providing a principled way to reason about what the agent must learn to regulate effectively beyond scalar performance metrics.

Human-Guided Learning and Control for Contact-Rich Robotic Manipulation: An Interoperable Toolchain

BAJRAMI, ALBIN
2026-04-17

Abstract

Continuous, contact-rich manipulation is increasingly relevant for industrial applications such as surface finishing and polishing, where value is created through sustained, controlled interaction between tool and workpiece. Despite its practical importance, reliable execution remains challenging because key process parameters can vary during operation, such as contact conditions, friction, surface properties, tool compliance, and wear, making purely model-based tuning brittle. This thesis investigates whether automated, learning-based readaptation, specifically Reinforcement Learning (RL), can improve robustness under such variability while preserving the structure and safety of classical control. To address the problem from multiple fronts, the thesis bridges Learning from Demonstration (LfD) and Reinforcement Learning (RL) within a human-in-the-loop, deployment-oriented workflow built around NVIDIA Isaac Lab. A low-cost and open-source LfD pipeline based on multi-view RGB sensing is introduced to capture operator demonstrations, reconstruct task-relevant motion, and retarget it to the robot through constrained inverse kinematics and safety-aware filtering, producing executable trajectories and waypoints that are imported into Isaac Lab as task references. Within the same Isaac Lab framework, a modular contact-rich simulation environment is developed to study interaction dynamics and to train structured RL policies (PPO) layered on operational-space control, enabling adaptive behaviours such as impedance modulation under explicit constraints and targeted randomization of contact conditions. Experiments on a polishing benchmark show more consistent task execution and improved robustness to changing contact conditions when RL is combined with structured control and demonstration-derived references. Finally, a PCA-based analysis is introduced to reframe policy evaluation in physical terms: by extracting dominant modes from interaction and control signals, it highlights which variables drive contact behaviour and how they couple, providing a principled way to reason about what the agent must learn to regulate effectively beyond scalar performance metrics.
17-apr-2026
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1294936
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact