Learning Contact-Rich Loco-Manipulation

Tolomei, Simone

The physical world admits no action at a distance. As robots step out of structured factories and into dynamic, unstructured environments, such as domestic assistance and environmental monitoring, their autonomy is conditioned on their ability to make and exploit contact with the world. True physical intelligence therefore requires mastering this interaction across two tightly coupled dimensions: deciding where to make contact (e.g., footholds, pushes, grasps, supports) and learning how to exploit it to move, stabilise, and accomplish tasks reliably. Although physical contact has been studied extensively, handling it in practice often still requires tailoring, which ranges from simplified contact abstractions to task-specific heuristics and rules. This can work well in controlled conditions, but may become tedious as interactions diversify and operating conditions become less predictable. From a model-based perspective, contact introduces hybrid dynamics and discontinuities, complicating planning and control. From a model-free perspective, contact remains challenging because meaningful interactions are sparse and hard to discover through unguided exploration, making learning inefficient and fragile. Motivated by these challenges, in this thesis, I target learning contact-rich loco-manipulation: the ability of a robot to coordinate locomotion and manipulation through purposeful, intermittent contacts with the environment. Within this view, locomotion and manipulation are not separate problems but two sides of the same question: how to choose where and how to touch the world, and how to leverage those contacts over time to generate stable motion, effective interaction, and task completion. I begin with locomotion-oriented contact selection, learning footholds that account for foot shape and contact patch geometry; I study how mechanical design choices, including joint compliance, affect robustness in legged locomotion; and I develop learning-based control pipelines to achieve reliable movement on hardware. Shifting to manipulation, the focus moves from the ground to the object. I address the geometric challenge of extracting collision-free grasp priors in severely cluttered scenes and for novel reconfigurable grippers, while also exploring how learning-based policies can expand a manipulator's skills from static pick-and-place to dynamic throwing. Finally, these conceptual tracks converge into a unified, learning-based architecture for non-prehensile loco-manipulation. By embedding geometric contact priors into a multi-critic reinforcement learning framework, I introduce a strategy that explicitly guides a legged manipulator's exploration toward meaningful physical interactions, progressively annealing this guidance to recover task-optimal, whole-body control. Validated through extensive simulation studies and real-world deployment across diverse quadrupedal and manipulation platforms, this thesis provides a comprehensive blueprint that can provide machines with the resilient, contact-rich autonomy required to operate in the wild.