New Paper accepted @ RA-L!

1–2 minutes

Self-Supervised Multisensory Pretraining for Contact-Rich Robot Reinforcement Learning
Rickmer Krohn, Vignesh Prasad, Gabriele Tiboni, Georgia Chalvatzaki

TLDR: Multisensory pretraining enhances RL for contact-rich tasks by learning expressive representations through masked autoencoding.

A robotic arm positioned over a blue box, with various icons representing vision, touch, and a connected network labeled 'MSDP' to the right.

Contact-rich robot manipulation demands tight integration of vision, force, and proprioception. Our new work, Self-Supervised Multisensory Pretraining for Contact-Rich Robot Reinforcement Learning, introduces MSDP — a framework that uses masked autoencoding and cross-modal sensor fusion to learn expressive multisensory representations, paired with a novel asymmetric actor-critic architecture for efficient real-robot RL.
MSDP achieves ~90% success on challenging manipulation tasks using only 6,000 real-robot interactions, with the full pipeline completing in under 55 minutes. Adding a force-torque sensor alone improves performance by 14%. The method is robust to sensor noise, variable stiffness, external disturbances, and varying lighting conditions.

Checkout the Website for robot videos: https://msdp-pearl.github.io

Discover more from Interactive Robot Perception & Learning

Subscribe now to keep reading and get access to the full archive.

Continue reading