H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation

Abstract

Human hands possess remarkable dexterity and have long served as a source of inspiration for robotic manipulation. In this work, we propose a human Hand-Informed visual representation learning framework to solve difficult Dexterous manipulation tasks (H-InDex). Our framework consists of three stages: (i) pre-training representations with 3D human hand pose estimation, (ii) offline adapting representations with self-supervised keypoint detection, and (iii) reinforcement learning with exponential moving average BatchNorm. The last two stages only modify 0.36% parameters of the pre-trained representation in total, ensuring the knowledge from pre-training is maintained to the full extent. We empirically study 12 challenging dexterous manipulation tasks and find that our method largely surpasses the previous state-of-the-art method and also the recent visual foundation models for motor control.

Publication
In Conference on Neural Information Processing Systems (NeurIPS), 2023
Yuyao Liu
Yuyao Liu
Undergraduate Student of Computer Science

Yao Class, IIIS, Tsinghua University