Robotic Grasp Detection Using Structure Prior Attention and Multiscale Features

Authors: Lu Chen, , Mingdi Niu, , Jing Yang, , Yuhua Qian, , Zhuomao Li, , Keqi Wang, , Tao Yan,, Panfeng Huang

Abstract:

Abstract—Most available grasp detection methods tend to directly predict grasp configurations with deep neural networks, where all features are equally extracted and utilized, leading to the relative restriction of truly useful grasping features. Inspired by the observed three-section structure pattern revealed by human-labeled graspable rectangles, we first design a structure prior attention (SPA) module which uses two-dimensional encoding to enhance the local patterns and utilizes self-attention mechanism to reallocate distribution of grasping-specific features. Then, the proposed SPA module is integrated with fundamental feature extraction modules and residual connection to achieve the implicit and explicit feature fusion, which further serves as the building block of our proposed Unet-like grasp detection network. It takes RGBD images as input and outputs imagesize feature maps, from which the grasp configurations can be determined. Extensive comparative experiments on the five public datasets prove our method's superiority to other approaches in detection accuracy, achieving 99.2%, 96.1%, 98.0%, 86.7%, and 92.6% on the Cornell, Jacquard, Clutter, VMRD, and GraspNet datasets. With visual evaluation metrics and user study, the quality maps generated by our method possess more concentrative distribution of high-confidence grasps and clearer discrimination with backgrounds. In addition, its effectiveness is also verified by robotic grasping under real-world scenario, leading to higher success rate.

Keywords:

Robotic_Grasp_Detection_Using_Structure_Prior_Attention_and_Multiscale_Features.pdf

Thu Sep 12 11:46:00 CST 2024