# Lidar Perception ## Introduction In Apollo 7.0, a new LiDAR-based obstacle detection model is provided named Mask-Pillars based on PointPillars, which improves the original version in two aspects. The first one is that a residual attention module is introduced into the encoder of the backbone to learn a mask and to enhance the feature map in a residual way. The second one is that a pillar-level supervision is applied after decoder of the backbone which is only performed in the training stage. The training data for pillar-level supervision is generated by composing the distribution of foreground obstacle pillars. From the experimental validation, Mask-Pillars achieves higher performance than PointPillars on both Kitti and Waymo datasets, especially the recall on obstacles. ## Architecture Here we mainly focus on the modifications based on PointPillars: ### Attention Module Although LiDAR can collect high-quality point cloud data, some obstacles may have a small number of point clouds due to occlusion or distance. Therefore, we introduce an attention layer on FPN encoder module to enhance the features refer to [Residual Attention Network for Image Classification](https://arxiv.org/abs/1704.06904). Since FPN has three feature maps with different resolutions, our attention module also acts on three feature maps at the same time. More details about the network architecture can refer to figure below,S represent Sigmoid,F function is shown as Formula 1: $$ F(x) = (1 + M(x)) * T(x) \tag{1} $$ $T(x)$is the output of backbone,$M(x)$is the output of attention module. ### Pillar-level supervision In order to improve the recall of the network, we introduce a pillar-level supervision mechanism in the training stage. We notice that the segmentation algorithms always have high recall rates because of the pixel level supervision. Therefore, we borrow the idea of segmentation network by adding a pillar-level supervision of foreground pillars that representing obstacles. The feature maps before feeding into the detection module are supervised by the pillar supervision data, which are represented as obstacle distributions. The supervision data are simply generated by composing the Guassion distributions of obstacle pillars of the training point cloud. The network structure of the final FPN is shown in the figure below