Abstract
We introduce HmPEAR, a novel dataset crafted for advancing re- search in 3D Human Pose Estimation (3DHPE) and Human Action Recognition (HAR), with a primary focus on outdoor environments. This dataset offers a synchronized collection of imagery, LiDAR point clouds, 3D human poses, and action categories. In total, the dataset encompasses over 300,000 frames collected from 10 distinct scenes and 25 diverse subjects. Among these, 250,000 frames of data contain 3D human pose annotations captured using an advanced motion capture system and further optimized for accuracy. Fur- thermore, the dataset annotates 40 types of daily human actions,resulting in over 6,000 action clips. Through extensive experimenta- tion, we have demonstrated the quality of HmPEAR and highlighted the challenges it presents to current methodologies. Additionally, we propose baselines leveraging sequential images and point clouds for 3D HPE and HAR, which underscore the mutual reinforcement between them,highlighting the potential for cross-task synergies.