Download

Please send an e-mail to yanmnn@stu.xmu.edu.cn, including contact details (title, full name, organization, and country) and the purpose for downloading the dataset. By sending the e-mail you accept the following License.

Data structure

  • The dataset of CIMI4D is divided into two parts, XMU and ChangSha.
  • # the data structure for XMU sequence
    ├── root_folder
       ├── sequence_img/              # Downsample video to 20fps and align to point clouds
       ├── sequence_pos.csv           
       ├── sequence_rot.csv           # Original IMU Data
       ├── sequence.bvh               # Original Human Motion 100fps
       ├── Sparse_scene_mesh.ply      
       ├── Sparse_scene_points.pcd
       ├── LiDAR2Cam_e.txt            # Extrinsic of LiDAR to Camera
       ├── lidar_p.txt                # Extrinsic of LiDAR to Mocap
       ├── Dense_scene_mesh.ply
       ├── Dense_scene_points.pcd     
       ├── sequence.pkl               
       |  |   └── 'beta'              # Shape of subject 
       |  |   └── 'gt_pose_3D'              
       |  |   └── 'gt_pose_2D'             
       |  |   └── 'gt_trans'
       |  |   └── 'human_point_clouds'
       |  |   └── 'IMU_pose'
       |  |   └── 'IMU_trans'
       |  |   └── 'frame_num'
       ├── others...
    
    
    # the data structure for ChangSha sequence
    ├── root_folder
       ├── sequence_pos.csv           
       ├── sequence_rot.csv           # Original IMU Data
       ├── sequence.bvh               # Original Human Motion 100fps  
       ├── Sparse_scene_points.pcd
       ├── sequence.json              # Sequence Human Info     
       ├── sequence_label.h5py               
       |  |   └── 'beta'              # Shape of subject 
       |  |   └── 'gt_pose_3D'              
       |  |   └── 'gt_pose_2D'             
       |  |   └── 'gt_trans'
       |  |   └── 'human_point_clouds'
       |  |   └── 'IMU_pose'
       |  |   └── 'IMU_trans'
       |  |   └── 'frame_num'
       ├── others...
    
    

    Tips:

  • Since the ChangSha dataset involves the data of some national-level professional athletes, according to the confidentiality agreement, we cannot provide this part of the RGB data. This part of the data is not mentioned in the paper and is not a contribution point of the paper.
  • We give the key information beta of SMPL human body for each sequence, which is stored in pkl in XMU and h5py in ChangSha. Volunteers agree to disclose SMPL human data.
  • Due to privacy concerns, XMU_1023_ym002_V1_0 does not provide RGB data.
  • Holds Problem. Since the hold manufacturers refuse us to release their patented hold shapes, we cannot provide the contact points directly. However, we provide a source code of the algorithm we use to extract scene rock points. If you need hold information and spatial location, please execute this code Holds_Extract_ASC_XMU.py.
  • Holds_Extract_ASC_XMU.py ​You can adjust the parameters of DBSCAN and RANSAC to improve the extraction effect you need. Visualization in code can help you make adjustments.​​ ​The inversion of the code result is the rock point of each wall.
  • Mesh scene. The mesh scene we provide comes from the Poisson reconstruction of the point cloud scene, and the reconstruction effect can be seen in the .ply file in XMU_1023_zpc001_V1_1. However, according to the feedback from the researchers, due to errors in Poisson reconstruction, directly using the scene mesh would be misleading, so we deleted the mesh scenes in other sequences. If you need to use scene mesh, please generate it according to the point cloud scene.
  • Render to 2D image:
    1. We mentioned in the paper that in order to unify the coordinate system conversion problem, we first converted the mocap coordinate system and LiDAR coordinate system to the world coordinate system.
      In order to facilitate the calibration of the initial image and initial radar, we register the point cloud scenes in the RGB and LiDAR coordinate systems.
      So you also need to transfer the human body in the world coordinate system to the LiDAR coordinate system, and then to the camera coordinate system.
      The RT matrix from the LiDAR coordinate system to the world coordinate system is in the lidar_p.txt file, and the extrinsic parameter matrix and camera intrinsic parameter matrix from the LiDAR coordinate system to the camera coordinate system are in LiDAR2Cam_e.txt.
    2. The x, y, z of trans are global trajectories.
      *tips: The unit of T in the external parameter matrix in lidar_p.txt is mm. You need to divide T by 1000 to convert it to m.
    3. bbox is the center point coordinates of bbox + w and h.

    Citation

    @inproceedings{yan2023cimi4d,
      title={CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions},
      author={Yan, Ming and Wang, Xin and Dai, Yudi and Shen, Siqi and Wen, Chenglu and Xu, Lan and Ma, Yuexin and Wang, Cheng},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      pages={12977--12988},
      month={June},
      year={2023}
    }
    

    Commercial licensing

    Please email us:
    siqishen@xmu.edu.cn
    clwen@xmu.edu.cn
    cwang@xmu.edu.cn