Commit 4a648dab authored by Radim Tylecek's avatar Radim Tylecek

public release

parents

Too many changes to show.

To preserve performance only 1000 of 1000+ files are displayed.

*.ply filter=lfs diff=lfs merge=lfs -text
*.avi filter=lfs diff=lfs merge=lfs -text
*.blend filter=lfs diff=lfs merge=lfs -text
# 3D Reconstruction meets Semantics
Part of the ECCV 2018 workshop [3D Reconstruction meets Semantics](http://trimbot2020.webhosting.rug.nl/events/3drms/) is a challenge on combining 3D and semantic information in complex scenes.
To this end, a challenging outdoor dataset, captured by a robot driving through a semantically-rich garden that contains fine geometric details, is released.
A multi-camera rig is mounted on top of the robot, enabling the use of both stereo and motion stereo information.
Precise ground truth for the 3D structure of the garden has been obtained with a laser scanner and accurate pose estimates for the robot are available as well.
Ground truth semantic labels and ground truth depth from a laser scan will be used for benchmarking the quality of the 3D reconstructions.
## Reconstruction Challenge
Given a set of images and their known camera poses, the goal of the challenge is to create a semantically annotated 3D model of the scene.
To this end, it will be necessary to compute depth maps for the images and then fuse them together (potentially while incorporating information from the semantics) into a single 3D model.
We provide the following data for the challenge:
* A set of synthetic training sequences consisting of
* calibrated images with their camera poses,
* ground truth semantic annotations for a subset of these images,
* a semantically annotated 3D point cloud depicting the area of the training sequence.
* A set of synthetic testing sequences consisting of calibrated images with their camera poses.
* A set of real validation sequences consisting of calibrated images with their camera poses.
## Data
_IMPORTANT_: Please install [git lfs](https://git-lfs.github.com/) before cloning this repository to retrieve PLY files.
_NOTE_: Due to bug in Gitlab server valid PLY files are not downloaded with ZIP web link. You can still download them via web individually.
* File [`labels.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/labels.yaml) - semantic label definition list
* File [`colors.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/colors.yaml) - label color definition (for display)
* File [`calibration/camchain-DDDD.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/camchain-2017-05-16-09-53-50.yaml) - camera rig calibration (for real data)
### Training (Synthetic data)
| Sequence | frames | annotated frames |
| -------- | ------ | ----- |
| clear_0224 | 1000 | 500 |
| cloudy_0224 | 1000 | 500 |
| overcast_0224 | 1000 | 500 |
| sunset_0224 | 1000 | 500 |
| twilight_0224 | 1000 | 500 |
| _Total_ | 5000 | 2500 |
* File `model_RRRR_SSSS.ply` - point cloud of scene SSSS with semantic labels (field `scalar_s`) at resolution RRRR
* Higher resolution point clouds are available upon request (too large for this repository)
* Folders `EEEE_SSSS` - sequences rendered from scene SSSS in environment EEEE
* Subfolders `vcam_X`
* Files `vcam_X_fXXXXX_gtr.png` - GT annotation with label set IDs (indexed bitmap)
* Files `vcam_X_fXXXXX_undist.png` - color image (RGB, undistorted)
* Files `vcam_X_fXXXXX_over.png` - overlay of annotation over greyscale image (for display)
* Files `vcam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)
* Files `vcam_X_fXXXXX_dmap.bin` - depth map (binary matrix with image dimensions, single float IEEE-BE format)
* Files `vcam_X_fXXXXX_dmap.png` - depth map (visualization)
#### Cameras
There are five camera pairs arranged in a pentagonal rig. The stereo pairs are `cam_0/cam_1`, `cam_2/cam_3`, `cam_4/cam_5`, `cam_6/cam_7`, `cam_8/cam_9`.
The pose format is `fx fy cx cy qw qx qy qz tx ty tz`, where `q` is the quaternion denoting the camera orientation and `t` is the camera translation.
The transformation from world to camera coordinates is given as `[R(q)|t]`, where `R(q)` is the rotation matrix corresponding to quaternion `q`.
The poses and are also rendered in PLY files, single environments in `cams_EEEE_SSSS_fXXXX.ply` and all jointly in `cams_all_SSSS_fXXXX.ply`.
Points correspond to camera centers (inner circle of the rig) and viewing direction (outer circle).
### Testing (Synthetic data)
| Sequence | frames |
| -------- | ------ |
| clear_0288 | 1000 |
| cloudy_0288 | 1000 |
| overcast_0288 | 1000 |
| sunset_0288 | 1000 |
| twilight_0288 | 1000 |
| _Total_ | 5000 |
* Folders `EEEE_SSSS` - sequences rendered from scene SSSS in environment EEEE
* Subfolders `vcam_X`
* Files `vcam_X_fXXXXX_undist.png` - color image (RGB, undistorted)
* Files `vcam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)
### Validation (Real data)
| Sequence | cameras | range | frames |
| -------- | ------- | ------ | ------- | ------ |
| test_around_garden | cam_0, cam_1, cam_2, cam_3 | 140:10:1480 | 268 |
* Subfolders `uvc_camera_cam_X`
* Files `uvc_camera_cam_X_fXXXXX_undist.png` - undistorted color image (RGB)
* Files `uvc_camera_cam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)
* For `cam_1` and `cam_3` there is no annotation provided, ie. _gtr and _over are missing, and _undist is greyscale only
## Evaluation
We will evaluate the following measures:
* Reconstruction accuracy in % for a set of distance thresholds (similar to [1,2])
* Reconstruction completeness in % for a set of distance thresholds (similar to [1,2])
* Semantic quality in % of the triangles that are correctly labeled.
We will use distance thresholds of 1cm, 2cm, 3cm, 5cm, and 10cm.
#### References
* [1] Seitz et al., A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR 2006
* [2] Schöps et al., A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos, CVPR 2017
## Submission Categories
This year we accept submissions in several categories: semantics and geometry, either joint or separate.
For example, if you have a pipeline that first computes semantics and geometry independently and then fuses them, we can compare how the fused result improved accuracy.
Once you have created the results in one or more categories below, please follow [instructions on the website](http://trimbot2020.webhosting.rug.nl/events/3drms/challenge/) to submit them.
### A. Semantic mesh
In order to submit to the challenge, please create a semantically annotated 3D triangle mesh from the test sequence and validation sequence.
* The mesh should be stored in the [PLY text format](http://paulbourke.net/dataformats/ply/).
* The file should store for each triangle a color corresponding to the triangle’s semantic class (see the [`calibrations/colors.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/colors.yaml) file for the mapping between semantic classes and colors).
* Semantic labels 'Unknown' and 'Background' are only for 2D images, and should not be present in the submitted 3D mesh, ie. only values 1-8 are valid.
### B. Geometric Mesh
Same as above, but PLY mesh without semantic annotations.
### C. Semantic Image Annotations
Create a set of semantic image annotations for all views in the test, using the same filename convention and PNG format as in the training part. Upload them in a single ZIP archive.
## Contact
For questions and requests, please contact `rtylecek@inf.ed.ac.uk`.
## Credits
Dataset composed by @hale and @rtylecek.
Please report any errors via [issue tracker](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/issues/new) or via email to rtylecek@inf.ed.ac.uk.
### Acknowledgements
Production of this dataset was supported by EU project TrimBot2020.
Radim Tylecek, Torsten Sattler, Thomas Brox, Marc Pollefeys, Robert B. Fisher, Theo Gevers.
3D Reconstruction meets Semantics – Reconstruction Challenge, ECCV Workshops, September 2018.
URL: http://trimbot2020.webhosting.rug.nl/events/3drms/challenge/
camfix:
rotation: [ -0.3711, 0.0050, -0.9286, 0.0269]
translation: [0.12, 0.015, -0.145]
cam0:
cam_overlaps: [1, 2, 3, 8, 9]
camera_model: pinhole
distortion_coeffs: [-0.3587690871187596, 0.11078947299389359, -0.0021780792358551846,
0.0011356593244950483]
distortion_model: radtan
intrinsics: [543.6696561242189, 537.5408824639999, 380.0151007157061, 234.36556730449118]
resolution: [752, 480]
rostopic: /uvc_camera/cam_0/image_raw
cam1:
T_cn_cnm1:
- [0.9998527383095952, 0.017137738662598448, 0.0008942082137564826, -0.030880484313990136]
- [-0.01714093820943339, 0.9998462287513096, 0.003702308614529358, -6.527514320719734e-05]
- [-0.0008306215127587759, -0.003717090974042763, 0.9999927466249874, 0.00018972059463724382]
- [0.0, 0.0, 0.0, 1.0]
cam_overlaps: [0, 2, 3, 8, 9]
camera_model: pinhole
distortion_coeffs: [-0.353144426187912, 0.10573328445472395, -0.002580762364064637,
0.0015643763611592502]
distortion_model: radtan
intrinsics: [538.3775780565093, 532.5535724693466, 371.25616245221073, 231.21357156768667]
resolution: [752, 480]
rostopic: /uvc_camera/cam_1/image_raw
cam2:
T_cn_cnm1:
- [0.31149151654244916, -0.008955800913476598, 0.9502067294815791, 0.0872829353960657]
- [0.024766494567423795, 0.999692413651424, 0.00130339350448885, 0.003005051981533158]
- [-0.9499261317960058, 0.023127293784270305, 0.3116175097989347, -0.05394550944575083]
- [0.0, 0.0, 0.0, 1.0]
cam_overlaps: [0, 1, 3, 4, 5]
camera_model: pinhole
distortion_coeffs: [-0.3605185378853076, 0.10936954674509232, -0.0011088324128469991,
-0.0006382486700628317]
distortion_model: radtan
intrinsics: [544.3804118498692, 535.9312835741073, 390.81200139738917, 277.588460638854]
resolution: [752, 480]
rostopic: /uvc_camera/cam_2/image_raw
cam3:
T_cn_cnm1:
- [0.9999710111137126, 0.005760671439309794, -0.004979116063980995, -0.029261320719829526]
- [-0.005824399502668552, 0.9999000760733924, -0.012880770118605354, -5.78080083917376e-06]
- [0.004904416646614305, 0.01290939708055193, 0.9999046425356589, -0.00046465688915452234]
- [0.0, 0.0, 0.0, 1.0]
cam_overlaps: [0, 1, 2, 4, 5]
camera_model: pinhole
distortion_coeffs: [-0.3582955697364296, 0.10784625374915544, -0.0011753169412302125,
-0.0010040787737917879]
distortion_model: radtan
intrinsics: [546.0203967139188, 537.7169111899459, 378.1028461045431, 278.09712634461846]
resolution: [752, 480]
rostopic: /uvc_camera/cam_3/image_raw
cam4:
T_cn_cnm1:
- [0.30044419537433736, -0.013617213839615833, 0.9537021846221735, 0.09044782290677211]
- [0.013407574100583864, 0.9998595824238323, 0.01005248188110284, 0.00019429929490094445]
- [-0.9537051548684202, 0.00976662287992749, 0.3005845815322484, -0.061156190749617846]
- [0.0, 0.0, 0.0, 1.0]
cam_overlaps: [2, 3, 5, 6, 7]
camera_model: pinhole
distortion_coeffs: [-0.3479975244082098, 0.09933410234514833, -0.0027142736958379373,
0.00026979496805150355]
distortion_model: radtan
intrinsics: [533.4739221963415, 527.1088992238627, 379.0655341096242, 270.04278009985524]
resolution: [752, 480]
rostopic: /uvc_camera/cam_4/image_raw
cam5:
T_cn_cnm1:
- [0.9999367427960977, -0.004446100243154904, 0.010331630991170066, -0.028209322476417057]
- [0.004422381245588988, 0.9999875358420928, 0.002317478059665759, 0.00027974464974978795]
- [-0.010341805955854646, -0.0022716410513519266, 0.99994394177698, -0.0002968089722215753]
- [0.0, 0.0, 0.0, 1.0]
cam_overlaps: [2, 3, 4, 6, 7]
camera_model: pinhole
distortion_coeffs: [-0.3426973060943818, 0.09567394656768138, -0.0016701255650504673,
0.0019889641475287294]
distortion_model: radtan
intrinsics: [538.5113849764855, 532.4898097922554, 401.1057053033241, 264.9341702656638]
resolution: [752, 480]
rostopic: /uvc_camera/cam_5/image_raw
cam6:
T_cn_cnm1:
- [0.32477675901267106, -0.011975076282617955, 0.9457148906267944, 0.08715597431777547]
- [0.004965546829508832, 0.9999276490363629, 0.01095627844183401, -0.002686163318926537]
- [-0.9457776695132466, 0.001137646973591924, 0.32481272390325805, -0.06318980746172721]
- [0.0, 0.0, 0.0, 1.0]