README.md 8.31 KB
Newer Older
Radim Tylecek's avatar
Radim Tylecek committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
# 3D Reconstruction meets Semantics 

Part of the ECCV 2018 workshop [3D Reconstruction meets Semantics](http://trimbot2020.webhosting.rug.nl/events/3drms/) is a challenge on combining 3D and semantic information in complex scenes. 
To this end, a challenging outdoor dataset, captured by a robot driving through a semantically-rich garden that contains fine geometric details, is released. 
A multi-camera rig is mounted on top of the robot, enabling the use of both stereo and motion stereo information. 
Precise ground truth for the 3D structure of the garden has been obtained with a laser scanner and accurate pose estimates for the robot are available as well. 
Ground truth semantic labels and ground truth depth from a laser scan will be used for benchmarking the quality of the 3D reconstructions.

## Reconstruction Challenge
Given a set of images and their known camera poses, the goal of the challenge is to create a semantically annotated 3D model of the scene. 
To this end, it will be necessary to compute depth maps for the images and then fuse them together (potentially while incorporating information from the semantics) into a single 3D model.

We provide the following data for the challenge:
* A set of synthetic training sequences consisting of
  * calibrated images with their camera poses,
  * ground truth semantic annotations for a subset of these images,
  * a semantically annotated 3D point cloud depicting the area of the training sequence.
* A set of synthetic testing sequences consisting of calibrated images with their camera poses.
* A set of real validation sequences consisting of calibrated images with their camera poses.

## Data

_IMPORTANT_: Please install [git lfs](https://git-lfs.github.com/) before cloning this repository to retrieve PLY files.

_NOTE_: Due to bug in Gitlab server valid PLY files are not downloaded with ZIP web link. You can still download them via web individually.

* File [`labels.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/labels.yaml) - semantic label definition list
* File [`colors.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/colors.yaml) - label color definition (for display)
* File [`calibration/camchain-DDDD.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/camchain-2017-05-16-09-53-50.yaml) - camera rig calibration (for real data)

### Training (Synthetic data)

| Sequence | frames | annotated frames |
| -------- | ------ | ----- | 
| clear_0224  |  1000  | 500 |
| cloudy_0224  |  1000  | 500 |
| overcast_0224  |  1000 | 500 |
| sunset_0224  |  1000 | 500 | 
| twilight_0224  |  1000 | 500 | 
| _Total_ | 5000 | 2500 | 

* File `model_RRRR_SSSS.ply` - point cloud of scene SSSS with semantic labels (field `scalar_s`) at resolution RRRR
    * Higher resolution point clouds are available upon request (too large for this repository) 
* Folders `EEEE_SSSS` - sequences rendered from scene SSSS in environment EEEE
* Subfolders `vcam_X`
    * Files `vcam_X_fXXXXX_gtr.png` - GT annotation with label set IDs (indexed bitmap)
    * Files `vcam_X_fXXXXX_undist.png` - color image (RGB, undistorted)
    * Files `vcam_X_fXXXXX_over.png` - overlay of annotation over greyscale image (for display)
    * Files `vcam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)
    * Files `vcam_X_fXXXXX_dmap.bin` - depth map (binary matrix with image dimensions, single float IEEE-BE format)
    * Files `vcam_X_fXXXXX_dmap.png` - depth map (visualization)

Radim Tylecek's avatar
Radim Tylecek committed
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
#### Depth data

Matlab code to read `_dmap.bin` files:
```matlab
fd = fopen('training/clear_0001/vcam_0/vcam_0_f00001_dmap.bin','r');
A = fread(fd,[480 640],'single','ieee-be');
fclose(fd);
```
Python:
```python
with open('training/clear_0001/vcam_0/vcam_0_f00001_dmap.bin', 'rb') as f:
    x = np.fromfile(f, dtype='>f4', sep='')
    a = np.reshape(x, [480, 640], order='F')
```

Radim Tylecek's avatar
Radim Tylecek committed
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155
#### Cameras

There are five camera pairs arranged in a pentagonal rig. The stereo pairs are `cam_0/cam_1`, `cam_2/cam_3`, `cam_4/cam_5`, `cam_6/cam_7`, `cam_8/cam_9`. 

The pose format is `fx fy cx cy qw qx qy qz tx ty tz`, where `q` is the quaternion denoting the camera orientation and `t` is the camera translation. 
The transformation from world to camera coordinates is given as `[R(q)|t]`, where `R(q)` is the rotation matrix corresponding to quaternion `q`.

The poses and are also rendered in PLY files, single environments in `cams_EEEE_SSSS_fXXXX.ply` and all jointly in `cams_all_SSSS_fXXXX.ply`. 
Points correspond to camera centers (inner circle of the rig) and viewing direction (outer circle).  

### Testing (Synthetic data)

| Sequence | frames | 
| -------- | ------ | 
| clear_0288  |  1000  | 
| cloudy_0288  |  1000  | 
| overcast_0288  |  1000 | 
| sunset_0288  |  1000 | 
| twilight_0288  |  1000 | 
| _Total_ | 5000 | 

* Folders `EEEE_SSSS` - sequences rendered from scene SSSS in environment EEEE
  * Subfolders `vcam_X`
    * Files `vcam_X_fXXXXX_undist.png` - color image (RGB, undistorted)
    * Files `vcam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)


### Validation (Real data)

| Sequence | cameras | range | frames |
| -------- | ------- | ------ | ------- | ------ |
| test_around_garden  | cam_0, cam_1, cam_2, cam_3   | 140:10:1480 | 268 | 

* Subfolders `uvc_camera_cam_X`
    * Files `uvc_camera_cam_X_fXXXXX_undist.png` - undistorted color image (RGB)
    * Files `uvc_camera_cam_X_fXXXXX_cam.txt` - camera parameters (f,c,q,t)

* For `cam_1` and `cam_3` there is no annotation provided, ie. _gtr and _over are missing, and _undist is greyscale only

## Evaluation

We will evaluate the following measures:
* Reconstruction accuracy in % for a set of distance thresholds (similar to [1,2])
* Reconstruction completeness in % for a set of distance thresholds (similar to [1,2])
* Semantic quality in % of the triangles that are correctly labeled.

We will use distance thresholds of 1cm, 2cm, 3cm, 5cm, and 10cm. 


#### References

* [1] Seitz et al., A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR 2006
* [2] Schöps et al., A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos, CVPR 2017

## Submission Categories

This year we accept submissions in several categories: semantics and geometry, either joint or separate. 
For example, if you have a pipeline that first computes semantics and geometry independently and then fuses them, we can compare how the fused result improved accuracy.

Once you have created the results in one or more categories below, please follow [instructions on the website](http://trimbot2020.webhosting.rug.nl/events/3drms/challenge/) to submit them.
### A. Semantic mesh

In order to submit to the challenge, please create a semantically annotated 3D triangle mesh from the test sequence and validation sequence. 
* The mesh should be stored in the [PLY text format](http://paulbourke.net/dataformats/ply/). 
* The file should store for each triangle a color corresponding to the triangle’s semantic class (see the [`calibrations/colors.yaml`](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/blob/master/calibration/colors.yaml) file for the mapping between semantic classes and colors). 
  * Semantic labels 'Unknown' and 'Background' are only for 2D images, and should not be present in the submitted 3D mesh, ie. only values 1-8 are valid.

### B. Geometric Mesh
Same as above, but PLY mesh without semantic annotations.

### C. Semantic Image Annotations
Create a set of semantic image annotations for all views in the test, using the same filename convention and PNG format as in the training part. Upload them in a single ZIP archive.


## Contact

For questions and requests, please contact `rtylecek@inf.ed.ac.uk`.

## Credits

Dataset composed by @hale and @rtylecek.

Please report any errors via [issue tracker](https://gitlab.inf.ed.ac.uk/3DRMS/Challenge2018/issues/new) or via email to rtylecek@inf.ed.ac.uk.

### Acknowledgements

Production of this dataset was supported by EU project TrimBot2020.

Radim Tylecek's avatar
Radim Tylecek committed
156 157 158 159 160 161 162 163 164 165 166 167 168
Please cite the following when using the dataset: 


    @techreport{tylecek2018rms,
      author={Radim Tylecek and Torsten Sattler and Thomas Brox and Marc Pollefeys and Robert B. Fisher and Theo Gevers},
      title={3D Reconstruction meets Semantics – Reconstruction Challenge},
      institution={ECCV Workshops}, 
      month={September},
      year={2018},
      URL={http://trimbot2020.webhosting.rug.nl/events/3drms/challenge/}
    }