The purpose of this toolbox is to evaluate global registration algorithms in the context of scene reconstruction. Local fragments are first constructed from short term segments of RGB-D scans using RGB-D odometry. Each fragment is represented as a point cloud $\mathbf{P}_{i}$. Its pose is denoted as a 4 × 4 transformation matrix $\mathbf{T}_{i}$. Consider a pair of non-consecutive fragments $(\mathbf{P}_{i}, \mathbf{P}_{j})_{(i+1 \lt j)}$. They may overlap in space and create a loop closure constraint although the transformation that aligns them cannot be directly computed from the odometry. Therefore, we need a global registration algorithm that tests non-consecutive fragment pairs and computes a transformation $\mathbf{T}_{ji}$ that aligns fragment $\mathbf{P}_{j}$ to fragment $\mathbf{P}_{i}$ if the alignment yields sufficient overlap.
This toolbox is designed to evaluate such global registration algorithms. An algorithm is required to take a set of fragments as input, and test all pairs of non-consecutive fragments $(\mathbf{P}_{i}, \mathbf{P}_{j})_{(i+1 \lt j)}$ — consecutive fragment pairs are not considered in this evaluation. If the registration algorithm decides that a pair of fragments can be aligned (our guideline is to have at least 30% overlap), it needs to output a transformation matrix $\mathbf{T}_{ji}$ to be evaluated. All the transformation matrices of successfully aligned fragment pairs are stored in a .log file (see the File Format section). The three numbers in the metadata of transformation matrix $\mathbf{T}_{ji}$ are: fragment index $i$, fragment index $j$, and the total number of fragments.
An example output of a global registration algorithm is as follows:
4 6 57
0.931813 -0.151833 0.329652 1.2445
-0.0433474 0.855229 0.516434 -0.78035
-0.36034 -0.49551 0.790332 0.711526
0 -0 -0 1
7 10 57
-0.0657651 -0.0242835 0.99754 1.44585
-0.983248 0.171894 -0.0606383 3.72015
-0.169998 -0.984816 -0.0351812 3.10725
0 0 0 1
1 3 57
0.554312 0.236264 -0.798071 1.46312
-0.2289 0.965163 0.126745 0.205479
0.800214 0.112423 0.589083 -1.43515
-0 -0 -0 1
7 11 57
0.977749 -0.0522724 0.20316 -0.0895188
0.202217 -0.0228047 -0.979075 2.13087
0.0558116 0.998373 -0.011727 -1.28311
0 0 0 1
......
Here are the fragments from four sequences that should be used for testing:
Scene | Fragments (PLY or PCD) | Evaluation File | Source | |
---|---|---|---|---|
Living Room 1 | PLY (390M) | PCD (394M) | ZIP (1M) | Dataset |
Living Room 2 | PLY (332M) | PCD (336M) | ZIP (1M) | |
Office 1 | PLY (282M) | PCD (286M) | ZIP (1M) | |
Office 2 | PLY (330M) | PCD (334M) | ZIP (1M) |
Code
For an example of complete evaluation on all four sequences refer to https://github.com/qianyizh/ElasticReconstruction/blob/master/Matlab_Toolbox/Example/demo_mrRegistrationEvaluation.m
The following Matlab script evaluates the registration result stored in "result.log".
gt = mrLoadLog( 'gt.log' );
gt_info = mrLoadInfo( 'gt.info' );
result = mrLoadLog( 'result.log' );
[ recall, precision ] = mrEvaluateRegistration( result, gt, gt_info );
Ground Truth
The ground truth data is created from synthesized data. Local fragments $\{\mathbf{P}_{i}\}$ are constructed within a 3m × 3m × 3m volume with 6mm resolution. The fragments are created from depth images synthesized with a noise model measured on a real PrimeSense camera. The fragments are fused using the ground truth trajectory. Since the synthesized depth images are affected by both noise and intrinsic camera distortion, the constructed fragments do not perfectly represent the true geometry. This is consistent with artifacts present in real data.
Let $\mathbf{T}^{*}_{i}$ be the ground truth pose of fragment $\mathbf{P}_{i}$, derived from the ground truth trajectory. Each pair of non-consecutive fragments $(\mathbf{P}_{i}, \mathbf{P}_{j})_{(i+1 \lt j)}$ is tested using the ground truth transformation $\mathbf{T}^{*}_{ji}=\mathbf{T}^{*-1}_{i}\mathbf{T}^{*}_{j}$ to see if their overlap is larger than 30% of $\min\{|\mathbf{P}_{i}|,|\mathbf{P}_{j}|\}$. To perform the evaluation efficiently, each fragment is uniformly downsampled at 5cm resolution. For each point $\mathbf{q}$ in the downsampled fragment $\mathbf{P}_{j}$, we look for the nearest point of $\mathbf{T}^{*}_{ji}\mathbf{q}$ in the downsampled $\mathbf{P}_{i}$, denoted as $\mathbf{p}$. If the distance between $\mathbf{T}^{*}_{ji}\mathbf{q}$ and $\mathbf{p}$ is less than 7.5cm, point pair $(\mathbf{p},\mathbf{q})$ is considered a ground-truth correspondence and is included in ${\mathcal K}^{*}_{ij}$. We define $(\mathbf{P}_{i}, \mathbf{P}_{j})$ to be a ground-truth loop closure if the size of ${\mathcal K}^{*}_{ij}$ is larger than 30% of the size of downsampled $\mathbf{P}_{i}$ or $\mathbf{P}_{j}$.
For each synthesized scan, we create a set of fragments $\mathbb{P}=\{\mathbf{P}_{i}\}$ and a set of ground-truth loop closures $\mathbb{L}^{*}$. Each loop closure has a ground-truth transformation $\mathbf{T}^{*}_{ji}$ and a ground-truth correspondence set ${\mathcal K}^{*}_{ij}$.
Evaluation
As described in the introduction, to be evaluated, a global registration algorithm needs to output a set of loop closures $\mathbb{L}$ with transformations $\{\mathbf{T}_{ji}\}$. A loop closure is a true positive if and only if it satisfies the following conditions:
- It is also a loop closure in $\mathbb{L}^{*}$, and
- $\mathbf{T}_{ji}$ brings correspondences in $ {\mathcal K}^{*}_{ij}$ sufficiently close. Specifically, we require the RMSE of the ground truth correspondences to be below a threshold $\tau=0.2$m: