HW3: 3D reconstruction¶

Late day image:

No description has been provided for this image

Q1: 8-point and 7-point algorithm (40 points)¶

(A1) F matrix using 8-point algorithm (15 points)¶

Given two images from the Co3D dataset, you need to implement the 8-point algorithm for estimating the fundamental matrix.

Submission

  • Brief explanation of your implementation.
  1. Normalize the input points (pts1 and pts2):

    • Apply scaling and translation to normalize the point coordinates. Compute the transformation matrices $ T $ and $ T' $:

    $$ \hat{\mathbf{p}} = T \mathbf{p}, \quad \hat{\mathbf{p'}} = T' \mathbf{p'} $$

  2. Setup the eight-point algorithm equation:

    • Form the linear system $ A \mathbf{f} = 0 $, where each row of $ A $ is derived from the epipolar constraint for each point pair:

    $$ u_i' (f_{11} u_i + f_{12} v_i + f_{13}) + v_i' (f_{21} u_i + f_{22} v_i + f_{23}) + (f_{31} u_i + f_{32} v_i + f_{33}) = 0 $$

  3. Solve for the least squares solution using SVD:

    • Solve $ A \mathbf{f} = 0 $ by performing Singular Value Decomposition (SVD) on $ A $. The solution for $ \mathbf{f} $ corresponds to the smallest singular value of $ A $:

    $$ A = U S V^{\top}, \quad \mathbf{f} = V_{\text{last column}} $$

  4. Enforce the singularity condition:

    • Use the function _singularize to enforce the rank-2 constraint on $ F $. This is done by setting the smallest singular value of $ F $ to zero.
  5. Refine the fundamental matrix:

    • Use the refineF function to iteratively refine the computed fundamental matrix $ F $ to better fit the point correspondences.
  6. Unscale the fundamental matrix:

    • Undo the normalization by applying the inverse transformations to the fundamental matrix:

    $$ F = T'^{\top} F T $$

This process results in the final fundamental matrix $ F $ computed from the given point correspondences.

  1. Epipolar lines: Show lines from fundamental matrix over the two images. See the following example figure:
Image
bench No description has been provided for this image
remote No description has been provided for this image
  • bench fundamental matrix:
[[-1.19265833e-07 -2.53255522e-06  2.07584109e-04]
 [-3.96465204e-07  2.27096267e-07  2.48867750e-02]
 [-1.06890748e-03 -2.30052117e-02  1.00000000e+00]]
  • remote fundamental matrix:
[[ 7.19006698e-07, -1.90583125e-07,  2.36215166e-03],
 [-1.88159055e-06, -8.60510729e-07, -8.45685523e-03],
 [-3.98058321e-03,  9.46500248e-03,  1.00000000e+00]]

(A2) E matrix using 8-point algorithm (5 points)¶

Given the estimated fundamental matrix F (from above) and intrinsic matrices K1 and K2 (that we provide as intrinsics.npz), you need to compute the essential matrix E.

Submission

  • Brief explanation of your implementation.

After getting the fundamental matrix $ F $ from the 8-point algorithm, the essential matrix $ E $ can be computed using the camera intrinsics $ K $ and $ K' $: $$ E = K'^{\top} F K $$

  • Provide your estimated E.
  • bench:
[[ -0.18528359  -3.93441194  -0.95959081]
 [ -0.61592238   0.35280189  30.82505761]
 [ -1.60139078 -30.55364717   0.44620549]]
  • remote:
[[  1.01667679  -0.26948489   2.9461563 ]
 [ -2.66057249  -1.21676375 -11.37262789]
 [ -5.82753537  10.6169466    0.40187929]]

(B) 7-point algorithm (20 points)¶

Since the fundamental matrix only has 7 degrees of freedom, it is possible to calculate F using only 7 point correspondences. This requires solving a polynomial equation. In this question, you will implement the 7-point algorithm.

Submission

  • Brief explanation of your implementation.
  1. Normalize the input points (pts1 and pts2):

    • Normalize the points by scaling and translating them to improve numerical stability. Compute the normalization transformation matrix $ T $ and $ T' $ :

    $$ \hat{\mathbf{p}} = T \mathbf{p}, \quad \hat{\mathbf{p'}} = T' \mathbf{p'} $$

  2. Setup the seven-point algorithm equation:

    • Form a linear system $ A \mathbf{f} = 0 $, where each row of $ A $ comes from the epipolar constraint for each of the seven point correspondences. The fundamental matrix $ F $ has 9 unknowns, but only 7 equations from 7 points. This leads to a two-dimensional null space, meaning we get two possible solutions:

    $$ u_i' (f_{11} u_i + f_{12} v_i + f_{13}) + v_i' (f_{21} u_i + f_{22} v_i + f_{23}) + (f_{31} u_i + f_{32} v_i + f_{33}) = 0 $$

  3. Solve for the least squares solution using SVD:

    • Perform Singular Value Decomposition (SVD) on the matrix $ A $ to find its null space. This results in a decomposition:

    $$ A = U S V^{\top} $$ The solution $ \mathbf{f} $ lies in the null space of $ A $, which is spanned by the last two right singular vectors.

  4. Pick the last two column vectors of $ V^{\top} $ (the two null space solutions $ f_1 $ and $ f_2 $):

    • Extract the last two columns of $ V^{\top} $, which correspond to two possible fundamental matrix solutions $ f_1 $ and $ f_2 $:

    $$ F_1 = V[:, -2], \quad F_2 = V[:, -1] $$

  5. Use the singularity constraint to solve for the cubic polynomial:

    • The actual fundamental matrix is a linear combination of $ F_1 $ and $ F_2 $. Let:

    $$ F = \alpha F_1 + (1 - \alpha) F_2 $$ Since the fundamental matrix must have rank 2, we enforce the singularity condition:

    $$ \text{det}(F) = \text{det}(\alpha F_1 + (1 - \alpha) F_2) = 0 $$ Expanding this determinant results in a cubic polynomial in $ \alpha $. Solve for $ \alpha $ using a root-finding method like np.polynomial.polynomial.polyroots, which can give either one or three real solutions.

  6. Unscale the fundamental matrices and return $ F $ as an array:

    • For each solution of $ \alpha $, compute the corresponding fundamental matrix $ F $. Finally, unscale the fundamental matrices using the inverse normalization matrices:

    $$ F = T'^{\top} F T $$ Return all valid fundamental matrices as an array.

  • Epipolar lines: Similar to the above, you need to show lines from fundamental matrix over the two images.
Image
hydrant No description has been provided for this image
ball No description has been provided for this image

Q2: RANSAC with 7-point and 8-point algorithm (20 points)¶

In some real world applications, manually determining correspondences is infeasible and often there will be noisy correspondences. Fortunately, the RANSAC method can be applied to the problem of fundamental matrix estimation.

Data

In this question, you will use the image sets released in q1a and q1b and calculate the F matrix using both 7-point and 8-point algorithm with RANSAC. The given correspondences corresp_noisy.npz consists potential inlier matches. Within each .npz file, the fields pts1 and pts2 are N × 2 matrices corresponding to the (x, y) coordinates of the N points in the first and second image respectively.

Hint

  • There are around 40-70% of inliers in the provided data.
  • Pick the number of iterations and tolerance of error carefully to get reasonable F.

Submission

  • Brief explanation of your RANSAC implementation and criteria for considering inliers.

Overview: RANSAC (Random Sample Consensus) is used to robustly estimate the fundamental matrix $ F $ by iteratively selecting random subsets of point correspondences and computing $ F $ that maximizes the number of inliers.

Steps of My Implementation:

  1. Random Sampling: Randomly select a subset of point correspondences (7 or 8 points) from $ \text{pts1} $ and $ \text{pts2} $.

  2. Compute Fundamental Matrix: Use the selected points to compute a candidate fundamental matrix $ F $ via the 7-point or 8-point algorithm.

  3. Compute Epipolar Distance: For each correspondence $ (\mathbf{p}_i, \mathbf{p}'_i) $, compute the epipolar constraint: $$ \mathbf{p'}_i^{\top} F \mathbf{p}_i = 0 $$ and calculate the distance between the points and their epipolar lines: $$ d_i = \frac{| \mathbf{p'}_i^{\top} F \mathbf{p}_i |}{\sqrt{F_{1,1}^2 + F_{1,2}^2}} \quad \text{and} \quad \frac{| \mathbf{p}_i^{\top} F^{\top} \mathbf{p}'_i |}{\sqrt{F_{2,1}^2 + F_{2,2}^2}} $$

  4. Inlier Criterion: Points are considered inliers if the squared distance $ d_i^2 $ is below a predefined threshold.

  5. Iterative Process: Repeat the above steps over multiple iterations, maximizing the number of inliers. After the iterations, recompute $ F $ using all inliers and return the final fundamental matrix.

  • Report your best solution and plot the epipolar lines -- show lines from fundamental matrix that you calculate over the inliers.
algorithm Inliers Image
bench 7-point 1080 No description has been provided for this image
[[ 9.12589426e-11 -9.25176400e-07 -3.26863716e-04]
 [-1.62804006e-06  8.18252401e-07  2.24850752e-02]
 [-6.21559155e-04 -2.12371159e-02  1.00000000e+00]]
algorithm Inliers Image
bench 8-point 1080 No description has been provided for this image
[[-1.61723371e-07 -3.47081782e-06  2.18034126e-04]
 [ 5.21790918e-07  6.17984694e-08  2.45279466e-02]
 [-1.06431753e-03 -2.26498389e-02  1.00000000e+00]]
algorithm Inliers Image
remote 7-point 47 No description has been provided for this image
[[ 1.49153926e-06 -3.01441304e-06  3.02911749e-03]
 [-1.77921621e-07 -1.72764728e-07 -9.19690639e-03]
 [-4.71380786e-03  1.01636776e-02  1.00000000e+00]]
algorithm Inliers Image
remote 8-point 46 No description has been provided for this image
[[-8.94700974e-06  6.66574370e-05 -6.52321708e-03]
 [-9.12214359e-05 -4.62806417e-05 -7.64409145e-02]
 [-9.06291964e-04  1.28135271e-01  1.00000000e+00]]
algorithm Inliers Image
hydrant 7-point 1594 No description has been provided for this image
[[ 1.12370790e-07 -4.02591395e-06 -1.28907307e-03]
 [ 5.55487867e-06  2.08854009e-08 -1.70094306e-02]
 [ 9.60503306e-04  1.67957275e-02  1.00000000e+00]]
algorithm Inliers Image
hydrant 8-point 1592 No description has been provided for this image
[[ 1.13106117e-07 -3.99512637e-06 -1.30380392e-03]
 [ 5.51985895e-06  2.51270098e-08 -1.69831856e-02]
 [ 9.78082855e-04  1.67649349e-02  1.00000000e+00]]
algorithm Inliers Image
ball 7-point 967 No description has been provided for this image
[[-2.24027463e-08 -3.05020992e-06 -6.75884687e-03]
 [ 3.63607723e-06 -9.32444360e-07 -1.51329974e-02]
 [ 9.13140772e-03  1.50536884e-02  1.00000000e+00]]
algorithm Inliers Image
ball 8-point 966 No description has been provided for this image
[[ 2.24385619e-09 -3.08823282e-06 -6.82130935e-03]
 [ 3.69338748e-06 -9.59930333e-07 -1.51329889e-02]
 [ 9.14890000e-03  1.50662478e-02  1.00000000e+00]]
  • Visualization (graph plot) of % of inliers vs. # of RANSAC iterations (see the example below). You should report such plots for both, the 7-pt and 8-pt Algorithms in the inner loop of RANSAC.
  • bench:

    No description has been provided for this image
  • remote:

    No description has been provided for this image
  • hydrant:

    No description has been provided for this image
  • ball:

    No description has been provided for this image

Q3: Triangulation (20 points)¶

Given 2D correspondences and 2 camera matrices, your goal is to triangulate the 3D points.

Submission

  • Brief explanation of your implementation.
  1. Form matrix $ A $ for every input point:

    • For each corresponding point $ \mathbf{p}_1 $ from pts1 and $ \mathbf{p}_2 $ from pts2, and the projection matrices $ P_1 $ and $ P_2 $, form the matrix $ A $ as:

    $$ A = \begin{bmatrix} p_1 \cdot P_1[2,:] - P_1[0,:] \\ p_1 \cdot P_1[2,:] - P_1[1,:] \\ p_2 \cdot P_2[2,:] - P_2[0,:] \\ p_2 \cdot P_2[2,:] - P_2[1,:] \end{bmatrix} $$

  2. Solve for the least square solution using SVD:

    • Solve the linear system $ A \mathbf{X} = 0 $ using Singular Value Decomposition (SVD) to compute the 3D point $ \mathbf{X} $ in homogeneous coordinates:

    $$ A = U S V^\top $$ The solution $ \mathbf{X} $ corresponds to the last column of $ V $.

  3. Convert to non-homogeneous coordinates:

    • Convert the 3D point from homogeneous coordinates $ \mathbf{X}_h $ to non-homogeneous coordinates:

    $$ \mathbf{X} = \frac{\mathbf{X}_h}{\mathbf{X}_h[3]} $$

  4. Visualize the 3D points:

    • After converting the points, visualize the 3D points $ \mathbf{X} $ by plotting them in a 3D space.
  • A colored point cloud as below:
No description has been provided for this image

Q4: Reconstruct your own scene! (20 points)¶

For this part, you will run an off-the-shelf incremental SfM toolbox such as COLMAP on your own captured multi-view images. Please submit a gif of the reconstructed 3d structure and the location of cameras.

For this reconstruction, you can choose your own data. This data could either be a sequence having rigid objects, any object (for e.g. a mug or a vase in your vicinity), or any scene you wish to reconstruct in 3D.

Submissions

  • Multi-view input images.
  • A gif to visualize the reconstruction of the scene and location of cameras (extrinsics).
  • Run this on at least 2 sequences / objects / scenes
Example Multi-view images Output
No description has been provided for this image No description has been provided for this image
No description has been provided for this image No description has been provided for this image